C++ Remove punctuation from String

C++ Remove punctuation from String

Using algorithm remove_copy_if :-

string text,result;
std::remove_copy_if(text.begin(), text.end(),            
                        std::back_inserter(result), //Store output           
                        std::ptr_fun<int, int>(&std::ispunct)  
                       );

POW already has a good answer if you need the result as a new string. This answer is how to handle it if you want an in-place update.

The first part of the recipe is std::remove_if, which can remove the punctuation efficiently, packing all the non-punctuation as it goes.

std::remove_if (text.begin (), text.end (), ispunct)

Unfortunately, std::remove_if doesnt shrink the string to the new size. It cant because it has no access to the container itself. Therefore, theres junk characters left in the string after the packed result.

To handle this, std::remove_if returns an iterator that indicates the part of the string thats still needed. This can be used with strings erase method, leading to the following idiom…

text.erase (std::remove_if (text.begin (), text.end (), ispunct), text.end ());

I call this an idiom because its a common technique that works in many situations. Other types than string provide suitable erase methods, and std::remove (and probably some other algorithm library functions Ive forgotten for the moment) take this approach of closing the gaps for items they remove, but leaving the container-resizing to the caller.

C++ Remove punctuation from String

#include <string>
#include <iostream>
#include <cctype>

int main() {

    std::string text = this. is my string. its here.;

    for (int i = 0, len = text.size(); i < len; i++)
    {
        if (ispunct(text[i]))
        {
            text.erase(i--, 1);
            len = text.size();
        }
    }

    std::cout << text;
    return 0;
}

Output

this is my string its here

When you delete a character, the size of the string changes. It has to be updated whenever deletion occurs. And, you deleted the current character, so the next character becomes the current character. If you dont decrement the loop counter, the character next to the punctuation character will not be checked.

Leave a Reply

Your email address will not be published.