regex – How can I remove punctuation from input text in Java?

regex – How can I remove punctuation from input text in Java?

This first removes all non-letter characters, folds to lowercase, then splits the input, doing all the work in a single line:

String[] words = instring.replaceAll([^a-zA-Z ], ).toLowerCase().split(\s+);

Spaces are initially left in the input so the split will still work.

By removing the rubbish characters before splitting, you avoid having to loop through the elements.

You can use following regular expression construct

Punctuation: One of !#$%&()*+,-./:;<=>[email protected][]^_`{|}~

inputString.replaceAll(\p{Punct}, );

regex – How can I remove punctuation from input text in Java?

You may try this:-

Scanner scan = new Scanner(System.in);
System.out.println(Type a sentence and press enter.);
String input = scan.nextLine();
String strippedInput = input.replaceAll(\W, );
System.out.println(Your string:  + strippedInput);

[^w] matches a non-word character, so the above regular expression will match and remove all non-word characters.

Leave a Reply

Your email address will not be published.