java – Regular Expressions on Punctuation

java – Regular Expressions on Punctuation

Java does support POSIX character classes in a roundabout way. For punctuation, the Java equivalent of [:punct:] is p{Punct}.

Please see the following link for details.

Here is a concrete, working example that uses the expression in the comments

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegexFindPunctuation {

    public static void main(String[] args) {
        Pattern p = Pattern.compile(\p{Punct});

        Matcher m = p.matcher(One day! when I was walking. I found your pants? just kidding...);
        int count = 0;
        while (m.find()) {
            count++;
            System.out.println(nMatch number:  + count);
            System.out.println(start() :  + m.start());
            System.out.println(end()   :  + m.end());
            System.out.println(group() :  + m.group());
        }
    }
}

I would try a character class regex similar to

[.!?\-]

Add whatever characters you wish to match inside the []s. Be careful to escape any characters that might have a special meaning to the regex parser.

You then have to iterate through the matches by using Matcher.find() until it returns false.

java – Regular Expressions on Punctuation

I would try

W

it matches any non-word character. This includes spaces and punctuation, but not underscores. It’s equivalent to [^A-Za-z0-9_]

Leave a Reply

Your email address will not be published.