Java Regular Expressions
Java Regular Expressions
http://www.tutorialspoint.com/java/java_regular_expressions.htm
Copyright tutorialspoint.com
Java provides the java.util.regex package for pattern matching with regular expressions. Java
regular expressions are very similar to the Perl programming language and very easy to learn.
A regular expression is a special sequence of characters that helps you match or find other strings
or sets of strings, using a specialized syntax held in a pattern. They can be used to search, edit, or
manipulate text and data.
The java.util.regex package primarily consists of the following three classes:
Pattern Class: A Pattern object is a compiled representation of a regular expression. The
Pattern class provides no public constructors. To create a pattern, you must first invoke one
of its public static compile methods, which will then return a Pattern object. These methods
accept a regular expression as the first argument.
Matcher Class: A Matcher object is the engine that interprets the pattern and performs
match operations against an input string. Like the Pattern class, Matcher defines no public
constructors. You obtain a Matcher object by invoking the matcher method on a Pattern
object.
PatternSyntaxException: A PatternSyntaxException object is an unchecked exception that
indicates a syntax error in a regular expression pattern.
Capturing Groups:
Capturing groups are a way to treat multiple characters as a single unit. They are created by
placing the characters to be grouped inside a set of parentheses. For example, the regular
expression dog creates a single group containing the letters "d", "o", and "g".
Capturing groups are numbered by counting their opening parentheses from left to right. In the
expression (AB(C)), for example, there are four such groups:
(AB(C))
A
B(C)
C
To find out how many groups are present in the expression, call the groupCount method on a
matcher object. The groupCount method returns an int showing the number of capturing groups
present in the matcher's pattern.
There is also a special group, group 0, which always represents the entire expression. This group is
not included in the total reported by groupCount.
Example:
Following example illustrates how to find a digit string from the given alphanumeric string:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexMatches
{
public static void main( String args[] ){
// String to be scanned to find the pattern.
String line = "This order was placed for QT3000! OK?";
String pattern = "(.*)(\\d+)(.*)";
// Create a Pattern object
Pattern r = Pattern.compile(pattern);
// Now create matcher object.
Matcher m = r.matcher(line);
if (m.find( )) {
System.out.println("Found value: " + m.group(0) );
System.out.println("Found value: " + m.group(1) );
System.out.println("Found value: " + m.group(2) );
} else {
System.out.println("NO MATCH");
}
}
}
Matches
[...]
[^...]
\A
\z
\Z
re*
re+
re?
re{ n}
re{ n,}
re{ n, m}
a| b
Matches either a or b.
re
? : re
? > re
\w
\W
\s
\S
Matches nonwhitespace.
\d
\D
Matches nondigits.
\A
\Z
\z
\G
\n
\b
\B
\Q
\E
Index Methods:
Index methods provide useful index values that show precisely where the match was found in the
input string:
SN
Study Methods:
Study methods review the input string and return a Boolean indicating whether or not the pattern
is found:
SN
Replacement Methods:
Replacement methods are useful methods for replacing text in an input string:
SN
Following is the example that counts the number of times the word "cat" appears in the input
string:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexMatches
{
private static final String REGEX = "\\bcat\\b";
private static final String INPUT =
"cat cat cat cattie cat";
public static void main( String args[] ){
Pattern p = Pattern.compile(REGEX);
Matcher m = p.matcher(INPUT); // get a matcher object
int count = 0;
while(m.find()) {
count++;
System.out.println("Match number "+count);
System.out.println("start(): "+m.start());
System.out.println("end(): "+m.end());
}
}
}
1
2
3
4
You can see that this example uses word boundaries to ensure that the letters "c" "a" "t" are not
merely a substring in a longer word. It also gives some useful information about where in the input
string the match has occurred.
The start method returns the start index of the subsequence captured by the given group during
the previous match operation, and end returns the index of the last character matched, plus one.
Pattern p = Pattern.compile(REGEX);
// get a matcher object
Matcher m = p.matcher(INPUT);
StringBuffer sb = new StringBuffer();
while(m.find()){
m.appendReplacement(sb,REPLACE);
}
m.appendTail(sb);
System.out.println(sb.toString());
}
}
Loading [MathJax]/jax/output/HTML-CSS/jax.js