Regular Expressions
Where can I use
Powerful in
- Search , search and replace, text processing
Text Editors vi, editplus
Programming languages - perl, Java
Grep,awk
07/09/15Friday, January 28,
Before RegEx
Wildcard
*.txt
My_report*.doc
Here the * indicates any number of any
characters.
07/09/15Friday, January 28,
!
Regular expressions (RegEx) tend to
be easier to write than they are to
read
07/09/15Friday, January 28,
What is RegEx
a regular expression-- a pattern that describes or
matches a set of strings
Matched text chunk of text which matches the
regular expression.
ca[trn]
Matches car, can, cat
Editplus is used throughout this presentation as tool to demonstrate
regular expressions
07/09/15Friday, January 28,
the
Structure of RegEx
Made up of normal characters and
metacharacters.
Metacharacters special function
$ ^ . \ [] \( \) + ?
$ means end of line
^ start of line
07/09/15Friday, January 28,
Literal match
RegEx: cat will match the word cat
It will also match words like
concatenation , delicate, located,
modification
It is not desired sometimes ?
solution
07/09/15Friday, January 28,
Matching
Match the space before and after
cat
cat
? Still problem
07/09/15Friday, January 28,
Character class
Want to search in or on ..
So searching RegEx : [io]n will match in
and on both
[ ] : used to specify a set of character to
select from.
[a-h] : indicates set of all characters from a to
h
[4-9A-T]
07/09/15Friday, January 28,
Character class
It can also contain individual
characters as : [acV5y0]
[0-9] : ?
[0-9][0-9] :?
18[0-9][0-9]:?
07/09/15Friday, January 28,
10
Example
set of vowels
[aeiou]
set of consonents
[bcdfghjklmnpqrstvwxyz]
Consider matching words which start with 2
vowels and end with consonant
[aeiou][aeiou][bcdfghjklmnpqrstvwxyz] ?
[aeiou][aeiou][bcdfghjklmnpqrstvwxyz]
07/09/15Friday, January 28,
11
Negation
The absence of any character or set
of character can be shown using ^
symbol
[^ab^8] : means not a , but b , but not 8
[^c-p] : means any character other than
c..p
[^t]ion : select all words ending with
ion but with not before it
07/09/15Friday, January 28,
12
Start/End of line
^ : indicates start of line
$ : indicates end of line
Example:
search lines starting with I
Use RegEx : ^I
search lines ending ending with is
Use RegEx : is$
07/09/15Friday, January 28,
13
match
. : Any character match
e.e : match all strings where first letter is e and
last is e.
Try e.e
If you want only words to be searched then
change the query to
e[a-z]e
07/09/15Friday, January 28,
14
Repeated match
* : match the previous character or
character-class zero or more times
be* : will match sequence of zero or
more e preceded by b
+ : similar to *
Only difference is that it matches
sequence of one or more.
07/09/15Friday, January 28,
15
Selecting a number
Single digit : [0-9]
When single digit is repeated zero or
more times it is a number.
(digit)repeat
[0-9]*
$[0-9]* : ?
\$[0-9]*
07/09/15Friday, January 28,
16
Selecting a word
Word is composed of alphabets
A word is : [a-z]*
A word in all capital letters : ??
A word starting with capital letter :[
][ ]*
07/09/15Friday, January 28,
17
Alternate match
| : symbol is used to specify
alternate match
Search: (above)|(below)
07/09/15Friday, January 28,
18
Search
Day Words
[a-z]*day
[a-z]+day
- [A-Z][a-z]+day
07/09/15Friday, January 28,
19
Escaping Special meaning
How to match (, ) or *
To match the characters which are
used as Metacharacter, \ is added
before them as an escape character.
i.e. to match ( write \( and to match
period . write \.
07/09/15Friday, January 28,
20
Search patterns
has, have, had
not, nt
((have)|(had)|(has))
(( )|(n't)|( not ))*
((have)|(had)|(has))(( )|(n't)|( not ))*
07/09/15Friday, January 28,
21
07/09/15Friday, January 28,
22
References
Editplus help pages
[Link]
[Link]
OReilly - Mastering Regular Expressions
Google regular expression tutorial
07/09/15Friday, January 28,
23
Thank you !
07/09/15Friday, January 28,
24