Introduction to Regular Expressions

This lecture notes is based on Prof Doreen De Leon's lectures.

Reading

Learning Perl, Chapter 7

Definition

$_ = "Perl is cool!";
if(/cool/) { # if $_ contains cool
    print "It matched.\n"; 
} else {
    print "It didn't match.\n";
}

# Output
# It matched.
$str = "Perl is cool!";
if($str =~ /cool/) { # if $str contains cool
    print "It matched.\n"; 
} else {
    print "It didn't match.\n";
}

if($str =~ /cold/) { # if $str contains cold
    print "It matched.\n"; 
} else {
    print "It didn't match.\n";
}

# Output
# It matched.
# It didn't match.
$_ = "Coke";
if(/Coke\tSprite/) { # Coke tab Sprite
    print "It matched.\n"; 
} else {
    print "It didn't match.\n";
}
# A program to find out while line contains 
# MILLION DOLLAR BABY 
while(<>) {
    if(/MILLION DOLLAR BABY/) {
        print $_;
    }
}
Oscar Winner List:

ACTOR IN A LEADING ROLE: Jamie Foxx, RAY
ACTOR IN A SUPPORTING ROLE: Morgan Freeman, MILLION DOLLAR BABY
ACTRESS IN A LEADING ROLE: Hilary Swank, MILLION DOLLAR BABY
ACTRESS IN A SUPPORTING ROLE: Cate Blanchett, THE AVIATOR
ANIMATED FEATURE FILM: THE INCREDIBLES
BEST PICTURE: MILLION DOLLAR BABY
CINEMATOGRAPHY: THE AVIATOR, Robert Richardson
COSTUME DESIGN: THE AVIATOR, Sandy Powell
DIRECTING: MILLION DOLLAR BABY, Clint Eastwood

Output

ACTOR IN A SUPPORTING ROLE: Morgan Freeman, MILLION DOLLAR BABY
ACTRESS IN A LEADING ROLE: Hilary Swank, MILLION DOLLAR BABY
BEST PICTURE: MILLION DOLLAR BABY
DIRECTING: MILLION DOLLAR BABY, Clint Eastwood

Metacharacters

$_ = "Perl is the best langauge!";
if(/ga.g/) { # gaug matched ga.g
    print "It matched.\n";
} else {
    print "It didn't match.\n";
}

if(/l.n.a/) { #langa matches l.n.a
    print "It matched.\n";
} else {
    print "It didn't match\n";
}    

if(/a.b/) { # no match
    print "It matched.\n";
} else {
    print "It didn't match\n";
}  
$_ = 3.1415926;

if(/3\.1415/) { # match 3.1415
    print "It matched.\n";
} else {
    print "It didn't match\n";
}   

Quantifiers


file

XaaaX
XabcX
abcXaaXdef
aaaaaaXaaaaa
xyzXaaaaaXzyx
gggXXggg

program

while(<>) {
    if(/Xa*X/) {
        print $_;
    }
}

output

XaaaX
abcXaaXdef
xyzXaaaaaXzyx
gggXXggg

program

while(<>) {
    if(/Xa+X/) {
        print $_;
    }
}

output

XaaaX
abcXaaXdef
xyzXaaaaaXzyx

file

abcXababXabc
ababX
XabcdefYZW
ccbaXbabaX
XabX
XX

program

while(<>) {
    if(/X(ab)*X/) {
        print $_;
    }
}

output

abcXababXabc
XabX
XX

file

Charles Li
Charles abcdefghLi
Charles something
something Li
Charles Charles
Charles something Li

program

while(<>) {
    if(/Charles.*Li/) {
        print $_;
    }
}

output

Charles Li
Charles abcdefghLi
Charles something Li

Alternation


file

Charles Li
Amy
Ben
Amy and Charles
Dianne
Eric and Charles
.... Amy .... 
Ben and Eric

program

while(<>) {
    if(/Amy|Charles/) {
        print $_;
    }
}

output

Charles Li
Amy
Amy and Charles
Eric and Charles
.... Amy ....

file

abcccc
deffff
ababab
defabccc
xyzxyzz
abxyz
baddefg

program

while(<>) {
    if(/abc+|(ab)+/) {
        print $_;
    }
}

output

abcccc
ababab
defabccc
abxyz

file

yyyXaaa
zzzXbbb
zXcccXzz
##XaaaX
XbbX
oooXabababXppp
XaabbX
XabcdX
aaaaaX
aaXXX
XbbbbbbbbbbbbbbbX

program

while(<>) {
   #Xaaa...X or Xbbb...X
   if(/X(a+|b+)X/){
       print $_;
   }
}

output

##XaaaX
XbbX
XbbbbbbbbbbbbbbbX

file

yyyXaaa
zzzXbbb
zXcccXzz
##XaaaX
XbbX
oooXabababXppp
XaabbX
XabcdX
aaaaaX
aaXXX
XbbbbbbbbbbbbbbbX

program

while(<>) {
   if(/X(a|b)+X/){
       print $_;
   }
}

output

##XaaaX
XbbX
oooXabababXppp
XaabbX
XbbbbbbbbbbbbbbbX

A pattern test program

file

oooXaabaXooo
ooXXooo
XabcXoo
XabaabX
oooXabXabXooo
ooooXaaaaa

program

while(<>) {
   chomp;
   if(/X(a|b)+X/){ #substitute by other  pattern
       print "Matched: |$`<$&>$'|\n";
   } else {
       print "No match.\n";
   }
}

matches

oooXaabaXooo
ooXXooo
XabcXoo
XabaabX
oooXabXabXooo
ooooXaaaaa

output

Matched: |ooo<XaabaX>ooo|
No match.
No match.
Matched: |<XabaabX>|
Matched: |ooo<XabX>abXooo|
No match.