Unit 4

The document discusses string matching algorithms, particularly focusing on the naive string-matching algorithm and the Rabin-Karp algorithm, which utilizes hash values for efficient pattern searching. It also covers the classification of problems into P and NP categories, explaining the characteristics of NP complete and NP hard problems. Applications of these algorithms in fields like bioinformatics and the complexity of solving such problems are highlighted.

Uploaded by

saiscount01

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views27 pages

Unit 4

Uploaded by

saiscount01

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

UNIT 4

~Ravi Sheth
String matching
Introduction
• Finding all occurrences of a pattern in a text is a
problem that arises frequently in text-editing
programs.
• Typically, the text is a document being edited,
and the pattern searched for is a particular word
supplied by the user.
• Efficient algorithms for this problem can greatly
aid the responsiveness of the text-editing
program. String-matching algorithms are also
used, for example, to search for particular
patterns in DNA sequences.
• We formalize the string-matching problem as
follows. We assume that the text is an array T[1 .
. n] of length n and that the pattern is an
array P[1 . . m] of length m.
• We further assume that the elements of P
and T are characters drawn from a finite
alphabet . For example, we may have = {0, 1}
or = {a, b, . . . , z}. The character
arrays P and T are often called strings of
characters.
Naive string-matching algorithm
• NAIVE-STRING-MATCHER(T, P)
• 1 n length[T]
• 2 m length[P]
• 3 for s 0 to n - m
• 4 do if P[1 . . m] = T[s + 1 . . s + m]
• 5 then print "Pattern occurs with shift" s
• The naive string-matching procedure can be interpreted
graphically as sliding a "template" containing the pattern
over the text, noting for which shifts all of the characters on
the template equal the corresponding characters in the
text, as illustrated in Figure.

• The for loop beginning on line 3 considers each possible

shift explicitly. The test on line 4 determines whether the
current shift is valid or not; this test involves an implicit
loop to check corresponding character positions until all
positions match successfully or a mismatch is found.

• Line 5 prints out each valid shift s.

Exercises
• Show the comparisons the naive string
matcher makes for the pattern P = 0001 in the
text T = 000010001010001.
Definition of Rabin-Karp
• A string search algorithm which compares a
string's hash values, rather than the strings
themselves. For efficiency, the hash value of
the next position in the text is easily
computed from the hash value of the current
position.
How Rabin-Karp works
• Let characters in both arrays T and P be digits in
radix-S notation. (S = (0,1,...,9)
• Let p be the value of the characters in P
• Choose a prime number q such that fits within a
computer word to speed computations.
• Compute (p mod q)
– The value of p mod q is what we will be using to find all
matches of the pattern P in T.
How Rabin-Karp works (continued)
• Compute (T[s+1, .., s+m] mod q) for s = 0 .. n-
m
• Test against P only those sequences in T
having the same (mod q) value
• (T[s+1, .., s+m] mod q) can be incrementally
computed by subtracting the high-order digit,
shifting, adding the low-order bit, all in
modulo q arithmetic.
A Rabin-Karp example
• Given T = 31415926535 and P = 26
• We choose q = 11
• P mod q = 26 mod 11 = 4

3 1 4 1 5 9 2 6 5 3 5
31 mod 11 = 9 not equal to 4

3 1 4 1 5 9 2 6 5 3 5

14 mod 11 = 3 not equal to 4

3 1 4 1 5 9 2 6 5 3 5

41 mod 11 = 8 not equal to 4

Rabin-Karp example continued
3 1 4 1 5 9 2 6 5 3 5
15 mod 11 = 4 equal to 4 -> spurious hit

3 1 4 1 5 9 2 6 5 3 5
59 mod 11 = 4 equal to 4 -> spurious hit

3 1 4 1 5 9 2 6 5 3 5
92 mod 11 = 4 equal to 4 -> spurious hit

3 1 4 1 5 9 2 6 5 3 5

26 mod 11 = 4 equal to 4 -> an exact match!!

3 1 4 1 5 9 2 6 5 3 5
65 mod 11 = 10 not equal to 4
Rabin-Karp example continued
3 1 4 1 5 9 2 6 5 3 5
53 mod 11 = 9 not equal to 4

3 1 4 1 5 9 2 6 5 3 5
35 mod 11 = 2 not equal to 4

As we can see, when a match is found, further testing is

done to insure that a match has indeed been found.
Complexity
• The running time of the Rabin-Karp algorithm in the
worst-case scenario is O(n-m+1)m but it has a good
average-case running time.
• If the expected number of valid shifts is small O(1)
and the prime q is chosen to be quite large, then the
Rabin-Karp algorithm can be expected to run in time
O(n+m) plus the time to required to process spurious
hits.
Applications
• Bioinformatics
– Used in looking for similarities of two or more proteins;
i.e. high sequence similarity usually implies significant
structural or functional similarity.

Example:
Hb A_human
GSAQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKL
G+ +VK+HGKKV A++++++AH+ D++ ++ +++LS+LH KL
Hb B_human
GNPKVKAHGKKVLGAFSDGLAH LDNLKGTF ATLSELH CDKL
+ similar amino acids
The Class P and NP problem
• P problem
• NP problem
P - Problem
• P problems are set of problems that can be
solved in polynomial time by algorithm
• P problems are simple to solve and easy to
verify.
• Most of the problems we have discussed so
far are P problem
• It excludes all the problems which cannot be
solved in polynomial time.
• Examples of P problems are searching an
element in array, inserting element, sorting
data, finding height of tree.
NP- Problem
• NP problems are such problems that can be
solved in polynomial time
• NP stands for non deterministic polynomial
time.
• Nondeterministic stage: guessing
• Deterministic stage: verification
• Solution to NP problems cannot be obtained
in polynomial time, but given the solution, it
can be verified in polynomial time
NP- Problem
• NP includes all problem of P, i.e. P subset of
NP
• Examples
– Knapsack problem
– Travelling salesman problem
• NP problems are classified in NP complete and
NP hard categories.
Problem

P Class NP Class

NP
NP hard
Complete
NP Complete problem
• Decision problem p is called NP complete if it
has following properties:

– It belongs to class NP
– Every other problem in NP can be transferred to P
in polynomial time
NP complete
• Solution of any NP complete problem can be
verified in polynomial time, but cannot be
obtained in polynomial time
• NP complete problems are often solved using
randomization algorithms, heuristic approach
and approximation algorithms
• Examples
– Knapsack problem
– Travelling sales man problem
NP Hard
• NP hard problem are at least as hard as the
hardest in NP.
• NP hard problems might not be a decision
problem
• NP hard problems may not be in NP
Example
• Halting problem:
– “given an algorithm and set of inputs, will it run
forever?”
– Answer to this question is YES or NO
– There does not exist any known algorithm which
can decide the answer for any given input in
polynomial time.
– So it is consider as a NP hard problem
Conclusion

Rabin-Karp Algorithm for String Matching
No ratings yet
Rabin-Karp Algorithm for String Matching
13 pages
Rabin Karp Algorithm of Pattern Matching (Goutam Padhy)
No ratings yet
Rabin Karp Algorithm of Pattern Matching (Goutam Padhy)
15 pages
String Matching
100% (1)
String Matching
27 pages
Rabin-Karp Algorithm Overview
100% (1)
Rabin-Karp Algorithm Overview
14 pages
Unit 7
No ratings yet
Unit 7
60 pages
Unit-V String Matching Algorithms
No ratings yet
Unit-V String Matching Algorithms
53 pages
17 StringMatching
No ratings yet
17 StringMatching
18 pages
Unit 5
No ratings yet
Unit 5
52 pages
Randomization & Approximation Algorithms
No ratings yet
Randomization & Approximation Algorithms
39 pages
Rabin-Karp String Matching Algorithm
No ratings yet
Rabin-Karp String Matching Algorithm
9 pages
The Rabin-Karp Algorithm: String Matching
No ratings yet
The Rabin-Karp Algorithm: String Matching
18 pages
Rabin Karp
100% (1)
Rabin Karp
13 pages
String Matching Algorithms Guide
No ratings yet
String Matching Algorithms Guide
46 pages
CH 8
No ratings yet
CH 8
26 pages
5CS4-AOA-Unit-3 @zammers
No ratings yet
5CS4-AOA-Unit-3 @zammers
7 pages
Unit 3-Pattern Matching
No ratings yet
Unit 3-Pattern Matching
43 pages
String Matching Algorithms Guide
No ratings yet
String Matching Algorithms Guide
57 pages
Adobe Scan Nov 24, 2023
No ratings yet
Adobe Scan Nov 24, 2023
5 pages
String Matching Algorithms Overview
No ratings yet
String Matching Algorithms Overview
63 pages
String Matching
No ratings yet
String Matching
35 pages
Lecture 56string Matching
No ratings yet
Lecture 56string Matching
43 pages
M3-String Matching
No ratings yet
M3-String Matching
74 pages
String Matching
No ratings yet
String Matching
4 pages
Daaunit5 IT3
No ratings yet
Daaunit5 IT3
21 pages
Randomized Algorithms
No ratings yet
Randomized Algorithms
12 pages
Lecture15 String Matching
No ratings yet
Lecture15 String Matching
10 pages
Brute-Force Pattern Matching Algorithm
No ratings yet
Brute-Force Pattern Matching Algorithm
21 pages
Unit 4 Daa
No ratings yet
Unit 4 Daa
14 pages
Unit 3-Pattern Matching
No ratings yet
Unit 3-Pattern Matching
42 pages
String Matching Kmprabin Karp and Naive
No ratings yet
String Matching Kmprabin Karp and Naive
41 pages
String Matching Algorithms Explained
No ratings yet
String Matching Algorithms Explained
34 pages
Rabin-Karp Algorithm for String Matching
No ratings yet
Rabin-Karp Algorithm for String Matching
11 pages
BNP Unit-5 Lecture 19
No ratings yet
BNP Unit-5 Lecture 19
13 pages
Naive and Rabin Karp
No ratings yet
Naive and Rabin Karp
47 pages
RB Matcher String Matching Technique
No ratings yet
RB Matcher String Matching Technique
4 pages
String Matching Algorithms Overview
No ratings yet
String Matching Algorithms Overview
23 pages
Unit V - Daa
No ratings yet
Unit V - Daa
37 pages
Rabin Karp
No ratings yet
Rabin Karp
11 pages
Unit II
No ratings yet
Unit II
94 pages
Combinatorial Optimization Guide
No ratings yet
Combinatorial Optimization Guide
10 pages
Semester Final Project Report
No ratings yet
Semester Final Project Report
11 pages
DAA (Algorithms Knowledge Capsule 4 by Dr. Choudhary Ravi Singh)
No ratings yet
DAA (Algorithms Knowledge Capsule 4 by Dr. Choudhary Ravi Singh)
20 pages
String Matching
No ratings yet
String Matching
30 pages
Lecture 18 - String Matching-KMP
No ratings yet
Lecture 18 - String Matching-KMP
40 pages
String Matching Introduction To NP-Completeness
No ratings yet
String Matching Introduction To NP-Completeness
37 pages
Adv Data Structure Chapter - 6
No ratings yet
Adv Data Structure Chapter - 6
15 pages
String Matching Algorithms Overview
No ratings yet
String Matching Algorithms Overview
17 pages
Ada Notes Unit 4
No ratings yet
Ada Notes Unit 4
28 pages
04 3 Hashing Search Substring
No ratings yet
04 3 Hashing Search Substring
123 pages
Strings and Pattern Matching
No ratings yet
Strings and Pattern Matching
17 pages
DAA Unit 5
No ratings yet
DAA Unit 5
22 pages
Algo Lab Project
No ratings yet
Algo Lab Project
9 pages
String Matching Algorithms Overview
No ratings yet
String Matching Algorithms Overview
18 pages
P, NP, NP-Hard and NP-Complete
No ratings yet
P, NP, NP-Hard and NP-Complete
30 pages
Topcoder Article
No ratings yet
Topcoder Article
8 pages
Disjoint Set, String Matching, NP Problem
No ratings yet
Disjoint Set, String Matching, NP Problem
7 pages
Randomized Algorithms Overview
No ratings yet
Randomized Algorithms Overview
69 pages
String Matching Algorithms Analysis
No ratings yet
String Matching Algorithms Analysis
5 pages
Linux Shell Scripting Notes Complete
No ratings yet
Linux Shell Scripting Notes Complete
3 pages
Unit 1 C++ RRU
No ratings yet
Unit 1 C++ RRU
23 pages
Unit 2.2
No ratings yet
Unit 2.2
31 pages
Cyber Security Education Hub
No ratings yet
Cyber Security Education Hub
12 pages
Cyber Security B.Tech Syllabus 2021-22
No ratings yet
Cyber Security B.Tech Syllabus 2021-22
69 pages
Two Dimensional Array in Java - JavaTutoring
No ratings yet
Two Dimensional Array in Java - JavaTutoring
4 pages
12.2.1 Resolution Principle (1) : - Resolution Refutation Proves A Theorem by
No ratings yet
12.2.1 Resolution Principle (1) : - Resolution Refutation Proves A Theorem by
31 pages
Perceptron and Logistic Regression
No ratings yet
Perceptron and Logistic Regression
16 pages
Graphs and Directed Graphs Guide
No ratings yet
Graphs and Directed Graphs Guide
14 pages
Unit 3
No ratings yet
Unit 3
27 pages
Data Structures Course Guide
No ratings yet
Data Structures Course Guide
53 pages
Markov Chain Monte Carlo
No ratings yet
Markov Chain Monte Carlo
9 pages
Activity Networks
No ratings yet
Activity Networks
27 pages
Space & Time Complexity
No ratings yet
Space & Time Complexity
3 pages
DLD Assignment 2
No ratings yet
DLD Assignment 2
3 pages
15.LDPC Codes
No ratings yet
15.LDPC Codes
21 pages
Theory of Computation
No ratings yet
Theory of Computation
7 pages
Sudoku: Daa Case Study
No ratings yet
Sudoku: Daa Case Study
8 pages
Fast Fourier Transforms: Quote of The Day
No ratings yet
Fast Fourier Transforms: Quote of The Day
13 pages
02 Lexical Analysis
No ratings yet
02 Lexical Analysis
86 pages
Q-Basic Numerical Analysis Programs
84% (19)
Q-Basic Numerical Analysis Programs
54 pages
Non Linear 1704955560
No ratings yet
Non Linear 1704955560
50 pages
ML-Unit 1
No ratings yet
ML-Unit 1
101 pages
Lecture 0
No ratings yet
Lecture 0
12 pages
Quiz 5
No ratings yet
Quiz 5
1 page
Business Math Theory
No ratings yet
Business Math Theory
2 pages
SPL (Eliminasi Gauss, Gauss-Jordan)
No ratings yet
SPL (Eliminasi Gauss, Gauss-Jordan)
10 pages
Acoustic Echo Cancellation
No ratings yet
Acoustic Echo Cancellation
10 pages
C Programming IA 1 (Solution)
No ratings yet
C Programming IA 1 (Solution)
2 pages
Rounding and Truncation Errors Explained
No ratings yet
Rounding and Truncation Errors Explained
8 pages
Chapter 3 Simplex Method PDF
No ratings yet
Chapter 3 Simplex Method PDF
32 pages
Research Paper
No ratings yet
Research Paper
12 pages
Java Data Structures for Intermediate Users
No ratings yet
Java Data Structures for Intermediate Users
18 pages
RDFcache Sigmod15
No ratings yet
RDFcache Sigmod15
16 pages

Unit 4

Uploaded by

Unit 4

Uploaded by

UNIT 4

• The for loop beginning on line 3 considers each possible

• Line 5 prints out each valid shift s.

14 mod 11 = 3 not equal to 4

41 mod 11 = 8 not equal to 4

26 mod 11 = 4 equal to 4 -> an exact match!!

As we can see, when a match is found, further testing is

You might also like