0% found this document useful (0 votes)
12 views9 pages

Rajkumar V Pattern Searching

The Rabin-Karp Algorithm is a string-searching method that utilizes hashing to efficiently find patterns in text, performing brute-force comparisons only when hash values match. It is particularly effective for multiple pattern searches and large texts, with an average time complexity of O(N + M) but a worst-case of O(N × M) due to potential hash collisions. Applications include spam detection, plagiarism detection, and bioinformatics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views9 pages

Rajkumar V Pattern Searching

The Rabin-Karp Algorithm is a string-searching method that utilizes hashing to efficiently find patterns in text, performing brute-force comparisons only when hash values match. It is particularly effective for multiple pattern searches and large texts, with an average time complexity of O(N + M) but a worst-case of O(N × M) due to potential hash collisions. Applications include spam detection, plagiarism detection, and bioinformatics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

The Rabin-Karp Algorithm

RAJKUMAR V
The Rabin-Karp Algorithm
The Rabin-Karp Algorithm is a string-searching algorithm that efficiently finds a pattern within a text using hashing

If the hash values are unequal, the algorithm will calculate the hash value for the next M-character sequence.

If the hash values are equal, the algorithm will do a brute-force comparison between the pattern and the M-
character sequence at this location (in case of a hash value collision causing a false match).

It is particularly useful for finding multiple patterns in the same text or for searching in streams of data.

In this way, there is only one comparison per text subsequence, and Brute Force is only needed when hash values
match.
KEY IDEAS
Main Concept:

• Convert pattern and text substrings into numbers (hashes).

• Compare numbers (hashes) instead of whole strings.

• If hash matches, confirm by checking the actual text.


HOW IT WORKS
Compute hash of the pattern.

Compute hash of first window of text.

Slide window ➔ Calculate next hash using rolling hash.

If hashes match ➔ Confirm by comparing characters.

Repeat till end of text.


EXAMPLES
TIME COMPLEXITY
Average Case:O(N + M)

Worst Case:O(N × M)

 Best/Average: Fast when no collisions.

 Worst: Slow if many hash collisions ➔ Needs character-by-character check.


Advantages

 Fast for multiple pattern searches.

 Efficient when working with large texts.

 Uses rolling hash ➔ Fast hash updates.

Disadvantages

 Hash Collisions possible ➔ Requires full comparison.

 Not always faster for single pattern ➔ Other algorithms may work
better.

 Depends on quality of hash function.

 Spurious Hit
Applications

 Spam Detection (e.g., emails).

 Plagiarism Detection.

 Digital Forensics.

 Anti-virus Programs.

 Bioinformatics (DNA, proteins).

Conclusion

 Rabin-Karp ➔ Powerful for searching many


patterns in large texts.

 Strength: Hashing makes searching fast.

 Limitation: Must handle hash collisions


⏱ Time Complexity:
•Best case: O(n + m) (hashes don't match much

•Worst case: O(nm) (when hashes match but strings don’t


→ character by character comparison needed)

•Average case: better than brute-force for many practical


cases

You might also like