0% found this document useful (0 votes)
26 views3 pages

Steps

The Rabin-Karp String Matching Algorithm efficiently finds occurrences of a pattern string in a larger text using hashing to compare substrings. It involves preprocessing the hash values of the pattern and the text, sliding the pattern over the text while calculating new hashes in constant time, and confirming matches through character comparison. The algorithm has a best/average time complexity of O(n + m) but can degrade to O(n * m) in the worst case due to hash collisions.

Uploaded by

dhansanushree
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views3 pages

Steps

The Rabin-Karp String Matching Algorithm efficiently finds occurrences of a pattern string in a larger text using hashing to compare substrings. It involves preprocessing the hash values of the pattern and the text, sliding the pattern over the text while calculating new hashes in constant time, and confirming matches through character comparison. The algorithm has a best/average time complexity of O(n + m) but can degrade to O(n * m) in the worst case due to hash collisions.

Uploaded by

dhansanushree
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd

Steps:

Preprocess:

Calculate the hash value of the pattern P.

Calculate the hash value of the first substring of T of length m.

Sliding Window:

Slide the pattern over the text one character at a time.

For each window, calculate the hash of the new substring in constant time using a
rolling hash.

Match Check:

If hash values match, then compare the strings character by character to confirm
the match.

If they don’t match, skip to the next window.Sure! Here's a **detailed


explanation** of the **Rabin-Karp String Matching Algorithm** — perfect for exam
prep or understanding it thoroughly:

---

## 🧠 **Rabin-Karp String Matching Algorithm – Detailed Explanation**

---

### ✅ **Purpose:**
The **Rabin-Karp algorithm** is used to **find occurrences of a pattern string `P`
in a larger text string `T`** efficiently, especially useful when multiple patterns
are involved.

---

### 🔍 **Key Idea:**


It uses **hashing** to **compare substrings of text** with the pattern, instead of
comparing characters one by one.

- If the hash of the substring matches the hash of the pattern, then do an exact
match check.
- If the hash doesn't match, skip checking the characters.

This greatly reduces the number of comparisons in practice.

---

### 🔧 **How It Works:**

Let:
- `n` = length of text `T`
- `m` = length of pattern `P`
- `q` = a large prime number used for modulo (to reduce hash collisions)
- `d` = number of characters in the alphabet (e.g., 256 for ASCII)

---

### 🔁 **Steps:**
1. **Preprocess**:
- Calculate the **hash value of the pattern `P`**.
- Calculate the **hash value of the first substring of `T` of length `m`**.

2. **Sliding Window**:
- Slide the pattern over the text one character at a time.
- For each window, calculate the **hash of the new substring** in constant time
using a **rolling hash**.

3. **Match Check**:
- If hash values **match**, then compare the strings character by character to
**confirm the match**.
- If they don’t match, skip to the next window.

---

### 🧮 **Hash Function:**

The hash function treats the substring as a number in base `d`.


For a string `S[0..m-1]`, hash value is:

```
hash(S) = (d⁰*S[0] + d¹*S[1] + ... + dᵐ⁻¹*S[m-1]) % q
```

The rolling hash formula allows updating the hash in **O(1)** time as the window
slides.

---

### **Time Complexity:**

- **Best/Average Case:** `O(n + m)`


- **Worst Case:** `O(n * m)` (due to hash collisions)

---

### ✅ **Advantages:**
- Very efficient for **multiple pattern matching**.
- Hashing avoids character-by-character comparison in most cases.

---

### ❌ **Disadvantages:**
- Can suffer from **hash collisions**, leading to unnecessary comparisons.
- Needs a good hash function and a large prime `q`.

---

### 📌 **Example:**

Text: `T = "ABCCDDAEFG"`
Pattern: `P = "CDD"`

1. Compute hash of `"CDD"`


2. Slide through substrings of length 3 in text:
- `"ABC"` → hash doesn’t match
- `"BCC"` → no match
- `"CCD"` → no match
- `"CDD"` → hash match → compare → ✅ match found!

---

### ✨ Summary Table:

| Feature | Rabin-Karp Algorithm |


|--------------------|----------------------------------------|
| Uses | Pattern matching in strings |
| Technique | Hashing + Sliding Window |
| Time Complexity | Best: O(n + m), Worst: O(n * m) |
| Suitable For | Large texts, Multiple pattern matching |
| Checks | First hash, then actual characters |

---

Let me know if you want a simple C/C++ or Python implementation too!

You might also like