What is market basket analysis? How it is used.
Ans. Market Basket Analysis (MBA) is a data mining technique used to uncover patterns or
relationships between items in large datasets, typically transaction data from retail or e-commerce.
The goal is to identify items that frequently co-occur in transactions — essentially, what items are
often "bought together."
How is it used?
1. Cross-Selling:
Suggest related items to customers.
E.g., “Customers who bought a laptop also bought a mouse.”
2. Product Placement:
Place related items near each other in stores.
E.g., Chips placed near soda.
3. Promotions & Bundles:
Create combo offers or discounts on related items.
E.g., Pizza + soft drink deal.
Solved the example on FP tree algorithm.
Ans.
Step 1: Count Item Frequency
A–4
B–5
C–3
D–3
E–5
Step 2: Reorder Items in Each Transaction
Transaction Reordered Items
T1 B, E, A, D
T2 B, E, C
T3 B, E, A, D
T4 B, E, A, C
T5 B, E, A, C, D
Step 3: Build FP-Tree
T1 (B → E → A → D)
T2 (B → E → C)
T3 (B → E → A → D) (merge paths, increase counts)
T4 (B → E → A → C)
T5 (B → E → A → C → D)
Tree:
[NULL]
B(5)
E(5)
/ \
A(3) C(1)
/ \ \
D(2) C(2) (end)
D(1)
Final Output: Frequent Itemsets (support ≥ 3)
Examples:
• {B} – 5
• {E} – 5
• {B, E} – 5
• {A, B, E} – 3
• {C, B, E} – 3
• {A, B, C, E} – 3
• {D, A, B, E} – 3
Solved the example on apriori algorithm.
Ans.
The Apriori Algorithm finds frequent itemsets and then derives association rules from those
itemsets.
It works level-by-level: from single items to pairs, then triples, and so on — using the Apriori
principle:
If an itemset is frequent, all its subsets must also be frequent.
Example Dataset:
Let’s use 5 transactions:
TID Items
T1 A, B, C
T2 A, C
T3 A, D
T4 B, E
T5 A, B, C, E
Step 1: Set Minimum Support
Let’s say: Minimum support = 2
Step 2: Find Frequent 1-itemsets (L1)
Count support of each item:
mathematica
CopyEdit
A–4
B–3
C–3
D–1
E–2
Keep only items with support ≥ 2:
ini
CopyEdit
L1 = { A, B, C, E }
Step 3: Generate Candidate 2-itemsets (C2)
Make all 2-item combinations from L1:
css
CopyEdit
{A,B}, {A,C}, {A,E}, {B,C}, {B,E}, {C,E}
Count their supports from the dataset:
css
CopyEdit
{A,B} – 2 (T1, T5)
{A,C} – 3 (T1, T2, T5)
{A,E} – 1
{B,C} – 2 (T1, T5)
{B,E} – 2 (T4, T5)
{C,E} – 1
Keep those with support ≥ 2:
mathematica
CopyEdit
L2 = { {A,B}, {A,C}, {B,C}, {B,E} }
Step 4: Generate Candidate 3-itemsets (C3)
From L2, generate combinations:
css
CopyEdit
{A,B,C}, {A,B,E}, {B,C,E}
Check support:
css
CopyEdit
{A,B,C} – 2 (T1, T5)
{A,B,E} – 1
{B,C,E} – 1
Only {A,B,C} has enough support.
ini
CopyEdit
L3 = { {A,B,C} }
No more candidates can be generated after this.
Final Frequent Itemsets:
css
CopyEdit
L1: {A}, {B}, {C}, {E}
L2: {A,B}, {A,C}, {B,C}, {B,E}
L3: {A,B,C}
Step 5: Generate Association Rules
From {A, B, C}:
• A, B → C (support = 2, confidence = 2/2 = 100%)
• A, C → B
• B, C → A
• A → B, C
• B → A, C
• C → A, B