0% found this document useful (0 votes)
836 views21 pages

Chapter 9

MBA is a set of techniques, Association Rules being most common. It focuses on point-of-sale (p-o-s) Transaction Data. It can be used to: +Identify who customers are (not by name) +Understand why they make certain purchases +Gain insight about its merchandise (products): Fast and slow movers products which are purchased together products which might benefit from promotion +Take action: Store layouts which products to put on specials, promote, coupons.

Uploaded by

vinay_bhandari_2
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
836 views21 pages

Chapter 9

MBA is a set of techniques, Association Rules being most common. It focuses on point-of-sale (p-o-s) Transaction Data. It can be used to: +Identify who customers are (not by name) +Understand why they make certain purchases +Gain insight about its merchandise (products): Fast and slow movers products which are purchased together products which might benefit from promotion +Take action: Store layouts which products to put on specials, promote, coupons.

Uploaded by

vinay_bhandari_2
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd

| 

 
     
 
Data Mining Techniques So Far«

‡ Chapter 5 ± Statistics

‡ Chapter 6 ± Decision Trees

‡ Chapter 7 ± Neural Networks

‡ Chapter 8 ± Nearest Neighbor Approaches: Memory-

Based Reasoning and Collaborative Filtering

2
hat can be inferred?

‡ I purchase diapers

‡ I purchase a new car

‡ I purchase OTC cough medicine

‡ I purchase a prescription medication

‡ I don¶t show up for class

ã
Market Basket Analysis
‡ Retail ± each customer purchases different set of
products, different quantities, different times
‡ MBA uses this information to:
± Identify who customers are (not by name)
± Understand why they make certain purchases
± Gain insight about its merchandise (products):
‡ Fast and slow movers
‡ Products which are purchased together
‡ Products which might benefit from promotion
± Take action:
‡ Store layouts
‡ hich products to put on specials, promote, coupons«
‡ Combining all of this with a customer loyalty card it
becomes even more valuable

ß
Association Rules
‡ DM technique most closely allied with
Market Basket Analysis
‡ AR can be automatically generated
± AR represent patterns in the data without a
specified target variable
± Good example of undirected data mining
± hether patterns make sense is up to
humanoids (us!)

5
Association Rules Apply Elsewhere
‡ Besides retail ± supermarkets, etc«
‡ Purchases made using credit/debit cards
‡ Optional Telco Service purchases
‡ Banking services
‡ Unusual combinations of insurance claims
can be a warning of fraud
‡ Medical patient histories

6
Market Basket Analysis Drill-Down
‡ MBA is a set of techniques, Association
Rules being most common, that focus on
point-of-sale (p-o-s) transaction data
‡ ã types of market basket data (p-o-s data)
± Customers
± Orders (basic purchase data)
± Items (merchandise/services purchased)

7
˜  

‡ Lots of questions can be answered


± Avg # of orders/customer
± Avg # unique items/order
± Avg # of items/order
± For a product
‡ hat % of customers have purchased
Transaction Data
‡ Avg # orders/customer include it
‡ Avg quantity of it purchased/order
± Etc«
‡ Visualization is extremely helpful«next slide

8
Sales Order Characteristics

à
Sales Order Characteristics
‡ Did the order use gift wrap?
‡ Billing address same as Shipping address?
‡ Did purchaser accept/decline a cross-sell?
‡ hat is the most common item found on a one-item
order?
‡ hat is the most common item found on a multi-item
order?
‡ hat is the most common item for repeat customer
purchases?
‡ How has ordering of an item changed over time?
‡ How does the ordering of an item vary geographically?
‡ Yada«yada«yada«
¬
Pivoting for Cluster Algorithms

¬¬
Association Rules
‡ al-Mart customers who purchase Barbie dolls
have a 6 % likelihood of also purchasing one of
three types of candy bars [r , Sept 8, ¬àà7]
‡ Customers who purchase maintenance
agreements are very likely to purchase large
appliances (author experience)
‡ hen a new hardware store opens, one of the
most commonly sold items is toilet bowl cleaners
(author experience)
‡ So what«
¬2
Association Rules
‡ Association rule types:
± Actionable Rules ± contain high-quality,
actionable information
± Trivial Rules ± information already well-
known by those familiar with the business
± Inexplicable Rules ± no explanation and do
not suggest action
‡ Trivial and Inexplicable Rules occur most
often

‹
How Good is an Association Rule?
Customer Items Purchased
¬ OJ, soda POS Transactions
2 Milk, OJ, window cleaner
ã OJ, detergent
ß OJ, detergent, soda Co-occurrence of
5 indow cleaner, soda Products

OJ indow Milk Soda Detergent


cleaner
OJ ß ¬ ¬ 2 2
indow cleaner ¬ 2 ¬ ¬
Milk ¬ ¬ ¬
Soda 2 ¬ ã ¬
Detergent 2 ¬ 2

§
How Good is an Association Rule?
OJ indow Milk Soda Detergent
cleaner
OJ ß ¬ ¬ 2 2
indow cleaner ¬ 2 ¬ ¬
Milk ¬ ¬ ¬
Soda 2 ¬ ã ¬
Detergent 2 ¬ 2

Simple patterns:
¬ OJ and soda are more likely purchased together than
any other two items
2 Detergent is never purchased with milk or window cleaner
ã Milk is never purchased with soda or detergent

¬5
How Good is an Association Rule?
Customer Items Purchased
¬ OJ, soda POS Transactions
2 Milk, OJ, window cleaner
ã OJ, detergent
ß OJ, detergent, soda
5 indow cleaner, soda

‡ hat is the confidence for this rule:


± If a customer purchases soda, then customer also purchases OJ
± 2 out of ã soda purchases also include OJ, so 67%
‡ hat about the confidence of this rule reversed?
± 2 out of ß OJ purchases also include soda, so 5 %
‡ |  = Ratio of the number of transactions with all the items
to the number of transactions with just the ³if´ items
¬6
How Good is an Association Rule?
‡ How much better than chance is a rule?
‡ Lift (improvement) tells us how much better a rule is at predicting the
result than just assuming the result in the first place
‡ mis the ratio of the records that support the entire rule to the
number that would be expected, assuming there was no relationship
between the products
‡ Calculating lift«p 㬠« hen lift > ¬ then the rule is better at
predicting the result than guessing
‡ hen lift < ¬, the rule is doing worse than informed guessing and
using the  produces a better rule than guessing

‡ Co-occurrence can occur in ã, ß, or more dimensions«

¬7
Creating Association Rules
¬ Choosing the right set of
items
2 Generating rules by
deciphering the counts in
the co-occurrence matrix
ã Overcoming the practical
limits imposed by
thousands or tens of
thousands of unique
items

¬8
Overcoming Practical Limits for
Association Rules

¬ Generate co-occurrence matrix for single


items«´`  
2 Generate co-occurrence matrix for two
items«´` ` 
ã Generate co-occurrence matrix for three
items«´` ` `
Cleaner´ then soda
ß Etc«

ˆ
Final Thought on Association Rules:
The Problem of Lots of Data

‡ Fast Food Restaurant«could have ¬ items on


its menu
± How many combinations are there with ã different
menu items? ¬6¬,7 !
‡ Supermarket«¬ , or more unique items
± 5 million 2-item combinations
± ¬ billion ã-item combinations
‡ Use of product hierarchies (groupings) helps
address this common issue
‡ Finally, know that the number of transactions in
a given time-period could also be huge (hence
expensive to analyze)

2
End of Chapter à

You might also like