0% found this document useful (0 votes)
284 views6 pages

Multi Level Association Rules

The document discusses different types of association rule mining including: 1) Multi-level association rule mining which generates rules at different levels of abstraction to provide both common sense and low-level rules. 2) Multi-dimensional association rule mining which involves rules with more than one dimension or predicate, such as rules involving both customer attributes and purchases. 3) Techniques for mining quantitative association rules including static and dynamic discretization of numeric attributes as well as clustering-based approaches. The goal is to generate meaningful rules involving both categorical and quantitative attributes.

Uploaded by

Uttam Singh
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
284 views6 pages

Multi Level Association Rules

The document discusses different types of association rule mining including: 1) Multi-level association rule mining which generates rules at different levels of abstraction to provide both common sense and low-level rules. 2) Multi-dimensional association rule mining which involves rules with more than one dimension or predicate, such as rules involving both customer attributes and purchases. 3) Techniques for mining quantitative association rules including static and dynamic discretization of numeric attributes as well as clustering-based approaches. The goal is to generate meaningful rules involving both categorical and quantitative attributes.

Uploaded by

Uttam Singh
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Support for an itemset X in a transactional database D is defined as count(X) / |D|.

For an association rule X Þ Y, we can calculate

support(X Þ Y) = support(X U Y) = support(X union Y).


confidence(X Þ Y) = support(X U Y) / support(X).

     Support (S) and Confidence (C) can also be related to joint probabilities and conditional
probabilities as follows.

support(X Þ Y) = P(X U Y).


confidence(X Þ Y) = P(Y/X).

     The number of association rules that can be derived from a dataset D are exponentially
large. Interesting association rules are those whose support and confidence are greater than
minSupp and minConf.

Frequent itemsets (also called as large itemsets), are those itemsets whose support is greater
than minSupp. The apriori property (downward closure property) says that any subsets of an
frequent itemset are also frequent itemsets.

Multi Level Association Rules – Concepts:

o Rules Generated from mining data at different levels of abstraction

o Essential to mine at different levels, in supporting business decision making

o Massive amount of data highly sparse at the primitive level

o Rules at high concept level adds to common sense

o Rules at low concept level may not be interesting always

Example:

o Items in task relevant data will be primitive

o Primitive data items occurs least frequently


buys (hp-laptop computer)  buys (canon-inkjet printer)

Vs

buys (laptop computer)  buys (inkjet printer)

Vs

buys (computer)  buys (printer)

o Support- Confidence Framework

o Top down Strategy, in accumulating counts

o Algorithms – Apriori & it’s variations

o Variations includes

o Uniform support for all levels

o Reduced Support at lower levels

Mining (UNIFORM SUPPORT):

o Same support for all levels of abstraction

o Subsets of ancestors not satisfying minimum support are not examined

o Higher support threshold  lose interesting associations at lower abstractions

o Lower support threshold  Many uninteresting associations at higher abstractions


o Alternate Search Strategies

o Level by level independent

 Full breadth search

 No back Ground knowledge in pruning

 Leads to examining lot of infrequent items

o Level-cross filtering by single item

 Examine nodes at level i, only if node at level i-1 is frequent

 Misses frequent items at lower level abstractions (due to reduced support)

o Level-cross filtering by k-itemset

 Examine k-itemsets at level i, only if k-itemset at level i-1 is frequent

 Misses frequent k-itemsets at lower level abstractions (due to reduced


support)

o Controlled level-cross filtering by singe item

o A modified level-cross filtering by singe item

o Sets a level passage threshold for every levels

o Allows the inspection of lower abstractions, even if its ancestor fails to satisfy
min_sup threshold

Computer  Printer

(At same Abstraction level)


Computer  InkJet Printer (Cross level Association rules)

(At Different Abstraction level)

Redundancy:

Laptop computer  InkJet Printer

(Support = 10 % , confidence = 70%)

Vs

HP Laptop Computer  InkJet Printer

(Support = 5 % , confidence = 68%)

o Second one is redundant due to the existing ancestor relationship

Multi Dimensional Association Rules – Concepts:


=>Rules involving more than one dimensions or predicates

• buys (X, “IBM Laptop Computer”)  buys (X, “HP Inkjet Printer”)

(Single dimensional)

• age (X, “20 ..25” ) and occupation (X, “student”)  buys (X, “HP Inkjet
Printer”)

(Multi Dimensional- Inter dimension Association Rule)

• age (X, “20 ..25” ) and buys (X, “IBM Laptop Computer”) buys (X, “HP Inkjet
Printer”)

(Multi Dimensional- Hybrid dimension Association Rule)

• Attributes can be categorical or quantitative

• Quantitative attributes are numeric and incorporates hierarchy (age, income..)

• Numeric attributes must be discretized

• 3 different approaches in mining multi dimensional association rules

o Using static discretization of quantitative attributes

o Using dynamic discretization of quantitative attributes

o Using Distance based discretization with clustering

Mining using Static Discretization:

• Discretization is static and occurs prior to mining

• Discretized attributes are treated as categorical


• Use apriori algorithm to find all k-frequent predicate sets

• Every subset of frequent predicate set must be frequent

• If in a data cube the 3D cuboid (age, income, buys) is frequent implies (age, income),
(age,buys), (income, buys)

Mining using Dynamic Discretization:

• Known as Mining Quantitative Association Rules

• Numeric attributes are dynamically discretized

• Consider rules of type

Aquan1 Λ Aquan2  Acat

(2D Quantitative Association Rules)

age(X,”20…25”) Λ income(X,”30K…40K”)  buys (X, ”Laptop Computer”)

• ARCS (Association Rule Clustering System)An Approach for mining quantitative


association rules.

• 2 step mining process

o Perform clustering to find the interval of attributes involved

o Obtain association rules by searching for groups of clusters that occur together
• The resultant rules must satisfy

o Clusters in the rule antecedent are strongly associated with clusters of rules in
the consequent

o Clusters in the antecedent occur together

o Clusters in the consequent occur together

You might also like