0% found this document useful (0 votes)
61 views12 pages

FP Growth Algorithm Example Problems

Uploaded by

ragnarock1902
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
61 views12 pages

FP Growth Algorithm Example Problems

Uploaded by

ragnarock1902
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 12

Example-1

Consider the following data:-

The above-given data is a hypothetical dataset of transactions with each letter representing an
item. The frequency of each individual item is computed:-

Let the minimum support be 3. A Frequent Pattern set is built which will contain all the
elements whose frequency is greater than or equal to the minimum support. These elements are
stored in descending order of their respective frequencies. After insertion of the relevant items,
the set L looks like this:-

L = {K : 5, E : 4, M : 3, O : 3, Y : 3}
Now, for each transaction, the respective Ordered-Item set is built. It is done by iterating the
Frequent Pattern set and checking if the current item is contained in the transaction in question.
If the current item
em is contained, the item is inserted in the Ordered
Ordered-Item
Item set for the current
transaction. The following table is built for all the transactions:

Now, all the Ordered-Item


Item sets are inserted into a Trie Data Structure.

a) Inserting the set {K, E, M, O, Y}:

Here, all the items are simply linked one after the other in the order of occurrence in the set and
initialize the support count for each item as 1.

b) Inserting the set {K, E, O, Y}:

Till the insertion of the elements K and E, simply the suppor


supportt count is increased by 1. On
inserting O we can see that there is no direct link between E and O, therefore a new node for the
item O is initialized with the support count as 1 and item E is linked to this new node. On
inserting Y, we first initialize a nnew
ew node for the item Y with support count as 1 and link the new
node of O with the new node of Y.
c) Inserting the set {K, E, M}:

Here simply the support count of each element is increased by 1.

d) Inserting the set {K, M, Y}:

Similar to step b),, first the support count of K is increased, then new nodes for M and Y are
initialized and linked accordingly.
e) Inserting the set {K, E, O}:

Here simply the support counts of the respective elements are increased. Note that the support
count of the new node of item O is increased.

Now, for each item, the Conditional Pattern Base is computed which is path labels of all the
paths which lead to any node of the given item in the frequent
frequent-pattern
pattern tree. Note that the items in
the below table are arranged
ged in the ascending order of their frequencies.
Now for each item, the Conditional Frequent Pattern Tree is built. It is done by taking the set
of elements that is common in all the paths in the Conditional Pattern Base of that item and
calculating its support count by summing the support counts of all the paths in the Conditional
Pattern Base.

From the Conditional Frequent Pattern tree, the Frequent Pattern rules are generated by pairing
the items of the Conditional Frequent Pattern Tree set to the corresponding to the item as given
in the below table.

For each row, two types of association rules can be inferred for example for the first row which
contains the element, the rules K -> Y and Y -> > K can be inferred. To determine the valid rule,
the confidence of both the rules is calculated and the one with confidence greater than or equal to
the minimum confidence value is retained.
Example-2

Support threshold=50%

Table 1

Transaction List of items


T1 I1,I2,I3
T2 I2,I3,I4
T3 I4,I5
T4 I1,I2,I4
T5 I1,I2,I3,I5
T6 I1,I2,I3,I4

Solution:

Support threshold=50% => 0.5*6= 3 => min_sup=3

1. Count of each item

Table 2

Item Count
I1 4
I2 5
I3 4
I4 4
I5 2

2. Sort the itemset in descending order.

Table 3

Item Count
I2 5
I1 4
I3 4
I4 4
3. Build FP Tree

1. Considering the root node null.


2. The first scan of Transaction T1: I1, I2, I3 contains three items {I1:1}, {I2:1}, {I3:1},
where I2 is linked as a child to root, I1 is linked to I2 and I3 is linked to I1.
3. T2: I2, I3, I4 contains I2, I3, and I4, where I2 is linked to root, I3 is linked to I2 and I4 is
linked to I3. But this branch would share I2 node as common as it is already used in T1.
4. Increment the count of I2 by 1 and I3 is linked as a child to I2, I4 is linked as a child to
I3. The count is {I2:2}, {I3:1}, {I4:1}.
5. T3: I4, I5. Similarly, a new branch with I5 is linked to I4 as a child is created.
6. T4: I1, I2, I4. The sequence will be I2, I1, and I4. I2 is already linked to the root node,
hence it will be incremented by 1. Similarly I1 will be incremented by 1 as it is already
linked with I2 in T1, thus {I2:3}, {I1:2}, {I4:1}.
7. T5:I1, I2, I3, I5. The sequence will be I2, I1, I3, and I5. Thus {I2:4}, {I1:3}, {I3:2},
{I5:1}.
8. T6: I1, I2, I3, I4. The sequence will be I2, I1, I3, and I4. Thus {I2:5}, {I1:4}, {I3:3}, {I4
1}.

4. Mining of FP-tree is summarized below:

1. The lowest node item I5 is not considered as it does not have a min support count, hence
it is deleted.
2. The next lower node is I4. I4 occurs in 2 branches , {I2,I1,I3:,I41},{I2,I3,I4:1}.
Therefore considering I4 as suffix the prefix paths will be {I2, I1, I3:1}, {I2, I3: 1}. This
forms the conditional pattern base.
3. The conditional pattern base is considered a transaction database, an FP-tree is
constructed. This will contain {I2:2, I3:2}, I1 is not considered as it does not meet the
min support count.
4. This path will generate all combinations of frequent patterns :
{I2,I4:2},{I3,I4:2},{I2,I3,I4:2}
5. For I3, the prefix path would be: {I2,I1:3},{I2:1}, this will generate a 2 node FP-tree :
{I2:4, I1:3} and frequent patterns are generated: {I2,I3:4}, {I1:I3:3}, {I2,I1,I3:3}.
6. For I1, the prefix path would be: {I2:4} this will generate a single node FP-tree: {I2:4}
and frequent patterns are generated: {I2, I1:4}.

Item Conditional Pattern Base Conditional FP-tree Frequent Patterns Generated


I4 {I2,I1,I3:1},{I2,I3:1} {I2:2, I3:2} {I2,I4:2},{I3,I4:2},{I2,I3,I4:2}
I3 {I2,I1:3},{I2:1} {I2:4, I1:3} {I2,I3:4}, {I1:I3:3}, {I2,I1,I3:3}
I1 {I2:4} {I2:4} {I2,I1:4}
Example-3
Consider the below dataset. The minimum support given is 3.

TID Items Bought


100 f, a, c, d, g, i, m, p
200 a, b, c, f, l, m, o
300 b, f, h, j, o
400 b, c, k, s, p
500 a, f, c, e, l, p, m, n

In the frequent pattern growth algorithm, first, we find the frequency of each item. The following
table gives the frequency of each item in the given data.

See also Name Node Federation Checkpoint Backup and Snapshots

Item Frequency Item Frequency


a 3 j 1
b 3 k 1
c 4 l 2
d 1 m 3
e 1 n 1
f 4 o 2
g 1 p 3
h 1 s 1
i 1

A Frequent Pattern set (L) is built, which will contain all the elements whose frequency is
greater than or equal to the minimum support.

These elements are stored in descending order of their respective frequencies.

As minimum support is 3.

After insertion of the relevant items, the set L looks like this:-

L = { (f:4), (c:4), (a:3), (b:3), (m:3), (p:3) }

Now, for each transaction, the respective Ordered-Item set is built.


Frequent Pattern set L = { (f:4), (c:4), (a:3), (b:3), (m:3), (p:3) }

TID Items Bought (Ordered) Frequent Items


100 f, a, c, d, g, i, m, p f, c, a, m, p
200 a, b, c, f, l, m, o f, c, a, b, m
300 b, f, h, j, o f, b
400 b, c, k, s, p c, b, p
500 a, f, c, e, l, p, m, n f, c, a, m, p

Now, all the Ordered-Item sets are inserted into a Trie Data Structure (frequent pattern tree).

a) Inserting Ordered frequent items of TID-100


b) Inserting Ordered items of TID-200 c) Inserting Ordered items of TID-300

d) Inserting Ordered items of TID-400 e) Inserting Ordered items of TID-500


Now, for each item, the Conditional Pattern Base is computed which is the path labels of all
the paths which lead to any node of the given item in the frequent-pattern tree.

Item Conditional Pattern Base


p {{f, c, a, m : 2}, {c, b : 1}}
m {{f, c, a : 2}, {f, c, a, b : 1}}
b {{f, c, a : 1}, {f : 1}, {c : 1}}
a {{f, c : 3}}
c {{f : 3}}
f Φ

Now for each item, the Conditional Frequent Pattern Tree is built. It is done by taking the set
of elements that is common in all the paths in the Conditional Pattern Base of that item and
calculating its support count by summing the support counts of all the paths in the Conditional
Pattern Base.

Item Conditional Pattern Base Conditional FP-Tree


p {{f, c, a, m : 2}, {c, b : 1}} {c : 3}
m {{f, c, a : 2}, {f, c, a, b : 1}} {f, c, a :3}
b {{f, c, a : 1}, {f : 1}, {c : 1}} Φ
a {{f, c : 3}} {f, c : 3}
c {{f : 3}} {f : 3}
f Φ Φ

From the Conditional Frequent Pattern tree, the Frequent Pattern rules are generated by pairing
the items of the Conditional Frequent Pattern Tree set to the corresponding item.

Conditional FP-
Item Conditional Pattern Base Frequent Patterns Generated
Tree
p {{f, c, a, m : 2}, {c, b : 1}} {c : 3} {<c, p : 3>}
{ <f, m : 3>, <c, m : 3> <a, m : 3>, <f, c,
m {{f, c, a : 2}, {f, c, a, b : 1}} {f, c, a :3}
m : 3> <f, a, m : 3>, <c, a, m :3>}
b {{f, c, a : 1}, {f : 1}, {c : 1}} Φ {}
a {{f, c : 3}} {f, c : 3} {<f, a : 3>, <c, a : 3>, <f, c, a:3>}
c {{f : 3}} {f : 3} { <f, c : 3>}
f Φ Φ {}

For each row, two types of association rules can be inferred.

You might also like