Apriori Itemset Generation |
Pass 1
itemsets in Lk-1 insert into Ck select p.item1, q.item1, . . . , p.itemk-1, q.itemk-1 from Lk-1 p, Lk-1q where p.item1 = q.item1, . . . p.itemk-2 = q.itemk-2, p.itemk-1 < q.itemk-1 |
Example 1: Assume the user-specified minimum support is 50%
TID | A | B | C | D | E | F |
T1 | 1 | 0 | 1 | 1 | 0 | 0 |
T2 | 0 | 1 | 0 | 1 | 0 | 0 |
T3 | 1 | 1 | 1 | 0 | 1 | 0 |
T4 | 0 | 1 | 0 | 1 | 0 | 1 |
Itemset X | supp(X) |
{A,B} | 25% |
{A,C} | 50% |
{A,D} | 25% |
{B,C} | 25% |
{B,D} | 50% |
{C,D} | 25% |
Itemset X | supp(X) |
{A,C} | 50% |
{B,D} | 50% |
Example 2: Assume the user-specified minimum support is 40%, then generate all frequent itemsets.
Given: The transaction database shown below
TID | A | B | C | D | E |
T1 | 1 | 1 | 1 | 0 | 0 |
T2 | 1 | 1 | 1 | 1 | 1 |
T3 | 1 | 0 | 1 | 1 | 0 |
T4 | 1 | 0 | 1 | 1 | 1 |
T5 | 1 | 1 | 1 | 1 | 0 |
Pass 1
|
|
Pass 2
Itemset X | supp(X) |
A,B | ? |
A,C | ? |
A,D | ? |
A,E | ? |
B,C | ? |
B,D | ? |
B,E | ? |
C,D | ? |
C,E | ? |
D,E | ? |
- Nothing pruned since all subsets of these itemsets are infrequent
|
|
Pass 3
- To create C3 only look at items that have the same first item (in pass k, the first k - 2 items must match)
|
|
- Pruning eliminates ABE since BE is not frequent
- Scan transactions in the database
Itemset X | supp(X) |
A,B,C | 60% |
A,B,D | 40% |
A,C,D | 80% |
A,C,E | 40% |
A,D,E | 40% |
B,C,D | 40% |
C,D,E | 40% |
Pass 4
- First k - 2 = 2 items must match in pass k = 4
Itemset X | supp(X) | |
combine ABC with ABD | A,B,C,D | ? |
combine ACD with ACE | A,C,D,E | ? |
- Pruning:
- For ABCD we check whether ABC, ABD, ACD, BCD are frequent. They are in all cases, so we do not prune ABCD.
- For ACDE we check whether ACD, ACE, ADE, CDE are frequent. Yes, in all cases, so we do not prune ACDE
Itemset X | supp(X) |
A,B,C,D | 40% |
A,C,D,E | 40% |
Pass 5: For pass 5 we can't form any candidates because there aren't two frequent 4-itemsets beginning with the same 3 items.
- Both are frequent