 A frequent itemset is an itemset whose support is greater than some userspecified minimum support (denoted L_{k}, where k is the size of the itemset)
 A candidate itemset is a potentially frequent itemset (denoted C_{k}, where k is the size of the itemset)
Apriori Algorithm
A Java applet which combines DIC, Apriori and Probability Based Objected Interestingness Measures can be found here.
Apriori Algorithm: (by Agrawal et al at IBM Almaden Research Centre) can be used to generate all frequent itemset
Pass 1
 Generate the candidate itemsets in C_{1}
 Save the frequent itemsets in L_{1}
Pass k
 Generate the candidate itemsets in C_{k} from the frequent
itemsets in L_{k}_{1}
 Join L_{k}_{1} p with L_{k}_{1}q, as follows:
insert into C_{k}
select p.item_{1}, p.item_{2}, . . . , p.item_{k1}, q.item_{k1}
from L_{k}_{1} p, L_{k}_{1}q
where p.item_{1} = q.item_{1}, . . . p.item_{k2} = q.item_{k2}, p.item_{k1} < q.item_{k1}
 Generate all (k1)subsets from the candidate itemsets in C_{k}
 Prune all candidate itemsets from C_{k} where some (k1)subset of the candidate itemset is not in the frequent itemset L_{k}_{1}
 Scan the transaction database to determine the support for each candidate itemset in C_{k}
 Save the frequent itemsets in L_{k}
Implementation: A working Apriori Itemset Generation program can be found on the Itemset Implementation page.
Example 1: Assume the userspecified minimum support is 50%
 Given: The transaction database shown below
TID 
A 
B 
C 
D 
E 
F 
T_{1} 
1 
0 
1 
1 
0 
0 
T_{2} 
0 
1 
0 
1 
0 
0 
T_{3} 
1 
1 
1 
0 
1 
0 
T_{4} 
0 
1 
0 
1 
0 
1 
 The candidate itemsets in C_{2} are shown below
Itemset X 
supp(X) 
{A,B} 
25% 
{A,C} 
50% 
{A,D} 
25% 
{B,C} 
25% 
{B,D} 
50% 
{C,D} 
25% 
 The frequent itemsets in L_{2} are shown below
Itemset X 
supp(X) 
{A,C} 
50% 
{B,D} 
50% 
Example 2: Assume the userspecified minimum support is 40%, then generate all frequent itemsets.
Given: The transaction database shown below
TID 
A 
B 
C 
D 
E 
T_{1} 
1 
1 
1 
0 
0 
T_{2} 
1 
1 
1 
1 
1 
T_{3} 
1 
0 
1 
1 
0 
T_{4} 
1 
0 
1 
1 
1 
T_{5} 
1 
1 
1 
1 
0 
Pass 1
C_{1}
Itemset X 
supp(X) 
A 
? 
B 
? 
C 
? 
D 
? 
E 
? 

L_{1}
Itemset X 
supp(X) 
A 
100% 
B 
60% 
C 
100% 
D 
80% 
E 
40% 

Pass 2
C_{2}
Itemset X 
supp(X) 
A,B 
? 
A,C 
? 
A,D 
? 
A,E 
? 
B,C 
? 
B,D 
? 
B,E 
? 
C,D 
? 
C,E 
? 
D,E 
? 
 Nothing is pruned since all subsets of these itemsets are frequent
L_{2}
Itemset X 
supp(X) 
A,B 
60% 
A,C 
100% 
A,D 
80% 
A,E 
40% 
B,C 
60% 
B,D 
40% 
B,E 
20% 
C,D 
80% 
C,E 
40% 
D,E 
40% 

L_{2} after saving only the frequent itemsets
Itemset X 
supp(X) 
A,B 
60% 
A,C 
100% 
A,D 
80% 
A,E 
40% 
B,C 
60% 
B,D 
40% 
C,D 
80% 
C,E 
40% 
D,E 
40% 

Pass 3
 To create C3 only look at items that have the same first item (in pass k, the first k  2 items must match)
C_{3}

Itemset X 
supp(X) 
join AB with AC 
A,B,C 
? 
join AB with AD 
A,B,D 
? 
join AB with AE 
A,B,E 
? 
join AC with AD 
A,C,D 
? 
join AC with AE 
A,C,E 
? 
join AD with AE 
A,D,E 
? 
join BC with BD 
B,C,D 
? 
join CD with CE 
C,D,E 
? 

C_{3} after pruning
Itemset X 
supp(X) 
A,B,C 
? 
A,B,D 
? 
A,C,D 
? 
A,C,E 
? 
A,D,E 
? 
B,C,D 
? 
C,D,E 
? 

 Pruning eliminates ABE since BE is not frequent
 Scan transactions in the database
L_{3}
Itemset X 
supp(X) 
A,B,C 
60% 
A,B,D 
40% 
A,C,D 
80% 
A,C,E 
40% 
A,D,E 
40% 
B,C,D 
40% 
C,D,E 
40% 
Pass 4
 First k  2 = 2 items must match in pass k = 4
C_{4}

Itemset X 
supp(X) 
combine ABC with ABD 
A,B,C,D 
? 
combine ACD with ACE 
A,C,D,E 
? 
 Pruning:
 For ABCD we check whether ABC, ABD, ACD, BCD are frequent. They are in all cases, so we do not prune ABCD.
 For ACDE we check whether ACD, ACE, ADE, CDE are frequent. Yes, in all cases, so we do not prune ACDE
L_{4}
Itemset X 
supp(X) 
A,B,C,D 
40% 
A,C,D,E 
40% 
Pass 5: For pass 5 we can't form any candidates because there aren't two frequent 4itemsets beginning with the same 3 items.