There Apriori algorithm has been implemented as Apriori.java (version 2).
A Java applet which combines DIC, Apriori and Probability Based Objected Interestingness Measures can be found here.
Note: Java 1.6.0_07 or newer
Download the following files:
- Apriori.java: Simple implementation of the Apriori Itemset Generation algorithm.
- Version 2: Apriori Itemset Generation algorithm that uses a hash tree.
- config.txt: Consists of four lines.
- Number of items
- Number of transactions
- Minimum support i.e. 20 represents 20% minsup
- Size of step M for the DIC algorithm. This line is ignored by the Apriori algorithm
- transa.txt: Contains the transaction database as an m x n matrix. Transaction 1 appears in row one. Columns are separated by a space and represent items. A 1 indicates that item is present in the transaction and a 0 indicates it is not. The transaction file for the following example (and many other datasets) can be found on the datasets page.
Place Apriori.java in a directory called apriori
Compile the .java file:
hercules[1]% javac apriori/Apriori.java
Change config.txt and transa.txt to represent the database and criteria to be tested.
Run the programs:
hercules[2]% java apriori/AprioriExample
We use the database from Apriori Itemset Generation example #2. The minsupp is 40%.
TID | A | B | C | D | E |
T1 | 1 | 1 | 1 | 0 | 0 |
T2 | 1 | 1 | 1 | 1 | 1 |
T3 | 1 | 0 | 1 | 1 | 0 |
T4 | 1 | 0 | 1 | 1 | 1 |
T5 | 1 | 1 | 1 | 1 | 0 |
transa.txt contains a row for each of the five transactions and a column for each of the five items.
1 1 1 0 0
1 1 1 1 1
1 0 1 1 0
1 0 1 1 1
1 1 1 1 0
1 1 1 1 1
1 0 1 1 0
1 0 1 1 1
1 1 1 1 0
Config.txt: The value of the last line is ingnored by apriori.java.
5
5
40
5
5
40
5
Output:
hercules[71]% java apriori
Algorithm apriori starting now.....
Press 'C' to change the default configuration and transaction files
or any other key to continue.
Input configuration: 5 items, 5 transactions, minsup = 40%
Frequent 1-itemsets:
[1, 2, 3, 4, 5]
Frequent 2-itemsets:
[1 2, 1 3, 1 4, 1 5, 2 3, 2 4, 3 4, 3 5, 4 5]
Frequent 3-itemsets:
[1 2 3, 1 2 4, 1 3 4, 1 3 5, 1 4 5, 2 3 4, 3 4 5]
Frequent 4-itemsets:
[1 2 3 4, 1 3 4 5]
Execution time is: 0 seconds.
hercules[72]%
Algorithm apriori starting now.....
Press 'C' to change the default configuration and transaction files
or any other key to continue.
Input configuration: 5 items, 5 transactions, minsup = 40%
Frequent 1-itemsets:
[1, 2, 3, 4, 5]
Frequent 2-itemsets:
[1 2, 1 3, 1 4, 1 5, 2 3, 2 4, 3 4, 3 5, 4 5]
Frequent 3-itemsets:
[1 2 3, 1 2 4, 1 3 4, 1 3 5, 1 4 5, 2 3 4, 3 4 5]
Frequent 4-itemsets:
[1 2 3 4, 1 3 4 5]
Execution time is: 0 seconds.
hercules[72]%
We get the same results as we did earlier when we did the Apriori algorithm by hand.