Back Decision Tree Construction HomeNext

References:

A decision tree is constructed by looking for regularities in data.

From Data to Trees: Quinlan's ID3 Algorithm for Constructing a Decision Tree

General Form

Until each leaf node is populated by as homogeneous a sample set as possible:

Specific Form

  1. Examine the attributes to add at the next level of the tree using an entropy calculation.
  2. Choose the attribute that minimizes the entropy.

Tests Should Minimize Entropy

It is computationally impractical to find the smallest possible decision tree, so we use a procedure that tends to build small trees.

Choosing an Attribute using the Entropy Formula

The central focus of the ID3 algorithm is selecting which attribute to test at each node in the tree.

Procedure:

  1. See how the attribute distributes the instances.
  2. Minimize the average entropy.

Review of Log2

On a calculator:

eg) (0)log2(0) = - (the formula is no good for a probability of 0)
eg) (1)log2(1) = 0
eg) log2(2) = 1
eg) log2(4) = 2
eg) log2(1/2) = -1
eg) log2(1/4) = -2
eg) (1/2)log2(1/2) = (1/2)(-1) = -1/2

Entropy Formulae

Entropy, a measure from information theory, characterizes the (im)purity, or homogeneity, of an arbitrary collection of examples.

Given:

Probability

Entropy

The disorder in a set containing members of two classes A and B, as a function of the fraction of the set beloning to class A.
In the diagram above, the total number of instances in both classes combined is two.
In the diagram below, the total number of instances in both classes is eight.

Average Entropy

 

Click here for an exercise in decision tree construction.

 

Comments on the ID3 Algorithm

Unlike the version space candidate-elimination algorithm,

Approximate inductive bias of ID3: shorter trees are preferred over larger trees.

A closer approximation to the inductive bias of ID3: shorter trees are preferred over longer trees. Trees that place high information gain attributes close to the root are preferred over those that do not.

By viewing ID3 in terms of its search space and search strategy, we can get some insight into its capabilities and limitations:

(Although the pure ID3 algorithm does not backtrack, we discuss an extension that adds a form of backtracking: post-pruning the decision tree. See decision rules.)


Back   HomeNext