Reference: P. Winston, 1992.

Factors Affecting Sunburn (continued)

Phase 2: From Tree to Rules

We may now establish rules from the decision tree.

Rule If then
1 the person's hair color is blonde
the person uses no lotion
the person gets sunburned.
2 the person's hair color is blonde
the person uses lotion
nothing happens.
3 the person's hair color is red the person gets sunburned.
4 the person's hair color is brown nothing happens.

Phase 3: Simplify Rules

Once a rule set has been generated, simplify it.

[The training data has been multiplied by a factor of 4 to permit the use of the chi-square test below]

Assume a significance level of alpha = 0.05

a) Eliminate Unnecessary Rule Antecedents

  1. Blonde

    Actual:

    No Change Sunburned Marginal Sum
    Blonde 16 16 32
    Not Blonde 20 12 32
    Marginal Sum 36 28 64

    Expected:

    Sample expected value calculation for contingency cell X11:

    expsampcalc

    No Change Sunburned
    Blonde 18 14
    Not Blonde 18 14

    Sample chi square calculation, where oij and eij are the observed (actual) and expected values for cell xij, respectively, and 1 less than or equal i less than or equal 2, 1 less than or equal j less than or equal 2:

    chi sample calc

    Sample degrees of freedom calculation:

    df = (r - 1)(c - 1) = (2 - 1)(2 - 1) = 1

    From the chi-square table, chi square alpha = 3.84

    Since chi square less than chi square alpha, we accept the null hypothesis of independence, H0.

    We thus conclude, according to the training examples, that sunburn is independent from blonde hair, and thus we may eliminate this antecedent from Rule #1 and Rule #2.

  2. Lotion

    For argument's sake we will also examine the lotion antecedent for independence.

    Actual:

      No Change Sunburned Marginal Sum
    Lotion 12 0 12
    No Lotion 8 12 20
    Marginal Sum 20 12 32

    Expected:

      No Change Sunburned
    Lotion 7.5 4.5
    No Lotion 12.5 7.5

    chi square = 11.52

    df = 1

    From the chi-square table, chi square alpha = 3.84

    Since chi-square greater than chi square alpha, we reject the null hypothesis of independence, H0, and accept the alternate hypothesis of dependence, Ha.

    Therefore, according to the training examples, sunburn is clearly dependent upon the use of lotion, so we cannot eliminate this antecedent.

b) Eliminate Unnecessary Rules (under construction)

The tentative rule set is as follows:

Rule If then
1 the person uses no lotion the person gets sunburned.
2 the person uses lotion nothing happens.
3 the person's hair color is red the person gets sunburned.
4 the person's hair color is brown nothing happens.