Homonym Confusion Matrix
Purpose: to illustrate how C4.5 may be used to produce a confusion matrix.
Problem: apply the rules generated for the homonym pair bare/bear in the previous example to a set of unseen sentences to test their fitness.
|"Bare" versus "Bear"||bare.names||bare.data||bare.test|
The data used in bare.test were determined from the following fourteen sentences:
Bare (as an adjective)
- The cupboard was completely bare, spare a thin layer of dust.
- The judge wanted nothing but bare facts.
- Soon, frostbite was sure to affect his bare skin.
- They decided the bare corner in the living room was an ideal location for the new house plant.
- The walls of her studio were bare.
Bare (as a verb)
- The school counselor asked him to bare his feelings.
- The doctor asked his sick patient if he could bare his back for him so that he could examine his breathing.
Bear (as a noun)
- The park ranger spotted the bear fifty meters from the camp site.
- I once tranquilized a polar bear during a research expedition in the arctic.
Bear (as a verb)
- Her supervisor told her to bear in mind that sometimes the optimal solution is not always the best solution.
- She could not bear the suspense any longer.
- If he could bear the strain for another ten seconds, he could break the old record.
- I bear grudges against all types of pollutants.
- They were forced to stay behind and bear witness.
Summary of Results
Running C4.5 with the -u switch
i.e., % c4.5 -f bare -u
generates this grid from the test data:
This grid is called a confusion matrix, and the results shown are interpreted as follows:
- 2 sentences of class bare were correctly classified,
- 5 sentences of class bare were misclassified as bear,
- 1 sentence of class bear was misclassified as bare, and
- 6 sentences of class bear were correctly classified.