Homework 5

Techniques for "Learning"


Due: Monday, March 30th, 2:00 PM


The table below contains 11 training examples with three attributes each:

ID# Texture Temperature Size Classification
1 smooth cold large yes
2 smooth cold small no
3 smooth cool large yes
4 smooth cool small yes
5 smooth hot small yes
6 wavy cold medium no
7 wavy hot large yes
8 rough cold large no
9 rough cool large yes
10 rough hot small no
11 rough warm medium yes

Calculate the initial Entropy of this problem

Initial Entropy                             

 

 

We want to use this data and the ID3 algorithm and the concept of change in entropy to construct an accurate yet compact decision tree for this domain.  To determine the optimal first attribute, you should calculate the entropy after independently dividing the data using each of the three attributes as the first choice.  Complete the table below.

 

Attribute Entropy if splitting on the attribute
texture  

 

temperature  

 

size  

 

 

As we know from our study of Entropy and the ID3 algorithm, the attribute with the lowest Entropy will provide the most information gain.  Thus, using your results from the table above, split the eleven pieces of training data on the best attribute.  Begin to construct the tree resulting from this split.  Notice that some of the resulting categories will be perfectly classified and, thus, leaves in the decision tree.  For each of those leaves, label the node with the correct classification.  For each node not yet a leaf, label the node with the number of training examples in each classification.

 

 

 

 

 

 

For each of the nodes not yet a leaf, recursively calculate (independently) which of the remaining two attributes would make the appropriate second choice by calculating the entropy of that portion of the tree using that attribute split.  Continue making these calculations until you can complete the tree.