The table below contains 11 training examples with three attributes each:
| ID# | Texture | Temperature | Size | Classification |
| 1 | smooth | cold | large | yes |
| 2 | smooth | cold | small | no |
| 3 | smooth | cool | large | yes |
| 4 | smooth | cool | small | yes |
| 5 | smooth | hot | small | yes |
| 6 | wavy | cold | medium | no |
| 7 | wavy | hot | large | yes |
| 8 | rough | cold | large | no |
| 9 | rough | cool | large | yes |
| 10 | rough | hot | small | no |
| 11 | rough | warm | medium | yes |
Calculate the initial Entropy of this problem
| Initial Entropy |
|
We want to use this data and the ID3 algorithm and the concept of change in entropy to construct an accurate yet compact decision tree for this domain. To determine the optimal first attribute, you should calculate the entropy after independently dividing the data using each of the three attributes as the first choice. Complete the table below.
| Attribute | Entropy if splitting on the attribute |
| texture |
|
| temperature |
|
| size |
|
As we know from our study of Entropy and the ID3 algorithm, the attribute with the lowest Entropy will provide the most information gain. Thus, using your results from the table above, split the eleven pieces of training data on the best attribute. Begin to construct the tree resulting from this split. Notice that some of the resulting categories will be perfectly classified and, thus, leaves in the decision tree. For each of those leaves, label the node with the correct classification. For each node not yet a leaf, label the node with the number of training examples in each classification.
For each of the nodes not yet a leaf, recursively calculate (independently) which of the remaining two attributes would make the appropriate second choice by calculating the entropy of that portion of the tree using that attribute split. Continue making these calculations until you can complete the tree.