Below are a few notes on decision trees. I still have a lot to learn but this will become my working document.
The aim with any decision tree is to create a workable model that will predict the value of a target variable based on the set of input variables.
Uses
Financial Industry
One of the fundamental use cases is in option pricing, where a binary-like decision tree is used to predict the price of an option in either a bull or bear market.
Marketing
Marketers use decision trees to establish customers by type and predict whether a customer will buy a specific type of product.
Medical
In the medical field, decision tree models have been designed to diagnose blood infections or even predict heart attack outcomes in chest pain patients. Variables in the decision tree include diagnosis, treatment, and patient data.
Gaming
The gaming industry now uses multiple decision trees in movement recognition and facial recognition.
Positives
they are easy to read. After a model is generated, it's easy to report back to others regarding how the tree works. Also, with decision trees you can handle numerical or categorized information.
Negatives
One of the main issues of decision trees is that they can create overly complex models, depending on the data presented in the training set
Types of Algorithms
-
ID3 - The ID3 (Iterative Dichotomiser 3) algorithm was invented by Ross Quinlan to create trees from datasets. By calculating the entropy for every attribute in the dataset, this could be split into subsets based on the minimum entropy value.
-
C4.5
-
CHAID (Chi-squared Automatic Interaction Detection)
-
MARS (multivariate adaptive regression splines) algorithm
Creating a Decision Tree
Decision trees are built around the basic concept of this algorithm.
- Check the model for the base cases.
- Iterate through all the attributes (attr).
- Get the normalized information gain from splitting on attr.
- Let
best_attr
be the attribute with the highest information gain. - Create a decision node that splits on the
best_attr
attribute. - Work on the sublists that are obtained by splitting on
best_attr
and add those nodes as child nodes. - That's the basic outline of what happens when you build a decision tree.
- Depending on the algorithm type, like the ones previously mentioned, there might be subtle differences in the way things are done.
Calculating Entropy
Entropy is a measure of uncertainty and is measured in bits and comes as a number between zero and 1 (entropy bits are not the same bits as used in computing terminology). Basically, you are looking for the unpredictability in a random variable.