Decision Tree

Concept: Decision Tree Algorithm
  1. Instances: Refer to the vector of features or attributes that define the input space.
  2. Attribute: A quantity describing an instance.
  3. Root Node: It is the main node of the tree and further it gets divided into two or more homogeneous sets.
  4. Decision Node: At this node, the tree gets split into sub-nodes.
  5. Leaf Node: This is the final node of the tree that gives us the outcome from the target variable.
  6. Pruning: The splitting process results in fully grown trees until the stopping criteria are reached. But, the fully grown tree is likely to overfit the data, leading to poor accuracy on unseen data.
Example of Decision Tree
Entropy Formula
Information Gain Formula
Gini index Formula (Source:KD Nuggets)
Source: Datacamp
  1. Easy to use and understand.
  2. Can handle both categorical and numerical data.
  3. Resistant to outliers, hence require little data preprocessing.
  4. Can be used to build larger classifiers by using ensemble methods.
  1. Prone to overfitting.
  2. It can be unstable because small variations in the data might result in a completely different tree being generated. This is called variance, which needs to be lowered by methods like bagging and boosting.
  3. Decision tree learners create biased trees if some classes dominate. Therefore, It is recommended to balance the data set prior to fitting with the decision tree.

If anything is worth doing, do it with all your heart -Budda

  1. https://www.kdnuggets.com/2020/01/decision-tree-algorithm-explained.html
  2. Hands-on machine learning book by Aurelien Geron.

--

--

--

A Machine Learning technology researcher commencing a new Blog Series to make clear ML concepts Simple and ease for everyone.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

The One Petroleum

Entity Embeddings

When Clustering Doesn’t Make Sense

GeoPoll — a quick overview of a good tool

Creating Joy Plots Using JoyPy

Probability Density Functions(PDF) and Comulative Density Function(CDF)

Three Tricks to Speed Up and Optimise Your Python

A Journey into Knowledge Graphs at Instacart

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
sai krishna

sai krishna

A Machine Learning technology researcher commencing a new Blog Series to make clear ML concepts Simple and ease for everyone.

More from Medium

Employee Attrition Prediction

knn Algorithm

Concept of Accuracy in Data Mining

Demystify K-Nearest Neighbour