From the course: Supervised Learning Essential Training
Unlock the full course today
Join today to access over 22,600 courses taught by industry experts or purchase this course individually.
Splitting data and limiting decision tree depth - Python Tutorial
From the course: Supervised Learning Essential Training
Splitting data and limiting decision tree depth
- [Instructor] One of the biggest challenges in creating decision trees is overfitting. Overfitting is the biggest practical challenge in supervised learning and occurs when the model memorizes the training data and has difficulty predicting well on the test data. For decision trees, there are two ways to avoid overfitting. First, we can set constraints on tree size. And second, we can prune our decision trees. There are a few constraints we can set for our trees, such as the minimum samples for a leaf node, maximum depth of the tree or vertical depth, maximum number of leaf nodes, and maximum features to consider while searching for a best split. We can also prune our tree to create a robust model that generalizes well on new data. To do this, we first make our tree. Then we start at the bottom of the tree and remove sub-trees that don't improve classification accuracy. There are two pruning strategies to consider. First…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Contents
-
-
-
-
-
(Locked)
Identify common decision trees1m 55s
-
(Locked)
Splitting data and limiting decision tree depth3m 41s
-
(Locked)
How to build a decision tree2m 3s
-
(Locked)
Creating your first decision trees2m 49s
-
(Locked)
Analyzing decision tree performance5m 1s
-
(Locked)
Exploring how ensemble methods create strong learners1m 55s
-
(Locked)
-
-
-