From the course: Supervised Learning Essential Training

What is supervised learning? - Python Tutorial

From the course: Supervised Learning Essential Training

Start my 1-month free trial

What is supervised learning?

- [Instructor] Machine learning, or ML, is the ability of a technology to learn automatically based on data that it consumes. Some types of machine learning require training and others don't. As with everything, there are pros and cons involved with different types of machine learning. You may have heard of unsupervised, semi-supervised, and reinforcement ML, but the most common branch is supervised ML. In supervised learning, the machine learns similarly to how a human would while under the supervision of a teacher. Imagine a one-on-one session where a teacher provides examples for a student to learn about a new subject. The student learns some patterns from these examples and is then given a test on examples they haven't seen before. The teacher grades the test and gives them feedback on where they went wrong. Retraining can be done if the student does poorly on the test. Supervised learning problems are categorized into regression and classification problems. A regression problem is one in which you try to predict a continuous variable. This is typically a numerical value. For example, we may want to predict the price of a prescription medication based on several factors such as strength in milligrams or the manufacturer. Our training data would have one row for each historical period and columns with details about the medication like strength, manufacturer, and other details. Here's the important part. The prescription's price for those particular parameters. Based on this information, we may be able to predict the prices of other medication. A classification problem is when the output variable is a category. Predicting whether an image is a dog or cat is a classification problem as we're predicting a discreet class. We can also have multiple classifications like dog, cat, fish, or horse. In this case, the algorithm would be fed images of these animals and told which image belongs to which class. It would then predict the class for images that have not yet been classified. Let's run through the general steps of creating machine learning model. First, we start with messy or disparate data. We need to clean that data, like removing nulls and duplicate rows, then we separate the data into three sections. We have the group of data that the models trained on, and that is called training data. Then we have a group of data that we know is correct and use it to test the model's predictive power. This is our test data. Our model here is just our stored guidelines so we can't judge how well our model performs until we test it on a group of data. The last group is the validation set. During our model building, we have several little knobs we can turn to finally adjust how our model learns. We can experiment with changing these values, also known as hyper parameters, and we can analyze how well our tuned model performs on our validation dataset. This kind of data splitting is crucial to creating models that can generalize well on new data, or in other words, are able to predict well on data they've never seen before. Whew, that was a very quick primer on supervised learning! Now you're ready to better understand the topics in this course.

Contents