Note: This course is no longer available for enrollment, and has been replaced by Master Machine Learning with scikit-learn.
In this 8-hour course, you'll learn:
How to prepare complex datasets for Machine Learning using scikit-learn
How to handle common scenarios such as missing values, text data, and categorical data
How to build a reusable and efficient workflow that starts with a pandas DataFrame and ends with a trained scikit-learn model
How to integrate feature engineering, selection, and standardization into your workflow
How to avoid data leakage so that you can correctly estimate model performance
How to tune your entire workflow for maximum performance
By the end of the course, you'll be more confident when tackling new Machine Learning problems because you'll understand what steps you need to take, why you need to take them, and how to correctly execute those steps using scikit-learn.
And because you're learning a better way to work in scikit-learn, your code will be easier to write and to read, and you'll get better Machine Learning results faster than before!
This is the perfect course for you if:
You've taken my introductory course and you're ready to go deeper into scikit-learn
You want to write efficient, readable, and reusable scikit-learn code that integrates well with pandas
You want to properly handle common data issues such as missing values, text data, and categorical data
You want to tune your entire workflow for maximum performance
You want to take advantage of the latest scikit-learn features
Five years ago, I remember sitting at my computer for hours, struggling to figure out how to use "Pipeline" and "FeatureUnion" together. It felt like my head might explode at any moment 🤯
Although I had been teaching scikit-learn for a year, I still couldn't figure out how to turn the concepts in my head into the code I needed to write.
Why was it so hard?
With the benefit of hindsight, I can see two reasons why I was struggling:
First, I didn't have a trusted guide who could show me the easiest way to solve my scikit-learn problems.
Second, I didn't have a complete mental model of how the scikit-learn pieces fit together. (When do you use "fit" vs "transform"? What objects are output by each step of a Pipeline? What happens when the test set differs from the training set? etc.)
Fast forward to today, and scikit-learn comes much easier to me:
I know how to find exactly what I need in the documentation
I understand nearly all of the terminology
I've built a clear mental model of how things work in scikit-learn
I know what functions and classes are available, and how to use them for maximum efficiency
These days, working in scikit-learn is a JOY. I know what code I need to write, and I can execute my Machine Learning projects much more quickly!
But it took me FIVE YEARS to get here.
Do you want to struggle for five years? Or do you want to dramatically improve your scikit-learn skills TODAY?
Let me be your guide, so that you can finally work with ease in scikit-learn and get better Machine Learning results faster than before!
My name is Kevin, and I've taught Data Science in Python to over a million students.
My courses explain data science topics in a clear, thorough, and step-by-step manner.
I'd love to teach you, regardless of your educational background or professional experience.
Thanks for joining me! 🙌