Imagine yourself at your next job: You're using Machine Learning to solve exciting problems that matter to you, whether that's helping to cure a disease, detect fraud, or predict what will happen next in the stock market.
Machine Learning is one of the most in-demand skills in the job market today, and you know that finally mastering Machine Learning will help you to build a more fulfilling career at the company of your choice.
But making progress with Machine Learning is hard. You've built a lot of models using scikit-learn, but mostly with clean datasets. When you try to apply your knowledge to more complex scenarios, you get lost. Your code doesn't always work, and when it does, you're not sure that you're actually doing things the right way. The courses and blog posts and Stack Overflow answers that got you to this point aren't helping any more.
It's SO frustrating because you can't find any resources that help get you to the next level. You're motivated to keep learning, but you keep running into roadblocks like documentation you can't understand, papers with too many formulas, and courses that skip over details that you can't figure out on your own.
It feels like your dream of a Machine Learning job is so close, yet so far away.
I took a few Machine Learning courses during my Master's degree (Business Analytics), which gave me some basic ML knowledge.
Your courses and videos helped me a lot to further understand ML, which I believe is the reason I landed my dream job.
- Maggie Tang (Machine Learning Engineer)
I'm here to help. My name is Kevin Markham, and I'm the founder of Data School. I've taught data science in Python to over a million students. My courses explain data science topics in a clear, thorough, and step-by-step manner that you can understand regardless of your educational background or professional experience.
I know EXACTLY what you're going through because eight years ago, I was there too. I had been teaching scikit-learn for a year, and yet I still couldn't figure out how to turn the concepts in my head into the code I needed to write.
Fast forward to now, and scikit-learn comes SO much easier to me. I know what code I need to write in order to do proper Machine Learning, and I get better Machine Learning results faster than ever.
So how did I get here? I've spent the last eight years researching best practices in Machine Learning, digging through the scikit-learn documentation, practicing what I've learned, sharing my knowledge with the community, getting feedback from experts, and even contributing to the scikit-learn library. It was an undeniably effective process, but the journey was long and challenging.
I want to help you avoid that struggle so that you can get to where I am in a fraction of the time!
Machine Learning is huge, and there are lots of people teaching it out there. But few are as clear and practical as Kevin.
Yes, he teaches you the syntax you need to know. But more importantly, he teaches you the ideas behind the syntax, and shows you why and where you'll want to use various techniques.
If you think that Machine Learning is too complex for you to learn, I cannot recommend Kevin's courses enough. He'll give you the confidence you need, along with the knowledge you want.
- Reuven Lerner (Python trainer)
Imagine what it will be like once you finally master Machine Learning with scikit-learn:
You'll be more confident when tackling new Machine Learning problems because you'll understand what steps you need to take, why you need to take them, and how to execute those steps using scikit-learn.
You'll know what problems you might run into, and you'll know exactly how to solve them.
Your code will be easier to write and read, and you'll get better Machine Learning results faster than before.
All of these skills, and the confidence you'll have when applying them, will set you apart from the competition when you're looking for your next Machine Learning job.
I need to thank you for your videos that let me find my dream job. Thank you so much!!
- Arrigo Coen Coria (Data Scientist)
So how can you build these skills?
Well, you could follow the same path that I did... if you can afford to spend ten years of your life and a lot of time banging your head against the wall!
Or you could spend two years and $50,000 getting a Master's degree in Machine Learning, but even then you might end up with a lot of theoretical rather than practical knowledge.
Or you could drop everything, move to a new city, and spend three months and $20,000 at a bootcamp, though you'd better hope that the teaching staff is good.
Or you could stay right where you are and start building these skills TODAY with my new course!
Master Machine Learning with scikit-learn will teach you how to solve almost any supervised Machine Learning problem using the latest scikit-learn techniques.
It's a distillation of everything I've learned about Machine Learning over the past ten years, packaged into clear, step-by-step, easy to understand and easy to reference lessons.
Previously, I tested a shorter version of this course with 200 students. Here's what a few of them had to say:
This was one of the best data science classes I have ever taken... I was impressed with Kevin's easy-to-understand teaching style where he clearly explains the 'what' and 'why' of each principle... I highly recommend this course.
This course takes you through some of the challenges we face with real data, which is not always the case in other courses... If you're familiar with Machine Learning but need to know how to apply it using scikit-learn, then this course is definitely for you!
This class will not only save me a lot of time in the future, but will also ensure that my models will be robust to data leakage... The explanations and demonstrations are worth the price of admission.
I've already used the learnings from the course in a Machine Learning competition and got impressive results, while keeping the code clean and easy to understand. Also, I'm much more confident at tackling Machine Learning problems and I'm sure this will contribute a lot to my career.
After the test run, I spent 1,000+ hours refining and expanding the course to make it the most clear and comprehensive scikit-learn course available today.
For a tiny fraction of the cost of a Master's degree or a bootcamp, you can massively improve your skills and get ready for your dream job in Machine Learning!
Most Machine Learning courses suffer from a host of problems: They're poorly taught, lack the necessary depth, and include unexplained or broken code. They don't teach you how to apply what you're learning, and they don't show you how to handle all of the problems you'll ACTUALLY face in real-world Machine Learning.
But in this course, we'll focus on application from the very beginning. We'll spend most of our time writing scikit-learn code, and you'll understand how every single line relates to the problem we're solving.
You'll learn the best practices for proper Machine Learning and you'll learn how to apply those practices to your own Machine Learning problems.
We'll also cover topics that are critical to effective Machine Learning but are rarely covered by other courses, such as:
Cost-sensitive learning
Class imbalance
Data leakage
Regularization
Multivariate imputation
High-cardinality categorical features
Custom transformers for feature engineering
Multiclass problems
Ensembling
Non-linear models
Binning numeric features
ROC and precision-recall curves
Calculating rates from a confusion matrix
How to read the scikit-learn documentation
Your courses are THE most to-the-point (yet) comprehensive tutorials I have come across on ML.
I have reviewed many ML courses out there, and none are as terse and yet as useful as your video materials especially when it comes to applied ML.
- Neil Dias (ML Engineer)
Why scikit-learn is the single most popular Machine Learning framework today (and is usually a better choice than deep learning!)
Why your workflow matters WAY more than which algorithm you choose
Five key factors to consider when deciding whether to impute missing values
How a "missing indicator" can transform your missing values into an asset
How to choose between ordinal and one-hot encoding for your categorical features
How to create meaningful features from unstructured text data
Seven ways to select columns in a ColumnTransformer (this is a huge timesaver!)
Three automated feature selection methods that will improve your model's performance
How to know whether your numerical features should be standardized
How to calculate the confidence level of your predictions (and when NOT to do this)
My five-step process for properly handling class imbalance
Why you need to tune your transformers and your model at the same time
How to speed up your grid search (and when to use a randomized search instead)
Two easy methods for ensembling multiple models
Why using pandas for transformations can lead to "data leakage" (this is critical to avoid!)
How to do ALL of your feature engineering in scikit-learn using custom transformers
How to examine every step of your Pipeline (this is great for troubleshooting!)
How to save your best Pipeline for future predictions
And much, much more!
You're one of the best teachers out there! Thank you very much for making Data Science and Machine Learning so intuitive and interesting.
- Sohum Rajdev (Master's Student)
149 video lessons (7.5 hours) with transcripts for easy reference
126 quiz questions to check your understanding
900+ lines of code you can adapt for your own projects
Jupyter notebooks with all of the code and lecture notes
Downloadable datasets so you can follow along at home
Certificate of completion at the end of the course
Lifetime access to everything
Free access to future course updates
Choosing Data School's courses is the best decision I have ever made when embarking on my data science journey.
- Duc Nguyen Huu (Data Science Intern)
I'm confident that this course will help you massively improve your Machine Learning skills and move you closer to your dream job in Machine Learning.
But if you're not satisfied with the course, just let me know within 30 days of purchase and I'm happy to give you 100% of your money back, no questions asked!
If you're ready to invest in your Machine Learning career, click the button below for immediate access to the entire course.
I'm so excited for the journey you're about to take, and I can't wait to hear how this course helps you to build a more fulfilling career!
- Kevin Markham (Founder of Data School)
You're ready for this course if you can use scikit-learn to solve simple classification or regression problems, including loading a dataset, defining the features and target, training and evaluating a model, and making predictions with new data. You'll also need to know how to perform a few basic pandas operations, including reading a CSV file and selecting columns from a DataFrame.
If you're not yet ready, I recommend enrolling in my free introductory ML course and completing lessons 1 through 7, after which you'll be ready for this course!
Not at all! I purposefully teach in a way that is accessible to a wide variety of educational backgrounds, even when covering more advanced topics like regularization, multivariate imputation, cost-sensitive learning, and so on.
Here's a brief summary of the topics covered in the course:
Review of the basic Machine Learning workflow
Encoding categorical features
Encoding text data
Handling missing values
Preparing complex datasets
Creating an efficient workflow for preprocessing and model building
Tuning your workflow for maximum performance
Avoiding data leakage
Proper model evaluation
Automatic feature selection
Feature standardization
Feature engineering using custom transformers
Linear and non-linear models
Model ensembling
Model persistence
Handling high-cardinality categorical features
Handling class imbalance
You can scroll down to the Course Outline to see the detailed list of all 149 lessons.
Workflow composition: Pipeline, ColumnTransformer, make_pipeline, make_column_transformer, make_column_selector, make_union
Categorical encoding: OneHotEncoder, OrdinalEncoder
Numerical encoding: KBinsDiscretizer
Text encoding: CountVectorizer
Missing value imputation: SimpleImputer, KNNImputer, IterativeImputer, MissingIndicator
Model building: LogisticRegression, RandomForestClassifier, ExtraTreesClassifier
Model ensembling: VotingClassifier
Model selection: StratifiedKFold, cross_val_score, train_test_split
Model evaluation: accuracy_score, classification_report, confusion_matrix, roc_auc_score, average_precision_score, plot_confusion_matrix, plot_roc_curve, plot_precision_recall_curve
Hyperparameter tuning: GridSearchCV, RandomizedSearchCV
Feature selection: RFE, SelectPercentile, SelectFromModel, chi2
Feature standardization: StandardScaler, MaxAbsScaler
Feature engineering: FunctionTransformer, PolynomialFeatures
Configuration: set_config
Model persistence: joblib, pickle, cloudpickle (these are external libraries)
Yes! If you have unlimited time and a lot of patience, you can learn everything I cover in this course by reading countless books, research papers, articles, documentation pages, GitHub pull requests, and so on. (Make sure to ignore all of the erroneous and outdated information!)
Alternatively, you can save yourself a lot of time and frustration by making this small, one-time investment in yourself and your career!
No single course (or degree) can guarantee you a job in Machine Learning. Every job requires a combination of skills, experience, and domain knowledge that are specific to the employer and position.
However, I can guarantee that if you complete this course and commit to applying what you've learned, you will achieve a far greater fluency with Machine Learning and scikit-learn that will set you apart from the competition when applying for your next job!
Definitely! If you browse through the Course Outline below, I think you'll find that there are a ton of topics I cover in the course that are critical to effective Machine Learning but are rarely covered by other courses (or even the scikit-learn documentation!)
I've found that the workflow will have a far greater impact on your Machine Learning results than your ability to pick between algorithms. In fact, once you've mastered the workflow, you can iterate through different algorithms quickly even if you don't deeply understand them.
Understanding algorithms is still useful, but it's hard to know in advance which algorithm will work best for a particular problem. That's why it's so important to build a flexible workflow that enables you to easily experiment with different algorithms.
Master Machine Learning with scikit-learn is an updated and significantly expanded version of Building an Effective Machine Learning Workflow with scikit-learn. I spent 1,000+ hours revising every existing lesson and adding countless new lessons in order to make Master Machine Learning the most clear and comprehensive scikit-learn course available today.
Yes! I created this course using scikit-learn 0.23.2. Since then, very little has changed in the library that affects the course, and when there has been a relevant change, I note that within the course.
The only libraries you'll need to install are scikit-learn (version 0.20.2 or later), pandas (any version), and matplotlib (any version). To check your scikit-learn version, just open your Python editor and run these two lines of code:
import sklearn
sklearn.__version__
If your scikit-learn version is 0.20.1 or earlier, then it's important that you upgrade it using pip or conda.
I'll be writing code using the Jupyter notebook, though you can use any Python editor you like. If you'd like to install the Jupyter notebook, I recommend downloading the free Anaconda distribution, which also includes scikit-learn, pandas, and matplotlib.
Alternatively, you could participate in the course using Google Colab. Colab is free and runs entirely in your browser, and it provides you with an interface similar to the Jupyter notebook.
Most chapters include a substantial section of Q&A lessons, which answer all of the common questions that students have asked me about that topic. In addition, you can post a question below any video, and I'll do my best to respond!
Once you have watched all of the videos and attempted all of the quizzes, you can request a certificate of completion.
You will have lifetime access to the course so that you can work through lessons at your own pace and reference them later. I expect that it will be useful to you for years to come!
Yes! I offer Purchasing Power Parity discounts (also known as location-based discounts) for all of my paid courses. If you're located in one of the 160+ qualifying countries, you should automatically see a discount code at the top of this page.
I also offer student discounts and hardship-based discounts, regardless of where you live. Please email me at kevin@dataschool.io and I'd be happy to send you the appropriate discount code.
If you decide that the course isn't a good fit for you, I'd be happy to give you a full refund within 30 days of purchase. Simply email me at kevin@dataschool.io and I'll promptly process your refund.
Please email me at kevin@dataschool.io and I'd be happy to answer your question!
You'll notice that the chapters are divided into small, digestible video lessons so that you can work through the course as you have time and easily reference the material later.
In chapter 1, I'll give you an overview of the course and help you to get set up. Then in chapter 2, we'll move on to a review of the Machine Learning workflow in order to establish a foundation for the rest of the course.
In chapters 3 through 9, we'll explore how to handle common issues such as categorical features, text data, and missing values, and also how to integrate those steps into an efficient workflow. Then in chapter 10, we'll cover how to properly evaluate and tune your entire workflow for maximum performance.
In chapters 11 through 16, we'll walk through a variety of advanced techniques that can help to further improve your model's performance, including ensembling, feature selection, feature standardization, and feature engineering. In chapters 17 through 19, we'll dive deep into two common issues you'll run into during real-world Machine Learning, namely high-cardinality categorical features and class imbalance.
Finally, in chapter 20, I'll end the course with my advice for how you can continue to make progress with your Machine Learning education and skill development!
For more details, you can browse through the complete list of lessons below.
52 minutes
19 minutes
13 minutes
30 minutes
8 minutes
32 minutes
4 minutes
15 minutes
30 minutes
25 minutes
12 minutes
Whether or not you enroll, you're still going to want a dream job in Machine Learning.
Sure, you can choose NOT to enroll, and maybe you'll make the time to learn it all on your own. Or, you can enroll in my course and accelerate your Machine Learning skills TODAY!
Just picture yourself in a month:
You'll be more confident when tackling new Machine Learning problems
You'll understand how to write proper, efficient, high-performing scikit-learn code
You'll solve problems at work more quickly and easily
You'll be more ready than ever for your dream job in Machine Learning!
I know that in these uncertain times, it can be hard to invest in yourself.
But think about it: With the dramatic rise in AI technologies, there's no better time to invest in a career in Machine Learning!
There's zero risk, because I offer a 30-day money back guarantee. What have you got to lose?
I'll see you in the course! 🎓
- Kevin
P.S. Not quite ready? Enroll today to get lifetime access to the course, and then start the course once you're ready!
My name is Kevin, and I've taught Data Science in Python to over a million students.
My courses explain data science topics in a clear, thorough, and step-by-step manner.
I'd love to teach you, regardless of your educational background or professional experience.
Thanks for joining me! 🙌