Book: Master Machine Learning with scikit-learn

A Practical Guide to Building Better Models with Python

About this book

This is a practical guide to help you transform from Machine Learning novice to skilled Machine Learning practitioner.

Throughout the book, you’ll learn the best practices for proper Machine Learning and how to apply those practices to your own Machine Learning problems.

By the end of this book, you’ll be more confident when tackling new Machine Learning problems because you’ll understand what steps you need to take, why you need to take them, and how to correctly execute those steps using scikit-learn.

You’ll know what problems you might run into, and you’ll know exactly how to solve them.

Because you’re learning a better way to work in scikit-learn, your code will be easier to write and to read, and you’ll get better Machine Learning results faster than before!

Prerequisite skills

This is an intermediate-level book about scikit-learn, though it also includes many advanced topics.

You’re ready for this book if you can use scikit-learn to solve simple classification or regression problems, including loading a dataset, defining the features and target, training and evaluating a model, and making predictions with new data.

If you’re brand new to scikit-learn, I recommend first taking my free introductory course, Introduction to Machine Learning with scikit-learn. Once you’ve completed lessons 1 through 7, you’ll know the Machine Learning and scikit-learn fundamentals that you’ll need for this book.

If you’ve used scikit-learn before but you just need a refresher, there’s no need to take my introductory course because I’ll be reviewing the Machine Learning workflow in chapter 2.

Topics covered

  • Review of the basic Machine Learning workflow

  • Encoding categorical features

  • Encoding text data

  • Handling missing values

  • Preparing complex datasets

  • Creating an efficient workflow for preprocessing and model building

  • Tuning your workflow for maximum performance

  • Avoiding data leakage

  • Proper model evaluation

  • Automatic feature selection

  • Feature standardization

  • Feature engineering using custom transformers

  • Linear and non-linear models

  • Model ensembling

  • Model persistence

  • Handling high-cardinality categorical features

  • Handling class imbalance

Get the Paperback edition ($19)

Only available from Amazon.

Forward your Amazon receipt to ebook@dataschool.io and I'll send you the ebook for free!

Get the ebook ($19)

Only available from Data School. Free for All-Access Pass subscribers.

Includes PDF and EPUB formats.

Read it online (free)

Can't afford to buy the book? No problem!

Read the entire book online, for free, with no ads or registration required. That's my gift to you! 🎁

Please consider writing an Amazon review in order to support the book.

About the author: Kevin Markham

Kevin is the founder of Data School, an online school for learning Data Science with Python.

He has been teaching Machine Learning for more than 10 years, and is passionate about teaching people who are new to the field.

He has a degree in Computer Engineering from Vanderbilt University and lives in Asheville, North Carolina.