The Kaggle Workbook: Self-learning exercises and valuable insights for Kaggle data science competitions

Konrad Banachewicz

Luca Massaron

€11.99 ~~€17.99~~

4.8 (25 Ratings)

eBook Feb 2023 172 pages 1st Edition

Konrad Banachewicz

Luca Massaron

€11.99 ~~€17.99~~

4.8 (25 Ratings)

eBook Feb 2023 172 pages 1st Edition

What do you get with eBook?

Instant access to your Digital eBook purchase

Download this book in EPUB and PDF formats

Access this title in our online reader with advanced features

DRM FREE - Read whenever, wherever and however you want

AI Assistant (beta) to help accelerate your learning

View table of contents

Preview Book

The Kaggle Workbook

Ensembling the results

Now, having two models, what’s left is to mix them together and see if we can improve the results. As suggested by Jahrer we go straight for a blend of them, but we do not limit ourselves to producing just an average of the two (since our approach in the end has slightly differed from Jahrer’s one) but we will also try to get optimal weights for the blend. We start importing the out-of-fold predictions and having our evaluation function ready.

import pandas as pd
import numpy as np
from numba import jit
@jit
def eval_gini(y_true, y_pred):
    y_true = np.asarray(y_true)
    y_true = y_true[np.argsort(y_pred)]
    ntrue = 0
    gini = 0
    delta = 0
    n = len(y_true)
    for i in range(n-1, -1, -1):
        y_i = y_true[i]
        ntrue += y_i
        gini += y_i * delta
        delta += 1 - y_i
    gini = 1 - 2 * gini / (ntrue * (n - ntrue))
    return gini
lgb_oof = pd.read_csv("../input/workbook-lgb/lgb_oof.csv")
dnn_oof = pd.read_csv...

Download Code

Key benefits

Challenge yourself to start thinking like a Kaggle Grandmaster
Fill your portfolio with impressive case studies that will come in handy during interviews
Packed with exercises and notes pages for you to enhance your skills and record key findings

Description

More than 80,000 Kaggle novices currently participate in Kaggle competitions. To help them navigate the often-overwhelming world of Kaggle, two Grandmasters put their heads together to write The Kaggle Book, which made plenty of waves in the community. Now, they’ve come back with an even more practical approach based on hands-on exercises that can help you start thinking like an experienced data scientist. In this book, you’ll get up close and personal with four extensive case studies based on past Kaggle competitions. You’ll learn how bright minds predicted which drivers would likely avoid filing insurance claims in Brazil and see how expert Kagglers used gradient-boosting methods to model Walmart unit sales time-series data. Get into computer vision by discovering different solutions for identifying the type of disease present on cassava leaves. And see how the Kaggle community created predictive algorithms to solve the natural language processing problem of subjective question-answering. You can use this workbook as a supplement alongside The Kaggle Book or on its own alongside resources available on the Kaggle website and other online communities. Whatever path you choose, this workbook will help make you a formidable Kaggle competitor.

Who is this book for?

If you’re new to Kaggle and want to sink your teeth into practical exercises, start with The Kaggle Book, first. A basic understanding of the Kaggle platform, along with knowledge of machine learning and data science is a prerequisite. This book is suitable for anyone starting their Kaggle journey or veterans trying to get better at it. Data analysts/scientists who want to do better in Kaggle competitions and secure jobs with tech giants will find this book helpful.

What you will learn

Take your modeling to the next level by analyzing different case studies
Boost your data science skillset with a curated selection of exercises
Combine different methods to create better solutions
Get a deeper insight into NLP and how it can help you solve unlikely challenges
Sharpen your knowledge of time-series forecasting
Challenge yourself to become a better data scientist

What do you get with eBook?

Instant access to your Digital eBook purchase

Download this book in EPUB and PDF formats

Access this title in our online reader with advanced features

DRM FREE - Read whenever, wherever and however you want

AI Assistant (beta) to help accelerate your learning

Frequently bought together

The Kaggle Workbook

Feb 2023 172 pages

4.8 (25)

eBook

€11.99 ~~€17.99~~

Developing Kaggle Notebooks

Dec 2023 370 pages

5 (29)

eBook

€20.98 ~~€29.99~~

The Kaggle Book

Apr 2022 534 pages

4.1 (34)

eBook

€32.99 ~~€47.99~~

Total € 120.97

€22.99

€37.99

€59.99

Total € 120.97

Filter reviews by

All

Packt verified reviews

Amazon verified reviews

Amznswap Feb 26, 2023

This book is an excellent deep-dive into the nitty-gritties of the Kaggle competition environment.The book is comprehensive, it furnishes diverse competition case-studies in domains like forecasting, NLP and Computer Vision.It provides ample context by distilling the top discussions by leaderboard rankers and complete SotA solution building practise.Particularly impressive are the in-depth sections on the metrics used by these competitions,helping the reader lucidly understand the data-science metrics used by top companies to evaluate ML models.I believe that this book will surely help any novice user get their hands dirty withpractical data-science, beyond the theoretical model fundamentals covered in the Kaggle Book.

Amazon Verified review

Daniel Brooks Mar 18, 2023

A hands-on introduction to machine learning. The book covers 4 example competitions on the Kaggle platform - tabular data, time series analysis, computer vision, and NLP. The commentary on each is thorough and reads easily. A great read for those looking to learn more about Kaggle competitions.

Paul Perry Apr 12, 2023

This Kaggle Workbook brings great depth in specific areas and Chapter 4 alone is invaluable, especially now as we all delve deeper into NLP, ChatGPT, and Transformers. This is an excellent resource and I'll tell you why.Firstly, for those aspiring to be experts in AI, ML, and NLP, it's essential to immerse yourself in Kaggle, transcend academic learning, and truly grasp what it takes to achieve top-performing solutions. For those who are new or relatively new to Kaggle, you'll definitely need the broader context provided by the Kaggle Book, but as a practitioner, the Kaggle Workbook goes deeper and dissects the top solutions to 4 specific competitions. As someone who has competed many times and reached the level of Kaggle Master, I find it incredibly valuable to have an in-depth walkthrough of a previous competition. Documenting the top solutions is a ton of work and time I don't have, and here it's as if I have a front row seat to a live competition! It also serves as fantastic blueprint for how to study past competitions.I've tried to structure my code, document my solutions and store them on GitHub, but all I offer is messy raw code of a ton of failed experiments. But in this Kaggle Workbook Konrad and Luca do all the work and provide all the links an references, and I appreciate their expert view because I don't want to read every forum post to recreate what happened.I'm now using this book to delve into Chapter 4 and looking at the innovative techniques around Transformers. I'm glad to have found this book.

Samuel de Zoete Mar 21, 2023

The Kaggle workbook has important exercises, which accompanies the Kaggle book. The Kaggle book was already fantastic for any level of Data Scientist by the way, and now the workbook gives you the confidence and tools to do it really yourself. Having done a few Kaggle competitions in the past, and I can highly recommend it to everyone, regardless your current skill level, there is always something to learn. The Kaggle book and workbook managed to turn this 'always something to learn ' into a super practical course in machine learning and I have to use the cliché "A must have... really you do!" .

Gianluca Rossi Mar 03, 2023

The book focuses on four practical examples covering tabular data, time series, NLP, and computer vision. The examples are based on high-ranked solutions in recent Kaggle competitions. The author did a great job describing the reasoning behind every code snippet and sharing tips that can be useful in Kaggle and any ML projects. I particularly appreciated the effort in making the code very readable yet concise. The exercises are educational and require the reader to stop and reason. It's an effortless read, despite the solutions being sophisticated and state-of-the-art. This is a testament to the authors writing abilities and extensive knowledge. This book is highly recommended for anyone serious about improving their ML skills.

The Kaggle Workbook: Self-learning exercises and valuable insights for Kaggle data science competitions

What do you get with eBook?

The Kaggle Workbook

Ensembling the results

Understanding the competition and the data

Understanding the Evaluation Metric

Examining the 4th place solution’s ideas from Monsaraida

Computing predictions for specific dates and time horizons

Assembling public and private predictions

Summary

Join our book’s Discord space

Page 1 of 7

Key benefits

Description

Who is this book for?

What you will learn

Product Details

What do you get with eBook?

Product Details

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

Filter reviews by

People who bought this also bought

About the authors

FAQs

The Kaggle Workbook: Self-learning exercises and valuable insights for Kaggle data science competitions

What do you get with eBook?

Contact Details

Billing Address

Key benefits

Description

Who is this book for?

What you will learn

Product Details

What do you get with eBook?

Contact Details

Billing Address

Product Details

Packt Subscriptions

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

Filter reviews by

People who bought this also bought

About the authors

FAQs