Preface
More than six years ago, before I first discovered Kaggle, I was searching for a new path in my professional career. A few years later, I was firmly entrenched in a new job, which Kaggle helped me find. Before discovering this marvelous site, I was looking around on different sites, reading articles, downloading and analyzing datasets, trying out pieces of code from GitHub or other sites, doing online trainings, and reading books. With Kaggle, I found more than a source of information; I found a community sharing the same interest in machine learning, and, more generally, in data science, looking to learn, share knowledge, and solve difficult challenges. I also discovered that in this community, if you want, you can experience an accelerated learning curve, because you can learn from the best, sometimes competing against them, and other times collaborating with them. You can also learn from the less experienced; after all these years on the platform, I am still learning from both crowds.
This mix of continuous challenges and fruitful collaboration makes Kaggle a unique platform, where new and old contributors can feel equally welcome and find things to learn or share. In my first months on the platform, I mostly learned from the vast collections of datasets and notebooks, analyzing competition data and offering solutions for active or past competitions and on the discussion threads. I soon started to contribute, mostly to notebooks, and discovered how rewarding it is to share your own findings and get feedback from other people on the platform. This book is about sharing this joy and what I learned while sharing my findings, ideas, and solutions with the community.
This book is intended to introduce you to the wide world of data analysis, with a focus on how you can use Kaggle Notebooks resources to help you achieve mastery in this field. We will cover simple concepts to more advanced ones. The book is also a personal journey and will take you down a similar path to the one I took while experimenting and learning about analyzing datasets and preparing for competitions.