Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Python Data Mining Quick Start Guide

You're reading from   Python Data Mining Quick Start Guide A beginner's guide to extracting valuable insights from your data

Arrow left icon
Product type Paperback
Published in Apr 2019
Publisher Packt
ISBN-13 9781789800265
Length 188 pages
Edition 1st Edition
Languages
Concepts
Arrow right icon
Author (1):
Arrow left icon
Nathan Greeneltch Nathan Greeneltch
Author Profile Icon Nathan Greeneltch
Nathan Greeneltch
Arrow right icon
View More author details
Toc

Table of Contents (9) Chapters Close

Preface 1. Data Mining and Getting Started with Python Tools 2. Basic Terminology and Our End-to-End Example FREE CHAPTER 3. Collecting, Exploring, and Visualizing Data 4. Cleaning and Readying Data for Analysis 5. Grouping and Clustering Data 6. Prediction with Regression and Classification 7. Advanced Topics - Building a Data Processing Pipeline and Deploying It 8. Other Books You May Enjoy

Data Mining and Getting Started with Python Tools

In a sense, data mining is a necessary and predictable response to the dawn of the information age. Indeed, every piece of the modern global economy relies more each year on information and an immense in-stream of data. The path from information pool to actionable insights has many steps. Data mining is typically defined as the pattern and/or trend discovery phase in the pipeline.

This book is a quick-start guide for data mining and will include utilitarian descriptions of the most important and widely used methods, including the mainstays among data professionals such as k-means clustering, random forest prediction, and principal component dimensionality reduction. Along the way, I will give you tips I've learned and introduce helpful scripting tools to make your life easier. Not only will I introduce the tools, but I will clearly describe what makes them so helpful and why you should take the time to learn them.

The first half of the book will cover the nuts and bolts of data collection and preparation. The second half will be more conceptual and will introduce the topics of transformation, clustering, and prediction. The conceptual discussions start in the middle of Chapter 4, Cleaning and Readying Data for Analysis, and are written solely as a conversation between myself and the reader. These conversations are ported mostly from the many adhoc training sessions I've done over the years on Intel office marker boards. The last chapter of the book will be on the deployment of these models. This topic is the natural next step for new practitioners and I will provide an introduction and references for when you think you are ready to take the next steps.

The following topics will be covered in this chapter:

  • Descriptive, predictive, and prescriptive analytics
  • What will and will not be covered in this book
  • Setting up Python environments for data mining
  • Installing the Anaconda distribution and Conda package manager
  • Launching the Spyder IDE
  • Launching a Jupyter Notebook
  • Installing a high performance Python distribution
  • Recommended libraries and how to install
Practitioners should be familiar with the previous data selection, preprocessing, and transformation steps as well as the subsequent pattern and trend evaluation. Knowledge of the full process and an understanding of the goals will orient your data mining efforts in space and keep you aligned with the overall goal.
lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime