Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Getting Started with Python Data Analysis

You're reading from   Getting Started with Python Data Analysis Learn to use powerful Python libraries for effective data processing and analysis

Arrow left icon
Product type Paperback
Published in Nov 2015
Publisher
ISBN-13 9781785285110
Length 188 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Toc

Table of Contents (10) Chapters Close

Preface 1. Introducing Data Analysis and Libraries FREE CHAPTER 2. NumPy Arrays and Vectorized Computation 3. Data Analysis with Pandas 4. Data Visualization 5. Time Series 6. Interacting with Databases 7. Data Analysis Application Examples 8. Machine Learning Models with scikit-learn Index

What this book covers

Chapter 1, Introducing Data Analysis and Libraries, describes the typical steps involved in a data analysis task. In addition, a couple of existing data analysis software packages are described.

Chapter 2, NumPy Arrays and Vectorized Computation, dives right into the core of the PyData ecosystem by introducing the NumPy package for high-performance computing. The basic data structure is a typed multidimensional array which supports various functions, among them typical linear algebra tasks. The data structure and functions are explained along with examples.

Chapter 3, Data Analysis with Pandas, introduces a prominent and popular data analysis library for Python called Pandas. It is built on NumPy, but makes a lot of real-world tasks simpler. Pandas comes with its own core data structures, which are explained in detail.

Chapter 4, Data Visualizaiton, focuses on another important aspect of data analysis: the understanding of data through graphical representations. The Matplotlib library is introduced in this chapter. It is one of the most popular 2D plotting libraries for Python and it is well integrated with Pandas as well.

Chapter 5, Time Series, shows how to work with time-oriented data in Pandas. Date and time handling can quickly become a difficult, error-prone task when implemented from scratch. We show how Pandas can be of great help there, by looking in detail at some of the functions for date parsing and date sequence generation.

Chapter 6, Interacting with Databases, deals with some typical scenarios. Your data does not live in vacuum, and it might not always be available as CSV files either. MongoDB is a NoSQL database and Redis is a data structure server, although many people think of it as a key value store first. Both storage systems are introduced to help you interact with data from real-world systems.

Chapter 7, Data Analysis Application Examples, applies many of the things covered in the previous chapters to deepen your understanding of typical data analysis workflows. How do you clean, inspect, reshape, merge, or group data – these are the concerns in this chapter. The library of choice in the chapter will be Pandas again.

Chapter 8, Machine Learning Models with scikit-learn, would like to make you familiar with a popular machine learning package for Python. While it supports dozens of models, we only look at four models, two supervised and two unsupervised. Even if this is not mentioned explicitly, this chapter brings together a lot of the existing tools. Pandas is often used for machine learning data preparation and matplotlib is used to create plots to facilitate understanding.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime