Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Exploratory Data Analysis with Python Cookbook

You're reading from   Exploratory Data Analysis with Python Cookbook Over 50 recipes to analyze, visualize, and extract insights from structured and unstructured data

Arrow left icon
Product type Paperback
Published in Jun 2023
Publisher Packt
ISBN-13 9781803231105
Length 382 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Ayodele Oluleye Ayodele Oluleye
Author Profile Icon Ayodele Oluleye
Ayodele Oluleye
Arrow right icon
View More author details
Toc

Table of Contents (13) Chapters Close

Preface 1. Chapter 1: Generating Summary Statistics 2. Chapter 2: Preparing Data for EDA FREE CHAPTER 3. Chapter 3: Visualizing Data in Python 4. Chapter 4: Performing Univariate Analysis in Python 5. Chapter 5: Performing Bivariate Analysis in Python 6. Chapter 6: Performing Multivariate Analysis in Python 7. Chapter 7: Analyzing Time Series Data in Python 8. Chapter 8: Analysing Text Data in Python 9. Chapter 9: Dealing with Outliers and Missing Values 10. Chapter 10: Performing Automated Exploratory Data Analysis in Python 11. Index 12. Other Books You May Enjoy

What this book covers

Chapter 1, Generating Summary Statistics, explores statistical concepts, such as measures of central tendency and variability, which help with effectively summarizing and analyzing data. It provides practical examples and step-by-step instructions on how to use Python libraries, such as NumPy, Pandas and SciPy to compute measures (like the mean, median, mode, standard deviation, percentiles, and other critical summary statistics). By the end of the chapter, you will have gained the required knowledge for generating summary statistics in Python. You will also have gained the foundational knowledge required for understanding some of the more complex EDA techniques covered in other chapters.

Chapter 2, Preparing Data for EDA, focuses on the critical steps required to prepare data for analysis. Real-world data rarely come in a ready-made format, hence the reason for this very crucial step in EDA. Through practical examples, you will learn aggregation techniques such as grouping, concatenating, appending, and merging. You will also learn data-cleaning techniques, such as handling missing values, changing data formats, removing records, and replacing records. Lastly, you will learn how to transform data by sorting and categorizing it.

By the end of this chapter, you will have mastered the techniques in Python required for preparing data for EDA.

Chapter 3, Visualizing Data in Python, covers data visualization tools critical for uncovering hidden trends and patterns in data. It focuses on popular visualization libraries in Python, such as Matplotlib, Seaborn, GGPLOT and Bokeh, which are used to create compelling representations of data. It also provides the required foundation for subsequent chapters in which some of the libraries will be used. With practical examples and a step-by-step guide, you will learn how to plot charts and customize them to present data effectively. By the end of this chapter, you will be equipped with the knowledge and hands-on experience of Python’s visualization capabilities to uncover valuable insights.

Chapter 4, Performing Univariate Analysis in Python, focuses on essential techniques for analyzing and visualizing a single variable of interest to gain insights into its distribution and characteristics. Through practical examples, it delves into a wide range of visualizations such as histograms, boxplots, bar plots, summary tables, and pie charts required to understand the underlying distribution of a single variable and uncover hidden patterns in the variable. It also covers univariate analysis for both categorical and numerical variables.

By the end of this chapter, you will be equipped with the knowledge and skills required to perform comprehensive univariate analysis in Python to uncover insights.

Chapter 5, Performing Bivariate Analysis in Python, explores techniques for analyzing the relationships between two variables of interest and uncovering meaningful insights embedded in them. It delves into various techniques, such as correlation analysis, scatter plots, and box plots required to effectively understand relationships, trends, and patterns that exist between two variables. It also explores the various bivariate analysis options for different variable combinations, such as numerical-numerical, numerical-categorical, and categorical-categorical. By the end of this chapter, you will have gained the knowledge and hands-on experience required to perform in-depth bivariate analysis in Python to uncover meaningful insights.

Chapter 6, Performing Multivariate Analysis in Python, builds on previous chapters and delves into some more advanced techniques required to gain insights and identify complex patterns within multiple variables of interest. Through practical examples, it delves into concepts, such as clustering analysis, principal component analysis and factor analysis, which enable the understanding of interactions among multiple variables of interest. By the end of this chapter, you will have the skills required to apply advanced analysis techniques to uncover hidden patterns in multiple variables.

Chapter 7, Analyzing Time Series Data, offers a practical guide to analyze and visualize time series data. It introduces time series terminologies and techniques (such as trend analysis, decomposition, seasonality detection, differencing, and smoothing) and provides practical examples and code on how to implement them using various libraries in Python. It also covers how to spot patterns within time series data to uncover valuable insights. By the end of the chapter, you will be equipped with the relevant skills required to explore, analyze, and derive insights from time series data.

Chapter 8, Analyzing Text Data, covers techniques for analyzing text data, a form of unstructured data. It provides a comprehensive guide on how to effectively analyze and extract insights from text data. Through practical steps, it covers key concepts and techniques for data preprocessing such as stop-word removal, tokenization, stemming, and lemmatization. It also covers essential techniques for text analysis such as sentiment analysis, n-gram analysis, topic modelling, and part-of-speech tagging. At the end of this chapter, you will have the necessary skills required to process and analyze various forms of text data to unpack valuable insights.

Chapter 9, Dealing with Outliers and Missing Values, explores the process of effectively handling outliers and missing values within data. It highlights the importance of dealing with missing values and outliers and provides step-by-step instructions on how to handle them using visualization techniques and statistical methods in Python. It also delves into various strategies for handling missing values and outliers within different scenarios. At the end of the chapter, you will have the essential knowledge of the tools and techniques required to handle missing values and outliers in various scenarios.

Chapter 10, Performing Automated EDA, focuses on speeding up the EDA process through automation. It explores the popular automated EDA libraries in Python, such as Pandas Profiling, Dtale, SweetViz, and AutoViz. It also provides hands-on guidance on how to build custom functions to automate the EDA process yourself. With step-by-step instructions and practical examples, it will empower you to gain deep insights quickly from data and save time during the EDA process.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime