You're reading from Python Real-World Projects Craft your Python portfolio with deployable applications

Product type Paperback

Published in Sep 2023

Publisher Packt

ISBN-13 9781803246765

Length 478 pages

Edition 1st Edition

Languages

Python

Concepts

Programming Language

Author (1):

Steven F. Lott

View More author details

Table of Contents (20) Chapters

Preface

1. Chapter 1: Project Zero: A Template for Other Projects

2. Chapter 2: Overview of the Projects FREE CHAPTER

3. Chapter 3: Project 1.1: Data Acquisition Base Application

4. Chapter 4: Data Acquisition Features: Web APIs and Scraping

5. Chapter 5: Data Acquisition Features: SQL Database

6. Chapter 6: Project 2.1: Data Inspection Notebook

7. Chapter 7: Data Inspection Features

8. Chapter 8: Project 2.5: Schema and Metadata

9. Chapter 9: Project 3.1: Data Cleaning Base Application

10. Chapter 10: Data Cleaning Features

11. Chapter 11: Project 3.7: Interim Data Persistence

12. Chapter 12: Project 3.8: Integrated Data Acquisition Web Service

13. Chapter 13: Project 4.1: Visual Analysis Techniques

14. Chapter 14: Project 4.2: Creating Reports

15. Chapter 15: Project 5.1: Modeling Base Application

16. Chapter 16: Project 5.2: Simple Multivariate Statistics

17. Chapter 17: Next Steps

18. Other Books You Might Enjoy

19. Index

9.1 Description

We need to build a data validating, cleaning, and standardizing application. A data inspection notebook is a handy starting point for this design work. The goal is a fully-automated application to reflect the lessons learned from inspecting the data.

A data preparation pipeline has the following conceptual tasks:

Validate the acquired source text to be sure it’s usable and to mark invalid data for remediation.
Clean any invalid raw data where necessary; this expands the available data in those cases where sensible cleaning can be defined.
Convert the validated and cleaned source data from text (or bytes) to usable Python objects.
Where necessary, standardize the code or ranges of source data. The requirements here vary with the problem domain.

The goal is to create clean, standardized data for subsequent analysis. Surprises occur all the time. There are several sources:

Technical problems with file formats of the upstream software. The intent of the acquisition...

The rest of the chapter is locked

You're reading from Python Real-World Projects Craft your Python portfolio with deployable applications

Table of Contents (20) Chapters

9.1 Description

Authors (1)

Personalised recommendations for you

You're reading from Python Real-World Projects Craft your Python portfolio with deployable applications

Table of Contents (20) Chapters

9.1 Description

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you