You're reading from Python Real-World Projects Craft your Python portfolio with deployable applications

Product type Paperback

Published in Sep 2023

Publisher Packt

ISBN-13 9781803246765

Length 478 pages

Edition 1st Edition

Languages

Python

Concepts

Programming Language

Author (1):

Steven F. Lott

View More author details

Table of Contents (20) Chapters

Preface

1. Chapter 1: Project Zero: A Template for Other Projects

2. Chapter 2: Overview of the Projects FREE CHAPTER

3. Chapter 3: Project 1.1: Data Acquisition Base Application

4. Chapter 4: Data Acquisition Features: Web APIs and Scraping

5. Chapter 5: Data Acquisition Features: SQL Database

6. Chapter 6: Project 2.1: Data Inspection Notebook

7. Chapter 7: Data Inspection Features

8. Chapter 8: Project 2.5: Schema and Metadata

9. Chapter 9: Project 3.1: Data Cleaning Base Application

10. Chapter 10: Data Cleaning Features

11. Chapter 11: Project 3.7: Interim Data Persistence

12. Chapter 12: Project 3.8: Integrated Data Acquisition Web Service

13. Chapter 13: Project 4.1: Visual Analysis Techniques

14. Chapter 14: Project 4.2: Creating Reports

15. Chapter 15: Project 5.1: Modeling Base Application

16. Chapter 16: Project 5.2: Simple Multivariate Statistics

17. Chapter 17: Next Steps

18. Other Books You Might Enjoy

19. Index

2.2 Acquisition via Extract

Since data formats are in a constant state of flux, it’s helpful to understand how to add and modify data formats. These projects will all build on Project 1.1 by adding features to the base application. The following projects are designed around alternative sources for data:

Project 1.2: ”Acquire Web Data from an API”. This project will acquire data from web services using JSON format.
Project 1.3: ”Acquire Web Data from HTML”. This project will acquire data from a web page by scraping the HTML.
Two separate projects are part of gathering data from a SQL database:
- Project 1.4: ”Build a Local Database”. This is a necessary sidebar project to build a local SQL database. This is necessary because SQL databases accessible by the public are a rarity. It’s more secure to build our own demonstration database.
- Project 1.5: ”Acquire Data from a Local Database”. Once a database is available, we can acquire data from a SQL extract.

These projects will focus on data represented as text. For CSV files, the data is text; an application must convert it to a more useful Python type. HTML pages, also, are pure text. Sometimes, additional attributes are provided to suggest the text should be treated as a number. A SQL database is often populated with non-text data. To be consistent, the SQL data should be serialized as text. The acquisition applications all share a common approach of working with text.

These applications will also minimize the transformations applied to the source data. To process the data consistently, it’s helpful to make a shift to a common format. As we’ll see in Chapter 3, Project 1.1: Data Acquisition Base Application the NDJSON format provides a useful structure that can often be mapped back to source files.

After acquiring new data, it’s prudent to do a manual inspection. This is often done a few times at the start of application development. After that, inspection is only done to diagnose problems with the source data. The next few chapters will cover projects to inspect data.

You're reading from Python Real-World Projects Craft your Python portfolio with deployable applications

Table of Contents (20) Chapters

2.2 Acquisition via Extract

Authors (1)

Personalised recommendations for you