Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Data Exploration and Preparation with BigQuery

You're reading from   Data Exploration and Preparation with BigQuery A practical guide to cleaning, transforming, and analyzing data for business insights

Arrow left icon
Product type Paperback
Published in Nov 2023
Publisher Packt
ISBN-13 9781805125266
Length 264 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Mike Kahn Mike Kahn
Author Profile Icon Mike Kahn
Mike Kahn
Arrow right icon
View More author details
Toc

Table of Contents (21) Chapters Close

Preface 1. Part 1: Introduction to BigQuery FREE CHAPTER
2. Chapter 1: Introducing BigQuery and Its Components 3. Chapter 2: BigQuery Organization and Design 4. Part 2: Data Exploration with BigQuery
5. Chapter 3: Exploring Data in BigQuery 6. Chapter 4: Loading and Transforming Data 7. Chapter 5: Querying BigQuery Data 8. Chapter 6: Exploring Data with Notebooks 9. Chapter 7: Further Exploring and Visualizing Data 10. Part 3: Data Preparation with BigQuery
11. Chapter 8: An Overview of Data Preparation Tools 12. Chapter 9: Cleansing and Transforming Data 13. Chapter 10: Best Practices for Data Preparation, Optimization, and Cost Control 14. Part 4: Hands-On and Conclusion
15. Chapter 11: Hands-On Exercise – Analyzing Advertising Data 16. Chapter 12: Hands-On Exercise – Analyzing Transportation Data 17. Chapter 13: Hands-On Exercise – Analyzing Customer Support Data 18. Chapter 14: Summary and Future Directions 19. Index 20. Other Books You May Enjoy

Assessing dataset integrity

Dataset integrity refers to the quality and consistency of data within a dataset. It is the assurance that the data is accurate, complete, reliable, and free from errors or inconsistencies. Understanding your data’s integrity is important for ensuring the quality and usability of data and determining to what degree you will need to cleanse or transform data. A dataset with poor integrity can lead to incorrect analysis, inaccurate reports, and misinformed business decisions. There are several ways to assess dataset integrity. In this section, we will discuss techniques and considerations for assessing dataset integrity in BigQuery.

The shape of the dataset

Understanding your dataset’s shape helps you form a baseline expectation for the quality of results you will receive from queries. Consider Figure 9.3. Your dataset may be taller than wide, indicating a lot of rows and few columns. This may present a situation where you want to join...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime