Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
CompTIA Data+: DAO-001 Certification Guide

You're reading from   CompTIA Data+: DAO-001 Certification Guide Complete coverage of the new CompTIA Data+ (DAO-001) exam to help you pass on the first attempt

Arrow left icon
Product type Paperback
Published in Dec 2022
Publisher Packt
ISBN-13 9781804616086
Length 370 pages
Edition 1st Edition
Arrow right icon
Author (1):
Arrow left icon
Cameron Dodd Cameron Dodd
Author Profile Icon Cameron Dodd
Cameron Dodd
Arrow right icon
View More author details
Toc

Table of Contents (24) Chapters Close

Preface 1. Part 1: Preparing Data
2. Chapter 1: Introduction to CompTIA Data+ FREE CHAPTER 3. Chapter 2: Data Structures, Types, and Formats 4. Chapter 3: Collecting Data 5. Chapter 4: Cleaning and Processing Data 6. Chapter 5: Data Wrangling and Manipulation 7. Part 2: Analyzing Data
8. Chapter 6: Types of Analytics 9. Chapter 7: Measures of Central Tendency and Dispersion 10. Chapter 8: Common Techniques in Descriptive Statistics 11. Chapter 9: Hypothesis Testing 12. Chapter 10: Introduction to Inferential Statistics 13. Part 3: Reporting Data
14. Chapter 11: Types of Reports 15. Chapter 12: Reporting Process 16. Chapter 13: Common Visualizations 17. Chapter 14: Data Governance 18. Chapter 15: Data Quality and Management 19. Part 4: Mock Exams
20. Chapter 16: Practice Exam One 21. Chapter 17: Practice Exam Two 22. Index 23. Other Books You May Enjoy

Cleaning and Processing Data

On rare occasions, you may receive data that is already clean, neat, and ready to use, but having an immaculate dataset just handed to you is the exception, not the rule. More often than not, while working as a data analyst, the datasets you receive will be messy, incomplete, and completely unusable without a little work. Trying to use jumbled data will only give you jumbled results. This chapter covers the most common issues you will come across and a few approaches to dealing with them.

Here, we will discuss the difference between duplicate data and redundant data, as well as what to do about it. Then, we will discuss why missing data is an issue and the different approaches you can take to deal with it. Briefly, we will cover invalid data, mismatched data, and data type validation. After that, we will discuss non-parametric data, what it is, and how to approach it. Finally, we will discuss outliers or data points that don’t seem to fit in with...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image