You're reading from Data Wrangling with R Load, explore, transform and visualize data for modeling with tidyverse libraries

Product type Paperback

Published in Feb 2023

Publisher Packt

ISBN-13 9781803235400

Length 384 pages

Edition 1st Edition

Languages

Tools

Power BI

Concepts

Data Mining

Author (1):

Gustavo Santos

View More author details

Table of Contents (21) Chapters

Preface

1. Part 1: Load and Explore Data

2. Chapter 1: Fundamentals of Data Wrangling FREE CHAPTER

3. Chapter 2: Loading and Exploring Datasets

4. Chapter 3: Basic Data Visualization

5. Part 2: Data Wrangling

6. Chapter 4: Working with Strings

7. Chapter 5: Working with Numbers

8. Chapter 6: Working with Date and Time Objects

9. Chapter 7: Transformations with Base R

10. Chapter 8: Transformations with Tidyverse Libraries

11. Chapter 9: Exploratory Data Analysis

12. Part 3: Data Visualization

13. Chapter 10: Introduction to ggplot2

14. Chapter 11: Enhanced Visualizations with ggplot2

15. Chapter 12: Other Data Visualization Options

16. Part 4: Modeling

17. Chapter 13: Building a Model with R

18. Chapter 14: Build an Application with Shiny in R

19. Conclusion

20. Other Books You May Enjoy

Fundamentals of Data Wrangling

The relationship between humans and data is age old. Knowing that our brains can capture and store only a limited amount of information, we had to create ways to keep and organize data.

The first idea of keeping and storing data goes back to 19000 BC (as stated in https://www.thinkautomation.com/histories/the-history-of-data/) when a bone stick is believed to have been used to count things and keep information engraved on it, serving as a tally stick. Since then, words, writing, numbers, and many other forms of data collection have been developed and evolved.

In 1663, John Graunt performed one of the first recognized data analyses, studying births and deaths by gender in the city of London, England.

In 1928, Fritz Pfleumer received the patent for magnetic tapes, a solution to store sound that enabled other researchers to create many of the storage technologies that are still used, such as hard disk drives.

Fast forward to the modern world, at the beginning of the computer age, in the 1970s, when IBM researchers Raymond Boyce and Donald Chamberlin created the Structured Query Language (SQL) for getting access to and modifying data held in databases. The language is still used, and, as a matter of fact, many data-wrangling concepts come from it. Concepts such as SELECT, WHERE, GROUP BY, and JOIN are heavily present in any work you want to perform with datasets. Therefore, a little knowledge of those basic commands might help you throughout this book, although it is not mandatory.

In this chapter, we will cover the following main topics:

What is data wrangling?
Why data wrangling?
The key steps of data wrangling

You're reading from Data Wrangling with R Load, explore, transform and visualize data for modeling with tidyverse libraries

Table of Contents (21) Chapters

Fundamentals of Data Wrangling

Authors (1)

Personalised recommendations for you