You're reading from Data Augmentation with Python Enhance deep learning accuracy with data augmentation methods for image, text, audio, and tabular data

Product type Paperback

Published in Apr 2023

Publisher Packt

ISBN-13 9781803246451

Length 394 pages

Edition 1st Edition

Languages

Python

Tools

BERT

Concepts

Data Science

Author (1):

Duc Haba

View More author details

Table of Contents (17) Chapters

Preface

1. Part 1: Data Augmentation

2. Chapter 1: Data Augmentation Made Easy FREE CHAPTER

3. Chapter 2: Biases in Data Augmentation

4. Part 2: Image Augmentation

5. Chapter 3: Image Augmentation for Classification

6. Chapter 4: Image Augmentation for Segmentation

7. Part 3: Text Augmentation

8. Chapter 5: Text Augmentation

9. Chapter 6: Text Augmentation with Machine Learning

10. Part 4: Audio Data Augmentation

11. Chapter 7: Audio Data Augmentation

12. Chapter 8: Audio Data Augmentation with Spectrogram

13. Part 5: Tabular Data Augmentation

14. Chapter 9: Tabular Data Augmentation

15. Index

Why subscribe?

16. Other Books You May Enjoy

Data Augmentation Made Easy

Data augmentation is essential for developing a successful deep learning (DL) project. However, data scientists and developers often overlook this crucial step. It is no secret that you will spend the majority of your project time gathering, cleaning, and augmenting the dataset in a real-world DL project. Thus, learning how to expand the dataset without purchasing new data is essential. This book covers standard and advanced techniques for extending image, text, audio, and tabular datasets. Furthermore, you will learn about data biases and learn how to code on Jupyter Python Notebooks.

Chapter 1 will introduce various data augmentation concepts, set up the coding environment, and create the foundation class. Later chapters will explain various techniques in detail, including Python coding. The effective use of data augmentation has proven to be the deciding factor between success and failure in machine learning (ML). Many real-world ML projects stay in the conceptual phase because of insufficient data for training the ML model. Data augmentation is a cost-effective technique that can increase the size of the dataset, lower the training error rate, and produce a more accurate prediction and forecast.

Fun fact

The car gasoline analogy is helpful for students who first learn about data augmentation and artificial intelligence (AI). You can think of data for the AI engine as the gasoline and data augmentation as the additive, such as the Chevron Techron fuel cleaner, that makes your car engine run faster, smoother, and further without extra petrol.

In this chapter, we’ll define the data augmentation role and the limitations of extending data without changing its integrity. We’ll briefly discuss the different types of input data, such as image, text, audio, and tabular data, and the challenges in supplementing it. Finally, we’ll set up the system requirements and the programming style in the accompanying Python notebook.

I designed this book to be a hands-on journey. It will be most effective to read a chapter, run the code, re-read the part of the chapter that confused you, and jump back to hacking the code until you firmly understand the concept or technique that was presented.

You are encouraged to change or add new code to the Python notebook. The primary purpose of this book is interactive learning. So, if something goes wrong, download a fresh copy from the book's GitHub. The surest method to learn is to make mistakes and create something new.

Data augmentation is an iterative process. There is no fixed recipe. In other words, depending on the dataset, you select augmented functions and jiggle the parameters. A subject domain expert may provide insight into how much distortion is acceptable. By the end of this chapter, you will know the general rules for data augmentation, what type of input data can be augmented, the programming style, and how to set up a Python Notebook online or offline.

In particular, this chapter covers the following primary topics:

Data augmentation role
Data input types
Python Notebook
Programming styles

Let’s start with the data augmentation role.

You're reading from Data Augmentation with Python Enhance deep learning accuracy with data augmentation methods for image, text, audio, and tabular data

Table of Contents (17) Chapters

Data Augmentation Made Easy

Authors (1)

Personalised recommendations for you