Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases now! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Data Augmentation with Python

You're reading from   Data Augmentation with Python Enhance deep learning accuracy with data augmentation methods for image, text, audio, and tabular data

Arrow left icon
Product type Paperback
Published in Apr 2023
Publisher Packt
ISBN-13 9781803246451
Length 394 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Duc Haba Duc Haba
Author Profile Icon Duc Haba
Duc Haba
Arrow right icon
View More author details
Toc

Table of Contents (17) Chapters Close

Preface 1. Part 1: Data Augmentation
2. Chapter 1: Data Augmentation Made Easy FREE CHAPTER 3. Chapter 2: Biases in Data Augmentation 4. Part 2: Image Augmentation
5. Chapter 3: Image Augmentation for Classification 6. Chapter 4: Image Augmentation for Segmentation 7. Part 3: Text Augmentation
8. Chapter 5: Text Augmentation 9. Chapter 6: Text Augmentation with Machine Learning 10. Part 4: Audio Data Augmentation
11. Chapter 7: Audio Data Augmentation 12. Chapter 8: Audio Data Augmentation with Spectrogram 13. Part 5: Tabular Data Augmentation
14. Chapter 9: Tabular Data Augmentation 15. Index 16. Other Books You May Enjoy

What this book covers

Chapter 1, Data Augmentation Made Easy, is an introduction to data augmentation. Readers will learn the definition of data augmentation, data types, and its benefits. Furthermore, the readers will learn how to select the appropriate online Jupyter Python Notebook or install it locally. Finally, Chapter 1 concludes with a discussion on coding conventions, GitHub access, and the foundation of Object-Oriented class code, named Pluto.

Chapter 2, Biases in Data Augmentation, defines the computation, human, and systemic biases with plenty of real-world examples to illustrate the differences between these types of biases. Readers will have the opportunity to practice identifying data biases by downloading three real-world image datasets and two text datasets from the Kaggle website to reinforce their learning. Once downloaded, readers will learn how to display image and text batches and discuss potential biases in the data.

Chapter 3, Image Augmentation for Classification, has two parts. First, readers will learn the concepts and techniques of augmentation for Image classification, followed by hands-on Python coding and a detailed explanation of the image augmentation methods with a safe level of image distortion. By the end of this chapter, readers will learn the concepts and hands-on techniques in Python coding for classification image augmentation using six real-world image datasets. In addition, they will examine several Python open-source libraries for image augmentation and write Python wrapper functions using the chosen libraries.

Chapter 4, Image Augmentation for Segmentation, highlights that both Image Segmentation and Image Classification are critical components of the Computer Vision domain. Image Segmentation involves grouping parts of an image that belong to the same object, also known as pixel-level classification. Unlike Image Classification, which identifies and predicts the subject or label of a photo, Image Segmentation determines if a pixel belongs to a list of objects or tags. The image augmentation methods for segmentation or classification are the same, except segmentation comes with an additional mask or ground-truth image. Chapter 4 aims to provide continuing Geometric and Photometric transformations for Image Segmentation.

Chapter 5, Text Augmentation, explores text augmentation, a technique used in natural language processing (NLP) to generate additional data by modifying or creating new text from existing text data. Text augmentation can involve techniques such as character swapping, noise injection, synonym replacement, word deletion, word insertion, and word swapping. Image and Text augmentation has the same goal. They strive to increase the training dataset’s size and improve AI prediction accuracy. In Chapter 5, you will learn about Text augmentation and how to code the methods in the Python Notebooks.

Chapter 6, Text Augmentation with Machine Learning, discusses an advanced technique that aims to improve ML model accuracy. Interestingly, text augmentation uses a pre-trained ML model to create additional training NLP data, creating a circular process. Although ML coding is beyond the scope of this book, understanding the difference between using libraries and ML for text augmentation can be beneficial. Chapter 6 will cover text augmentation with Machine Learning.

Chapter 7, Audio Data Augmentation, explains that similar to image and text augmentation, the objective of audio augmentation is to extend the dataset for gaining higher accuracy forecast or prediction in a Generative AI system. Audio augmentation is cost-effective and a viable option when acquiring additional audio files is expensive or time-consuming. Writing about audio augmentation methods poses unique challenges. The first is that audio is not visual like images or text. If the format is audiobooks, web pages, or mobile apps, we play the sound, but the medium is paper. Thus, we will transform the audio signal into a visual representation. Chapter 6 will cover Audio augmentation using Waveform transformation. You can play the audio file on the Python Notebook.

Chapter 8, Audio Data Augmentation with Spectogram, builds on the previous chapter’s topic of audio augmentation by exploring additional visualization methods beyond the Waveform graph. An audio spectrogram is another visualizing method to see the audio components. The inputs to the spectrogram are a one-dimensional array of amplitude values and the sampling rate. They are the same inputs as the Waveform graph. An audio spectrogram is sometimes called sonographs, sonagrams, voiceprints, or voicegrams. The typical usage is for music, human speech, and sonar. A short standard definition is a spectrum of frequency maps with time duration. In other words, the Y-axis is the frequency in Hz or kHz, and the X-axis is the time duration in seconds or milliseconds. Chapter 8 will cover the audio spectrogram standard format, variation of a spectrogram, Mel-spectrogram, Chroma Short-time Fourier transformation (STFT), and augmentation techniques.

Chapter 9, Tabular Data Augmentation, involves taking data from a database, spreadsheet, or table format and extending it for the AI training cycle. The goal is to increase the accuracy of prediction or forecast, which is the same for image, text, and audio augmentations. Tabular augmentation is a relativelynew field for Data scientists. It is contrary to using analytics for reporting, summarizing, or forecasting. In analytics, altering or adding data to skew the results to a preconceived desired outcome is unethical. In data augmentation, the purpose is to derive new data from an existing dataset. The two goals are incongruent, but they are not. There will be a slight departure from the image, text, and audio augmentation format. We will spend more time in Python code studying the real-world tabular dataset.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime