You're reading from Natural Language Processing with TensorFlow The definitive NLP book to implement the most sought-after machine learning models and tasks

Product type Paperback

Published in Jul 2022

Publisher Packt

ISBN-13 9781838641351

Length 514 pages

Edition 2nd Edition

Languages

Processing

Tools

Processing

Concepts

Machine Learning

Author (1):

Thushan Ganegedara

View More author details

Table of Contents (15) Chapters

Preface

1. Introduction to Natural Language Processing FREE CHAPTER

2. Understanding TensorFlow 2

3. Word2vec – Learning Word Embeddings

4. Advanced Word Vector Algorithms

5. Sentence Classification with Convolutional Neural Networks

6. Recurrent Neural Networks

7. Understanding Long Short-Term Memory Networks

8. Applications of LSTM – Generating Text

9. Sequence-to-Sequence Learning – Neural Machine Translation

10. Transformers

11. Image Captioning with Transformers

12. Other Books You May Enjoy

13. Index

Appendix A: Mathematical Foundations and Advanced TensorFlow

Our data

First, we will discuss the data we will use for text generation and various preprocessing steps employed to clean the data.

About the dataset

First, we will understand what the dataset looks like so that when we see the generated text, we can assess whether it makes sense, given the training data. We will download the first 100 books from the website https://www.cs.cmu.edu/~spok/grimmtmp/. These are translations of a set of books (from German to English) by the Grimm brothers.

Initially, we will download all 209 books from the website with an automated script, as follows:

url = 'https://www.cs.cmu.edu/~spok/grimmtmp/'
dir_name = 'data'
def download_data(url, filename, download_dir):
    """Download a file if not present, and make sure it's the right 
    size."""
      
    # Create directories if doesn't exist
    os.makedirs(download_dir, exist_ok=True)
    
    # If file doesn't exist download...

The rest of the chapter is locked

You're reading from Natural Language Processing with TensorFlow The definitive NLP book to implement the most sought-after machine learning models and tasks

Table of Contents (15) Chapters

Our data

About the dataset

Unlock this book and the full library FREE for 7 days

Authors (1)