In this chapter, we'll be using the Internet Movie Database (IMDb) movie reviews text data that's available in the Keras package. Note that there is no need to download this data from anywhere as it can be easily accessed from the Keras library using code that we will discuss soon. In addition, this dataset is preprocessed so that text data is converted into a sequence of integers. We cannot use text data directly for model building, and such preprocessing of text data into a sequence of integers is necessary before the data can be used as input for developing deep learning networks.
We will start by loading the imdb data using the dataset_imdb function, where we will also specify the number of most frequent words as 500 using num_words. Then, we'll split the imdb data into train and test datasets. Let's take a look at the...