What this book covers
This book includes fifteen chapters that will take you through a process that starts from understanding what NLU is, through selecting applications, developing systems, and figuring out how to improve a system you have developed.
Chapter 1, Natural Language Understanding, Related Technologies, and Natural Language Applications, provides an explanation of what NLU is, and how it differs from related technologies such as speech recognition.
Chapter 2, Identifying Practical Natural Language Understanding Problems, systematically goes through a wide range of potential applications of NLU and reviews the specific requirements of each type of application. It also reviews aspects of an application that might make it difficult for the current state of the art.
Chapter 3, Approaches to Natural Language Understanding – Rule-Based Systems, Machine Learning, and Deep Learning, provides an overview of the main approaches to NLU and discusses their benefits and drawbacks, including rule-based techniques, statistical techniques, and deep learning. It also discusses popular pre-trained models such as BERT and its variants. Finally, it discusses combining different approaches into a solution.
Chapter 4, Selecting Libraries and Tools for Natural Language Understanding, helps you get set up to process natural language. It begins by discussing general tools such as Jupyter Labs and GitHub, and how to install and use them. It then goes on to discuss installing Python and the many available Python libraries that are available for NLU. Libraries that are discussed include NLTK, spaCy, and TensorFlow/Keras.
Chapter 5, Natural Language Data – Finding and Preparing Data, teaches you how to identify and prepare data for processing with NLU techniques. It discusses data from databases, the web, and other documents as well as privacy and ethics considerations. The Wizard of Oz technique and other simulated data acquisition approaches, such as generation, are covered briefly. For those of you who don’t have access to your own data, or to those who wish to compare their results to those of other researchers, this chapter also discusses generally available and frequently used corpora. It then goes on to discuss preprocessing steps such as stemming and lemmatization.
Chapter 6, Exploring and Visualizing Data, discusses exploratory techniques for getting an overall picture of the data such as summary statistics (word frequencies, category frequencies, and so on). It will also discuss visualization tools such as matplotlib. Finally, it discusses the kinds of decisions that can be made based on visualization and statistical results.
Chapter 7, Selecting Approaches and Representing Data, discusses considerations for selecting approaches, for example, amount of data, training resources, and intended application. This chapter also discusses representing language with such techniques as vectors and embeddings in preparation for quantitative processing. It also discusses combining multiple approaches through the use of pipelines.
Chapter 8, Rule-Based Techniques, discusses how to apply rule-based techniques to specific applications. Examples include regular expressions, lemmatization, syntactic parsing, semantic role assignment and ontologies. This chapter primarily uses the NLTK libraries.
Chapter 9, Machine Learning Part 1 - Statistical Machine Learning, discusses how to apply statistical machine techniques such as Naïve Bayes, TF-IDF, support vector machines and conditional random fields to tasks such as classification, intent recognition, and entity extraction. The emphasis will be on newer techniques such as SVM and how they provide improved performance over more traditional approaches.
Chapter 10, Machine Learning Part 2 – Neural Networks and Deep Learning Techniques, covers applying machine learning techniques based on neural networks (fully connected networks, CNNs and RNNs) to problems such classification and information extraction. The chapter compares results using these approaches to the approaches described in the previous chapter. The chapter discusses neural net concepts such as hyperparameters, learning rate, and training iterations. This chapter uses the TensorFlow/Keras libraries.
Chapter 11, Machine Learning Part 3 – Transformers and Large Language Models, covers the currently best-performing techniques in natural language processing – transformers and pretrained models. It discusses the insights behind transformers and include an example of using transformers for text classification. Code for this chapter is based on the TensorFlow/Keras Python libraries.
Chapter 12, Applying Unsupervised Learning Approaches, discusses applications of unsupervised learning, such as topic modeling, including the value of unsupervised learning for exploratory applications and maximizing scarce data. It also addresses types of partial supervision such as weak supervision and distant supervision.
Chapter 13, How Well Does It Work? – Evaluation, covers quantitative evaluation. This includes segmenting the data into training, validation and test data, evaluation with cross-validation, evaluation metrics such as precision and recall, area under the curve, ablation studies, statistical significance, and user testing.
Chapter 14, What to Do If the System Isn’t Working, discusses system maintenance. If the original model isn’t adequate or if the situation in the real world changes, how does the model have to be changed? The chapter discusses adding new data and changing the structure of the application while at the same time ensuring that new data doesn’t degrade the performance of the existing system.
Chapter 15, Summary and Looking to the Future, provides an overview of the book and a look to the future. It discusses where there is potential for improvement in performance as well as faster training, more challenging applications, and future directions for technology as well as research in this exciting technology.