Profiling the code
Profiling involves identifying parts of the code that need performance tuning because they are either too slow or use a large amount of resources, such as processor power or memory. We will profile a modified version of sentiment analysis code from Chapter 9, Analyzing Textual Data and Social Media. The code is refactored to comply with multiprocessing programming guidelines (you will learn about multiprocessing later in this chapter). We also simplified the stopwords filtering. The third change was to have fewer word features in the code so that the reduction doesn't impact accuracy. This last change has the most impact. The original code ran for about 20 seconds. The new code runs faster than that and will serve as the baseline in this chapter. Some changes are related to profiling and will be explained later in this section. Please refer to the prof_demo.py
file in this book's code bundle:
import random from nltk.corpus import movie_reviews from nltk.corpus...