Sentiment analysis
In this section, we'll demonstrate how DistilBERT – a lightweight version of BERT – can be used to handle a common problem of sentiment analysis. We will be using data from a Kaggle competition (https://www.kaggle.com/c/tweet-sentiment-extraction): given a tweet and the sentiment (positive, neutral, or negative), participants needed to identify the part of the tweet that defines that sentiment. Sentiment analysis is typically employed in business as part of a system that helps data analysts gauge public opinion, conduct detailed market research, and track customer experience. An important application is medical: the effect of different treatments on patients' moods can be evaluated based on their communication patterns.
How do we go about it?
As usual, we begin by loading the necessary packages.
import pandas as pd
import re
import numpy as np
np.random.seed(0)
import matplotlib.pyplot as plt
%matplotlib inline
import keras
from...