BERT-based fake news classification
In our first experiment, we trained a classical random forest classifier on TF-IDF features to detect fake versus real news articles and got an accuracy score of about 93%. In this section, we will train a deep learning model for the same task and see if we get any accuracy gains over the classical tree-based approach. Deep learning has changed the way we used to solve NLP problems. Classical approaches required hand-crafted features, most of which were related to the frequency of words appearing in a document. Looking at the complexity of languages, just knowing the count of words in a paragraph is not enough. The order in which words occur also has a significant impact on the overall meaning of the paragraph or sentence. Deep learning approaches such as Long-Short-Term-Memory (LSTM) also consider the sequential dependency of words in sentences or paragraphs to get a more meaningful feature representation. LSTM has achieved great success in many...