Bagging
The term bagging is derived from a technique called bootstrap aggregation. In order to implement a successful predictive model, it's important to know in what situation we could benefit from using bootstrapping methods to build ensemble models. Such models are used extensively both in industry as well as academia.
One such application would be that these models can be used for the quality assessment of Wikipedia articles. Features such as article_length
, number_of_references
, number_of_headings
, and number_of_images
are used to build a classifier that classifies Wikipedia articles into low- or high-quality articles. Out of the several models that were tried for this task, the random forest model – a well-known bagging-based ensemble classifier that we will discuss in our next section – outperforms all other models such as SVM, logistic regression, and even neural networks, with the best precision and recall scores of 87.3% and 87.2%, respectively. This...