Boosting techniques for imbalanced data
Imagine two friends doing group study to solve their mathematics assignment. The first student is strong in most topics but weak in two topics: complex numbers and triangles. So, the first student asks the second student to spend more time on these two topics. Then, while solving the assignments, they combine their answers. Since the first student knows most of the topics well, they decided to give more weight to his answers to the assignment questions. What these two students are doing is the key idea behind boosting.
In bagging, we noticed that we could train all the classifiers in parallel. These classifiers are trained on a subset of the data, and all of them have an equal say at the time of prediction.
In boosting, the classifiers are trained one after the other. While every classifier learns from the whole data, points in the dataset are assigned different weights based on their difficulty of classification. Classifiers are also assigned...