Definition: Boosting also combines predictions from multiple models, but in a sequential manner. Unlike bagging, boosting focuses on giving more weight to instances that the current set of models misclassifies, aiming to correct errors iteratively.
Key Steps:
- Model Training:
- Train a base model on the original dataset, and assign weights to each data point. Initially, all weights are equal.
- Weight Adjustment:
- Increase the weights of misclassified instances, making them more influential in the next round of training. This allows subsequent models to focus on the previously misclassified data.
- Sequential Model Building:
- Train additional models sequentially, with each model giving more attention to the instances that were misclassified by the previous models.
- Combining Predictions:
- Combine the predictions of all models, giving more weight to models that performed well on their respective training subsets.
Example: AdaBoost : AdaBoost is a popular boosting algorithm that combines weak learners to create a strong learner. It assigns higher weights to misclassified instances, leading subsequent models to focus on these instances.
Advantages of Bagging:
- Reduces overfitting by introducing diversity through multiple models.
- Enhances model robustness by averaging out errors or biases present in individual models.
- Particularly effective when the base model is sensitive to the specific data it is trained on.