AdaBoost



Distribution 1.0
13 July 2015

Massih R. Amini

Université Grenoble Alpes
Laboratoire d'Informatique de Grenoble





 


Description


This program is an implementation of the Adaptive Boosting (AdaBoost) algorithm proposed by [Schapire, 1999; Freund, 1995] and described in [Amini, 2015; p.88-97]. This algorithm generates a set of weak binary learners and combine them using a voting method; the t-th weak classier takes into account the errors committed by the previous weak classifier. This is done by affecting a weight D(i), to each training example i : a well (resp. misclassified) example by the previous classifier will get a lower (resp. higher) weight. A new training set is then sampled from the initial training set with respect to these weights and a new classifier is trained. By doing so, the new classifier focusses then on hard examples found by the previous classifier. The final classifier called the boosted, or the voted, classifier is a linear combination of the weak-classifiers where the combination weights depend on the misclassification error of the associated classifiers. The pseudo-code of the algorithm is:


Download and Installation


The program is free for scientific use only and it is developed on Linux with gcc and the source code is available from:
http://ama.liglab.fr/~amini/AdaBoost/AdaBoost.tar.bz2

After downloading the file, and unpackting it:

> bzip2 -cd AdaBoost.tar.bz2 | tar xvf -

you need to compile the program in the new directory AdaBoost/

> make

After compilation, two executables are created:

  • AdaBoost-learn (for training the model)
  • AdaBoost-test (for testing it)


Training and testing


Each example in these files is represented by its class label (+1 or -1) followed by its plain vector representation. In Adaboost/example/ there are four (training_set and test_set) files, from UCI repository.


Train the model:
> AdaBoost-learn [options] input_file parameter_file

Options are:
-t   (integer) Maximum number of iterations (default 100),
-? Help


Test the model:
> AdaBoost-test input_file parameter_file

Example:
> ./AdaBoost-learn example/HEART-Train Parms-HEART
Training file contains 162 examples in dimension 13
Iteration=1 - Epsilon=0.123457 Alpha=0.980047
Iteration=2 - Epsilon=0.211532 Alpha=0.657858
Iteration=3 - Epsilon=0.229046 Alpha=0.606853
Iteration=4 - Epsilon=0.373132 Alpha=0.259402
Iteration=5 - Epsilon=0.334523 Alpha=0.343899
Iteration=6 - Epsilon=0.340029 Alpha=0.331582
Iteration=7 - Epsilon=0.462733 Alpha=0.074673
Iteration=8 - Epsilon=0.375127 Alpha=0.255141
Iteration=9 - Epsilon=0.474154 Alpha=0.051739
Iteration=10 - Epsilon=0.472774 Alpha=0.054506
Iteration=11 - Epsilon=0.397950 Alpha=0.207007
> ./AdaBoost-test example/HEART-Test Parms-HEART
AdaBoost on a test collection containing 108 examples in dimension 13 with 11 weak-classifiers
Precision:0.774648 Recall:0.846154 F1-measure:0.808824 Error=0.240741


Disclaimer


This program is publicly available for research use only. It should not be distributed for commercial use and the author is not responsible for any (mis)use of this algorithm.


Bibliography



[Amini, 2015] Massih-Reza Amini. Apprentissage Machine: de la théorie à la pratique. Eyrolles, 2015.

[Freund, 1995] Yoav Freund. Boosting a weak learning algorithm by majority, Information and Computation, 121:256-285, 1995.

[Schapire, 1999] Robert E. Schapire. Theoretical views of boosting and applications. In Proceedings of the 10th International Conference on Algorithmic Learning Theory, p. 13-25, 1999.