Skip to content

Commit

Permalink
Adaboost example (#739)
Browse files Browse the repository at this point in the history
* added overview for AdaBoost

* added implementation for AdaBoost

* added example for AdaBoost

* added tests for AdaBoost

* rephrased sentences

* final changes to AdaBoost

* changed adaboost tests to use grade_learner

* grammar check
  • Loading branch information
aswanipranjal authored and norvig committed Feb 23, 2018
1 parent 9ccc092 commit af50f30
Show file tree
Hide file tree
Showing 2 changed files with 286 additions and 1 deletion.
271 changes: 270 additions & 1 deletion learning.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 2,
"metadata": {
"collapsed": true
},
Expand Down Expand Up @@ -1778,6 +1778,275 @@
"source": [
"The Perceptron didn't fare very well mainly because the dataset is not linearly separated. On simpler datasets the algorithm performs much better, but unfortunately such datasets are rare in real life scenarios."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## AdaBoost\n",
"\n",
"### Overview\n",
"\n",
"**AdaBoost** is an algorithm which uses **ensemble learning**. In ensemble learning the hypotheses in the collection, or ensemble, vote for what the output should be and the output with the majority votes is selected as the final answer.\n",
"\n",
"AdaBoost algorithm, as mentioned in the book, works with a **weighted training set** and **weak learners** (classifiers that have about 50%+epsilon accuracy i.e slightly better than random guessing). It manipulates the weights attached to the the examples that are showed to it. Importance is given to the examples with higher weights.\n",
"\n",
"All the examples start with equal weights and a hypothesis is generated using these examples. Examples which are incorrectly classified, their weights are increased so that they can be classified correctly by the next hypothesis. The examples that are correctly classified, their weights are reduced. This process is repeated *K* times (here *K* is an input to the algorithm) and hence, *K* hypotheses are generated.\n",
"\n",
"These *K* hypotheses are also assigned weights according to their performance on the weighted training set. The final ensemble hypothesis is the weighted-majority combination of these *K* hypotheses.\n",
"\n",
"The speciality of AdaBoost is that by using weak learners and a sufficiently large *K*, a highly accurate classifier can be learned irrespective of the complexity of the function being learned or the dullness of the hypothesis space."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Implementation\n",
"\n",
"As seen in the previous section, the `PerceptronLearner` does not perform that well on the iris dataset. We'll use perceptron as the learner for the AdaBoost algorithm and try to increase the accuracy. \n",
"\n",
"Let's first see what AdaBoost is exactly:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01//EN\"\n",
" \"http://www.w3.org/TR/html4/strict.dtd\">\n",
"\n",
"<html>\n",
"<head>\n",
" <title></title>\n",
" <meta http-equiv=\"content-type\" content=\"text/html; charset=None\">\n",
" <style type=\"text/css\">\n",
"td.linenos { background-color: #f0f0f0; padding-right: 10px; }\n",
"span.lineno { background-color: #f0f0f0; padding: 0 5px 0 5px; }\n",
"pre { line-height: 125%; }\n",
"body .hll { background-color: #ffffcc }\n",
"body { background: #f8f8f8; }\n",
"body .c { color: #408080; font-style: italic } /* Comment */\n",
"body .err { border: 1px solid #FF0000 } /* Error */\n",
"body .k { color: #008000; font-weight: bold } /* Keyword */\n",
"body .o { color: #666666 } /* Operator */\n",
"body .ch { color: #408080; font-style: italic } /* Comment.Hashbang */\n",
"body .cm { color: #408080; font-style: italic } /* Comment.Multiline */\n",
"body .cp { color: #BC7A00 } /* Comment.Preproc */\n",
"body .cpf { color: #408080; font-style: italic } /* Comment.PreprocFile */\n",
"body .c1 { color: #408080; font-style: italic } /* Comment.Single */\n",
"body .cs { color: #408080; font-style: italic } /* Comment.Special */\n",
"body .gd { color: #A00000 } /* Generic.Deleted */\n",
"body .ge { font-style: italic } /* Generic.Emph */\n",
"body .gr { color: #FF0000 } /* Generic.Error */\n",
"body .gh { color: #000080; font-weight: bold } /* Generic.Heading */\n",
"body .gi { color: #00A000 } /* Generic.Inserted */\n",
"body .go { color: #888888 } /* Generic.Output */\n",
"body .gp { color: #000080; font-weight: bold } /* Generic.Prompt */\n",
"body .gs { font-weight: bold } /* Generic.Strong */\n",
"body .gu { color: #800080; font-weight: bold } /* Generic.Subheading */\n",
"body .gt { color: #0044DD } /* Generic.Traceback */\n",
"body .kc { color: #008000; font-weight: bold } /* Keyword.Constant */\n",
"body .kd { color: #008000; font-weight: bold } /* Keyword.Declaration */\n",
"body .kn { color: #008000; font-weight: bold } /* Keyword.Namespace */\n",
"body .kp { color: #008000 } /* Keyword.Pseudo */\n",
"body .kr { color: #008000; font-weight: bold } /* Keyword.Reserved */\n",
"body .kt { color: #B00040 } /* Keyword.Type */\n",
"body .m { color: #666666 } /* Literal.Number */\n",
"body .s { color: #BA2121 } /* Literal.String */\n",
"body .na { color: #7D9029 } /* Name.Attribute */\n",
"body .nb { color: #008000 } /* Name.Builtin */\n",
"body .nc { color: #0000FF; font-weight: bold } /* Name.Class */\n",
"body .no { color: #880000 } /* Name.Constant */\n",
"body .nd { color: #AA22FF } /* Name.Decorator */\n",
"body .ni { color: #999999; font-weight: bold } /* Name.Entity */\n",
"body .ne { color: #D2413A; font-weight: bold } /* Name.Exception */\n",
"body .nf { color: #0000FF } /* Name.Function */\n",
"body .nl { color: #A0A000 } /* Name.Label */\n",
"body .nn { color: #0000FF; font-weight: bold } /* Name.Namespace */\n",
"body .nt { color: #008000; font-weight: bold } /* Name.Tag */\n",
"body .nv { color: #19177C } /* Name.Variable */\n",
"body .ow { color: #AA22FF; font-weight: bold } /* Operator.Word */\n",
"body .w { color: #bbbbbb } /* Text.Whitespace */\n",
"body .mb { color: #666666 } /* Literal.Number.Bin */\n",
"body .mf { color: #666666 } /* Literal.Number.Float */\n",
"body .mh { color: #666666 } /* Literal.Number.Hex */\n",
"body .mi { color: #666666 } /* Literal.Number.Integer */\n",
"body .mo { color: #666666 } /* Literal.Number.Oct */\n",
"body .sa { color: #BA2121 } /* Literal.String.Affix */\n",
"body .sb { color: #BA2121 } /* Literal.String.Backtick */\n",
"body .sc { color: #BA2121 } /* Literal.String.Char */\n",
"body .dl { color: #BA2121 } /* Literal.String.Delimiter */\n",
"body .sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */\n",
"body .s2 { color: #BA2121 } /* Literal.String.Double */\n",
"body .se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */\n",
"body .sh { color: #BA2121 } /* Literal.String.Heredoc */\n",
"body .si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */\n",
"body .sx { color: #008000 } /* Literal.String.Other */\n",
"body .sr { color: #BB6688 } /* Literal.String.Regex */\n",
"body .s1 { color: #BA2121 } /* Literal.String.Single */\n",
"body .ss { color: #19177C } /* Literal.String.Symbol */\n",
"body .bp { color: #008000 } /* Name.Builtin.Pseudo */\n",
"body .fm { color: #0000FF } /* Name.Function.Magic */\n",
"body .vc { color: #19177C } /* Name.Variable.Class */\n",
"body .vg { color: #19177C } /* Name.Variable.Global */\n",
"body .vi { color: #19177C } /* Name.Variable.Instance */\n",
"body .vm { color: #19177C } /* Name.Variable.Magic */\n",
"body .il { color: #666666 } /* Literal.Number.Integer.Long */\n",
"\n",
" </style>\n",
"</head>\n",
"<body>\n",
"<h2></h2>\n",
"\n",
"<div class=\"highlight\"><pre><span></span><span class=\"k\">def</span> <span class=\"nf\">AdaBoost</span><span class=\"p\">(</span><span class=\"n\">L</span><span class=\"p\">,</span> <span class=\"n\">K</span><span class=\"p\">):</span>\n",
" <span class=\"sd\">&quot;&quot;&quot;[Figure 18.34]&quot;&quot;&quot;</span>\n",
" <span class=\"k\">def</span> <span class=\"nf\">train</span><span class=\"p\">(</span><span class=\"n\">dataset</span><span class=\"p\">):</span>\n",
" <span class=\"n\">examples</span><span class=\"p\">,</span> <span class=\"n\">target</span> <span class=\"o\">=</span> <span class=\"n\">dataset</span><span class=\"o\">.</span><span class=\"n\">examples</span><span class=\"p\">,</span> <span class=\"n\">dataset</span><span class=\"o\">.</span><span class=\"n\">target</span>\n",
" <span class=\"n\">N</span> <span class=\"o\">=</span> <span class=\"nb\">len</span><span class=\"p\">(</span><span class=\"n\">examples</span><span class=\"p\">)</span>\n",
" <span class=\"n\">epsilon</span> <span class=\"o\">=</span> <span class=\"mf\">1.</span> <span class=\"o\">/</span> <span class=\"p\">(</span><span class=\"mi\">2</span> <span class=\"o\">*</span> <span class=\"n\">N</span><span class=\"p\">)</span>\n",
" <span class=\"n\">w</span> <span class=\"o\">=</span> <span class=\"p\">[</span><span class=\"mf\">1.</span> <span class=\"o\">/</span> <span class=\"n\">N</span><span class=\"p\">]</span> <span class=\"o\">*</span> <span class=\"n\">N</span>\n",
" <span class=\"n\">h</span><span class=\"p\">,</span> <span class=\"n\">z</span> <span class=\"o\">=</span> <span class=\"p\">[],</span> <span class=\"p\">[]</span>\n",
" <span class=\"k\">for</span> <span class=\"n\">k</span> <span class=\"ow\">in</span> <span class=\"nb\">range</span><span class=\"p\">(</span><span class=\"n\">K</span><span class=\"p\">):</span>\n",
" <span class=\"n\">h_k</span> <span class=\"o\">=</span> <span class=\"n\">L</span><span class=\"p\">(</span><span class=\"n\">dataset</span><span class=\"p\">,</span> <span class=\"n\">w</span><span class=\"p\">)</span>\n",
" <span class=\"n\">h</span><span class=\"o\">.</span><span class=\"n\">append</span><span class=\"p\">(</span><span class=\"n\">h_k</span><span class=\"p\">)</span>\n",
" <span class=\"n\">error</span> <span class=\"o\">=</span> <span class=\"nb\">sum</span><span class=\"p\">(</span><span class=\"n\">weight</span> <span class=\"k\">for</span> <span class=\"n\">example</span><span class=\"p\">,</span> <span class=\"n\">weight</span> <span class=\"ow\">in</span> <span class=\"nb\">zip</span><span class=\"p\">(</span><span class=\"n\">examples</span><span class=\"p\">,</span> <span class=\"n\">w</span><span class=\"p\">)</span>\n",
" <span class=\"k\">if</span> <span class=\"n\">example</span><span class=\"p\">[</span><span class=\"n\">target</span><span class=\"p\">]</span> <span class=\"o\">!=</span> <span class=\"n\">h_k</span><span class=\"p\">(</span><span class=\"n\">example</span><span class=\"p\">))</span>\n",
" <span class=\"c1\"># Avoid divide-by-0 from either 0% or 100% error rates:</span>\n",
" <span class=\"n\">error</span> <span class=\"o\">=</span> <span class=\"n\">clip</span><span class=\"p\">(</span><span class=\"n\">error</span><span class=\"p\">,</span> <span class=\"n\">epsilon</span><span class=\"p\">,</span> <span class=\"mi\">1</span> <span class=\"o\">-</span> <span class=\"n\">epsilon</span><span class=\"p\">)</span>\n",
" <span class=\"k\">for</span> <span class=\"n\">j</span><span class=\"p\">,</span> <span class=\"n\">example</span> <span class=\"ow\">in</span> <span class=\"nb\">enumerate</span><span class=\"p\">(</span><span class=\"n\">examples</span><span class=\"p\">):</span>\n",
" <span class=\"k\">if</span> <span class=\"n\">example</span><span class=\"p\">[</span><span class=\"n\">target</span><span class=\"p\">]</span> <span class=\"o\">==</span> <span class=\"n\">h_k</span><span class=\"p\">(</span><span class=\"n\">example</span><span class=\"p\">):</span>\n",
" <span class=\"n\">w</span><span class=\"p\">[</span><span class=\"n\">j</span><span class=\"p\">]</span> <span class=\"o\">*=</span> <span class=\"n\">error</span> <span class=\"o\">/</span> <span class=\"p\">(</span><span class=\"mf\">1.</span> <span class=\"o\">-</span> <span class=\"n\">error</span><span class=\"p\">)</span>\n",
" <span class=\"n\">w</span> <span class=\"o\">=</span> <span class=\"n\">normalize</span><span class=\"p\">(</span><span class=\"n\">w</span><span class=\"p\">)</span>\n",
" <span class=\"n\">z</span><span class=\"o\">.</span><span class=\"n\">append</span><span class=\"p\">(</span><span class=\"n\">math</span><span class=\"o\">.</span><span class=\"n\">log</span><span class=\"p\">((</span><span class=\"mf\">1.</span> <span class=\"o\">-</span> <span class=\"n\">error</span><span class=\"p\">)</span> <span class=\"o\">/</span> <span class=\"n\">error</span><span class=\"p\">))</span>\n",
" <span class=\"k\">return</span> <span class=\"n\">WeightedMajority</span><span class=\"p\">(</span><span class=\"n\">h</span><span class=\"p\">,</span> <span class=\"n\">z</span><span class=\"p\">)</span>\n",
" <span class=\"k\">return</span> <span class=\"n\">train</span>\n",
"</pre></div>\n",
"</body>\n",
"</html>\n"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"psource(AdaBoost)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"AdaBoost takes as inputs: **L** and *K* where **L** is the learner and *K* is the number of hypotheses to be generated. The learner **L** takes in as inputs: a dataset and the weights associated with the examples in the dataset. But the `PerceptronLearner` doesnot handle weights and only takes a dataset as its input. \n",
"To remedy that we will give as input to the PerceptronLearner a modified dataset in which the examples will be repeated according to the weights associated to them. Intuitively, what this will do is force the learner to repeatedly learn the same example again and again until it can classify it correctly. \n",
"\n",
"To convert `PerceptronLearner` so that it can take weights as input too, we will have to pass it through the **`WeightedLearner`** function."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"psource(WeightedLearner)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `WeightedLearner` function will then call the `PerceptronLearner`, during each iteration, with the modified dataset which contains the examples according to the weights associated with them."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Example\n",
"\n",
"We will pass the `PerceptronLearner` through `WeightedLearner` function. Then we will create an `AdaboostLearner` classifier with number of hypotheses or *K* equal to 5."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"WeightedPerceptron = WeightedLearner(PerceptronLearner)\n",
"AdaboostLearner = AdaBoost(WeightedPerceptron, 5)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"iris2 = DataSet(name=\"iris\")\n",
"iris2.classes_to_numbers()\n",
"\n",
"adaboost = AdaboostLearner(iris2)\n",
"\n",
"adaboost([5, 3, 1, 0.1])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"That is the correct answer. Let's check the error rate of adaboost with perceptron."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Error ratio for adaboost: 0.046666666666666634\n"
]
}
],
"source": [
"print(\"Error ratio for adaboost: \", err_ratio(adaboost, iris2))"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"It reduced the error rate considerably. Unlike the `PerceptronLearner`, `AdaBoost` was able to learn the complexity in the iris dataset."
]
}
],
"metadata": {
Expand Down
16 changes: 16 additions & 0 deletions tests/test_learning.py
Original file line number Diff line number Diff line change
Expand Up @@ -218,3 +218,19 @@ def test_random_weights():
assert len(test_weights) == num_weights
for weight in test_weights:
assert weight >= min_value and weight <= max_value


def test_adaboost():
iris = DataSet(name="iris")
iris.classes_to_numbers()
WeightedPerceptron = WeightedLearner(PerceptronLearner)
AdaboostLearner = AdaBoost(WeightedPerceptron, 5)
adaboost = AdaboostLearner(iris)
tests = [([5, 3, 1, 0.1], 0),
([5, 3.5, 1, 0], 0),
([6, 3, 4, 1.1], 1),
([6, 2, 3.5, 1], 1),
([7.5, 4, 6, 2], 2),
([7, 3, 6, 2.5], 2)]
assert grade_learner(adaboost, tests) > 5/6
assert err_ratio(adaboost, iris) < 0.1

0 comments on commit af50f30

Please sign in to comment.