forked from gcallah/algorithms
-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathGreedy.html
1019 lines (940 loc) · 40.8 KB
/
Greedy.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<html>
<head>
<link href="style.css" rel="stylesheet" type="text/css"/>
<title>
Design and Analysis of Algorithms: Greedy Algorithms
</title>
</head>
<body>
<h1>
Design and Analysis of Algorithms: Greedy Algorithms
</h1>
<div style="text-align:center">
<p>
<img
src="https://upload.wikimedia.org/wikipedia/commons/thumb/b/bc/The_worship_of_Mammon.jpg/170px-The_worship_of_Mammon.jpg">
<br>
</p>
</div>
<h2>
A family of problems and approriate solutions
</h2>
<p>
It is important to recognize that we have been dealing with a
family of problems and solutions to those problems:
</p>
<ul>
<li><b>Optimal substructure</b>:
<br>Recursive solution.
<li><b>Optimal substructure with overlapping sub-problems</b>:
<br>Recursion with memoization or dynamic programming.
<li><b>Optimal substructure where only the locally optimal choice
matters</b>:
<br>Greedy algorithms.
</ul>
<h2>
What is a greedy algorithm?
</h2>
<p>
We have an optimization problem.
<br>
At each step of the algorithm, we have to make a choice, e.g.,
cut the rod here, or cut it there.
<br>
Sometimes, we need to calculate the result of all possible
choices.
</p>
<ul>
<li>When we do so from the top down, we have a <i>recursive
algorithm</i>. A naive recursive algorithm may be very
expensive, but we can significantly reduce its run-time by
<i>memoizing</i> it.
<li>When we do this calculation from the bottom up, we are
employing <i>dynamic programming</i>.
</ul>
<p>
But sometimes, we can do much better than either of those
choices. Sometimes, we don't need to consider the global
situation at all: we can simply make the best choice among the
options provided by the local problem we face, and then
continue that procedure for all subsequent local problems.
<br>
<br>
An algorithm that operates in such a fashion is a <i>greedy
algorithm</i>. (The name comes from the idea that the
algorithm <i>greedily</i> grabs the best choice available to it right
away.)
<br>
<br>
Clearly, not all problems can be solved by greedy algorithms.
Consider this simple shortest path problem:
<br>
<br>
<img src="graphics/ShortPath.png">
<br>
<br>
A greedy algorithm choosing the shortest path from <i>a</i> to <i>d</i> will
wrongly head to <i>b</i> first, rather than to <i>c</i>.
<br>
<br>
And that provides us a <a href="#hw1">homework problem</a>.
<br>
</p>
<h3>
Minimum spanning trees
</h3>
<p>
<b>Tree</b>: A connected graph with no cycles.
<br>
Given a graph G, any tree that includes all of the vertices
of G is called a <i>spanning tree</i>. The lowest-weight tree
that does that is a <i>minimum spanning tree</i>.
<br>
<br>
These are used to solve problems such as:
</p>
<ul>
<li>Lowest cost way to bring a package between two
cities.
<li>Most efficient way to connect two components on a
circuit board.
<li>Most efficient way to connect two strangers in a
social network.
</ul>
<p>
Our first greedy algorithm.
<br>
<br>
Weighted graph.
</p>
<h4>
Kruskal's Algorithm
</h4>
<p>
<img src="graphics/MinSpanTree1.png">
<br>
<br>
We add edges in increasing-cost order, so long as the edges
don't create a cycle (are "safe").
<br>
Steps:
</p>
<ul>
<li>Include the edge weighted 1.
<li>Include both edges weighted 2.
<li>Include the edge weighted 3.
<li>Include the upper
and lower edges weighted 4.
(Not the middle one!)
<li>Include the lone edge weighted 5.
<li>Include the edge weighted 6.
</ul>
<p>
<img src="graphics/MinSpanTree2.png">
<br>
<br>
<b>Proof</b>: Is Kruskal's algorithm guaranteed to always
find the minimum spanning tree?
<br>
Yes, it is. Let's prove it.
<br>
We suppose that graph <i>G</i> has <i>n</i> vertices.
Then our algorithm will create a tree <i>T</i>
with edges <i>e<sub>1</sub></i>, <i>e<sub>2</sub></i>,
... <i>e<sub>n - 1</sub></i>, where
<i>w(e<sub>1</sub>) < w(e<sub>2</sub>)
< ... w(e<sub>n - 1</sub>)</i>.
<br>
Suppose that there is a tree <i>T*</i> with a lesser
weight.
<br>
Let <i>e<sub>k</sub></i> be the first edge in <i>T</i>
that is not in <i>T*</i>.
<br>
Now we insert <i>e<sub>k</sub></i> in <i>T*</i>. This will
produce a cycle in <i>T*</i>, by the nature of trees.
There must be some edge <i>e<sup>*</sup></i> that is in
<i>T*</i> but not in <i>T</i> (otherwise <i>T</i> would
have a cycle).
<br>
<img src="graphics/Kruskal.png">
<br>
But the weight of <i>e<sub>k</sub></i> must be less than
the weight of <i>e<sup>*</sup></i>, because after we had
inserted <i>e<sub>1</sub></i> through
<i>e<sub>k - 1</sub></i>, we could have next chosen
<i>e<sup>*</sup></i>... but we did not. Instead we chose
<i>e<sub>k</sub></i>.
<br>
So <i>T*</i> does not have a lesser weight after all.
</p>
<h4>
How to find safe edges
</h4>
<ul>
<li>Place each vertex in its own set. (We now have a
<i>forest</i> of one-vertex, unconnected trees.)
<li>We examine edges (<i>u, v</i>) in non-decreasing order.
<li>If <i>u</i> and <i>v</i> are not in the same set (tree), add the
edge. (It is safe.)
<li>If <i>u</i> and <i>v</i> are in the same set, adding the edge
would create a cycle (by the nature of trees, E = V
- 1).
</ul>
<h2>
An activity selection problem
</h2>
<p>
Suppose we need to schedule a lecture hall with the goal of
maximizing the number of lectures it can hold, given the
constraint that no lectures can share the space.
<br>
We have the activities sorted in order of increasing finish
time:
<br>
<i>f<sub>1</sub> ≤ f<sub>2</sub> ≤ f<sub>3</sub>
≤ ... ≤ f<sub>n - 1</sub> ≤ f<sub>n</sub></i>.
<br>
<br>
We might have the following set of activities:
</p>
<table>
<tr>
<th>
<i>i</i>
</th>
<th>
1
</th>
<th>
2
</th>
<th>
3
</th>
<th>
4
</th>
<th>
5
</th>
<th>
6
</th>
<th>
7
</th>
<th>
8
</th>
<th>
9
</th>
<th>
10
</th>
<th>
11
</th>
</tr>
<tr>
<th>
<i>s<sub>i</sub></i>
</th>
<td>
1
</td>
<td>
3
</td>
<td>
0
</td>
<td>
5
</td>
<td>
3
</td>
<td>
5
</td>
<td>
6
</td>
<td>
8
</td>
<td>
8
</td>
<td>
2
</td>
<td>
12
</td>
</tr>
<tr>
<th>
<i>f<sub>i</sub></i>
</th>
<td>
4
</td>
<td>
5
</td>
<td>
6
</td>
<td>
7
</td>
<td>
9
</td>
<td>
9
</td>
<td>
10
</td>
<td>
11
</td>
<td>
12
</td>
<td>
14
</td>
<td>
16
</td>
</tr>
</table>
<p>
<br>
We have several sets of mutually compatible lectures, e.g.:
<br>
{<i>a</i><sub>3</sub>,
<i>a</i><sub>9</sub>,
<i>a</i><sub>11</sub>}
<br>
However, this set is larger:
<br>
{<i>a</i><sub>1</sub>,
<i>a</i><sub>4</sub>,
<i>a</i><sub>8</sub>,
<i>a</i><sub>11</sub>}
<br>
<br>
How do we find the maximum subset of lectures we can
schedule?
</p>
<h3>
The optimal sub-structure of the problem
</h3>
<p>
It is easy to see that the problem exhibits optimal
substructure. <i>If</i> the optimal solution includes a
lecture from 4 to 6, then we have the sub-problems of
scheduling lectures that end by 4, and lectures that start
at 6 or after. Obviously, we must solve each of those
sub-problems in an optimal way, or our solution will not be
optimal! (Think of our problem of traveling from MetroTech
Center to Yankee Stadium via Grand Central.) Our textbook
calls this a "cut-and-paste" argument: we can cut an
optimal set of lectures ending by 4 and paste it into our
supposed optimal solution: if the solution was different
before 4, we have improved it, so it wasn't actually
optimal!
<br>
<br>
It is straight forward to see that we can solve this problem
with a recursive, memoized algorithm -- we examine each
possible "cut" of a "middle" lecture, and recursively solve
the start-to-middle problem, and the middle-to-end problem.
Or we could use a bottom-up dynamic programming algorithm,
developed from the recursive solution in the same way we
saw in our last lecture.
</p>
<h3>
Making the greedy choice
</h3>
<p>
But we can solve this problem much more efficiently. At
each step in solving it, we can make a completely <i>local</i>
choice: what activity (still possible) finishes earliest?
<br>
<br>
Since we have sorted the activities in order of increasing
finish time, we simply choose <i>a</i><sub>1</sub>, since
it is guaranteed to finish at least tied for first.
<br>
<br>
The proof that this will give us a maximal set of
activities is trivial: Let us suppose <i>a</i><sub>j</sub>
is the activity in some set that finishes first, but that
there is a maximal subset A<sub>k</sub> that does <i>not</i> include
<i>a</i><sub>j</sub>. We simply remove the first element of
A<sub>k</sub> and substitute in <i>a</i><sub>j</sub>. This
is guaranteed to be a compatible set of activities, since
<i>a</i><sub>j</sub> finishes first, and it will be
maximal, since it is the same size as A<sub>k</sub>. (Note
that the textbook offers us an example of such sets on page
415.)
</p>
<h3>
A recursive greedy algorithm
</h3>
<p>
The algorithm it Is straightforward: find the first compatible
activity, then call the algorithm again with the rest of the activity list.
<br>
<br>
One trick worth noting: the authors put a dummy activity in the
first position of the list, so that there is no special first
call of the function.
<br>
There is a <b>design pattern</b> here: oftentimes, it is
better to modify a data structure than to do special coding for
end cases.
</p>
<h3>
An iterative greedy algorithm
</h3>
<p>
As usual, it is fairly simple to transform the recursive
algorithm into an iterative version. The authors discuss tail
recursion in this section: let's review what this is.
</p>
<h2>
Elements of the greedy strategy
</h2>
<p>
<img
src="https://upload.wikimedia.org/wikipedia/commons/thumb/d/da/Greedy_algorithm_36_cents.svg/280px-Greedy_algorithm_36_cents.svg.png">
</p>
<ol>
<li>Determine the optimal substructure of the problem.
<li>Develop a recursive solution.
<li>Show if we make the greedy choice, that only one
sub-problem remains.
<li>Prove that it is always safe to make the greedy
choice.
<li>Develop a recursive algorithm that implements the
greedy strategy.
<li>Convert the recursive algorithm to an iterative algorithm
</ol>
<h3>
Greedy-choice property
</h3>
<p>
<img
src="https://upload.wikimedia.org/wikipedia/commons/8/8c/Greedy-search-path-example.gif">
<br>
<br>
For a greedy algorithm to work, the optimal choice must not
depend upon any sub-problems or any future choices.
<br>
<br>
To prove that a greedy choice will be appropriate for some
problem, we typically examine an optimal solution, and then
show that substituting in a greedy choice will also yield an
optimal solution.
</p>
<h3>
Optimal substructure
</h3>
<p>
This is simpler then for dynamic programming
problems:
all we need to do is show that a
greedy choice combined with
an optimal solution to the sub problem of the rest of
the data gives us an optimal solution to the
original problem.
We use induction to show that making the
greedy choice at every
step produces an optimal solution.
</p>
<h3>
Greedy versus dynamic programming
</h3>
<p>
Two knapsack problems:
</p>
<ol>
<li>A thief can take entire items from a store or not, and
put them in his knapsack.
<li>A thief can take whatever fraction of an item he wants,
and put it in his knapsack.
</ol>
<p>
The latter can be solved with a greedy algorithm, the former
cannot.
</p>
<h2>
Huffman codes
</h2>
<p>
<img
src="https://upload.wikimedia.org/wikipedia/commons/thumb/8/82/Huffman_tree_2.svg/220px-Huffman_tree_2.svg.png">
<br>
<br>
Hoffman coding is away of compressing data that consists of
characters buy creating short binary representations for the
characters that occur most frequently in the data, and using longer
representations for characters that occur less frequently.
</p>
<h3>
Prefix codes
</h3>
<p>
Prefix codes are coding schemes in which no codeword
is a prefix of a different codeword.
<br>
This makes decoding easier -- no lookahead.
(<b>else</b> and <b>else-if</b>)
</p>
<table>
<tr>
<td>
</td>
<th>
a
</th>
<th>
b
</th>
<th>
c
</th>
<th>
d
</th>
<th>
e
</th>
<th>
f
</th>
</tr>
<tr>
<th>
Codeword
</th>
<td>
0
</td>
<td>
101
</td>
<td>
100
</td>
<td>
111
</td>
<td>
1101
</td>
<td>
1100
</td>
</tr>
</table>
<h3>
Constructing a Huffman code
</h3>
<p>
We build a binary tree from the bottom-up, starting with the
two least-frequent characters, and building up from there. This
ensures the least-frequent characters have the longest codes.
<br>
<br>
Consider the phrase "Mississippi River".
This is 136 bits in 8-bit ASCII encoding.
<br>
<br>
Here is the Huffman coding for it:
<br>
<img src="graphics/Huffman.png">
<br>
<br>
I = 00
<br>
S = 01
<br>
P = 100
<br>
R - 101
<br>
M = 1100
<br>
V = 1101
<br>
E = 1110
<br>
_ = 1111
<br>
<br>
The final string:
<br>
110000010100010100100100001111101001101101
<br>
<br>
Try parsing it, and convince yourself that there is
only one possible interpretation of it. That is what
the prefix coding buys us.
</p>
<h3>
Correctness of Huffman's algorithm
</h3>
<p>
We begin operating on an alphabet Σ. At each step,
we create a new alphabet, Σ', with two symbols of
Σ replaced by a new "meta-symbol".
<br>
<br>
<b>Base case</b>: For an alphabet of two symbols, the
algorithm outputs 0 for one of them, and 1 for the
other. And that is clearly optimal! (An alphabet of
size 1 is a non-problem!)
<br>
<br>
<b>Inductive step</b>: Assume the algorithm is correct
for input of size <i>n</i> - 1.
<br>
At each step in the algorithm, we replace the two least
frequent remaining symbols <i>a</i> and <i>b</i> with a
new symbol <i>ab</i>, whose probability is the sum of
the probabilities of <i>a</i> and <i>b</i>.
<br>
We always want the lowest frequency symbols to have the
longest encoding length.
<br>
Any choice among symbols at the lowest frequency will
be fine, since they all have equally low frequency. So
we can pair any of those lowest frequency symbols as we
wish.
<br>
Now assume that there is an encoding Σ'' that is
derived from Σ (like Σ') but differs in
choosing to combine <i>x</i> and <i>y</i> into a
<i>xy</i>. But since we made the greedy choice,
<i>xy</i> can be at best tied with <i>ab</i> meaning
Σ'' is at best another optimal encoding, and our
encoding Σ' is optimal after all.
</p>
<h2>
Matroids and greedy methods
</h2>
<p>
<img
src="https://upload.wikimedia.org/wikipedia/commons/thumb/b/b7/Vamos_matroid.svg/220px-Vamos_matroid.svg.png">
<br>
<br>
<h3>
Matroids
</h3>
<h4>
A graph
</h4>
<p>
Consider the following graph, where the edges represent
roads and the vertices towns:
<br>
<br>
<img src="graphics/MatroidGraph.png">
<br>
<br>
If we want to build a set of roads that are only built
if there is no other route between towns, what sets are
possible?
<br>
</p>
<ul>
<li>∅
<li>{a} {b} {c} {d} {e}
<li>{a, b} {a, c} {a, d} {a, e} {b, c} {b, d} {b, e}
{c, d} {c, e}
<li>{a, b, c} {a, b, d}, {a, b, e} {a, c, d} {a, c, e}
</ul>
<p>
Notice:
</p>
<ul>
<li>f is useless!
<li>a is in every maximal set.
<li>d and e are never in the same set.
<li>b, c and d are never in the same set.
<li>b, c and e are never in the same set.
</ul>
<h4>
A vector space
</h4>
<p>
Now look at the following vector space in
ℝ<sup>3</sup>:
<br>
<br>
<img src="graphics/MatroidVectors.png">
<br>
(f is the vector 0, 0, 0.)
<br>
<br>
What sets of linearly independent vectors are possible?
</p>
<ul>
<li>∅
<li>{a} {b} {c} {d} {e}
<li>{a, b} {a, c} {a, d} {a, e} {b, c} {b, d} {b, e}
{c, d} {c, e}
<li>{a, b, c} {a, b, d}, {a, b, e} {a, c, d} {a, c, e}
</ul>
<p>
Notice:
</p>
<ul>
<li>f is useless!
<li>a is in every maximal set.
<li>d and e are never in the same set.
<li>b, c and d are never in the same set.
<li>b, c and e are never in the same set.
</ul>
<h4>
A matching problem
</h4>
<p>
You live in a small town with 3 women and 5 men who are
single but might be married off. There is a town
matchmaker who makes money by arranging marriages, and
so he wants to arrange as many as possible.
<br>
In this town, it's "ladies' choice," and the women are
numbered 1-3, and the men are a-f.
<br>
<br>
<img src="graphics/MatroidMatching.png">
<br>
<br>
What sets of matches are possible?
</p>
<ul>
<li>∅
<li>{a} {b} {c} {d} {e}
<li>{a, b} {a, c} {a, d} {a, e} {b, c} {b, d} {b, e}
{c, d} {c, e}
<li>{a, b, c} {a, b, d}, {a, b, e} {a, c, d} {a, c, e}
</ul>
<p>
Notice:
</p>
<ul>
<li>f is useless!
<li>a is in every maximal set.
<li>d and e are never in the same set.
<li>b, c and d are never in the same set.
<li>b, c and e are never in the same set.
</ul>
<h4>
The definition of a matroid
</h4>
<p>
We are looking at general conditions for <i>independence</i>
of choices. All three problems turn out to be
surprisingly similar. They are all <i>matroids</i>.
<br>
<br>
We have a matroid when the following independence
conditions are satisfied:
</p>
<ol>
<li>∅ is independent.
<li>If set J is independent and I is a subset of J,
then I is independent.
<li>If I and J are independent, and cardinality(I) <
cardinality(J), then there is some <i>j</i> in J
and not in I, so that I ∪ <i>j</i> is also
independent.
</ol>
<p>
A matroid consists of (S, I), where S is a set, and I
is a set of subsets of S, for which axioms 1-3
above are satisfied.
<br>
<br>
All three situations described above, the vector space,
the graph, and the matching problem, turn out to be the
<i>same</i> matroid. (Technically, they are <i>isomorphic</i>
matroids.)
</p>
<h4>
The bases of a matroid
</h4>
<p>
A subset in I is a <i>basis</i> for a
matroid if it is a <i>maximal</i>
subset, which here means it is not contained in any
larger subset that is in I.
<br>
<br>
So in the above examples, the bases are:
<br>
{a, b, c} {a, b, d}, {a, b, e} {a, c, d} {a, c, e}
<br>
<br>
<b>Theorem</b>: Every basis of a matroid is of the same size.
<br>
<b>Proof</b>: Let assume that there are some bases
A and B of a matroid such that |B| < |A|.
<br>
Then by the exchange axiom (#3 above), there is some
element <i>a</i> of A that can be moved into B, and B
will still be independent.
<br>
Therefore, it was <i>not</i> a maximal independent subset after
all!
<br>
<br>
Translation into the different matroid realms:
<br>
<br>
The dimension of a vector space is the size of any of
its bases.
<br>
<br>
Every spanning tree has the same number of edges.
<br>
<br>
We are proving things about vector spaces and graphs
simply by examining the axioms of matroids!
</p>
<h4>
Credit
</h4>
<p>
This section draws on lectures by Federico Ardila:
<br>
https://www.youtube.com/watch?v=4EvpzV_3RXI
<br>
https://www.youtube.com/watch?v=pe5MaEugAwg&list=PL-XzhVrXIVeSu_b29hbX5xJ0bRThokU8a
</p>
<h3>
Greedy algorithms on a weighted matroid
</h3>
<p>
<b>Q</b>: Why do greedy algorithms work on a matroid?
<br>
<b>A</b>: Independence!
</p>
<h2>
A task-scheduling problem as a matroid
</h2>
<p>
<img
src="https://upload.wikimedia.org/wikipedia/commons/thumb/0/0c/Thread_pool.svg/400px-Thread_pool.svg.png">
<br>
<br>
The surprise here is that we can simply greedily grab tasks in
order of increasing deadlines, and our algorithm will work.
</p>
<h2>
Source Code
</h2>
<p>
<a
href="https://github.com/gcallah/algorithms/blob/master/python/greedy.py">
Python
</a>
<br>
<a
href="https://github.com/gcallah/algorithms/blob/master/ruby/greedy.rb">
Ruby
</a>
</p>
<h2>
External Links
</h2>
<ul>
<li><a href="https://en.wikipedia.org/wiki/Greedy_algorithm">
Greedy algorithm
</a>
<li><a href="https://en.wikipedia.org/wiki/Huffman_coding">
Huffman coding
</a>
<li><a href="https://en.wikipedia.org/wiki/Matroid">
Matroid
</a>
</ul>
<h2>
Homework
</h2>
<ol>
<li id="hw1">
Our intuition might tell us that choosing the shortest
subway path from MetroTech Center to Yankee Stadium
is a simpler problem than choosing a minimum
spanning tree for all New York subway
stations. Yet the latter can be solved with a greedy algorithm,
while the first cannot. Why?
<br>
<br>
<b>Answer</b>: The spanning tree problem is a matroid,
and so deals with indpendent sub-problems. The
different paths between subway stops are <i>not</i>
independent: some choices foreclose other choices.
<br>
<li>Why is the run-time analysis of the rod cutting problem not
the same as that of the matrix chain problem?
<br>
<br>
<b>Answer</b>:The matrix chain order problem can nest
the sub-problems in many ways, while the rod-cutting
problem simply makes one level of cuts. It would be
more like the matrix problem if it also bundled
packages of cut rods into higher level bundles.
<br>
Or, rod cutting: ---|--|--|--|------|--
<br>
The vertical lines represent cuts.
<br>
Whereas, matrix chaining: (((A A) (A (A A))) (A A) ((A A
A)) A)
<br>
<li>For the activity selection problem, besides simply
choosing the activity this finishes first, what other
greedy choices could we make?
<br>
<br>
<b>Answer</b>: The activity that starts first, the
activity with the smallest time requirement.
<br>
<li>On page 419, line 2 of pseudo-code loops, and the
comment describing why it is looping is incorrect. What
is actually going on there? Why isn't the code just
grabbing the first element in the activity array, which
must be the one that finishes first?
<br>