forked from pnugues/edan20
-
Notifications
You must be signed in to change notification settings - Fork 0
/
cw0.xml
executable file
·173 lines (173 loc) · 8.25 KB
/
cw0.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xml:lang="en" xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Lab #0: Python and scikit-learn</title>
</head>
<body>
<!--<h1>Lab #0: Python and scikit-learn</h1>-->
<h2>Objectives</h2>
<p>The objectives of this lab are to:</p>
<ul>
<li>Be sure that you can log on the computer system</li>
<li>Have the right programming environment</li>
<li>Have a hands-on introduction to Python</li>
<li>Know of a few Unix tools to derive word statistics</li>
<li>Know of the main functions of scikit-learn</li>
</ul>
<p>The student's presence in the computer room through Discord is not compulsory for this initial session.
Its goal
is to be sure that all the students have the elementary Python programming skills they need for the course.
This means that if you know Python and scikit-learn (well enough), you can skip the lab.
However, each student will have to run a short program at home on spelling correction and comment it.
This last exercise is compulsory as well as handing in the report. See the last section of the page.
</p>
<h2>Organization and location</h2>
<p>The initial lab session will take place on</p>
<ol>
<li>Group 1, August 31, 2021, 13:15 to 15:00, in the Beta room,
<br/>
Discord link: https://discord.gg/83wWpF7
<br/>
</li>
<li>group 2, August 31, 2021, 13:15 to 15:00, in the Gamma room,
<br/>
Discord link: https://discord.gg/83wWpF7
<br/>
</li>
<li>group 3, August 31, 2021, 15:15 to 17:00, in the Gamma room,
<br/>
Discord link: https://discord.gg/83wWpF7
<br/>
</li>
<li>group 4, September 1, 2021, 13:15 to 15:00, in the Alpha room,
<br/>
Discord link: https://discord.gg/83wWpF7
<br/>
</li>
<li>group 5, September 1, 2021, 13:15 to 15:00, in the Varg room,
<br/>
Discord link: https://discord.gg/83wWpF7
<br/>
</li>
<li>group 6, September 1, 2021, 15:15 to 17:00, in the Alpha room.
<br/>
Discord link: https://discord.gg/83wWpF7
<br/>
</li>
<li>group 7, September 1, 2021, 15:15 to 17:00, in the Varg room.
<br/>
Discord link: https://discord.gg/83wWpF7
<br/>
</li>
</ol>
<p>There can be last minute changes. Please always check the official times here:
<a
href="https://cloud.timeedit.net/lu/web/lth1/ri1X50gQ6560YfQQ15Z5771Y0Zy7007335Y67Q565.html">
https://cloud.timeedit.net/lu/web/lth1/ri1Q5006.html
</a>
</p>
<p>In this lab, we will review the Python syntax and some tools. Attendance is not compulsory,
but you have to run the spell checker by Peter Norvig at home to make sure
you understand Python.
</p>
<h2>Outline</h2>
<ol>
<li>We will use Python 3 and the Anaconda distribution in the labs:
<a href="https://www.anaconda.com/distribution">https://www.anaconda.com/distribution</a>.
Anaconda has most packages we need for the course.
</li>
<li>Anaconda is available on the LTH lab machines. You add it to your path by running:
<br/>
<tt>$ initcs</tt>
<br/>
If you use a personal machine, you will have to download and install it.
</li>
<li>
<tt>regex</tt>
is one of the few modules that is not in Anaconda:
<a href="https://pypi.python.org/pypi/regex/">https://pypi.python.org/pypi/regex/</a>.
<br/>
<tt>regex</tt>
can handle regular expressions and Unicode. It should already be installed on the computer network.
On your personal machine, install it with <tt>pip</tt>:
<br/>
<tt>python -m pip install --upgrade regex</tt>
</li>
<li>You will carry out the labs with Jupyter notebooks. We will write code
snippets (cells) that you will run interactively.
You start jupyter with:
<br/>
<tt>$ jupyter lab</tt>
<br/>
or
<br/>
<tt>$ jupyter notebook</tt>
</li>
<li>Instead of an IDE, you may prefer a programming environment (IDE). I recommend PyCharm:
<a href="https://www.jetbrains.com/pycharm/">https://www.jetbrains.com/pycharm/</a>.
The community edition is free.
<br/>
PyCharm should be available on the lab computers. If not, you will add the Python plugin to IntelliJ
instead.
Run:
<br/>
<tt>$ intellij-idea-community</tt>
<br/>
then Configure and add Python
</li>
<li>
On the LTH machines, the <tt>regex</tt> module is not available from PyCharm by default.
You need first to configure your environment. To do so, in the File menu, select
Settings..., then Project and Project Interpreter. In the Project Interpreter box,
on the top of the right pane, add the new interpreter by pressing the cog icon, and Add...
Then select Anaconda Python:
<tt>/usr/local/anaconda3/bin/python</tt>
</li>
</ol>
<h2>Course of the lab</h2>
<p>In the lab session, your instructors will walk you through Python, Unix, and scikit-learn. You will:</p>
<ol>
<li>run all the code in the chapter: A Tour of Python available here
<a href="https://github.com/pnugues/edan20/tree/master/notebooks">
https://github.com/pnugues/edan20/tree/master/notebooks
</a>
</li>
<li>count the words of a text with Unix tools. You will use the Unix command:
<pre>$ tr -cs 'A-Za-z' '\n' < text_file | sort | uniq -c | sort -nr | more</pre>
Be sure to understand all the parts of this command.
</li>
<li>run the quick introduction to scikit-learn:
<a href="https://scikit-learn.org/stable/tutorial/basic/tutorial.html">
https://scikit-learn.org/stable/tutorial/basic/tutorial.html
</a>
and understand <tt>fit()</tt> and <tt>predict()</tt>.
</li>
</ol>
<h2>Compulsory part</h2>
<ol>
<li>Run the spell checker here: <a href="http://norvig.com/spell-correct.html">
http://norvig.com/spell-correct.html</a>. Use Python 3 and make sure you understand all the code and
Python syntax;
</li>
<li>Write an individual description of this program on one to two pages (not more) and submit it to the Canvas
site (<a href="https://canvas.education.lu.se/">https://canvas.education.lu.se</a> )
</li>
<li>To write your report, you can either
<ol>
<li>Write directly your text in Canvas, or</li>
<li>Use Latex and Overleaf (<a href="https://www.overleaf.com/">
www.overleaf.com</a>). This will probably help you structure your text. You will then upload a
PDF file in Canvas)
</li>
</ol>
Please check your document with a spell checker before you send it.
</li>
<li>
The deadline to hand in your report is <b>September 9, 2021</b>.
</li>
</ol>
</body>
</html>