-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.html
416 lines (404 loc) · 28.5 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta name="author" content="Max Kapur" />
<meta name="revised" content="2019-10-14" />
<meta lang="en" />
<meta property="og:image" content="cluster.png" />
<link rel="stylesheet" href="./style.css">
<link rel="shortcut icon" type="image/png" href="favicon.png">
<title>Using Data to Understand ELL Students</title>
</head>
<body>
<div class="container">
<main>
<header>
<h1>Using Data to Understand ELL Students</h1>
</header>
<article>
<p class="subtitle">Max Kapur, August 2019</p>
<p>I teach English at a boys’ middle school here in Naju, South Korea. August in Korea is A/C
season,
and not a class goes by that I don’t have students going back and forth asking me to turn the
A/C
up and down. Someone always starts by setting it on full blast to the coldest temperature; once the
room
nears 17°C, someone will say, “Damn, it’s cold” and turn the machine <em>all
the
way</em> off … you can see where this is going.</p>
<p>As I began my second year of teaching, I decided to settle the A/C dispute once and for all by
issuing my
students a survey about the perfect classroom temperature. Oh, and I also asked them about their
motivation for learning English, their opinions about our class and my teaching style, and what
resources they were using to study English outside of class.</p>
<p>I’ve been studying math and stats in my free time here, and this felt like a good chance to
whet my
chops on some new data. If you poke around in the files below, you can get a sense for my process,
but
basically, I did all the analysis in Python, leaning heavily on the Pandas and Seaborn libraries.
</p>
<p>Rather than write a traditional summary, <strong>I’ll just highlight the most important bits
like
this</strong> for those of you who are as tired as I am of staring at a screen all day.</p>
<p>This document was the basis of a talk given October 18, 2019 at Fulbright’s fall conference in
Gyeongju, South Korea. You can see a digital version of the talk <a
href="https://youtu.be/jgAhgY3TVEc">here</a>.</p>
<p>Contents:</p>
<navi>
<ol>
<li><a href="#files">Files</a></li>
<li><a href="#design">Survey Design</a></li>
<li><a href="#first">First Look</a></li>
<li><a href="#corr">Correlation and Causation</a></li>
<li><a href="#cluster">Cluster Analysis</a></li>
<li><a href="#app">Applications</a></li>
</ol>
</navi>
<h2 id="files">Files <a class="top" href="#">↖ back to top</a></h2>
<table class="filelist">
<tr>
<td class="fl">Survey:</td>
<td class="fr"><a
href="https://docs.google.com/document/d/1lEM__8rigiXrKoXnxgsQB6jujSDTvM7GeUcoMV9OaLg/edit?usp=sharing"
, target="blank">Google Doc</a> | <a href="survey.pdf" , target="blank">PDF</a></td>
</tr>
<tr>
<td class="fl">Raw data:</td>
<td class="fr"><a href="survey.xlsx" , target="blank">Excel</a></td>
</tr>
<tr>
<td class="fl">Python code used to generate graphs:</td>
<td class="fr"><a href="analysis.html" , target="blank">HTML</a> | <a href="analysis.ipynb" ,
target="blank">Jupyter Notebook</a></td>
</tr>
<tr>
<td class="fl">Digital talk about this project:</td>
<td class="fr"><a href="https://youtu.be/jgAhgY3TVEc" , target="blank">YouTube</a></td>
</tr>
</table>
<h2 id="design">Survey Design <a class="top" href="#">↖ back to top</a></h2>
<p>I’ll refrain from embedding the document because I don’t want Google to stalk you. You
can
get it <a href="survey.pdf">here</a> (to download a PDF) or <a
href="https://docs.google.com/document/d/1lEM__8rigiXrKoXnxgsQB6jujSDTvM7GeUcoMV9OaLg/edit?usp=sharing">here</a>
(to sell your soul).</p>
<p>The survey has nineteen questions. I wanted a high response rate, so I translated all the questions
badly
into Korean and talked the students through some of the trickier ones. <strong>I asked the students
<em>not</em> to write their names</strong> on the surveys, again to encourage honest responses.
</p>
<p>Many of the most helpful responses I received came from the optional fill-in-the-blank questions, but
in
this document <strong>I will focus on the quantitative data</strong>, with the hope that the
techniques
I used to analyze it will be of use to other teachers.</p>
<p>The first nine quantitative questions deal with what I would call <strong>“learning
posture”</strong>: what the students’ strategy is for navigating English class and
how
well they feel it’s working. My students know that they are supposed to <em>say</em> learning
English is important, but I was curious if they would do so on an anonymous survey (they
didn’t)
and if their motivation for learning English correlated with anything else, like how often they
asked
questions in class.</p>
<p>I asked the students a couple of questions about <strong>strictness</strong> because I know I am more
lenient than the other teachers at our school. In one question, I asked if “teachers at our
school” in general should be less strict to get a reference point, and then later asked if
<em>I</em> am too lenient.
</p>
<p>A group of five questions then asks the students <strong>what they want more or less of in our
class</strong>: videos, games, competitions, and so on.</p>
<p>The students answered those first fourteen questions on a scale of <code>strongly disagree (1)</code>
to
<code>strongly agree (5)</code>.
</p>
<p>Next I asked the students two binary questions: <strong>whether they attended an after-school English
tutoring center</strong>, and whether they received one-on-one tutoring in English. In the end,
there were only a few students getting one-on-one tutoring, so I couldn’t learn much about
that.
</p>
<p>Finally, I asked them the most important question of all:
<code>What temperature should we put the A/C at?</code>
</p>
<p>I distributed and collected the surveys, <strong>154 in total</strong>, during one class period. I
made a
note of which surveys came from which class sections, which gave me two more data points (grade and
class) for each student. When entering the surveys into Excel, I disregarded nonsense responses,
such as
when a student answered every question with <code>3</code>.</p>
<h2 id="first">First Look<a class="top" href="#">↖ back to top</a></h2>
<p>The most obvious thing to do, once we have all the data in a spreadsheet, is start taking averages
and
see what stands out. There’s a problem with just looking at averages, though: they tell us
nothing
about how the data is actually <em>distributed.</em> Compare these scenarios:</p>
<ul>
<li>Nearly every student answers with <code>agree (4)</code></li>
<li>One third of the students answer with <code>neutral (3)</code>, another third answers
<code>agree (4)</code>, and the final third chooses <code>strongly agree (5)</code>.
</li>
</ul>
<p>In both cases the average is <code>4</code>, but in the third case, the data isn’t as
consistent.
To quantify this uncertainty, I computed <strong><a
href="https://en.wikipedia.org/wiki/Student%27s_t-test#One-sample_t-test">one-sample
<em>t</em>-scores</a> for each question’s average</strong>. The further the
<em>t</em>-score gets from zero, the more confidently we can say that the “true” average
(the average if you could interview an infinite number of students like mine) is <em>not</em>
<code>3</code>. A negative <em>t</em>-score means the true answer is likely lower than
<code>3</code>;
positive, higher.
</p>
<div class="image">
<a href="questionT.png"><img src="questionT.png" /></a>
<p class="caption">I’ve done something unusual in plotting the <em>t</em>-scores directly, instead
of
averages with error bars. I did this because there are risks inherent in interpreting error bars (or
<a href="https://xkcd.com/882/"><em>p</em>-values</a>) when you compute them for several samples.
Better to just plot the <em>t</em>-scores and see what stands out. Thanks to <a
href="https://stackoverflow.com/questions/57601156/change-axis-along-which-seaborn-applies-color-palette">people
on Stack Overflow</a> for helping me get the color map working.</p>
</div>
<p><strong>More videos and more games</strong>—I saw that coming. Knowing the sometimes-militant
style
of my colleagues, I also expected the students to tell me I wasn’t being strict enough. I
didn’t expect to hear that <strong>both I <em>and</em> the other teachers should be
stricter</strong> (which is what the negative responses to
<code>If the teachers at our school were less strict, I would learn more</code> mean, once you
puzzle
out the conditional).
</p>
<p>Before we get into the other stuff, I’d like to break down these results by class, since the
differences were pretty interesting:</p>
<div class="image">
<a href="classT.png"><img src="classT.png" /></a>
<p class="caption">These are <em>t</em>-scores again. There are two class sections in each grade, so six
columns.</p>
</div>
<p>Let me tell you about class 3-2. They might be my favorite class. They’re slightly noisy and
often
tardy, but they tend to get into a nice rhythm if I supply them with a sufficiently interesting
activity, and I never have trouble getting volunteers to speak. Moreover, they had the highest
average
scores on our speaking test last semester. That’s why <strong>I was surprised to see 3-2 agree
so
uniformly that English was one of their most challenging subjects</strong>.</p>
<p>Another interesting row is
<code>If the teachers at our school were less strict, I would learn more</code>. Here we discover
prominent differences between the classes that had been obscured when the data was aggregated
together.
Broken up like this, and knowing who the homeroom teacher is for each class, I can see that
<strong>students’ assessments of the prototypical “teacher at our school” is
informed
heavily by who <em>their</em> homeroom teacher is</strong>. I’ll, um, leave it at that.
</p>
<h2 id="corr">Correlation and Causation <a class="top" href="#">↖ back to top</a></h2>
<p>OK, stare at a neutral surface for thirty seconds before you look at this one.</p>
<div class="image">
<a href="corr.png"><img src="corr.png" /></a>
<p class="caption">I tried to get the labels on the bottom to sit at a 45° angle, but they got all
clumpy and weird.</p>
</div>
<p>This chart shows how students’ responses to each question correlated with their responses to
the
other questions. For example, the <em>r</em>-value of <code>-0.5</code> at the intersection of
<code>English is one of my favorite subjects</code> and
<code>English is one of my most challenging subjects</code> means that students who liked English a
lot
tended to disagree that it was challenging. There are at least three ways to interpret this:
</p>
<ul>
<li>If English comes easily to a student, then he will enjoy it.</li>
<li>If a student enjoys English, then it will feel easy to him.</li>
<li>There is some outside factor (watching American dramas, for example) that leads some students to
both enjoy English and be good at it.</li>
</ul>
<p>When you hear pundits say “correlation is not causation,” this is what they are talking
about. We have to tread carefully here, all the more so because computing 180 correlation
coefficients
is a mildly crappy statistical practice; at that scale, you’ll see some pretty decent
correlations
even in <a href="corr_random.png">random data</a>.
<p>On the flip side, a low correlation coefficient doesn’t necessarily mean there’s no
correlation. Virtually every student circled <code>strongly agree (5)</code> on
<code>More videos</code>; since there’s little variance in the responses, there’s no way
to
“know” how a student who chose a different number might answer the other questions, and
we
get a row of <em>r</em>-values near zero.
</p>
<p>Those caveats aside, there are a couple of things that stand out to me. One is the sixth column,
which
corresponds to <code>I believe it’s important to learn English</code>. Affirmative responses
here
are the best predictor of affirmative responses to
<code>When I have a question, I tend to ask Max</code>, suggesting that <strong>students who ask a
lot
of questions do so out of intrinsic motivation rather than fear of failure</strong> (among other
possible interpretations).
</p>
<p>Another column I like is the third to last column, <code>Do you attend an English hagwon?</code>,
which
students answered with a simple yes or no. For those who don’t know, many Korean students
attend
private tutoring centers (hagwons) after the school day ends. A good English hagwon usually has
smaller
class sizes than school and one or more native speakers on hand. They tend to focus on vocabulary
and
standardized tests. There are hagwons in other subjects, too; a common strategy is to attend an
English
hagwon on Monday and Wednesday, a math hagwon on Tuesday and Thursday, and something
“fun”
like piano (cue music educators, cringing in the back) on Friday.</p>
<p><strong>English hagwons seem to be working.</strong> Students who attend hagwons tended to enjoy
English
and find it less challenging than their peers. Since the hagwons my students attend don’t
generally have a competitive application process, and since it’s usually the parents, not the
students, who choose what subject of hagwon to send their kids to, I don’t <em>think</em> the
positive correlations were merely a function of prior ability or interest, but you could test this
further with a longitudinal study.</p>
<h2 id="cluster">Cluster Analysis <a class="top" href="#">↖ back to top</a></h2>
<p>Are you ready for my favorite part?</p>
<p>One of the neatest statistical techniques I’ve learned about is called cluster analysis. Here,
I’ve used a <a href="https://en.wikipedia.org/wiki/K-means_clustering"><em>k</em>-means
clustering
algorithm</a> to <strong>sort the students into three groups with similar
characteristics</strong>.
I performed the clustering only on the subset of the questions that I felt reflected the
students’
individual learning styles, omitting the responses about whether we should have more or less videos
and
so on.</p>
<div class="image">
<a href="cluster.png"><img src="cluster.png" /></a>
<p class="caption">This time, I <em>am</em> showing the averages instead of <em>t</em>-scores. Cluster
analysis has divided the students into similar groups, which violates the <em>t</em>-test’s
assumption of random sampling. Charting the <em>t</em>-scores would just add an unnecessary layer of
abstraction.</p>
</div>
<p>In the chart above, the clusters are split into three columns, and we can summarize the results like
this:</p>
<ol start="0">
<li>Students in cluster <code>0</code> are crushing it. They like English, they think it’s
important, and they usually understand what we’re doing in class. They are proactive in
asking
questions.</li>
<li>Students in cluster <code>1</code> are the most difficult to reach. Not only is English hard for
them, but they also don’t think it’s important and are the least likely to ask
questions. Mercifully, this is the smallest group.</li>
<li>Cluster <code>2</code> is really interesting. These are students who find English challenging
and
lean a lot on their peers to help them complete assignments. But they differ from cluster
<code>1</code> in that they are motivated to learn English (they believe it is important). This
group is the largest of the three, and its members would benefit greatly from asking more
questions.
It is the silent majority.
</li>
</ol>
<p>Obviously, these are generalizations. Many students fall in the space between the various clusters.
An
inappropriate application of cluster analysis would be to partition the classroom into groups of
students based on which cluster the algorithm assigned them to. We need to allow for the possibility
that an individual student was misclassified or that his response changes over time.</p>
<p>The approximate nature of cluster analyis becomes clear if we visualize which individual responses
were
assigned to which clusters. In this next graph, I’ve plotted the students’ responses to
just
two statements: <code>I believe it’s important to learn English</code> and
<code>English is one of my favorite subjects</code>. These two questions appeared to be a strong
source
of differentiation in the cluster report above, but here, you can see that the clusters overlap
substantially.
</p>
<div class="image">
<a href="cluster2.png"><img src="cluster2.png" /></a>
<p class="caption">When performing the cluster analysis itself, I treated the agree/disagree responses
as
quantitative. But in visualizing it here, I’ve treated the same data as categorical, which
means
every student falls into one of twenty-five bins: those who strongly disagreed with both the
importance
and motivation statements, those who strongly disagreed with the first and somewhat disagreed with
the
second, and so on. Then, inside of each of those bins, I drew a bar graph indicating that
bin’s
distribution of students in each of the three clusters. This graph, which visualizes the
distribution of
three discrete variables, took a bit of <a
href="https://stackoverflow.com/questions/58303175/plotting-three-dimensions-of-categorical-data-in-python">thinking
through</a>.</p>
</div>
<p>You can choose any number of clusters when performing <em>k</em>-means clustering; I settled on three
because expanding it to four just created another group very similar to cluster <code>0</code>,
while
reducing it to two obscured the important differences between clusters <code>1</code> and
<code>2</code>.
</p>
<h2 id="app">Applications<a class="top" href="#">↖ back to top</a></h2>
<p>So, I can’t just do what the students tell me to. At the very least, I’ll need my vice
principal’s permission before I screen movies every day! (He might say yes. There is an
elective
class in film appreciation at our school.)</p>
<p>At the same time, <strong>it was important to me that the students knew that I had actually gone
through
the surveys</strong>, so the week after I collected them, I showed them these graphs and
explained
the most salient features. I also did a “write-ins that made me laugh” slide where I
showcased some of the funny answers to
<code>Is there anything else you’d like Max to know?</code>
There were some real gems:</p>
<ul>
<li>“Change your fashion.”</li>
<li>“Yes.”</li>
<li>(A detailed comparative analysis of various pizza and fried-chicken restaurants in the area)
</li>
<li>“EEEEEE~”</li>
<li>“We are crazy men.”</li>
</ul>
<p>After sharing these responses with the students, I distilled the above graphs into a few general
takeaways. I explained that we’d still have to wait until the end of the semester to watch a
full-length movie, but we can do more TED talks and music videos in regular class. I had them
brainstorm
ideas for speaking practice that don’t involve presentations and partner activities.</p>
<p><strong>I showed the students the cluster analysis and asked them to think to themselves about which
of
the groups they fall into—or if perhaps they were none of the three.</strong> A couple of
kids
proudly raised their hands and say, “I’m cluster one, all the way.” I can only
applaud
their honesty. Others, when I described cluster <code>2</code> and emphasized that it included
nearly
half the students, looked visibly relieved.</p>
<p>The greatest benefit of this analysis was to my lesson planning. Without realizing it, prior to this
survey, <strong>I’d been planning my lessons around the assumption that there are students who
are
motivated and skilled, and students who are unmotivated and unskilled</strong>. That is, I was
thinking only about clusters <code>0</code> (the nerds) and <code>1</code> (the disinterested),
evaluating each potential activity on the challenging/interesting spectrum and trying to include a
mix
of both ends. <strong>This approach neglected the largest group of students</strong>, cluster
<code>2</code>, who expressed both interest <em>and</em> challenge in English, defying my assumption
that the best way to reach disengaged students was with highly accessible, “fun”
content.
Instead, the data suggested that <strong>practical, minimal-frills material with lots of scaffolding
will yield positive results</strong> in classrooms like mine.
</p>
<p>But what you really wanted to know was the verdict on the A/C temperature, huh? I used my very
favorite
measure of central tendency for this one, the <a
href="https://en.wikipedia.org/wiki/Truncated_mean">10%
trimmed mean</a>.</p>
<code>20.87264150943396</code>
<p>No more and no less.</p>
</article>
<footer>
<p><a href="https://www.maxkapur.com">maxkapur.com</a></p>
<p><a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img
alt="Creative Commons License" style="border-width:0"
src="https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png"></a></p>
</footer>
</main>
</div>
</body>