diff --git a/PUBLICITY.md b/PUBLICITY.md
new file mode 100644
index 0000000..8a15c07
--- /dev/null
+++ b/PUBLICITY.md
@@ -0,0 +1,55 @@
+# Facetweet announcement
+
+Learn how to analyze your datasets in R! [insert link here](https://youtu.be/dQw4w9WgXcQ)
+
+# Information for calendar
+
+The workshop duration is 3hrs per class.
+
+# Descriptions for website
+
+## Header
+
+**title** : R for Data Science
+
+**description** : The R for Data Science workshop series is a four part course, designed to take novices in the R language for statistical computing and produce programmers who are competent in finding, displaying, analyzing, and publishing data in R.
+
+## Part 1
+
+**subtitle** : Basics of R
+
+**description** : Students will understand the motivation behind object orientation, and how that relates to computation. Students will be able to perform basic functions in R necessary to use the software on their computers and conduct basic arithmetic. Students will understand data types and data structures, and why and how they are different from each other.
+
+**knowledge requirements** : [Programming Fun!damentals](https://github.com/dlab-berkeley/programming-fundamentals), or equivalent prior knowledge
+
+**tech requirements** : Laptop required; please install R version 3.2 or greater in advance (University laptops will need to have R installed by an administrator); the RStudio IDE is recommended but not required
+
+## Part 2
+
+**subtitle** : Clean and tidy data
+
+**description** : Students will be introduced to DRY principles and best practices for sanitizing and tidying data. Students will learn what missingness is, and how best to accommodate missing data in their research designs. Students will be able to read in files from disk or a database, clean the data found within them, select specific data from them, and merge them with other datasets. 
+
+**knowledge requirements** : R-for-Data-Science Part 1 or equivalent prior knowledge
+
+**tech requirements** :  Laptop required; please install R version 3.2 or greater in advance (University laptops will need to have R installed by an administrator); the RStudio IDE is recommended but not required
+
+## Part 3
+
+**subtitle** : Analyzing data
+
+**description** : Students will be introduced to the principles behind the grammar of graphics and the general linear model. Students will understand the implementation of plotting in R. Students will be able to explore, summarize, and analyze data using R's implementation of exploratory and inferential data analysis.
+
+**knowledge requirements** : R-for-Data-Science Part 2 or equivalent prior knowledge
+
+**tech requirements** : Laptop required; please install R version 3.2 or greater in advance (University laptops will need to have R installed by an administrator); the RStudio IDE is recommended but not required
+
+## Part 4
+
+**subtitle** : Functions and packages
+
+**description** : Students will be introduced to the principles behind functional programming. Students will learn how to write and import functions, add looped and vectorized computation to their functions, and control the flow of data through a function. Students will understand the basics of name spaces, and how that relates to assigning values within functions. Students will see how to successfully package a function for CRAN.
+
+**knowledge requirements** : R-for-Data-Science Part 2 or equivalent prior knowledge
+
+**tech requirements** : Laptop required; please install R version 3.2 or greater in advance (University laptops will need to have R installed by an administrator); the RStudio IDE is recommended but not required
diff --git a/README.md b/README.md
index 3c4651e..b46ecc0 100644
--- a/README.md
+++ b/README.md
@@ -19,7 +19,7 @@ The instructor of this workshop series will lead you through the activities for
 
 ## If you are a D-Lab instructor
 
-You'll see accumulated teaching notes and examples for each day's topics in the instructor folder. For your convenience, these are available as .Rmd, commented .R files, PDF documents, and HTML slides.
+You'll see accumulated teaching notes and examples for each day's topics in the instructor folder. For your convenience, these are available as .Rmd, commented .R files, PDF documents, and HTML slides. The meta-document for this workshop series, which explains the logic behind the structure and topics, can be viewed [at the D-Lab guides repository](https://github.com/dlab-berkeley/guides/blob/master/r.pdf)
 
 For information on contributing to this repository, see `CONTRIBUTING.md`
 
@@ -61,17 +61,17 @@ This workshop series covers:
 
 This workshop uses the following packages:
 
-1. Amelia
-2. devtools
-3. dplyr
-4. foreign
-5. ggplot2
-6. parallelMap
-7. RCurl
-8. reshape2
-9. roxygen2
-10. stringr
-11. XML
+* Amelia
+* devtools
+* dplyr
+* foreign
+* ggplot2
+* parallelMap
+* RCurl
+* roxygen2
+* stringr
+* tidyr
+* XML
 
 ---
 _D-Lab == Data Intensive Social Science, For All!_
diff --git a/data/dirty.csv b/data/dirty.csv
index 92aa05e..b7d2c22 100644
--- a/data/dirty.csv
+++ b/data/dirty.csv
@@ -1,6 +1,6 @@
 Timestamp,How tall are you?,What department are you in?,Are you currently enrolled?,What is your birth order?
 7/25/2015 10:08:41,very,Geology  ,Yes,1
 7/25/2015 10:10:56,70,999,Yes,1
-7/25/2015 10:11:20,5’9,  geology,999,2
+7/25/2015 10:11:20,5'9,  geology,999,2
 7/25/2015 10:11:25,2.1,goelogy,No,"9,000"
-7/25/2015 10:11:29,156,anthro,999,2
\ No newline at end of file
+7/25/2015 10:11:29,156,anthro,999,2
diff --git a/instructor/day_four.html b/instructor/day_four.html
index 4576a9c..d3c37e2 100644
--- a/instructor/day_four.html
+++ b/instructor/day_four.html
@@ -55,7 +55,7 @@ <h1 class="title">Day Four: Functional Programming</h1>
   <p class="author">
 Dillon Niederhut<br />Shinhye Choi
   </p>
-  <p class="date">02 May, 2016</p>
+  <p class="date">24 May, 2016</p>
 </div>
 <div class="slide section level1">
 
@@ -83,7 +83,7 @@ <h1>Looping</h1>
 }                <span class="co"># or</span>
 
 mat &lt;-<span class="st"> </span><span class="kw">c</span>(<span class="kw">rep</span>(<span class="ot">NA</span>, <span class="dv">6</span>))
-for(i in <span class="dv">5</span>:<span class="dv">10</span>){   
+for(i in <span class="dv">5</span>:<span class="dv">10</span>){
   mat[i<span class="dv">-4</span>] &lt;-<span class="st"> </span><span class="dv">2</span>^i
 }                <span class="co"># by setting sequence and statement accordingly</span></code></pre></div>
 <p>You can also loop over a non-numeric vector</p>
@@ -93,11 +93,11 @@ <h1>Looping</h1>
 
 for(city in <span class="kw">c</span>(<span class="st">&quot;Berkeley&quot;</span>, <span class="st">&quot;Walnut Creek&quot;</span>, <span class="st">&quot;Richmond&quot;</span>)){
   if(<span class="kw">sum</span>(city==city.temp$a)&gt;<span class="dv">0</span>){
-    <span class="kw">print</span>(city.temp[<span class="kw">which</span>(city==city.temp$a),])   
+    <span class="kw">print</span>(city.temp[<span class="kw">which</span>(city==city.temp$a),])
     <span class="co"># if we have the city in our data, then print it's temperature and the name of the city</span>
   }
   if(<span class="kw">sum</span>(city==city.temp$a)==<span class="dv">0</span>){
-    <span class="kw">print</span>(<span class="kw">paste</span>(city, <span class="st">&quot;is NOT in the data. :(&quot;</span>, <span class="dt">sep=</span><span class="st">&quot; &quot;</span>))          
+    <span class="kw">print</span>(<span class="kw">paste</span>(city, <span class="st">&quot;is NOT in the data. :(&quot;</span>, <span class="dt">sep=</span><span class="st">&quot; &quot;</span>))
     <span class="co"># if not, then just print the name of the city next to &quot;is Not in the data. :(&quot;</span>
   }
 }   <span class="co"># Loops can be as complicated and long as they could be. Often not so efficient.</span></code></pre></div>
@@ -114,7 +114,7 @@ <h1>Looping</h1>
 <span class="kw">system.time</span>(
   for(i in <span class="dv">1</span>:<span class="dv">1000</span>){
     <span class="kw">print</span>(i)
-  if(i ==<span class="st"> </span><span class="dv">50</span>) break  
+  if(i ==<span class="st"> </span><span class="dv">50</span>) break
   })</code></pre></div>
 <p>Next we move on to control structures, such as if statements. ``If&quot; statements are very useful when you want to assign different tasks to different subsets of data using a single for-loop. The basic syntax looks like the following: if(condition){statement} else{other statement}</p>
 <blockquote>
@@ -123,7 +123,7 @@ <h1>Looping</h1>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">x &lt;-<span class="st"> </span><span class="dv">7</span>
 if(x &gt;<span class="st"> </span><span class="dv">10</span>){
   <span class="kw">print</span>(x)
-  
+
   }else{                     <span class="co"># &quot;else&quot; should not start its own line. </span>
                              <span class="co"># Always let it be preceded by a closing brace on the same line.</span>
   <span class="kw">print</span>(<span class="st">&quot;NOT BIG ENOUGH!!&quot;</span>)
@@ -137,15 +137,15 @@ <h1>Looping</h1>
 ##   [8] &quot;male&quot;   &quot;male&quot;   &quot;male&quot;   &quot;female&quot; &quot;female&quot; &quot;male&quot;   &quot;female&quot;
 ##  [15] &quot;male&quot;   &quot;female&quot; &quot;female&quot; &quot;male&quot;   &quot;male&quot;   &quot;female&quot; &quot;female&quot;
 ##  [22] &quot;male&quot;   &quot;male&quot;   &quot;male&quot;   &quot;male&quot;   &quot;female&quot; &quot;male&quot;   &quot;female&quot;
-##  [29] &quot;female&quot; &quot;male&quot;   &quot;male&quot;   &quot;female&quot; &quot;female&quot; &quot;female&quot; &quot;male&quot;  
-##  [36] &quot;female&quot; &quot;female&quot; &quot;female&quot; &quot;female&quot; &quot;female&quot; &quot;male&quot;   &quot;male&quot;  
-##  [43] &quot;male&quot;   &quot;female&quot; &quot;male&quot;   &quot;female&quot; &quot;female&quot; &quot;male&quot;   &quot;male&quot;  
+##  [29] &quot;female&quot; &quot;male&quot;   &quot;male&quot;   &quot;female&quot; &quot;female&quot; &quot;female&quot; &quot;male&quot;
+##  [36] &quot;female&quot; &quot;female&quot; &quot;female&quot; &quot;female&quot; &quot;female&quot; &quot;male&quot;   &quot;male&quot;
+##  [43] &quot;male&quot;   &quot;female&quot; &quot;male&quot;   &quot;female&quot; &quot;female&quot; &quot;male&quot;   &quot;male&quot;
 ##  [50] &quot;male&quot;   &quot;male&quot;   &quot;male&quot;   &quot;male&quot;   &quot;female&quot; &quot;female&quot; &quot;female&quot;
-##  [57] &quot;male&quot;   &quot;female&quot; &quot;male&quot;   &quot;female&quot; &quot;male&quot;   &quot;female&quot; &quot;male&quot;  
-##  [64] &quot;female&quot; &quot;female&quot; &quot;female&quot; &quot;female&quot; &quot;male&quot;   &quot;male&quot;   &quot;male&quot;  
+##  [57] &quot;male&quot;   &quot;female&quot; &quot;male&quot;   &quot;female&quot; &quot;male&quot;   &quot;female&quot; &quot;male&quot;
+##  [64] &quot;female&quot; &quot;female&quot; &quot;female&quot; &quot;female&quot; &quot;male&quot;   &quot;male&quot;   &quot;male&quot;
 ##  [71] &quot;female&quot; &quot;male&quot;   &quot;female&quot; &quot;male&quot;   &quot;female&quot; &quot;female&quot; &quot;female&quot;
-##  [78] &quot;female&quot; &quot;male&quot;   &quot;male&quot;   &quot;male&quot;   &quot;male&quot;   &quot;female&quot; &quot;male&quot;  
-##  [85] &quot;female&quot; &quot;male&quot;   &quot;female&quot; &quot;male&quot;   &quot;female&quot; &quot;female&quot; &quot;male&quot;  
+##  [78] &quot;female&quot; &quot;male&quot;   &quot;male&quot;   &quot;male&quot;   &quot;male&quot;   &quot;female&quot; &quot;male&quot;
+##  [85] &quot;female&quot; &quot;male&quot;   &quot;female&quot; &quot;male&quot;   &quot;female&quot; &quot;female&quot; &quot;male&quot;
 ##  [92] &quot;male&quot;   &quot;male&quot;   &quot;male&quot;   &quot;female&quot; &quot;female&quot; &quot;female&quot; &quot;female&quot;
 ##  [99] &quot;female&quot; &quot;male&quot;</code></pre>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">gender &lt;-<span class="st"> </span><span class="kw">ifelse</span>(gender==<span class="st">&quot;male&quot;</span>, <span class="dv">1</span>, <span class="dv">0</span>)
@@ -175,7 +175,7 @@ <h2 id="every-function-has-three-parts">every function has three parts</h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">body</span>(f)</code></pre></div>
 <pre><code>## x + 1</code></pre>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">environment</span>(f)</code></pre></div>
-<pre><code>## &lt;environment: 0x7f81d9374c60&gt;</code></pre>
+<pre><code>## &lt;environment: 0x7fe163bba308&gt;</code></pre>
 <h2 id="environments-are-where-the-function-was-defined">environments are where the function was defined</h2>
 <p>see how our function has <code>R_GlobalEnv</code> as it’s environment? that’s because we defined it in the global environment</p>
 <p>this means that if you tell a function to look for an <code>object</code>, it will look in the global namespace</p>
@@ -260,13 +260,13 @@ <h2 id="the-right-way-to-be-functional">the right way to be functional</h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">lapply</span>(heights, in_to_m)</code></pre></div>
 <pre><code>## [[1]]
 ## [1] 1.7526
-## 
+##
 ## [[2]]
 ## [1] 1.3716
-## 
+##
 ## [[3]]
 ## [1] 1.8542
-## 
+##
 ## [[4]]
 ## [1] 2.0828</code></pre>
 <h2 id="its-not-always-smart-to-name-functions">it’s not always smart to name functions</h2>
@@ -274,13 +274,13 @@ <h2 id="its-not-always-smart-to-name-functions">it’s not always smart to name
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">lapply</span>(heights, <span class="dt">FUN =</span> function(x) x %/%<span class="st"> </span><span class="dv">12</span>)</code></pre></div>
 <pre><code>## [[1]]
 ## [1] 5
-## 
+##
 ## [[2]]
 ## [1] 4
-## 
+##
 ## [[3]]
 ## [1] 6
-## 
+##
 ## [[4]]
 ## [1] 6</code></pre>
 <h2 id="lapply-has-limits">lapply has limits</h2>
@@ -293,10 +293,10 @@ <h2 id="lapply-has-limits">lapply has limits</h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">lapply</span>(dat, mean)</code></pre></div>
 <pre><code>## $a
 ## [1] NA
-## 
+##
 ## $b
 ## [1] NA
-## 
+##
 ## $c
 ## [1] NA</code></pre>
 <p>we <em>know</em> there are numbers there - why are the means all missing?</p>
@@ -315,16 +315,16 @@ <h2 id="this-can-be-parallelized">this can be parallelized</h2>
 <p>side note - previous versions of these materials imported the <code>parallel</code> library, which is no longer supported as of R versions &gt;= 3.2</p>
 </blockquote>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">install.packages</span>(<span class="st">'parallelMap'</span>)</code></pre></div>
-<pre><code>## 
+<pre><code>##
 ## The downloaded binary packages are in
-##  /var/folders/rj/8gpcssqd52z9yrqw7f8xxfym0000gn/T//Rtmp2xjYZ7/downloaded_packages</code></pre>
+ /var/folders/rj/8gpcssqd52z9yrqw7f8xxfym0000gn/T//RtmpmP1txl/downloaded_packages</code></pre>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">library</span>(parallelMap)
 <span class="kw">system.time</span>(<span class="kw">Map</span>(median, dat, <span class="dt">na.rm=</span><span class="ot">TRUE</span>))</code></pre></div>
-<pre><code>##    user  system elapsed 
+<pre><code>##    user  system elapsed
 ##   0.000   0.000   0.001</code></pre>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">system.time</span>(<span class="kw">parallelMap</span>(median, dat, <span class="dt">na.rm=</span><span class="ot">TRUE</span>))</code></pre></div>
-<pre><code>##    user  system elapsed 
-##   0.001   0.000   0.001</code></pre>
+<pre><code>##    user  system elapsed
+##       0       0       0</code></pre>
 <p>parallel processing incurs time costs from memory management and message passing that can make small jobs take longer in parallel than in serial</p>
 </div>
 <div id="packages" class="slide section level1">
@@ -372,7 +372,7 @@ <h2 id="adding-dependencies">adding dependencies</h2>
 <p>what if there are other packages that your package uses? like ggplot2? do</p>
 <pre><code>Imports: ggplot</code></pre>
 <p>and if you want list optional packages, you can do so like this:</p>
-<pre><code>Suggests: 
+<pre><code>Suggests:
   reshape2 (&gt;=1.4.1)
   plyr (&gt;=1.8.3)</code></pre>
 <blockquote>
@@ -396,19 +396,19 @@ <h2 id="creating-man-pages">creating man pages</h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">library</span>(roxygen2)</code></pre></div>
 <p>now we’re going to add specialized comments to our length.R file</p>
 <pre><code>#' Converts inches to centimeters
-#' 
+#'
 #' @param x A numeric
 #' @return Converted numeric
-#' @examples 
+#' @examples
 #' in_to_cm(1)
 #' in_to_cm(c(1,2,3))
 in_to_cm &lt;- function(x) x * 2.54
 
 #' Converts inches to meters
-#' 
+#'
 #' @param x A numeric
 #' @return Converted numeric
-#' @examples 
+#' @examples
 #' in_to_m(1)
 #' in_to_m(c(1,2,3))
 in_to_m &lt;- function(x){
@@ -423,20 +423,20 @@ <h2 id="namespace"><code>NAMESPACE</code></h2>
 export(in_to_m)</code></pre>
 <p>or you can have roxygen2 handle it for you by adding <code>#' @export</code> in the function blocks you want to have exported</p>
 <pre><code>#' Converts inches to centimeters
-#' 
+#'
 #' @param x A numeric
 #' @return Converted numeric
-#' @examples 
+#' @examples
 #' in_to_cm(1)
 #' in_to_cm(c(1,2,3))
 #' @export
 in_to_cm &lt;- function(x) x * 2.54
 
 #' Converts inches to meters
-#' 
+#'
 #' @param x A numeric
 #' @return Converted numeric
-#' @examples 
+#' @examples
 #' in_to_m(1)
 #' in_to_m(c(1,2,3))
 #' @export
@@ -507,31 +507,31 @@ <h2 id="example-code">Example code</h2>
 <pre><code>## [1] &quot;&lt;H3&gt;ACT I&lt;/h3&gt;&quot;   &quot;&lt;H3&gt;ACT II&lt;/h3&gt;&quot;  &quot;&lt;H3&gt;ACT III&lt;/h3&gt;&quot;
 ## [4] &quot;&lt;H3&gt;ACT IV&lt;/h3&gt;&quot;  &quot;&lt;H3&gt;ACT V&lt;/h3&gt;&quot;</code></pre>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">RJ[<span class="kw">grep</span>(<span class="st">&quot;&lt;h3&gt;&quot;</span>, RJ, <span class="dt">perl=</span><span class="ot">TRUE</span>)]</code></pre></div>
-<pre><code>##  [1] &quot;&lt;h3&gt;PROLOGUE&lt;/h3&gt;&quot;                                                        
-##  [2] &quot;&lt;h3&gt;SCENE I. Verona. A public place.&lt;/h3&gt;&quot;                                
-##  [3] &quot;&lt;h3&gt;SCENE II. A street.&lt;/h3&gt;&quot;                                             
-##  [4] &quot;&lt;h3&gt;SCENE III. A room in Capulet's house.&lt;/h3&gt;&quot;                           
-##  [5] &quot;&lt;h3&gt;SCENE IV. A street.&lt;/h3&gt;&quot;                                             
-##  [6] &quot;&lt;h3&gt;SCENE V. A hall in Capulet's house.&lt;/h3&gt;&quot;                             
-##  [7] &quot;&lt;h3&gt;PROLOGUE&lt;/h3&gt;&quot;                                                        
-##  [8] &quot;&lt;h3&gt;SCENE I. A lane by the wall of Capulet's orchard.&lt;/h3&gt;&quot;               
-##  [9] &quot;&lt;h3&gt;SCENE II. Capulet's orchard.&lt;/h3&gt;&quot;                                    
-## [10] &quot;&lt;h3&gt;SCENE III. Friar Laurence's cell.&lt;/h3&gt;&quot;                               
-## [11] &quot;&lt;h3&gt;SCENE IV. A street.&lt;/h3&gt;&quot;                                             
-## [12] &quot;&lt;h3&gt;SCENE V. Capulet's orchard.&lt;/h3&gt;&quot;                                     
-## [13] &quot;&lt;h3&gt;SCENE VI. Friar Laurence's cell.&lt;/h3&gt;&quot;                                
-## [14] &quot;&lt;h3&gt;SCENE I. A public place.&lt;/h3&gt;&quot;                                        
-## [15] &quot;&lt;h3&gt;SCENE II. Capulet's orchard.&lt;/h3&gt;&quot;                                    
-## [16] &quot;&lt;h3&gt;SCENE III. Friar Laurence's cell.&lt;/h3&gt;&quot;                               
-## [17] &quot;&lt;h3&gt;SCENE IV. A room in Capulet's house.&lt;/h3&gt;&quot;                            
-## [18] &quot;&lt;h3&gt;SCENE V. Capulet's orchard.&lt;/h3&gt;&quot;                                     
-## [19] &quot;&lt;h3&gt;SCENE I. Friar Laurence's cell.&lt;/h3&gt;&quot;                                 
-## [20] &quot;&lt;h3&gt;SCENE II. Hall in Capulet's house.&lt;/h3&gt;&quot;                              
-## [21] &quot;&lt;h3&gt;SCENE III. Juliet's chamber.&lt;/h3&gt;&quot;                                    
-## [22] &quot;&lt;h3&gt;SCENE IV. Hall in Capulet's house.&lt;/h3&gt;&quot;                              
-## [23] &quot;&lt;h3&gt;SCENE V. Juliet's chamber.&lt;/h3&gt;&quot;                                      
-## [24] &quot;&lt;h3&gt;SCENE I. Mantua. A street.&lt;/h3&gt;&quot;                                      
-## [25] &quot;&lt;h3&gt;SCENE II. Friar Laurence's cell.&lt;/h3&gt;&quot;                                
+<pre><code>##  [1] &quot;&lt;h3&gt;PROLOGUE&lt;/h3&gt;&quot;
+##  [2] &quot;&lt;h3&gt;SCENE I. Verona. A public place.&lt;/h3&gt;&quot;
+##  [3] &quot;&lt;h3&gt;SCENE II. A street.&lt;/h3&gt;&quot;
+##  [4] &quot;&lt;h3&gt;SCENE III. A room in Capulet's house.&lt;/h3&gt;&quot;
+##  [5] &quot;&lt;h3&gt;SCENE IV. A street.&lt;/h3&gt;&quot;
+##  [6] &quot;&lt;h3&gt;SCENE V. A hall in Capulet's house.&lt;/h3&gt;&quot;
+##  [7] &quot;&lt;h3&gt;PROLOGUE&lt;/h3&gt;&quot;
+##  [8] &quot;&lt;h3&gt;SCENE I. A lane by the wall of Capulet's orchard.&lt;/h3&gt;&quot;
+##  [9] &quot;&lt;h3&gt;SCENE II. Capulet's orchard.&lt;/h3&gt;&quot;
+## [10] &quot;&lt;h3&gt;SCENE III. Friar Laurence's cell.&lt;/h3&gt;&quot;
+## [11] &quot;&lt;h3&gt;SCENE IV. A street.&lt;/h3&gt;&quot;
+## [12] &quot;&lt;h3&gt;SCENE V. Capulet's orchard.&lt;/h3&gt;&quot;
+## [13] &quot;&lt;h3&gt;SCENE VI. Friar Laurence's cell.&lt;/h3&gt;&quot;
+## [14] &quot;&lt;h3&gt;SCENE I. A public place.&lt;/h3&gt;&quot;
+## [15] &quot;&lt;h3&gt;SCENE II. Capulet's orchard.&lt;/h3&gt;&quot;
+## [16] &quot;&lt;h3&gt;SCENE III. Friar Laurence's cell.&lt;/h3&gt;&quot;
+## [17] &quot;&lt;h3&gt;SCENE IV. A room in Capulet's house.&lt;/h3&gt;&quot;
+## [18] &quot;&lt;h3&gt;SCENE V. Capulet's orchard.&lt;/h3&gt;&quot;
+## [19] &quot;&lt;h3&gt;SCENE I. Friar Laurence's cell.&lt;/h3&gt;&quot;
+## [20] &quot;&lt;h3&gt;SCENE II. Hall in Capulet's house.&lt;/h3&gt;&quot;
+## [21] &quot;&lt;h3&gt;SCENE III. Juliet's chamber.&lt;/h3&gt;&quot;
+## [22] &quot;&lt;h3&gt;SCENE IV. Hall in Capulet's house.&lt;/h3&gt;&quot;
+## [23] &quot;&lt;h3&gt;SCENE V. Juliet's chamber.&lt;/h3&gt;&quot;
+## [24] &quot;&lt;h3&gt;SCENE I. Mantua. A street.&lt;/h3&gt;&quot;
+## [25] &quot;&lt;h3&gt;SCENE II. Friar Laurence's cell.&lt;/h3&gt;&quot;
 ## [26] &quot;&lt;h3&gt;SCENE III. A churchyard; in it a tomb belonging to the Capulets.&lt;/h3&gt;&quot;</code></pre>
 Now that we know that the first line of each act begins with the string ``
 <H3>
@@ -548,41 +548,41 @@ <H3>
 }</code></pre></div>
 <p>How should we count the number of the words appear in each act? Create a wrapper function that counts the number of the words and returns the number.</p>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">countR &lt;-<span class="st"> </span>function(z){
-  <span class="kw">return</span>(<span class="kw">c</span>(<span class="kw">length</span>(<span class="kw">grep</span>(<span class="st">&quot;Romeo&quot;</span>, z, <span class="dt">perl=</span>T)), <span class="kw">length</span>(<span class="kw">grep</span>(<span class="st">&quot;Juliet&quot;</span>, z, <span class="dt">perl=</span>T)))) 
+  <span class="kw">return</span>(<span class="kw">c</span>(<span class="kw">length</span>(<span class="kw">grep</span>(<span class="st">&quot;Romeo&quot;</span>, z, <span class="dt">perl=</span>T)), <span class="kw">length</span>(<span class="kw">grep</span>(<span class="st">&quot;Juliet&quot;</span>, z, <span class="dt">perl=</span>T))))
 }
 <span class="kw">lapply</span>(x, countR)</code></pre></div>
 <pre><code>## [[1]]
 ## [1] 8 4
-## 
+##
 ## [[2]]
 ## [1] 30  3
-## 
+##
 ## [[3]]
 ## [1] 54 13
-## 
+##
 ## [[4]]
 ## [1] 9 8
-## 
+##
 ## [[5]]
 ## [1] 20 19</code></pre>
 <p>Now count the lines in each scene</p>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="co"># now count the lines in each scene</span>
 countL &lt;-<span class="st"> </span>function(z){
-  <span class="kw">return</span>(<span class="kw">length</span>(<span class="kw">grep</span>(<span class="st">&quot;&lt;/A&gt;&lt;br&gt;$&quot;</span>, z, <span class="dt">perl=</span>T))) 
+  <span class="kw">return</span>(<span class="kw">length</span>(<span class="kw">grep</span>(<span class="st">&quot;&lt;/A&gt;&lt;br&gt;$&quot;</span>, z, <span class="dt">perl=</span>T)))
 }
 <span class="kw">lapply</span>(x, countL)</code></pre></div>
 <pre><code>## [[1]]
 ## [1] 739
-## 
+##
 ## [[2]]
 ## [1] 685
-## 
+##
 ## [[3]]
 ## [1] 821
-## 
+##
 ## [[4]]
 ## [1] 407
-## 
+##
 ## [[5]]
 ## [1] 441</code></pre>
 </div>
diff --git a/instructor/day_one.html b/instructor/day_one.html
index f2fd818..c80ac3d 100644
--- a/instructor/day_one.html
+++ b/instructor/day_one.html
@@ -54,7 +54,7 @@ <h1 class="title">Day One: R Basics</h1>
   <p class="author">
 Dillon Niederhut
   </p>
-  <p class="date">02 May, 2016</p>
+  <p class="date">24 May, 2016</p>
 </div>
 <div class="slide section level1">
 
@@ -96,23 +96,23 @@ <h1>Object Oriented Programming</h1>
 <h2 id="everything-in-r-is-an-object">everything in R is an object</h2>
 <p>yes, even the commands, just watch</p>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">ls</code></pre></div>
-<pre><code>## function (name, pos = -1L, envir = as.environment(pos), all.names = FALSE, 
-##     pattern, sorted = TRUE) 
+<pre><code>## function (name, pos = -1L, envir = as.environment(pos), all.names = FALSE,
+##     pattern, sorted = TRUE)
 ## {
 ##     if (!missing(name)) {
 ##         pos &lt;- tryCatch(name, error = function(e) e)
 ##         if (inherits(pos, &quot;error&quot;)) {
 ##             name &lt;- substitute(name)
-##             if (!is.character(name)) 
+##             if (!is.character(name))
 ##                 name &lt;- deparse(name)
-##             warning(gettextf(&quot;%s converted to character string&quot;, 
+##             warning(gettextf(&quot;%s converted to character string&quot;,
 ##                 sQuote(name)), domain = NA)
 ##             pos &lt;- name
 ##         }
 ##     }
 ##     all.names &lt;- .Internal(ls(envir, all.names, sorted))
 ##     if (!missing(pattern)) {
-##         if ((ll &lt;- length(grep(&quot;[&quot;, pattern, fixed = TRUE))) &amp;&amp; 
+##         if ((ll &lt;- length(grep(&quot;[&quot;, pattern, fixed = TRUE))) &amp;&amp;
 ##             ll != length(grep(&quot;]&quot;, pattern, fixed = TRUE))) {
 ##             if (pattern == &quot;[&quot;) {
 ##                 pattern &lt;- &quot;\\[&quot;
@@ -127,7 +127,7 @@ <h2 id="everything-in-r-is-an-object">everything in R is an object</h2>
 ##     }
 ##     else all.names
 ## }
-## &lt;bytecode: 0x7f81da153fb8&gt;
+## &lt;bytecode: 0x7fe163574678&gt;
 ## &lt;environment: namespace:base&gt;</code></pre>
 <p><code>ls</code>, like basketball, is a specific thing with a <code>name</code> and stuff inside it that makes it <code>ls</code> and not dillon niederhut. in this particular instance, we are looking at the function that tells you what <code>objects</code> are in your <code>environment</code></p>
 <p>until we get to functional programming, your <code>environment</code> is just R plus whatever you put in R</p>
@@ -135,19 +135,23 @@ <h2 id="in-r-you-store-objects-with-names-with-the---operator">in R, you store o
 <p>just like you need names to tell things apart, R does too</p>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">my.name &lt;-<span class="st"> </span>dir
 my.name</code></pre></div>
-<pre><code>## function (path = &quot;.&quot;, pattern = NULL, all.files = FALSE, full.names = FALSE, 
-##     recursive = FALSE, ignore.case = FALSE, include.dirs = FALSE, 
-##     no.. = FALSE) 
-## .Internal(list.files(path, pattern, all.files, full.names, recursive, 
+<pre><code>## function (path = &quot;.&quot;, pattern = NULL, all.files = FALSE, full.names = FALSE,
+##     recursive = FALSE, ignore.case = FALSE, include.dirs = FALSE,
+##     no.. = FALSE)
+## .Internal(list.files(path, pattern, all.files, full.names, recursive,
 ##     ignore.case, include.dirs, no..))
-## &lt;bytecode: 0x7f81dca955f0&gt;
+## &lt;bytecode: 0x7fe164e12268&gt;
 ## &lt;environment: namespace:base&gt;</code></pre>
 <h2 id="names-must-be-unique">names must be unique</h2>
 <p>everytime you give an <code>object</code> a <code>name</code>, it removes anything that already had that <code>name</code> from your environment</p>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">my.name &lt;-<span class="st"> </span><span class="kw">dir</span>()
 my.name</code></pre></div>
-<pre><code>## [1] &quot;CONTRIBUTING.md&quot; &quot;data&quot;            &quot;examples&quot;        &quot;instructor&quot;     
-## [5] &quot;LICENSE&quot;         &quot;README.md&quot;       &quot;scripts&quot;</code></pre>
+<pre><code>##  [1] &quot;CONTRIBUTING.md&quot;            &quot;convertR&quot;
+##  [3] &quot;convertR_0.0.0.9000.tar.gz&quot; &quot;data&quot;
+##  [5] &quot;examples&quot;                   &quot;instructor&quot;
+##  [7] &quot;LICENSE&quot;                    &quot;PUBLICITY.md&quot;
+##  [9] &quot;R-intensive.Rproj&quot;          &quot;README.md&quot;
+## [11] &quot;scripts&quot;</code></pre>
 <p>you see those parentheses? that means you are calling an object (here, it’s a function evaluator) on <code>dir</code>.</p>
 <h2 id="classes-in-r">classes in R</h2>
 <p>because it is code to be evalueated, <code>dir</code> belongs in a class called ‘functions’</p>
@@ -182,19 +186,23 @@ <h2 id="tell-r-where-you-would-like-it-to-be-with">tell R where you would like i
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">setwd</span>(<span class="st">&quot;/Users/dillonniederhut/Dropbox/dlab/R-for-Data-Science&quot;</span>)</code></pre></div>
 <h2 id="find-out-whats-in-your-directory-with">find out what’s in your directory with</h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">dir</span>()</code></pre></div>
-<pre><code>## [1] &quot;CONTRIBUTING.md&quot; &quot;data&quot;            &quot;examples&quot;        &quot;instructor&quot;     
-## [5] &quot;LICENSE&quot;         &quot;README.md&quot;       &quot;scripts&quot;</code></pre>
+<pre><code>##  [1] &quot;CONTRIBUTING.md&quot;            &quot;convertR&quot;
+##  [3] &quot;convertR_0.0.0.9000.tar.gz&quot; &quot;data&quot;
+##  [5] &quot;examples&quot;                   &quot;instructor&quot;
+##  [7] &quot;LICENSE&quot;                    &quot;PUBLICITY.md&quot;
+##  [9] &quot;R-intensive.Rproj&quot;          &quot;README.md&quot;
+## [11] &quot;scripts&quot;</code></pre>
 <h2 id="find-out-whats-in-your-environment-with">find out what’s in your environment with</h2>
 <p>in R, you are always in an environment (more on scoping in day 4)</p>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">ls</span>()</code></pre></div>
-<pre><code>##  [1] &quot;document&quot;     &quot;my.character&quot; &quot;my.data&quot;      &quot;my.date&quot;     
-##  [5] &quot;my.factor&quot;    &quot;my.list&quot;      &quot;my.name&quot;      &quot;my.vector&quot;   
+<pre><code>##  [1] &quot;document&quot;     &quot;my.character&quot; &quot;my.data&quot;      &quot;my.date&quot;
+##  [5] &quot;my.factor&quot;    &quot;my.list&quot;      &quot;my.name&quot;      &quot;my.vector&quot;
 ##  [9] &quot;test&quot;         &quot;your.vector&quot;</code></pre>
 <p>our environment is currently empty</p>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">test &lt;-<span class="st"> &quot;I have no idea what I'm doing&quot;</span>
 <span class="kw">ls</span>()</code></pre></div>
-<pre><code>##  [1] &quot;document&quot;     &quot;my.character&quot; &quot;my.data&quot;      &quot;my.date&quot;     
-##  [5] &quot;my.factor&quot;    &quot;my.list&quot;      &quot;my.name&quot;      &quot;my.vector&quot;   
+<pre><code>##  [1] &quot;document&quot;     &quot;my.character&quot; &quot;my.data&quot;      &quot;my.date&quot;
+##  [5] &quot;my.factor&quot;    &quot;my.list&quot;      &quot;my.name&quot;      &quot;my.vector&quot;
 ##  [9] &quot;test&quot;         &quot;your.vector&quot;</code></pre>
 <h2 id="we-can-clean-our-environment-with">we can clean our environment with</h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">rm</span>(<span class="dt">list =</span> <span class="kw">ls</span>())
@@ -205,53 +213,53 @@ <h2 id="and-search-the-help-pages-with">and search the help pages with <code>??<
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">??exists</code></pre></div>
 <h2 id="you-can-get-a-quick-example-with">you can get a quick example with</h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">example</span>(exists)</code></pre></div>
-<pre><code>## 
+<pre><code>##
 ## exists&gt; ##  Define a substitute function if necessary:
 ## exists&gt; if(!exists(&quot;some.fun&quot;, mode = &quot;function&quot;))
 ## exists+   some.fun &lt;- function(x) { cat(&quot;some.fun(x)\n&quot;); x }
-## 
+##
 ## exists&gt; search()
-##  [1] &quot;.GlobalEnv&quot;          &quot;package:Amelia&quot;      &quot;package:Rcpp&quot;       
+##  [1] &quot;.GlobalEnv&quot;          &quot;package:Amelia&quot;      &quot;package:Rcpp&quot;
 ##  [4] &quot;package:roxygen2&quot;    &quot;package:devtools&quot;    &quot;package:parallelMap&quot;
-##  [7] &quot;package:rmarkdown&quot;   &quot;package:knitr&quot;       &quot;package:stats&quot;      
-## [10] &quot;package:graphics&quot;    &quot;package:grDevices&quot;   &quot;package:utils&quot;      
-## [13] &quot;package:datasets&quot;    &quot;package:methods&quot;     &quot;Autoloads&quot;          
-## [16] &quot;package:base&quot;       
-## 
+##  [7] &quot;package:rmarkdown&quot;   &quot;package:knitr&quot;       &quot;package:stats&quot;
+## [10] &quot;package:graphics&quot;    &quot;package:grDevices&quot;   &quot;package:utils&quot;
+## [13] &quot;package:datasets&quot;    &quot;package:methods&quot;     &quot;Autoloads&quot;
+## [16] &quot;package:base&quot;
+##
 ## exists&gt; exists(&quot;ls&quot;, 2) # true even though ls is in pos = 3
 ## [1] TRUE
-## 
+##
 ## exists&gt; exists(&quot;ls&quot;, 2, inherits = FALSE) # false
 ## [1] FALSE
-## 
+##
 ## exists&gt; ## These are true (in most circumstances):
 ## exists&gt; identical(ls,   get0(&quot;ls&quot;))
 ## [1] TRUE
-## 
+##
 ## exists&gt; identical(NULL, get0(&quot;.foo.bar.&quot;)) # default ifnotfound = NULL (!)
 ## [1] TRUE
-## 
-## exists&gt; ## Don't show: 
+##
+## exists&gt; ## Don't show:
 ## exists&gt; stopifnot(identical(ls, get0(&quot;ls&quot;)),
 ## exists+           is.null(get0(&quot;.foo.bar.&quot;)))
-## 
+##
 ## exists&gt; ## End(Don't show)
-## exists&gt; 
-## exists&gt; 
+## exists&gt;
+## exists&gt;
 ## exists&gt;</code></pre>
 <p>when you kind of remember what you are looking for, try</p>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">apropos</span>(<span class="st">'lm'</span>)</code></pre></div>
-<pre><code>##  [1] &quot;.__C__anova.glm&quot;      &quot;.__C__anova.glm.null&quot; &quot;.__C__glm&quot;           
-##  [4] &quot;.__C__glm.null&quot;       &quot;.__C__lm&quot;             &quot;.__C__mlm&quot;           
-##  [7] &quot;.__C__optionalMethod&quot; &quot;.colMeans&quot;            &quot;.lm.fit&quot;             
-## [10] &quot;colMeans&quot;             &quot;confint.lm&quot;           &quot;contr.helmert&quot;       
-## [13] &quot;dummy.coef.lm&quot;        &quot;getAllMethods&quot;        &quot;glm&quot;                 
-## [16] &quot;glm.control&quot;          &quot;glm.fit&quot;              &quot;KalmanForecast&quot;      
-## [19] &quot;KalmanLike&quot;           &quot;KalmanRun&quot;            &quot;KalmanSmooth&quot;        
-## [22] &quot;kappa.lm&quot;             &quot;lm&quot;                   &quot;lm.fit&quot;              
-## [25] &quot;lm.influence&quot;         &quot;lm.wfit&quot;              &quot;model.matrix.lm&quot;     
-## [28] &quot;nlm&quot;                  &quot;nlminb&quot;               &quot;parallelMap&quot;         
-## [31] &quot;predict.glm&quot;          &quot;predict.lm&quot;           &quot;residuals.glm&quot;       
+<pre><code>##  [1] &quot;.__C__anova.glm&quot;      &quot;.__C__anova.glm.null&quot; &quot;.__C__glm&quot;
+##  [4] &quot;.__C__glm.null&quot;       &quot;.__C__lm&quot;             &quot;.__C__mlm&quot;
+##  [7] &quot;.__C__optionalMethod&quot; &quot;.colMeans&quot;            &quot;.lm.fit&quot;
+## [10] &quot;colMeans&quot;             &quot;confint.lm&quot;           &quot;contr.helmert&quot;
+## [13] &quot;dummy.coef.lm&quot;        &quot;getAllMethods&quot;        &quot;glm&quot;
+## [16] &quot;glm.control&quot;          &quot;glm.fit&quot;              &quot;KalmanForecast&quot;
+## [19] &quot;KalmanLike&quot;           &quot;KalmanRun&quot;            &quot;KalmanSmooth&quot;
+## [22] &quot;kappa.lm&quot;             &quot;lm&quot;                   &quot;lm.fit&quot;
+## [25] &quot;lm.influence&quot;         &quot;lm.wfit&quot;              &quot;model.matrix.lm&quot;
+## [28] &quot;nlm&quot;                  &quot;nlminb&quot;               &quot;parallelMap&quot;
+## [31] &quot;predict.glm&quot;          &quot;predict.lm&quot;           &quot;residuals.glm&quot;
 ## [34] &quot;residuals.lm&quot;         &quot;summary.glm&quot;          &quot;summary.lm&quot;</code></pre>
 </div>
 <div id="the-power-of-r-is-its-extensibility" class="slide section level1">
@@ -418,8 +426,8 @@ <h2 id="r-stores-factors-internally-as-integers-and-uses-the-character-strings-a
 <p>notice how it sorts those levels alphabetically?</p>
 <p>this can cause issues when making plots or trying to display in a particular order - if sort order is critical</p>
 <h2 id="try-giving-your-factor-explicitly-numeric-levels-and-character-labels">try giving your factor explicitly numeric levels and character labels</h2>
-<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">my.factor &lt;-<span class="st"> </span><span class="kw">factor</span>(<span class="kw">c</span>(<span class="dv">1</span>,<span class="dv">2</span>,<span class="dv">3</span>,<span class="dv">4</span>), 
-                    <span class="dt">levels=</span><span class="kw">c</span>(<span class="dv">1</span>,<span class="dv">2</span>,<span class="dv">3</span>,<span class="dv">4</span>), 
+<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">my.factor &lt;-<span class="st"> </span><span class="kw">factor</span>(<span class="kw">c</span>(<span class="dv">1</span>,<span class="dv">2</span>,<span class="dv">3</span>,<span class="dv">4</span>),
+                    <span class="dt">levels=</span><span class="kw">c</span>(<span class="dv">1</span>,<span class="dv">2</span>,<span class="dv">3</span>,<span class="dv">4</span>),
                     <span class="dt">labels=</span><span class="kw">c</span>(<span class="st">'undergraduate'</span>,<span class="st">'graduate'</span>,<span class="st">'professor'</span>,<span class="st">'staff'</span>))
 <span class="kw">levels</span>(my.factor)</code></pre></div>
 <pre><code>## [1] &quot;undergraduate&quot; &quot;graduate&quot;      &quot;professor&quot;     &quot;staff&quot;</code></pre>
@@ -480,10 +488,10 @@ <h2 id="a-list-is-an-ordered-group-of-things-that-are-not-of-the-same-type">a li
 my.list</code></pre></div>
 <pre><code>## [[1]]
 ## [1] TRUE
-## 
+##
 ## [[2]]
 ## [1] &quot;two&quot;
-## 
+##
 ## [[3]]
 ## [1] 3</code></pre>
 <h2 id="you-can-find-out-the-attributes-for-and-types-of-data-in-a-list-with">you can find out the attributes for and types of data in a list with</h2>
diff --git a/instructor/day_three.R b/instructor/day_three.R
index 7a4c6db..11a84bd 100644
--- a/instructor/day_three.R
+++ b/instructor/day_three.R
@@ -10,20 +10,28 @@ summary(dat)
 table(dat$department)
 
 ## ------------------------------------------------------------------------
-dat$wday <- factor(weekdays(dat$timestamp, abbreviate = TRUE), 
-                   levels = c('Mon','Tue','Wed','Thu','Fri','Sat','Sun')
-                   )
-summary(dat$wday)
+library(psych)
+describe(dat)
+
+## ------------------------------------------------------------------------
+library(dplyr)
+dat %>% group_by(gender) %>% summarize(n())
 
 ## ------------------------------------------------------------------------
-library(reshape2)
-dcast(dat[dat$gender == 'Female/Woman' | dat$gender == 'Male/Man',], department ~ gender)
-dcast(melt(dat, measure.vars = c('course.delivered')), wday ~ 'Delivered', fun.aggregate = mean)
+library(tidyr)
+dat %>% filter(!is.na(gender)) %>% group_by(gender, department) %>% 
+  summarize(n=n()) %>% spread(gender, n)
 
 ## ------------------------------------------------------------------------
 install.packages('ggplot2')
 library(ggplot2)
 
+## ------------------------------------------------------------------------
+dat$wday <- factor(weekdays(dat$timestamp, abbreviate = TRUE), 
+                   levels = c('Mon','Tue','Wed','Thu','Fri','Sat','Sun')
+                   )
+summary(dat$wday)
+
 ## ------------------------------------------------------------------------
 qplot(instructor.communicated, data = dat)
 qplot(wday, course.delivered, data = dat)
diff --git a/instructor/day_three.Rmd b/instructor/day_three.Rmd
index 2904e1a..ecaf1bd 100644
--- a/instructor/day_three.Rmd
+++ b/instructor/day_three.Rmd
@@ -47,21 +47,28 @@ summary(dat)
 table(dat$department)
 ```
 
-think back to day one - how would we make weekdays out of the date variable?
+## the `psych` package provides trimmed means, skew, kurtosis, and missingness
 
 ```{r}
-dat$wday <- factor(weekdays(dat$timestamp, abbreviate = TRUE), 
-                   levels = c('Mon','Tue','Wed','Thu','Fri','Sat','Sun')
-                   )
-summary(dat$wday)
+library(psych)
+describe(dat)
+```
+
+## you can use dplyr::groupby to generate summaries
+
+```{r}
+library(dplyr)
+dat %>% group_by(gender) %>% summarize(n())
 ```
 
-## reshape provides a few more ways to aggregate things
+## and you can combine dplyr with tidyr::spread to generate crosstabs
+
+> side note - we are filtering out missing values of gender, because `tidyr` doesn't allow `NA` as a column name
 
 ```{r}
-library(reshape2)
-dcast(dat[dat$gender == 'Female/Woman' | dat$gender == 'Male/Man',], department ~ gender)
-dcast(melt(dat, measure.vars = c('course.delivered')), wday ~ 'Delivered', fun.aggregate = mean)
+library(tidyr)
+dat %>% filter(!is.na(gender)) %>% group_by(gender, department) %>% 
+  summarize(n=n()) %>% spread(gender, n)
 ```
 
 # Plotting
@@ -88,6 +95,19 @@ install.packages('ggplot2')
 library(ggplot2)
 ```
 
+## getting weekdays
+
+let's imagine that we are interested in looking at differences in feedback based on the day of the week -- how would we do this in R?
+
+> side note - `weekdays` is locale aware, so students who have their laptop language set to something other than english will get their weekday names in the other language
+
+```{r}
+dat$wday <- factor(weekdays(dat$timestamp, abbreviate = TRUE), 
+                   levels = c('Mon','Tue','Wed','Thu','Fri','Sat','Sun')
+                   )
+summary(dat$wday)
+```
+
 ## use qplot for initial poking around
 
 it has very strong intuitions about what you want to see, and is not particularly customizable
diff --git a/instructor/day_three.html b/instructor/day_three.html
index 2c5f77d..8bfb8ce 100644
--- a/instructor/day_three.html
+++ b/instructor/day_three.html
@@ -54,7 +54,7 @@ <h1 class="title">Day Three: Data Analysis</h1>
   <p class="author">
 Dillon Niederhut
   </p>
-  <p class="date">02 May, 2016</p>
+  <p class="date">24 May, 2016</p>
 </div>
 <div class="slide section level1">
 
@@ -94,133 +94,166 @@ <h2 id="lets-load-in-some-data-about-d-lab-feedback">let’s load in some data a
 <h2 id="r-provides-two-easysimple-summary-functions-in-the-base-package">R provides two easy/simple summary functions in the base package</h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">summary</span>(dat)</code></pre></div>
 <pre><code>##    timestamp          course.delivered instructor.communicated
-##  Min.   :2014-08-19   Min.   :1.000    Min.   :1.000          
-##  1st Qu.:2014-11-05   1st Qu.:6.000    1st Qu.:6.000          
-##  Median :2015-01-30   Median :7.000    Median :7.000          
-##  Mean   :2015-01-22   Mean   :6.251    Mean   :6.257          
-##  3rd Qu.:2015-04-03   3rd Qu.:7.000    3rd Qu.:7.000          
-##  Max.   :2015-06-22   Max.   :7.000    Max.   :7.000          
-##                                                               
-##                                      hear        interest  
-##  Email from the D-Lab mailing list     :340   Min.   :1.0  
-##  Found it on the D-Lab website         :278   1st Qu.:6.0  
-##  Heard about it from a friend/colleague:247   Median :7.0  
-##  Email from another mailing list       : 99   Mean   :6.6  
-##  Don't remember                        : 12   3rd Qu.:7.0  
-##  (Other)                               : 55   Max.   :7.0  
-##  NA's                                  : 31   NA's   :15   
-##                department     verbs               useful    
-##  Public Health      : 81   Length:1062        Min.   :1.00  
-##  Public Policy      : 44   Class :character   1st Qu.:5.00  
-##  Sociology          : 38   Mode  :character   Median :6.00  
-##  Political Science  : 36                      Mean   :6.02  
-##  Integrative Biology: 28                      3rd Qu.:7.00  
-##  (Other)            :288                      Max.   :7.00  
-##  NA's               :547                                    
-##                                gender     ethnicity        
-##  Female/Woman                     :579   Length:1062       
-##  Male/Man                         :332   Class :character  
-##  Genderqueer/Gender non-conforming:  1   Mode  :character  
-##  NA's                             :150                     
-##                                                            
-##                                                            
-##                                                            
-##  outside.barriers inside.barriers what.barriers     
-##  Min.   :1.000    Min.   :1.000   Length:1062       
-##  1st Qu.:1.000    1st Qu.:1.000   Class :character  
-##  Median :1.000    Median :1.000   Mode  :character  
-##  Mean   :2.073    Mean   :1.259                     
-##  3rd Qu.:3.000    3rd Qu.:1.000                     
-##  Max.   :5.000    Max.   :5.000                     
-##  NA's   :167      NA's   :175                       
-##                             position  
-##  PhD student, dissertation stage: 41  
-##  PhD student, pre-dissertation  : 33  
-##  Visiting fellow or researcher  : 24  
-##  Masters student                : 22  
-##  Undergraduate student          : 21  
-##  (Other)                        : 64  
+##  Min.   :2014-08-19   Min.   :1.000    Min.   :1.000
+##  1st Qu.:2014-11-05   1st Qu.:6.000    1st Qu.:6.000
+##  Median :2015-01-30   Median :7.000    Median :7.000
+##  Mean   :2015-01-22   Mean   :6.251    Mean   :6.257
+##  3rd Qu.:2015-04-03   3rd Qu.:7.000    3rd Qu.:7.000
+##  Max.   :2015-06-22   Max.   :7.000    Max.   :7.000
+##
+##                                      hear        interest
+##  Email from the D-Lab mailing list     :340   Min.   :1.0
+##  Found it on the D-Lab website         :278   1st Qu.:6.0
+##  Heard about it from a friend/colleague:247   Median :7.0
+##  Email from another mailing list       : 99   Mean   :6.6
+##  Don't remember                        : 12   3rd Qu.:7.0
+##  (Other)                               : 55   Max.   :7.0
+##  NA's                                  : 31   NA's   :15
+##                department     verbs               useful
+##  Public Health      : 81   Length:1062        Min.   :1.00
+##  Public Policy      : 44   Class :character   1st Qu.:5.00
+##  Sociology          : 38   Mode  :character   Median :6.00
+##  Political Science  : 36                      Mean   :6.02
+##  Integrative Biology: 28                      3rd Qu.:7.00
+##  (Other)            :288                      Max.   :7.00
+##  NA's               :547
+##                                gender     ethnicity
+##  Female/Woman                     :579   Length:1062
+##  Male/Man                         :332   Class :character
+##  Genderqueer/Gender non-conforming:  1   Mode  :character
+##  NA's                             :150
+##
+##
+##
+##  outside.barriers inside.barriers what.barriers
+##  Min.   :1.000    Min.   :1.000   Length:1062
+##  1st Qu.:1.000    1st Qu.:1.000   Class :character
+##  Median :1.000    Median :1.000   Mode  :character
+##  Mean   :2.073    Mean   :1.259
+##  3rd Qu.:3.000    3rd Qu.:1.000
+##  Max.   :5.000    Max.   :5.000
+##  NA's   :167      NA's   :175
+##                             position
+##  PhD student, dissertation stage: 41
+##  PhD student, pre-dissertation  : 33
+##  Visiting fellow or researcher  : 24
+##  Masters student                : 22
+##  Undergraduate student          : 21
+##  (Other)                        : 64
 ##  NA's                           :857</code></pre>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">table</span>(dat$department)</code></pre></div>
-<pre><code>## 
-##  African American Studies  Ag &amp; Resource Econ &amp; Pol 
-##                        24                        23 
-##              Anthropology   App Sci &amp; Tech Grad Grp 
-##                        12                        10 
-##    Biostatistics Grad Grp  City &amp; Regional Planning 
-##                         8                        20 
-##                 Economics                 Education 
-##                        23                        26 
-##  Energy &amp; Resources Group   Env Sci, Policy, &amp; Mgmt 
-##                        14                        17 
-##   Ethnic Studies Grad Grp                   History 
-##                         1                        17 
-## Industrial Eng &amp; Ops Rsch               Information 
-##                         4                         9 
-##       Integrative Biology              JSP Grad Pgm 
-##                        28                         6 
-##                       Law               Linguistics 
-##                         9                        11 
-##                     Music              Neuroscience 
-##                         3                         4 
-##         Political Science                Psychology 
-##                        36                        28 
-##             Public Health             Public Policy 
-##                        81                        44 
-##                  Rhetoric    Slavic Languages &amp; Lit 
-##                        11                         8 
-##                 Sociology 
+<pre><code>##
+##  African American Studies  Ag &amp; Resource Econ &amp; Pol
+##                        24                        23
+##              Anthropology   App Sci &amp; Tech Grad Grp
+##                        12                        10
+##    Biostatistics Grad Grp  City &amp; Regional Planning
+##                         8                        20
+##                 Economics                 Education
+##                        23                        26
+##  Energy &amp; Resources Group   Env Sci, Policy, &amp; Mgmt
+##                        14                        17
+##   Ethnic Studies Grad Grp                   History
+##                         1                        17
+## Industrial Eng &amp; Ops Rsch               Information
+##                         4                         9
+##       Integrative Biology              JSP Grad Pgm
+##                        28                         6
+##                       Law               Linguistics
+##                         9                        11
+##                     Music              Neuroscience
+##                         3                         4
+##         Political Science                Psychology
+##                        36                        28
+##             Public Health             Public Policy
+##                        81                        44
+##                  Rhetoric    Slavic Languages &amp; Lit
+##                        11                         8
+##                 Sociology
 ##                        38</code></pre>
-<p>think back to day one - how would we make weekdays out of the date variable?</p>
-<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">dat$wday &lt;-<span class="st"> </span><span class="kw">factor</span>(<span class="kw">weekdays</span>(dat$timestamp, <span class="dt">abbreviate =</span> <span class="ot">TRUE</span>), 
-                   <span class="dt">levels =</span> <span class="kw">c</span>(<span class="st">'Mon'</span>,<span class="st">'Tue'</span>,<span class="st">'Wed'</span>,<span class="st">'Thu'</span>,<span class="st">'Fri'</span>,<span class="st">'Sat'</span>,<span class="st">'Sun'</span>)
-                   )
-<span class="kw">summary</span>(dat$wday)</code></pre></div>
-<pre><code>## Mon Tue Wed Thu Fri Sat Sun 
-## 168 124 144 323 277  16  10</code></pre>
-<h2 id="reshape-provides-a-few-more-ways-to-aggregate-things">reshape provides a few more ways to aggregate things</h2>
-<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">library</span>(reshape2)
-<span class="kw">dcast</span>(dat[dat$gender ==<span class="st"> 'Female/Woman'</span> |<span class="st"> </span>dat$gender ==<span class="st"> 'Male/Man'</span>,], department ~<span class="st"> </span>gender)</code></pre></div>
-<pre><code>## Using wday as value column: use value.var to override.
-## Aggregation function missing: defaulting to length</code></pre>
-<pre><code>##                   department Female/Woman Male/Man  NA
-## 1   African American Studies            8       16   0
-## 2   Ag &amp; Resource Econ &amp; Pol           20        3   0
-## 3               Anthropology            9        3   0
-## 4    App Sci &amp; Tech Grad Grp            6        4   0
-## 5     Biostatistics Grad Grp            5        3   0
-## 6   City &amp; Regional Planning           12        7   0
-## 7                  Economics           16        5   0
-## 8                  Education           20        3   0
-## 9   Energy &amp; Resources Group           10        3   0
-## 10   Env Sci, Policy, &amp; Mgmt           11        5   0
-## 11   Ethnic Studies Grad Grp            1        0   0
-## 12                   History            9        6   0
-## 13 Industrial Eng &amp; Ops Rsch            2        2   0
-## 14               Information            2        7   0
-## 15       Integrative Biology           20        8   0
-## 16              JSP Grad Pgm            5        1   0
-## 17                       Law            5        4   0
-## 18               Linguistics            8        1   0
-## 19                     Music            2        0   0
-## 20              Neuroscience            0        4   0
-## 21         Political Science           17       18   0
-## 22                Psychology           20        8   0
-## 23             Public Health           55       19   0
-## 24             Public Policy           22       21   0
-## 25                  Rhetoric            0       11   0
-## 26    Slavic Languages &amp; Lit            7        1   0
-## 27                 Sociology           23       12   0
-## 28                      &lt;NA&gt;          264      157 150</code></pre>
-<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">dcast</span>(<span class="kw">melt</span>(dat, <span class="dt">measure.vars =</span> <span class="kw">c</span>(<span class="st">'course.delivered'</span>)), wday ~<span class="st"> 'Delivered'</span>, <span class="dt">fun.aggregate =</span> mean)</code></pre></div>
-<pre><code>##   wday Delivered
-## 1  Mon  6.309524
-## 2  Tue  6.274194
-## 3  Wed  6.159722
-## 4  Thu  6.077399
-## 5  Fri  6.444043
-## 6  Sat  6.250000
-## 7  Sun  6.600000</code></pre>
+<h2 id="the-psych-package-provides-trimmed-means-skew-kurtosis-and-missingness">the <code>psych</code> package provides trimmed means, skew, kurtosis, and missingness</h2>
+<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">library</span>(psych)
+<span class="kw">describe</span>(dat)</code></pre></div>
+<pre><code>## Warning in FUN(newX[, i], ...): no non-missing arguments to min; returning
+## Inf</code></pre>
+<pre><code>## Warning in FUN(newX[, i], ...): no non-missing arguments to min; returning
+## Inf</code></pre>
+<pre><code>## Warning in FUN(newX[, i], ...): no non-missing arguments to min; returning
+## Inf</code></pre>
+<pre><code>## Warning in FUN(newX[, i], ...): no non-missing arguments to min; returning
+## Inf</code></pre>
+<pre><code>## Warning in FUN(newX[, i], ...): no non-missing arguments to max; returning
+## -Inf</code></pre>
+<pre><code>## Warning in FUN(newX[, i], ...): no non-missing arguments to max; returning
+## -Inf</code></pre>
+<pre><code>## Warning in FUN(newX[, i], ...): no non-missing arguments to max; returning
+## -Inf</code></pre>
+<pre><code>## Warning in FUN(newX[, i], ...): no non-missing arguments to max; returning
+## -Inf</code></pre>
+<pre><code>##                         vars    n  mean   sd median trimmed  mad min  max
+## timestamp*                 1 1062   NaN   NA     NA     NaN   NA Inf -Inf
+## course.delivered           2 1062  6.25 1.11      7    6.47 0.00   1    7
+## instructor.communicated    3 1062  6.26 1.08      7    6.47 0.00   1    7
+## hear*                      4 1031 23.08 6.77     24   23.10 7.41   1   51
+## interest                   5 1047  6.60 0.80      7    6.79 0.00   1    7
+## department*                6  515 15.86 8.45     18   16.29 8.90   1   27
+## verbs*                     7  383   NaN   NA     NA     NaN   NA Inf -Inf
+## useful                     8 1062  6.02 1.20      6    6.23 1.48   1    7
+## gender*                    9  912  1.37 0.48      1    1.33 0.00   1    3
+## ethnicity*                10 1062   NaN   NA     NA     NaN   NA Inf -Inf
+## outside.barriers          11  895  2.07 1.29      1    1.89 0.00   1    5
+## inside.barriers           12  887  1.26 0.68      1    1.07 0.00   1    5
+## what.barriers*            13  120   NaN   NA     NA     NaN   NA Inf -Inf
+## position*                 14  205 13.14 5.84     14   13.70 2.97   1   23
+##                         range  skew kurtosis   se
+## timestamp*               -Inf    NA       NA   NA
+## course.delivered            6 -1.92     4.20 0.03
+## instructor.communicated     6 -1.92     4.35 0.03
+## hear*                      50  0.44     0.88 0.21
+## interest                    6 -2.84    11.11 0.02
+## department*                26 -0.37    -1.34 0.37
+## verbs*                   -Inf    NA       NA   NA
+## useful                      6 -1.57     2.89 0.04
+## gender*                     2  0.58    -1.58 0.02
+## ethnicity*               -Inf    NA       NA   NA
+## outside.barriers            4  0.87    -0.53 0.04
+## inside.barriers             4  2.93     8.62 0.02
+## what.barriers*           -Inf    NA       NA   NA
+## position*                  22 -0.85    -0.27 0.41</code></pre>
+<h2 id="you-can-use-dplyrgroupby-to-generate-summaries">you can use dplyr::groupby to generate summaries</h2>
+<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">library</span>(dplyr)
+dat %&gt;%<span class="st"> </span><span class="kw">group_by</span>(gender) %&gt;%<span class="st"> </span><span class="kw">summarize</span>(<span class="kw">n</span>())</code></pre></div>
+<pre><code>## Source: local data frame [4 x 2]
+##
+##                              gender   n()
+##                              (fctr) (int)
+## 1                      Female/Woman   579
+## 2                          Male/Man   332
+## 3 Genderqueer/Gender non-conforming     1
+## 4                                NA   150</code></pre>
+<h2 id="and-you-can-combine-dplyr-with-tidyrspread-to-generate-crosstabs">and you can combine dplyr with tidyr::spread to generate crosstabs</h2>
+<blockquote>
+<p>side note - we are filtering out missing values of gender, because <code>tidyr</code> doesn’t allow <code>NA</code> as a column name</p>
+</blockquote>
+<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">library</span>(tidyr)
+dat %&gt;%<span class="st"> </span><span class="kw">filter</span>(!<span class="kw">is.na</span>(gender)) %&gt;%<span class="st"> </span><span class="kw">group_by</span>(gender, department) %&gt;%<span class="st"> </span>
+<span class="st">  </span><span class="kw">summarize</span>(<span class="dt">n=</span><span class="kw">n</span>()) %&gt;%<span class="st"> </span><span class="kw">spread</span>(gender, n)</code></pre></div>
+<pre><code>## Source: local data frame [28 x 4]
+##
+##                  department Female/Woman Male/Man
+##                      (fctr)        (int)    (int)
+## 1  African American Studies            8       16
+## 2  Ag &amp; Resource Econ &amp; Pol           20        3
+## 3              Anthropology            9        3
+## 4   App Sci &amp; Tech Grad Grp            6        4
+## 5    Biostatistics Grad Grp            5        3
+## 6  City &amp; Regional Planning           12        7
+## 7                 Economics           16        5
+## 8                 Education           20        3
+## 9  Energy &amp; Resources Group           10        3
+## 10  Env Sci, Policy, &amp; Mgmt           11        5
+## ..                      ...          ...      ...
+## Variables not shown: Genderqueer/Gender non-conforming (int)</code></pre>
 </div>
 <div id="plotting" class="slide section level1">
 <h1>Plotting</h1>
@@ -239,10 +272,21 @@ <h2 id="every-time-you-use-baseplot-edward-tufte-does-something-unkind-to-a-cute
 <li><p>you can stack as many layers as you like</p></li>
 </ul>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">install.packages</span>(<span class="st">'ggplot2'</span>)</code></pre></div>
-<pre><code>## 
+<pre><code>##
 ## The downloaded binary packages are in
-##  /var/folders/rj/8gpcssqd52z9yrqw7f8xxfym0000gn/T//Rtmp2xjYZ7/downloaded_packages</code></pre>
+##  /var/folders/rj/8gpcssqd52z9yrqw7f8xxfym0000gn/T//RtmpmP1txl/downloaded_packages</code></pre>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">library</span>(ggplot2)</code></pre></div>
+<h2 id="getting-weekdays">getting weekdays</h2>
+<p>let’s imagine that we are interested in looking at differences in feedback based on the day of the week – how would we do this in R?</p>
+<blockquote>
+<p>side note - <code>weekdays</code> is locale aware, so students who have their laptop language set to something other than english will get their weekday names in the other language</p>
+</blockquote>
+<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">dat$wday &lt;-<span class="st"> </span><span class="kw">factor</span>(<span class="kw">weekdays</span>(dat$timestamp, <span class="dt">abbreviate =</span> <span class="ot">TRUE</span>),
+                   <span class="dt">levels =</span> <span class="kw">c</span>(<span class="st">'Mon'</span>,<span class="st">'Tue'</span>,<span class="st">'Wed'</span>,<span class="st">'Thu'</span>,<span class="st">'Fri'</span>,<span class="st">'Sat'</span>,<span class="st">'Sun'</span>)
+                   )
+<span class="kw">summary</span>(dat$wday)</code></pre></div>
+<pre><code>## Mon Tue Wed Thu Fri Sat Sun
+## 168 124 144 323 277  16  10</code></pre>
 <h2 id="use-qplot-for-initial-poking-around">use qplot for initial poking around</h2>
 <p>it has very strong intuitions about what you want to see, and is not particularly customizable</p>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">qplot</span>(instructor.communicated, <span class="dt">data =</span> dat)</code></pre></div>
@@ -323,45 +367,45 @@ <h1>Mean testing</h1>
 <p>we’ll start by trying to tell whether differences between group summaries are real</p>
 <h2 id="t.test-with-two-vectors-default-method">t.test with two vectors (default method)</h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">t.test</span>(dat$inside.barriers, dat$outside.barriers)</code></pre></div>
-<pre><code>## 
+<pre><code>##
 ##  Welch Two Sample t-test
-## 
+##
 ## data:  dat$inside.barriers and dat$outside.barriers
 ## t = -16.638, df = 1356.8, p-value &lt; 2.2e-16
 ## alternative hypothesis: true difference in means is not equal to 0
 ## 95 percent confidence interval:
 ##  -0.9092224 -0.7174269
 ## sample estimates:
-## mean of x mean of y 
+## mean of x mean of y
 ##  1.259301  2.072626</code></pre>
 <p>note that R takes care of the defaults for you - what it is really computing is `t.test(dat<span class="math inline">\(inside.barriers, dat\)</span>outside.barriers, alternative = “two.sided”, paired = FALSE, var.equal = FALSE, mu = 0, conf.level = 0.95)</p>
 <p>how would you find this out for yourself?</p>
 <h2 id="t.test-with-subsets-of-one-vector-default-method">t.test with subsets of one vector (default method)</h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">t.test</span>(dat$outside.barriers[dat$gender ==<span class="st"> &quot;Male/Man&quot;</span>], dat$outside.barriers[dat$gender ==<span class="st"> &quot;Female/Woman&quot;</span>])</code></pre></div>
-<pre><code>## 
+<pre><code>##
 ##  Welch Two Sample t-test
-## 
+##
 ## data:  dat$outside.barriers[dat$gender == &quot;Male/Man&quot;] and dat$outside.barriers[dat$gender == &quot;Female/Woman&quot;]
 ## t = -6.9925, df = 748.19, p-value = 5.993e-12
 ## alternative hypothesis: true difference in means is not equal to 0
 ## 95 percent confidence interval:
 ##  -0.7650033 -0.4296142
 ## sample estimates:
-## mean of x mean of y 
+## mean of x mean of y
 ##  1.702875  2.300184</code></pre>
 <p>recall that we mentioned inconsistency on day one - here it is, and in a big way</p>
 <h2 id="t.test-with-s3-method">t.test with S3 method</h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">t.test</span>(outside.barriers ~<span class="st"> </span>gender, <span class="dt">data =</span> dat, <span class="dt">subset =</span> dat$gender %in%<span class="st"> </span><span class="kw">c</span>(<span class="st">&quot;Male/Man&quot;</span>, <span class="st">&quot;Female/Woman&quot;</span>))</code></pre></div>
-<pre><code>## 
+<pre><code>##
 ##  Welch Two Sample t-test
-## 
+##
 ## data:  outside.barriers by gender
 ## t = 6.9925, df = 748.19, p-value = 5.993e-12
 ## alternative hypothesis: true difference in means is not equal to 0
 ## 95 percent confidence interval:
 ##  0.4296142 0.7650033
 ## sample estimates:
-## mean in group Female/Woman     mean in group Male/Man 
+## mean in group Female/Woman     mean in group Male/Man
 ##                   2.300184                   1.702875</code></pre>
 <h2 id="aov">aov</h2>
 <p>first, you would think anova would be called by <code>anova</code>, but that’s reserved for conducting F-tests on lm objects</p>
@@ -372,12 +416,12 @@ <h2 id="aov">aov</h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">aov</span>(outside.barriers ~<span class="st"> </span>gender, <span class="dt">data =</span> dat)</code></pre></div>
 <pre><code>## Call:
 ##    aov(formula = outside.barriers ~ gender, data = dat)
-## 
+##
 ## Terms:
 ##                    gender Residuals
 ## Sum of Squares    79.3444 1363.4374
 ## Deg. of Freedom         2       854
-## 
+##
 ## Residual standard error: 1.263539
 ## Estimated effects may be unbalanced
 ## 205 observations deleted due to missingness</code></pre>
@@ -385,9 +429,9 @@ <h2 id="aov">aov</h2>
 <p>remember our old friend <code>summary</code>? it works on almost everything</p>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">model<span class="fl">.1</span> &lt;-<span class="st"> </span><span class="kw">aov</span>(outside.barriers ~<span class="st"> </span>gender, <span class="dt">data =</span> dat)
 <span class="kw">summary</span>(model<span class="fl">.1</span>)</code></pre></div>
-<pre><code>##              Df Sum Sq Mean Sq F value   Pr(&gt;F)    
+<pre><code>##              Df Sum Sq Mean Sq F value   Pr(&gt;F)
 ## gender        2   79.3   39.67   24.85 3.24e-11 ***
-## Residuals   854 1363.4    1.60                     
+## Residuals   854 1363.4    1.60
 ## ---
 ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 ## 205 observations deleted due to missingness</code></pre>
@@ -395,9 +439,9 @@ <h2 id="aov">aov</h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">TukeyHSD</span>(model<span class="fl">.1</span>)</code></pre></div>
 <pre><code>##   Tukey multiple comparisons of means
 ##     95% family-wise confidence level
-## 
+##
 ## Fit: aov(formula = outside.barriers ~ gender, data = dat)
-## 
+##
 ## $gender
 ##                                                      diff        lwr
 ## Male/Man-Female/Woman                          -0.5973088 -0.8078392
@@ -418,16 +462,16 @@ <h2 id="cor.test-pearson">cor.test (Pearson)</h2>
 <p>earlier, we were looking at differences between the means of two variables</p>
 <p>but those variables were both continuous, so we can ask whether they are related</p>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">cor.test</span>(dat$outside.barriers, dat$inside.barriers)</code></pre></div>
-<pre><code>## 
+<pre><code>##
 ##  Pearson's product-moment correlation
-## 
+##
 ## data:  dat$outside.barriers and dat$inside.barriers
 ## t = 15.558, df = 882, p-value &lt; 2.2e-16
 ## alternative hypothesis: true correlation is not equal to 0
 ## 95 percent confidence interval:
 ##  0.4106679 0.5142422
 ## sample estimates:
-##       cor 
+##       cor
 ## 0.4640396</code></pre>
 <p>okay, so they’re related - now what?</p>
 <h2 id="lm">lm</h2>
@@ -436,37 +480,37 @@ <h2 id="lm">lm</h2>
 <p>the basic call is the S3 method</p>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">model<span class="fl">.1</span> &lt;-<span class="st"> </span><span class="kw">lm</span>(inside.barriers ~<span class="st"> </span>outside.barriers, <span class="dt">data =</span> dat)
 <span class="kw">summary</span>(model<span class="fl">.1</span>)</code></pre></div>
-<pre><code>## 
+<pre><code>##
 ## Call:
 ## lm(formula = inside.barriers ~ outside.barriers, data = dat)
-## 
+##
 ## Residuals:
-##      Min       1Q   Median       3Q      Max 
-## -0.98483 -0.24569  0.00069  0.00069  3.01517 
-## 
+##      Min       1Q   Median       3Q      Max
+## -0.98483 -0.24569  0.00069  0.00069  3.01517
+##
 ## Coefficients:
-##                  Estimate Std. Error t value Pr(&gt;|t|)    
+##                  Estimate Std. Error t value Pr(&gt;|t|)
 ## (Intercept)       0.75292    0.03842   19.60   &lt;2e-16 ***
 ## outside.barriers  0.24638    0.01584   15.56   &lt;2e-16 ***
 ## ---
 ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
-## 
+##
 ## Residual standard error: 0.6041 on 882 degrees of freedom
 ##   (178 observations deleted due to missingness)
-## Multiple R-squared:  0.2153, Adjusted R-squared:  0.2144 
+## Multiple R-squared:  0.2153, Adjusted R-squared:  0.2144
 ## F-statistic:   242 on 1 and 882 DF,  p-value: &lt; 2.2e-16</code></pre>
 <h2 id="r-automatically-one-hot-encodes-your-categories">R automatically one-hot encodes your categories</h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">model<span class="fl">.2</span> &lt;-<span class="st"> </span><span class="kw">lm</span>(inside.barriers ~<span class="st"> </span>outside.barriers +<span class="st"> </span>department, <span class="dt">data =</span> dat)
 <span class="kw">summary</span>(model<span class="fl">.2</span>)</code></pre></div>
-<pre><code>## 
+<pre><code>##
 ## Call:
-## lm(formula = inside.barriers ~ outside.barriers + department, 
+## lm(formula = inside.barriers ~ outside.barriers + department,
 ##     data = dat)
-## 
+##
 ## Residuals:
-##      Min       1Q   Median       3Q      Max 
-## -1.20049 -0.36011 -0.04989  0.17705  2.91702 
-## 
+##      Min       1Q   Median       3Q      Max
+## -1.20049 -0.36011 -0.04989  0.17705  2.91702
+##
 ## Coefficients:
 ##                                     Estimate Std. Error t value Pr(&gt;|t|)
 ## (Intercept)                          0.91782    0.14467   6.344 5.57e-10
@@ -497,54 +541,54 @@ <h2 id="r-automatically-one-hot-encodes-your-categories">R automatically one-hot
 ## departmentRhetoric                   0.17521    0.24153   0.725   0.4686
 ## departmentSlavic Languages &amp; Lit    -0.19495    0.26748  -0.729   0.4665
 ## departmentSociology                 -0.34162    0.17664  -1.934   0.0537
-##                                        
+##
 ## (Intercept)                         ***
 ## outside.barriers                    ***
-## departmentAg &amp; Resource Econ &amp; Pol  *  
-## departmentAnthropology                 
-## departmentApp Sci &amp; Tech Grad Grp      
-## departmentBiostatistics Grad Grp       
-## departmentCity &amp; Regional Planning     
-## departmentEconomics                 .  
-## departmentEducation                    
-## departmentEnergy &amp; Resources Group  .  
-## departmentEnv Sci, Policy, &amp; Mgmt      
-## departmentEthnic Studies Grad Grp      
-## departmentHistory                      
-## departmentIndustrial Eng &amp; Ops Rsch    
-## departmentInformation                  
-## departmentIntegrative Biology       .  
-## departmentJSP Grad Pgm                 
-## departmentLaw                          
-## departmentLinguistics                  
-## departmentMusic                        
-## departmentNeuroscience                 
-## departmentPolitical Science            
-## departmentPsychology                   
-## departmentPublic Health             *  
-## departmentPublic Policy                
-## departmentRhetoric                     
-## departmentSlavic Languages &amp; Lit       
-## departmentSociology                 .  
+## departmentAg &amp; Resource Econ &amp; Pol  *
+## departmentAnthropology
+## departmentApp Sci &amp; Tech Grad Grp
+## departmentBiostatistics Grad Grp
+## departmentCity &amp; Regional Planning
+## departmentEconomics                 .
+## departmentEducation
+## departmentEnergy &amp; Resources Group  .
+## departmentEnv Sci, Policy, &amp; Mgmt
+## departmentEthnic Studies Grad Grp
+## departmentHistory
+## departmentIndustrial Eng &amp; Ops Rsch
+## departmentInformation
+## departmentIntegrative Biology       .
+## departmentJSP Grad Pgm
+## departmentLaw
+## departmentLinguistics
+## departmentMusic
+## departmentNeuroscience
+## departmentPolitical Science
+## departmentPsychology
+## departmentPublic Health             *
+## departmentPublic Policy
+## departmentRhetoric
+## departmentSlavic Languages &amp; Lit
+## departmentSociology                 .
 ## ---
 ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
-## 
+##
 ## Residual standard error: 0.6462 on 440 degrees of freedom
 ##   (594 observations deleted due to missingness)
-## Multiple R-squared:  0.2759, Adjusted R-squared:  0.2314 
+## Multiple R-squared:  0.2759, Adjusted R-squared:  0.2314
 ## F-statistic: 6.209 on 27 and 440 DF,  p-value: &lt; 2.2e-16</code></pre>
 <h2 id="r-does-not-assume-you-want-the-full-factorial-model">R does not assume you want the full factorial model</h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">model<span class="fl">.3</span> &lt;-<span class="st"> </span><span class="kw">lm</span>(inside.barriers ~<span class="st"> </span>outside.barriers +<span class="st"> </span>department +<span class="st"> </span>outside.barriers*department, <span class="dt">data =</span> dat)
 <span class="kw">summary</span>(model<span class="fl">.3</span>)</code></pre></div>
-<pre><code>## 
+<pre><code>##
 ## Call:
-## lm(formula = inside.barriers ~ outside.barriers + department + 
+## lm(formula = inside.barriers ~ outside.barriers + department +
 ##     outside.barriers * department, data = dat)
-## 
+##
 ## Residuals:
-##      Min       1Q   Median       3Q      Max 
-## -1.75495 -0.25924  0.00000  0.05784  2.80608 
-## 
+##      Min       1Q   Median       3Q      Max
+## -1.75495 -0.25924  0.00000  0.05784  2.80608
+##
 ## Coefficients: (3 not defined because of singularities)
 ##                                                        Estimate Std. Error
 ## (Intercept)                                           0.3378995  0.2274560
@@ -601,71 +645,71 @@ <h2 id="r-does-not-assume-you-want-the-full-factorial-model">R does not assume y
 ## outside.barriers:departmentRhetoric                   2.1457382  0.4109273
 ## outside.barriers:departmentSlavic Languages &amp; Lit            NA         NA
 ## outside.barriers:departmentSociology                 -0.4996106  0.1372998
-##                                                      t value Pr(&gt;|t|)    
-## (Intercept)                                            1.486 0.138151    
+##                                                      t value Pr(&gt;|t|)
+## (Intercept)                                            1.486 0.138151
 ## outside.barriers                                       5.636 3.22e-08 ***
-## departmentAg &amp; Resource Econ &amp; Pol                     1.634 0.102964    
-## departmentAnthropology                                 0.234 0.814794    
-## departmentApp Sci &amp; Tech Grad Grp                      0.000 0.999802    
-## departmentBiostatistics Grad Grp                      -1.350 0.177895    
-## departmentCity &amp; Regional Planning                     0.836 0.403848    
-## departmentEconomics                                    2.013 0.044719 *  
-## departmentEducation                                    0.359 0.719428    
-## departmentEnergy &amp; Resources Group                     1.609 0.108295    
-## departmentEnv Sci, Policy, &amp; Mgmt                      0.379 0.704979    
-## departmentEthnic Studies Grad Grp                     -0.911 0.362671    
-## departmentHistory                                     -0.480 0.631266    
-## departmentIndustrial Eng &amp; Ops Rsch                    0.405 0.685368    
-## departmentInformation                                  0.532 0.595352    
-## departmentIntegrative Biology                          1.694 0.091057 .  
-## departmentJSP Grad Pgm                                -0.595 0.552372    
-## departmentLaw                                          1.712 0.087560 .  
-## departmentLinguistics                                  1.932 0.054032 .  
-## departmentMusic                                       -1.261 0.208134    
-## departmentNeuroscience                                 0.717 0.473811    
-## departmentPolitical Science                            1.203 0.229669    
-## departmentPsychology                                   2.158 0.031503 *  
-## departmentPublic Health                                1.668 0.096018 .  
-## departmentPublic Policy                                1.009 0.313790    
+## departmentAg &amp; Resource Econ &amp; Pol                     1.634 0.102964
+## departmentAnthropology                                 0.234 0.814794
+## departmentApp Sci &amp; Tech Grad Grp                      0.000 0.999802
+## departmentBiostatistics Grad Grp                      -1.350 0.177895
+## departmentCity &amp; Regional Planning                     0.836 0.403848
+## departmentEconomics                                    2.013 0.044719 *
+## departmentEducation                                    0.359 0.719428
+## departmentEnergy &amp; Resources Group                     1.609 0.108295
+## departmentEnv Sci, Policy, &amp; Mgmt                      0.379 0.704979
+## departmentEthnic Studies Grad Grp                     -0.911 0.362671
+## departmentHistory                                     -0.480 0.631266
+## departmentIndustrial Eng &amp; Ops Rsch                    0.405 0.685368
+## departmentInformation                                  0.532 0.595352
+## departmentIntegrative Biology                          1.694 0.091057 .
+## departmentJSP Grad Pgm                                -0.595 0.552372
+## departmentLaw                                          1.712 0.087560 .
+## departmentLinguistics                                  1.932 0.054032 .
+## departmentMusic                                       -1.261 0.208134
+## departmentNeuroscience                                 0.717 0.473811
+## departmentPolitical Science                            1.203 0.229669
+## departmentPsychology                                   2.158 0.031503 *
+## departmentPublic Health                                1.668 0.096018 .
+## departmentPublic Policy                                1.009 0.313790
 ## departmentRhetoric                                    -5.518 6.03e-08 ***
-## departmentSlavic Languages &amp; Lit                       0.226 0.821167    
-## departmentSociology                                    2.034 0.042571 *  
+## departmentSlavic Languages &amp; Lit                       0.226 0.821167
+## departmentSociology                                    2.034 0.042571 *
 ## outside.barriers:departmentAg &amp; Resource Econ &amp; Pol   -3.649 0.000297 ***
-## outside.barriers:departmentAnthropology               -1.118 0.264358    
-## outside.barriers:departmentApp Sci &amp; Tech Grad Grp     0.006 0.995116    
-## outside.barriers:departmentBiostatistics Grad Grp      1.303 0.193236    
-## outside.barriers:departmentCity &amp; Regional Planning   -1.007 0.314480    
+## outside.barriers:departmentAnthropology               -1.118 0.264358
+## outside.barriers:departmentApp Sci &amp; Tech Grad Grp     0.006 0.995116
+## outside.barriers:departmentBiostatistics Grad Grp      1.303 0.193236
+## outside.barriers:departmentCity &amp; Regional Planning   -1.007 0.314480
 ## outside.barriers:departmentEconomics                  -3.395 0.000752 ***
-## outside.barriers:departmentEducation                  -1.409 0.159722    
-## outside.barriers:departmentEnergy &amp; Resources Group   -3.251 0.001242 ** 
-## outside.barriers:departmentEnv Sci, Policy, &amp; Mgmt    -0.852 0.394793    
-## outside.barriers:departmentEthnic Studies Grad Grp        NA       NA    
-## outside.barriers:departmentHistory                     1.036 0.300571    
-## outside.barriers:departmentIndustrial Eng &amp; Ops Rsch  -1.034 0.301967    
-## outside.barriers:departmentInformation                -1.137 0.256041    
-## outside.barriers:departmentIntegrative Biology        -3.273 0.001154 ** 
-## outside.barriers:departmentJSP Grad Pgm                0.826 0.409544    
+## outside.barriers:departmentEducation                  -1.409 0.159722
+## outside.barriers:departmentEnergy &amp; Resources Group   -3.251 0.001242 **
+## outside.barriers:departmentEnv Sci, Policy, &amp; Mgmt    -0.852 0.394793
+## outside.barriers:departmentEthnic Studies Grad Grp        NA       NA
+## outside.barriers:departmentHistory                     1.036 0.300571
+## outside.barriers:departmentIndustrial Eng &amp; Ops Rsch  -1.034 0.301967
+## outside.barriers:departmentInformation                -1.137 0.256041
+## outside.barriers:departmentIntegrative Biology        -3.273 0.001154 **
+## outside.barriers:departmentJSP Grad Pgm                0.826 0.409544
 ## outside.barriers:departmentLaw                        -3.329 0.000950 ***
-## outside.barriers:departmentLinguistics                -3.011 0.002758 ** 
-## outside.barriers:departmentMusic                          NA       NA    
-## outside.barriers:departmentNeuroscience               -0.882 0.378243    
-## outside.barriers:departmentPolitical Science          -2.181 0.029771 *  
-## outside.barriers:departmentPsychology                 -3.046 0.002465 ** 
+## outside.barriers:departmentLinguistics                -3.011 0.002758 **
+## outside.barriers:departmentMusic                          NA       NA
+## outside.barriers:departmentNeuroscience               -0.882 0.378243
+## outside.barriers:departmentPolitical Science          -2.181 0.029771 *
+## outside.barriers:departmentPsychology                 -3.046 0.002465 **
 ## outside.barriers:departmentPublic Health              -3.660 0.000284 ***
-## outside.barriers:departmentPublic Policy              -1.996 0.046612 *  
+## outside.barriers:departmentPublic Policy              -1.996 0.046612 *
 ## outside.barriers:departmentRhetoric                    5.222 2.80e-07 ***
-## outside.barriers:departmentSlavic Languages &amp; Lit         NA       NA    
+## outside.barriers:departmentSlavic Languages &amp; Lit         NA       NA
 ## outside.barriers:departmentSociology                  -3.639 0.000308 ***
 ## ---
 ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
-## 
+##
 ## Residual standard error: 0.586 on 417 degrees of freedom
 ##   (594 observations deleted due to missingness)
-## Multiple R-squared:  0.4357, Adjusted R-squared:  0.368 
+## Multiple R-squared:  0.4357, Adjusted R-squared:  0.368
 ## F-statistic: 6.439 on 50 and 417 DF,  p-value: &lt; 2.2e-16</code></pre>
 <h2 id="extract-model-parameters-with">extract model parameters with <code>$</code></h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">model<span class="fl">.1</span>$coefficients</code></pre></div>
-<pre><code>##      (Intercept) outside.barriers 
+<pre><code>##      (Intercept) outside.barriers
 ##        0.7529250        0.2463815</code></pre>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">model<span class="fl">.1</span>$coefficients[[<span class="dv">2</span>]]</code></pre></div>
 <pre><code>## [1] 0.2463815</code></pre>
@@ -696,9 +740,9 @@ <h2 id="ranked-variables">ranked variables</h2>
 <h2 id="median-testing-ranks">median testing ranks</h2>
 <p>we use Mann-Whitney sums to test that the ranks are centered the same way</p>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">wilcox.test</span>(dat$outside.barriers, dat$inside.barriers, <span class="dt">alternative =</span> <span class="st">&quot;two.sided&quot;</span>, <span class="dt">paired =</span> <span class="ot">FALSE</span>, <span class="dt">mu =</span> <span class="dv">0</span>, <span class="dt">conf.level =</span> <span class="fl">0.95</span>)</code></pre></div>
-<pre><code>## 
+<pre><code>##
 ##  Wilcoxon rank sum test with continuity correction
-## 
+##
 ## data:  dat$outside.barriers and dat$inside.barriers
 ## W = 541240, p-value &lt; 2.2e-16
 ## alternative hypothesis: true location shift is not equal to 0</code></pre>
@@ -708,14 +752,14 @@ <h2 id="correlating-ranks">correlating ranks</h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">cor.test</span>(dat$outside.barriers, dat$inside.barriers, <span class="dt">method =</span> <span class="st">'spearman'</span>)</code></pre></div>
 <pre><code>## Warning in cor.test.default(dat$outside.barriers, dat$inside.barriers,
 ## method = &quot;spearman&quot;): Cannot compute exact p-value with ties</code></pre>
-<pre><code>## 
+<pre><code>##
 ##  Spearman's rank correlation rho
-## 
+##
 ## data:  dat$outside.barriers and dat$inside.barriers
 ## S = 63037000, p-value &lt; 2.2e-16
 ## alternative hypothesis: true rho is not equal to 0
 ## sample estimates:
-##       rho 
+##       rho
 ## 0.4524909</code></pre>
 <p>rho is pretty close to the r from above</p>
 <h2 id="chisq">chisq</h2>
@@ -724,9 +768,9 @@ <h2 id="chisq">chisq</h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">chisq.test</span>(dat$gender, dat$department)</code></pre></div>
 <pre><code>## Warning in chisq.test(dat$gender, dat$department): Chi-squared
 ## approximation may be incorrect</code></pre>
-<pre><code>## 
+<pre><code>##
 ##  Pearson's Chi-squared test
-## 
+##
 ## data:  dat$gender and dat$department
 ## X-squared = 76.442, df = 26, p-value = 7.326e-07</code></pre>
 </div>
diff --git a/instructor/day_two.html b/instructor/day_two.html
index adca61f..5a8d1bb 100644
--- a/instructor/day_two.html
+++ b/instructor/day_two.html
@@ -55,7 +55,7 @@ <h1 class="title">Day Two: Data Cleaning</h1>
   <p class="author">
 Dillon Niederhut<br />Shinhye Choi
   </p>
-  <p class="date">02 May, 2016</p>
+  <p class="date">24 May, 2016</p>
 </div>
 <div id="review" class="slide section level1">
 <h1>Review</h1>
@@ -66,28 +66,28 @@ <h2 id="inspecting-objects">Inspecting objects</h2>
 <h2 id="inspecting-variables">Inspecting variables</h2>
 <p>We should see 50 levels in this division variable</p>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">state.division</code></pre></div>
-<pre><code>##  [1] East South Central Pacific            Mountain          
-##  [4] West South Central Pacific            Mountain          
-##  [7] New England        South Atlantic     South Atlantic    
-## [10] South Atlantic     Pacific            Mountain          
+<pre><code>##  [1] East South Central Pacific            Mountain
+##  [4] West South Central Pacific            Mountain
+##  [7] New England        South Atlantic     South Atlantic
+## [10] South Atlantic     Pacific            Mountain
 ## [13] East North Central East North Central West North Central
 ## [16] West North Central East South Central West South Central
-## [19] New England        South Atlantic     New England       
+## [19] New England        South Atlantic     New England
 ## [22] East North Central West North Central East South Central
 ## [25] West North Central Mountain           West North Central
-## [28] Mountain           New England        Middle Atlantic   
-## [31] Mountain           Middle Atlantic    South Atlantic    
+## [28] Mountain           New England        Middle Atlantic
+## [31] Mountain           Middle Atlantic    South Atlantic
 ## [34] West North Central East North Central West South Central
-## [37] Pacific            Middle Atlantic    New England       
+## [37] Pacific            Middle Atlantic    New England
 ## [40] South Atlantic     West North Central East South Central
-## [43] West South Central Mountain           New England       
-## [46] South Atlantic     Pacific            South Atlantic    
-## [49] East North Central Mountain          
+## [43] West South Central Mountain           New England
+## [46] South Atlantic     Pacific            South Atlantic
+## [49] East North Central Mountain
 ## 9 Levels: New England Middle Atlantic ... Pacific</code></pre>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">length</span>(state.division)</code></pre></div>
 <pre><code>## [1] 50</code></pre>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">levels</span>(state.division)</code></pre></div>
-<pre><code>## [1] &quot;New England&quot;        &quot;Middle Atlantic&quot;    &quot;South Atlantic&quot;    
+<pre><code>## [1] &quot;New England&quot;        &quot;Middle Atlantic&quot;    &quot;South Atlantic&quot;
 ## [4] &quot;East South Central&quot; &quot;West South Central&quot; &quot;East North Central&quot;
 ## [7] &quot;West North Central&quot; &quot;Mountain&quot;           &quot;Pacific&quot;</code></pre>
 <h2 id="inspecting-data-frames">Inspecting data frames</h2>
@@ -190,7 +190,7 @@ <h2 id="exporting-data-from-other-software-can-do-weird-things-to-numbers-and-fa
 <span class="kw">str</span>(dirty)</code></pre></div>
 <pre><code>## 'data.frame':    5 obs. of  5 variables:
 ##  $ Timestamp                  : Factor w/ 5 levels &quot;7/25/2015 10:08:41&quot;,..: 1 2 3 4 5
-##  $ How.tall.are.you.          : Factor w/ 5 levels &quot;156&quot;,&quot;2.1&quot;,&quot;5’9&quot;,..: 5 4 3 2 1
+##  $ How.tall.are.you.          : Factor w/ 5 levels &quot;156&quot;,&quot;2.1&quot;,&quot;5'9&quot;,..: 5 4 3 2 1
 ##  $ What.department.are.you.in.: Factor w/ 5 levels &quot;  geology&quot;,&quot;999&quot;,..: 4 2 1 5 3
 ##  $ Are.you.currently.enrolled.: Factor w/ 3 levels &quot;999&quot;,&quot;No&quot;,&quot;Yes&quot;: 3 3 1 2 1
 ##  $ What.is.your.birth.order.  : Factor w/ 3 levels &quot;1&quot;,&quot;2&quot;,&quot;9,000&quot;: 1 1 2 3 2</code></pre>
@@ -200,16 +200,16 @@ <h2 id="its-usually-better-to-disable-rs-intuition-about-data-types">it’s usua
 <span class="kw">str</span>(dirty)</code></pre></div>
 <pre><code>## 'data.frame':    5 obs. of  5 variables:
 ##  $ Timestamp                  : chr  &quot;7/25/2015 10:08:41&quot; &quot;7/25/2015 10:10:56&quot; &quot;7/25/2015 10:11:20&quot; &quot;7/25/2015 10:11:25&quot; ...
-##  $ How.tall.are.you.          : chr  &quot;very&quot; &quot;70&quot; &quot;5’9&quot; &quot;2.1&quot; ...
+##  $ How.tall.are.you.          : chr  &quot;very&quot; &quot;70&quot; &quot;5'9&quot; &quot;2.1&quot; ...
 ##  $ What.department.are.you.in.: chr  &quot;Geology  &quot; &quot;999&quot; &quot;  geology&quot; &quot;goelogy&quot; ...
 ##  $ Are.you.currently.enrolled.: chr  &quot;Yes&quot; &quot;Yes&quot; &quot;999&quot; &quot;No&quot; ...
 ##  $ What.is.your.birth.order.  : chr  &quot;1&quot; &quot;1&quot; &quot;2&quot; &quot;9,000&quot; ...</code></pre>
 <h2 id="lets-start-by-removing-the-empty-rows-and-columns">let’s start by removing the empty rows and columns</h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">tail</span>(dirty)</code></pre></div>
 <pre><code>##            Timestamp How.tall.are.you. What.department.are.you.in.
-## 1 7/25/2015 10:08:41              very                   Geology  
+## 1 7/25/2015 10:08:41              very                   Geology
 ## 2 7/25/2015 10:10:56                70                         999
-## 3 7/25/2015 10:11:20               5’9                     geology
+## 3 7/25/2015 10:11:20               5'9                     geology
 ## 4 7/25/2015 10:11:25               2.1                     goelogy
 ## 5 7/25/2015 10:11:29               156                      anthro
 ##   Are.you.currently.enrolled. What.is.your.birth.order.
@@ -224,7 +224,7 @@ <h2 id="lets-start-by-removing-the-empty-rows-and-columns">let’s start by remo
 <h2 id="you-can-replace-variable-names">you can replace variable names</h2>
 <p>and you should, if they are uninformative or long</p>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">names</span>(dirty)</code></pre></div>
-<pre><code>## [1] &quot;Timestamp&quot;                   &quot;How.tall.are.you.&quot;          
+<pre><code>## [1] &quot;Timestamp&quot;                   &quot;How.tall.are.you.&quot;
 ## [3] &quot;What.department.are.you.in.&quot; &quot;Are.you.currently.enrolled.&quot;
 ## [5] &quot;What.is.your.birth.order.&quot;</code></pre>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">names</span>(dirty) &lt;-<span class="st"> </span><span class="kw">c</span>(<span class="st">&quot;time&quot;</span>, <span class="st">&quot;height&quot;</span>, <span class="st">&quot;dept&quot;</span>, <span class="st">&quot;enroll&quot;</span>, <span class="st">&quot;birth.order&quot;</span>)</code></pre></div>
@@ -234,13 +234,13 @@ <h2 id="its-common-for-hand-coded-data-to-have-a-signifier-for-subject-missingne
 <pre><code>## [1] &quot;Yes&quot; &quot;Yes&quot; &quot;999&quot; &quot;No&quot;  &quot;999&quot;</code></pre>
 <h2 id="you-should-replace-all-of-these-values-in-your-dataframe-with-rs-missingness-signifier-na">you should replace all of these values in your dataframe with R’s missingness signifier, <code>NA</code></h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">table</span>(dirty$enroll)</code></pre></div>
-<pre><code>## 
-## 999  No Yes 
+<pre><code>##
+## 999  No Yes
 ##   2   1   2</code></pre>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">dirty$enroll[dirty$enroll==<span class="st">&quot;999&quot;</span>] &lt;-<span class="st"> </span><span class="ot">NA</span>
 <span class="kw">table</span>(dirty$enroll, <span class="dt">useNA =</span> <span class="st">&quot;ifany&quot;</span>)</code></pre></div>
-<pre><code>## 
-##   No  Yes &lt;NA&gt; 
+<pre><code>##
+##   No  Yes &lt;NA&gt;
 ##    1    2    2</code></pre>
 <blockquote>
 <p>side note - read.table() has an option to specify field values as <code>NA</code> as soon as you import the data, but this is a BAAAAD idea because R automatically encodes blank fields as missing too, and thus you lose the ability to distinguish between user-missing and experimenter-missing</p>
@@ -310,27 +310,27 @@ <h2 id="remember-how-we-talked-about-the-extensibility-of-r">remember how we tal
 <h2 id="lets-use-this-large-dataset-as-an-example">let’s use this large dataset as an example</h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">large &lt;-<span class="st"> </span><span class="kw">read.csv</span>(<span class="st">'data/large.csv'</span>)
 <span class="kw">summary</span>(large)</code></pre></div>
-<pre><code>##        a                   b               c             
-##  Min.   :-33.98426   Min.   :-13.4   Min.   :-249998.64  
-##  1st Qu.: -6.71903   1st Qu.:128.6   1st Qu.:-141005.65  
-##  Median :  0.41681   Median :256.9   Median : -63498.56  
-##  Mean   :  0.00176   Mean   :252.2   Mean   : -83954.09  
-##  3rd Qu.:  7.00630   3rd Qu.:377.5   3rd Qu.: -15748.98  
-##  Max.   : 35.33306   Max.   :513.3   Max.   :     11.77  
+<pre><code>##        a                   b               c
+##  Min.   :-33.98426   Min.   :-13.4   Min.   :-249998.64
+##  1st Qu.: -6.71903   1st Qu.:128.6   1st Qu.:-141005.65
+##  Median :  0.41681   Median :256.9   Median : -63498.56
+##  Mean   :  0.00176   Mean   :252.2   Mean   : -83954.09
+##  3rd Qu.:  7.00630   3rd Qu.:377.5   3rd Qu.: -15748.98
+##  Max.   : 35.33306   Max.   :513.3   Max.   :     11.77
 ##  NA's   :45          NA's   :45      NA's   :45</code></pre>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">nrow</span>(<span class="kw">na.omit</span>(large))</code></pre></div>
 <pre><code>## [1] 871</code></pre>
 <h2 id="for-it-to-work-you-need-low-missingness-and-large-n">for it to work you need low missingness and large N</h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">a &lt;-<span class="st"> </span><span class="kw">amelia</span>(large,<span class="dt">m =</span> <span class="dv">1</span>)</code></pre></div>
 <pre><code>## -- Imputation 1 --
-## 
+##
 ##   1  2  3</code></pre>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">print</span>(a)</code></pre></div>
-<pre><code>## 
+<pre><code>##
 ## Amelia output with 1 imputed datasets.
-## Return code:  1 
-## Message:  Normal EM convergence. 
-## 
+## Return code:  1
+## Message:  Normal EM convergence.
+##
 ## Chain Lengths:
 ## --------------
 ## Imputation 1:  3</code></pre>
@@ -338,12 +338,12 @@ <h2 id="amelia-returns-a-list-where-the-first-item-is-a-list-of-your-imputations
 <p>we only did one, so here it is</p>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">large.imputed &lt;-<span class="st"> </span>a[[<span class="dv">1</span>]][[<span class="dv">1</span>]]
 <span class="kw">summary</span>(large.imputed)</code></pre></div>
-<pre><code>##        a                   b               c          
-##  Min.   :-33.98426   Min.   :-13.4   Min.   :-249999  
-##  1st Qu.: -6.73649   1st Qu.:126.5   1st Qu.:-140641  
-##  Median :  0.30970   Median :252.0   Median : -63513  
-##  Mean   : -0.01213   Mean   :250.0   Mean   : -83156  
-##  3rd Qu.:  6.99412   3rd Qu.:373.9   3rd Qu.: -15561  
+<pre><code>##        a                   b               c
+##  Min.   :-33.98426   Min.   :-13.4   Min.   :-249999
+##  1st Qu.: -6.73649   1st Qu.:126.5   1st Qu.:-140641
+##  Median :  0.30970   Median :252.0   Median : -63513
+##  Mean   : -0.01213   Mean   :250.0   Mean   : -83156
+##  3rd Qu.:  6.99412   3rd Qu.:373.9   3rd Qu.: -15561
 ##  Max.   : 35.33306   Max.   :518.7   Max.   :  69498</code></pre>
 <h2 id="if-you-give-it-a-tiny-dataset-it-will-fuss-at-you">if you give it a tiny dataset, it will fuss at you</h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">a &lt;-<span class="st"> </span><span class="kw">amelia</span>(large[<span class="dv">990</span>:<span class="dv">1000</span>,],<span class="dt">m =</span> <span class="dv">1</span>)</code></pre></div>
@@ -352,14 +352,14 @@ <h2 id="if-you-give-it-a-tiny-dataset-it-will-fuss-at-you">if you give it a tiny
 ## variables in the imputation model. Consider removing some variables, or
 ## reducing the order of time polynomials to reduce the number of parameters.</code></pre>
 <pre><code>## -- Imputation 1 --
-## 
+##
 ##   1  2</code></pre>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">print</span>(a)</code></pre></div>
-<pre><code>## 
+<pre><code>##
 ## Amelia output with 1 imputed datasets.
-## Return code:  1 
-## Message:  Normal EM convergence. 
-## 
+## Return code:  1
+## Message:  Normal EM convergence.
+##
 ## Chain Lengths:
 ## --------------
 ## Imputation 1:  2</code></pre>
@@ -404,10 +404,10 @@ <h2 id="subsetting-data-frames">subsetting data frames</h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">my.data$numeric ==<span class="st"> </span><span class="dv">2</span></code></pre></div>
 <pre><code>## logical(0)</code></pre>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">my.data[my.data$numeric ==<span class="st"> </span><span class="dv">2</span>,]</code></pre></div>
-<pre><code>## [1] n                                        
-## [2] c                                        
-## [3] b                                        
-## [4] d                                        
+<pre><code>## [1] n
+## [2] c
+## [3] b
+## [4] d
 ## [5] really.long.and.complicated.variable.name
 ## &lt;0 rows&gt; (or 0-length row.names)</code></pre>
 <h2 id="boolean-variables-can-act-as-filters-right-out-of-the-box">boolean variables can act as filters right out of the box</h2>
@@ -423,19 +423,19 @@ <h2 id="you-can-also-select-columns">you can also select columns</h2>
 <h2 id="you-can-also-match-elements-from-a-vector">you can also match elements from a vector</h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">good.things &lt;-<span class="st"> </span><span class="kw">c</span>(<span class="st">&quot;three&quot;</span>, <span class="st">&quot;four&quot;</span>, <span class="st">&quot;five&quot;</span>)
 my.data[my.data$character %in%<span class="st"> </span>good.things, ]</code></pre></div>
-<pre><code>## [1] n                                        
-## [2] c                                        
-## [3] b                                        
-## [4] d                                        
+<pre><code>## [1] n
+## [2] c
+## [3] b
+## [4] d
 ## [5] really.long.and.complicated.variable.name
 ## &lt;0 rows&gt; (or 0-length row.names)</code></pre>
 <h2 id="most-subsetting-operations-on-dataframes-also-return-a-dataframe">most subsetting operations on dataframes also return a dataframe</h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">str</span>(my.data[!(my.data$character %in%<span class="st"> </span>good.things), ])</code></pre></div>
 <pre><code>## 'data.frame':    0 obs. of  5 variables:
-##  $ n                                        : num 
-##  $ c                                        : Factor w/ 3 levels &quot;one&quot;,&quot;three&quot;,..: 
-##  $ b                                        : logi 
-##  $ d                                        :Class 'Date'  num(0) 
+##  $ n                                        : num
+##  $ c                                        : Factor w/ 3 levels &quot;one&quot;,&quot;three&quot;,..:
+##  $ b                                        : logi
+##  $ d                                        :Class 'Date'  num(0)
 ##  $ really.long.and.complicated.variable.name: num</code></pre>
 <h2 id="subsets-that-are-a-single-column-return-a-vector">subsets that are a single column return a vector</h2>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">str</span>(my.data$numeric)</code></pre></div>
@@ -484,16 +484,16 @@ <h2 id="reshaping-1">reshaping</h2>
 <p>side note - don’t worry about how this works yet - we’ll talk about it tomorrow</p>
 </blockquote>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">t.test</span>(score ~<span class="st"> </span>time, <span class="dt">data=</span>normal)</code></pre></div>
-<pre><code>## 
+<pre><code>##
 ##  Welch Two Sample t-test
-## 
+##
 ## data:  score by time
 ## t = 0.58132, df = 2.0278, p-value = 0.6191
 ## alternative hypothesis: true difference in means is not equal to 0
 ## 95 percent confidence interval:
 ##  -73.56101  96.89434
 ## sample estimates:
-## mean in group 1 mean in group 2 
+## mean in group 1 mean in group 2
 ##       110.00000        98.33333</code></pre>
 <p>it’s easy to combine tidy tables to compare different levels of information simultaneously</p>
 </div>
@@ -608,7 +608,7 @@ <h2 id="dplyr-allows-you-to-apply-functions-to-groups">dplyr allows you to apply
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">group_by</span>(normal, time)</code></pre></div>
 <pre><code>## Source: local data frame [6 x 4]
 ## Groups: time [2]
-## 
+##
 ##     name  time score    id
 ##   (fctr) (dbl) (dbl) (int)
 ## 1  Alice     1    90     1
@@ -619,7 +619,7 @@ <h2 id="dplyr-allows-you-to-apply-functions-to-groups">dplyr allows you to apply
 ## 6    Eve     2   100     6</code></pre>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">summarize</span>(<span class="kw">group_by</span>(normal, time), <span class="kw">mean</span>(score))</code></pre></div>
 <pre><code>## Source: local data frame [2 x 2]
-## 
+##
 ##    time mean(score)
 ##   (dbl)       (dbl)
 ## 1     1   110.00000
@@ -627,7 +627,7 @@ <h2 id="dplyr-allows-you-to-apply-functions-to-groups">dplyr allows you to apply
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">mutate</span>(<span class="kw">group_by</span>(normal, time), <span class="dt">diff=</span>score-<span class="kw">mean</span>(score))</code></pre></div>
 <pre><code>## Source: local data frame [6 x 5]
 ## Groups: time [2]
-## 
+##
 ##     name  time score    id       diff
 ##   (fctr) (dbl) (dbl) (int)      (dbl)
 ## 1  Alice     1    90     1 -20.000000
@@ -638,7 +638,7 @@ <h2 id="dplyr-allows-you-to-apply-functions-to-groups">dplyr allows you to apply
 ## 6    Eve     2   100     6   1.666667</code></pre>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">ungroup</span>(<span class="kw">mutate</span>(<span class="kw">group_by</span>(normal, time), <span class="dt">diff=</span>score-<span class="kw">mean</span>(score)))</code></pre></div>
 <pre><code>## Source: local data frame [6 x 5]
-## 
+##
 ##     name  time score    id       diff
 ##   (fctr) (dbl) (dbl) (int)      (dbl)
 ## 1  Alice     1    90     1 -20.000000
diff --git a/instructor/overflow.html b/instructor/overflow.html
index 7941bf6..b7c300b 100644
--- a/instructor/overflow.html
+++ b/instructor/overflow.html
@@ -54,7 +54,7 @@ <h1 class="title">Additional Course Materials</h1>
   <p class="author">
 Dillon Niederhut
   </p>
-  <p class="date">02 May, 2016</p>
+  <p class="date">24 May, 2016</p>
 </div>
 <div class="slide section level1">
 
@@ -70,87 +70,87 @@ <h2 id="r-has-an-interface-to-curl-called-rcurl">R has an interface to curl call
 <span class="kw">library</span>(XML)</code></pre></div>
 <h2 id="you-can-use-this-to-access-remote-data">you can use this to access remote data</h2>
 <p>you may just want to read text lines from a webpage</p>
-<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">RJ &lt;-<span class="st"> </span><span class="kw">readLines</span>(<span class="st">&quot;http://shakespeare.mit.edu/romeo_juliet/full.html&quot;</span>)  
+<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">RJ &lt;-<span class="st"> </span><span class="kw">readLines</span>(<span class="st">&quot;http://shakespeare.mit.edu/romeo_juliet/full.html&quot;</span>)
 RJ[<span class="dv">1</span>:<span class="dv">25</span>]</code></pre></div>
-<pre><code>##  [1] &quot;&lt;!DOCTYPE HTML PUBLIC \&quot;-//W3C//DTD HTML 4.0 Transitional//EN\&quot;&quot;              
-##  [2] &quot; \&quot;http://www.w3.org/TR/REC-html40/loose.dtd\&quot;&gt;&quot;                              
-##  [3] &quot; &lt;html&gt;&quot;                                                                      
-##  [4] &quot; &lt;head&gt;&quot;                                                                      
-##  [5] &quot; &lt;title&gt;Romeo and Juliet: Entire Play&quot;                                        
-##  [6] &quot; &lt;/title&gt;&quot;                                                                    
+<pre><code>##  [1] &quot;&lt;!DOCTYPE HTML PUBLIC \&quot;-//W3C//DTD HTML 4.0 Transitional//EN\&quot;&quot;
+##  [2] &quot; \&quot;http://www.w3.org/TR/REC-html40/loose.dtd\&quot;&gt;&quot;
+##  [3] &quot; &lt;html&gt;&quot;
+##  [4] &quot; &lt;head&gt;&quot;
+##  [5] &quot; &lt;title&gt;Romeo and Juliet: Entire Play&quot;
+##  [6] &quot; &lt;/title&gt;&quot;
 ##  [7] &quot; &lt;meta http-equiv=\&quot;Content-Type\&quot; content=\&quot;text/html; charset=iso-8859-1\&quot;&gt;&quot;
-##  [8] &quot; &lt;LINK rel=\&quot;stylesheet\&quot; type=\&quot;text/css\&quot; media=\&quot;screen\&quot;&quot;                 
-##  [9] &quot;       href=\&quot;/shake.css\&quot;&gt;&quot;                                                  
-## [10] &quot; &lt;/HEAD&gt;&quot;                                                                     
-## [11] &quot; &lt;body bgcolor=\&quot;#ffffff\&quot; text=\&quot;#000000\&quot;&gt;&quot;                                 
-## [12] &quot;&quot;                                                                             
-## [13] &quot;&lt;table width=\&quot;100%\&quot; bgcolor=\&quot;#CCF6F6\&quot;&gt;&quot;                                   
-## [14] &quot;&lt;tr&gt;&lt;td class=\&quot;play\&quot; align=\&quot;center\&quot;&gt;Romeo and Juliet&quot;                     
-## [15] &quot;&lt;tr&gt;&lt;td class=\&quot;nav\&quot; align=\&quot;center\&quot;&gt;&quot;                                      
-## [16] &quot;      &lt;a href=\&quot;/Shakespeare\&quot;&gt;Shakespeare homepage&lt;/A&gt; &quot;                     
-## [17] &quot;    | &lt;A href=\&quot;/romeo_juliet/\&quot;&gt;Romeo and Juliet&lt;/A&gt; &quot;                       
-## [18] &quot;    | Entire play&quot;                                                            
-## [19] &quot;&lt;/table&gt;&quot;                                                                     
-## [20] &quot;&quot;                                                                             
-## [21] &quot;&lt;H3&gt;ACT I&lt;/h3&gt;&quot;                                                               
-## [22] &quot;&lt;h3&gt;PROLOGUE&lt;/h3&gt;&quot;                                                            
-## [23] &quot;&lt;blockquote&gt;&quot;                                                                 
-## [24] &quot;&lt;A NAME=1.0.1&gt;Two households, both alike in dignity,&lt;/A&gt;&lt;br&gt;&quot;                 
+##  [8] &quot; &lt;LINK rel=\&quot;stylesheet\&quot; type=\&quot;text/css\&quot; media=\&quot;screen\&quot;&quot;
+##  [9] &quot;       href=\&quot;/shake.css\&quot;&gt;&quot;
+## [10] &quot; &lt;/HEAD&gt;&quot;
+## [11] &quot; &lt;body bgcolor=\&quot;#ffffff\&quot; text=\&quot;#000000\&quot;&gt;&quot;
+## [12] &quot;&quot;
+## [13] &quot;&lt;table width=\&quot;100%\&quot; bgcolor=\&quot;#CCF6F6\&quot;&gt;&quot;
+## [14] &quot;&lt;tr&gt;&lt;td class=\&quot;play\&quot; align=\&quot;center\&quot;&gt;Romeo and Juliet&quot;
+## [15] &quot;&lt;tr&gt;&lt;td class=\&quot;nav\&quot; align=\&quot;center\&quot;&gt;&quot;
+## [16] &quot;      &lt;a href=\&quot;/Shakespeare\&quot;&gt;Shakespeare homepage&lt;/A&gt; &quot;
+## [17] &quot;    | &lt;A href=\&quot;/romeo_juliet/\&quot;&gt;Romeo and Juliet&lt;/A&gt; &quot;
+## [18] &quot;    | Entire play&quot;
+## [19] &quot;&lt;/table&gt;&quot;
+## [20] &quot;&quot;
+## [21] &quot;&lt;H3&gt;ACT I&lt;/h3&gt;&quot;
+## [22] &quot;&lt;h3&gt;PROLOGUE&lt;/h3&gt;&quot;
+## [23] &quot;&lt;blockquote&gt;&quot;
+## [24] &quot;&lt;A NAME=1.0.1&gt;Two households, both alike in dignity,&lt;/A&gt;&lt;br&gt;&quot;
 ## [25] &quot;&lt;A NAME=1.0.2&gt;In fair Verona, where we lay our scene,&lt;/A&gt;&lt;br&gt;&quot;</code></pre>
 <p>and use the kinds of string manipulation we learned yesterday to retrieve the first lines of an act or a scene</p>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">RJ[<span class="kw">grep</span>(<span class="st">&quot;&lt;h3&gt;&quot;</span>, RJ, <span class="dt">perl=</span>T)]</code></pre></div>
-<pre><code>##  [1] &quot;&lt;h3&gt;PROLOGUE&lt;/h3&gt;&quot;                                                        
-##  [2] &quot;&lt;h3&gt;SCENE I. Verona. A public place.&lt;/h3&gt;&quot;                                
-##  [3] &quot;&lt;h3&gt;SCENE II. A street.&lt;/h3&gt;&quot;                                             
-##  [4] &quot;&lt;h3&gt;SCENE III. A room in Capulet's house.&lt;/h3&gt;&quot;                           
-##  [5] &quot;&lt;h3&gt;SCENE IV. A street.&lt;/h3&gt;&quot;                                             
-##  [6] &quot;&lt;h3&gt;SCENE V. A hall in Capulet's house.&lt;/h3&gt;&quot;                             
-##  [7] &quot;&lt;h3&gt;PROLOGUE&lt;/h3&gt;&quot;                                                        
-##  [8] &quot;&lt;h3&gt;SCENE I. A lane by the wall of Capulet's orchard.&lt;/h3&gt;&quot;               
-##  [9] &quot;&lt;h3&gt;SCENE II. Capulet's orchard.&lt;/h3&gt;&quot;                                    
-## [10] &quot;&lt;h3&gt;SCENE III. Friar Laurence's cell.&lt;/h3&gt;&quot;                               
-## [11] &quot;&lt;h3&gt;SCENE IV. A street.&lt;/h3&gt;&quot;                                             
-## [12] &quot;&lt;h3&gt;SCENE V. Capulet's orchard.&lt;/h3&gt;&quot;                                     
-## [13] &quot;&lt;h3&gt;SCENE VI. Friar Laurence's cell.&lt;/h3&gt;&quot;                                
-## [14] &quot;&lt;h3&gt;SCENE I. A public place.&lt;/h3&gt;&quot;                                        
-## [15] &quot;&lt;h3&gt;SCENE II. Capulet's orchard.&lt;/h3&gt;&quot;                                    
-## [16] &quot;&lt;h3&gt;SCENE III. Friar Laurence's cell.&lt;/h3&gt;&quot;                               
-## [17] &quot;&lt;h3&gt;SCENE IV. A room in Capulet's house.&lt;/h3&gt;&quot;                            
-## [18] &quot;&lt;h3&gt;SCENE V. Capulet's orchard.&lt;/h3&gt;&quot;                                     
-## [19] &quot;&lt;h3&gt;SCENE I. Friar Laurence's cell.&lt;/h3&gt;&quot;                                 
-## [20] &quot;&lt;h3&gt;SCENE II. Hall in Capulet's house.&lt;/h3&gt;&quot;                              
-## [21] &quot;&lt;h3&gt;SCENE III. Juliet's chamber.&lt;/h3&gt;&quot;                                    
-## [22] &quot;&lt;h3&gt;SCENE IV. Hall in Capulet's house.&lt;/h3&gt;&quot;                              
-## [23] &quot;&lt;h3&gt;SCENE V. Juliet's chamber.&lt;/h3&gt;&quot;                                      
-## [24] &quot;&lt;h3&gt;SCENE I. Mantua. A street.&lt;/h3&gt;&quot;                                      
-## [25] &quot;&lt;h3&gt;SCENE II. Friar Laurence's cell.&lt;/h3&gt;&quot;                                
+<pre><code>##  [1] &quot;&lt;h3&gt;PROLOGUE&lt;/h3&gt;&quot;
+##  [2] &quot;&lt;h3&gt;SCENE I. Verona. A public place.&lt;/h3&gt;&quot;
+##  [3] &quot;&lt;h3&gt;SCENE II. A street.&lt;/h3&gt;&quot;
+##  [4] &quot;&lt;h3&gt;SCENE III. A room in Capulet's house.&lt;/h3&gt;&quot;
+##  [5] &quot;&lt;h3&gt;SCENE IV. A street.&lt;/h3&gt;&quot;
+##  [6] &quot;&lt;h3&gt;SCENE V. A hall in Capulet's house.&lt;/h3&gt;&quot;
+##  [7] &quot;&lt;h3&gt;PROLOGUE&lt;/h3&gt;&quot;
+##  [8] &quot;&lt;h3&gt;SCENE I. A lane by the wall of Capulet's orchard.&lt;/h3&gt;&quot;
+##  [9] &quot;&lt;h3&gt;SCENE II. Capulet's orchard.&lt;/h3&gt;&quot;
+## [10] &quot;&lt;h3&gt;SCENE III. Friar Laurence's cell.&lt;/h3&gt;&quot;
+## [11] &quot;&lt;h3&gt;SCENE IV. A street.&lt;/h3&gt;&quot;
+## [12] &quot;&lt;h3&gt;SCENE V. Capulet's orchard.&lt;/h3&gt;&quot;
+## [13] &quot;&lt;h3&gt;SCENE VI. Friar Laurence's cell.&lt;/h3&gt;&quot;
+## [14] &quot;&lt;h3&gt;SCENE I. A public place.&lt;/h3&gt;&quot;
+## [15] &quot;&lt;h3&gt;SCENE II. Capulet's orchard.&lt;/h3&gt;&quot;
+## [16] &quot;&lt;h3&gt;SCENE III. Friar Laurence's cell.&lt;/h3&gt;&quot;
+## [17] &quot;&lt;h3&gt;SCENE IV. A room in Capulet's house.&lt;/h3&gt;&quot;
+## [18] &quot;&lt;h3&gt;SCENE V. Capulet's orchard.&lt;/h3&gt;&quot;
+## [19] &quot;&lt;h3&gt;SCENE I. Friar Laurence's cell.&lt;/h3&gt;&quot;
+## [20] &quot;&lt;h3&gt;SCENE II. Hall in Capulet's house.&lt;/h3&gt;&quot;
+## [21] &quot;&lt;h3&gt;SCENE III. Juliet's chamber.&lt;/h3&gt;&quot;
+## [22] &quot;&lt;h3&gt;SCENE IV. Hall in Capulet's house.&lt;/h3&gt;&quot;
+## [23] &quot;&lt;h3&gt;SCENE V. Juliet's chamber.&lt;/h3&gt;&quot;
+## [24] &quot;&lt;h3&gt;SCENE I. Mantua. A street.&lt;/h3&gt;&quot;
+## [25] &quot;&lt;h3&gt;SCENE II. Friar Laurence's cell.&lt;/h3&gt;&quot;
 ## [26] &quot;&lt;h3&gt;SCENE III. A churchyard; in it a tomb belonging to the Capulets.&lt;/h3&gt;&quot;</code></pre>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">RJ[<span class="kw">grep</span>(<span class="st">&quot;&lt;h3&gt;&quot;</span>, RJ, <span class="dt">perl=</span><span class="ot">TRUE</span>)]</code></pre></div>
-<pre><code>##  [1] &quot;&lt;h3&gt;PROLOGUE&lt;/h3&gt;&quot;                                                        
-##  [2] &quot;&lt;h3&gt;SCENE I. Verona. A public place.&lt;/h3&gt;&quot;                                
-##  [3] &quot;&lt;h3&gt;SCENE II. A street.&lt;/h3&gt;&quot;                                             
-##  [4] &quot;&lt;h3&gt;SCENE III. A room in Capulet's house.&lt;/h3&gt;&quot;                           
-##  [5] &quot;&lt;h3&gt;SCENE IV. A street.&lt;/h3&gt;&quot;                                             
-##  [6] &quot;&lt;h3&gt;SCENE V. A hall in Capulet's house.&lt;/h3&gt;&quot;                             
-##  [7] &quot;&lt;h3&gt;PROLOGUE&lt;/h3&gt;&quot;                                                        
-##  [8] &quot;&lt;h3&gt;SCENE I. A lane by the wall of Capulet's orchard.&lt;/h3&gt;&quot;               
-##  [9] &quot;&lt;h3&gt;SCENE II. Capulet's orchard.&lt;/h3&gt;&quot;                                    
-## [10] &quot;&lt;h3&gt;SCENE III. Friar Laurence's cell.&lt;/h3&gt;&quot;                               
-## [11] &quot;&lt;h3&gt;SCENE IV. A street.&lt;/h3&gt;&quot;                                             
-## [12] &quot;&lt;h3&gt;SCENE V. Capulet's orchard.&lt;/h3&gt;&quot;                                     
-## [13] &quot;&lt;h3&gt;SCENE VI. Friar Laurence's cell.&lt;/h3&gt;&quot;                                
-## [14] &quot;&lt;h3&gt;SCENE I. A public place.&lt;/h3&gt;&quot;                                        
-## [15] &quot;&lt;h3&gt;SCENE II. Capulet's orchard.&lt;/h3&gt;&quot;                                    
-## [16] &quot;&lt;h3&gt;SCENE III. Friar Laurence's cell.&lt;/h3&gt;&quot;                               
-## [17] &quot;&lt;h3&gt;SCENE IV. A room in Capulet's house.&lt;/h3&gt;&quot;                            
-## [18] &quot;&lt;h3&gt;SCENE V. Capulet's orchard.&lt;/h3&gt;&quot;                                     
-## [19] &quot;&lt;h3&gt;SCENE I. Friar Laurence's cell.&lt;/h3&gt;&quot;                                 
-## [20] &quot;&lt;h3&gt;SCENE II. Hall in Capulet's house.&lt;/h3&gt;&quot;                              
-## [21] &quot;&lt;h3&gt;SCENE III. Juliet's chamber.&lt;/h3&gt;&quot;                                    
-## [22] &quot;&lt;h3&gt;SCENE IV. Hall in Capulet's house.&lt;/h3&gt;&quot;                              
-## [23] &quot;&lt;h3&gt;SCENE V. Juliet's chamber.&lt;/h3&gt;&quot;                                      
-## [24] &quot;&lt;h3&gt;SCENE I. Mantua. A street.&lt;/h3&gt;&quot;                                      
-## [25] &quot;&lt;h3&gt;SCENE II. Friar Laurence's cell.&lt;/h3&gt;&quot;                                
+<pre><code>##  [1] &quot;&lt;h3&gt;PROLOGUE&lt;/h3&gt;&quot;
+##  [2] &quot;&lt;h3&gt;SCENE I. Verona. A public place.&lt;/h3&gt;&quot;
+##  [3] &quot;&lt;h3&gt;SCENE II. A street.&lt;/h3&gt;&quot;
+##  [4] &quot;&lt;h3&gt;SCENE III. A room in Capulet's house.&lt;/h3&gt;&quot;
+##  [5] &quot;&lt;h3&gt;SCENE IV. A street.&lt;/h3&gt;&quot;
+##  [6] &quot;&lt;h3&gt;SCENE V. A hall in Capulet's house.&lt;/h3&gt;&quot;
+##  [7] &quot;&lt;h3&gt;PROLOGUE&lt;/h3&gt;&quot;
+##  [8] &quot;&lt;h3&gt;SCENE I. A lane by the wall of Capulet's orchard.&lt;/h3&gt;&quot;
+##  [9] &quot;&lt;h3&gt;SCENE II. Capulet's orchard.&lt;/h3&gt;&quot;
+## [10] &quot;&lt;h3&gt;SCENE III. Friar Laurence's cell.&lt;/h3&gt;&quot;
+## [11] &quot;&lt;h3&gt;SCENE IV. A street.&lt;/h3&gt;&quot;
+## [12] &quot;&lt;h3&gt;SCENE V. Capulet's orchard.&lt;/h3&gt;&quot;
+## [13] &quot;&lt;h3&gt;SCENE VI. Friar Laurence's cell.&lt;/h3&gt;&quot;
+## [14] &quot;&lt;h3&gt;SCENE I. A public place.&lt;/h3&gt;&quot;
+## [15] &quot;&lt;h3&gt;SCENE II. Capulet's orchard.&lt;/h3&gt;&quot;
+## [16] &quot;&lt;h3&gt;SCENE III. Friar Laurence's cell.&lt;/h3&gt;&quot;
+## [17] &quot;&lt;h3&gt;SCENE IV. A room in Capulet's house.&lt;/h3&gt;&quot;
+## [18] &quot;&lt;h3&gt;SCENE V. Capulet's orchard.&lt;/h3&gt;&quot;
+## [19] &quot;&lt;h3&gt;SCENE I. Friar Laurence's cell.&lt;/h3&gt;&quot;
+## [20] &quot;&lt;h3&gt;SCENE II. Hall in Capulet's house.&lt;/h3&gt;&quot;
+## [21] &quot;&lt;h3&gt;SCENE III. Juliet's chamber.&lt;/h3&gt;&quot;
+## [22] &quot;&lt;h3&gt;SCENE IV. Hall in Capulet's house.&lt;/h3&gt;&quot;
+## [23] &quot;&lt;h3&gt;SCENE V. Juliet's chamber.&lt;/h3&gt;&quot;
+## [24] &quot;&lt;h3&gt;SCENE I. Mantua. A street.&lt;/h3&gt;&quot;
+## [25] &quot;&lt;h3&gt;SCENE II. Friar Laurence's cell.&lt;/h3&gt;&quot;
 ## [26] &quot;&lt;h3&gt;SCENE III. A churchyard; in it a tomb belonging to the Capulets.&lt;/h3&gt;&quot;</code></pre>
 <p>or maybe pull information out of an RSS feed</p>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">link &lt;-<span class="st"> &quot;http://rss.nytimes.com/services/xml/rss/nyt/HomePage.xml&quot;</span>
@@ -210,20 +210,20 @@ <h1>Connecting to a database</h1>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">install.packages</span>(<span class="st">&quot;RPostgreSQL&quot;</span>)
 <span class="kw">library</span>(RPostgreSQL)
 con &lt;-<span class="st"> </span><span class="kw">dbConnect</span>(<span class="kw">dbDriver</span>(<span class="st">&quot;PostgreSQL&quot;</span>),
-                 <span class="dt">dbname=</span><span class="st">&quot;&quot;</span>, 
+                 <span class="dt">dbname=</span><span class="st">&quot;&quot;</span>,
                  <span class="dt">host=</span><span class="st">&quot;localhost&quot;</span>,
-                 <span class="dt">port=</span><span class="dv">1234</span>, 
-                 <span class="dt">user=</span><span class="st">&quot;&quot;</span>, 
+                 <span class="dt">port=</span><span class="dv">1234</span>,
+                 <span class="dt">user=</span><span class="st">&quot;&quot;</span>,
                  <span class="dt">password=</span><span class="st">&quot;&quot;</span>)
 data &lt;-<span class="st"> </span><span class="kw">dbReadTable</span>(con, <span class="kw">c</span>(<span class="st">&quot;column1&quot;</span>,<span class="st">&quot;column2&quot;</span>))
 <span class="kw">dbDisconnect</span>(con)</code></pre></div>
 <p>a popular non-relational database is MongoDB</p>
 <div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">install.packages</span>(<span class="st">&quot;rmongodb&quot;</span>)
 <span class="kw">library</span>(rmongodb)
-con &lt;-<span class="st"> </span><span class="kw">mongo.create</span>(<span class="dt">host =</span> localhost, 
-                      <span class="dt">name =</span> <span class="st">&quot;&quot;</span>, 
-                      <span class="dt">username =</span> <span class="st">&quot;&quot;</span>, 
-                      <span class="dt">password =</span> <span class="st">&quot;&quot;</span>, 
+con &lt;-<span class="st"> </span><span class="kw">mongo.create</span>(<span class="dt">host =</span> localhost,
+                      <span class="dt">name =</span> <span class="st">&quot;&quot;</span>,
+                      <span class="dt">username =</span> <span class="st">&quot;&quot;</span>,
+                      <span class="dt">password =</span> <span class="st">&quot;&quot;</span>,
                       <span class="dt">db =</span> <span class="st">&quot;admin&quot;</span>)
 if(<span class="kw">mongo.is.connected</span>(con) ==<span class="st"> </span><span class="ot">TRUE</span>) {
   data &lt;-<span class="st"> </span><span class="kw">mongo.find.all</span>(con, <span class="st">&quot;collection&quot;</span>, <span class="kw">list</span>(<span class="st">&quot;city&quot;</span> =<span class="st"> </span><span class="kw">list</span>( <span class="st">&quot;$exists&quot;</span> =<span class="st"> &quot;true&quot;</span>)))
@@ -260,7 +260,7 @@ <h2 id="group-wise-operationsplyrselecting-functions">group-wise operations/plyr
 mydata &lt;-<span class="st"> </span><span class="kw">read.csv</span>(<span class="st">&quot;http://www.ats.ucla.edu/stat/data/binary.csv&quot;</span>)
 <span class="co"># Consider the case where we want to calculate descriptive statistics across admits and not-admits</span>
 <span class="co"># from the dataset and return them as a data.frame</span>
-ddata &lt;-<span class="st"> </span><span class="kw">ddply</span>(mydata, <span class="kw">c</span>(<span class="st">&quot;admit&quot;</span>), summarize, 
+ddata &lt;-<span class="st"> </span><span class="kw">ddply</span>(mydata, <span class="kw">c</span>(<span class="st">&quot;admit&quot;</span>), summarize,
                 <span class="dt">gpa.over3 =</span> <span class="kw">length</span>(gpa[gpa&gt;=<span class="dv">3</span>]),
                 <span class="dt">gpa.over3.5 =</span> <span class="kw">length</span>(gpa[gpa&gt;=<span class="fl">3.5</span>]),
                 <span class="dt">gpa.over3per =</span> <span class="kw">length</span>(gpa[gpa&gt;=<span class="dv">3</span>])/<span class="kw">length</span>(gpa),
@@ -277,7 +277,7 @@ <h1>Group-wise Operations/plyr/functions</h1>
 </div>
 <div id="add-a-column-containing-the-average-gre-score-of-students" class="slide section level1">
 <h1>add a column containing the average gre score of students</h1>
-<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">mydata &lt;-<span class="st"> </span><span class="kw">ddply</span>(mydata, <span class="kw">c</span>(<span class="st">&quot;admit&quot;</span>), transform, 
+<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">mydata &lt;-<span class="st"> </span><span class="kw">ddply</span>(mydata, <span class="kw">c</span>(<span class="st">&quot;admit&quot;</span>), transform,
                 <span class="dt">gre.ave=</span><span class="kw">mean</span>(<span class="dt">x=</span>gre, <span class="dt">na.rm=</span>T),
                 <span class="dt">gre.sd =</span> <span class="kw">sd</span>(<span class="dt">x=</span>gre, <span class="dt">na.rm=</span>T))
 <span class="kw">head</span>(mydata)