Skip to content

Commit

Permalink
clean pandas exercises
Browse files Browse the repository at this point in the history
  • Loading branch information
fabridamicelli committed Oct 16, 2024
1 parent bbf749b commit 4a19064
Show file tree
Hide file tree
Showing 7 changed files with 367 additions and 44 deletions.
2 changes: 1 addition & 1 deletion chapters/051_oop.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -228,7 +228,7 @@
"metadata": {},
"source": [
"Where is the _self_ argument of the functions? \n",
"The first argument of a method will be _always passed_ to the method in the call and it is the object itself! \n",
"The first argument of a method will be _always passed_ to the method in the call and it is the object itself (thus the convention to call it _self_). \n",
"That's a bit meta an a bit confusing, but don't worry, you'll get a feel of it by using the classes/objects."
]
},
Expand Down
212 changes: 182 additions & 30 deletions chapters/081_pandas.ipynb

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/051_oop.html
Original file line number Diff line number Diff line change
Expand Up @@ -452,7 +452,7 @@ <h2 data-number="9.2" class="anchored" data-anchor-id="methods"><span class="hea
</div>
</div>
<p>Where is the <em>self</em> argument of the functions?<br>
The first argument of a method will be <em>always passed</em> to the method in the call and it is the object itself!<br>
The first argument of a method will be <em>always passed</em> to the method in the call and it is the object itself (thus the convention to call it <em>self</em>).<br>
That’s a bit meta an a bit confusing, but don’t worry, you’ll get a feel of it by using the classes/objects.</p>
</section>
<section id="dunder-methods" class="level2" data-number="9.3">
Expand Down
177 changes: 169 additions & 8 deletions docs/081_pandas.html
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,44 @@
margin: 0 0.8em 0.2em -1em; /* quarto-specific, see https://github.com/quarto-dev/quarto-cli/issues/4556 */
vertical-align: middle;
}
/* CSS for syntax highlighting */
pre > code.sourceCode { white-space: pre; position: relative; }
pre > code.sourceCode > span { line-height: 1.25; }
pre > code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
div.sourceCode { margin: 1em 0; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
pre > code.sourceCode { white-space: pre-wrap; }
pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
pre.numberSource code > span
{ position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
{ content: counter(source-line);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
}
pre.numberSource { margin-left: 3em; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
</style>


<script src="site_libs/quarto-nav/quarto-nav.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.5.1/jquery.min.js" integrity="sha512-bLT0Qm9VnAYZDflyKcBaQ2gg0hSYNQrJ8RilYldYQ1FxQYoCLtUjuuRuZo+fjqhx/qtq/1itJ0C2ejDxltZVFg==" crossorigin="anonymous"></script><script src="site_libs/quarto-nav/quarto-nav.js"></script>
<script src="site_libs/quarto-nav/headroom.min.js"></script>
<script src="site_libs/clipboard/clipboard.min.js"></script>
<script src="site_libs/quarto-search/autocomplete.umd.js"></script>
Expand Down Expand Up @@ -70,6 +104,9 @@
}</script>
<!-- plausible -->
<script defer="" data-domain="fabridamicelli.github.io/python-course" src="https://plausible.io/js/script.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js" integrity="sha512-c3Nl8+7g4LMSTdrm621y7kf9v3SDPnhxLNhcjFJbKECVnmZHTdo+IRO05sNLTH/D3vA6u1X32ehoLC7WFVdheg==" crossorigin="anonymous"></script>

<script type="application/javascript">define('jquery', [],function() {return window.jQuery;})</script>


</head>
Expand Down Expand Up @@ -341,20 +378,144 @@ <h1 class="title"><span class="chapter-number">15</span>&nbsp; <span class="chap
<p>Here’s the introductory <a href="https://wesmckinney.com/book/pandas-basics">chapter of the book</a>, by the original author of the library pandas Wes McKinney. In particular, we’ll look at subtitles:</p>
<ul>
<li><a href="https://wesmckinney.com/book/pandas-basics#pandas_construction">5.1. Introduction to pandas Data Structures</a></li>
<li><a href="https://wesmckinney.com/book/pandas-basics#pandas_series">5.2 Essential Functionality</a></li>
<li><a href="https://wesmckinney.com/book/pandas-basics#pandas_series">5.2. Essential Functionality</a>
<ul>
<li>Indexing, selection and filtering</li>
<li>Function application and mapping</li>
<li>Sorting and ranking</li>
</ul></li>
<li><a href="https://wesmckinney.com/book/data-cleaning">7.1. Handling missing data</a></li>
<li><a href="https://wesmckinney.com/book/data-cleaning#prep_replace">7.2. Replacing values</a></li>
<li><a href="https://wesmckinney.com/book/data-cleaning#text_string_manip_vectorized">7.4. String functions in pandas</a></li>
<li><a href="https://wesmckinney.com/book/data-aggregation#groupby_fundamentals">10.1. How to think about group operations</a></li>
</ul>
<p>These are useful <a href="https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf">cheatsheets</a>.</p>
<p>Check out the pandas getting-started documentation <a href="https://pandas.pydata.org/docs/getting_started/index.html#getting-started">here</a></p>
<p>Here’s short introduction by the very author of the library:</p>
<div class="quarto-video ratio ratio-16x9"><iframe data-external="1" src="https://www.youtube.com/embed/_T8LGqJtuGc" title="" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe></div>
<div id="7617b9b8-455a-44f0-9856-8b2e248d11a0" class="cell" data-execution_count="1">
<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> seaborn <span class="im">as</span> sns</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</div>
<div id="91cfdd53-1b65-4b02-b0f8-5bd227b6da4e" class="cell" data-execution_count="2">
<div class="sourceCode cell-code" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>df <span class="op">=</span> sns.load_dataset(<span class="st">"fmri"</span>)</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a>df.head()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-display" data-execution_count="2">
<div>
<div>


<table class="dataframe table table-sm table-striped small" data-quarto-postprocess="true" data-border="1">
<thead>
<tr class="header">
<th data-quarto-table-cell-role="th"></th>
<th data-quarto-table-cell-role="th">subject</th>
<th data-quarto-table-cell-role="th">timepoint</th>
<th data-quarto-table-cell-role="th">event</th>
<th data-quarto-table-cell-role="th">region</th>
<th data-quarto-table-cell-role="th">signal</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td data-quarto-table-cell-role="th">0</td>
<td>s13</td>
<td>18</td>
<td>stim</td>
<td>parietal</td>
<td>-0.017552</td>
</tr>
<tr class="even">
<td data-quarto-table-cell-role="th">1</td>
<td>s5</td>
<td>14</td>
<td>stim</td>
<td>parietal</td>
<td>-0.080883</td>
</tr>
<tr class="odd">
<td data-quarto-table-cell-role="th">2</td>
<td>s12</td>
<td>18</td>
<td>stim</td>
<td>parietal</td>
<td>-0.081033</td>
</tr>
<tr class="even">
<td data-quarto-table-cell-role="th">3</td>
<td>s11</td>
<td>18</td>
<td>stim</td>
<td>parietal</td>
<td>-0.046134</td>
</tr>
<tr class="odd">
<td data-quarto-table-cell-role="th">4</td>
<td>s10</td>
<td>18</td>
<td>stim</td>
<td>parietal</td>
<td>-0.037970</td>
</tr>
</tbody>
</table>

</div>
</div>
</div>
</div>
<div id="97350317-64f0-48cf-b73d-b14e2364ce87" class="cell" data-execution_count="35">
<div class="sourceCode cell-code" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>sns.lineplot(df, x<span class="op">=</span><span class="st">"timepoint"</span>, y<span class="op">=</span><span class="st">"signal"</span>, marker<span class="op">=</span><span class="st">"."</span>, hue<span class="op">=</span><span class="st">"region"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="081_pandas_files/figure-html/cell-4-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<section id="exercises" class="level2" data-number="15.1">
<h2 data-number="15.1" class="anchored" data-anchor-id="exercises"><span class="header-section-number">15.1</span> Exercises</h2>
<p>Series from dict Missing values</p>
<p>Series from dict Display first 7 elements Display last 7 elements Read in data, assign column names Read in only X rows Compute mean of 1 column Add two columns (numbers) Add two columns (strings) Add a column to existing dataframe Add an indicator column to existing dataframe</p>
<p>Find an interesting dataset, download and explore it.</p>
<p>Discard rows with missing values Fill missing values Drop entire rows (by index) Drop entire cols Sort elements (index, values)</p>
<p>Summary metric on specific cols Substract mean of a col from other columns</p>
<p>Select a couple of columns of DF (list of str) Select with boolean mask Select with loc Select with iloc</p>
<ul>
<li>Series from dict</li>
<li>DF from dict</li>
<li>DF from matrix (columns)</li>
<li>read_csv</li>
<li>Discard Missing values</li>
<li>Fill Missing values</li>
<li>Display first 7 elements</li>
<li>Display last 7 elements</li>
<li>Read in data, assign column names</li>
<li>Read in only X rows</li>
<li>Compute mean of 1 column</li>
<li>Add two columns (numbers)</li>
<li>Add two columns (strings)</li>
<li>Add a column to existing dataframe</li>
<li>Add an indicator column to existing dataframe</li>
<li>Find an interesting dataset, download and explore it.<br>
</li>
<li>Discard rows with missing values</li>
<li>Fill missing values</li>
<li>Drop entire rows (by index)</li>
<li>Drop entire cols</li>
<li>Sort elements (index, values)</li>
<li>Sort 2 columns<br>
</li>
<li>Summary metric on specific cols</li>
<li>Substract mean of a col from other columns</li>
<li>Substract rolling mean of a col from other columns</li>
<li>Select a couple of columns of DF (list of str)</li>
<li>Select with boolean mask</li>
<li>Select with loc</li>
<li>Select with iloc</li>
<li>Sample 20% of the rows</li>
<li>pd.to_datetime</li>
<li>fmri:
<ul>
<li>unique brain regions</li>
<li>unique subjects</li>
<li>plot subject “s13” time series of “parietal” region</li>
</ul></li>
</ul>


</section>
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
14 changes: 12 additions & 2 deletions docs/search.json
Original file line number Diff line number Diff line change
Expand Up @@ -434,7 +434,7 @@
"href": "051_oop.html#methods",
"title": "9  Object Oriented Programming",
"section": "9.2 Methods",
"text": "9.2 Methods\nDifferent car models have different functionalities, like turning the radio on.\nIn python it’s the same, the functions attached to an object are called methods and we call them with this dot notation:\n\ncar1.turn_on_radio() # it's just a function, so we call it with ()\n\nRadio is on!\n\n\n\ncar1.turn_off_radio()\n\nRadio is off!\n\n\nWhere is the self argument of the functions?\nThe first argument of a method will be always passed to the method in the call and it is the object itself!\nThat’s a bit meta an a bit confusing, but don’t worry, you’ll get a feel of it by using the classes/objects.",
"text": "9.2 Methods\nDifferent car models have different functionalities, like turning the radio on.\nIn python it’s the same, the functions attached to an object are called methods and we call them with this dot notation:\n\ncar1.turn_on_radio() # it's just a function, so we call it with ()\n\nRadio is on!\n\n\n\ncar1.turn_off_radio()\n\nRadio is off!\n\n\nWhere is the self argument of the functions?\nThe first argument of a method will be always passed to the method in the call and it is the object itself (thus the convention to call it self).\nThat’s a bit meta an a bit confusing, but don’t worry, you’ll get a feel of it by using the classes/objects.",
"crumbs": [
"<span class='chapter-number'>9</span>  <span class='chapter-title'>Object Oriented Programming</span>"
]
Expand Down Expand Up @@ -634,7 +634,17 @@
"href": "081_pandas.html",
"title": "15  Pandas",
"section": "",
"text": "15.1 Exercises\nSeries from dict Missing values\nSeries from dict Display first 7 elements Display last 7 elements Read in data, assign column names Read in only X rows Compute mean of 1 column Add two columns (numbers) Add two columns (strings) Add a column to existing dataframe Add an indicator column to existing dataframe\nFind an interesting dataset, download and explore it.\nDiscard rows with missing values Fill missing values Drop entire rows (by index) Drop entire cols Sort elements (index, values)\nSummary metric on specific cols Substract mean of a col from other columns\nSelect a couple of columns of DF (list of str) Select with boolean mask Select with loc Select with iloc",
"text": "15.1 Exercises",
"crumbs": [
"<span class='chapter-number'>15</span>  <span class='chapter-title'>Pandas</span>"
]
},
{
"objectID": "081_pandas.html#exercises",
"href": "081_pandas.html#exercises",
"title": "15  Pandas",
"section": "",
"text": "Series from dict\nDF from dict\nDF from matrix (columns)\nread_csv\nDiscard Missing values\nFill Missing values\nDisplay first 7 elements\nDisplay last 7 elements\nRead in data, assign column names\nRead in only X rows\nCompute mean of 1 column\nAdd two columns (numbers)\nAdd two columns (strings)\nAdd a column to existing dataframe\nAdd an indicator column to existing dataframe\nFind an interesting dataset, download and explore it.\n\nDiscard rows with missing values\nFill missing values\nDrop entire rows (by index)\nDrop entire cols\nSort elements (index, values)\nSort 2 columns\n\nSummary metric on specific cols\nSubstract mean of a col from other columns\nSubstract rolling mean of a col from other columns\nSelect a couple of columns of DF (list of str)\nSelect with boolean mask\nSelect with loc\nSelect with iloc\nSample 20% of the rows\npd.to_datetime\nfmri:\n\nunique brain regions\nunique subjects\nplot subject “s13” time series of “parietal” region",
"crumbs": [
"<span class='chapter-number'>15</span>  <span class='chapter-title'>Pandas</span>"
]
Expand Down
4 changes: 2 additions & 2 deletions docs/sitemap.xml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@
</url>
<url>
<loc>https://fabridamicelli.github.io/python-course/051_oop.html</loc>
<lastmod>2024-10-14T15:45:37.378Z</lastmod>
<lastmod>2024-10-16T12:53:06.986Z</lastmod>
</url>
<url>
<loc>https://fabridamicelli.github.io/python-course/06_imports.html</loc>
Expand All @@ -62,7 +62,7 @@
</url>
<url>
<loc>https://fabridamicelli.github.io/python-course/081_pandas.html</loc>
<lastmod>2024-10-16T11:44:09.906Z</lastmod>
<lastmod>2024-10-16T15:51:10.495Z</lastmod>
</url>
<url>
<loc>https://fabridamicelli.github.io/python-course/09_plotting.html</loc>
Expand Down

0 comments on commit 4a19064

Please sign in to comment.