If you're taking CS 41, we still have a few weeks left in the course, which will cover the Python Standard Library and Third-Party Libraries in more detail and culminate in the final project presentations (get hype!! 🥳🚀). As far as course notes are concerned, though, we wanted to leave you with a tutorial on how to teach yourself things within the Python ecosystem.
Here, we'll walk through reading documentation from the Python standard library because most third-party documentation has a similar format.
This is the link to the Python 3 standard library table of contents: https://docs.python.org/3/library/index.html. We've reproduced some of the table of contents below to walk through how it's structured:
At the top-level of the above list, you can find categories like "Data Persistence." Within those categories are Python libraries about that topic. For example, under "Data Persistence" is the library pickle
, which allows you to store many Python objects onto your computer's hard drive.
Let's take a closer look at the pickle
module, whose documentation is hosted at https://docs.python.org/3/library/pickle.html.
If you're reading about this module for the first time, you should start with the introduction which usually describes the module at a high level. In this case, the introduction is:
The pickle module implements binary protocols for serializing and de-serializing a Python object structure. "Pickling" is the process whereby a Python object hierarchy is converted into a byte stream, and "unpickling" is the inverse operation, whereby a byte stream (from a binary file or bytes-like object) is converted back into an object hierarchy. Pickling (and unpickling) is alternatively known as "serialization", "marshalling," or "flattening"; however, to avoid confusion, the terms used here are "pickling" and "unpickling".
Then, you should glance at the table of contents for this module, which lives in the left-hand collapsible bar. In this case, it looks like this:
Many of these sections are unique to pickle
, which is a fairly sophisticated library. Let's jump to the "Module Interface" section which describes the exports of this module in the standard format.
This section (and most Python documentation) is ordered by indentation. The module exports are aligned to the left of the page, and indentation levels are used to nest descriptions. For example, this is the documentation for pickle.dumps
:
pickle.dumps(obj, protocol=None, *, fix_imports=True, buffer_callback=None)
The name of the function is depicted in bold, and the function signature is reproduced in detail to show which arguments are required/optional and positional/keyword. In the above example, obj
is the only required argument.
Then, there's an indented description of the function which describes what the function does. This description will refer to parameters in italics and should explicitly say when the function returns something and what the type of that object will be.
Finally, for objects that have their own attributes (like classes and instances), the documentation will typically display these with additional levels of indentation. As a template:
library.ClassName(...parameters...)
A description of the class and parameters, at a high level.
method_name(self, ...parameters...)
A description of the method and its parameters.
If a class has an attribute that has its own attributes, the indentation can continue further.
Below, we've compiled a list of third party packages that we've found useful, along with a brief description of each and the link to the documentation. Hopefully you'll find these packages helpful in your Python projects!
numpy
(documentation) -numpy
provides a series of numerical computing tools.numpy
provides an n-dimensional array object, as well as (fast) linear-algebraic operations onnumpy
arrays, statistical operations, and random simulation.scipy
(documentation) -scipy
also provides numerical computing tools, specifically, it contains function implementations for numerical integration, interpolation, optimization, linear algebra, and statistics.matplotlib
(documentation) -matplotlib
provides tools for the easy creation of data plots in Python.tensorflow
(documentation) -tensorflow
is Google's open-source machine learning package. It contains tools to easily create, train, and test machine learning models.pytorch
(documentation) -pytorch
is another open-source machine learning package, primarily designed for deep learning. It's similar totenosrflow
in that it provides tools to create, train, and test machine learning models.scikit-learn
(documentation) -scikit-learn
is (yet another) machine learning package. It provides out-of-the-box implementations of classical machine learning models (KNN, SVM, random forest, etc.) as well as a multi-layered perceptron for regression and classification tasks.keras
(documentation) -keras
is a deep learning package built on top oftensorflow
, that makes it easier to design and train deep learning models.nltk
(documentation) -nltk
is a natural language processing package, which provides access to lexicons and corpora, as well as libraries for classification, tokenization, parsing, and other tasks.cvxpy
(documentation) -cvxpy
is the Boyd Lab's convex optimization package, which implements standard optimization algorithms to automatically optimize convex problems.pandas
(documentation) -pandas
is a data manipulation library. Through itsDataFrame
class, it allows for the easy processing, reading/writing, and transformation of various types of data.
django
(documentation) -django
is an industrial-strength framework in Python for building web applications.beautifulsoup
(documentation) -BeautifulSoup
is an HTML-parsing library in Python for web scraping.
pyca
/cryptography
(documentation) -pyca
/cryptography
is a package which provides an interface to cryptographic function implementations, such as symmetric ciphers, message digests, and key derivation functions.
pygame
(documentation) -pygame
is a collection of modules which enable developers to write video games in Python.
With love, 🦄s, and 🐘s by the CS41 Staff