Pycon 2021 Tutorial Video [YouTube] - May 12, 2021
Other Presentations:
STL Python Presentation [YouTube]
Chicago Python Users Group [YouTube]
Details
A spreadsheet is a wonderful invention and an excellent tool for certain jobs. All too often, however, spreadsheets are called upon to perform tasks that are beyond their capabilities. It’s like the old saying, 'If the only tool you have is a hammer, every problem looks like a nail.' However, some problems are better addressed with a screwdriver, with glue, or with a Swiss Army Knife.
Python is described by some in the programming world as the Swiss Army Knife of programming languages because of its unrivaled versatility and flexibility in use. This allows its users to solve complex problems relatively easily compared with other programming languages and is one of the reasons why Python has become increasingly popular over time.
In this tutorial, we’ll briefly discuss spreadsheets, signs that you might be living in “Excel Hell”, and then we’ll spend the rest of the time learning how to escape it using Python.
In the first section, we’ll extend on what spreadsheet users already know about cells, rows, columns, and formulas, and map them to their Python equivalent, such as variables, lists, dictionaries, and functions. At the end of this section, we’ll do an interactive exercise and learn how we can perform a simple calculation, similar to one you might do in Excel, but instead using Python.
In the second section, we’ll discuss (and attempt) how we can perform more complex tasks including web scraping, data processing, analysis, and visualization, by utilizing a few popular 3rd party libraries used including Requests, Pandas, Flask, Matplotlib, and others.
In the last section, we’ll round out our discussion with a few important concepts in data management, including concept of tidy data, building a data pipeline, and a few strategies (and packages) to use when approaching various data problems, including demo using Apache Airflow.
Python Libraries & Resources [Slides]
01 basics_but_important_stuff.ipynb
02 files_lists_dictionaries.ipynb
02-real-world-example-refactored.py
- 01-basics - examples used in presentation
- 02-webscraping - program that clicks through a calendar (written in javascript) and exports csv files
- 02-selenium-safari - program that logins to website, scrapes html from javascript generated page, cleans html, and exports to pdf files
- 02-webscrape-celery - example of that uses a message queue, and celery to download list of urls
- 04-other-analysis - examples of different quantitative notebooks
- 05-other-visualizations - examples of different data visualization tools
- 06-flask - different flask examples
- 07-airflow - example that uses airflow to download and store stock prices
- Anaconda = manages your Python environments
- Pycharm = code editor
Install Git - Allows you to git clone/download Github Projects'
$ git clone https://github.com/ryansmccoy/spreadsheets-to-dataframes.git
$ cd spreadsheets-to-dataframes
$ conda create -n spreadsheets-to-dataframes python=3.8 pandas scipy numpy lxml jupyter matplotlib -y
$ activate spreadsheets-to-dataframes
$ pip install -r requirements_dev.txt
$ git clone https://github.com/ryansmccoy/spreadsheets-to-dataframes.git
$ cd spreadsheets-to-dataframes
$ conda create -n spreadsheets-to-dataframes python=3.8 pandas scipy numpy lxml jupyter matplotlib -y
$ source activate spreadsheets-to-dataframes
$ pip install -r requirements_dev.txt
Navigate to spreadsheet-to-dataframe directory/folder:
$ activate spreadsheets-to-dataframes
$ jupyter notebook
https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html
(Book) Python Crash Course, 2nd Edition
(Book) Introducing Python: Modern Computing in Simple Packages
(Book) Learning Python, 5th Edition
(Book) Automate the Boring Stuff with Python, 2nd Edition: Practical Programming for Total Beginners
(Book) Think Python: How to Think Like a Computer Scientist
(Book) The Quick Python Book (Book)
(Book) Serious Python: Black-Belt Advice on Deployment, Scalability, Testing, and More
(Github) A Whirlwind Tour of Python
(Github) Python Data Science Handbook
(Github) Introduction to Python
$ pip install cookiecutter
Resources:
https://github.com/cookiecutter/cookiecutter
https://github.com/audreyfeldroy/cookiecutter-pypackage
https://towardsdatascience.com/cookiecutter-creating-custom-reusable-project-templates-fc85c8627b07
$ pip install requests
Resources:
https://python.readthedocs.io/en/stable/library/stdtypes.html
https://realpython.com/python-requests/
Have you mastered Requests? Then you should check out multithreading, concurrency, asyncio, message queues, parallelism.
https://yasoob.me/2019/05/29/speedingw-up-python-code-using-multithreading/
https://www.toptal.com/python/beginners-guide-to-concurrency-and-parallelism-in-python
https://creativedata.stream/multi-threading-api-requests-in-python/
https://tests4geeks.com/blog/python-celery-rabbitmq-tutorial/
https://codeburst.io/automated-web-scraping-with-python-and-celery-ac02a4a9ce51
https://github.com/ryansmccoy/zmq-high-speed-subs
$ pip install pandas
Resources:
brandon-rhodespycon-pandas-tutorial
Python pandas Q&A video series
Master Data Analysis with Python
Have you mastered Pandas? Then you check out Dask and Spark.
https://spark.apache.org/docs/latest/api/python/
$ pip install matplotlib
Resources:
https://github.com/fasouto/awesome-dataviz
https://pandas.pydata.org/pandas-docs/stable/user_guide/visualization.html
https://www.toptal.com/designers/data-visualization/data-visualization-tools
https://realpython.com/pandas-plot-python/
Have you mastered Matplotlilb? Then you should checkout Javascript, D3, React, Tableau
$ pip install flask
Resources:
https://www.fullstackpython.com/flask.html
https://blog.miguelgrinberg.com/
Have you mastered Flask? Then you should checkout FastAPI, Javascript, Node, React