Skip to content

nogibjj/arko_github_actions_matrix_build

Repository files navigation

Install Format Lint Test

Descriptive Statistics and Runtime Comparison using Polars and Pandas with CICD Build Matrix.

This project is to demonstrate how to perform statistical analysis using pandas and polars. We then compare runtimes of both the approaches. Then for CICD, we perform tests to evaluate compatibility across different python versions.

Project Function

  • A .ipynb notebook each for polars and pandas analysis
  • A .py script to calculate the runtimes of each of these notebooks
  • A lib folder with helper.py script to host helper function. image

Project Structure

  • src/: Contains the source code for the project.
  • tests/: Contains the unit tests for the project.
  • requirements.txt: Lists the Python dependencies.
  • Makefile: Defines common tasks like installing dependencies, running tests, linting, and running docker.
  • .devcontainer/: Contains Dockerfile and VS Code configuration.
  • .github/workflows/: Contians CI/CD workflows for GitHub. image

Project Setup

1. Clone the Repository

Clone the repository to your local machine:

git clone https://github.com/nogibjj/arko_individual_project_1.git
cd arko_individual_project_1

2. Run notebooks (plots saved to plots subfolder)

.venv/bin/python rdu_weather_analytics_pandas.ipynb
.venv/bin/python rdu_weather_analytics_pandas.ipynb

image

3. Run main.py script to see runtime results of both notebooks

image

As we can see. polars tends to run quicker than its equivalent implementation in pandas.

4. YT video link

https://youtu.be/TTSH6CPpNzY

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published