Tutorial featuring Data engineering workflow and Open Source tools and technologies.
The example datasets are openly available online, metadata info is present in the intake
catalog
✅ Packaging framework added
✅ Conda environment added
✅ GitHub actions configured
✅ Pre-commit hooks configured for code linting/formatting
✅ Reading data from online sources using intake
✅ Sample pipeline built using Dagster
✅ Building Dashboard using holoviews + panel
✅ Exploratory data analysis (EDA) using mito
✅ Analysing source code complexity using Wily
✅ Web UI build on Flask
✅ Web UI re-done and expanded with FastHTML
✨ [WIP]: Deployment of FastHTML application
⚙️ Managed by GitHub Action: https://github.com/jgehrcke/github-repo-stats
⏳ Configured to run daily at 23:55:00 IST
📬 Checkout daily reports generated: PDF Report
🗳️ Supplementary details regarding stats/reports generated present here
- Global coral bleaching dataset: Additional Info
van Woesik, R., Burkepile, D. (2022) Bleaching and environmental data for global coral reef sites from 1980-2020. Biological and Chemical Oceanography Data Management Office (BCO-DMO). (Version 2) Version Date 2022-10-14 [if applicable, indicate subset used]. doi:10.26008/1912/bco-dmo.773466.2 [access date]
Terms of Use
This dataset is licensed under Creative Commons Attribution 4.0 (https://creativecommons.org/licenses/by/4.0/)
Currently new pre-build images are disabled due to limited storage
conda env create -f environment.yml
conda activate journey
pip install -e .
Just like the name suggests, pre-commit-hooks are designed to format the code based on PEP standards before committing. More details 🗒
pre-commit install
cd analytics_framework/pipeline
dagit -f process.py
cd analytics_framework/dashboard
python simple_app.py
NOTE:
The dashboard generated is exported into HTML format and saved as stock_price_dashboard.html
Before running the jupyter notebook doc/mito_exp.ipynb
, run the below command
in your terminal to enable the installer. Might take some time to run.
To explore further visit trymito.io
python -m mitoinstaller install
# Instructions specific to FastHTML app
cd intake/web_ui_fasthtml
python app.py
Link: http://localhost:5001
INFO: Will watch for changes in these directories: ['../DataJourney/analytics_framework/intake/web_ui_fasthtml']
INFO: Uvicorn running on http://0.0.0.0:5001 (Press CTRL+C to quit)
INFO: Started reloader process [20071] using WatchFiles
INFO: Started server process [20075]
INFO: Waiting for application startup.
INFO: Application startup complete.