diff --git a/python/best_practices.md b/python/best_practices.md index baba20c..6b93689 100644 --- a/python/best_practices.md +++ b/python/best_practices.md @@ -1,18 +1,19 @@ # Coding (best) practices for Data Science > Author: Henry Webel +> Reviewers: Pasquale Colaianni, Jakob Berg Jespersen -Being asked to show some coding best practices for an internal retreat, I assembeled -some low hanging fruits in reach for everyone and some pratices I learned to appreciate. +Being asked to show some coding best practices for an internal retreat, I assembled +some low hanging fruits in reach for everyone and some practices I learned to appreciate. -## Use a formater +## Use a formatter -When you write code, you should at least use some sort of formatter. `black` is common choice +When you write code, I encourage using a formatter. `black` is a common choice as it allows you to format code in a user-defined linelength consistently. It even can -break too long strings into it's parts - leaving only long comments and docstrings to you +break too long strings into its parts - leaving only long comments and docstrings to you for adoption. -`black` or `autopep8` are also availbe next to `isort` for sorting imports in VSCode as +`black` or `autopep8` are also available next to `isort` for sorting imports in VSCode as extension, so your files are formatted everytime you save these ([link](https://code.visualstudio.com/docs/python/formatting)). @@ -31,13 +32,13 @@ profile = "black" ## Use a linter -Too long lines, unpassed arguments or mutable objects as default function parameters you can -identify using a linter like `flake8` or `ruff`. Tools like +Using a linter like `flake8` or `ruff` can identify too long lines, unpassed arguments +or mutable objects as default function parameters. Tools like [`Pylint` in VSCode](https://code.visualstudio.com/docs/python/linting) allow you to get in editor highlighting of Code issues and links with hints on how to fix them ![screenshot with typehints](assets/lint_pylance_in_vscode.png) -Example: Using the linter you can for example if you did not pass an argument to a function +Using the linter you can for example find that you did not pass an argument to a function as was fixed in this commit [18b675](https://github.com/Multiomics-Analytics-Group/acore/pull/2/commits/18b67516b25de30cf6fd4bb640422aa8e0642b08) in `run_umap` (you will need to unfold the first file to see the full picture). @@ -56,7 +57,7 @@ through them from time to time and prioritze. ## Text based Notebook (percent format) with jupytext and papermill [`jupytext`](https://jupytext.readthedocs.io/) is a lightweight tool to keep scripts either as notebooks (`.ipynb`) or simpler text based file formats, such as markdown files (`.md`) which can be easily rendered on GitHub or python files (`.py`) which can be executed in VSCode’s interactive shell and are better for version control. Some tools still need ipynb to work, e.g. `papermill`. Therefore it is handy to keep different version of a script in sync. Otherwise one can also only use python files and render these as notebook in e.g. -[jupyter lab](https://jupytext.readthedocs.io/en/latest/text-notebooks.html). Especially if the code is only kept for version control, but executed versions are keep in a project folder using a workflow environment (as `snakemake` or `nextflow`) this comes in handy. +[jupyter lab](https://jupytext.readthedocs.io/en/latest/text-notebooks.html). This is useful especially if the code is only kept for version control, but executed versions are kept in a project folder using a workflow environment (as `snakemake` or `nextflow`). You can see an example of the percent notebook in the [percent_notebooks](project:percent_notebooks.py) section. @@ -67,7 +68,7 @@ I showed how to sync a text based percent notebook and execute it using `papermi jupytext --to ipynb -k - -o - example_nb.py | papermill - path/to/executed_example.ipynb ``` -If you want to keep some formats in sync and only sync one of these and only push one type to git +If you want to keep some formats in sync, you can specify that and only push one type to git - specifying e.g. a `.gitignore` the types you want to only have locally. Each folder can have a `.jupytext.toml` file to specify the formats you want to keep in sync in that folder e.g.: @@ -81,5 +82,5 @@ formats = "ipynb,py:percent" Ghosttext, chats and inline chats are great ways to get suggestions on the code you are writing. You can apply for a free version as a [(PhD) student](https://github.com/education/students) -or [instructor](https://github.com/education/teachers). Currently alternatives wit a free-tier as [codium](https://codeium.com/) are also available. +or [instructor](https://github.com/education/teachers). Currently alternatives with a free-tier as [codium](https://codeium.com/) are also available.