Skip to content

Commit

Permalink
🎨 apply suggestions and fixed typos from Pasquale and Jakob
Browse files Browse the repository at this point in the history
  • Loading branch information
enryH committed Sep 20, 2024
1 parent 930cc63 commit a15df42
Showing 1 changed file with 13 additions and 12 deletions.
25 changes: 13 additions & 12 deletions python/best_practices.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,19 @@
# Coding (best) practices for Data Science

> Author: Henry Webel
> Reviewers: Pasquale Colaianni, Jakob Berg Jespersen
Being asked to show some coding best practices for an internal retreat, I assembeled
some low hanging fruits in reach for everyone and some pratices I learned to appreciate.
Being asked to show some coding best practices for an internal retreat, I assembled
some low hanging fruits in reach for everyone and some practices I learned to appreciate.

## Use a formater
## Use a formatter

When you write code, you should at least use some sort of formatter. `black` is common choice
When you write code, I encourage using a formatter. `black` is a common choice
as it allows you to format code in a user-defined linelength consistently. It even can
break too long strings into it's parts - leaving only long comments and docstrings to you
break too long strings into its parts - leaving only long comments and docstrings to you
for adoption.

`black` or `autopep8` are also availbe next to `isort` for sorting imports in VSCode as
`black` or `autopep8` are also available next to `isort` for sorting imports in VSCode as
extension, so your files are formatted everytime you save these
([link](https://code.visualstudio.com/docs/python/formatting)).

Expand All @@ -31,13 +32,13 @@ profile = "black"

## Use a linter

Too long lines, unpassed arguments or mutable objects as default function parameters you can
identify using a linter like `flake8` or `ruff`. Tools like
Using a linter like `flake8` or `ruff` can identify too long lines, unpassed arguments
or mutable objects as default function parameters. Tools like
[`Pylint` in VSCode](https://code.visualstudio.com/docs/python/linting)
allow you to get in editor highlighting of Code issues and links with hints on how to fix them
![screenshot with typehints](assets/lint_pylance_in_vscode.png)

Example: Using the linter you can for example if you did not pass an argument to a function
Using the linter you can for example find that you did not pass an argument to a function
as was fixed in this commit [18b675](https://github.com/Multiomics-Analytics-Group/acore/pull/2/commits/18b67516b25de30cf6fd4bb640422aa8e0642b08) in `run_umap` (you will need to unfold the first file to see the full picture).


Expand All @@ -56,7 +57,7 @@ through them from time to time and prioritze.
## Text based Notebook (percent format) with jupytext and papermill

[`jupytext`](https://jupytext.readthedocs.io/) is a lightweight tool to keep scripts either as notebooks (`.ipynb`) or simpler text based file formats, such as markdown files (`.md`) which can be easily rendered on GitHub or python files (`.py`) which can be executed in VSCode’s interactive shell and are better for version control. Some tools still need ipynb to work, e.g. `papermill`. Therefore it is handy to keep different version of a script in sync. Otherwise one can also only use python files and render these as notebook in e.g.
[jupyter lab](https://jupytext.readthedocs.io/en/latest/text-notebooks.html). Especially if the code is only kept for version control, but executed versions are keep in a project folder using a workflow environment (as `snakemake` or `nextflow`) this comes in handy.
[jupyter lab](https://jupytext.readthedocs.io/en/latest/text-notebooks.html). This is useful especially if the code is only kept for version control, but executed versions are kept in a project folder using a workflow environment (as `snakemake` or `nextflow`).

You can see an example of the percent notebook in the [percent_notebooks](project:percent_notebooks.py) section.

Expand All @@ -67,7 +68,7 @@ I showed how to sync a text based percent notebook and execute it using `papermi
jupytext --to ipynb -k - -o - example_nb.py | papermill - path/to/executed_example.ipynb
```

If you want to keep some formats in sync and only sync one of these and only push one type to git
If you want to keep some formats in sync, you can specify that and only push one type to git
- specifying e.g. a `.gitignore` the types you want to only have locally.
Each folder can have a `.jupytext.toml` file to specify the formats you want to keep in sync
in that folder e.g.:
Expand All @@ -81,5 +82,5 @@ formats = "ipynb,py:percent"

Ghosttext, chats and inline chats are great ways to get suggestions on the code you are
writing. You can apply for a free version as a [(PhD) student](https://github.com/education/students)
or [instructor](https://github.com/education/teachers). Currently alternatives wit a free-tier as [codium](https://codeium.com/) are also available.
or [instructor](https://github.com/education/teachers). Currently alternatives with a free-tier as [codium](https://codeium.com/) are also available.

0 comments on commit a15df42

Please sign in to comment.