From fafd9641ffaebca18925e46d7ad25cb110cf4413 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Edouard=20Choini=C3=A8re?= <27212526+echoix@users.noreply.github.com> Date: Wed, 23 Oct 2024 11:36:50 -0400 Subject: [PATCH] Apply suggestions from code review Co-authored-by: Anna Petrasova --- .flake8 | 1 - doc/notebooks/parallelization_tutorial.ipynb | 39 -------------------- 2 files changed, 40 deletions(-) diff --git a/.flake8 b/.flake8 index b8c498bfcb7..ead2478803a 100644 --- a/.flake8 +++ b/.flake8 @@ -86,7 +86,6 @@ per-file-ignores = # Files not managed by Black python/grass/imaging/images2gif.py: E226 # Unused imports in init files - # F401 imported but unused # F403 star import used; unable to detect undefined names python/grass/*/__init__.py: F401, F403 python/grass/*/*/__init__.py: F403 diff --git a/doc/notebooks/parallelization_tutorial.ipynb b/doc/notebooks/parallelization_tutorial.ipynb index feece7d2823..4611f5d1c24 100644 --- a/doc/notebooks/parallelization_tutorial.ipynb +++ b/doc/notebooks/parallelization_tutorial.ipynb @@ -2,7 +2,6 @@ "cells": [ { "cell_type": "markdown", - "id": "7fb27b941602401d91542211134fc71a", "metadata": {}, "source": [ "# Introduction to Parallelization in GRASS GIS\n", @@ -11,7 +10,6 @@ }, { "cell_type": "markdown", - "id": "acae54e37e7d407bbb7b55eff062a284", "metadata": {}, "source": [ "Let's start GRASS to run examples:" @@ -20,7 +18,6 @@ { "cell_type": "code", "execution_count": null, - "id": "9a63283cbaf04dbcab1f6479b197f3a8", "metadata": {}, "outputs": [], "source": [ @@ -45,7 +42,6 @@ }, { "cell_type": "markdown", - "id": "8dd0d8092fe74a7c96281538738b07e2", "metadata": {}, "source": [ "Note: most examples assume we are already in an active GRASS session." @@ -53,7 +49,6 @@ }, { "cell_type": "markdown", - "id": "72eea5119410473aa328ad9291626812", "metadata": { "tags": [] }, @@ -80,7 +75,6 @@ { "cell_type": "code", "execution_count": null, - "id": "8edb47106e1a46a883d545849b8ab81b", "metadata": {}, "outputs": [], "source": [ @@ -91,7 +85,6 @@ }, { "cell_type": "markdown", - "id": "10185d26023b46108eb7d9f57d49d2b3", "metadata": {}, "source": [ "The speedup (processing time with 1 core / processing time with N cores) typically does not increase linearly with the number of cores and parallel efficiency (speedup / N cores) decreases when adding cores. See, e.g., [benchmarks for r.neighbors](https://grass.osgeo.org/grass-stable/manuals/r.neighbors.html#performance). This behavior is due to the serial parts of the code (see [Amdahl's law](https://en.wikipedia.org/wiki/Amdahl%27s_law)) and computation overhead. " @@ -99,7 +92,6 @@ }, { "cell_type": "markdown", - "id": "8763a12b2bbd4a93a75aff182afb95dc", "metadata": {}, "source": [ "## Parallelization of workflows\n", @@ -112,7 +104,6 @@ }, { "cell_type": "markdown", - "id": "7623eae2785240b9bd12b16a66d81610", "metadata": {}, "source": [ "### Data-based parallelization\n", @@ -122,7 +113,6 @@ }, { "cell_type": "markdown", - "id": "7cdc8c89c7104fffa095e18ddfef8986", "metadata": {}, "source": [ "The following example shows IDW interpolation split into 4 tiles. In this case, specifying an overlap is needed to get correct results without edge artifacts. Here, the number and size of tiles is automatically derived from the number of cores, but can be specified." @@ -131,7 +121,6 @@ { "cell_type": "code", "execution_count": null, - "id": "b118ea5561624da68c537baed56e602f", "metadata": {}, "outputs": [], "source": [ @@ -142,7 +131,6 @@ { "cell_type": "code", "execution_count": null, - "id": "938c804e27f84196a10c8828c723f798", "metadata": {}, "outputs": [], "source": [ @@ -168,7 +156,6 @@ }, { "cell_type": "markdown", - "id": "504fb2a444614c0babb325280ed9130a", "metadata": {}, "source": [ "The following is the same tool ran in serial:" @@ -177,7 +164,6 @@ { "cell_type": "code", "execution_count": null, - "id": "59bbdb311c014d738909a11f9e486628", "metadata": {}, "outputs": [], "source": [ @@ -187,7 +173,6 @@ }, { "cell_type": "markdown", - "id": "b43b363d81ae4b689946ece5c682cd59", "metadata": {}, "source": [ "There are tools that already integrate tiling. For example, addon [r.mapcalc.tiled](https://grass.osgeo.org/grass-stable/manuals/addons/r.mapcalc.tiled.html) uses the tiling concept for raster algebra computation. More complex algebra expression will increase the speedup of this method." @@ -196,7 +181,6 @@ { "cell_type": "code", "execution_count": null, - "id": "8a65eabff63a45729fe45fb5ade58bdc", "metadata": {}, "outputs": [], "source": [ @@ -211,7 +195,6 @@ }, { "cell_type": "markdown", - "id": "c3933fab20d04ec698c2621248eb3be0", "metadata": {}, "source": [ "### Task-based parallelization\n", @@ -221,7 +204,6 @@ }, { "cell_type": "markdown", - "id": "4dd4641cc4064e0191573fe9c69df29b", "metadata": {}, "source": [ "#### Examples in Python\n", @@ -230,7 +212,6 @@ }, { "cell_type": "markdown", - "id": "8309879909854d7188b41380fd92a7c3", "metadata": {}, "source": [ "In the following example viewsheds from different coordinates are computed in parallel using `multiprocessing.Pool` class. To avoid issues when using multiprocessing from Jupyter Notebook (multiprocessing.Pool does not work with interactive interpreters), we will first write a Python script with main function and then execute it." @@ -239,7 +220,6 @@ { "cell_type": "code", "execution_count": null, - "id": "3ed186c9a28b402fb0bc4494df01f08d", "metadata": {}, "outputs": [], "source": [ @@ -266,7 +246,6 @@ { "cell_type": "code", "execution_count": null, - "id": "cb1e1581032b452c9409d6c6813c49d1", "metadata": {}, "outputs": [], "source": [ @@ -277,7 +256,6 @@ { "cell_type": "code", "execution_count": null, - "id": "379cbbc1e968416e875cc15c1202d7eb", "metadata": {}, "outputs": [], "source": [ @@ -286,7 +264,6 @@ }, { "cell_type": "markdown", - "id": "277c27b1587741f2af2001be3712ef0d", "metadata": {}, "source": [ "#### Examples in Bash\n", @@ -296,7 +273,6 @@ { "cell_type": "code", "execution_count": null, - "id": "db7b79bc585a40fcaf58bf750017e135", "metadata": {}, "outputs": [], "source": [ @@ -309,7 +285,6 @@ }, { "cell_type": "markdown", - "id": "916684f9a58a4a2aa5f864670399430d", "metadata": { "tags": [] }, @@ -322,7 +297,6 @@ { "cell_type": "code", "execution_count": null, - "id": "1671c31a24314836a5b85d7ef7fbf015", "metadata": {}, "outputs": [], "source": [ @@ -336,7 +310,6 @@ }, { "cell_type": "markdown", - "id": "33b0902fd34d4ace834912fa1002cf8e", "metadata": {}, "source": [ "See manual pages of GNU Parallel or xargs for more advanced uses. GNU Parallel can be configured to distribute jobs across multiple machines. In that case, use `--exec` interface described below." @@ -344,7 +317,6 @@ }, { "cell_type": "markdown", - "id": "f6fa52606d8c4a75a9b52967216f8f3f", "metadata": {}, "source": [ "### Safe execution of parallel tasks" @@ -352,7 +324,6 @@ }, { "cell_type": "markdown", - "id": "f5a1fa73e5044315a093ec459c9be902", "metadata": {}, "source": [ "While you can execute tasks in parallel within a single mapset, it is *not safe* when your tasks:\n", @@ -368,7 +339,6 @@ }, { "cell_type": "markdown", - "id": "cdf66aed5cc84ca1b48e60bad68798a8", "metadata": {}, "source": [ "#### Executing processes in separate mapsets\n", @@ -382,7 +352,6 @@ { "cell_type": "code", "execution_count": null, - "id": "28d3efd5258a48a79c179ea5c6759f01", "metadata": {}, "outputs": [], "source": [ @@ -392,7 +361,6 @@ }, { "cell_type": "markdown", - "id": "3f9bc0b9dd2c44919cc8dcca39b469f8", "metadata": {}, "source": [ "One of the previous examples that was running within GRASS session in a single mapset can be rewritten so that each task runs in a newly created mapset. Note that by default newly created mapsets use default computational region for that GRASS location (you can use `g.region -s` to modify it). For raster computations, you need to change the computational region for each new mapset if the default one is not desired." @@ -401,7 +369,6 @@ { "cell_type": "code", "execution_count": null, - "id": "0e382214b5f147d187d36a2058b9c724", "metadata": {}, "outputs": [], "source": [ @@ -418,7 +385,6 @@ }, { "cell_type": "markdown", - "id": "5b09d5ef5b5e4bb6ab9b829b10b6a29f", "metadata": {}, "source": [ "In some cases, only a temporary mapset or location is needed, see [examples](https://grass.osgeo.org/grass-stable/manuals/grass.html#batch-jobs-with-the-exec-interface).\n", @@ -427,7 +393,6 @@ }, { "cell_type": "markdown", - "id": "a50416e276a0479cbe66534ed1713a40", "metadata": {}, "source": [ "#### Safely modifying computational region in a single mapset\n", @@ -440,7 +405,6 @@ { "cell_type": "code", "execution_count": null, - "id": "46a27a456b804aa2a380d5edf15a5daf", "metadata": {}, "outputs": [], "source": [ @@ -470,7 +434,6 @@ { "cell_type": "code", "execution_count": null, - "id": "1944c39560714e6e80c856f20744a8e5", "metadata": {}, "outputs": [], "source": [ @@ -479,7 +442,6 @@ }, { "cell_type": "markdown", - "id": "d6ca27006b894b04b6fc8b79396e2797", "metadata": {}, "source": [ "#### Safely modifying vectors with attributes in a single mapset" @@ -487,7 +449,6 @@ }, { "cell_type": "markdown", - "id": "f61877af4e7f4313ad8234302950b331", "metadata": {}, "source": [ "By default vector maps share a single SQLite database file, however SQLite does not support concurrent write access. That poses a problem when modifying vectors with attributes in parallel. While this can be solved by running the computations in separate mapsets, it is also possible to change the default behavior to write attributes of each vector to the vector's individual SQLite file. This behavior can be activated after a new mapset is created with:\n",