diff --git a/.github/workflows/lint.yaml b/.github/workflows/lint.yaml index a88504ae..4cd83be1 100644 --- a/.github/workflows/lint.yaml +++ b/.github/workflows/lint.yaml @@ -1,12 +1,14 @@ -name: Check Markdown and Front Matter +name: Lint markdown files on: push: + branches: + - main paths: - - '**/*.md' + - "**/*.md" pull_request: paths: - - '**/*.md' + - "**/*.md" jobs: check-markdown-yaml: @@ -19,4 +21,35 @@ jobs: - name: Run Front Matter Linting uses: alasdairwilson/front-matter-lint@main with: - directory: '.' \ No newline at end of file + directory: "." + + lint-python-codeblocks: + runs-on: ubuntu-latest + + steps: + - name: Checkout code + uses: actions/checkout@v2 + + - name: Lint Python Code Blocks in Markdown + uses: OxfordRSE/lint-md-codeblocks@main + with: + directory: "." + language: "python" + + lint-markdown: + runs-on: ubuntu-latest + + steps: + - name: Checkout repository + uses: actions/checkout@v3 + + - name: Set up Node.js + uses: actions/setup-node@v3 + with: + node-version: "20" + + - name: Install markdownlint-cli + run: npm install -g markdownlint-cli + + - name: Lint Markdown files + run: markdownlint '**/*.md' --ignore '*/*/slides/*' --ignore README.md diff --git a/.gitignore b/.gitignore index a68f4ed9..802a04d1 100644 --- a/.gitignore +++ b/.gitignore @@ -1,4 +1,5 @@ .idea/ .DS_Store -*.code-workspace \ No newline at end of file +*.code-workspace +.vscode diff --git a/.markdownlint.yaml b/.markdownlint.yaml new file mode 100644 index 00000000..dfaa32f9 --- /dev/null +++ b/.markdownlint.yaml @@ -0,0 +1,7 @@ +default: true +MD041: false # Ignore first line heading warnings (because we title set in front matters) +MD013: false # Ignore line length warnings +MD033: + allowed_elements: ["kbd"] +MD024: + siblings_only: true \ No newline at end of file diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index e9fabee4..78efe956 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,6 +1,7 @@ # Contributing guidelines To contribute new material, or update existing material please: + 1. Create an issue on this repo with a description of the proposed change 2. Fork the repo and create a new branch. Add commits to your branch in your own fork with the changes. Please provide the issue number in each commit message, e.g. a commit message for issue number 5 might be "#5 added version control material" 3. When you are ready, open a pull request to merge your new commits to the `main` branch of this repo. @@ -13,7 +14,7 @@ It is useful to see how your change is rendered into a webpage using the [Gutenb The file structure in this repo defines the structure of the generated material, there are three levels of subdirectories, each with their own `index.md` file, which contains metadata: -``` +```text - index.md - [theme.id] - index.md @@ -34,9 +35,10 @@ Each folder has an `index.md` with metadata for that theme/course/section. ### Material metadata The top level `index.md` has a yaml block containing the keys: - - `id`, a string with a unique id for this material - - `name`, a string with the material title - - `themes`, an array of folder names of the themes within this material + +- `id`, a string with a unique id for this material +- `name`, a string with the material title +- `themes`, an array of folder names of the themes within this material The theme names correspond to subfolders in this repo. The rest of the content of this file is markdown formatted content with a top-level description of the material. @@ -44,9 +46,9 @@ The theme names correspond to subfolders in this repo. The rest of the content o The theme level `index.md` has a yaml block containing the keys: - - `id`, a string with a unique id (unique within this material) for this theme - - `name`, a string with the theme title - - `courses`, an array of folder names of the courses within this material +- `id`, a string with a unique id (unique within this material) for this theme +- `name`, a string with the theme title +- `courses`, an array of folder names of the courses within this material The course names correspond to subfolders in this theme folder. The rest of the content of this file is markdown formatted content with a top-level description of the theme. @@ -54,9 +56,9 @@ The course names correspond to subfolders in this theme folder. The rest of the The course level `index.md` has a yaml block containing the keys: - - `id`, a string with a unique id (unique within this theme) for this theme - - `name`, a string with the theme title - - `files`, an array of filenames of the sections within this material +- `id`, a string with a unique id (unique within this theme) for this theme +- `name`, a string with the theme title +- `files`, an array of filenames of the sections within this material The file names correspond to markdown files in this course folder. The rest of the content of this file is markdown formatted content with a top-level description of the course. @@ -64,8 +66,8 @@ The file names correspond to markdown files in this course folder. The rest of t Each section markdown file has a yaml block containing the keys. Note that the `id` of each section is implicitly defined from the filename, so a section filename `array.md` would have an id `array`. - - `name`, a string with the section title - - `dependsOn`, an array of identifiers indicating the pre-requisite sections/courses/themes for this section +- `name`, a string with the section title +- `dependsOn`, an array of identifiers indicating the pre-requisite sections/courses/themes for this section Each entry in `dependsOn` indicates a course dependency using `.` or a section dependency using @@ -115,7 +117,6 @@ The id must be unique within this particular section, and the title is any strin The solution directive produces a section that is initially hidden, but which a user can click to display. It can be written using the following syntax: - ```pandoc :::solution @@ -126,7 +127,6 @@ The answer is 42. Note that solutions can be nested within challenges by matching the number of colons: - ```pandoc ::::challenge{id=big_question title="Hitchhikers question"} @@ -141,10 +141,9 @@ The answer is 42. ### Callout directive -The callout directive produces a highlighted and bordered block of markdown content. It +The callout directive produces a highlighted and bordered block of markdown content. It can be written using the following syntax: - ```pandoc :::callout @@ -153,7 +152,9 @@ be found on [the series Wikipedia entry](https://en.wikipedia.org/wiki/The_Hitch ::: ``` -Different variants/flavours of callout are available by using the syntax + +Different variants/flavours of callout are available by using the syntax + ```pandoc :::callout{variant="variant"} Text @@ -165,19 +166,25 @@ Variants available are "danger", "warning", "tip", "discussion", "note" and "key ![image](https://private-user-images.githubusercontent.com/60351846/301895586-343ade2a-0c4e-4f20-9559-8e3a4986a523.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTQxMjkxMjcsIm5iZiI6MTcxNDEyODgyNywicGF0aCI6Ii82MDM1MTg0Ni8zMDE4OTU1ODYtMzQzYWRlMmEtMGM0ZS00ZjIwLTk1NTktOGUzYTQ5ODZhNTIzLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA0MjYlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNDI2VDEwNTM0N1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPThjYjRmMGVkYTJmNWRmNTZkZWQwYWUzMzI0MTg0N2I0M2Q4ZDBkYWJlMmMwOWU3MWNmYTkzMTdiYzFmZGRhZjAmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.I4hgqTxWYcVN9TXzcFjDcyxotBrq_-o0TFWuVlJXW14) #### Danger + Used to inform students to be aware that there is a danger that they could break their development environment or lose data if an action is not performed correctly. #### Warning + Used to warn students that they should be careful when performing a task in case there is a risk of a breaking change or issue arising that is less serious than something in the "danger" category or precautions to take to ensure good practice. #### Tip + Used to inform students of a useful tip that may help them complete a task or help them in a wider context. #### Discussion + Used to introduce a discussion topics as part of a training session or as a thinking point if completing the training individually, these help students better understand a concept. #### Note + Used to highlight additional information that the student may wat to bear in mind when completing a task or thinking about a topic. #### Key Points + Used to summarise the most essential or critical information in a topic. These can be takeaways or highlights that students should focus on. diff --git a/README.md b/README.md index b6eec279..87fb644f 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,5 @@ # About + This repository contains course materials that have been developed or adapted as part of the [UNIVERSE-HPC project](http://www.universe-hpc.ac.uk/). If you are interested in working through the course materials **as a learner or instructor**, please do so through the [Gutenberg platform](https://train.oxrse.uk/material). @@ -8,6 +9,7 @@ If you are interested in **contributing** material to this repository, please se If you are interested in contributing to the Gutenberg web application, please do so [here](https://github.com/OxfordRSE/gutenberg) ## License + This work is licensed under a Creative Commons Attribution 4.0 International License. This means that you are free to share and adapt the materials in this repository, as long as you provide attribution to the authors. You can read the full terms in the [LICENSE file](LICENSE). diff --git a/high_performance_computing/hpc_aws_slurm_setup/01_flight_solo_setup.md b/high_performance_computing/hpc_aws_slurm_setup/01_flight_solo_setup.md index 84126196..5bf30f79 100644 --- a/high_performance_computing/hpc_aws_slurm_setup/01_flight_solo_setup.md +++ b/high_performance_computing/hpc_aws_slurm_setup/01_flight_solo_setup.md @@ -11,44 +11,43 @@ attribution: --- :::callout{variant="note"} -This is not a requirement for the High Performance Computing theme. It is a -tutorial for teachers & course runners to setup a training environment - if that +This is not a requirement for the High Performance Computing theme. It is a +tutorial for teachers & course runners to setup a training environment - if that is not relevant to you then please ignore! ::: -The following is a tutorial for setting up a minimal Slurm cluster, using AWS -and the Flight Solo image from OpenFightHPC, for trainees to use in the rest of -the High Performance Computing (HPC) theme. In particular this is aimed at -getting trainees an environment for running the [Intro to HPC][hpcintro] course -and should be **followed by the trainers** before the course is taught if the -trainees have no other access to a HPC environment on which to run simple -commands. - -For this task we will be using [Flight Solo][flightsolopage], an open source -image for setting up a HPC environment for research and scientific computing, -including SLURM and HPC package management using `spack` (among other things). +The following is a tutorial for setting up a minimal Slurm cluster, using AWS +and the Flight Solo image from OpenFightHPC, for trainees to use in the rest of +the High Performance Computing (HPC) theme. In particular this is aimed at +getting trainees an environment for running the [Intro to HPC][hpcintro] course +and should be **followed by the trainers** before the course is taught if the +trainees have no other access to a HPC environment on which to run simple +commands. + +For this task we will be using [Flight Solo][flightsolopage], an open source +image for setting up a HPC environment for research and scientific computing, +including SLURM and HPC package management using `spack` (among other things). We will be setting up a minimal training cluster consisting of: - 1 login node - 2 compute nodes -with `SLURM` and `OpenMPI` set up to run jobs on the 2 compute nodes. As we will -not be using this for serious computation, these can all run on pretty small -machines within Amazon's EC2 System – in our case we opted for `T3.medium` as it +with `SLURM` and `OpenMPI` set up to run jobs on the 2 compute nodes. As we will +not be using this for serious computation, these can all run on pretty small +machines within Amazon's EC2 System – in our case we opted for `T3.medium` as it was the smallest available in our region. ## Spinning up the Nodes -Fortunately, Flight Solo comes with good tutorials for setting up the image/images -on AWS (and other cloud platforms) with detailed, step-by-step instructions for -getting the machines spun up. Therefore, the first advice is to just follow the -instructions [here](flightsolotutorial). +Fortunately, Flight Solo comes with good tutorials for setting up the image/images +on AWS (and other cloud platforms) with detailed, step-by-step instructions for +getting the machines spun up. Therefore, the first advice is to just follow the +instructions [here](flightsolotutorial). - -If that works, __great!__ You can carry on to the next step: [setting up pacakges with `spack`](#spack-and-modules). -However if, like for us, this did not work first time, you can try the following -modified steps. During the first stage, i.e. "Launch Login Node", we instead did -this via the EC2 panes (see step `f` in "Launch Compute Nodes") and added in the +If that works, **great!** You can carry on to the next step: [setting up pacakges with `spack`](#spack-and-modules). +However if, like for us, this did not work first time, you can try the following +modified steps. During the first stage, i.e. "Launch Login Node", we instead did +this via the EC2 panes (see step `f` in "Launch Compute Nodes") and added in the following yaml config under `advanced-details.user-data`: ``` yaml @@ -61,25 +60,27 @@ write_files: owner: root:root ``` -This is basically just configuring Flight Solo by setting `SHAREPUBKEY` to true -in a config , which shares out the public key from the login key-pair to the -compute nodes and allows them to be found and configured by Flight. Note that we -also created the login node with additional storage space (30GB, rather than -10GB) as we were struggling with getting big packages installed otherwise. +This is basically just configuring Flight Solo by setting `SHAREPUBKEY` to true +in a config , which shares out the public key from the login key-pair to the +compute nodes and allows them to be found and configured by Flight. Note that we +also created the login node with additional storage space (30GB, rather than +10GB) as we were struggling with getting big packages installed otherwise. -At this point we should be able to ssh into the login node with the key pair -generated on the AWS interface and the public ip address of the login node EC2 -instance, i.e. -``` +At this point we should be able to ssh into the login node with the key pair +generated on the AWS interface and the public ip address of the login node EC2 +instance, i.e. + +```shell ssh -i path/to/keyfile.pem flight@$PUBLIC_IP_ADDRESS ``` -which should bring you to a login page where you can start configuring the -cluster with Flight. +which should bring you to a login page where you can start configuring the +cluster with Flight. -You can then follow the instructions the rest of the way, including leaving most -of the config set up during `flight profile configure` as default. What we did +You can then follow the instructions the rest of the way, including leaving most +of the config set up during `flight profile configure` as default. What we did specifically: + - `Cluster type`: Openflight Slurm Multinode - `Cluster name`: slurm-multinode (you can do what you want here) - `Setup Multi User Environment with IPA?`: none @@ -88,73 +89,84 @@ specifically: - `IP or FQDN for Web Access`: ec2-13-43-90-160.eu-west-2.compute.amazonaws.com (left as default) - `IP Range of Compute Nodes`: 172.31.32.0/20 -Note that the `IP or FQDN for Web Access` was left as default as we didn't try -configuring Flight's web-interface, the IP range of the compute nodes was -calculated automatically so there was no need to change it, and the password was -left as default but changed after successfully applying profiles. +Note that the `IP or FQDN for Web Access` was left as default as we didn't try +configuring Flight's web-interface, the IP range of the compute nodes was +calculated automatically so there was no need to change it, and the password was +left as default but changed after successfully applying profiles. -Profiles were then applied as specified in the [Flight Tutorial][flightsolotutorialslurm]. +Profiles were then applied as specified in the [Flight Tutorial][flightsolotutorialslurm]. ## Spack and Modules -Flight has [some environments available][flightenv] for installing system-level -packages, we opted for `spack` for no particular reason other than it is -well-regarded, and one of our requirements is for the cluster to have modules, -which is easily achievable with `spack`. First though you have to create a -global `spack` flight-environment with +Flight has [some environments available][flightenv] for installing system-level +packages, we opted for `spack` for no particular reason other than it is +well-regarded, and one of our requirements is for the cluster to have modules, +which is easily achievable with `spack`. First though you have to create a +global `spack` flight-environment with + ``` bash flight env create -g spack ``` -We recommend `-g` (global) so every other user can access the installed modules -but not install their own. This will need to be done as the root user though, -which you can escalate to while logged in as the user `flight` with `sudo -s`. +We recommend `-g` (global) so every other user can access the installed modules +but not install their own. This will need to be done as the root user though, +which you can escalate to while logged in as the user `flight` with `sudo -s`. Once finished, you can then activate the `spack` environment with: + ``` bash flight env activate spack ``` -after which your regular `spack` commands should work. More info on the `spack` -flight-environment can be found in the [flight docs][flightenvspack]. +after which your regular `spack` commands should work. More info on the `spack` +flight-environment can be found in the [flight docs][flightenvspack]. -To get the module files working you can [follow the instructions][spackmodules] +To get the module files working you can [follow the instructions][spackmodules] on the `spack` docs, but to summarise what we did: 1. Enable tcl module files -``` bash -spack config add "modules:default:enable:[tcl]" -``` + + ``` bash + spack config add "modules:default:enable:[tcl]" + ``` + 2. Install `lmod` with `spack` -``` bash -spack install lmod -``` + + ``` bash + spack install lmod + ``` + 3. Make the module tool available to the current shell -``` bash -. $(spack location -i kmod)/lmod/lmod/init/bash -``` -4. Install and add a new compiler -``` bash -spack install gcc@12.3.0 -spack compiler add -``` + + ``` bash + . $(spack location -i kmod)/lmod/lmod/init/bash + ``` + +4. Install and add a new compiler + + ``` bash + spack install gcc@12.3.0 + spack compiler add + ``` + 5. Install any new modules with this new compiler -``` bash -$ spack install {module_name} %gcc@12.3.0 -``` -The compiler part isn't strictly necessary, so you can ignore if you like, but -it does make formatting the module list a bit more straightforward so we still -recommend it. We also found that the gcc@11 that came pre-installed on -flight didn't have fortran compilers installed so a fresh compiler install was + ``` bash + spack install {module_name} %gcc@12.3.0 + ``` + +The compiler part isn't strictly necessary, so you can ignore if you like, but +it does make formatting the module list a bit more straightforward so we still +recommend it. We also found that the gcc@11 that came pre-installed on +flight didn't have fortran compilers installed so a fresh compiler install was necessary for MPI, though your mileage may vary. -#### Formatting the module list +### Formatting the module list -This then gives us a large list of all of the dependencies `spack` downloaded -and installed for each of these newly installed modules, which we can leave be -if you like, or we can configure down to a nice, minimalist list. To get around -this we added a config file, following the advice of the aforementioned -[tutorial][spackmodules], to `$SPACK_ROOT/etc/spack/modules.yaml` containing the +This then gives us a large list of all of the dependencies `spack` downloaded +and installed for each of these newly installed modules, which we can leave be +if you like, or we can configure down to a nice, minimalist list. To get around +this we added a config file, following the advice of the aforementioned +[tutorial][spackmodules], to `$SPACK_ROOT/etc/spack/modules.yaml` containing the following: ``` yaml @@ -175,76 +187,78 @@ modules: all: '{name}/{version}' ``` -Where you'll need to replace `{PACKAGE_LIST}` with the yaml-formatted list of -packages you specifically want to include (to see the full list of packages -`spack` has installed, simply run `spack find`). After creating/editing this +Where you'll need to replace `{PACKAGE_LIST}` with the yaml-formatted list of +packages you specifically want to include (to see the full list of packages +`spack` has installed, simply run `spack find`). After creating/editing this file you'll have to run + ``` bash spack module tcl refresh --delete-tree -y ``` -for the changes to be reflected in the list of available modules. This will only -show specific packages (and dependencies) you installed with `spack` and -specified in the include section. This can be a little limiting if you're -installing new packages frequently, so the `{list of packages}` and the -`"%gcc@12"` can be removed for all of the `spack` packages and dependencies to -be included on the module avail command. +for the changes to be reflected in the list of available modules. This will only +show specific packages (and dependencies) you installed with `spack` and +specified in the include section. This can be a little limiting if you're +installing new packages frequently, so the `{list of packages}` and the +`"%gcc@12"` can be removed for all of the `spack` packages and dependencies to +be included on the module avail command. #### Some sysadmin -The above approach doesn't persist after leaving the shell instance, so we put -the following into `/etc/profile`: +The above approach doesn't persist after leaving the shell instance, so we put +the following into `/etc/profile`: + ``` bash flight env activate spack . $(spack location -i lmod)/lmod/lmod/init/bash flight env deactivate ``` -which leave the paths in place for any user to be able to call module commands -(e.g. `module avail`) but not install new `spack` packages. You might be able to -copy the output of `$(spack location -i lmod)` and hardcode it to avoid having -to activate the flight environment. - -One final bit of sys-admin involved activating password authentication for sshd -so that users could login with a password and then add their own ssh file, as -per the [course][hpcintro]. This just involves uncommenting the line -`PasswordAuthentication yes` in `/etc/ssh/sshd_config`, removing any overriding -references to this option in `/etc/ssh/sshd_config.d`, and then restarting the -`sshd` service with + +which leave the paths in place for any user to be able to call module commands +(e.g. `module avail`) but not install new `spack` packages. You might be able to +copy the output of `$(spack location -i lmod)` and hardcode it to avoid having +to activate the flight environment. + +One final bit of sys-admin involved activating password authentication for sshd +so that users could login with a password and then add their own ssh file, as +per the [course][hpcintro]. This just involves uncommenting the line +`PasswordAuthentication yes` in `/etc/ssh/sshd_config`, removing any overriding +references to this option in `/etc/ssh/sshd_config.d`, and then restarting the +`sshd` service with + ``` bash sudo systemctl restart sshd ``` -after which you should be able to log in to the login node with just a password. +after which you should be able to log in to the login node with just a password. ## OpenMPI Installation -The training material also requires the use of `srun`, `mpirun`, and `mpiexec`, -for which some installation of MPI is required. We went for OpenMPI, and was -required to install it with +The training material also requires the use of `srun`, `mpirun`, and `mpiexec`, +for which some installation of MPI is required. We went for OpenMPI, and was +required to install it with + ``` bash spack install openmpi ~legacylaunchers schedulers=slurm ``` -Where `schedulers=slurm` is telling it to compile with slurm compatibility and -`~legacylaunchers` is telling it not to delete the `mpirun` and `mpiexec` -binaries. There are [good reasons to delete][openmpiissue] them for a proper, -production install, but for our training purposes having them is preferable. - +Where `schedulers=slurm` is telling it to compile with slurm compatibility and +`~legacylaunchers` is telling it not to delete the `mpirun` and `mpiexec` +binaries. There are [good reasons to delete][openmpiissue] them for a proper, +production install, but for our training purposes having them is preferable. ## Summary -At this point we should now have a working SLURM cluster on AWS which we can ssh -into, submit jobs on with `sbatch`, and generally treat like a proper HPC -environment. Feel free at this point to take it for a spin – we ran through the -[HPC introduction course][hpcintro] but you may wish to try something a bit more +At this point we should now have a working SLURM cluster on AWS which we can ssh +into, submit jobs on with `sbatch`, and generally treat like a proper HPC +environment. Feel free at this point to take it for a spin – we ran through the +[HPC introduction course][hpcintro] but you may wish to try something a bit more involved. - [hpcintro]: ../hpc_intro [flightsolopage]: https://www.openflighthpc.org/latest/solo/ -[flightsolotutorial]: https://www.openflighthpc.org/latest/docs/flight-solo/cluster-build-methods/slurm-multinode-aws/ [flightsolotutorialslurm]: https://www.openflighthpc.org/latest/docs/flight-solo/cluster-build-methods/slurm-multinode-aws#slurm-multinode-configuration [flightenv]: https://www.openflighthpc.org/latest/docs/flight-environment/use-flight/flight-user-suite/flight-env/usage/ [flightenvspack]: https://www.openflighthpc.org/latest/docs/flight-environment/use-flight/flight-user-suite/flight-env/ecosystems/spack/ [spackmodules]: https://spack-tutorial.readthedocs.io/en/latest/tutorial_modules.html -[openmpiissue]: https://github.com/spack/spack/pull/10340#issuecomment-454355612 \ No newline at end of file +[openmpiissue]: https://github.com/spack/spack/pull/10340#issuecomment-454355612 diff --git a/high_performance_computing/hpc_intro/01_hpc_intro.md b/high_performance_computing/hpc_intro/01_hpc_intro.md index 25150612..b39eee83 100644 --- a/high_performance_computing/hpc_intro/01_hpc_intro.md +++ b/high_performance_computing/hpc_intro/01_hpc_intro.md @@ -2,6 +2,9 @@ name: Why Use a Cluster? dependsOn: [] tags: [] +learningOutcomes: + - Describe what an HPC system is. + - Identify how an HPC system could benefit you. attribution: - citation: > "Introduction to High-Performance Computing" course by the HPC-carpentries @@ -44,7 +47,6 @@ In all these cases, access to more (and larger) computers is needed. Those computers should be usable at the same time, __solving many researchers' problems in parallel__. - ::::challenge{id=never-used-server title="I've Never Used a Server, Have I?"} Take a minute and think about which of your daily interactions with a @@ -52,6 +54,7 @@ computer may require a remote server or even cluster to provide you with results. :::solution + ## Some Ideas * Checking email: your computer (possibly in your pocket) contacts a remote diff --git a/high_performance_computing/hpc_intro/02_connecting.md b/high_performance_computing/hpc_intro/02_connecting.md index 9a8404b7..062442c4 100644 --- a/high_performance_computing/hpc_intro/02_connecting.md +++ b/high_performance_computing/hpc_intro/02_connecting.md @@ -4,6 +4,9 @@ dependsOn: [ high_performance_computing.hpc_intro.01_hpc_intro ] tags: [ssh] +learningOutcomes: + - Configure secure access to a remote HPC system. + - Connect to a remote HPC system. attribution: - citation: > "Introduction to High-Performance Computing" course by the HPC-carpentries @@ -34,6 +37,7 @@ SSH keys are an alternative method for authentication to obtain access to remote * a public key which can be placed on any remote system you will access. :::callout + ## Private keys are your secure digital passport A private key that is visible to anyone but you should be considered compromised, and must be destroyed. This includes having improper permissions on the directory it (or a copy) is stored in, traversing any network that is not secure (encrypted), attachment on unencrypted email, and even displaying the key on your terminal window. @@ -44,16 +48,19 @@ Protect this key as if it unlocks your front door. In many ways, it does. Regardless of the software or operating system you use, _please_ choose a strong password or passphrase to act as another layer of protection for your private SSH key. :::callout + ## Considerations for SSH Key Passwords + When prompted, enter a strong password that you will remember. There are two common approaches to this: 1. Create a memorable passphrase with some punctuation and number-for-letter substitutions, 32 characters or longer. Street addresses work well; just be careful of social engineering or public records attacks. 2. Use a password manager and its built-in password generator with all character classes, 25 characters or longer. [KeePass][keepass] and [BitWarden][bitwarden] are two good options. 3. Nothing is _less_ secure than a private key with no password. If you skipped password entry by accident, go back and generate a new key pair _with_ a strong password. + ::: -#### SSH Keys on Linux, Mac, MobaXterm, and Windows Subsystem for Linux +### SSH Keys on Linux, Mac, MobaXterm, and Windows Subsystem for Linux Once you have opened a terminal, check for existing SSH keys and filenames since existing SSH keys are overwritten. @@ -80,8 +87,10 @@ Take a look in `~/.ssh` (use `ls ~/.ssh`). You should see two new files: * your private key (`~/.ssh/id_ed25519`): _do not share with anyone!_ * the shareable public key (`~/.ssh/id_ed25519.pub`): if a system administrator asks for a key, this is the one to send. It is also safe to upload to websites such as GitHub: it is meant to be seen. -:::callout +:::callout{variant="tip"} + ## Use RSA for Older Systems + If key generation failed because ed25519 is not available, try using the older (but still strong and trustworthy) [RSA][wiki-rsa] cryptosystem. Again, first check for an existing key: ```bash @@ -103,9 +112,10 @@ Take a look in `~/.ssh` (use `ls ~/.ssh`). You should see two new files: * your private key (`~/.ssh/id_rsa`): _do not share with anyone!_ * the shareable public key (`~/.ssh/id_rsa.pub`): if a system administrator asks for a key, this is the one to send. It is also safe to upload to websites such as GitHub: it is meant to be seen. + ::: -#### SSH Keys on PuTTY +### SSH Keys on PuTTY If you are using PuTTY on Windows, download and use `puttygen` to generate the key pair. See the [PuTTY documentation][putty-gen] for details. @@ -139,29 +149,36 @@ ssh-add -l ``` If you get a list, or a message like: -``` + +```text The agent has no identities ``` Then everything is fine! If you get an error like the ones below one: -``` + +```text Error connecting to agent: No such file or directory # or Could not open a connection to your authentication agent. ``` + ... then your SSH agent isn't running and you need to start it as: + ```bash eval $(ssh-agent) ``` :::callout + ## What's In A `$(...)`? + The syntax of this SSH Agent command is unusual, based on what we've seen in the UNIX Shell lesson. This is because the `ssh-agent` command creates a connection that only you have access to, and prints a series of shell commands that can be used to reach it -- but _does not execute them!_ ```bash ssh-agent ``` -``` + +```text SSH_AUTH_SOCK=/tmp/ssh-Zvvga2Y8kQZN/agent.131521; export SSH_AUTH_SOCK; SSH_AGENT_PID=131522; @@ -174,14 +191,13 @@ The `eval` command interprets this text output as commands and allows you to acc You could run each line of the `ssh-agent` output yourself, and achieve the same result. Using `eval` just makes this easier. ::: - Add your key to the agent, with session expiration after 8 hours: ```bash ssh-add -t 8h ~/.ssh/id_ed25519 ``` -``` +```text Enter passphrase for .ssh/id_ed25519: Identity added: .ssh/id_ed25519 Lifetime set to 86400 seconds @@ -189,7 +205,7 @@ Lifetime set to 86400 seconds For the duration (8 hours), whenever you use that key, the SSH Agent will provide the key on your behalf without you having to type a single keystroke. -#### SSH Agent on PuTTY +### SSH Agent on PuTTY If you are using PuTTY on Windows, download and use `pageant` as the SSH agent. See the [PuTTY documentation][putty-agent]. @@ -227,7 +243,7 @@ Very often, many users are tempted to think of a high-performance computing inst remote$ hostname ``` -``` +```text cluster.name ``` @@ -237,7 +253,7 @@ So, we're definitely on the remote machine. Next, let's find out where we are by remote$ pwd ``` -``` +```text /home/user ``` @@ -247,7 +263,7 @@ Great, we know where we are! Let's see what's in our current directory: remote$ ls ``` -``` +```text id_ed25519.pub ``` @@ -257,7 +273,7 @@ The system administrators may have configured your home directory with some help remote$ ls -a ``` -``` +```text . .bashrc id_ed25519.pub .. .ssh ``` @@ -267,6 +283,7 @@ In the first column, `.` is a reference to the current directory and `..` a refe ### Install Your SSH Key :::callout + ## There May Be a Better Way Policies and practices for handling SSH keys vary between HPC clusters: follow any guidance provided by the cluster administrators or documentation. In particular, if there is an online portal for managing SSH keys, use that instead of the directions outlined here. @@ -297,12 +314,10 @@ local$ ssh user@cluster.name ``` [bitwarden]: https://bitwarden.com -[fshs]: https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard [gh-ssh]: https://docs.github.com/en/authentication/connecting-to-github-with-ssh [keepass]: https://keepass.info [putty-gen]: https://tartarus.org/~simon/putty-prerel-snapshots/htmldoc/Chapter8.html#pubkey-puttygen [putty-agent]: https://tartarus.org/~simon/putty-prerel-snapshots/htmldoc/Chapter9.html#pageant [ssh-agent]: https://www.ssh.com/academy/ssh/agent -[ssh-flags]: https://stribika.github.io/2015/01/04/secure-secure-shell.html [wiki-rsa]: https://en.wikipedia.org/wiki/RSA_(cryptosystem) [wiki-dsa]: https://en.wikipedia.org/wiki/EdDSA diff --git a/high_performance_computing/hpc_intro/03_cluster.md b/high_performance_computing/hpc_intro/03_cluster.md index e2c4567d..6dcf63e6 100644 --- a/high_performance_computing/hpc_intro/03_cluster.md +++ b/high_performance_computing/hpc_intro/03_cluster.md @@ -4,6 +4,12 @@ dependsOn: [ high_performance_computing.hpc_intro.02_connecting ] tags: [ARC] +learningOutcomes: + - Survey system resources using nproc, free, and the queuing system. + - Compare & contrast resources on the local machine, login node, and worker nodes. + - Learn about the various filesystems on the cluster using df. + - Find out who else is logged in. + - Assess the number of idle and occupied nodes. attribution: - citation: > "Introduction to High-Performance Computing" course by the HPC-carpentries @@ -39,7 +45,7 @@ You would likely see something more like this: local$ ls ``` -``` +```text Applications Documents Library Music Public Desktop Downloads Movies Pictures ``` @@ -58,7 +64,7 @@ devices are anchored to the "root" directory, which is `/`: remote$ ls / ``` -``` +```text bin etc lib64 proc sbin sys var boot home mnt root scratch tmp working dev lib opt run srv usr @@ -69,6 +75,7 @@ folders on a UNIX OS contain system files and change as you install new software upgrade your OS. :::callout + ## Using HPC filesystems On HPC systems, you have a number of places where you can store your files. @@ -87,6 +94,7 @@ are backed up. Scratch; it may not be backed up. It differs from Scratch space in that files in a work file system are not automatically deleted for you: you must manage the space yourself. + ::: ## Nodes @@ -115,6 +123,7 @@ This may show only your user ID, but there are likely several other people (including fellow learners) connected right now. :::callout + ## Dedicated Transfer Nodes If you want to transfer larger amounts of data to or from the cluster, some @@ -143,7 +152,7 @@ For example, we can view all of the compute nodes by running the command remote$ sinfo ``` -``` +```text PARTITION AVAIL TIMELIMIT NODES STATE NODELIST compute* up 7-00:00:00 1 drain* gra259 compute* up 7-00:00:00 11 down* gra[8,99,211,268,376,635,647,803,85... @@ -201,19 +210,19 @@ can be found on the command line. In a Linux environment, you can use the following commands: * Run system utilities + ```bash local$ nproc --all local$ free -m ``` * Read from `/proc` + ```bash local$ cat /proc/cpuinfo local$ cat /proc/meminfo ``` - - ``` ::: In a macOS environment, you can use the following to get the number of cpus and @@ -231,6 +240,7 @@ or install `htop` using `apt` in Ubuntu or `brew` in macOS. local$ top local$ htop ``` + :::: ::::challenge{id=explore-remote title="Explore the Login Node"} @@ -277,7 +287,6 @@ This is an important point to remember: files saved on one node ::: :::: - ::::challenge{id=compare-local-remote title="Compare Your Computer, the Login Node and the Compute Node"} Compare your laptop's number of processors and memory with the numbers you @@ -285,7 +294,6 @@ see on the cluster login node and compute node. What implications do you think the differences might have on running your research work on the different systems and nodes? - :::solution Compute nodes are usually built with processors that have _higher core-counts_ than the login node or personal computers in order to support @@ -297,6 +305,7 @@ more, faster memory is key for large or _complex numerical tasks_. :::: :::callout + ## Differences Between Nodes Many HPC clusters have a variety of nodes optimized for particular workloads. diff --git a/high_performance_computing/hpc_intro/04_scheduler.md b/high_performance_computing/hpc_intro/04_scheduler.md index ec725e36..33c7d790 100644 --- a/high_performance_computing/hpc_intro/04_scheduler.md +++ b/high_performance_computing/hpc_intro/04_scheduler.md @@ -4,6 +4,11 @@ dependsOn: [ high_performance_computing.hpc_intro.03_cluster ] tags: [slurm, ARC] +learningOutcomes: + - Submit a simple script to the cluster. + - Monitor the execution of jobs using command line tools. + - Inspect the output and error files of your jobs. + - Find the right place to put large datasets on the cluster. attribution: - citation: > "Introduction to High-Performance Computing" course by the HPC-carpentries @@ -59,7 +64,8 @@ manner. Our shell script will have three parts: ```bash remote$ nano example-job.sh ``` -``` + +```text #!/bin/bash echo -n "This script is running on " @@ -76,9 +82,10 @@ Run the script. Does it execute on the cluster or just our login node? remote$ bash example-job.sh ``` -``` +```text This script is running on <> ``` + ::: :::: @@ -96,7 +103,7 @@ available to perform the work. remote$ sbatch example-job.sh ``` -``` +```text Submitted batch job 36855 ``` @@ -109,15 +116,16 @@ status, we check the queue using the command `squeue -u yourUsername` remote$ squeue -u yourUsername ``` -``` +```text JOBID USER ACCOUNT NAME ST REASON START_TIME T... 36856 yourUsername yourAccount example-job.sh R None 2017-07-01T16:47:02 ... ``` -We can see all the details of our job, most importantly that it is in the R or RUNNING +We can see all the details of our job, most importantly that it is in the R or RUNNING state. Sometimes our jobs might need to wait in a queue (PENDING) or have an error (E). :::callout + ## Where's the Output? On the login node, this script printed output to the terminal - but @@ -154,7 +162,7 @@ name of a job. Add an option to the script: remote$ cat example-job.sh ``` -``` +```text #!/usr/bin/env bash #SBATCH -J hello-world @@ -169,7 +177,7 @@ remote$ sbatch example-job.sh remote$ squeue -u yourUsername ``` -``` +```text JOBID USER ACCOUNT NAME ST REASON START_TIME TIME TIME_LEF... 38191 yourUsername yourAccount hello-wo PD Priority N/A 0:00 1:00:00 ... ``` @@ -187,16 +195,16 @@ with your site's default resources, which is probably not what you want. The following are several key resource requests: -- `--ntasks=` or `-n `: How many CPU cores does your job need, in total? +* `--ntasks=` or `-n `: How many CPU cores does your job need, in total? -- `--time ` or `-t `: How much +* `--time ` or `-t `: How much real-world time (walltime) will your job take to run? The `` part can be omitted. -- `--mem=`: How much memory on a node does your job need in megabytes? You can +* `--mem=`: How much memory on a node does your job need in megabytes? You can also specify gigabytes using by adding a little “g” afterwards (example: `--mem=5g`) -- `--nodes=` or `-N `: How many separate machines does your job need to run - on? Note that if you set ntasks to a number greater than what one machine can offer, +* `--nodes=` or `-N `: How many separate machines does your job need to run + on? Note that if you set ntasks to a number greater than what one machine can offer, Slurm will set this value automatically. Note that just _requesting_ these resources does not make your job run faster, @@ -220,7 +228,7 @@ for it on the cluster. remote$ cat example-job.sh ``` -``` +```text #!/usr/bin/env bash #SBATCH -t 00:01 # timeout in HH:MM @@ -239,10 +247,10 @@ Why are the Slurm runtime and `sleep` time not identical? ::::challenge{id=env-var title="Job environment variables"} -When Slurm runs a job, it sets a number of environment variables for the job. One of -these will let us check what directory our job script was submitted from. The -`SLURM_SUBMIT_DIR` variable is set to the directory from which our job was submitted. -Using the `SLURM_SUBMIT_DIR` variable, modify your job so that it prints out the +When Slurm runs a job, it sets a number of environment variables for the job. One of +these will let us check what directory our job script was submitted from. The +`SLURM_SUBMIT_DIR` variable is set to the directory from which our job was submitted. +Using the `SLURM_SUBMIT_DIR` variable, modify your job so that it prints out the location from which the job was submitted :::solution @@ -252,7 +260,7 @@ remote$ nano example-job.sh remote$ cat example-job.sh ``` -``` +```text #!/usr/bin/env bash #SBATCH -t 00:00:30 @@ -262,6 +270,7 @@ hostname echo "This job was launched in the following directory:" echo ${SLURM_SUBMIT_DIR} ``` + ::: :::: @@ -273,7 +282,7 @@ wall time, and attempt to run a job for two minutes. remote$ cat example-job.sh ``` -``` +```text #!/usr/bin/env bash #SBATCH -J long_job #SBATCH -t 00:01 # timeout in HH:MM @@ -295,7 +304,7 @@ remote$ squeue -u yourUsername remote$ cat slurm-38193.out ``` -``` +```text This job is running on: gra533 slurmstepd: error: *** JOB 38193 ON gra533 CANCELLED AT 2017-07-02T16:35:48 @@ -325,7 +334,7 @@ remote$ sbatch example-job.sh remote$ squeue -u yourUsername ``` -``` +```text Submitted batch job 38759 JOBID USER ACCOUNT NAME ST REASON TIME TIME_LEFT NOD... @@ -342,14 +351,14 @@ remote$ scancel 38759 remote$ squeue -u yourUsername ``` -``` +```text JOBID USER ACCOUNT NAME ST REASON START_TIME TIME TIME_LEFT NODES CPUS ``` ::::challenge{id=cancel-multiple title="Cancelling multiple jobs"} -We can also cancel all of our jobs at once using the -u option. This will delete all -jobs for a specific user (in this case, yourself). Note that you can only delete your +We can also cancel all of our jobs at once using the -u option. This will delete all +jobs for a specific user (in this case, yourself). Note that you can only delete your own jobs. Try submitting multiple jobs and then cancelling them all. @@ -383,26 +392,26 @@ too much for a login node to handle. A good example of this might be building a genome index for alignment with a tool like [HISAT2][hisat]. Fortunately, we can run these types of tasks as a one-off with `srun`. -`srun` runs a single command on the cluster and then exits. Let’s demonstrate this by +`srun` runs a single command on the cluster and then exits. Let’s demonstrate this by running the hostname command with `srun`. (We can cancel an `srun` job with `Ctrl-c`.) ```bash remote$ srun hostname ``` -``` +```text gra752 ``` -`srun` accepts all of the same options as `sbatch`. However, instead of specifying these in -a script, these options are specified on the command-line when starting a job. To submit +`srun` accepts all of the same options as `sbatch`. However, instead of specifying these in +a script, these options are specified on the command-line when starting a job. To submit a job that uses 2 CPUs for instance, we could use the following command: ```bash remote$ srun -n 2 echo "This job will use 2 CPUs." ``` -``` +```text This job will use 2 CPUs. This job will use 2 CPUs. ``` @@ -411,32 +420,33 @@ Typically, the resulting shell environment will be the same as that for `sbatch` ## Interactive jobs -Sometimes, you will need a lot of resource for interactive use. Perhaps it’s our first -time running an analysis or we are attempting to debug something that went wrong with a +Sometimes, you will need a lot of resource for interactive use. Perhaps it’s our first +time running an analysis or we are attempting to debug something that went wrong with a previous job. Fortunately, Slurm makes it easy to start an interactive job with `srun`: ```bash remote$ srun --pty bash ``` -You should be presented with a bash prompt. Note that the prompt will likely change to -reflect your new location, in this case the compute node we are logged on. You can also +You should be presented with a bash prompt. Note that the prompt will likely change to +reflect your new location, in this case the compute node we are logged on. You can also verify this with hostname. :::callout + ### Creating remote graphics -To see graphical output inside your jobs, you need to use X11 forwarding. To connect -with this feature enabled, use the -Y option when you login with the ssh command, e.g., +To see graphical output inside your jobs, you need to use X11 forwarding. To connect +with this feature enabled, use the -Y option when you login with the ssh command, e.g., `ssh -Y user@cluster.name`. -To demonstrate what happens when you create a graphics window on the remote node, use -the `xeyes` command. A relatively adorable pair of eyes should pop up (press `Ctrl-C` to -stop). If you are using a Mac, you must have installed XQuartz (and restarted your +To demonstrate what happens when you create a graphics window on the remote node, use +the `xeyes` command. A relatively adorable pair of eyes should pop up (press `Ctrl-C` to +stop). If you are using a Mac, you must have installed XQuartz (and restarted your computer) for this to work. -If your cluster has the `slurm-spank-x11` plugin installed, you can ensure X11 -forwarding within interactive jobs by using the `--x11` option for `srun` with the command +If your cluster has the `slurm-spank-x11` plugin installed, you can ensure X11 +forwarding within interactive jobs by using the `--x11` option for `srun` with the command `srun --x11 --pty bash`. ::: @@ -444,9 +454,8 @@ When you are done with the interactive job, type `exit` to quit your session. ## Key Points -- The scheduler handles how compute resources are shared between users. -- A job is just a shell script. -- Request slightly more resources than you will need. +* The scheduler handles how compute resources are shared between users. +* A job is just a shell script. +* Request slightly more resources than you will need. -[fshs]: https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard [hisat]: https://ccb.jhu.edu/software/hisat2/index.shtml diff --git a/high_performance_computing/hpc_intro/05_modules.md b/high_performance_computing/hpc_intro/05_modules.md index 63dc5857..bf651927 100644 --- a/high_performance_computing/hpc_intro/05_modules.md +++ b/high_performance_computing/hpc_intro/05_modules.md @@ -4,6 +4,9 @@ dependsOn: [ high_performance_computing.hpc_intro.04_scheduler ] tags: [ARC] +learningOutcomes: + - Load and use a software package. + - Explain how the shell environment changes when the module mechanism loads or unloads packages. attribution: - citation: > "Introduction to High-Performance Computing" course by the HPC-carpentries @@ -73,7 +76,7 @@ To see available software modules, use `module avail`: remote$ module avail ``` -``` +```text ---------------- MPI-dependent avx2 modules ----------------- abinit/8.2.2 (chem) ncl/6.4.0 abyss/1.9.0 (bio) ncview/2.1.7 (vis) @@ -109,7 +112,7 @@ message telling you so remote$ module list ``` -``` +```text No Modulefiles Currently Loaded. ``` @@ -126,7 +129,7 @@ it to tell us where a particular piece of software is stored. remote$ which python3 ``` -``` +```text python3 not found ``` @@ -137,7 +140,7 @@ remote$ module load python remote$ which python3 ``` -``` +```text /path/to/python/python3 ``` @@ -156,21 +159,20 @@ variables we can print it out using `echo`. remote$ echo $PATH ``` -``` +```text /path/to/python:/another/path:/some/other/path:/yet/another/path ``` -You'll notice a similarity to the output of the `which` command. -In this case, there's only one difference: the different directory at the beginning. -When we ran the `module load` command, it added a directory to the beginning of our `$PATH`. +You'll notice a similarity to the output of the `which` command. +In this case, there's only one difference: the different directory at the beginning. +When we ran the `module load` command, it added a directory to the beginning of our `$PATH`. Let's examine what's there: - ```bash remote$ ls /path/to/python ``` -``` +```text 2to3 idle3.5 pydoc3.5 python3.5m virtualenv 2to3-3.5 pip python python3.5m-config wheel easy_install pip3 python3 python3-config @@ -179,8 +181,8 @@ idle3 pydoc3 python3.5-config pyvenv-3.5 ``` Taking this to its conclusion, `module load` will add software to your `$PATH`. -It "loads" software. -A special note on this - depending on which version of the `module` program that is installed at your site, +It "loads" software. +A special note on this - depending on which version of the `module` program that is installed at your site, `module load` will also load required software dependencies. To demonstrate, let’s use `module list`. `module list` shows all loaded software modules. @@ -189,7 +191,7 @@ To demonstrate, let’s use `module list`. `module list` shows all loaded softwa remote$ module list ``` -``` +```text Currently Loaded Modules: 1) nixpkgs/.16.09 (H,S) 5) intel/2016.4 (t) 2) icc/.2016.4.258 (H) 6) openmpi/2.1.1 (m) @@ -203,14 +205,15 @@ Currently Loaded Modules: H: Hidden Module ``` -The list of modules available will vary widely by HPC system. +The list of modules available will vary widely by HPC system. If your system has the `beast` module available, then loading `beast` module (a bioinformatics software package) will do something like this: ```bash remote$ module load beast remote$ module list ``` -``` + +```text Currently Loaded Modules: 1) nixpkgs/.16.09 (H,S) 5) intel/2016.4 (t) 9) java/1.8.0_121 (t) 2) icc/.2016.4.258 (H) 6) openmpi/2.1.1 (m) 10) beagle-lib/2.1.2 (bio) @@ -226,14 +229,15 @@ Currently Loaded Modules: H: Hidden Module ``` -So in this case, `beast` also loaded `java/1.8.0_121` and `beagle-lib/2.1.2` as well. +So in this case, `beast` also loaded `java/1.8.0_121` and `beagle-lib/2.1.2` as well. Let’s try unloading the `beast` package. ```bash remote$ module unload beast remote$ module list ``` -``` + +```text Currently Loaded Modules: 1) nixpkgs/.16.09 (H,S) 5) intel/2016.4 (t) 2) icc/.2016.4.258 (H) 6) openmpi/2.1.1 (m) @@ -247,7 +251,7 @@ Currently Loaded Modules: H: Hidden Module ``` -So using `module unload` “un-loads” a module along with its dependencies. +So using `module unload` “un-loads” a module along with its dependencies. If we wanted to unload everything at once, we could run `module purge` (unloads everything). ```bash @@ -264,24 +268,23 @@ The following modules were not unloaded: 4) gcccore/.5.4.0 8) openmpi/2.1.1 ``` -Note that `module purge` is informative. +Note that `module purge` is informative. It lets us know that all but a default set of packages have been unloaded (and how to actually unload these if we truly so desired). Note that this module loading process happens principally through the manipulation of environment variables like $PATH. There is usually little or no data transfer involved. -The module loading process manipulates other special environment variables as well, -including variables that influence where the system looks for software libraries, +The module loading process manipulates other special environment variables as well, +including variables that influence where the system looks for software libraries, and sometimes variables which tell commercial software packages where to find license servers. The module command also restores these shell environment variables to their previous state when a module is unloaded. - ## Software Versioning -So far, we've learned how to load and unload software packages. -This is very useful. -However, we have not yet addressed the issue of software versioning. -At some point or other, you will run into issues where only one particular version of some software will be suitable. +So far, we've learned how to load and unload software packages. +This is very useful. +However, we have not yet addressed the issue of software versioning. +At some point or other, you will run into issues where only one particular version of some software will be suitable. Perhaps a key bugfix only happened in a certain version, or version X broke compatibility with a file format you use. In either of these example cases, it helps to be very specific about what software is loaded. @@ -291,7 +294,7 @@ Let's examine the output of `module avail` more closely. remote$ module avail ``` -``` +```text ---------------- MPI-dependent avx2 modules ----------------- abinit/8.2.2 (chem) ncl/6.4.0 abyss/1.9.0 (bio) ncview/2.1.7 (vis) @@ -317,24 +320,28 @@ Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys". ``` -Let’s take a closer look at the gcc module. -GCC is an extremely widely used C/C++/Fortran compiler. -Tons of software is dependent on the GCC version, and might not compile or run if the wrong version is loaded. -In this case, there are two different versions: `gcc/4.8.5` and `gcc/5.4.0`. +Let’s take a closer look at the gcc module. +GCC is an extremely widely used C/C++/Fortran compiler. +Tons of software is dependent on the GCC version, and might not compile or run if the wrong version is loaded. +In this case, there are two different versions: `gcc/4.8.5` and `gcc/5.4.0`. How do we load each copy, and which copy is the default? -In this case, `gcc/5.4.0` has a `(D)` next to it. This indicates that it is the default +In this case, `gcc/5.4.0` has a `(D)` next to it. This indicates that it is the default — if we type `module load gcc`, this is the copy that will be loaded. ::::callout{variant="tip"} + ## Filtering Lists + A lot of HPC systems will have so many modules available that looking through the whole of `module avail` is just not practical. You can use `module avail gcc` to check out the versions of GCC available on yours. An example output from a different system might be: + ```bash remote$ module avail gcc ``` -``` + +```text ------------------------------------------------- /local/modules/apps -------------------------------------------------- [removed for clarity] @@ -351,6 +358,7 @@ remote$ module avail gcc Where: D: Default Module ``` + :::: ```bash @@ -358,7 +366,7 @@ remote$ module load gcc remote$ gcc --version ``` -``` +```text Lmod is automatically replacing "intel/2016.4" with "gcc/5.4.0". Due to MODULEPATH changes, the following have been reloaded: @@ -370,22 +378,22 @@ This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. ``` -Note that three things happened: the default copy of GCC was loaded (version 5.4.0), the -Intel compilers (which conflict with GCC) were unloaded, and software that is dependent -on compiler (OpenMPI) was reloaded. The module system turned what might be a +Note that three things happened: the default copy of GCC was loaded (version 5.4.0), the +Intel compilers (which conflict with GCC) were unloaded, and software that is dependent +on compiler (OpenMPI) was reloaded. The module system turned what might be a super-complex operation into a single command. -So how do we load the non-default copy of a software package? In this case, the only -change we need to make is be more specific about the module we are loading. There are -two GCC modules: `gcc/5.4.0` and `gcc/4.8.5`. To load a non-default module, the only -change we need to make to our module load command is to leave in the version number +So how do we load the non-default copy of a software package? In this case, the only +change we need to make is be more specific about the module we are loading. There are +two GCC modules: `gcc/5.4.0` and `gcc/4.8.5`. To load a non-default module, the only +change we need to make to our module load command is to leave in the version number after the `/`. ```bash remote$ module load gcc/4.8.5 ``` -``` +```text Inactive Modules: 1) openmpi @@ -398,17 +406,17 @@ This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. ``` -We now have successfully switched from GCC 5.4.0 to GCC 4.8.5. It is also important to -note that there was no compatible OpenMPI module available for GCC 4.8.5. Because of -this, the module program has “inactivated” the module. All this means for us is that if -we re-load GCC 5.4.0, module will remember OpenMPI used to be loaded and load that +We now have successfully switched from GCC 5.4.0 to GCC 4.8.5. It is also important to +note that there was no compatible OpenMPI module available for GCC 4.8.5. Because of +this, the module program has “inactivated” the module. All this means for us is that if +we re-load GCC 5.4.0, module will remember OpenMPI used to be loaded and load that module as well. ```bash remote$ module load gcc/5.4.0 ``` -``` +```text Activating Modules: 1) openmpi/2.1.1 @@ -416,7 +424,6 @@ The following have been reloaded with a version change: 1) gcc/4.8.5 => gcc/5.4.0 ``` - ::::challenge{id=module-script title="Using Software Modules in Scripts"} Create a job that is able to run `python3 --version`. Remember, no software @@ -431,7 +438,7 @@ remote$ nano python-module.sh remote$ cat python-module.sh ``` -``` +```text #!/usr/bin/env bash module load python3 @@ -442,5 +449,6 @@ python3 --version ```bash remote$ sbatch python-module.sh ``` + ::: :::: diff --git a/high_performance_computing/hpc_intro/06_transferring_files.md b/high_performance_computing/hpc_intro/06_transferring_files.md index 7ef29976..d641c681 100644 --- a/high_performance_computing/hpc_intro/06_transferring_files.md +++ b/high_performance_computing/hpc_intro/06_transferring_files.md @@ -4,6 +4,8 @@ dependsOn: [ high_performance_computing.hpc_intro.05_modules ] tags: [] +learningOutcomes: + - Transfer files to and from a computing cluster. attribution: - citation: > "Introduction to High-Performance Computing" course by the HPC-carpentries @@ -52,6 +54,7 @@ in this case, `main`. Use one of the above commands to save the tarball as `amdahl.tar.gz`. :::solution + ## `wget` and `curl` Commands ```bash @@ -59,6 +62,7 @@ local$ wget -O amdahl.tar.gz https://github.com/hpc-carpentry/amdahl/tarball/mai # or local$ curl -o amdahl.tar.gz https://github.com/hpc-carpentry/amdahl/tarball/main ``` + ::: :::: @@ -140,7 +144,7 @@ directory named "amdahl" using `tar`. local$ tar -xvzf amdahl.tar.gz ``` -``` +```text hpc-carpentry-amdahl-46c9b4b/ hpc-carpentry-amdahl-46c9b4b/.github/ hpc-carpentry-amdahl-46c9b4b/.github/workflows/ @@ -191,7 +195,7 @@ then provide a directory to compress: local$ tar -cvzf compressed_code.tar.gz amdahl ``` -``` +```text amdahl/ amdahl/.github/ amdahl/.github/workflows/ @@ -214,6 +218,7 @@ That would mean adding the new `amdahl` folder to the _existing_ folder archive! :::callout + ## Working with Windows When you transfer text files from a Windows system to a Unix system (Mac, @@ -282,6 +287,7 @@ Try downloading the file directly. Note that it may well fail, and that's OK! :::solution + ## Commands ```bash @@ -290,6 +296,7 @@ remote$ wget -O amdahl.tar.gz https://github.com/hpc-carpentry/amdahl/tarball/ma # or remote$ curl -o amdahl.tar.gz https://github.com/hpc-carpentry/amdahl/tarball/main ``` + ::: Did it work? If not, what does the terminal output tell you about what @@ -308,6 +315,7 @@ local$ scp -r amdahl user@cluster.name: ``` :::callout + ## Caution For a large directory either in size or number of files - @@ -339,6 +347,7 @@ With `scp`, a trailing slash on the target directory is optional, and has no effect. It is important for other commands, like `rsync`. :::callout + ## A Note on `rsync` As you gain experience with transferring files, you may find the `scp` @@ -379,6 +388,7 @@ To download a file, we simply change the source and destination: ```bash local$ rsync -avP user@cluster.name:amdahl ./ ``` + ::: File transfers using both `scp` and `rsync` use SSH to encrypt data sent through diff --git a/high_performance_computing/hpc_intro/07_parallel.md b/high_performance_computing/hpc_intro/07_parallel.md index 45d99f6a..47e893f4 100644 --- a/high_performance_computing/hpc_intro/07_parallel.md +++ b/high_performance_computing/hpc_intro/07_parallel.md @@ -4,6 +4,12 @@ dependsOn: [ high_performance_computing.hpc_intro.06_transferring_files ] tags: [slurm] +learningOutcomes: + - Install a Python package using pip. + - Prepare a job submission script for the parallel executable. + - Launch jobs with parallel execution. + - Record and summarize the timing and accuracy of jobs. + - Describe the relationship between job parallelism and performance. attribution: - citation: > "Introduction to High-Performance Computing" course by the HPC-carpentries @@ -12,7 +18,7 @@ attribution: license: CC-BY-4.0 --- -We now have the tools we need to run a multi-processor job. +We now have the tools we need to run a multi-processor job. This is a very important aspect of HPC systems, as parallelism is one of the primary tools we have to improve the performance of computational tasks. If you disconnected, log back in to the cluster. @@ -21,12 +27,12 @@ If you disconnected, log back in to the cluster. local$ ssh user@cluster.name ``` - ## Install the Amdahl Program With the Amdahl source code on the cluster, we can install it, which will provide access to the `amdahl` executable. ::::callout + ## Amdahl is Python Code The Amdahl program is written in Python, and installing or using it requires locating the `python3` executable on the login node. @@ -35,6 +41,7 @@ The Amdahl program is written in Python, and installing or using it requires loc Move into the extracted directory, then use the Package Installer for Python, or `pip`, to install it in your ("user") home directory: To do this, we'll need to make sure we have our required packages installed: + ```bash remote$ module load python/3.11 remote$ module load openmpi/2.1.1 @@ -44,8 +51,11 @@ remote$ module load openmpi/2.1.1 remote$ cd amdahl remote$ python3 -m pip install --user . ``` + :::::callout{variant="warning"} + ## Dependencies and Clusters + As they're shared computing environments, and can be a target for hacking, many high performance clusters block the kind of automatic dependency resolution you're used to on regular machines. You can see this if you call `python3 -m pip install .` and it hangs for a while, before reporting it can't find `setuptools` (or another package). You may have to rely on their 'walled garden' of resources, or install any extras manually - to reduce the risk of malware sneaking onto the cluster. @@ -53,40 +63,50 @@ You may have to rely on their 'walled garden' of resources, or install any extra If `python3 -m pip install --user .` fails, you have a few possibilities to manage the dependencies: ::::callout + ### Use a module -The easiest fix is to look for a `mpi4py` module already available on your cluster. +The easiest fix is to look for a `mpi4py` module already available on your cluster. Try `module avail mpi4py`, and load it if you can find one. :::: ::::callout + ### Use Anaconda Some clusters prefer you to use Anaconda, a heavier-weight package and environment manager for Python that has a vetted list of packages. Your system might have it down as `conda`, `miniconda` or `anaconda` - try `module avail conda` to search for it: + ```bash remote$ module avail conda ``` -``` + +```text ------------------------------------------------- /local/modules/apps -------------------------------------------------- anaconda/py3.10 conda/py2-latest conda/py3-latest (D) ``` + Conda requires you to make some modifications to your `.bashrc` file, then re-load it to allow you to work with environments. We'll unload `python` and load `conda` instead: + ```bash remote$ module unload python remote$ module load conda remote$ conda init remote$ source ~/.bashrc ``` + Then, once you've done that, you can create an `amdahl` environment, and install the prerequisites for it using `conda` instead of `pip`. + ```bash remote$ conda create amdahl remote$ conda activate amdahl remote$ conda install --yes --file requirements.txt ``` + :::: ::::callout + ### Install manually You can try and install `mpi4py` from its source code, just like we're doing with `amdahl`. @@ -118,20 +138,20 @@ This happens if the 'development version' of Python isn't installed - you might **Finally, install `amdahl`.** Once your dependencies are sorted, you can then install `amdahl` without looking on the Python Package Index (PyPI): -``` + +```text python3 -m pip install --user --no-index . ``` ::::: - - ::::callout{variant="warning"} + ## Binaries and `PATH`s `pip` may warn that your user package binaries are not in your `PATH`. -``` +```text WARNING: The script amdahl is installed in "${HOME}/.local/bin" which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. ``` @@ -143,7 +163,7 @@ To check whether this warning is a problem, use `which` to search for the remote$ which amdahl ``` -If the command returns no output, displaying a new prompt, it means the file `amdahl` has not been found. +If the command returns no output, displaying a new prompt, it means the file `amdahl` has not been found. You must update the environment variable named `PATH` to include the missing folder. Edit your shell configuration file as follows, then log off the cluster and back on again so it takes effect. @@ -151,7 +171,8 @@ Edit your shell configuration file as follows, then log off the cluster and back remote$ nano ~/.bashrc remote$ tail ~/.bashrc ``` -``` + +```text export PATH=${PATH}:${HOME}/.local/bin ``` @@ -159,9 +180,7 @@ After logging back in to the cluster, `which` should be able to find `amdahl` wi If you had to load a Python module, load it again. :::: - - -## Help! +## Help Many command-line programs include a "help" message. Try it with `amdahl`: @@ -181,7 +200,7 @@ optional arguments: Random jitter: a float between -1 and +1 ``` -This message doesn't tell us much about what the program _does_, +This message doesn't tell us much about what the program _does_, but it does tell us the important flags we might want to use when launching it. ## Running the Job on a Compute Node @@ -192,6 +211,7 @@ Create a submission file, requesting one task on a single node, then launch it. remote$ nano serial-job.sh remote$ cat serial-job.sh ``` + ```bash #!/bin/bash #SBATCH -J solo-job @@ -205,27 +225,36 @@ module load python # Execute the task amdahl ``` + :::::callout{variant="warning"} + ## Dependency Variants + If you weren't able to just use `pip install` to handle all of `amdahl`'s dependencies, you'll need to do something different in the submission script: ::::callout + ### Using a Module + ```bash # Load the computing environment we need module load python/3.11 module load mpi4py ``` + :::: ::::callout + ### Using Anaconda + ```bash # Load the computing environment we need module load conda conda init conda activate amdahl ``` + :::: ::::: @@ -240,23 +269,26 @@ remote$ squeue -u yourUsername ``` :::::challenge{id=read-output, title="Read the Job Output"} -Use `ls` to locate the output file. The `-t` flag sorts in reverse-chronological order: newest first. +Use `ls` to locate the output file. The `-t` flag sorts in reverse-chronological order: newest first. What was the output? ::::solution -The cluster output should be written to a file in the folder you launched the job from. +The cluster output should be written to a file in the folder you launched the job from. For example, ```bash remote$ ls -t ``` -``` + +```text slurm-347087.out serial-job.sh amdahl README.md LICENSE.txt ``` + ```bash remote$ cat slurm-347087.out ``` -``` + +```text Doing 30.000 seconds of 'work' on 1 processor, which should take 30.000 seconds with 0.850 parallel proportion of the workload. @@ -265,16 +297,17 @@ which should take 30.000 seconds with 0.850 parallel proportion of the workload. Total execution time (according to rank 0): 30.033 seconds ``` + :::: ::::: As we saw before, two of the `amdahl` program flags set the amount of work and the proportion of that work that is parallel in nature. -Based on the output, we can see that the code uses a default of 30 seconds of work that is 85% parallel. -The program ran for just over 30 seconds in total, and if we run the numbers, +Based on the output, we can see that the code uses a default of 30 seconds of work that is 85% parallel. +The program ran for just over 30 seconds in total, and if we run the numbers, it is true that 15% of it was marked 'serial' and 85% was 'parallel'. -Since we only gave the job one CPU, this job wasn't really parallel: -the same processor performed the 'serial' work for 4.5 seconds, then the 'parallel' part for 25.5 seconds, and no time was saved. +Since we only gave the job one CPU, this job wasn't really parallel: +the same processor performed the 'serial' work for 4.5 seconds, then the 'parallel' part for 25.5 seconds, and no time was saved. The cluster can do better, if we ask. ## Running the Parallel Job @@ -282,6 +315,7 @@ The cluster can do better, if we ask. The `amdahl` program uses the Message Passing Interface (MPI) for parallelism -- this is a common tool on HPC systems. ::::callout{variant="note"} + ## What is MPI? The Message Passing Interface is a set of tools which allow multiple tasks @@ -304,6 +338,7 @@ you need to use), which will ensure that the appropriate run-time support for parallelism is included. ::::callout{variant="note"} + ## MPI Runtime Arguments On their own, commands such as `mpiexec` can take many arguments specifying @@ -322,6 +357,7 @@ remote$ cp serial-job.sh parallel-job.sh remote$ nano parallel-job.sh remote$ cat parallel-job.sh ``` + ```bash #!/bin/bash #SBATCH -J parallel-job @@ -334,20 +370,25 @@ module load python/3.11 # Execute the task mpiexec amdahl ``` + :::::callout{variant="warning"} + ## Dependency Variants: Using Anaconda + If you had to install your code using `conda`, as part of installing `mpi4py` it'll have built its own version of `mpiexec`. That will be in your `conda` environment directory. Try `which mpiexec` to see what your default version of `mpiexec` is. If the path it shows doesn't include `.conda/envs`, you'll have to call the correct version explicitly in your script: + ```bash #Execute the task ~/.conda/envs/amdahl/bin/mpiexec amdahl ``` + ::::: -Then submit your job. Note that the submission command has not really changed from how we submitted the serial job: +Then submit your job. Note that the submission command has not really changed from how we submitted the serial job: all the parallel settings are in the batch file rather than the command line. ```bash @@ -359,13 +400,16 @@ As before, use the status commands to check when your job runs. ```bash remote$ ls -t ``` -``` + +```text slurm-347178.out parallel-job.sh slurm-347087.out serial-job.sh amdahl README.md LICENSE.txt ``` + ```bash remote$ cat slurm-347178.out ``` -``` + +```text Doing 30.000 seconds of 'work' on 4 processors, which should take 10.875 seconds with 0.850 parallel proportion of the workload. @@ -397,7 +441,7 @@ parallel work. This sets a lower limit on the amount of time this job will take, no matter how many cores you throw at it. This is the basic principle behind [Amdahl's Law][amdahl], which is one way -of predicting improvements in execution time for a __fixed__ workload that +of predicting improvements in execution time for a **fixed** workload that can be subdivided and run in parallel to some extent. ::: :::: @@ -433,6 +477,7 @@ code gets. remote$ nano parallel-job.sh remote$ cat parallel-job.sh ``` + ```bash #!/bin/bash #SBATCH -J parallel-job @@ -447,11 +492,13 @@ mpiexec amdahl ``` ::::callout{variant="warning"} -## Dependency Variants + +### Dependency Variants + As before, you'll need to modify this script if you used a module or Anaconda. :::: -Then submit your job. +Then submit your job. Note that the submission command has not really changed from how we submitted the serial job: all the parallel settings are in the batch file rather than the command line. @@ -464,13 +511,16 @@ As before, use the status commands to check when your job runs. ```bash remote$ ls -t ``` -``` + +```text slurm-347271.out parallel-job.sh slurm-347178.out slurm-347087.out serial-job.sh amdahl README.md LICENSE.txt ``` + ```bash remote$ cat slurm-347178.out ``` -``` + +```text which should take 7.688 seconds with 0.850 parallel proportion of the workload. Hello, World! I am process 4 of 8 on <>. I will do parallel 'work' for 3.188 seconds. @@ -487,8 +537,10 @@ Total execution time (according to rank 0): 7.697 seconds ``` :::callout{variant="note"} + ## Non-Linear Output -When we ran the job with 4 parallel workers, the serial job wrote its output first, + +When we ran the job with 4 parallel workers, the serial job wrote its output first, then the parallel processes wrote their output, with process 0 coming in first and last. With 8 workers, this is not the case: since the parallel workers take less time than the serial work, @@ -524,13 +576,7 @@ In real programs, the speedup factor is influenced by * MPI library implementations * details of the MPI program itself -Using Amdahl's Law, you can prove that with this program, it is _impossible_ to reach 8× speedup, no matter how many processors you have on hand. +Using Amdahl's Law, you can prove that with this program, it is _impossible_ to reach 8× speedup, no matter how many processors you have on hand. Details of that analysis, with results to back it up, are left for the next section of the material, _Scalability Profiling_. [amdahl]: https://en.wikipedia.org/wiki/Amdahl's_law -[cmd-line]: https://swcarpentry.github.io/python-novice-inflammation/12-cmdline/index.html -[inflammation]: https://swcarpentry.github.io/python-novice-inflammation/ -[np-dtype]: https://numpy.org/doc/stable/reference/generated/numpy.dtype.html -[parallel-novice]: http://www.hpc-carpentry.org/hpc-parallel-novice/ -[python-func]: https://swcarpentry.github.io/python-novice-inflammation/08-func/index.html -[units]: https://en.wikipedia.org/wiki/Byte#Multiple-byte_units diff --git a/high_performance_computing/hpc_intro/08_resources.md b/high_performance_computing/hpc_intro/08_resources.md index 03b2d829..2d5d24d2 100644 --- a/high_performance_computing/hpc_intro/08_resources.md +++ b/high_performance_computing/hpc_intro/08_resources.md @@ -4,6 +4,9 @@ dependsOn: [ high_performance_computing.hpc_intro.07_parallel ] tags: [slurm] +learningOutcomes: + - Look up job statistics. + - Make more accurate resource requests in job scripts based on data describing past performance. attribution: - citation: > "Introduction to High-Performance Computing" course by the HPC-carpentries @@ -26,6 +29,7 @@ documentation or user testimonials provide some idea, we won't know how much memory or compute time a program will need. ::::callout + ## Read the Documentation Most HPC facilities maintain documentation as a wiki, a website, or a @@ -54,7 +58,8 @@ use `sacct -u yourUsername` to get statistics about `parallel-job.sh`. ```bash remote$ sacct -u yourUsername ``` -``` + +```text JobID JobName Partition Account AllocCPUS State ExitCode ------------ ---------- ---------- ---------- ---------- ---------- -------- 7 file.sh cpubase_b+ def-spons+ 1 COMPLETED 0:0 @@ -87,6 +92,7 @@ remote$ sacct -u yourUsername -l -j 347087 | less -S ``` ::::callout + ## Discussion This view can help compare the amount of time requested and actually @@ -109,7 +115,6 @@ actually finish. Specifying the expected runtime in the submission script more accurately will help alleviate cluster congestion and may get your job dispatched earlier. - ::::challenge{id=time-estimate, title="Narrow the Time Estimate"} Edit `parallel_job.sh` to set a better time estimate. How close can you get? @@ -121,5 +126,6 @@ The following line tells Slurm that our job should finish within 2 minutes: ```bash #SBATCH -t 00:02:00 ``` + ::: :::: diff --git a/high_performance_computing/hpc_intro/09_responsibility.md b/high_performance_computing/hpc_intro/09_responsibility.md index 35d8373f..8413834d 100644 --- a/high_performance_computing/hpc_intro/09_responsibility.md +++ b/high_performance_computing/hpc_intro/09_responsibility.md @@ -4,6 +4,12 @@ dependsOn: [ high_performance_computing.hpc_intro.08_resources ] tags: [slurm] +learningOutcomes: + - Describe how the actions of a single user can affect the experience of others on a shared system. + - Discuss the behaviour of a considerate shared system citizen. + - Explain the importance of backing up critical data. + - Describe the challenges with transferring large amounts of data off HPC systems. + - Convert many files to a single archive file using tar. attribution: - citation: > "Introduction to High-Performance Computing" course by the HPC-carpentries @@ -40,6 +46,7 @@ run on the head node is a quick and reliable way to discover and fix these issues. ::::callout + ## Login Nodes Are a Shared Resource Remember, the login node is shared with all other users and your actions @@ -55,6 +62,7 @@ You can always use the commands `top` and `ps ux` to list the processes that are running on the login node along with the amount of CPU and memory they are using. If this check reveals that the login node is somewhat idle, you can safely use it for your non-routine processing task. If something goes wrong + - the process takes too long, or doesn't respond - you can use the `kill` command along with the _PID_ to terminate the process. @@ -103,6 +111,7 @@ Most systems provide dedicated resources for testing that have short wait times to help you avoid this issue. ::::callout + ## Test Job Submission Scripts That Use Large Amounts of Resources Before submitting a large run of jobs, submit one as a test first to make @@ -149,6 +158,7 @@ provide useful guidance on your options for data transfer for the volumes of data you will be using. ::::callout + ## Your Data Is Your Responsibility Make sure you understand what the backup policy is on the file systems on the @@ -182,12 +192,12 @@ files, rather than the converse. Some of the key components and their associated issues are: -* __Disk speed__: File systems on HPC systems are often highly parallel, +- __Disk speed__: File systems on HPC systems are often highly parallel, consisting of a very large number of high performance disk drives. This allows them to support a very high data bandwidth. Unless the remote system has a similar parallel file system you may find your transfer speed limited by disk performance at that end. -* __Meta-data performance__: _Meta-data operations_ such as opening and closing +- __Meta-data performance__: _Meta-data operations_ such as opening and closing files or listing the owner or size of a file are much less parallel than read/write operations. If your data consists of a very large number of small files you may find your transfer speed is limited by meta-data operations. @@ -195,11 +205,11 @@ Some of the key components and their associated issues are: strongly with those you perform so reducing the number of such operations you use (by combining multiple files into a single file) may reduce variability in your transfer rates and increase transfer speeds. -* __Network speed__: Data transfer performance can be limited by network speed. +- __Network speed__: Data transfer performance can be limited by network speed. More importantly it is limited by the slowest section of the network between source and destination. If you are transferring to your laptop/workstation, this is likely to be its connection (either via LAN or WiFi). -* __Firewall speed__: Most modern networks are protected by some form of +- __Firewall speed__: Most modern networks are protected by some form of firewall that filters out malicious traffic. This filtering has some overhead and can result in a reduction in data transfer performance. The needs of a general purpose network that hosts email/web-servers and desktop machines are @@ -217,11 +227,11 @@ be created using tools like `tar` and `zip`. We have already met `tar` when we talked about data transfer earlier. ![Schematic of network bandwidth](fig/responsibility-bandwidth.svg) -*Schematic diagram of bandwidth and latency for disk and network +_Schematic diagram of bandwidth and latency for disk and network I/O. Each of the components on the figure is connected by a blue line of width proportional to the interface bandwidth. The small mazes at the link points illustrate the latency of the link, with more tortuous -mazes indicating higher latency.* +mazes indicating higher latency._ :::::challenge{id=transfer-method, title="Consider the Best Way to Transfer Data"} @@ -237,22 +247,27 @@ best way to transfer them to `cluster.name`? 1. ```bash local$ scp -r data user@cluster.name:~/ ``` + 2. ```bash local$ rsync -ra data user@cluster.name:~/ ``` + 3. ```bash local$ rsync -raz data user@cluster.name:~/ ``` + 4. ```bash local$ tar -cvf data.tar data local$ rsync -raz data.tar user@cluster.name:~/ ``` + 5. ```bash local$ tar -cvzf data.tar.gz data local$ rsync -ra data.tar.gz user@cluster.name:~/ ``` ::::solution + 1. `scp` will recursively copy the directory. This works, but without compression. 2. `rsync -ra` works like `scp -r`, but preserves file information like @@ -268,5 +283,6 @@ best way to transfer them to `cluster.name`? transfer it. This may perform similarly to #4, but in most cases (for large datasets), it's the best combination of high throughput and low latency (making the most of your time and network connection). + :::: -::::: \ No newline at end of file +::::: diff --git a/high_performance_computing/hpc_mpi/02_mpi_api.md b/high_performance_computing/hpc_mpi/02_mpi_api.md index c599f16f..d8fdfd3a 100644 --- a/high_performance_computing/hpc_mpi/02_mpi_api.md +++ b/high_performance_computing/hpc_mpi/02_mpi_api.md @@ -24,16 +24,19 @@ MPI stands for ***Message Passing Interface*** and was developed in the early 19 To address this challenge, researchers and computer scientists from leading vendors and organizations, including Intel, IBM, and Argonne National Laboratory, collaborated to develop MPI. Their collective efforts resulted in the release of the first version of the MPI standard, MPI-1, in 1994. This standardisation initiative aimed to provide a unified communication protocol and library for parallel computing. ::::callout + ## MPI versions + Since its inception, MPI has undergone several revisions, each introducing new features and improvements: + - **MPI-1 (1994):** The initial release of the MPI standard provided a common set of functions, datatypes, and communication semantics. It formed the foundation for parallel programming using MPI. -- **MPI-2 (1997):** This version expanded upon MPI-1 by introducing additional features such as dynamic process management, one-sided communication, paralell I/O, C++ and Fortran 90 bindings. - MPI-2 improved the flexibility and capabilities of MPI programs. -- **MPI-3 (2012):** MPI-3 brought significant enhancements to the MPI standard, including support for non-blocking collectives, improved multithreading, and performance optimizations. - It also addressed limitations from previous versions and introduced fully compliant Fortran 2008 bindings. +- **MPI-2 (1997):** This version expanded upon MPI-1 by introducing additional features such as dynamic process management, one-sided communication, paralell I/O, C++ and Fortran 90 bindings. + MPI-2 improved the flexibility and capabilities of MPI programs. +- **MPI-3 (2012):** MPI-3 brought significant enhancements to the MPI standard, including support for non-blocking collectives, improved multithreading, and performance optimizations. + It also addressed limitations from previous versions and introduced fully compliant Fortran 2008 bindings. Moreover, MPI-3 completely removed the deprecated C++ bindings, which were initially marked as deprecated in MPI-2.2. -- **MPI-4.0 (2021):** On June 9, 2021, the MPI Forum approved MPI-4.0, the latest major release of the MPI standard. +- **MPI-4.0 (2021):** On June 9, 2021, the MPI Forum approved MPI-4.0, the latest major release of the MPI standard. MPI-4.0 brings significant updates and new features, including enhanced support for asynchronous progress, improved support for dynamic and adaptive applications, and better integration with external libraries and programming models. These revisions, along with subsequent updates and errata, have refined the MPI standard, making it more robust, versatile, and efficient. @@ -46,6 +49,7 @@ The key concept in MPI is **message passing**, which involves the explicit excha Processes can send messages to specific destinations, broadcast messages to all processes, or perform collective operations where all processes participate. This message passing and coordination among parallel processes are facilitated through a set of fundamental functions provided by the MPI standard. Typically, their names start with `MPI_` and followed by a specific function or datatype identifier. Here are some examples: + - **MPI_Init:** Initializes the MPI execution environment. - **MPI_Finalize:** Finalises the MPI execution environment. - **MPI_Comm_rank:** Retrieves the rank of the calling process within a communicator. @@ -66,51 +70,58 @@ In general, an MPI program follows a basic outline that includes the following s ## Getting Started with MPI: MPI on HPC -As MPI codes allow you to run a code on multiple cores, we typically develop them to run on large systems like HPC clusters. +As MPI codes allow you to run a code on multiple cores, we typically develop them to run on large systems like HPC clusters. These are usually configured with versions of OpenMPI that have been optimised for the specific hardware involved, for maximum performance. For this episode, log into whichever HPC system you have access to - this could be a group server, or university- or national-level cluster (e.g. Iridis or DiRAC). HPC clusters typically have **more than one version** of MPI available, so you may need to tell it which one you want to use before it will give you access to it. -First check the available MPI implementations/modules on the cluster using the command below: +First check the available MPI implementations/modules on the cluster using the command below: ```bash module avail ``` This will display a list of available modules, including MPI implementations. -As for the next step, you should choose the appropriate MPI implementation/module from the list based on your requirements and load it using `module load `. +As for the next step, you should choose the appropriate MPI implementation/module from the list based on your requirements and load it using `module load `. For example, if you want to load OpenMPI version 4.0.5, you can use: ```bash module load openmpi/4.0.5 ``` -This sets up the necessary environment variables and paths for the MPI implementation and will give you access to the MPI library. -If you are not sure which implementation/version of MPI you should use on a particular cluster, ask a helper or consult your HPC facility's documentation. +This sets up the necessary environment variables and paths for the MPI implementation and will give you access to the MPI library. +If you are not sure which implementation/version of MPI you should use on a particular cluster, ask a helper or consult your HPC facility's documentation. ::::callout + ## MPI Elsewhere + This episode assumes you will be using a HPC cluster, but you can also install OpenMPI on a desktop or laptop: -* **Linux:** Most distributions have OpenMPI available in their package manager: +- **Linux:** Most distributions have OpenMPI available in their package manager: + ```bash - $ sudo apt install openmpi-bin openmpi-dev + sudo apt install openmpi-bin openmpi-dev ``` -* **Mac:** The MacPorts and Homebrew package managers both have OpenMPI available: + +- **Mac:** The MacPorts and Homebrew package managers both have OpenMPI available: + ```bash $ brew install openmpi # or $ port install openmpi ``` -* **Windows:** Whilst you *can* build OpenMPI yourself on Windows, it's generally easier to use the **Windows Subsystem for Linux**. + +- **Windows:** Whilst you *can* build OpenMPI yourself on Windows, it's generally easier to use the **Windows Subsystem for Linux**. This can be useful for when you're writing code or testing it on a smaller scale, but you will need to check that you're installing a version of OpenMPI that's also available on whichever HPC cluster you're likely to scale up to. :::: - ::::callout + ## Developing on a cluster + HPC clusters don't usually have GUI-based IDEs installed on them. We can write code locally, and copy it across using `scp` or `rsync`, but most IDEs have the ability to open folders on a remote machine, or to automatically synchronise a local folder with a remote one. @@ -120,7 +131,6 @@ Some older Linux systems don't support it - in that case, try the [SSH FS](https Other IDEs like **CLion** also support [a variety of remote development methods](https://www.jetbrains.com/help/clion/remote-development.html). :::: - ## Running a code with MPI Let's start with a simple C code that prints "Hello World!" to the console. @@ -142,30 +152,33 @@ Therefore the below command generates an executable file named **hello_world** . ```bash mpicc -o hello_world hello_world.c ``` - + Now let's try the following command: + ```bash mpiexec -n 4 ./hello_world ``` ::::callout + ## What if `mpiexec` doesn't exist? + If `mpiexec` is not found, try `mpirun` instead. This is another common name for the command. -When launching MPI applications and managing parallel processes, we often rely on commands like `mpiexec` or `mpirun`. -Both commands act as wrappers or launchers for MPI applications, allowing us to initiate and manage the execution of multiple parallel processes across nodes in a cluster. +When launching MPI applications and managing parallel processes, we often rely on commands like `mpiexec` or `mpirun`. +Both commands act as wrappers or launchers for MPI applications, allowing us to initiate and manage the execution of multiple parallel processes across nodes in a cluster. While the behavior and features of `mpiexec` and `mpirun` may vary depending on the MPI implementation being used (such as OpenMPI, MPICH, MS MPI, etc.), they are commonly used interchangeably and provide similar functionality. -It is important to note that `mpiexec` is defined as part of the MPI standard, whereas `mpirun` is not. -While some MPI implementations may use one name or the other, or even provide both as aliases for the same functionality, `mpiexec` is generally considered the preferred command. -Although the MPI standard does not explicitly require MPI implementations to include `mpiexec`, it does provide guidelines for its implementation. -In contrast, the availability and options of `mpirun` can vary between different MPI implementations. +It is important to note that `mpiexec` is defined as part of the MPI standard, whereas `mpirun` is not. +While some MPI implementations may use one name or the other, or even provide both as aliases for the same functionality, `mpiexec` is generally considered the preferred command. +Although the MPI standard does not explicitly require MPI implementations to include `mpiexec`, it does provide guidelines for its implementation. +In contrast, the availability and options of `mpirun` can vary between different MPI implementations. To ensure compatibility and adherence to the MPI standard, it is recommended to primarily use `mpiexec` as the command for launching MPI applications and managing parallel execution. :::: The expected output would be as follows: -```` +````text Hello World! Hello World! Hello World! @@ -173,8 +186,10 @@ Hello World! ```` ::::callout + ## What did `mpiexec` do? -Just running a program with `mpiexec` creates several instances of our application. + +Just running a program with `mpiexec` creates several instances of our application. The number of instances is determined by the `-n` parameter, which specifies the desired number of processes. These instances are independent and execute different parts of the program simultaneously. Behind the scenes, `mpiexec` undertakes several critical tasks. It sets up communication between the processes, enabling them to exchange data and synchronize their actions. @@ -190,12 +205,12 @@ As we've just learned, running a program with `mpiexec` or `mpirun` results in t mpirun -n 4 ./hello_world ``` -However, in the example above, the program does not know it was started by `mpirun`, and each copy just works as if they were the only one. -For the copies to work together, they need to know about their role in the computation, in order to properly take advantage of parallelisation. +However, in the example above, the program does not know it was started by `mpirun`, and each copy just works as if they were the only one. +For the copies to work together, they need to know about their role in the computation, in order to properly take advantage of parallelisation. This usually also requires knowing the total number of tasks running at the same time. - The program needs to call the `MPI_Init` function. -- `MPI_Init` sets up the environment for MPI, and assigns a number (called the _rank_) to each process. +- `MPI_Init` sets up the environment for MPI, and assigns a number (called the *rank*) to each process. - At the end, each process should also cleanup by calling `MPI_Finalize`. ```c @@ -215,7 +230,7 @@ int MPI_Comm_size(MPI_COMM_WORLD, &num_ranks); int MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); ``` -Here, `MPI_COMM_WORLD` is a **communicator**, which is a collection of ranks that are able to exchange data between one another. +Here, `MPI_COMM_WORLD` is a **communicator**, which is a collection of ranks that are able to exchange data between one another. We'll look into these in a bit more detail in the next episode, but essentially we use `MPI_COMM_WORLD` which is the default communicator which refers to all ranks. Here's a more complete example: @@ -242,10 +257,11 @@ int main(int argc, char** argv) { ``` :::::challenge{id=compile-run, title="Compile and Run"} -Compile the above C code with `mpicc`, then run the code with `mpirun`. +Compile the above C code with `mpicc`, then run the code with `mpirun`. You may find the output for each rank is returned out of order. Why is this? ::::solution + ```bash mpicc mpi_rank.c -o mpi_rank mpirun -n 4 mpi_rank @@ -253,14 +269,14 @@ mpirun -n 4 mpi_rank You should see something like (although the ordering may be different): -``` +```text My rank number is 1 My rank number is 2 My rank number is 0 My rank number is 3 ``` -The reason why the results are not returned in order is because the order in which the processes run is arbitrary. +The reason why the results are not returned in order is because the order in which the processes run is arbitrary. As we'll see later, there are ways to synchronise processes to obtain a desired ordering! :::: ::::: @@ -277,7 +293,7 @@ rank `my_rank`, and the total workload (i.e. 100,000 iterations), and using the evenly across our ranks. Therefore, given 4 CPUs, for each rank the work would be divided into 25,000 iterations per CPU, as: -``` +```text Rank 1: 1 - 25,000 Rank 2: 25,001 - 50,000 Rank 3: 50,001 - 75,000 @@ -285,6 +301,7 @@ Rank 4: 75,001 - 100,000 ``` We can work out the iterations to undertake for a given rank number, by working out: + - Work out the number of iterations per rank by dividing the total number of iterations we want to calculate by `num_ranks` - Determine the start of the work iterations for this rank by multiplying our rank number by the iterations per rank - Determine the end of the work iterations for this rank by working out the hypothetical start of the next rank and deducting 1 @@ -298,7 +315,7 @@ int rank_start = my_rank * iterations_per_rank; int rank_end = ((my_rank + 1) * iterations_per_rank) - 1; ``` -We also need to cater for the case where work may not be distributed evenly across ranks, where the total work isn't directly divisible by the number of CPUs. +We also need to cater for the case where work may not be distributed evenly across ranks, where the total work isn't directly divisible by the number of CPUs. In which case, we adjust the last rank's end of iterations to be the total number of iterations. This ensures the entire desired workload is calculated: @@ -341,12 +358,14 @@ mpicc -o count_primes count_primes.c mpiexec -n 2 count_primes ``` -Of course, this solution only goes so far. -We can add the resulting counts from each rank together to get our final number of primes between 0 and 100,000, but what would be useful would be to have our code somehow retrieve the results from each rank and add them together, and output that overall result. -More generally, ranks may need results from other ranks to complete their own computations. +Of course, this solution only goes so far. +We can add the resulting counts from each rank together to get our final number of primes between 0 and 100,000, but what would be useful would be to have our code somehow retrieve the results from each rank and add them together, and output that overall result. +More generally, ranks may need results from other ranks to complete their own computations. For this we would need ways for ranks to communicate - the primary benefit of MPI - which we'll look at in subsequent episodes. ::::callout + ## What About Python? + In [MPI for Python (mpi4py)](https://mpi4py.readthedocs.io/en/stable/), the initialization and finalization of MPI are handled by the library, and the user can perform MPI calls after ``from mpi4py import MPI``. :::: diff --git a/high_performance_computing/hpc_mpi/03_communicating_data.md b/high_performance_computing/hpc_mpi/03_communicating_data.md index c09294b8..3643f7a6 100644 --- a/high_performance_computing/hpc_mpi/03_communicating_data.md +++ b/high_performance_computing/hpc_mpi/03_communicating_data.md @@ -18,12 +18,11 @@ learningOutcomes: - List the basic MPI data types. --- -In previous episodes we've seen that when we run an MPI application, multiple *independent* processes are created which do their own work, on their own data, in their own private memory space. -At some point in our program, one rank will probably need to know about the data another rank has, such as when combining a problem back together which was split across ranks. -Since each rank's data is private to itself, we can't just access another rank's memory and get what we need from there. +In previous episodes we've seen that when we run an MPI application, multiple *independent* processes are created which do their own work, on their own data, in their own private memory space. +At some point in our program, one rank will probably need to know about the data another rank has, such as when combining a problem back together which was split across ranks. +Since each rank's data is private to itself, we can't just access another rank's memory and get what we need from there. We have to instead explicitly *communicate* data between ranks. Sending and receiving data between ranks form some of the most basic building blocks in any MPI application, and the success of your parallelisation often relies on how you communicate data. - ## Communicating data using messages MPI is a standardised framework for passing data and other messages between independently running processes. @@ -42,10 +41,11 @@ Often we won't notice this overhead, as it is quite small. But if we communicate large amounts data or too often, those small overheads can rapidly add up into a noticeable performance hit. ::::callout + ## Common mistakes -A common mistake for new MPI users is to write code using point-to-point communication which emulates what the collective communication functions are designed to do. -This is an inefficient way to share data. +A common mistake for new MPI users is to write code using point-to-point communication which emulates what the collective communication functions are designed to do. +This is an inefficient way to share data. The collective routines in MPI have multiple tricks and optimizations up their sleeves, resulting in communication overheads much lower than the equivalent point-to-point approach. One other advantage is that collective communication often requires less code to achieve the same thing, which is always a win. It is there almost always better to use collective operations where you can. @@ -58,7 +58,7 @@ To receive the data, rank B must call a data receiving function which will liste When the message has been successfully routed and the data transfer complete, rank B sends an acknowledgement back to rank A to say that the transfer has finished, similarly to how read receipts work in e-mails and instant messages. :::::challenge{id=check-understanding, title="Check Your Understanding"} -In an imaginary simulation, each rank is responsible for calculating the physical properties for a subset of cells on a larger simulation grid. +In an imaginary simulation, each rank is responsible for calculating the physical properties for a subset of cells on a larger simulation grid. Another calculation, however, needs to know the average of, for example, the temperature for the subset of cells for each rank. What approaches could you use to share this data? ::::solution @@ -67,7 +67,6 @@ You can, of course, also use a point-to-point pattern, but it would be less effi :::: ::::: - ### Communication modes There are multiple "modes" on how data is sent in MPI: standard, buffered, synchronous and ready. @@ -92,29 +91,30 @@ In contrast to the four modes for sending data, receiving data only has one mode | - | - | - | | Receive | Returns control when data has been received successfully | `MPI_Recv()` | - ### Blocking vs. non-blocking communication Communication can also be done in two additional ways: blocking and non-blocking. In blocking mode, communication functions will only return once the send buffer is ready to be re-used, meaning that the message has been both sent and received. -In terms of a blocking synchronous send, control will not be passed back to the program until the message sent by rank A has reached rank B, and rank B has sent an acknowledgement back. +In terms of a blocking synchronous send, control will not be passed back to the program until the message sent by rank A has reached rank B, and rank B has sent an acknowledgement back. If rank B is never listening for messages, rank A will become *deadlocked*. A deadlock happens when your program hangs indefinitely because the send (or receive) is unable to complete. Deadlocks occur for a countless number of reasons. -For example, we may forget to write the corresponding receive function when sending data. +For example, we may forget to write the corresponding receive function when sending data. Alternatively, a function may return earlier due to an error which isn't handled properly, or a while condition may never be met creating an infinite loop. Furthermore, ranks can sometimes crash silently making communication with them impossible, but this doesn't stop any attempts to send data to crashed rank. ::::callout + ## Avoiding communication deadlocks + A common piece of advice in C is that when allocating memory using `malloc()`, always write the accompanying call to `free()` to help avoid memory leaks by forgetting to deallocate the memory later. You can apply the same mantra to communication in MPI. When you send data, always write the code to receive the data as you may forget to later and accidentally cause a deadlock. :::: -Blocking communication works best when the work is balanced across ranks, so that each rank has an equal amount of things to do. +Blocking communication works best when the work is balanced across ranks, so that each rank has an equal amount of things to do. A common pattern in scientific computing is to split a calculation across a grid and then to share the results between all ranks before moving onto the next calculation. If the workload is well balanced, each rank will finish at roughly the same time and be ready to transfer data at the same time. -But, as shown in the diagram below, if the workload is unbalanced, some ranks will finish their calculations earlier and begin to send their data to the other ranks before they are ready to receive data. +But, as shown in the diagram below, if the workload is unbalanced, some ranks will finish their calculations earlier and begin to send their data to the other ranks before they are ready to receive data. This means some ranks will be sitting around doing nothing whilst they wait for the other ranks to become ready to receive data, wasting computation time. ![Blocking communication](fig/blocking-wait.png) @@ -122,19 +122,21 @@ This means some ranks will be sitting around doing nothing whilst they wait for If most of the ranks are waiting around, or one rank is very heavily loaded in comparison, this could massively impact the performance of your program. Instead of doing calculations, a rank will be waiting for other ranks to complete their work. -Non-blocking communication hands back control, immediately, before the communication has finished. -Instead of your program being *blocked* by communication, ranks will immediately go back to the heavy work and instead periodically check if there is data to receive (which you must remember to program) instead of waiting around. +Non-blocking communication hands back control, immediately, before the communication has finished. +Instead of your program being *blocked* by communication, ranks will immediately go back to the heavy work and instead periodically check if there is data to receive (which you must remember to program) instead of waiting around. The advantage of this communication pattern is illustrated in the diagram below, where less time is spent communicating. ![Non-blocking communication](fig/non-blocking-wait.png) -This is a common pattern where communication and calculations are interwoven with one another, decreasing the amount of "dead time" where ranks are waiting for other ranks to communicate data. +This is a common pattern where communication and calculations are interwoven with one another, decreasing the amount of "dead time" where ranks are waiting for other ranks to communicate data. Unfortunately, non-blocking communication is often more difficult to successfully implement and isn't appropriate for every algorithm. In most cases, blocking communication is usually easier to implement and to conceptually understand, and is somewhat "safer" in the sense that the program cannot continue if data is missing. However, the potential performance improvements of overlapping communication and calculation is often worth the more difficult implementation and harder to read/more complex code. ::::callout + ## Should I use blocking or non-blocking communication? + When you are first implementing communication into your program, it's advisable to first use blocking synchronous sends to start with, as this is arguably the easiest to use pattern. Once you are happy that the correct data is being communicated successfully, but you are unhappy with performance, then it would be time to start experimenting with the different communication modes and blocking vs. non-blocking patterns to balance performance with ease of use and code readability and maintainability. :::: @@ -155,13 +157,12 @@ We can periodically check our e-mail for the response, and either keep doing oth :::: ::::: - ## Communicators Communication in MPI happens in something known as a *communicator*. We can think of a communicator as fundamentally being a collection of ranks which are able to exchange data with one another. What this means is that every communication between two (or more) ranks is linked to a specific communicator in the program. -When we run an MPI application, the ranks will belong to the default communicator known as `MPI_COMM_WORLD`. +When we run an MPI application, the ranks will belong to the default communicator known as `MPI_COMM_WORLD`. We've seen this in earlier episodes when, for example, we've used functions like `MPI_Comm_rank()` to get the rank number, ```c @@ -172,13 +173,12 @@ MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); /* MPI_COMM_WORLD is the communicator In addition to `MPI_COMM_WORLD`, we can make sub-communicators and distribute ranks into them. Messages can only be sent and received to and from the same communicator, effectively isolating messages to a communicator. For most applications, we usually don't need anything other than `MPI_COMM_WORLD`. -But organising ranks into communicators can be helpful in some circumstances, as you can create small "work units" of multiple ranks to dynamically schedule the workload, or to help compartmentalise the problem into smaller chunks by using a [virtual cartesian topology](https://www.mpi-forum.org/docs/mpi-3.1/mpi31-report/node192.htm#Node192). +But organising ranks into communicators can be helpful in some circumstances, as you can create small "work units" of multiple ranks to dynamically schedule the workload, or to help compartmentalise the problem into smaller chunks by using a [virtual cartesian topology](https://www.mpi-forum.org/docs/mpi-3.1/mpi31-report/node192.htm#Node192). Throughout this lesson, we will stick to using `MPI_COMM_WORLD`. - ## Basic MPI data types -To send a message, we need to know the size of it. +To send a message, we need to know the size of it. The size is not the number of bytes of data that is being sent but is instead expressed as the number of elements of a particular data type you want to send. So when we send a message, we have to tell MPI how many elements of "something" we are sending and what type of data it is. If we don't do this correctly, we'll either end up telling MPI to send only *some* of the data or try to send more data than we want! @@ -214,11 +214,13 @@ These constants don't expand out to actual date types, so we can't use them in v MPI_INT my_int; ``` -is not valid code because under the hood, these constants are actually special structs used internally. +is not valid code because under the hood, these constants are actually special structs used internally. Therefore we can only uses these expressions as arguments in MPI functions. ::::callout + ## Don't forget to update your types + At some point during development, you might change an `int` to a `long` or a `float` to a `double`, or something to something else. Once you've gone through your codebase and updated the types for, e.g., variable declarations and function signatures, you must also do the same for MPI functions. If you don't, you'll end up running into communication errors. @@ -231,6 +233,7 @@ It may be helpful to define compile-time constants for the data types and use th /* use them as you would normally */ AGE_TYPE my_age = 25; ``` + :::: Derived data types are data structures which you define, built using the basic MPI data types. @@ -246,8 +249,10 @@ For the following pieces of data, what MPI data types should you use? 3. `a[] = "Hello, world!";` ::::solution + 1. `MPI_INT` 2. `MPI_DOUBLE` - `MPI_FLOAT` would not be correct as `float`'s contain 32 bits of data whereas `double`s are 64 bit. 3. `MPI_BYTE` or `MPI_CHAR` - you may want to use [strlen](https://man7.org/linux/man-pages/man3/strlen.3.html) to calculate how many elements of `MPI_CHAR` being sent + :::: ::::: diff --git a/high_performance_computing/hpc_mpi/04_point_to_point_communication.md b/high_performance_computing/hpc_mpi/04_point_to_point_communication.md index a76f8290..9177ee28 100644 --- a/high_performance_computing/hpc_mpi/04_point_to_point_communication.md +++ b/high_performance_computing/hpc_mpi/04_point_to_point_communication.md @@ -17,7 +17,7 @@ learningOutcomes: --- In the previous episode we introduced the various types of communication in MPI. -In this section we will use the MPI library functions `MPI_Send` and `MPI_Recv`, which employ point-to-point communication, to send data from one rank to another. +In this section we will use the MPI library functions `MPI_Send` and `MPI_Recv`, which employ point-to-point communication, to send data from one rank to another. ![Sending data from one rank to another using MPI_SSend and MPI_Recv](fig/send-recv.png) @@ -26,10 +26,10 @@ Let's look at how `MPI_Send` and `MPI_Recv`are typically used: - Rank A decides to send data to rank B. It first packs the data to send into a buffer, from which it will be taken. - Rank A then calls `MPI_Send` to create a message for rank B. The underlying MPI communication is then given the responsibility of routing the message to the correct destination. -- Rank B must know that it is about to receive a message and acknowledge this by calling `MPI_Recv`. +- Rank B must know that it is about to receive a message and acknowledge this by calling `MPI_Recv`. This sets up a buffer for writing the incoming data when it arrives and instructs the communication device to listen for the message. -As mentioned in the previous episode, `MPI_Send` and `MPI_Recv` are *synchronous* operations, +As mentioned in the previous episode, `MPI_Send` and `MPI_Recv` are *synchronous* operations, and will not return until the communication on both sides is complete. ## Sending a Message: MPI_Send @@ -68,21 +68,22 @@ So we are sending 14 elements of `MPI_CHAR` one time, and specified `0` for our This call is synchronous, and will block until the corresponding `MPI_Recv` operation receives and acknowledges receipt of the message. ::::callout + ## MPI_Ssend: an Alternative to MPI_Send + `MPI_Send` represents the "standard mode" of sending messages to other ranks, but some aspects of its behaviour are dependent on both the implementation of MPI being used, and the circumstances of its use. There are three scenarios to consider: 1. The message is directly passed to the receive buffer, in which case the communication has completed 2. The send message is buffered within some internal MPI buffer but hasn't yet been received 3. The function call waits for a corresponding receiving process - -In scenarios 1 & 2, the call is able to return immediately, but with 3 it may block until the recipient is ready to receive. + +In scenarios 1 & 2, the call is able to return immediately, but with 3 it may block until the recipient is ready to receive. It is dependent on the MPI implementation as to what scenario is selected, based on performance, memory, and other considerations. - + A very similar alternative to `MPI_Send` is to use `MPI_Ssend` - synchronous send - which ensures the communication is both synchronous and blocking. This function guarantees that when it returns, the destination has categorically started receiving the message. :::: - ## Receiving a Message: MPI_Recv Conversely, the `MPI_Recv` function looks like the following: @@ -164,16 +165,19 @@ int main(int argc, char** argv) { Compile and run the above code. Does it behave as you expect? ::::solution + ```bash mpicc mpi_hello_world.c -o mpi_hello_world mpirun -n 2 mpi_hello_world ``` + Note above that we specified only 2 ranks, since that's what the program requires (see line 12). You should see: -``` +```text Hello, world! ``` + :::: ::::: @@ -181,14 +185,16 @@ Hello, world! Try modifying, compiling, and re-running the code to see what happens if you... 1. Change the tag integer of the sent message. How could you resolve this where the message is received? -2. Modify the element count of the received message to be smaller than that of the sent message. +2. Modify the element count of the received message to be smaller than that of the sent message. How could you resolve this in how the message is sent? - + ::::solution -1. The program will hang since it's waiting for a message with a tag that will never be sent (press `Ctrl-C` to kill the hanging process). + +1. The program will hang since it's waiting for a message with a tag that will never be sent (press `Ctrl-C` to kill the hanging process). To resolve this, make the tag in `MPI_Recv` match the tag you specified in `MPI_Send`. 2. You will likely see a message like the following: - ``` + + ```text [...:220695] *** An error occurred in MPI_Recv [...:220695] *** reported by process [2456485889,1] [...:220695] *** on communicator MPI_COMM_WORLD @@ -196,6 +202,7 @@ Try modifying, compiling, and re-running the code to see what happens if you... [...:220695] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, [...:220695] *** and potentially your MPI job) ``` + You could resolve this by sending a message of equal size, truncating the message. A related question is whether this fix makes any sense! :::: ::::: @@ -205,6 +212,7 @@ Change the above example so that it works with any number of ranks. Pair even ranks with odd ranks and have each even rank send a message to the corresponding odd rank. ::::solution + ```c #include #include @@ -248,13 +256,14 @@ int main(int argc, char** argv) { return MPI_Finalize(); } ``` + :::: ::::: :::::challenge{id=hello-again, title="Hello Again, World!"} Modify the Hello World code below so that each rank sends its message to rank 0. Have rank 0 print each message. - + ```c #include #include @@ -274,7 +283,9 @@ int main(int argc, char** argv) { return MPI_Finalize(); } ``` + ::::solution + ```c #include #include @@ -311,14 +322,14 @@ int main(int argc, char** argv) { return MPI_Finalize(); } ``` + :::: ::::: - :::::challenge{id=blocking, title="Blocking"} -Try the code below and see what happens. How would you change the code to fix the problem? +Try the code below and see what happens. How would you change the code to fix the problem? -_Note: If you are using the MPICH library, this example might automagically work. With OpenMPI it shouldn't!)_ +*Note: If you are using the MPICH library, this example might automagically work. With OpenMPI it shouldn't!)* ```c #include @@ -435,20 +446,22 @@ int main(int argc, char** argv) { return MPI_Finalize(); } ``` + :::: ::::: :::::challenge{id=ping-pong, title="Ping Pong"} Write a simplified simulation of Ping Pong according to the following rules: - -* Ranks 0 and 1 participate -* Rank 0 starts with the ball -* The rank with the ball sends it to the other rank -* Both ranks count the number of times they get the ball -* After counting to 1 million, the rank is bored and gives up -* There are no misses or points + +- Ranks 0 and 1 participate +- Rank 0 starts with the ball +- The rank with the ball sends it to the other rank +- Both ranks count the number of times they get the ball +- After counting to 1 million, the rank is bored and gives up +- There are no misses or points ::::solution + ```c #include #include @@ -502,5 +515,6 @@ int main(int argc, char** argv) { return MPI_Finalize(); } ``` + :::: ::::: diff --git a/high_performance_computing/hpc_mpi/05_collective_communication.md b/high_performance_computing/hpc_mpi/05_collective_communication.md index a6f12aaf..8531b9f7 100644 --- a/high_performance_computing/hpc_mpi/05_collective_communication.md +++ b/high_performance_computing/hpc_mpi/05_collective_communication.md @@ -16,9 +16,9 @@ learningOutcomes: --- -The previous episode showed how to send data from one rank to another using point-to-point communication. -If we wanted to send data from multiple ranks to a single rank to, for example, add up the value of a variable across multiple ranks, we have to manually loop over each rank to communicatethe data. -This type of communication, where multiple ranks talk to one another known as called *collective communication*. +The previous episode showed how to send data from one rank to another using point-to-point communication. +If we wanted to send data from multiple ranks to a single rank to, for example, add up the value of a variable across multiple ranks, we have to manually loop over each rank to communicatethe data. +This type of communication, where multiple ranks talk to one another known as called *collective communication*. In the code example below, point-to-point communication is used to calculate the sum of the rank numbers, ```c @@ -49,12 +49,12 @@ if (my_rank == 0) { printf("Rank %d has a sum of %d\n", my_rank, sum); ``` -For it's use case, the code above works perfectly fine. -However, it isn't very efficient when you need to communicate large amounts of data, have lots of ranks, or when the workload is uneven (due to the blocking communication). -It's also a lot of code to do not much, which makes it easy to introduce mistakes in our code. +For it's use case, the code above works perfectly fine. +However, it isn't very efficient when you need to communicate large amounts of data, have lots of ranks, or when the workload is uneven (due to the blocking communication). +It's also a lot of code to do not much, which makes it easy to introduce mistakes in our code. A common mistake in this example would be to start the loop over ranks from 0, which would cause a deadlock! -We don't need to write code like this (unless we want *complete* control over the data communication), because MPI has access to collective communication functions to abstract all of this code for us. +We don't need to write code like this (unless we want *complete* control over the data communication), because MPI has access to collective communication functions to abstract all of this code for us. The above code can be replaced by a single collective communication function. Collection operations are also implemented far more efficiently in the MPI library than we could ever write using point-to-point communications. @@ -71,8 +71,8 @@ There are several collective operations that are implemented in the MPI standard ### Barrier -The most simple form of collective communication is a barrier. -Barriers are used to synchronise ranks by adding a point in a program where ranks *must* wait until all ranks have reached the same point. +The most simple form of collective communication is a barrier. +Barriers are used to synchronise ranks by adding a point in a program where ranks *must* wait until all ranks have reached the same point. A barrier is a collective operation because all ranks need to communicate with one another to know when they can leave the barrier. To create a barrier, we use the `MPI_Barrier()` function, @@ -83,13 +83,13 @@ int MPI_Barrier( ``` When a rank reaches a barrier, it will pause and wait for all the other ranks to catch up and reach the barrier as well. -As ranks waiting at a barrier aren't doing anything, barriers should be used sparingly to avoid large synchronisation overheads, which affects the scalability of our program. +As ranks waiting at a barrier aren't doing anything, barriers should be used sparingly to avoid large synchronisation overheads, which affects the scalability of our program. We should also avoid using barriers in parts of our program has have complicated branches, as we may introduce a deadlock by having a barrier in only one branch. In practise, there are not that many practical use cases for a barrier in an MPI application. In a shared-memory environment, synchronisation is important to ensure consistent and controlled access to shared data. -But in MPI, where each rank has its own private memory space and often resources, it's rare that we need to care about ranks becoming out-of-sync. -However, one usecase is when multiple ranks need to write *sequentially* to the same file. +But in MPI, where each rank has its own private memory space and often resources, it's rare that we need to care about ranks becoming out-of-sync. +However, one usecase is when multiple ranks need to write *sequentially* to the same file. The code example below shows how you may handle this by using a barrier. ```c @@ -105,8 +105,8 @@ for (int i = 0; i < num_ranks; ++i) { ### Broadcast -We'll often find that we need to data from one rank to all the other ranks. -One approach, which is not very efficient, is to use `MPI_Send()` in a loop to send the data from rank to rank one by one. +We'll often find that we need to data from one rank to all the other ranks. +One approach, which is not very efficient, is to use `MPI_Send()` in a loop to send the data from rank to rank one by one. A far more efficient approach is to use the collective function `MPI_Bcast()` to *broadcast* the data from a root rank to every other rank. The `MPI_Bcast()` function has the following arguments, @@ -127,7 +127,7 @@ The main functional difference is that `MPI_Bcast()` sends the data to all ranks There are lots of use cases for broadcasting data. One common case is when data is sent back to a "root" rank to process, which then broadcasts the results back out to all the other ranks. -Another example, shown in the code exerpt below, is to read data in on the root rank and to broadcast it out. +Another example, shown in the code exerpt below, is to read data in on the root rank and to broadcast it out. This is useful pattern on some systems where there are not enough resources (filesystem bandwidth, limited concurrent I/O operations) for every ranks to read the file at once. ```c @@ -148,6 +148,7 @@ MPI_Bcast(data_from_file, NUM_POINTS, MPI_INT, 0, MPI_COMM_WORLD); Send a message from rank 0 saying "Hello from rank 0" to all ranks using `MPI_Bcast()`. ::::solution + ```c #include #include @@ -174,10 +175,10 @@ int main(int argc, char **argv) { return MPI_Finalize(); } ``` + :::: ::::: - ### Scatter One way to parallelise processing amount of data is to have ranks process a subset of the data. @@ -202,7 +203,7 @@ int MPI_Scatter( ); ``` -The data to be *scattered* is split into even chunks of size `sendcount`. +The data to be *scattered* is split into even chunks of size `sendcount`. If `sendcount` is 2 and `sendtype` is `MPI_INT`, then each rank will receive two integers. The values for `recvcount` and `recvtype` are the same as `sendcount` and `sendtype`. If the total amount of data is not evenly divisible by the number of processes, `MPI_Scatter()` will not work. @@ -247,7 +248,7 @@ int MPI_Gather( ); ``` -The receive buffer needs to be large enough to hold data data from all of the ranks. For example, if there are 4 ranks sending 10 integers, then `recvbuffer` needs to be able to store *at least* 40 integers. +The receive buffer needs to be large enough to hold data data from all of the ranks. For example, if there are 4 ranks sending 10 integers, then `recvbuffer` needs to be able to store *at least* 40 integers. We can think of `MPI_Gather()` as being the inverse of `MPI_Scatter()`. This is shown in the diagram below, where data from each rank on the left is sent to the root rank (rank 0) on the right. @@ -277,10 +278,11 @@ MPI_Gather(rank_data, NUM_DATA_POINTS, MPI_INT, gathered_data, NUM_DATA_POINTS, :::::challenge{id=gathering-greetings, title="Gathering Greetings"} In the previous episode, we used point-to-point communication to send a greeting message to rank 0 from every other rank. -Instead of using point-to-point communication functions, re-implement your solution using `MPI_Gather()` instead. +Instead of using point-to-point communication functions, re-implement your solution using `MPI_Gather()` instead. You can use [this code](code/solutions/05-hello-gather-skeleton.c) as your starting point. ::::solution + ```c #include #include @@ -311,10 +313,10 @@ int main(int argc, char **argv) { return MPI_Finalize(); } ``` + :::: ::::: - ### Reduce A reduction operation is one which takes a values across the ranks, and combines them into a single value. @@ -382,7 +384,7 @@ int MPI_Allreduce( ![Each rank sending a piece of data to root rank](fig/allreduce.png) `MPI_Allreduce()` performs the same operations as `MPI_Reduce()`, but the result is sent to all ranks rather than only being available on the root rank. -This means we can remove the `MPI_Bcast()` in the previous code example and remove almost all of the code in the reduction example using point-to-point communication at the beginning of the episode. +This means we can remove the `MPI_Bcast()` in the previous code example and remove almost all of the code in the reduction example using point-to-point communication at the beginning of the episode. This is shown in the following code example: ```c @@ -395,6 +397,7 @@ MPI_Allreduce(&my_rank, &sum, 1, MPI_INT, MPI_SUM, MPI_COMM_WORLD); ``` ::::callout + ## In-Place Operations In MPI, we can use in-place operations to eliminate the need for separate send and receive buffers in some collective operations. @@ -409,7 +412,7 @@ Not all collective operations support in-place operations, and the usage of `MPI :::: :::::challenge{id=reductions, title="Reductions"} -The following program creates an array called `vector` that contains a list of `n_numbers` on each rank. +The following program creates an array called `vector` that contains a list of `n_numbers` on each rank. The first rank contains the numbers from > 1 to n_numbers, the second rank from n_numbers to 2*n_numbers2 and so on. It then calls the `find_max` and `find_sum` functions that should calculate the sum and maximum of the vector. @@ -476,6 +479,7 @@ int main(int argc, char** argv) { ``` ::::solution + ```c // Calculate the sum of numbers in a vector double find_sum( double * vector, int N ){ @@ -510,11 +514,12 @@ double find_maximum( double * vector, int N ){ return global_max; } ``` + :::: ::::: - ::::callout + ## More collective operations are available The collective functions introduced in this episode do not represent an exhaustive list of *all* collective operations in MPI. diff --git a/high_performance_computing/hpc_mpi/06_non_blocking_communication.md b/high_performance_computing/hpc_mpi/06_non_blocking_communication.md index de0f18e6..bf8efdc4 100644 --- a/high_performance_computing/hpc_mpi/06_non_blocking_communication.md +++ b/high_performance_computing/hpc_mpi/06_non_blocking_communication.md @@ -28,7 +28,7 @@ Non-blocking communication is a communication mode, which allows ranks to contin When we use blocking communication, like `MPI_Send()`, `MPI_Recv()`, `MPI_Reduce()` and etc, execution is passed from our program to MPI and is not passed back until the communication has finished. With non-blocking communication, the communication beings and control is passed back immediately. Whilst the data is transferred in the background, our application is free to do other work. -This ability to *overlap* computation and communication is absolutely critical for good performance for many HPC applications. +This ability to *overlap* computation and communication is absolutely critical for good performance for many HPC applications. The CPU is used very little when communicating data, so we are effectively wasting resources by not using them when we can. With good use of non-blocking communication, we can continue to use the CPU whilst communication happens and, at the same time, hide/reduce some of the communication overhead by overlapping communication and computation. @@ -48,6 +48,7 @@ For example, if one rank depends on the data of another rank and there is no oth ![Non-blocking communication with data dependency](fig/non-blocking-wait-data.png) :::::challenge{id=advantages-and-disadvantages, title="Advantages and Disadvantages"} + ## Advantages and disadvantages What are the main advantages of using non-blocking communication, compared to blocking? What about any disadvantages? @@ -55,23 +56,23 @@ What are the main advantages of using non-blocking communication, compared to bl ::::solution Some of the advantages of non-blocking communication over blocking communication include: -- Non-blocking communication gives us the ability to interleave communication with computation. +- Non-blocking communication gives us the ability to interleave communication with computation. By being able to use the CPU whilst the network is transmitting data, we create algorithms with more efficient hardware usage. - Non-blocking communication also improve the scalability of our program, due to the smaller communication overhead and latencies associated with communicating between a large number of ranks. -- Non-blocking communication is more flexible, which allows for more sophisticated parallel and communication algorithms. +- Non-blocking communication is more flexible, which allows for more sophisticated parallel and communication algorithms. On the other hand, some disadvantages are: - It is more difficult to use non-blocking communication. Not only does it result in more, and more complex, lines of code, we also have to worry about rank synchronisation and data dependency. - Whilst typically using non-blocking communication, where appropriate, improves performance, it's not always clear cut or predictable if non-blocking will result in sufficient performance gains to justify the increased complexity. + :::: ::::: - ## Point-to-point communication -For each blocking communication function we've seen, a non-blocking variant exists. +For each blocking communication function we've seen, a non-blocking variant exists. For example, if we take `MPI_Send()`, the non-blocking variant is `MPI_Isend()` which has the arguments: ```c @@ -90,9 +91,10 @@ The arguments are identical to `MPI_Send()`, other than the addition of the `*re This argument is known as an *handle* (because it "handles" a communication request) which is used to track the progress of a (non-blocking) communication. ::::callout + ## Naming conventions -Non-blocking functions have the same name as their blocking counterpart, but prefixed with "I". +Non-blocking functions have the same name as their blocking counterpart, but prefixed with "I". The "I" stands for "immediate", indicating that the function returns immediately and does not block the program whilst data is being communicated in the background. The table below shows some examples of blocking functions and their non-blocking counterparts. | Blocking | Non-blocking| @@ -100,9 +102,10 @@ The "I" stands for "immediate", indicating that the function returns immediately | `MPI_Bsend()` | `MPI_Ibsend()` | | `MPI_Barrier()` | `MPI_Ibarrier()` | | `MPI_Reduce()` | `MPI_Ireduce()` | + :::: -When we use non-blocking communication, we have to follow it up with `MPI_Wait()` to synchronise the program and make sure `*buf` is ready to be re-used. +When we use non-blocking communication, we have to follow it up with `MPI_Wait()` to synchronise the program and make sure `*buf` is ready to be re-used. This is incredibly important to do. Suppose we are sending an array of integers, @@ -123,7 +126,7 @@ int MPI_Wait( ); ``` -Once we have used `MPI_Wait()` and the communication has finished, we can safely modify `some_ints` again. +Once we have used `MPI_Wait()` and the communication has finished, we can safely modify `some_ints` again. To receive the data send using a non-blocking send, we can use either the blocking `MPI_Recv()` or it's non-blocking variant: ```c @@ -143,7 +146,7 @@ Is the following statement true or false? Non-blocking communication guarantees immediate completion of data transfer. ::::solution -**False**. Just because the communication function has returned, does not mean the communication has finished and the communication buffer is ready to be re-used or read from. +**False**. Just because the communication function has returned, does not mean the communication has finished and the communication buffer is ready to be re-used or read from. Before we access, or edit, any data which has been used in non-blocking communication, we always have to test/wait for the communication to finish using `MPI_Wait()` before it is safe to use it. :::: ::::: @@ -240,7 +243,7 @@ So even though rank 0 and 1 one both send, meaning there is no corresponding rec Thus a deadlock cannot happen. However, it is still possible to create a deadlock using `MPI_Wait()`. -If `MPI_Wait()` is waiting to for `MPI_Irecv()` to get some data, but there is no matching send operation (so no data has been sent), then `MPI_Wait()` can never return resulting in a deadlock. +If `MPI_Wait()` is waiting to for `MPI_Irecv()` to get some data, but there is no matching send operation (so no data has been sent), then `MPI_Wait()` can never return resulting in a deadlock. In the example code below, rank 0 becomes deadlocked. ```c @@ -257,6 +260,7 @@ if (my_rank == 0) { MPI_Wait(&send_req, &status); /* Wait for both requests in one call */ MPI_Wait(&recv_req, &status); /* Wait for both requests in one call */ ``` + :::: ::::: @@ -307,13 +311,15 @@ if (!comm_completed) { ``` ::::callout + ## Dynamic task scheduling -Dynamic task schedulers are a class of algorithms designed to optimise the work load across ranks. +Dynamic task schedulers are a class of algorithms designed to optimise the work load across ranks. The most efficient, and, really, only practical, implementations use non-blocking communication to periodically check the work balance and *asynchronously* assign and send additional work to a rank, in the background, as it continues to work on its current queue of work. :::: ::::callout + ## An interesting aside: communication timeouts Non-blocking communication gives us a lot of flexibility, letting us write complex communication algorithms to experiment and find the right solution. One example of that flexibility is using `MPI_Test()` to create a communication timeout algorithm. @@ -388,7 +394,7 @@ int main(int argc, char **argv) The output from your program should look something like this: -``` +```text Rank 0: message received -- Hello from rank 3! Rank 1: message received -- Hello from rank 0! Rank 2: message received -- Hello from rank 1! @@ -449,13 +455,14 @@ int main(int argc, char **argv) return MPI_Finalize(); } ``` + :::: ::::: ## Collective communication Since the release of MPI 3.0, the collective operations have non-blocking versions. -Using these non-blocking collective operations is as easy as we've seen for point-to-point communication in the last section. +Using these non-blocking collective operations is as easy as we've seen for point-to-point communication in the last section. If we want to do a non-blocking reduction, we'd use `MPI_Ireduce()`: ```c @@ -471,7 +478,7 @@ int MPI_Ireduce( ); ``` -As with `MPI_Send()` vs. `MPI_Isend()` the only change in using the non-blocking variant of `MPI_Reduce()` is the addition of the `*request` argument, which returns a request handle. +As with `MPI_Send()` vs. `MPI_Isend()` the only change in using the non-blocking variant of `MPI_Reduce()` is the addition of the `*request` argument, which returns a request handle. This is the request handle we'll use with either `MPI_Wait()` or `MPI_Test()` to ensure that the communication has finished, and been successful. The below code examples shows a non-blocking reduction: @@ -490,12 +497,12 @@ MPI_Wait(&request, &status); ``` :::::challenge{id=whats-ibarrier, title="What's `MPI_Ibarrier()` all about?"} -In the previous episode, we learnt that `MPI_Barrier()` is a collective operation we can use to bring all the ranks back into synchronisation with one another. -How do you think the non-blocking variant, `MPI_Ibarrier()`, is used and how might you use this in your program? +In the previous episode, we learnt that `MPI_Barrier()` is a collective operation we can use to bring all the ranks back into synchronisation with one another. +How do you think the non-blocking variant, `MPI_Ibarrier()`, is used and how might you use this in your program? You may want to read the relevant [documentation](https://www.mpi-forum.org/docs/mpi-3.1/mpi31-report/node127.htm) first. ::::solution -When a rank reaches a non-blocking barrier, `MPI_Ibarrier()` will return immediately whether other ranks have reached the barrier or not. The behaviour of the barrier we would expect is enforced at the next `MPI_Wait()` (or `MPI_Test()`) operation. +When a rank reaches a non-blocking barrier, `MPI_Ibarrier()` will return immediately whether other ranks have reached the barrier or not. The behaviour of the barrier we would expect is enforced at the next `MPI_Wait()` (or `MPI_Test()`) operation. `MPI_Wait()` will return once all the ranks have reached the barrier. Non-blocking barriers can be used to help hide/reduce synchronisation overhead. We may want to add a synchronisation point in our program so the ranks start some work all at the same time. @@ -544,7 +551,7 @@ int main(int argc, char **argv) For two ranks, the output should be: -``` +```text Start : Rank 0: my_num = 1 sum = 0 Start : Rank 1: my_num = 2 sum = 0 End : Rank 0: my_num = 1 sum = 3 @@ -589,5 +596,6 @@ int main(int argc, char **argv) return MPI_Finalize(); } ``` + :::: ::::: diff --git a/high_performance_computing/hpc_mpi/07_advanced_communication.md b/high_performance_computing/hpc_mpi/07_advanced_communication.md index 8cfc61c9..263da0a9 100644 --- a/high_performance_computing/hpc_mpi/07_advanced_communication.md +++ b/high_performance_computing/hpc_mpi/07_advanced_communication.md @@ -16,7 +16,7 @@ learningOutcomes: --- -We've so far seen the basic building blocks for splitting work and communicating data between ranks, meaning we're now dangerous enough to write a simple and successful MPI application. +We've so far seen the basic building blocks for splitting work and communicating data between ranks, meaning we're now dangerous enough to write a simple and successful MPI application. We've worked, so far, with simple data structures, such as single variables or small 1D arrays. In reality, any useful software we write will use more complex data structures, such as structures, n-dimensional arrays and other complex types. Working with these in MPI require a bit more work to communicate them correctly and efficiently. @@ -26,6 +26,7 @@ A derived type acts as a way to enable the translation of complex data structure communication. ::::callout + ## Size limitations for messages All throughout MPI, the argument which says how many elements of data are being communicated is an integer: `int count`. @@ -64,14 +65,14 @@ Arrays are row-major in C and column-major in Fortran. In a row-major array, the elements in each column of a row are contiguous, so element `x[i][j]` is preceded by `x[i][j - 1]` and is followed by `x[i][j +1]`. In Fortran, arrays are column-major so `x(i, j)` is followed by `x(i + 1, j)` and so on. -The diagram below shows how a 4 x 4 matrix is mapped onto a linear memory space, for a row-major array. +The diagram below shows how a 4 x 4 matrix is mapped onto a linear memory space, for a row-major array. At the top of the diagram is the representation of the linear memory space, where each number is ID of the element in memory. Below that are two representations of the array in 2D: the left shows the coordinate of each element and the right shows the ID of the element. ![Column memory layout in C](fig/c_column_memory_layout.png) The purple elements (5, 6, 7, 8) which map to the coordinates `[1][0]`, `[1][1]`, `[1][2]` and `[1][3]` are contiguous in linear memory. -The same applies for the orange boxes for the elements in row 2 (elements 9, 10, 11 and 12). +The same applies for the orange boxes for the elements in row 2 (elements 9, 10, 11 and 12). Columns in row-major arrays are contiguous. The next diagram instead shows how elements in adjacent rows are mapped in memory. @@ -87,7 +88,7 @@ Do you think memory contiguity could impact the performance of our software, in ::::solution Yes, memory contiguity can affect how fast our programs run. -When data is stored in a neat and organized way, the computer can find and use it quickly. +When data is stored in a neat and organized way, the computer can find and use it quickly. But if the data is scattered around randomly (fragmented), it takes more time to locate and use it, which decreases performance. Keeping our data and data access patterns organized can make our programs faster. But we probably won't notice the difference for small arrays and data structures. @@ -95,11 +96,12 @@ But we probably won't notice the difference for small arrays and data structures ::::: ::::callout + ## What about if I use `malloc()`? -More often than not, we will see `malloc()` being used to allocate memory for arrays. +More often than not, we will see `malloc()` being used to allocate memory for arrays. Especially if the code is using an older standard, such as C90, which does not support [variable length arrays](https://en.wikipedia.org/wiki/Variable-length_array). -When we use `malloc()`, we get a contiguous array of elements. +When we use `malloc()`, we get a contiguous array of elements. To create a 2D array using `malloc()`, we have to first create an array of pointers (which are contiguous) and allocate memory for each pointer: ```c @@ -122,7 +124,7 @@ When `malloc()` requests memory, the operating system will assign whatever memor This is not always next to the block of memory from the previous allocation. This makes life tricky, since data *has* to be contiguous for MPI communication. But there are workarounds. -One is to only use 1D arrays (with the same number of elements as the higher dimension array) and to map the n-dimensional coordinates into a linear coordinate system. +One is to only use 1D arrays (with the same number of elements as the higher dimension array) and to map the n-dimensional coordinates into a linear coordinate system. For example, the element `[2][4]` in a 3 x 5 matrix would be accessed as: ```c @@ -183,7 +185,7 @@ int MPI_Type_commit( ``` When a datatype is committed, resources which store information on how to handle it are internally allocated. -This contains data structures such as memory buffers as well as data used for bookkeeping. +This contains data structures such as memory buffers as well as data used for bookkeeping. Failing to free those resources after finishing with the vector leads to memory leaks, just like when we don't free memory created using `malloc()`. To free up the resources, we use `MPI_Type_free()`, @@ -237,7 +239,7 @@ In the above example, the intention is to only send the second and forth rows, s If we used `matrix`, the first and third rows would be sent instead. The other thing to notice, which is not immediately clear why it's done this way, is that the receive datatype is `MPI_INT` and the count is `num_elements = count * blocklength` instead of a single element of `rows_type`. -This is because when a rank receives data, the data is contiguous array. +This is because when a rank receives data, the data is contiguous array. We don't need to use a vector to describe the layout of contiguous memory. We are just receiving a contiguous array of `num_elements = count * blocklength` integers. :::::challenge{id=sending-columns, title="Sending columns from an array"} @@ -256,7 +258,7 @@ You may want to use [this code](code/solutions/skeleton-example.c) as your start ::::solution If your solution is correct you should see 2 and 5 printed to the screen. In the solution below, to send a 2 x 1 column of the matrix, we created a vector with `count = 2`, `blocklength = 1` and `stride = 3`. -To send the correct column our send buffer was `&matrix[0][1]` which is the address of the first element in column 1. +To send the correct column our send buffer was `&matrix[0][1]` which is the address of the first element in column 1. To see why the stride is 3, take a look at the diagram below: ![Stride example for question](fig/stride_example_2x3.png) @@ -310,6 +312,7 @@ int main(int argc, char **argv) return MPI_Finalize(); } ``` + :::: ::::: @@ -328,7 +331,7 @@ int matrix[4][4] = { You can re-use most of your code from the previous exercise as your starting point, replacing the 2 x 3 matrix with the 4 x 4 matrix above and modifying the vector type and communication functions as required. ::::solution -The receiving rank(s) should receive the numbers 6, 7, 10 and 11 if your solution is correct. +The receiving rank(s) should receive the numbers 6, 7, 10 and 11 if your solution is correct. In the solution below, we have created a vector with a count and block length of 2 and with a stride of 4. The first two arguments means two vectors of block length 2 will be sent. The stride of 4 results from that there are 4 elements between the start of each distinct block as shown in the image below: @@ -387,10 +390,10 @@ int main(int argc, char **argv) return MPI_Finalize(); } ``` + :::: ::::: - ## Structures in MPI Structures, commonly known as structs, are custom datatypes which contain multiple variables of (usually) different types. @@ -414,8 +417,8 @@ The main difference between vector and struct derived types is that the argument Most of these arguments are straightforward, given what we've just seen for defining vectors. But `array_of_displacements` is new and unique. -When a struct is created, it occupies a single contiguous block of memory. But there is a catch. -For performance reasons, compilers insert arbitrary "padding" between each member. +When a struct is created, it occupies a single contiguous block of memory. But there is a catch. +For performance reasons, compilers insert arbitrary "padding" between each member. This padding, known as [data structure alignment](https://en.wikipedia.org/wiki/Data_structure_alignment), optimises both the layout of the memory and the access of it. As a result, the memory layout of a struct may look like this instead: @@ -426,7 +429,7 @@ Although the memory used for padding and the struct's data exists in a contiguou This is why we need the `array_of_displacements` argument, which specifies the distance, in bytes, between each struct member relative to the start of the struct. In practise, it serves a similar purpose of the stride in vectors. -To calculate the byte displacement for each member, we need to know where in memory each member of a struct exists. +To calculate the byte displacement for each member, we need to know where in memory each member of a struct exists. To do this, we can use the function `MPI_Get_address()`: ```c @@ -567,6 +570,7 @@ int main(int argc, char **argv) return MPI_Finalize(); } ``` + :::: ::::: @@ -581,7 +585,7 @@ struct Grid { grid.position = malloc(3 * sizeof(double)); ``` -If we use `malloc()` to allocate memory for `position`, how would we send data in the struct and the memory we allocated one rank to another? +If we use `malloc()` to allocate memory for `position`, how would we send data in the struct and the memory we allocated one rank to another? If you are unsure, try writing a short program to create a derived type for the struct. ::::solution @@ -595,11 +599,11 @@ The memory we allocated for `*position` is somewhere else in memory, as shown in :::: ::::: - ::::callout + ## A different way to calculate displacements -There are other ways to calculate the displacement, other than using what MPI provides for us. +There are other ways to calculate the displacement, other than using what MPI provides for us. Another common way is to use the `offsetof()` macro part of ``. `offsetof()` accepts two arguments, the first being the struct type and the second being the member to calculate the offset for. ```c @@ -615,11 +619,10 @@ Some people prefer the "safety" of using `MPI_Get_address()` whilst others prefe Of course, if you're a Fortran programmer then you can't use the macro! :::: - ## Dealing with other non-contiguous data The previous two sections covered how to communicate complex but structured data between ranks using derived datatypes. -However, there are *always* some edge cases which don't fit into a derived types. +However, there are *always* some edge cases which don't fit into a derived types. For example, just in the last exercise we've seen that pointers and derived types don't mix well. Furthermore, we can sometimes also reach performance bottlenecks when working with heterogeneous data which doesn't fit, or doesn't make sense to be, in a derived type, as each data type needs to be communicated in separate communication calls. This can be especially bad if blocking communication is used! @@ -689,8 +692,8 @@ int MPI_Pack_size( ); ``` -`MPI_Pack_size()` is a helper function to calculate the *upper bound* of memory required. -It is, in general, preferable to calculate the buffer size using this function, as it takes into account any implementation specific MPI detail and thus is more portable between implementations and systems. +`MPI_Pack_size()` is a helper function to calculate the *upper bound* of memory required. +It is, in general, preferable to calculate the buffer size using this function, as it takes into account any implementation specific MPI detail and thus is more portable between implementations and systems. If we wanted to calculate the memory required for three elements of some derived struct type and a `double` array, we would do the following: ```c @@ -770,6 +773,7 @@ for (int i = 0; i < num_rows; ++i) { ``` ::::callout + ## Blocking or non-blocking? The processes of packing data into a contiguous buffer does not happen asynchronously. @@ -779,9 +783,10 @@ It works just as well to communicate the buffer using non-blocking methods, as i :::: ::::callout - ## What if the other rank doesn't know the size of the buffer? -In some cases, the receiving rank may not know the size of the buffer used in `MPI_Pack()`. +## What if the other rank doesn't know the size of the buffer? + +In some cases, the receiving rank may not know the size of the buffer used in `MPI_Pack()`. This could happen if a message is sent and received in different functions, if some ranks have different branches through the program or if communication happens in a dynamic or non-sequential way. In these situations, we can use `MPI_Probe()` and `MPI_Get_count` to find the a message being sent and to get the number of elements in the message. @@ -796,6 +801,7 @@ MPI_Get_count(&status, MPI_PACKED, &buffer_size); /* MPI_PACKED represents an element of a "byte stream." So, buffer_size is the size of the buffer to allocate */ char *buffer = malloc(buffer_size); ``` + :::: :::::challenge{id=heterogeneous-data, title="Sending Heterogeneous Data in a Single Communication"} @@ -819,7 +825,7 @@ for (int i = 0; i < float_data_count; ++i) { } ``` -Since the arrays are dynamically allocated, in rank 0, you should also pack the number of elements in each array. +Since the arrays are dynamically allocated, in rank 0, you should also pack the number of elements in each array. Rank 1 may also not know the size of the buffer. How would you deal with that? You can use this [skeleton code](code/solutions/08-pack-skeleton.c) to begin with. @@ -922,5 +928,6 @@ int main(int argc, char **argv) return MPI_Finalize(); } ``` + :::: ::::: diff --git a/high_performance_computing/hpc_mpi/08_communication_patterns.md b/high_performance_computing/hpc_mpi/08_communication_patterns.md index bf852abe..38abef19 100644 --- a/high_performance_computing/hpc_mpi/08_communication_patterns.md +++ b/high_performance_computing/hpc_mpi/08_communication_patterns.md @@ -144,8 +144,9 @@ Here is a (non-exhaustive) list of examples where reduction operations are usefu At the end of each time step, a reduction can be used to update the global state or combine together pieces of data (similar to a gather operation). 3. Large statistical models: in a large statistical model, the large amounts of data can be processed by splitting it across ranks and calculating statistical values for the sub-set of data. The final values are then calculated by using a reduction operation and re-normalizing the values appropriately. -4. Numerical integration: each rank will compute the area under the curve for its portion of the curve. +4. Numerical integration: each rank will compute the area under the curve for its portion of the curve. The value of the integral for the entire curve is then calculated using a reduction operation. + :::: ::::: @@ -205,9 +206,10 @@ scatter_sub_arrays_to_other_ranks(image, rank_image, sub_array_t, rank_dims, my_ ``` ::::callout + ## Extra: Scattering the image to other ranks -As mentioned in the previous code example, distributing the 2D sub-domains across ranks doesn't play well with collective functions. +As mentioned in the previous code example, distributing the 2D sub-domains across ranks doesn't play well with collective functions. Therefore, we have to transfer the data manually using point-to-point communication. An example of how can be done is shown below. ```c @@ -249,9 +251,10 @@ void scatter_sub_arrays_to_other_ranks( } } ``` + :::: -The function [`MPI_Dims_create()`](https://www.open-mpi.org/doc/v4.1/man3/MPI_Dims_create.3.php) is a useful utility function in MPI which is used to determine the dimensions of a Cartesian grid of ranks. +The function [`MPI_Dims_create()`](https://www.open-mpi.org/doc/v4.1/man3/MPI_Dims_create.3.php) is a useful utility function in MPI which is used to determine the dimensions of a Cartesian grid of ranks. In the above example, it's used to determine the number of rows and columns in each sub-array, given the number of ranks in the row and column directions of the grid of ranks from `MPI_Dims_create()`. In addition to the code above, you may also want to create a [*virtual Cartesian communicator topology*](https://www.mpi-forum.org/docs/mpi-3.1/mpi31-report/node187.htm#Node187) to reflect the decomposed geometry in the communicator as well, as this give access to a number of other utility functions which makes communicating data easier. @@ -278,6 +281,7 @@ The image has been decomposed into *strips*, which each rank working on a sub-im In the example, [`MPI_Sendrecv()`](https://www.open-mpi.org/doc/v4.1/man3/MPI_Sendrecv_replace.3.php) is used to send and receiving data between neighbouring ranks. ::::callout + ## Chain communication with `MPI_Sendrecv()` `MPI_Sendrecv()` combines both sending and receiving data in a single call. @@ -304,6 +308,7 @@ int MPI_Sendrecv( MPI_Status *status /* The status for the receive operation */ ); ``` + :::: ```c @@ -333,7 +338,7 @@ MPI_Sendrecv(&rank_image[index_into_2d(num_rows - 2, 1, num_cols)], num_rows, MP ``` :::::challenge{id=halo-exchange-2d, title="Halo Exchange in Two Dimensions"} -The previous code example shows one implementation of halo exchange in one dimension. +The previous code example shows one implementation of halo exchange in one dimension. Following from the code example showing domain decomposition in two dimensions, write down the steps (or some pseudocode) for the implementation of domain decomposition and halo exchange in two dimensions. ::::solution @@ -345,7 +350,7 @@ The image below, that we've already seen, shows a depiction of halo exchange in To communicate the halos, we need to: -1. Create a derived type to send a column of data for the correct number of rows of pixels. +1. Create a derived type to send a column of data for the correct number of rows of pixels. The top and bottom rows can be communicated without using a derived type, because the elements in a row are contiguous. 2. For each sub-domain, we need to determine the neighbouring ranks, so we know which rank to send data to and which ranks to receive data from. 3. Using the derived types and neighbouring ranks, communicate the top row of the sub-domain to the bottom halo row of the neighbouring top domain. diff --git a/high_performance_computing/hpc_mpi/09_porting_serial_to_mpi.md b/high_performance_computing/hpc_mpi/09_porting_serial_to_mpi.md index 3c67f7d4..c64d3b94 100644 --- a/high_performance_computing/hpc_mpi/09_porting_serial_to_mpi.md +++ b/high_performance_computing/hpc_mpi/09_porting_serial_to_mpi.md @@ -27,7 +27,6 @@ In this case, the equation is used in a simplified form to describe how heat dif In the simulation the stick is split into a given number of slices, each with a constant temperature. - ![Stick divided into separate slices with touching boundaries at each end](fig/poisson_stick.png) The temperature of the stick itself across each slice is initially set to zero, whilst at one boundary of the stick the amount of heat is set to 10. @@ -90,7 +89,7 @@ The next step is to initialise the initial conditions of the simulation: u[0] = 10.0; ``` -`residual` here refers to the threshold of temperature equilibrium along the stick we wish to achieve. Once it's within this threshold, the simulation will end. +`residual` here refers to the threshold of temperature equilibrium along the stick we wish to achieve. Once it's within this threshold, the simulation will end. Note that initially, `u` is set entirely to zero, representing a temperature of zero along the length of the stick. As noted, `rho` is set to zero here for simplicity. @@ -121,7 +120,7 @@ Finally, just for show, the code outputs a representation of the result - the en ### The Iterative Function - `poisson_step()` -The `poisson_step()` progresses the simulation by a single step. +The `poisson_step()` progresses the simulation by a single step. After it accepts its arguments, for each slice in the stick it calculates a new value based on the temperatures of its neighbours: ```c @@ -166,7 +165,7 @@ gcc poisson.c -o poisson And should see the following: -``` +```text Final result: 9-8-7-6-6-5-4-3-3-2-1-0- Run completed in 182 iterations with residue 9.60328e-06 @@ -176,15 +175,18 @@ Here, we can see a basic representation of the temperature of each slice of the Ordinarily, we might output the full sequence to a file, but we've simplified it for convenience here. ::::callout{variant="warning"} + ## Missing Links + Depending on your system, you might get an error along the line of `undefined reference to symbol 'sqrt'`. This error was generated when the compiler attempted to link together the compiled versions of your code and the libraries it depends on to produce the final executable. The `sqrt` function is present in `math.h`, but on some systems the compiled `math` library isn't linked by default. You can explicitly include it using the `-lm` flag: + ```bash gcc -poisson.c -o poisson -lm ``` -:::: +:::: ## Approaching Parallelism @@ -204,18 +206,19 @@ Looking at the code, which parts would benefit most from parallelisation, and ar ::::solution Potentially, the following regions could be executed in parallel: - -* The setup, when initialising the fields -* The calculation of each time step, `unew` - this is the most computationally intensive of the loops -* Calculation of the cumulative temperature difference, `unorm` -* Overwriting the field `u` with the result of the new calculation + +- The setup, when initialising the fields +- The calculation of each time step, `unew` - this is the most computationally intensive of the loops +- Calculation of the cumulative temperature difference, `unorm` +- Overwriting the field `u` with the result of the new calculation As `GRIDSIZE` is increased, these will take proportionally more time to complete, so may benefit from parallelisation. However, there are a few regions in the code that will require exchange of data across the parallel executions to work correctly: -* Calculation of `unorm` is a sum that requires difference data from all sections of the stick, so we'd need to somehow communicate these difference values to a single rank that computes and receives the overall sum -* Each section of the stick does not compute a single step in isolation, it needs boundary data from neighbouring sections of the stick to arrive at its computed temperature value for that step, so we'd need to communicate temperature values between neighbours (i.e. using a nearest neighbours communication pattern) +- Calculation of `unorm` is a sum that requires difference data from all sections of the stick, so we'd need to somehow communicate these difference values to a single rank that computes and receives the overall sum +- Each section of the stick does not compute a single step in isolation, it needs boundary data from neighbouring sections of the stick to arrive at its computed temperature value for that step, so we'd need to communicate temperature values between neighbours (i.e. using a nearest neighbours communication pattern) + :::: ::::: @@ -229,16 +232,15 @@ Examine the code and try to identify any serial regions that can't (or shouldn't ::::solution There aren't any large or time consuming serial regions, which is good from a parallelism perspective. However, there are a couple of small regions that are not amenable to running in parallel: - -* Setting the `10.0` initial temperature condition at the stick 'starting' boundary. We only need to set this once at the beginning of the stick, and not at the boundary of every section of the stick -* Printing a representation of the final result, since this only needs to be done once to represent the whole stick, and not for every section. - + +- Setting the `10.0` initial temperature condition at the stick 'starting' boundary. We only need to set this once at the beginning of the stick, and not at the boundary of every section of the stick +- Printing a representation of the final result, since this only needs to be done once to represent the whole stick, and not for every section. + So we'd need to ensure only one rank deals with these, which in MPI is typically the zeroth rank. This also makes sense in terms of our parallelism approach, since the zeroth rank would be the beginning of the stick, where we'd set the initial boundary temperature. :::: ::::: - ## Parallelising our Code So now let's apply what we've learned about MPI together with our consideration of the code. @@ -381,7 +383,9 @@ Insert the following into the `poisson_step()` function, putting the declaration ```c double unorm, global_unorm; ``` + Then add `MPI_Allreduce()` after the calculation of `unorm`: + ```c MPI_Allreduce(&unorm, &global_unorm, 1, MPI_DOUBLE, MPI_SUM, MPI_COMM_WORLD); ``` @@ -483,7 +487,7 @@ mpicc poisson_mpi.c -o poisson_mpi mpirun -n 2 poisson_mpi ``` -``` +```text Final result: 9-8-7-6-6-5-4-3-3-2-1-0- Run completed in 182 iterations with residue 9.60328e-06 @@ -492,18 +496,17 @@ Run completed in 182 iterations with residue 9.60328e-06 Note that as it stands, the implementation assumes that `GRIDSIZE` is divisible by `n_ranks`. So to guarantee correct output, we should use only factors of 12 for our `n_ranks`. - ### Testing our Parallel Code We should always ensure that as our parallel version is developed, that it behaves the same as our serial version. -This may not be possible initially, particularly as large parts of the code need converting to use MPI, but where possible, we should continue to test. +This may not be possible initially, particularly as large parts of the code need converting to use MPI, but where possible, we should continue to test. So we should test once we have an initial MPI version, and as our code develops, perhaps with new optimisations to improve performance, we should test then too. :::::challenge{id=an-initial-test, title="An Initial Test"} Test the MPI version of your code against the serial version, using 1, 2, 3, and 4 ranks with the MPI version. Are the results as you would expect? What happens if you test with 5 ranks, and why? Write a simple test into the code that would catch the error using the `assert(condition)` function from the `assert.h` library, which will terminate the program if `condition` evalutes to `false`. - + ::::solution Using these ranks, the MPI results should be the same as our serial version. Using 5 ranks, our MPI version yields `9-8-7-6-5-4-3-2-1-0-0-0-` which is incorrect. @@ -514,20 +517,25 @@ This doesn't fill `resultbuf` with results representing an expected `GRIDSIZE` o This highlights another aspect of complexity we need to take into account when writing such parallel implementations, where we must ensure a problem space is correctly subdivided. We especially want to prevent situations where the code *appears* to run without a crash or error, but still gives a completely wrong answer. We can catch the error using an assertion by importing the `assert.h` library at the top of the file: + ```c #include ``` -Then we can add the check itself just after calculating `rank_gridsize`, where we test to see if the gridsize calculation would have left a non-zero remainder. This means there are cells that can't be evenly distributed across the ranks. + +Then we can add the check itself just after calculating `rank_gridsize`, where we test to see if the gridsize calculation would have left a non-zero remainder. This means there are cells that can't be evenly distributed across the ranks. If we don't add any conditions, we'll get one error message per rank, so we want to condition it to only run on a single one: + ```c // Test that the grid can be subdivided between the ranks properly if (rank == 0) { assert(GRIDSIZE % n_ranks == 0); } ``` + This should give us a helpful error when we try to run the code for an invalid number of ranks, instead of simply giving us the wrong answer at the end: -``` + +```text poisson_mpi: poisson_mpi.c:105: main: Assertion `GRIDSIZE % n_ranks == 0' failed. ``` @@ -548,5 +556,6 @@ At initialisation, instead of setting it to zero we could do: ```c rho[i] = rho_coefficients[(rank * rank_gridsize) + i] ``` + :::: ::::: diff --git a/high_performance_computing/hpc_mpi/10_optimising_mpi.md b/high_performance_computing/hpc_mpi/10_optimising_mpi.md index 1cdf9c5c..521814ad 100644 --- a/high_performance_computing/hpc_mpi/10_optimising_mpi.md +++ b/high_performance_computing/hpc_mpi/10_optimising_mpi.md @@ -28,7 +28,9 @@ Also, we may want to consider how best to optimise the code to make more efficie Therefore, it's really helpful to understand how well our code *scales* in performance terms as we increase the resources available to it. ::::callout{variant="note"} + ## Prerequisite: [Intro to High Performance Computing](../hpc_intro/01_hpc_intro) + Whilst the previous episodes can be done on a laptop or desktop, this episode covers how to profile your code using tools that are only available on a HPC cluster. :::: @@ -71,10 +73,11 @@ Amdahl’s law states that, for a fixed problem, the upper limit of speedup is d ![A figure showing strong scaling](fig/scaling_amdahl.png) ::::callout + ## Amdahl's Law in Practice -Consider a program that takes 20 hours to run using one core. -If a particular part of the rogram, which takes one hour to execute, cannot be parallelized (s = 1/20 = 0.05), and if the code that takes up the remaining 19 hours of execution time can be parallelized (p = 1 − s = 0.95), then regardless of how many processors are devoted to a parallelized execution of this program, the minimum execution time cannot be less than that critical one hour. +Consider a program that takes 20 hours to run using one core. +If a particular part of the rogram, which takes one hour to execute, cannot be parallelized (s = 1/20 = 0.05), and if the code that takes up the remaining 19 hours of execution time can be parallelized (p = 1 − s = 0.95), then regardless of how many processors are devoted to a parallelized execution of this program, the minimum execution time cannot be less than that critical one hour. Hence, the theoretical speedup is limited to at most 20 times (when N = ∞, speedup = 1/s = 20). :::: @@ -83,11 +86,12 @@ Linear **strong** scaling if the speedup (work units completed per unit time) is It's harder to achieve good strong-scaling at larger process counts since communication overhead typically increases with the number of processes used. ::::callout + ## Testing Code Performance on SLURM - + We also need a way to test our code on our HPC infrastructure of choice. This will likely vary from system to system depending on your infrastructure configuration,but may look something like this (replacing ``, ``, and `` as appropriate): - + ```bash #!/usr/bin/env bash #SBATCH --account= @@ -105,7 +109,7 @@ module load openmpi/4.1.4 time mpirun -n 1 poisson_mpi ``` - + So here, after loading the required compiler and OpenMPI modules, we use the `time` command to output how long the process took to run for a given number of processors, and ensure we specify `ntasks` correctly as the required number of cores we wish to use. We can then submit this using `sbatch`, e.g. `sbatch poisson-mpi.sh`, with the output captured by default in a `slurm-....out` file which will include the time taken to run the program. @@ -144,7 +148,7 @@ Gustafson’s law is based on the approximations that the parallel part scales l $$ \mathrm{scaled\ speedup} = s + p × N $$ -where $$s$$, $$p$$ and $$N$$ have the same meaning as in Amdahl's law. +where $$s$$, $$p$$ and $$N$$ have the same meaning as in Amdahl's law. With Gustafson's law the scaled speedup increases linearly with respect to the number of processors (with a slope smaller than one), and there is no upper limit for the scaled speedup. This is called **weak scaling**, where the scaled speedup is calculated based on the amount of work done for a scaled problem size (in contrast to Amdahl’s law which focuses on fixed problem size). @@ -195,7 +199,7 @@ But if we keep increasing the number of ranks, the time spent in communication g In a parallel algorithm, the data which is handled by a core can be considered in two parts: the part the CPU needs that other cores control, and a part that the core controls itself and can compute. The whole data which a CPU or a core computes is the sum of the two. The data under the control of the other cores is called "surface" and the whole data is called "volume". -The surface data requires communications. +The surface data requires communications. he more surface there is, the more communications among CPUs/cores is needed, and the longer the program will take to finish. Due to Amdahl's law, you want to minimize the number of communications for the same surface since each communication takes finite amount of time to prepare (latency). @@ -204,7 +208,7 @@ Of course, sequential consistency should be obeyed when the surface data is exch ## Profiling our Code -Now we have a better understanding of how our code scales with resources and problem size, we may want to consider how to optimise the code to perform better. +Now we have a better understanding of how our code scales with resources and problem size, we may want to consider how to optimise the code to perform better. But we should be careful! > "We should forget about small efficiencies, say about 97% of the time: @@ -214,10 +218,11 @@ Essentially, before attempting to optimize your own code, you should profile it. Typically, most of the runtime is spent in a few functions/subroutines, so you should focus your optimization efforts on those parts of the code. The good news is that there are helpful tools known as *profilers* that can help us. -Profilers help you find out where a program is spending its time and pinpoint places where optimising it makes sense. +Profilers help you find out where a program is spending its time and pinpoint places where optimising it makes sense. Many different types of profiling tools exist, but for MPI application we need **parallel profilers**. Some examples of parallel profilers are: + - [Scalasca](http://scalasca.org) - a free and open source parallel profiler developed by three German research centers. - [TAU](https://www.cs.uoregon.edu/research/tau/home.php) - [VAMPIR](https://vampir.eu/) @@ -226,7 +231,7 @@ Some examples of parallel profilers are: In this lesson we will use a simple tool called ARM Performance Reports which gives us an overview of how much time is spent in compute, MPI calls and I/O. Performance Reports is part of the ARM Forge (formerly Allinea Forge) suite of tools for parallel applications and is developed by the semiconductor and software design company ARM. -The suite also comes with a debugger (ARM DDT) and a profiler (ARM MAP). ARM MAP is a more advanced tool which allows the user to see how much time each individual line of code takes, and why. +The suite also comes with a debugger (ARM DDT) and a profiler (ARM MAP). ARM MAP is a more advanced tool which allows the user to see how much time each individual line of code takes, and why. ARM DDT supports a wide range of parallel architectures and models, including MPI, UPC, CUDA and OpenMP. Version 19 and higher of ARM Forge supports Python, in addition to Fortran and C/C++. @@ -239,7 +244,9 @@ module avail allinea For more information on ARM Forge see the [product website](https://www.arm.com/products/development-tools/server-and-hpc/forge). ::::callout{variant="note"} + ## Software Availability + The ARM Forge suite of tools are licensed, and so may or may not be available on your HPC cluster (and certainly won't be on your laptop or desktop unless you buy a license and build them yourself!). If you don't have access to the ARM Forge tools, your local HPC cluster should have an alternative installed with similar functionality. @@ -284,7 +291,8 @@ terminal and one `.html` file which can be opened in a browser ```bash cat poisson_mpi_4p_1n_2024-01-30_15-38.txt ``` -``` + +```text Command: mpirun -n 4 poisson_mpi Resources: 1 node (28 physical, 56 logical cores per node) Memory: 503 GiB per node @@ -318,11 +326,11 @@ spent in the actual compute sections of the code. :::::challenge{id=profile-poisson, title="Profile Your Poisson Code"} Compile, run and analyse your own MPI version of the poisson code. - + How closely does it match the performance above? What are the main differences? Try reducing the number of processes used, rerun and investigate the profile. -Is it still MPI-bound? - +Is it still MPI-bound? + Increase the problem size, recompile, rerun and investigate the profile. What has changed now? ::::: @@ -332,6 +340,7 @@ In the Poisson code, try changing the location of the calls to `MPI_Send`. How d ::::: ::::callout{variant="tip"} + ## A General Optimisation Workflow? A general workflow for optimising a code, whether parallel or serial, is as follows: diff --git a/high_performance_computing/hpc_openmp/02_intro_openmp.md b/high_performance_computing/hpc_openmp/02_intro_openmp.md index ed9ef219..b7a2eb0f 100644 --- a/high_performance_computing/hpc_openmp/02_intro_openmp.md +++ b/high_performance_computing/hpc_openmp/02_intro_openmp.md @@ -16,12 +16,12 @@ OpenMP is an industry-standard API specifically designed for parallel programmin ::::challenge{title="An OpenMP Timeline"} If you're interested, there's a [timeline of how OpenMP developed](https://www.openmp.org/uncategorized/openmp-timeline/). -It provides an overview of OpenMP's evolution until 2014, with significant advancements -occurring thereafter. Notably, OpenMP 5.0 marked a significant step in 2018, followed by the latest +It provides an overview of OpenMP's evolution until 2014, with significant advancements +occurring thereafter. Notably, OpenMP 5.0 marked a significant step in 2018, followed by the latest iteration, OpenMP 5.2, which was released in November 2021. :::: -## How does it work? +## How does it work? OpenMP allows programmers to identify and parallelize sections of code, enabling multiple threads to execute them concurrently. This concurrency is achieved using a shared-memory model, where all threads can access a common memory space and communicate through shared variables. @@ -87,7 +87,7 @@ When you execute the OpenMP program, it will display 'Hello World!' multiple times according to the value we entered in `OMP_NUM_THREADS`, with each thread in the parallel region executing the `printf` statement concurrently: -~~~ +~~~text Hello World! Hello World! Hello World! @@ -95,6 +95,7 @@ Hello World! ~~~ ::::callout + ## How to Use in Microsoft VSCode? If you're looking to develop OpenMP programs in VSCode, here are three configuration hints which can help: @@ -107,4 +108,3 @@ You may need to adapt the `tasks.json` and `launch.json` depending on your platf Once you've compiled `hello_world_omp.c` the first time, then, by selecting VSCode's `Run and Debug` tab on the left, the `C++ OpenMP: current file` configuration should appear in the top left which will set `OMP_NUM_THREADS` before running it. :::: - diff --git a/high_performance_computing/hpc_openmp/03_parallel_api.md b/high_performance_computing/hpc_openmp/03_parallel_api.md index 495d33f3..937f1ed0 100644 --- a/high_performance_computing/hpc_openmp/03_parallel_api.md +++ b/high_performance_computing/hpc_openmp/03_parallel_api.md @@ -16,20 +16,20 @@ learningOutcomes: ## Using OpenMP in a Program As we introduced in the last episode, -OpenMP directives are special comments indicated by `#pragma omp` statements that guide the compiler in creating parallel code. +OpenMP directives are special comments indicated by `#pragma omp` statements that guide the compiler in creating parallel code. They mark sections of code to be executed concurrently by multiple threads. At a high level, the C/C++ syntax for pragma directives is as follows: -~~~c +```c #pragma omp [ ...] -~~~ +``` Following a directive are multiple optional clauses, which are themselves C expressions and may contain other clauses, with any arguments to both directives and clauses enclosed in parentheses and separated by commas. For example: -~~~c +```c #pragma omp a-directive a-clause(argument1, argument2) -~~~ +``` OpenMP offers a number of directives for parallelisation, although the two we'll focus on in this episode are: @@ -43,7 +43,7 @@ in the following we specify a specific block of code to run parallel threads, using the OpenMP runtime routine `omp_get_thread_num()` to return the unique identifier of the calling thread: -~~~c +```c #include #include int main() { @@ -52,23 +52,22 @@ int main() { printf("Hello from thread %d\n", omp_get_thread_num()); } } -~~~ +``` So assuming you've specified `OMP_NUM_THREADS` as `4`: -~~~ +```text Hello from thread 0 Hello from thread 1 Hello from thread 3 Hello from thread 2 -~~~ +``` Although the output may not be in the same order, since the order and manner in which these threads (and their `printf` statements) run is not guaranteed. So in summary, simply by adding this directive we have accomplished a basic form of parallelisation. - ### What about Variables? So how do we make use of variables across, and within, our parallel threads? @@ -82,28 +81,28 @@ Essentially, OpenMP provided two ways to do this for variables: For example, what if we wanted to hold the thread ID and the total number of threads within variables in the code block? Let's start by amending the parallel code block to the following: -~~~c +```c ... int num_threads = omp_get_num_threads(); int thread_id = omp_get_thread_num(); printf("Hello from thread %d out of %d\n", thread_id, num_threads); -~~~ +``` Here, `omp_get_num_threads()` returns the total number of available threads. If we recompile and re-run we should see: -~~~ +```text Hello from thread 0 out of 4 Hello from thread 1 out of 4 Hello from thread 3 out of 4 Hello from thread 2 out of 4 -~~~ +``` ::::challenge{title='OpenMP and C Scoping'} Try printing out `num_threads` at the end of the program, after the `#pragma` code block, and recompile. What happens? Is this what you expect? :::solution -Since the variable is scoped only to the code block within the curly braces, +Since the variable is scoped only to the code block within the curly braces, as with any C code block, `num_threads` is no longer in scope and cannot be read. ::: :::: @@ -111,7 +110,7 @@ as with any C code block, `num_threads` is no longer in scope and cannot be read Now by default, variables declared within parallel regions are private to each thread. But what about declarations outside of this block? For example: -~~~c +```c ... int num_threads, thread_id; @@ -121,7 +120,7 @@ But what about declarations outside of this block? For example: thread_id = omp_get_thread_num(); printf("Hello from thread %d out of %d\n", thread_id, num_threads); } -~~~ +``` Which may seem on the surface to be correct. However this illustrates a critical point about why we need to be careful. @@ -136,19 +135,22 @@ and lead to incorrect results. This is known as a *race condition*, and we'll look into them in more detail in the next episode. ::::callout -## Observing the Race Condition + +## Observing the Race Condition + We can observe the race condition occurring by adding a sleep command between the `thread_id` assignment and use. Add `#include ` to the top of your program, and after `thread_id`'s assignment, add `sleep(2);` which will force the code to wait for 2 seconds before the variable is accessed, providing more opportunity for the race condition to occur. Hopefully you'll then see the unwanted behaviour emerge, for example: -~~~ +```text Hello from thread 2 out of 4 Hello from thread 2 out of 4 Hello from thread 2 out of 4 Hello from thread 2 out of 4 -~~~ +``` + :::: But with our code, this makes variables potentially *unsafe*, since within a single thread, @@ -157,34 +159,34 @@ One approach to ensuring we don't do this accidentally is to specify that there classification. We can do this by changing our directive to: -~~~c +```c #pragma omp parallel default(none) -~~~ +``` Now if we recompile, we'll get an error mentioning that these variables aren't specified for use within the parallel region: -~~~ +```text hello_world_omp.c: In function 'main': hello_world_omp.c:10:21: error: 'num_threads' not specified in enclosing 'parallel' 10 | num_threads = omp_get_num_threads(); - | ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~ + | ````````````^`````````````````````~ hello_world_omp.c:8:13: note: enclosing 'parallel' 8 | #pragma omp parallel default(none) | ^~~ hello_world_omp.c:11:19: error: 'thread_id' not specified in enclosing 'parallel' 11 | thread_id = omp_get_thread_num(); - | ~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~ + | `````````~^````````````````````` hello_world_omp.c:8:13: note: enclosing 'parallel' 8 | #pragma omp parallel default(none) | ^~~ -~~~ +``` So we now need to be explicit in every case for which variables are accessible within the block, and whether they're private or shared: -~~~c +```c #pragma omp parallel default(none) private(num_threads, thread_id) -~~~ +``` So here, we ensure that each thread has its own private copy of these variables, which is now thread safe. @@ -195,7 +197,7 @@ A typical program uses `for` loops to perform many iterations of the same task, and fortunately OpenMP gives us a straightforward way to parallelise them, which builds on the use of directives we've learned so far. -~~~c +```c ... int num_threads, thread_id; @@ -211,7 +213,7 @@ which builds on the use of directives we've learned so far. printf("%d",i); } -~~~ +``` So essentially, very similar format to before, but here we use `for` in the pragma preceding a loop definition, which will then assign 10 separate loop iterations across the 4 available threads. @@ -219,12 +221,13 @@ Later in this episode we'll explore the different ways in which OpenMP is able t and how to specify different scheduling behaviours. :::callout + ## A Shortcut for Convenience The `#pragma omp parallel for` is actually equivalent to using two separate directives. For example: -~~~c +```c #pragma omp parallel { #pragma omp for @@ -233,19 +236,19 @@ For example: ... } } -~~~ +``` ...is equivalent to: -~~~c +```c #pragma omp parallel for for (int 1 = 1; 1 <=10; i++) { ... } -~~~ +``` -In the first case, `#pragma omp parallel` spawns a group of threads, whilst `#pragma omp for` divides the loop iterations between them. +In the first case, `#pragma omp parallel` spawns a group of threads, whilst `#pragma omp for` divides the loop iterations between them. But if you only need to do parallelisation within a single loop, the second case has you covered for convenience. ::: @@ -255,7 +258,7 @@ Use of this function will override any value set in `OMP_NUM_THREADS`. You should see something (but perhaps not exactly) like: -~~~ +```text Hello from iteration 1 from thread 0 out of 4 Hello from iteration 2 from thread 0 out of 4 Hello from iteration 3 from thread 0 out of 4 @@ -266,16 +269,16 @@ Hello from iteration 9 from thread 3 out of 4 Hello from iteration 10 from thread 3 out of 4 Hello from iteration 7 from thread 2 out of 4 Hello from iteration 8 from thread 2 out of 4 -~~~ +``` So with careful attention to variable scoping, using OpenMP to parallelise an existing loop is often quite straightforward. -However, particularly with more complex programs, there are some aspects and potential pitfalls with OpenMP parallelisation +However, particularly with more complex programs, there are some aspects and potential pitfalls with OpenMP parallelisation we need to be aware of - such as race conditions - which we'll explore in the next episode. ::::challenge{title="Calling Thread Numbering Functions Elsewhere?"} -Write, compile and run a simple OpenMP program that calls both `omp_get_num_threads()` and `omp_get_thread_num()` outside of a parallel region, +Write, compile and run a simple OpenMP program that calls both `omp_get_num_threads()` and `omp_get_thread_num()` outside of a parallel region, and prints the values received. What happens? :::solution @@ -295,7 +298,6 @@ In most OpenMP implementations, the default behaviour is to split the iterations int CHUNK_SIZE = NUM_ITERATIONS / omp_get_num_threads(); ``` - If the amount of time it takes to compute each iteration is the same, or nearly the same, then this is a perfectly efficient way to parallelise the work. Each thread will finish its chunk at roughly the same time as the other thread. But if the work is imbalanced, even if just one thread takes longer per iteration, then the threads become out of @@ -307,12 +309,12 @@ Fortunately, we can use other types of "scheduling" to control how work is divid scheduler is an algorithm which decides how to assign chunks of work to the threads. We can controller the scheduler we want to use with the `schedule` directive: -~~~c +```c #pragma omp parallel for schedule(SCHEDULER_NAME, OPTIONAL_ARGUMENT) for (int i = 0; i < NUM_ITERATIONS; ++i) { ... } -~~~ +``` `schedule` takes two arguments: the name of the scheduler and an optional argument. @@ -324,7 +326,6 @@ for (int i = 0; i < NUM_ITERATIONS; ++i) { | **auto** | The best choice of scheduling is chosen at run time. | - | Useful in all cases, but can introduce additional overheads whilst it decides which scheduler to use. | | **runtime** | Determined at runtime by the `OMP_SCHEDULE` environment variable or `omp_schedule` pragma. | - | - | - ::::challenge{title="Try Out Different Schedulers"} Try each of the static and dynamic schedulers on the code below, @@ -332,7 +333,7 @@ which uses `sleep` to mimic processing iterations that take increasing amounts o `static` is already specified, so replace this next with `dynamic`. Which scheduler is fastest? -~~~c +```c #include #include #include @@ -354,7 +355,7 @@ int main ( ) { double end = omp_get_wtime(); printf("Total time for %d reps = %f\n", NUM_ITERATIONS, end - start); } -~~~ +``` Try out the different schedulers and see how long it takes to finish the loop. Which scheduler was best? @@ -362,11 +363,11 @@ Which scheduler was best? :::solution You should see something like: - -~~~ + +```text Static: Total time for 8 reps = 13.003299 Dynamic: Total time for 8 reps = 10.007052 -~~~ +``` Here we can see that `dynamic` is the fastest, which is better with iterations taking differing amounts of time. @@ -381,9 +382,9 @@ With a dynamic scheduler, the default chunk size is 1. What happens if specify a i.e. `scheduler(dynamic, 2)`? :::solution -~~~ +```text Dynamic: Total time for 16 reps = 13.004029 -~~~ +``` So here, we now see approximately the same results. By increasing the chunk size, the dynamic scheduler behaves more like the static one, @@ -393,6 +394,7 @@ since the workload for static would have the same chunk size calculated to be 2 :::: :::callout + ## A Matter of Convenience We've seen that we can amend our code directly to use different schedulers, @@ -403,13 +405,14 @@ as well as the chunk size, so we don't need to recompile. Edit your code to specify `runtime` as the scheduler, i.e. `scheduler(runtime)`, recompile, then set the environment variable in turn to each scheduler, e.g. -~~~bash +```bash export OMP_SCHEDULE=dynamic -~~~ +``` Then rerun. Try it with different chunk sizes too, e.g.: -~~~bash +```bash export OMP_SCHEDULE=static,1 -~~~ +``` + ::: diff --git a/high_performance_computing/hpc_openmp/04_synchronisation.md b/high_performance_computing/hpc_openmp/04_synchronisation.md index f5b6ee54..fa13d7ca 100644 --- a/high_performance_computing/hpc_openmp/04_synchronisation.md +++ b/high_performance_computing/hpc_openmp/04_synchronisation.md @@ -19,6 +19,7 @@ and race conditions. In the context of parallel computing, thread or rank (in t crucial role in guaranteeing the *correctness* of our program, particularly in regard to data consistency and integrity. ::::callout + ## What is code correctness? Code correctness in parallel programming is the guarantee that a program operates as expected in multi-threaded @@ -53,6 +54,7 @@ correct value of 2. This illustrates why it's called a race condition, because t modify variables before another thread can! :::callout + ## Analogy: Editing a document Imagine two people trying to update the same document at the same time. If they don't communicate what they're doing, @@ -86,6 +88,7 @@ int main(void) { return EXIT_SUCCESS; } ``` + :::solution What you will notice is that when you run the program, the final value changes each time. The correct final value is 10,000 but you will often get a value that is lower than this. This is caused by a race condition, as explained in @@ -175,6 +178,7 @@ a `nowait` clause is used with a parallel for. next_function(); } ``` + ::: ### Synchronisation regions @@ -310,7 +314,6 @@ threads and explored how be used to prevent race conditions in the previous exer will look at the other mechanisms which can prevent race conditions, namely by setting locks or by using atomic operations. - ### Locks Critical regions provide a convenient and straightforward way to synchronise threads and guard data access to prevent @@ -412,8 +415,8 @@ often less expensive than critical regions or locks, so they should be preferred still important to not be over-zealous with using atomic operations as they can still introduce synchronisation overheads which can damage the parallel performance. - :::callout + ### When should I prefer to use a critical region? Or an atomic operation, or a lock? There are three mechanisms we can use to prevent race conditions: critical regions, locks and atomic operations. The @@ -472,7 +475,7 @@ int main(int argc, char **argv) { When we run the program multiple times, we expect the output `sum` to have the value of `0.000000`. However, due to an existing race condition, the program can sometimes produce wrong output in different runs, as shown below: -``` +```text 1. Sum: 1.000000 2. Sum: -1.000000 3. Sum: 2.000000 @@ -532,5 +535,6 @@ for (int i = 0; i < ARRAY_SIZE; ++i) { sum += array[i]; } ``` + ::: -:::: \ No newline at end of file +:::: diff --git a/high_performance_computing/hpc_openmp/05_hybrid_parallelism.md b/high_performance_computing/hpc_openmp/05_hybrid_parallelism.md index 40411721..b592a873 100644 --- a/high_performance_computing/hpc_openmp/05_hybrid_parallelism.md +++ b/high_performance_computing/hpc_openmp/05_hybrid_parallelism.md @@ -17,14 +17,14 @@ At this point in the lesson, we've introduced the basics you need to get out the OpenMP. There is one thing still worth being brought to your attention, and that is *hybrid parallelism*. :::callout + ## The Message Passing Interface (MPI) -In this episode, we will assume you have some knowledge about the Message Passing Interface (MPI) and that you have a +In this episode, we will assume you have some knowledge about the Message Passing Interface (MPI) and that you have a basic understand of how to paralleise code using MPI. If you're not sure, you can think of MPI as being like an OpenMP program where everything is in a `pragma omp parallel` directive. ::: - ## What is hybrid parallelism? When we talk about hybrid paralleism, what we're really talking about is writing parallel code using more than one @@ -34,6 +34,7 @@ research is *MPI+X*. What this means is that an application is *mostly* parallel (MPI), which has been extended using some +X other paradigm. A common +X is OpenMP, creating MPI+OpenMP. :::callout + ## Heterogeneous Computing An MPI+OpenMP scheme is known as homogenous computing, meaning all the processing units involved are of the same type. @@ -264,6 +265,7 @@ for (int i = rank_lower_limit; i < rank_upper_limit; ++i) { ``` :::callout + ## Still not sure about MPI? If you're still a bit unsure of how MPI is working, you can basically think of it as wrapping large parts of your @@ -287,6 +289,7 @@ struct input_par_t input_parameters[total_work]; } } ``` + ::: In the above example, we have only included the parallel region of code. It is unfortunately not as simple as this, @@ -418,6 +421,7 @@ Total time = 5.377609 seconds ``` :::callout + ## How many ranks and threads should I use? How many ranks and threads you should use depends on lots of parameters, such as the size of your problem (e.g. do you @@ -445,4 +449,4 @@ was, rather naturally, when either $N_{\mathrm{ranks}} = 1$, $N_{\mathrm{threads $N_{\mathrm{threads}} = 1$ with the former being slightly faster. Otherwise, we found the best balance was $N_{\mathrm{ranks}} = 2$, $N_{\mathrm{threads}} = 3$. ::: -:::: \ No newline at end of file +:::: diff --git a/high_performance_computing/hpc_parallel_intro/01_introduction.md b/high_performance_computing/hpc_parallel_intro/01_introduction.md index 559e4664..1b104c33 100644 --- a/high_performance_computing/hpc_parallel_intro/01_introduction.md +++ b/high_performance_computing/hpc_parallel_intro/01_introduction.md @@ -39,7 +39,9 @@ This can allow us to do much more at once, and therefore get results more quickl | ![Serial Computing](fig/serial2_prog.png) | ![Parallel Computing](fig/parallel_prog.png) | ::::callout + ## Analogy + The basic concept of parallel computing is simple to understand: we divide our job in tasks that can be executed at the same time so that we finish the job in a fraction of the time that it would have taken if the tasks are executed one by one. Suppose that we want to paint the four walls in a room. This is our **problem**. We can divide our **problem** into 4 different **tasks**: paint each of the walls. @@ -53,7 +55,9 @@ If we have 2 or more painters for the job, then the tasks can be performed in ** :::: ::::callout + ## Key idea + In our analogy, the painters represent CPU cores in the computers. The number of CPU cores available determines the maximum number of tasks that can be performed in parallel. The number of concurrent tasks that can be started at the same time, however is unlimited. @@ -85,7 +89,9 @@ These frameworks provide tools, libraries, and methodologies to handle memory ma Now, let's take a brief look at these fundamental concepts and explore the differences between MPI and OpenMP, setting the stage for a deeper understanding of MPI in the upcoming episodes. ::::callout + ## Processes + A process refers to an individual running instance of a software program. Each process operates independently and possesses its own set of resources, such as memory space and open files. As a result, data within one process remains isolated and cannot be directly accessed by other processes. @@ -97,6 +103,7 @@ MPI provides a comprehensive set of libraries, tools, and methodologies that ena ![Processes](fig/multiprocess.svg) ## Threads + A thread is an execution unit that is part of a process. It operates within the context of a process and shares the process's resources. Unlike processes, multiple threads within a process can access and share the same data, enabling more efficient and faster parallel programming. @@ -107,24 +114,28 @@ Threads can improve application performance by utilizing parallelism and allowin One advantage of using threads is that they can be easier to work with compared to processes when it comes to parallel programming. When incorporating threads, especially with frameworks like OpenMP, modifying a program becomes simpler. This ease of use stems from the fact that threads operate within the same process and can directly access shared data, eliminating the need for complex inter-process communication mechanisms required by MPI. -However, it's important to note that threads within a process are limited to a single computer. +However, it's important to note that threads within a process are limited to a single computer. While they provide an effective means of utilizing multiple CPU cores on a single machine, they cannot extend beyond the boundaries of that computer. ![Threads](fig/multithreading.svg) :::: ::::callout -## Analogy + +### Analogy + Let's go back to our painting 4 walls analogy. Our example painters have two arms, and could potentially paint with both arms at the same time. Technically, the work being done by each arm is the work of a single painter. -In this example, each painter would be a ***“process”*** (an individual instance of a program). -The painters’ arms represent a ***“thread”*** of a program. +In this example, each painter would be a _**“process”**_ (an individual instance of a program). +The painters’ arms represent a _**“thread”**_ of a program. Threads are separate points of execution within a single program, and can be executed either synchronously or asynchronously. :::: ::::callout + ## Shared vs Distributed Memory + Shared memory refers to a memory model where multiple processors can directly access and modify the same memory space. Changes made by one processor are immediately visible to all other processors. @@ -146,10 +157,13 @@ Distributed memory programming models, such as MPI, facilitate communication and - **Scalability:** Shared memory systems are typically limited to a single computer or node, whereas distributed memory systems can scale to larger configurations with multiple computers and nodes. - **Programming Complexity:** Shared memory programming models offer simpler constructs and require less explicit communication compared to distributed memory models. Distributed memory programming involves explicit data communication and synchronization, adding complexity to the programming process. + :::: ::::callout -## Analogy + +### Analogy + Imagine that all workers have to obtain their paint form a central dispenser located at the middle of the room. If each worker is using a different colour, then they can work asynchronously. However, if they use the same colour, and two of them run out of paint at the same time, then they have to synchronise to use the dispenser — one should wait while the other is being serviced. @@ -161,13 +175,16 @@ We need, however, a communication system in place. Suppose that worker A, for some reason, needs a colour that is only available in the dispenser of worker B, they must then synchronise: worker A must request the paint of worker B and worker B must respond by sending the required colour. ## Key Idea + In our analogy, the paint dispenser represents access to the memory in your computer. Depending on how a program is written, access to data in memory can be synchronous or asynchronous. For the different dispensers case for your workers, however, think of the memory distributed on each node/computer of a cluster. :::: ::::callout + ## MPI vs OpenMP: What is the difference? + | MPI | OpenMP | |---------|-----------| |Defines an API, vendors provide an optimized (usually binary) library implementation that is linked using your choice of compiler.|OpenMP is integrated into the compiler (e.g., gcc) and does not offer much flexibility in terms of changing compilers or operating systems unless there is an OpenMP compiler available for the specific platform.| @@ -175,16 +192,17 @@ For the different dispensers case for your workers, however, think of the memory |Suitable for both distributed memory and shared memory (e.g., SMP) systems, allowing for parallelization across multiple nodes.|Designed for shared memory systems and cannot be used for parallelization across multiple computers.| |Enables parallelism through both processes and threads, providing flexibility for different parallel programming approaches.|Focuses solely on thread-based parallelism, limiting its scope to shared memory environments.| |Creation of process/thread instances and communication can result in higher costs and overhead.|Offers lower overhead, as inter-process communication is handled through shared memory, reducing the need for expensive process/thread creation.| + :::: -## Parallel Paradigms +## Parallel Paradigms Thinking back to shared vs distributed memory models, how to achieve a parallel computation is divided roughly into **two paradigms**. Let's set both of these in context: -1. In a shared memory model, a ***data parallelism*** paradigm is typically used, as employed by OpenMP: the same operations are performed simultaneously on data that is _shared_ across each parallel operation. +1. In a shared memory model, a _**data parallelism**_ paradigm is typically used, as employed by OpenMP: the same operations are performed simultaneously on data that is _shared_ across each parallel operation. Parallelism is achieved by how much of the data a single operation can act on. -2. In a distributed memory model, a ***message passing*** paradigm is used, as employed by MPI: each CPU (or core) runs an independent program. +2. In a distributed memory model, a _**message passing**_ paradigm is used, as employed by MPI: each CPU (or core) runs an independent program. Parallelism is achieved by _receiving data_ which it doesn't have, conducting some operations on this data, and _sending data_ which it has. This division is mainly due to historical development of parallel architectures: the first one follows from shared memory architecture like SMP (Shared Memory Processors) and the second from distributed computer architecture. @@ -200,11 +218,13 @@ for(i=0; i *Speedup = T1 / Tn* +> *Speedup = T~1~ / T~n~* -Where *T1* denotes the time taken to run the code with only 1 core, and *Tn* denotes the time taken to run the code with `n` cores. +Where *T~1~* denotes the time taken to run the code with only 1 core, and *T~n~* denotes the time taken to run the code with `n` cores. The speedup efficiency, which measures how efficiently the additional resources are being used, is, -> *Efficiencyn = Speedupn / n*, +> *Efficiency~n~ = Speedup~n~ / n*, Which could be as high as 1, but probably will never reach that in practice. :::::challenge(id=calculate-speedup-1, title="Calculate using your Own Results I"} Submit your Pi job again, as you did in the previous episode. e.g. with a job script called `mpi-pi.sh`: - + ```bash remote$ sbatch mpi-pi.sh ``` @@ -66,37 +71,37 @@ If we use *n* processors, we might expect *n* times speedup. But as we've mentio We can think of a program as being operations which *can* and *can't* be parallelised, i.e. the part of the code we can and can't be speeded up. The time taken for a program to finish executing is the sum of the fractions of time spent in the serial and parallel portion of the code, -> *Time to Complete (T) = Fraction of time taken in Serial Portion (FS) + Fraction of time taken in Parallel Portion (FP)* +> *Time to Complete (T) = Fraction of time taken in Serial Portion (F~S~) + Fraction of time taken in Parallel Portion (F~P~)* > -> *T = FS + FP* +> *T = F~S~ + F~P~* When a program executes in parallel, the parallel portion of the code is split between the available cores. But since the serial portion is not split in this way, the time to complete is therefore, -> *Tn = FS + FP / n* +> *T~n~ = F~S~ + F~P~ / n* We can see that as the number of cores in use increases, then the time to complete decreases until it approaches that of the serial portion. The speedup from using more cores is, -> *Speedup = T1 / Tn = ( FS + FP ) / ( FS + FP / n )* +> *Speedup = T~1~ / T~n~ = ( F~S~ + F~P~ ) / ( F~S~ + F~P~ / n )* -To simplify the above, we will define the single core execution time as a single unit of time, such that *FS + FP = 1*. +To simplify the above, we will define the single core execution time as a single unit of time, such that *F~S~ + F~P~ = 1*. -> *Speedup = 1 / ( FS + FP / n )* +> *Speedup = 1 / ( F~S~ + F~P~ / n )* Again this shows us that as the number of cores increases, the serial portion of the code will dominate the run time as when *n = ∞*, -> *Max speedup = 1 / FS* +> *Max speedup = 1 / F~S~* ## What's the Maximum Speedup? From the previous section, we know the the maximum speedup achievable is limited to how long a program takes to execute in serial. If we know the portion of time spent in the serial and parallel code, we will theoretically know by how much we can accelerate our program. However, it's not always simple to know the exact value of these fractions. But from Amdahl's law, if we can measure the speedup as a function of number of cores, we can estimate that maximum speed up. -We can rearrange Amdahl's law to estimate the parallel portion *FP*, +We can rearrange Amdahl's law to estimate the parallel portion *F~P~*, -> *FP = n / ( n - 1 ) ( ( T1 - Tn ) / T1 )* +> *F~P~ = n / ( n - 1 ) ( ( T~1~ - T~n~ ) / T~1~ )* Using the above formula on our example code we get the following results: -| Cores (n) | Tn | Fp | Fs = 1 - Fp | +| Cores (n) | T~n~ | F~p~ | F~s~ = 1 - F~p~ | |-----------|-----------------|---------------|-----------------------------------| | 1 | 3.99667 | - | - | | 2 | 2.064242 | 0.967019 | 0.0329809 | @@ -110,10 +115,11 @@ Using the above formula on our example code we get the following results: We now have an estimated percentage for our serial and parallel portion of our code. As you can see, as the number of cores we use increases, the time spent in the serial portion of the code increases. :::::challenge{id=calculate-speedup-2, title="Calculate using your Own Results II"} -Looking back at your own results from the previous *Calculate using your Own Results I* exercise, create new columns for Fp and Fs and calculate the results for each, using the formula above. Finally, calculate the average for each of these as in the table above. +Looking back at your own results from the previous *Calculate using your Own Results I* exercise, create new columns for F~p~ and F~s~ and calculate the results for each, using the formula above. Finally, calculate the average for each of these as in the table above. ::::: ::::callout + ## Differences in Serial Timings Similarly, in this instance we see that serial run times may vary depending on the run. There are several factors that are impacting our code. Firstly as we've discussed, these were run on a working system with other users, so runtime will be affected depending on the load of the system. Throughout DiRAC, it is normal when you run your code to have exclusive access, so this will be less of an issue. But if, for example, your code accesses bulk storage then there may be an impact since these are shared resources. As we are using the MPI library in our code, it would be expected that the serial portion will actually increase slightly with the number of cores due to additional MPI overheads. This will have a noticeable impact if you try scaling your code into the thousands of cores. @@ -121,11 +127,11 @@ Similarly, in this instance we see that serial run times may vary depending on t If we have several values, we can take the average to estimate an upper bound on how much benefit we will get from adding more processors. In our case then, the maximum speedup we can expect is, -> *Max speedup = 1 / FS = 1 / ( 1 - FP ) = 1 / ( 1 - 0.965375 ) = 29* +> *Max speedup = 1 / F~S~ = 1 / ( 1 - F~P~ ) = 1 / ( 1 - 0.965375 ) = 29* -Using this formula we can calculate a table of the expected maximum speedup for a given FP: +Using this formula we can calculate a table of the expected maximum speedup for a given F~P~: -| FP | Max Speedup | +| F~P~ | Max Speedup | |---------------|-------------| | 0.0 | 1.00 | | 0.1 | 1.11 | @@ -140,16 +146,16 @@ Using this formula we can calculate a table of the expected maximum speedup for | 0.95 | 20.00 | | 0.99 | 100.00 | |---------------|-------------| - + :::::challenge{id=cores-vs-speedup, title="Number of Cores vs Expected Speedup"} -Using what we've learned about Amdahl's Law and the average percentages of serial and parallel proportions of our example code we calculated earlier in the *Calculate using your Own Results II* exercise, fill in or create a table estimating the expected total speedup and change in speedup when doubling the number of cores, in a table like the following (with the number of cores doubling each time until a total of 4096). Substitute the initial T1 `???????` value with the initial Tn value from your own run. +Using what we've learned about Amdahl's Law and the average percentages of serial and parallel proportions of our example code we calculated earlier in the *Calculate using your Own Results II* exercise, fill in or create a table estimating the expected total speedup and change in speedup when doubling the number of cores, in a table like the following (with the number of cores doubling each time until a total of 4096). Substitute the initial T~1~ `???????` value with the initial T~n~ value from your own run. Hints: use the following formula: - -> *Tn = Fs + ( Fp / n )*
-> *Speedup = T1 / Tn* - -| Cores (n) | Tn | Speedup | Change in Speedup | + +> *T~n~ = F~s~ + ( F~p~ / n )* +> *Speedup = T~1~ / T~n~* + +| Cores (n) | T~n~ | Speedup | Change in Speedup | |-----------|---------------|---------|-------------------| | 1 | ??????? | | | | 2 | | | | @@ -164,9 +170,9 @@ How closely do these estimations correlate with your actual results to 16 cores? Hopefully from your results you will find that we can get close to the maximum speedup calculated earlier, but it requires ever more resources. From our own trial runs, we expect the speedup to drop below 1% at 4096 cores, but it is expected that we would never run this code at these core counts as it would be a waste of resources. -Using the `3.9967` T1 starting value, we get the following estimations: +Using the `3.9967` T~1~ starting value, we get the following estimations: -| Cores (n) | Tn | Speedup | Change in Speedup | +| Cores (n) | T~n~ | Speedup | Change in Speedup | |-----------|---------------|-----------|-------------------| | 1 | 3.99667 | 1 | 0 | | 2 | 2.067504 | 1.933089 | 0.933089 | @@ -181,10 +187,10 @@ Using the `3.9967` T1 starting value, we get the following estimation | 1024 | 0.142149 | 28.115998 | 0.726001 | | 2048 | 0.140265 | 28.493625 | 0.377627 | | 4096 | 0.139323 | 28.686268 | 0.192643 | + :::: ::::: - :::::challenge{id=how-many-cores, title="How Many Cores Should we Use?"} From the data you have just calculated, what do you think the maximum number of cores we should use with our code to balance reduced execution time versus efficient usage of compute resources.? @@ -193,7 +199,6 @@ From the data you have just calculated, what do you think the maximum number of :::: ::::: - ## Calculating a Weak Scaling Profile Not all codes are suited to strong scaling. As seen in the previous example, even codes with as much as 96% parallelizable code will hit limits. Can we do something to enable moderately parallelizable codes to access the power of HPC systems? The answer is yes, and is demonstrated through *weak scaling*. @@ -216,7 +221,7 @@ When presenting your weak scaling it is common to show how well it scales, this ![Weak Scaling - Cores vs Time](fig/scalability-weak-scaling-time.png){: width="650px"} -We can also plot the scaling factor. This is the percentage increase in run time compared to base run time for a normal run. In this case we are just using *T1*: +We can also plot the scaling factor. This is the percentage increase in run time compared to base run time for a normal run. In this case we are just using **T**~1~: ![Weak Scaling - Cores vs Scaling Factor](fig/scalability-weak-scaling-factor.png){: width="650px"} @@ -224,6 +229,7 @@ The above plot shows that the code is highly scalable. We do have an anomaly wit :::::challenge{id=calculate-speeup-3, title="Calculate using your Own Results III"} You can reproduce this weak scaling profile with the Pi code by submitting a job which executes the following instead, in `mpi-pi.sh`: + ```bash ... ./run.sh Weak diff --git a/introductory_courses/python/01_running_python.md b/introductory_courses/python/01_running_python.md index 462309b5..285e91e9 100644 --- a/introductory_courses/python/01_running_python.md +++ b/introductory_courses/python/01_running_python.md @@ -1,28 +1,27 @@ --- name: Running Python -dependsOn: [ -] +dependsOn: [] tags: [python] -attribution: - - citation: > - "Programming with Python" course by the Carpentries - url: https://swcarpentry.github.io/python-novice-inflammation/ - image: https://carpentries.org/assets/img/TheCarpentries.svg - license: CC-BY-4.0 +attribution: + - citation: > + "Programming with Python" course by the Carpentries + url: https://swcarpentry.github.io/python-novice-inflammation/ + image: https://carpentries.org/assets/img/TheCarpentries.svg + license: CC-BY-4.0 --- Before starting this course, you should install [Visual Studio Code](https://code.visualstudio.com/) and [Python](https://www.python.org/). Python is a programming language, a collection of syntax rules and keywords that can be used to specify operations to be executed by a computer. But how to go from these instructions to actual operations carried out by a computer? -This translation and execution is the job of the *Python runtime*, a piece of software that, given some instructions in Python, translates them into machine code and runs them. +This translation and execution is the job of the _Python runtime_, a piece of software that, given some instructions in Python, translates them into machine code and runs them. The Python runtime is, most of the time, called Python, confusing the software that interprets the language with the language itself. -This language shortcut is harmless most of the time, but it's good to know that this it *is* a shortcut. +This language shortcut is harmless most of the time, but it's good to know that this it _is_ a shortcut. ## Starting Python -You can start Python (the *Python runtime*) through the command line or through an application also called +You can start Python (the _Python runtime_) through the command line or through an application also called `Python`. ### macOS - Command Line @@ -31,13 +30,13 @@ To start Python you will need to access the command line through the Terminal. There are two ways to open Terminal on Mac. 1. In your Applications folder, open Utilities and double-click on Terminal -2. Press ***Command*** + ***spacebar*** to launch Spotlight. Type `Terminal` and then double-click the search result or hit ***Enter*** +2. Press **_Command_** + **_spacebar_** to launch Spotlight. Type `Terminal` and then double-click the search result or hit **_Enter_** After you have launched Terminal, type the command to start Python -~~~ bash -$ python -~~~ +```bash +python +``` ### Windows Users - Command Line @@ -47,9 +46,9 @@ Press **_Windows Logo Key_** and search for `Terminal`, click the result or pres After you have launched the Terminal, type the command: -~~~ bash -$ python -~~~ +```bash +python +``` ### GNU/Linux Users - Command Line @@ -58,9 +57,9 @@ You can usually find it under "Accessories". After you have launched the terminal emulator, type the command: -~~~ bash -$ python -~~~ +```bash +python +``` ### Using the Python application @@ -76,34 +75,34 @@ Type the lines preceded by `>>>` or `...` and hit ENTER between each one. Try to guess what these little snippets of Python do, but don't try to understand the details of them yet - it will be clear to you by the end of this course. -~~~ python +```python nolint >>> 1 + 6 7 -~~~ +``` -~~~ python +```python nolint >>> a = 2 >>> b = 3 >>> a + b 5 -~~~ +``` -~~~ python +```python nolint >>> print("Just printing this on the screen") Just printing this on the screen -~~~ +``` -~~~ python +```python nolint >>> word = "Hello" >>> len(word) 5 -~~~ +``` -~~~ python +```python nolint >>> for word in ["Leeds", "Munich", "Marseille"]: ... print("City name has", len(word), "letters in it.") ... @@ -111,31 +110,31 @@ Just printing this on the screen City name has 5 letters in it. City name has 6 letters in it. City name has 9 letters in it. -~~~ +``` -~~~ python +```python nolint >>> for word in ["London", 3, "Marseille"]: ... print("City name has", len(word), "letters in it.") -... +... City name has 6 letters in it. Traceback (most recent call last): File "", line 2, in TypeError: object of type 'int' has no len() -~~~ +``` ## Quitting You can quit Python by typing: -~~~ python +```python nolint >>> quit() -~~~ +``` then ENTER. ## The REPL -Running Python interactively from the command line, one command after the other, is commonly referred to as using the *Read-Eval-Print-Loop* (REPL, pronounced "repel"). +Running Python interactively from the command line, one command after the other, is commonly referred to as using the _Read-Eval-Print-Loop_ (REPL, pronounced "repel"). Indeed, when doing so, Python **reads** the command, **evaluates** it, **prints** the result and **loops** (goes back to waiting for the next command). The REPL allows for very quick feedback while drafting a Python program or exploring data. diff --git a/introductory_courses/python/02_variables_and_types.md b/introductory_courses/python/02_variables_and_types.md index e5a1233d..5b460ae2 100644 --- a/introductory_courses/python/02_variables_and_types.md +++ b/introductory_courses/python/02_variables_and_types.md @@ -1,15 +1,13 @@ --- name: Variables and Types -dependsOn: [ - introductory_courses.python.01_running_python -] +dependsOn: [introductory_courses.python.01_running_python] tags: [python] -attribution: - - citation: > - "Programming with Python" course by the Carpentries - url: https://swcarpentry.github.io/python-novice-inflammation/ - image: https://carpentries.org/assets/img/TheCarpentries.svg - license: CC-BY-4.0 +attribution: + - citation: > + "Programming with Python" course by the Carpentries + url: https://swcarpentry.github.io/python-novice-inflammation/ + image: https://carpentries.org/assets/img/TheCarpentries.svg + license: CC-BY-4.0 --- ## Using variables to store values @@ -18,16 +16,16 @@ attribution: In Python, the `=` symbol assigns the value on the right to the name on the left. The variable is created when a value is assigned to it. Here, Python assigns an age to a variable `age` and a name in quotes to a variable `first_name`. -~~~ python +```python age = 42 first_name = 'Ahmed' -~~~ +``` -* Variable names in Python - * can **only** contain letters, digits, and underscore `_` (typically used to separate words in long variable names) - * cannot start with a digit - * are **case sensitive** (`age`, `Age` and `AGE` are three different variables) -* Variable names starting with underscores like `__alistairs_real_age` have special meaning so we won't do that until we understand the convention. +- Variable names in Python + - can **only** contain letters, digits, and underscore `_` (typically used to separate words in long variable names) + - cannot start with a digit + - are **case sensitive** (`age`, `Age` and `AGE` are three different variables) +- Variable names starting with underscores like `__alistairs_real_age` have special meaning so we won't do that until we understand the convention. ## Use `print` to display values @@ -36,13 +34,13 @@ Like any function, we can call print (i.e., tell Python to run it) by using its To add a string to the printout, wrap the string in single or double quotes The values passed to the function are called **arguments** (or args for short). -~~~ python +```python print(first_name, 'is', age, 'years old') -~~~ +``` -~~~ text +```text Ahmed is 42 years old -~~~ +``` `print` automatically puts a single space between items to separate them and wraps around to a new line at the end. @@ -51,73 +49,72 @@ and wraps around to a new line at the end. If a variable doesn't exist yet, or if the name has been mis-spelled, Python reports an error (unlike some languages, which "guess" a default value) -~~~ python +```python nolint print(last_name) -~~~ +``` -~~~ error +```error Traceback (most recent call last): File "", line 1, in NameError: name 'last_name' is not defined -~~~ +``` -* The last line of an error message is usually the most informative -* We will look at error messages in detail [later](12_errors_and_exceptions) +- The last line of an error message is usually the most informative +- We will look at error messages in detail [later](12_errors_and_exceptions) ## Variables can be used in calculations -* We can use variables in calculations just as if they were values. - * Remember, we assigned the value `42` to `age` a few lines ago. +- We can use variables in calculations just as if they were values. + - Remember, we assigned the value `42` to `age` a few lines ago. -~~~ python +```python print('Age in three years:', age + 3) -~~~ +``` -~~~ text +```text Age in three years: 45 -~~~ +``` ## Variables can be replaced -* We can replace the value associated with a variable by assigning it a new one. -* Replacing a variable permanently deletes the old value. +- We can replace the value associated with a variable by assigning it a new one. +- Replacing a variable permanently deletes the old value. -~~~ python +```python age = age + 3 print('Age in three years:', age) -~~~ +``` -~~~ text +```text Age in three years: 45 -~~~ +``` ## Use an index to get a single character from a string -* An item in a list is called an element. +- An item in a list is called an element. Whenever we treat a string as if it were a list, the string's elements are the individual characters. -* The characters (individual letters, numbers, and so on) in a string are ordered. +- The characters (individual letters, numbers, and so on) in a string are ordered. For example, the string `'AB'` is not the same as `'BA'`. Because of this ordering, we can treat the string as a list of characters. - * We will look at [lists in python more generally later](08_lists) -* Each position in the string (first, second, etc.) is given a number. + - We will look at [lists in python more generally later](08_lists) +- Each position in the string (first, second, etc.) is given a number. This number is called an **index** or sometimes a subscript. -* Indices are numbered from 0. -* Use the position's index in square brackets to get the character at that position. +- Indices are numbered from 0. +- Use the position's index in square brackets to get the character at that position. ![an illustration of indexing](fig/02_indexing.svg) -~~~ python +```python atom_name = 'helium' print(atom_name[0]) -~~~ +``` -~~~ text +```text h -~~~ +``` ## Use a slice to get a substring - A slice is subpart of a string (or, more generally, any list-like thing). When a slice is taken of a string, this is called a **substring**. These substrings can be as short as a single character. @@ -128,24 +125,24 @@ The difference between `stop` and `start` is the slice's length. Taking a slice does not change the contents of the original string, instead, the slice is a copy of part of the original string. -~~~ python +```python atom_name = 'sodium' print(atom_name[0:3]) -~~~ +``` -~~~ text +```text sod -~~~ +``` ## Use the built-in function `len` to find the length of a string -~~~ python +```python print(len('helium')) -~~~ +``` -~~~ text +```text 6 -~~~ +``` Here we have nested two function calls, `len` and `print`, nested functions are evaluated from the inside out, like in mathematics. @@ -154,7 +151,7 @@ nested functions are evaluated from the inside out, like in mathematics. Python treats upper- and lower-case characters as distinct: -* `Name` and `name` are different variables. +- `Name` and `name` are different variables. There are conventions for when to use upper-case letters at the start of variable names; we will stick to strictly lower-case characters for now. @@ -162,39 +159,39 @@ There are conventions for when to use upper-case letters at the start of variabl Python doesn't care what you call variables as long as they obey the rules (alphanumeric characters and the underscore). -~~~ python +```python flabadab = 42 ewr_422_yY = 'Ahmed' print(ewr_422_yY, 'is', flabadab, 'years old') -~~~ +``` However, code is supposed to be read by other humans: -* Use meaningful variable names to help other people understand what the program does. -* The most important "other person" is your future self. +- Use meaningful variable names to help other people understand what the program does. +- The most important "other person" is your future self. -::::challenge{id="swapping_values" title="Swapping Values"} -Fill the table showing the values of the variables in this program *after* each statement is executed. +::::challenge{id="swapping*values" title="Swapping Values"} +Fill the table showing the values of the variables in this program \_after* each statement is executed. -~~~ python +```text # Command # Value of x # Value of y # Value of swap # x = 1.0 # # # # y = 3.0 # # # # swap = x # # # # x = y # # # # y = swap # # # # -~~~ +``` :::solution -~~~ text +```text # Command # Value of x # Value of y # Value of swap # x = 1.0 # 1.0 # not defined # not defined # y = 3.0 # 1.0 # 3.0 # not defined # swap = x # 1.0 # 3.0 # 1.0 # x = y # 3.0 # 3.0 # 1.0 # y = swap # 3.0 # 1.0 # 1.0 # -~~~ +``` These three lines exchange the values in `x` and `y` using the `swap` variable for temporary storage. This is a fairly common programming idiom. ::: @@ -203,16 +200,16 @@ These three lines exchange the values in `x` and `y` using the `swap` variable f ::::challenge{id="slicing_practice" title="Slicing practice"} What does the following program print? -~~~ python +```python atom_name = 'carbon' print('atom_name[1:3] is:', atom_name[1:3]) -~~~ +``` :::solution -~~~ text +```text atom_name[1:3] is: ar -~~~ +``` ::: :::: @@ -234,6 +231,7 @@ atom_name[1:3] is: ar 4. `thing[:]` returns all of `thing` 5. `thing[number:some-negative-number]` returns a slice from `number` to `some-negative-number` values from the end of `thing` 6. If a part of the slice is out of range, the operation does not fail. `atom_name[0:15]` gives the same result as `atom_name[0:]`. + ::: :::: @@ -242,32 +240,32 @@ atom_name[1:3] is: ar In programming a "type" is a method of categorising like data which share characteristics, representations, common operations, etc. Every value in a program has a specific type: -* Integer (`int`): represents positive or negative whole numbers like 3 or -512. -* Floating point number (`float`): represents real numbers like 3.14159 or -2.5. -* Character string (usually called "string", `str`): text. - * Written in either single quotes or double quotes (as long as they match). - * The quote marks aren't printed when the string is displayed. +- Integer (`int`): represents positive or negative whole numbers like 3 or -512. +- Floating point number (`float`): represents real numbers like 3.14159 or -2.5. +- Character string (usually called "string", `str`): text. + - Written in either single quotes or double quotes (as long as they match). + - The quote marks aren't printed when the string is displayed. -## Use the built-in function `type` to find the type of a value. +## Use the built-in function `type` to find the type of a value We can use the built-in function `type` to find out what type a value has, this works on variables as well. -* Remember: it is the *value* which has a type --- the *variable* name is just a label. +- Remember: it is the _value_ which has a type --- the _variable_ name is just a label. -``` python +```python print(type(52)) ``` -``` python +```text ``` -``` python +```python fitness = 'average' print(type(fitness)) ``` -``` python +```text ``` @@ -275,19 +273,19 @@ print(type(fitness)) A value's type determines what the program can do to it. -``` python +```python print(5 - 3) ``` -``` python +```python 2 ``` -``` python +```python print('hello' - 'h') ``` -``` python +```text --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in () @@ -296,49 +294,49 @@ TypeError Traceback (most recent call last) TypeError: unsupported operand type(s) for -: 'str' and 'str' ``` -## You can use the "+" and "*" operators on strings. +## You can use the "+" and "\*" operators on strings "Adding" character strings concatenates them. -``` python +```python full_name = 'Ahmed' + ' ' + 'Walsh' print(full_name) ``` -``` python +```text Ahmed Walsh ``` -Multiplying a character string by an integer _N_ creates a new string that consists of that character string repeated _N_ times. Since multiplication is repeated addition. +Multiplying a character string by an integer _N_ creates a new string that consists of that character string repeated _N_ times. Since multiplication is repeated addition. -``` python +```python separator = '=' * 10 print(separator) ``` -``` python +```text ========== ``` -## Strings have a length (but numbers don't). +## Strings have a length (but numbers don't) The built-in function `len` counts the number of characters in a string. -``` python +```python print(len(full_name)) ``` -``` python +```python 11 ``` But numbers don't have a length (not even zero). -``` python +```python print(len(52)) ``` -``` python +```text --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in () @@ -347,15 +345,15 @@ TypeError Traceback (most recent call last) TypeError: object of type 'int' has no len() ``` -## Must convert numbers to strings or vice versa when operating on them. +## Must convert numbers to strings or vice versa when operating on them Cannot add numbers and strings. -``` python +```python print(1 + '2') ``` -``` python +```text --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in () @@ -366,12 +364,12 @@ TypeError: unsupported operand type(s) for +: 'int' and 'str' Is not allowed in python because it is ambiguous: should `1 + '2'` be `3` or `'12'`? Some types can be converted to other types by using the type name as a function. -``` python +```python print(1 + int('2')) print(str(1) + '2') ``` -```python +```text 3 12 ``` @@ -380,12 +378,12 @@ print(str(1) + '2') Integers and floating-point numbers can be mixed in arithmetic. Python 3 automatically converts integers to floats as needed. -``` python +```python print('half is', 1 / 2.0) print('three squared is', 3.0 ** 2) ``` -``` python +```text half is 0.5 three squared is 9.0 ``` @@ -394,14 +392,14 @@ three squared is 9.0 If we make one cell in a spreadsheet depend on another, and update the latter, the former updates automatically. This does **not** happen in programming languages. -``` python +```python first = 1 second = 5 * first first = 2 print('first is', first, 'and second is', second) ``` -``` text +```text first is 2 and second is 5 ``` @@ -413,12 +411,12 @@ What type of value is 3.25 + 4? :::solution It is a float: integers are automatically converted to floats as necessary. -``` python +```python result = 3.25 + 4 print(result, 'is', type(result)) ``` -``` python +```text 7.25 is ``` @@ -426,7 +424,7 @@ print(result, 'is', type(result)) :::: ::::challenge{id="choose_a_type" title="Choose a Type"} -What type of value (integer, floating point number, or character string) would you use to represent each of the following? Try to come up with more than one good answer for each problem. For example, in # 1, when would counting days with a floating point variable make more sense than using an integer? +What type of value (integer, floating point number, or character string) would you use to represent each of the following? Try to come up with more than one good answer for each problem. For example, in # 1, when would counting days with a floating point variable make more sense than using an integer? 1. Number of days since the start of the year. 2. Time elapsed from the start of the year until now in days. @@ -444,25 +442,26 @@ The answers to the questions are: 4. This will vary! How do you define a specimen's age? whole days since collection (integer)? date and time (string)? 5. Choose floating point to represent population as large aggregates (eg millions), or integer to represent population in units of individuals. 6. Floating point number, since an average is likely to have a fractional part. + ::: :::: ::::challenge{id="division_types" title="Division Types"} In Python 3, the `//` operator performs integer (whole-number) floor division, the `/` operator performs floating-point -division, and the `%` (or *modulo*) operator calculates and returns the remainder from integer division: +division, and the `%` (or _modulo_) operator calculates and returns the remainder from integer division: -~~~ python +```python print('5 // 3:', 5 // 3) print('5 / 3:', 5 / 3) print('5 % 3:', 5 % 3) -~~~ +``` -~~~ python +```text 5 // 3: 1 5 / 3: 1.6666666666666667 5 % 3: 2 -~~~ +``` If `num_subjects` is the number of subjects taking part in a study, and `num_per_survey` is the number that can take part in a single survey, write an expression that calculates the number of surveys needed to reach everyone once. @@ -471,17 +470,17 @@ We want the minimum number of surveys that reaches everyone once, which is the r This is equivalent to performing a floor division with `//` and adding 1. Before the division we need to subtract 1 from the number of subjects to deal with the case where `num_subjects` is evenly divisible by `num_per_survey`. -~~~ python +```python num_subjects = 600 num_per_survey = 42 num_surveys = (num_subjects - 1) // num_per_survey + 1 print(num_subjects, 'subjects,', num_per_survey, 'per survey:', num_surveys) -~~~ +``` -~~~ text +```text 600 subjects, 42 per survey: 15 -~~~ +``` ::: :::: diff --git a/introductory_courses/python/03_writing_and_running_ide.md b/introductory_courses/python/03_writing_and_running_ide.md index 10d744e7..0287e35f 100644 --- a/introductory_courses/python/03_writing_and_running_ide.md +++ b/introductory_courses/python/03_writing_and_running_ide.md @@ -1,15 +1,13 @@ --- name: Writing and running Python from an IDE -dependsOn: [ - introductory_courses.python.02_variables_and_types -] +dependsOn: [introductory_courses.python.02_variables_and_types] tags: [python] -attribution: - - citation: > - "Programming with Python" course by the Carpentries - url: https://swcarpentry.github.io/python-novice-inflammation/ - image: https://carpentries.org/assets/img/TheCarpentries.svg - license: CC-BY-4.0 +attribution: + - citation: > + "Programming with Python" course by the Carpentries + url: https://swcarpentry.github.io/python-novice-inflammation/ + image: https://carpentries.org/assets/img/TheCarpentries.svg + license: CC-BY-4.0 --- ## Python scripts @@ -17,11 +15,11 @@ attribution: So far we've worked in the REPL which does not allow us to save our programs. Instead of the interactive mode, Python can read a file that contains Python instructions. -This file is commonly referred to as a Python *script*. +This file is commonly referred to as a Python _script_. Python scripts are plain text files. -To create a plain text file, you need to use a *text editor*. -When we do programming, typically we will use a special text editor called an IDE, or *integrated development environment*, that contains lots of tools to help us. +To create a plain text file, you need to use a _text editor_. +When we do programming, typically we will use a special text editor called an IDE, or _integrated development environment_, that contains lots of tools to help us. We will use Visual Studio Code (vscode), which is a free IDE available for Windows, Mac, and Linux. :::callout @@ -41,21 +39,21 @@ Follow these steps: Now, change your script to contain the following: -~~~ python +```python print("hello world") varint = 1 print("Variable 'varint' is a", type(varint)) varstr = "astring" print("Variable 'varstr' is a", type(varstr)) -~~~ +``` Press the triangular `Play` button in the top right, and you should see the following output: -~~~ python +```text hello world Variable 'varint' is a Variable 'varstr' is a -~~~ +``` Can you guess what happened? Python read the file, executing each line one after the other. diff --git a/introductory_courses/python/04_built_in_functions_and_help.md b/introductory_courses/python/04_built_in_functions_and_help.md index 1c9d9fec..2f150688 100644 --- a/introductory_courses/python/04_built_in_functions_and_help.md +++ b/introductory_courses/python/04_built_in_functions_and_help.md @@ -1,25 +1,23 @@ --- name: Built-in Functions and Help -dependsOn: [ - introductory_courses.python.03_writing_and_running_ide -] +dependsOn: [introductory_courses.python.03_writing_and_running_ide] tags: [python] -attribution: - - citation: > - "Programming with Python" course by the Carpentries - url: https://swcarpentry.github.io/python-novice-inflammation/ - image: https://carpentries.org/assets/img/TheCarpentries.svg - license: CC-BY-4.0 +attribution: + - citation: > + "Programming with Python" course by the Carpentries + url: https://swcarpentry.github.io/python-novice-inflammation/ + image: https://carpentries.org/assets/img/TheCarpentries.svg + license: CC-BY-4.0 --- ## Use comments to add documentation to programs Any text preceded by the hash mark (`#`) is ignored by Python, this is called a "comment". -~~~ python +```python # This sentence isn't executed by Python. adjustment = 0.5 # Neither is this - anything after '#' is ignored. -~~~ +``` Comments make programs easier for humans to understand, good code is written for humans to read, not just for computers to execute. @@ -27,63 +25,63 @@ Comments make programs easier for humans to understand, good code is written for We have used functions already --- now let's take a closer look. -An *argument* is a value passed into a function. +An _argument_ is a value passed into a function. Functions can take any number of arguments, including zero. -* `len` takes exactly one. -* `int`, `str`, and `float` take one (and return one). -* `print` takes zero or more. - * `print` with no arguments prints a blank line. +- `len` takes exactly one. +- `int`, `str`, and `float` take one (and return one). +- `print` takes zero or more. + - `print` with no arguments prints a blank line. Function calls must always include the parentheses, even if they're empty, so that Python knows a function is being called. -~~~ python +```python print('before') print() print('after') -~~~ +``` produces this output: -~~~ text +```text before after -~~~ +``` ## Function returns Every function call produces some result but not every function returns something. -* If the function doesn't have a useful result to return, it usually returns the special value `None`. +- If the function doesn't have a useful result to return, it usually returns the special value `None`. `None` is a Python object that stands in anytime there is no value. -~~~ python +```python result = print('example') print('result of print is', result) -~~~ +``` -~~~ text +```text example result of print is None -~~~ +``` ## Commonly-used built-in functions -* Use `max` to find the largest value of one or more values. -* Use `min` to find the smallest. -* Both work on character strings as well as numbers. - * "Larger" and "smaller" use (0-9, A-Z, a-z) to compare letters. +- Use `max` to find the largest value of one or more values. +- Use `min` to find the smallest. +- Both work on character strings as well as numbers. + - "Larger" and "smaller" use (0-9, A-Z, a-z) to compare letters. -~~~ python +```python print(max(1, 2, 3)) print(min('a', 'A', '0')) -~~~ +``` -~~~ text +```text 3 0 -~~~ +``` ## Functions may only work for certain (combinations of) arguments @@ -91,17 +89,17 @@ print(min('a', 'A', '0')) They must be given things that can meaningfully be compared, -~~~ python +```python print(max(1, 'a')) -~~~ +``` -~~~ text +```text TypeError Traceback (most recent call last) in ----> 1 print(max(1, 'a')) TypeError: '>' not supported between instances of 'str' and 'int' -~~~ +``` ## Functions may have default values @@ -109,34 +107,34 @@ Some functions have optional values which are used when the user does not provid `round` will round off a floating-point number, by default, it rounds to zero decimal places. -~~~ python +```python round(3.712) -~~~ +``` -~~~ text +```text 4 -~~~ +``` We can specify the number of decimal places we want, -~~~ python +```python round(3.712, 1) -~~~ +``` -~~~ text +```text 3.7 -~~~ +``` ## Methods -Functions can be tied to a particular object, these are called *methods*, we will make extensive use of these when learning about [pandas](07_pandas_dataframes). +Functions can be tied to a particular object, these are called _methods_, we will make extensive use of these when learning about [pandas](07_pandas_dataframes). Methods behave in the same way as functions, and are called with parentheses like functions. They can be accessed from their base object with a dot (`obj.method()`). -* Some methods are used for internal Python operations, these are marked with double underlines. +- Some methods are used for internal Python operations, these are marked with double underlines. -~~~ python -my_string = 'Hello world!' # creation of a string object +```python +my_string = 'Hello world!' # creation of a string object print(len(my_string)) # the len function takes a string as an argument and returns the length of the string @@ -144,97 +142,96 @@ print(my_string.swapcase()) # calling the swapcase method on the my_string objec print(my_string.__len__()) # calling the internal __len__ method on the my_string object, used by len(my_string) -~~~ +``` -~~~ text +```text 12 hELLO WORLD! 12 -~~~ +``` -* You might even see them chained together, in which case they operate left to right. +- You might even see them chained together, in which case they operate left to right. -~~~ python +```python print(my_string.isupper()) # Not all the letters are uppercase print(my_string.upper()) # This capitalizes all the letters print(my_string.upper().isupper()) # Now all the letters are uppercase -~~~ +``` -~~~ text +```text False HELLO WORLD True -~~~ +``` ## Use the built-in function `help` Every built-in function has online documentation, you can access this with the `help` function. -~~~ python +```python help(round) -~~~ +``` -~~~ text +```text Help on built-in function round in module builtins: round(number, ndigits=None) Round a number to a given precision in decimal digits. - + The return value is an integer if ndigits is omitted or None. Otherwise the return value has the same type as the number. ndigits may be negative. -~~~ +``` ## Syntax errors When you execute Python code, mistakes in your syntax will be reported immediately. -This is called a *syntax error*. +This is called a _syntax error_. Here, syntax refers to the structure of a program and the rules about that structure, essentially the grammar of programming. -* Python won't even try to run the program if it can't be parsed. +- Python won't even try to run the program if it can't be parsed. -~~~ python +```python nolint # Forgot to close the quote marks around the string. name = 'Feng -~~~ +``` -~~~ +```text File "", line 2 name = 'Feng ^ SyntaxError: EOL while scanning string literal -~~~ +``` -~~~ python +```python nolint # An extra '=' in the assignment. age = = 52 -~~~ +``` -~~~ +```text File "", line 2 age = = 52 ^ SyntaxError: invalid syntax -~~~ +``` -* Look more closely at the error message: +- Look more closely at the error message: -~~~ python +```python nolint print("hello world" -~~~ +``` -~~~ +```text File "", line 1 print ("hello world" ^ SyntaxError: unexpected EOF while parsing -~~~ - +``` -* The message indicates a problem on first line of the input ("line 1"). - * In this case the "ipython-input" section of the file name tells us that we are working with input into IPython, the Python interpreter used by the Jupyter Notebook. -* The `-6-` part of the filename indicates that the error occurred in cell 6 of our Notebook. -* Next is the problematic line of code, the problem is indicated with a `^` pointer. +- The message indicates a problem on first line of the input ("line 1"). + - In this case the "ipython-input" section of the file name tells us that we are working with input into IPython, the Python interpreter used by the Jupyter Notebook. +- The `-6-` part of the filename indicates that the error occurred in cell 6 of our Notebook. +- Next is the problematic line of code, the problem is indicated with a `^` pointer. Syntax errors can be caught before executing the program, and can be easier to fix. We can make use of "linting" tools (such as those built in to our IDEs) which are effectively spellcheckers for programming to help us find, and correct, syntax errors. @@ -244,29 +241,31 @@ We can make use of "linting" tools (such as those built in to our IDEs) which ar Python reports a `runtime error` when something goes wrong while a program is executing. A runtime error is also called an `exception` because it usually indicates that something exceptional (and bad), outside the bounds of the programs normal operations has happened. -~~~ python +```python nolint age = 53 remaining = 100 - aege # mis-spelled 'age' -~~~ +``` -~~~ text +```text NameError Traceback (most recent call last) in 1 age = 53 ----> 2 remaining = 100 - aege # mis-spelled 'age' NameError: name 'aege' is not defined -~~~ +``` As before, the error message indicates where the problem occurred. It is up to you to diagnose and fix the problem, though there are tools which make this easier. We will see more about [errors and exceptions](12_errors_and_exceptions) later. -* Fix syntax errors by reading the source and runtime errors by tracing execution. +- Fix syntax errors by reading the source and runtime errors by tracing execution. :::callout -## Explore the Python docs! + +## Explore the Python docs + The [official Python documentation](https://docs.python.org/3/) is arguably the most complete source of information about the language. It is available in different languages and contains a lot of useful resources. The [Built-in Functions page](https://docs.python.org/3/library/functions.html) contains a catalogue of all of these functions, including the ones that we've covered in this lesson. diff --git a/introductory_courses/python/05_libraries.md b/introductory_courses/python/05_libraries.md index 41451530..f8ba1b00 100644 --- a/introductory_courses/python/05_libraries.md +++ b/introductory_courses/python/05_libraries.md @@ -1,40 +1,39 @@ --- name: Libraries -dependsOn: [ - introductory_courses.python.04_built_in_functions_and_help -] +dependsOn: [introductory_courses.python.04_built_in_functions_and_help] tags: [python] -attribution: - - citation: > - "Programming with Python" course by the Carpentries - url: https://swcarpentry.github.io/python-novice-inflammation/ - image: https://carpentries.org/assets/img/TheCarpentries.svg - license: CC-BY-4.0 +attribution: + - citation: > + "Programming with Python" course by the Carpentries + url: https://swcarpentry.github.io/python-novice-inflammation/ + image: https://carpentries.org/assets/img/TheCarpentries.svg + license: CC-BY-4.0 --- ## The power of libraries Most of the power of a programming language lies in its libraries. -A *library*, or a *package* is a collection of files (each called a *module*) that contains functions, types etc. for use in other programs. +A _library_, or a _package_ is a collection of files (each called a _module_) that contains functions, types etc. for use in other programs. -* May also contain data values (e.g., numerical constants) and other things. -* A good libraries content will be related, but there's no way to enforce that. +- May also contain data values (e.g., numerical constants) and other things. +- A good libraries content will be related, but there's no way to enforce that. The [Python standard library](https://docs.python.org/3/library/) is an extensive suite of modules that comes with Python itself. -* Many additional libraries are available from [PyPI](https://pypi.python.org/pypi/) (the Python Package Index) or elsewhere. -* We will later see how to write our own libraries. +- Many additional libraries are available from [PyPI](https://pypi.python.org/pypi/) (the Python Package Index) or elsewhere. +- We will later see how to write our own libraries. Using libraries brings several benefits: -* We don't have to write everything ourselves, often the task you are trying to accomplish has been done before. -* Good libraries may also be documented, tested, optimised and continually maintained such that they are more reliable than our own code. -* Our code is easier to understand to others, since they may be familiar with the library we are using. -* We can learn good programming practices from libraries. - * How did they accomplish something? +- We don't have to write everything ourselves, often the task you are trying to accomplish has been done before. +- Good libraries may also be documented, tested, optimised and continually maintained such that they are more reliable than our own code. +- Our code is easier to understand to others, since they may be familiar with the library we are using. +- We can learn good programming practices from libraries. + - How did they accomplish something? :::callout + ## Libraries and modules A library is a collection of modules, but the terms are often used @@ -42,43 +41,42 @@ interchangeably, especially since many libraries only consist of a single module, so don't worry if you mix the terms. ::: - ## Importing modules A module must be imported before it can be used. -* Use an `import` statement to load a library module into a program's memory. -* Once imported, we refer to "things" from the module as `module_name.thing_name`. - * Python uses `.` to mean "part of". - * Modules can be nested within one another e.g. `module_name.submodule.function`. +- Use an `import` statement to load a library module into a program's memory. +- Once imported, we refer to "things" from the module as `module_name.thing_name`. + - Python uses `.` to mean "part of". + - Modules can be nested within one another e.g. `module_name.submodule.function`. Using `math`, one of the modules in the standard library: -~~~python +```python import math print('pi is', math.pi) print('cos(pi) is', math.cos(math.pi)) -~~~ +``` -~~~ text +```text pi is 3.141592653589793 cos(pi) is -1.0 -~~~ +``` We always have to refer to each item with the module's name. -* `math.cos(pi)` won't work: the reference to `pi` doesn't somehow "inherit" the function's reference to `math`. +- `math.cos(pi)` won't work: the reference to `pi` doesn't somehow "inherit" the function's reference to `math`. ## Use `help` on modules We can find more out about a module with `help`, this works just like with a function. -~~~ python +```python help(math) -~~~ +``` -~~~ text +```text Help on module math: NAME @@ -101,51 +99,51 @@ FUNCTIONS acos(x, /) Return the arc cosine (measured in radians) of x. ⋮ ⋮ ⋮ -~~~ +``` ## Importing specific items We can simplify and speed up our programs by importing only what we need. -* Use the form `from ... import ...` to load only specific items from a library module. -* Then refer to them directly without library name as prefix. +- Use the form `from ... import ...` to load only specific items from a library module. +- Then refer to them directly without library name as prefix. -~~~ python +```python from math import cos, pi print('cos(pi) is', cos(pi)) -~~~ +``` -~~~ text +```text cos(pi) is -1.0 -~~~ +``` ## Creating aliases We can create an alias for a library module when importing it to make our programs clearer and shorter. -* Use the form `import ... as ...` to give a library a short *alias* while importing it. -* Then refer to items in the library using that shortened name. +- Use the form `import ... as ...` to give a library a short _alias_ while importing it. +- Then refer to items in the library using that shortened name. -~~~ python +```python import math as m print('cos(pi) is', m.cos(m.pi)) -~~~ +``` -~~~ text +```text cos(pi) is -1.0 -~~~ +``` -* Commonly used for libraries that are frequently used or have long names. - * E.g., the `matplotlib` plotting library is often aliased as `mpl`. -Overusing aliasing can make programs harder to understand, since readers must learn your program's aliases. +- Commonly used for libraries that are frequently used or have long names. + - E.g., the `matplotlib` plotting library is often aliased as `mpl`. + Overusing aliasing can make programs harder to understand, since readers must learn your program's aliases. We can, of course, also combine both selective importing and aliasing using `from ... import ... as ...` to do both things at once. -~~~ python +```python from matplotlib import pyplot as plt -~~~ +``` Will alias just the matplotlib.pyplot module into `plt`. @@ -153,11 +151,11 @@ Will alias just the matplotlib.pyplot module into `plt`. You want to select a random character from a string: -~~~ python +```python bases = 'ACTTGCTTGAC' -~~~ +``` -1. Which [standard library][stdlib] module could help you? +1. Which standard library [stdlib] module could help you? 2. Which function would you select from that module? Are there alternatives? 3. Try to write a program that uses the function. @@ -170,28 +168,28 @@ You could use either `random.randrange` or `random.randint` functions to get a random integer between 0 and 10, and then pick out the character at that position: -~~~ python +```python from random import randrange random_index = randrange(len(bases)) print(bases[random_index]) -~~~ +``` or more compactly: -~~~ python +```python from random import randrange print(bases[randrange(len(bases))]) -~~~ +``` Perhaps you found the `random.sample` function? It allows for slightly less typing: -~~~ python +```python from random import sample print(sample(bases, 1)[0]) -~~~ +``` Note that this function returns a list of values. We will learn about [lists](08_lists) later. @@ -204,9 +202,9 @@ There's also other functions you could use, but with more convoluted code as a r When a colleague of yours types `help(math)`, Python reports an error: -~~~ +```text NameError: name 'math' is not defined -~~~ +``` What has your colleague forgotten to do? @@ -219,30 +217,30 @@ Importing the math module (`import math`) ::::challenge{id="importing_with_aliases" title="Importing With Aliases"} 1. Fill in the blanks so that the program below prints `90.0`. -2. Rewrite the program so that it uses `import` *without* `as`. +2. Rewrite the program so that it uses `import` _without_ `as`. 3. Which form do you find easier to read? -~~~ python +```python nolint import math as m angle = ____.degrees(____.pi / 2) print(____) -~~~ +``` :::solution -~~~ python +```python import math as m angle = m.degrees(m.pi / 2) print(angle) -~~~ +``` can be written as -~~~ python +```python import math angle = math.degrees(math.pi / 2) print(angle) -~~~ +``` Since you just wrote the code and are familiar with it, you might actually find the first version easier to read. But when trying to read a huge piece of code written by someone else, or when getting back to your own huge piece of code after several months, non-abbreviated names are often easier, except where there are clear abbreviation conventions. @@ -279,7 +277,7 @@ Library calls: 3. Library call 2. Here `sin` and `pi` are referred to with the regular library name `math`, so the regular `import ...` call suffices. -__Note:__ although library call 4 works, importing all names from a module using a wildcard +**Note:** although library call 4 works, importing all names from a module using a wildcard import is [not recommended](https://pep8.org/#imports) as it makes it unclear which names from the module are used in the code. In general it is best to make your imports as specific as possible and to only import what your code uses. In library call 1, the `import` statement explicitly tells us @@ -292,21 +290,21 @@ convey this information. 1. Fill in the blanks so that the program below prints `90.0`. 2. Do you find this version easier to read than preceding ones? -3. Why *wouldn't* programmers always use this form of `import`? +3. Why _wouldn't_ programmers always use this form of `import`? -~~~ python +```python nolint ____ math import ____, ____ angle = degrees(pi / 2) print(angle) -~~~ +``` :::solution -~~~python +```python from math import degrees, pi angle = degrees(pi / 2) print(angle) -~~~ +``` Most likely you find this version easier to read since it's less dense. The main reason not to use this form of import is to avoid name clashes. @@ -320,14 +318,14 @@ Or if you were to also import a function named `degrees` from another library. 1. Read the code below and try to identify what the errors are without running it. 2. Run the code, and read the error message. What type of error is it? -~~~python +```python from math import log log(0) -~~~ +``` :::solution -~~~ text +```text --------------------------------------------------------------------------- ValueError Traceback (most recent call last) @@ -335,7 +333,7 @@ ValueError Traceback (most recent call last) ----2 log(0) ValueError: math domain error -~~~ +``` 1. The logarithm of `x` is only defined for `x > 0`, so 0 is outside the domain of the function. @@ -348,7 +346,7 @@ ValueError: math domain error ## Useful links -* [pypi](https://pypi.python.org/pypi/) -* [stdlib](https://docs.python.org/3/library/) -* [randommod](https://docs.python.org/3/library/random.html) -* [pep8-imports](https://pep8.org/#imports) +- [pypi](https://pypi.python.org/pypi/) +- [stdlib](https://docs.python.org/3/library/) +- [randommod](https://docs.python.org/3/library/random.html) +- [pep8-imports](https://pep8.org/#imports) diff --git a/introductory_courses/python/06_analyzing_and_visualizing_data.md b/introductory_courses/python/06_analyzing_and_visualizing_data.md index c65bf7ec..3c15f05b 100644 --- a/introductory_courses/python/06_analyzing_and_visualizing_data.md +++ b/introductory_courses/python/06_analyzing_and_visualizing_data.md @@ -1,15 +1,13 @@ --- name: Analyzing and Visualizing Data -dependsOn: [ - introductory_courses.python.05_libraries -] +dependsOn: [introductory_courses.python.05_libraries] tags: [python] -attribution: - - citation: > - "Programming with Python" course by the Carpentries - url: https://swcarpentry.github.io/python-novice-inflammation/ - image: https://carpentries.org/assets/img/TheCarpentries.svg - license: CC-BY-4.0 +attribution: + - citation: > + "Programming with Python" course by the Carpentries + url: https://swcarpentry.github.io/python-novice-inflammation/ + image: https://carpentries.org/assets/img/TheCarpentries.svg + license: CC-BY-4.0 --- :::callout @@ -26,9 +24,9 @@ One way we can accomplish this is using a library called [NumPy](http://docs.sci In general, you should be using this library whenever you want to do fancy things with lots of numbers, especially if you have matrices or arrays. To tell Python that we'd like to start using NumPy, we need to [import it](05_libraries): -~~~ python +```python import numpy -~~~ +``` Importing a library is like getting a piece of lab equipment out of a storage locker and setting it up on the bench. Libraries provide additional functionality to the basic Python package, much like @@ -38,19 +36,19 @@ need for each program. Once we've imported the library, we can ask the library to read our data file for us: -~~~ python +```python numpy.loadtxt(fname='inflammation-01.csv', delimiter=',') -~~~ +``` -~~~ python -array([[ 0., 0., 1., ..., 3., 0., 0.], +```text +numpy.array([[ 0., 0., 1., ..., 3., 0., 0.], [ 0., 1., 2., ..., 1., 0., 1.], [ 0., 1., 1., ..., 2., 1., 1.], ..., [ 0., 1., 1., ..., 1., 1., 1.], [ 0., 0., 0., ..., 0., 2., 0.], [ 0., 0., 1., ..., 1., 1., 0.]]) -~~~ +``` The expression `numpy.loadtxt(...)` is a function call that asks Python to run the function `loadtxt` which belongs to the `numpy` library. This dotted notation is used everywhere in Python: the thing that appears before the dot contains the thing that appears after. @@ -74,19 +72,19 @@ In a similar manner to how we assign a single value to a variable, we can also assign an array of values to a variable using the same syntax. Let's re-run `numpy.loadtxt` and save the returned data: -~~~ python +```python data = numpy.loadtxt(fname='inflammation-01.csv', delimiter=',') -~~~ +``` This statement doesn't produce any output because we've assigned the output to the variable `data`. If we want to check that the data have been loaded, we can print the variable's value: -~~~ python +```python print(data) -~~~ +``` -~~~ python +```text [[ 0. 0. 1. ..., 3. 0. 0.] [ 0. 1. 2. ..., 1. 0. 1.] [ 0. 1. 1. ..., 2. 1. 1.] @@ -94,18 +92,18 @@ print(data) [ 0. 1. 1. ..., 1. 1. 1.] [ 0. 0. 0. ..., 0. 2. 0.] [ 0. 0. 1. ..., 1. 1. 0.]] -~~~ +``` Now that the data are in memory, we can manipulate them. First, let's ask what type of thing `data` refers to: -~~~ python +```python print(type(data)) -~~~ +``` -~~~ python +```text -~~~ +``` The output tells us that `data` currently refers to an N-dimensional array, the functionality for which is provided by the NumPy library. @@ -121,36 +119,36 @@ A Numpy array contains one or more elements of the same type. The `type` function will only tell you that a variable is a NumPy array but won't tell you the type of thing inside the array. We can find out the type of the data contained in the NumPy array. -~~~ python +```python print(data.dtype) -~~~ +``` -~~~ text +```text float64 -~~~ +``` Likewise, we can use type to find out the type of a single value, this can be useful when dealing with other data structures. -~~~ python +```python type(data[0,0]) -~~~ +``` This tells us that the NumPy array's elements are floating point numbers. ::: With the following command, we can see the array's shape: -~~~ python +```python print(data.shape) -~~~ +``` -~~~ python +```python (60, 40) -~~~ +``` The output tells us that the `data` array variable contains 60 rows and 40 columns. When we created the variable `data` to store our arthritis data, we did not only create the array; we also -created information about the array, called *members* or +created information about the array, called _members_ or attributes. This extra information describes `data` in the same way an adjective describes a noun. `data.shape` is an attribute of `data` which describes the dimensions of `data`. We use the same dotted notation for the attributes of variables that we use for the functions in libraries because @@ -160,21 +158,21 @@ If we want to get a single number from the array, we must provide an index in sq just as we do in math when referring to an element of a matrix. Our inflammation data has two dimensions, so we will need to use two indices to refer to one specific value: -~~~ python +```python print('first value in data:', data[0, 0]) -~~~ +``` -~~~ text +```text first value in data: 0.0 -~~~ +``` -~~~ python +```python print('middle value in data:', data[30, 20]) -~~~ +``` -~~~ text +```text middle value in data: 13.0 -~~~ +``` The expression `data[30, 20]` accesses the element at row 30, column 20. While this expression may not surprise you, `data[0, 0]` might. @@ -208,16 +206,16 @@ An index like `[30, 20]` selects a single element of an array, but we can select For example, we can select the first ten days (columns) of values for the first four patients (rows) like this: -~~~ python +```python print(data[0:4, 0:10]) -~~~ +``` -~~~ python +```text [[ 0. 0. 1. 3. 1. 2. 4. 7. 8. 3.] [ 0. 1. 2. 1. 2. 1. 3. 2. 2. 6.] [ 0. 1. 1. 3. 3. 2. 6. 2. 5. 9.] [ 0. 0. 2. 0. 4. 2. 2. 1. 6. 7.]] -~~~ +``` The slice `0:4` means, "Start at index 0 and go up to, but not including, index 4". Similar to "arrays start at 0", the up-to-but-not-including takes a bit of getting used to, @@ -225,52 +223,52 @@ but the rule is that the difference between the upper and lower bounds is the nu We can start slices at any index we need, -~~~ python +```python print(data[5:10, 0:10]) -~~~ +``` -~~~ python +```text [[ 0. 0. 1. 2. 2. 4. 2. 1. 6. 4.] [ 0. 0. 2. 2. 4. 2. 2. 5. 5. 8.] [ 0. 0. 1. 2. 3. 1. 2. 3. 5. 3.] [ 0. 0. 0. 3. 1. 5. 6. 5. 5. 8.] [ 0. 1. 1. 2. 1. 3. 5. 3. 5. 8.]] -~~~ +``` -We also don't have to include the upper and lower bound on the slice. If we don't include the lower +We also don't have to include the upper and lower bound on the slice. If we don't include the lower bound, Python uses 0 by default; if we don't include the upper, the slice runs to the end of the axis, and if we don't include either (i.e., if we use ':' on its own), the slice includes everything: -~~~ python +```python small = data[:3, 36:] print('small is:') print(small) -~~~ +``` The above example selects rows 0 through 2 and columns 36 through to the end of the array. -~~~ text +```text small is: [[ 2. 3. 0. 0.] [ 1. 1. 0. 1.] [ 2. 2. 1. 1.]] -~~~ +``` ## Selecting every nth item We can also select only every nth element in an array by using double colons in our slices, -~~~ python +```python print(data[::3, ::2]) -~~~ +``` selects every 3rd row and every 2nd column. We can also combine this with a start and stop index, with the form `[start:stop:step]`, -~~~ python +```python print(data[0:10:3, 0:10:2]) -~~~ +``` will again select every 3rd row and every 2nd column, but only up to the 10th row and 10th column. @@ -279,13 +277,13 @@ will again select every 3rd row and every 2nd column, but only up to the 10th ro NumPy has several useful functions that take an array as input to perform operations on its values. If we want to find the average inflammation for all patients on all days, for example, we can ask NumPy to compute `data`'s mean value: -~~~ python +```python print(numpy.mean(data)) -~~~ +``` -~~~ python +```python 6.14875 -~~~ +``` `mean` is a function that takes an array as an argument. @@ -298,14 +296,14 @@ However, some functions produce outputs without needing any input. For example, checking the current time doesn't require any input. -~~~ python +```python import time print(time.ctime()) -~~~ +``` -~~~ text +```text Sat Mar 26 13:07:33 2016 -~~~ +``` For functions that don't take in any arguments, we still need parentheses (`()`) to tell Python to go and do something for us. ::: @@ -313,22 +311,22 @@ For functions that don't take in any arguments, we still need parentheses (`()`) Let's use three other NumPy functions to get some informative values about the dataset. We'll also use multiple assignment, a convenient Python feature that will enable us to do this all in one line. -~~~ python +```python maxval, minval, stdval = numpy.max(data), numpy.min(data), numpy.std(data) print('maximum inflammation:', maxval) print('minimum inflammation:', minval) print('standard deviation:', stdval) -~~~ +``` Here we've assigned the return value from `numpy.max(data)` to the variable `maxval`, the value from `numpy.min(data)` to `minval`, and so on. -~~~ text +```text maximum inflammation: 20.0 minimum inflammation: 0.0 standard deviation: 4.61383319712 -~~~ +``` :::callout @@ -354,14 +352,14 @@ such as the maximum inflammation per patient or the average inflammation per day One way to do this is to create a new temporary array of the data we want, then ask it to do the calculation: -~~~ python +```python patient_0 = data[0, :] # 0 on the first axis (rows), everything on the second (columns) print('maximum inflammation for patient 0:', numpy.max(patient_0)) -~~~ +``` -~~~ text +```text maximum inflammation for patient 0: 18.0 -~~~ +``` Everything in a line of Python code following the '#' symbol is a comment that is ignored by Python. Comments allow programmers to leave explanatory notes for other programmers or their future selves. @@ -369,13 +367,13 @@ Comments allow programmers to leave explanatory notes for other programmers or t We don't actually need to store the row in a variable of its own. Instead, we can combine the selection and the function call: -~~~ python +```python print('maximum inflammation for patient 2:', numpy.max(data[2, :])) -~~~ +``` -~~~ text +```text maximum inflammation for patient 2: 19.0 -~~~ +``` What if we need the maximum inflammation for each patient over all days (as in the next diagram on the left) or the average for each day (as in the diagram on the right)? @@ -386,11 +384,11 @@ As the diagram below shows, we want to perform the operation across an axis: To support this functionality, most array functions allow us to specify the axis we want to work on. If we ask for the average across axis 0 (rows in our 2D example), we get: -~~~ python +```python print(numpy.mean(data, axis=0)) -~~~ +``` -~~~ python +```text [ 0. 0.45 1.11666667 1.75 2.43333333 3.15 3.8 3.88333333 5.23333333 5.51666667 5.95 5.9 8.35 7.73333333 8.36666667 9.5 9.58333333 @@ -399,33 +397,33 @@ print(numpy.mean(data, axis=0)) 7.33333333 6.58333333 6.06666667 5.95 5.11666667 3.6 3.3 3.56666667 2.48333333 1.5 1.13333333 0.56666667] -~~~ +``` As a quick check, we can ask this array what its shape is: -~~~ python +```python print(numpy.mean(data, axis=0).shape) -~~~ +``` -~~~ text +```text (40,) -~~~ +``` The expression `(40,)` tells us we have an N×1 vector, so this is the average inflammation per day for all patients. If we average across axis 1 (columns in our 2D example), we get: -~~~ python +```python print(numpy.mean(data, axis=1)) -~~~ +``` -~~~ +```text [ 5.45 5.425 6.1 5.9 5.55 6.225 5.975 6.65 6.625 6.525 6.775 5.8 6.225 5.75 5.225 6.3 6.55 5.7 5.85 6.55 5.775 5.825 6.175 6.1 5.8 6.425 6.05 6.025 6.175 6.55 6.175 6.35 6.725 6.125 7.075 5.725 5.925 6.15 6.075 5.75 5.975 5.725 6.3 5.9 6.75 5.925 7.225 6.15 5.95 6.275 5.7 6.1 6.825 5.975 6.725 5.7 6.25 6.4 7.05 5.9 ] -~~~ +``` which is the average inflammation per patient across all days. @@ -433,16 +431,16 @@ which is the average inflammation per patient across all days. A section of an array is called a slice. We can take slices of character strings as well: -~~~ python +```python element = 'oxygen' print('first three characters:', element[0:3]) print('last three characters:', element[3:6]) -~~~ +``` -~~~ text +```text first three characters: oxy last three characters: gen -~~~ +``` What is the value of `element[:4]`? What about `element[4:]`? @@ -450,11 +448,11 @@ Or `element[:]`? :::solution -~~~ text +```text oxyg en oxygen -~~~ +``` ::: @@ -463,10 +461,10 @@ What is `element[-2]`? :::solution -~~~ text +```text n e -~~~ +``` ::: @@ -483,7 +481,7 @@ Test your solution with the following strings: `carpentry`, `clone`, `hi`. :::solution -~~~ python +```python element = 'oxygen' print('last three characters:', element[-3:]) element = 'carpentry' @@ -492,14 +490,14 @@ element = 'clone' print('last three characters:', element[-3:]) element = 'hi' print('last three characters:', element[-3:]) -~~~ +``` -~~~ text +```text last three characters: gen last three characters: try last three characters: one last three characters: hi -~~~ +``` ::: :::: @@ -511,10 +509,10 @@ If `data` holds our array of patient data, what does `data[3:3, 4:4]` produce? W :::solution -~~~ python -array([], shape=(0, 0), dtype=float64) -array([], shape=(0, 40), dtype=float64) -~~~ +```python +numpy.array([], shape=(0, 0), dtype=numpy.float64) +numpy.array([], shape=(0, 40), dtype=numpy.float64) +``` ::: :::: @@ -524,23 +522,23 @@ array([], shape=(0, 40), dtype=float64) Arrays can be concatenated and stacked on top of one another. We can use NumPy's `vstack` and `hstack` functions for vertical and horizontal stacking respectively. -~~~ python +```python import numpy -> + A = numpy.array([[1,2,3], [4,5,6], [7, 8, 9]]) print('A = ') print(A) -> + B = numpy.hstack([A, A]) print('B = ') print(B) -> + C = numpy.vstack([A, A]) print('C = ') print(C) -~~~ +``` -~~~ python +```text A = [[1 2 3] [4 5 6] @@ -556,7 +554,7 @@ C = [1 2 3] [4 5 6] [7 8 9]] -~~~ +``` Write some additional code that slices the first and last columns of `A`, and stacks them into a 3x2 array. Make sure to `print` the results to verify your solution. @@ -568,18 +566,18 @@ That means `A[:, 0]` is a one dimensional array, which won't stack as desired. To preserve singleton dimensions, the index itself can be a slice or array. For example, `A[:, :1]` returns a two dimensional array with one singleton dimension (i.e. a column vector). -~~~ python +```python D = numpy.hstack((A[:, :1], A[:, -1:])) print('D = ') print(D) -~~~ +``` -~~~ python +```text D = [[1 3] [4 6] [7 9]] -~~~ +``` ::: @@ -587,18 +585,19 @@ D = An alternative way to achieve the same result is to use Numpy's delete function to remove the second column of A. -~~~ python +```python D = numpy.delete(A, 1, 1) print('D = ') print(D) -~~~ +``` -~~~ python +```text D = [[1 3] [4 6] [7 9]] -~~~ +``` + ::: :::: @@ -611,30 +610,30 @@ Let's find out how to calculate changes in the data contained in an array with N The `numpy.diff()` function takes an array and returns the differences between two successive values. Let's use it to examine the changes each day across the first week of patient 3 from our inflammation dataset. -~~~ python +```python patient3_week1 = data[3, :7] print(patient3_week1) -~~~ +``` -~~~ python +```text [0. 0. 2. 0. 4. 2. 2.] -~~~ +``` Calling `numpy.diff(patient3_week1)` would do the following calculations: -~~~ python +```python [ 0 - 0, 2 - 0, 0 - 2, 4 - 0, 2 - 4, 2 - 2 ] -~~~ +``` and return the 6 difference values in a new array. -~~~ python +```python numpy.diff(patient3_week1) -~~~ +``` -~~~ python -array([ 0., 2., -2., 4., -2., 0.]) -~~~ +```python +numpy.array([ 0., 2., -2., 4., -2., 0.]) +``` Note that the array of differences is shorter by one element (length 6). @@ -646,9 +645,9 @@ When applying `numpy.diff` to our 2D inflammation array `data`, which axis would Since the row axis (0) is patients, it does not make sense to get the difference between two arbitrary patients. The column axis (1) is in days, so the difference is the change in inflammation -- a meaningful concept. -~~~ python +```python numpy.diff(data, axis=1) -~~~ +``` ::: @@ -669,36 +668,36 @@ Does it matter if the change in inflammation is an increase or a decrease? By using the `numpy.max()` function after you apply the `numpy.diff()` function, you will get the largest difference between days. -~~~ python +```python numpy.max(numpy.diff(data, axis=1), axis=1) -~~~ +``` -~~~ python -array([ 7., 12., 11., 10., 11., 13., 10., 8., 10., 10., 7., +```python +numpy.array([ 7., 12., 11., 10., 11., 13., 10., 8., 10., 10., 7., 7., 13., 7., 10., 10., 8., 10., 9., 10., 13., 7., 12., 9., 12., 11., 10., 10., 7., 10., 11., 10., 8., 11., 12., 10., 9., 10., 13., 10., 7., 7., 10., 13., 12., 8., 8., 10., 10., 9., 8., 13., 10., 7., 10., 8., 12., 10., 7., 12.]) -~~~ +``` -If inflammation values *decrease* along an axis, then the difference from one element to the next will be negative. +If inflammation values _decrease_ along an axis, then the difference from one element to the next will be negative. If you are interested in the **magnitude** of the change and not the direction, the `numpy.absolute()` function will provide that. Notice the difference if you get the largest _absolute_ difference between readings. -~~~ python +```python numpy.max(numpy.absolute(numpy.diff(data, axis=1)), axis=1) -~~~ +``` -~~~ python -array([ 12., 14., 11., 13., 11., 13., 10., 12., 10., 10., 10., +```python +numpy.array([ 12., 14., 11., 13., 11., 13., 10., 12., 10., 10., 10., 12., 13., 10., 11., 10., 12., 13., 9., 10., 13., 9., 12., 9., 12., 11., 10., 13., 9., 13., 11., 11., 8., 11., 12., 13., 9., 10., 13., 11., 11., 13., 11., 13., 13., 10., 9., 10., 10., 9., 9., 13., 10., 9., 10., 11., 13., 10., 10., 12.]) -~~~ +``` ::: :::: @@ -710,11 +709,11 @@ Visualization deserves an entire lecture of its own, but we can explore a few fe While there is no official plotting library, `matplotlib` is the _de facto_ standard. First, we will import the `pyplot` module from `matplotlib` and use two of its functions to create and display a heat map of our data: -~~~ python +```python import matplotlib.pyplot image = matplotlib.pyplot.imshow(data) matplotlib.pyplot.show() -~~~ +``` ![Heat map representing the `data` variable. Each cell is colored by value along a color gradient from blue to yellow.](fig/inflammation-01-imshow.png) @@ -722,11 +721,11 @@ Blue pixels in this heat map represent low values, while yellow pixels represent As we can see, inflammation rises and falls over a 40-day period. Let's take a look at the average inflammation over time: -~~~ python +```python ave_inflammation = numpy.mean(data, axis=0) ave_plot = matplotlib.pyplot.plot(ave_inflammation) matplotlib.pyplot.show() -~~~ +``` ![A line graph showing the average inflammation across all patients over a 40-day period.](fig/inflammation-01-average.png) @@ -737,17 +736,17 @@ we might instead expect a sharper rise and slower fall. Let's have a look at two other statistics: -~~~ python +```python max_plot = matplotlib.pyplot.plot(numpy.max(data, axis=0)) matplotlib.pyplot.show() -~~~ +``` ![A line graph showing the maximum inflammation across all patients over a 40-day period.](fig/inflammation-01-maximum.png) -~~~ python +```python min_plot = matplotlib.pyplot.plot(numpy.min(data, axis=0)) matplotlib.pyplot.show() -~~~ +``` ![A line graph showing the minimum inflammation across all patients over a 40-day period.](fig/inflammation-01-minimum.png) @@ -760,22 +759,22 @@ This insight would have been difficult to reach by examining the numbers themsel You can group similar plots in a single figure using subplots. This script below uses a number of new commands. -* The function `matplotlib.pyplot.figure()` creates a space into which we will place all of our plots. - * This function returns a `Figure` object. - * This is sometimes refered to as a "handle" to the figure, which can be used to change the figure's properties. -* The parameter `figsize` tells Python how big to make this space. -* Each subplot is placed into the figure using its `add_subplot` method. - * Like "figure", this method returns an `Axes` object, allowing us to interact with axes. -* The `add_subplot` method takes 3 parameters. - * The first denotes how many total rows of subplots there are - * The second parameter refers to the total number of subplot columns, - * The final parameter denotes which subplot your variable is referencing (left-to-right, top-to-bottom). -* The "handle" for each subplot is stored in a different variable (`axes1`, `axes2`, `axes3`). -* Once a subplot is created, the axes can be titled using the `set_xlabel()` command (or `set_ylabel()`). +- The function `matplotlib.pyplot.figure()` creates a space into which we will place all of our plots. + - This function returns a `Figure` object. + - This is sometimes refered to as a "handle" to the figure, which can be used to change the figure's properties. +- The parameter `figsize` tells Python how big to make this space. +- Each subplot is placed into the figure using its `add_subplot` method. + - Like "figure", this method returns an `Axes` object, allowing us to interact with axes. +- The `add_subplot` method takes 3 parameters. + - The first denotes how many total rows of subplots there are + - The second parameter refers to the total number of subplot columns, + - The final parameter denotes which subplot your variable is referencing (left-to-right, top-to-bottom). +- The "handle" for each subplot is stored in a different variable (`axes1`, `axes2`, `axes3`). +- Once a subplot is created, the axes can be titled using the `set_xlabel()` command (or `set_ylabel()`). Here are our three plots side by side: -~~~ python +```python import numpy import matplotlib.pyplot @@ -800,7 +799,7 @@ fig.tight_layout() matplotlib.pyplot.savefig('inflammation.png') matplotlib.pyplot.show() -~~~ +``` ![Three line graphs showing the daily average, maximum and minimum inflammation over a 40-day period.](fig/inflammation-01-group-plot.png) @@ -848,33 +847,33 @@ Because matplotlib normally sets x and y axes limits to the min and max of our d If we want to change this, we can use the `set_ylim(min, max)` method of each 'axes', for example: -~~~ python +```python axes3.set_ylim(0,6) -~~~ +``` Update your plotting code to automatically set a more appropriate scale. (Hint: you can make use of the `max` and `min` methods to help.) :::solution -~~~ python +```python # One method axes3.set_ylabel('min') axes3.plot(numpy.min(data, axis=0)) axes3.set_ylim(0,6) -~~~ +``` ::: :::solution -~~~ python +```python # A more automated approach min_data = numpy.min(data, axis=0) axes3.set_ylabel('min') axes3.plot(min_data) axes3.set_ylim(numpy.min(min_data), numpy.max(min_data) * 1.1) -~~~ +``` ::: :::: @@ -889,7 +888,7 @@ Why is this? Because matplotlib interpolates (draws a straight line) between the points. One way to do avoid this is to use the Matplotlib `drawstyle` option: -~~~ python +```python import numpy import matplotlib.pyplot @@ -913,7 +912,7 @@ axes3.plot(numpy.min(data, axis=0), drawstyle='steps-mid') fig.tight_layout() matplotlib.pyplot.show() -~~~ +``` ![Three line graphs, with step lines connecting the points, showing the daily average, maximum and minimum inflammation over a 40-day period.](fig/inflammation-01-line-styles.png) ::: @@ -925,10 +924,10 @@ Create a plot showing the standard deviation (`numpy.std`) of the inflammation d :::solution -~~~ python +```python std_plot = matplotlib.pyplot.plot(numpy.std(data, axis=0)) matplotlib.pyplot.show() -~~~ +``` ::: :::: @@ -940,7 +939,7 @@ instead of side by side. :::solution -~~~ python +```python import numpy import matplotlib.pyplot @@ -966,7 +965,7 @@ axes3.plot(numpy.min(data, axis=0)) fig.tight_layout() matplotlib.pyplot.show() -~~~ +``` ::: :::: diff --git a/introductory_courses/python/07_pandas_dataframes.md b/introductory_courses/python/07_pandas_dataframes.md index 6e5c67f1..aadadc85 100644 --- a/introductory_courses/python/07_pandas_dataframes.md +++ b/introductory_courses/python/07_pandas_dataframes.md @@ -1,40 +1,38 @@ --- name: Pandas Dataframes -dependsOn: [ - introductory_courses.python.06_analyzing_and_visualizing_data -] +dependsOn: [introductory_courses.python.06_analyzing_and_visualizing_data] tags: [python] -attribution: - - citation: > - "Programming with Python" course by the Carpentries - url: https://swcarpentry.github.io/python-novice-inflammation/ - image: https://carpentries.org/assets/img/TheCarpentries.svg - license: CC-BY-4.0 +attribution: + - citation: > + "Programming with Python" course by the Carpentries + url: https://swcarpentry.github.io/python-novice-inflammation/ + image: https://carpentries.org/assets/img/TheCarpentries.svg + license: CC-BY-4.0 --- :::callout For this lesson, you will need to download and unzip [uniqplus_python_data.zip](uniqplus_python_data.zip). ::: -## Use the Pandas library to do statistics on tabular data. +## Use the Pandas library to do statistics on tabular data -* Pandas is a widely-used Python library for statistics, particularly on tabular data. -* Borrows many features from R's dataframes. - * A 2-dimensional table whose columns have names - and potentially have different data types. -* Load it with `import pandas as pd`. The alias pd is commonly used for Pandas. -* Read a Comma Separated Values (CSV) data file with `pd.read_csv`. - * Argument is the name of the file to be read. - * Assign result to a variable to store the data that was read. +- Pandas is a widely-used Python library for statistics, particularly on tabular data. +- Borrows many features from R's dataframes. + - A 2-dimensional table whose columns have names + and potentially have different data types. +- Load it with `import pandas as pd`. The alias pd is commonly used for Pandas. +- Read a Comma Separated Values (CSV) data file with `pd.read_csv`. + - Argument is the name of the file to be read. + - Assign result to a variable to store the data that was read. -~~~ python +```python import pandas as pd data = pd.read_csv('data/gapminder_gdp_oceania.csv') print(data) -~~~ +``` -~~~ +```text country gdpPercap_1952 gdpPercap_1957 gdpPercap_1962 \ 0 Australia 10039.59564 10949.64959 12217.22686 1 New Zealand 10556.57566 12247.39532 13175.67800 @@ -50,12 +48,13 @@ print(data) gdpPercap_2007 0 34435.36744 1 25185.00911 -~~~ +``` -* The columns in a dataframe are the observed variables, and the rows are the observations. -* Pandas uses backslash `\` to show wrapped lines when output is too wide to fit the screen. +- The columns in a dataframe are the observed variables, and the rows are the observations. +- Pandas uses backslash `\` to show wrapped lines when output is too wide to fit the screen. :::callout + ## File Not Found Our lessons store their data files in a `data` sub-directory, @@ -65,24 +64,24 @@ or if you include it but your copy of the file is somewhere else, you will get a runtime error that ends with a line like this: -~~~ +```text FileNotFoundError: [Errno 2] No such file or directory: 'data/gapminder_gdp_oceania.csv' -~~~ +``` ::: -## Use `index_col` to specify that a column's values should be used as row headings. +## Use `index_col` to specify that a column's values should be used as row headings -* Row headings are numbers (0 and 1 in this case). -* Really want to index by country. -* Pass the name of the column to `read_csv` as its `index_col` parameter to do this. +- Row headings are numbers (0 and 1 in this case). +- Really want to index by country. +- Pass the name of the column to `read_csv` as its `index_col` parameter to do this. -~~~ python +```python data = pd.read_csv('data/gapminder_gdp_oceania.csv', index_col='country') print(data) -~~~ +``` -~~~ +```text gdpPercap_1952 gdpPercap_1957 gdpPercap_1962 gdpPercap_1967 \ country Australia 10039.59564 10949.64959 12217.22686 14526.12465 @@ -97,15 +96,15 @@ New Zealand 16046.03728 16233.71770 17632.41040 19007.19129 country Australia 23424.76683 26997.93657 30687.75473 34435.36744 New Zealand 18363.32494 21050.41377 23189.80135 25185.00911 -~~~ +``` -## Use the `DataFrame.info()` method to find out more about a dataframe. +## Use the `DataFrame.info()` method to find out more about a dataframe -~~~ python +```python data.info() -~~~ +``` -~~~ +```text Index: 2 entries, Australia to New Zealand Data columns (total 12 columns): @@ -123,44 +122,43 @@ gdpPercap_2002 2 non-null float64 gdpPercap_2007 2 non-null float64 dtypes: float64(12) memory usage: 208.0+ bytes -~~~ - +``` -* This is a `DataFrame` -* Two rows named `'Australia'` and `'New Zealand'` -* Twelve columns, each of which has two actual 64-bit floating point values. - * We will talk later about null values, which are used to represent missing observations. -* Uses 208 bytes of memory. +- This is a `DataFrame` +- Two rows named `'Australia'` and `'New Zealand'` +- Twelve columns, each of which has two actual 64-bit floating point values. + - We will talk later about null values, which are used to represent missing observations. +- Uses 208 bytes of memory. -## The `DataFrame.columns` variable stores information about the dataframe's columns. +## The `DataFrame.columns` variable stores information about the dataframe's columns -* Note that this is data, *not* a method. (It doesn't have parentheses.) - * Like `math.pi`. - * So do not use `()` to try to call it. -* Called a *member variable*, or just *member*. +- Note that this is data, _not_ a method. (It doesn't have parentheses.) + - Like `math.pi`. + - So do not use `()` to try to call it. +- Called a _member variable_, or just _member_. -~~~ python +```python print(data.columns) -~~~ +``` -~~~ +```text Index(['gdpPercap_1952', 'gdpPercap_1957', 'gdpPercap_1962', 'gdpPercap_1967', 'gdpPercap_1972', 'gdpPercap_1977', 'gdpPercap_1982', 'gdpPercap_1987', 'gdpPercap_1992', 'gdpPercap_1997', 'gdpPercap_2002', 'gdpPercap_2007'], dtype='object') -~~~ +``` -## Use `DataFrame.T` to transpose a dataframe. +## Use `DataFrame.T` to transpose a dataframe -* Sometimes want to treat columns as rows and vice versa. -* Transpose (written `.T`) doesn't copy the data, just changes the program's view of it. -* Like `columns`, it is a member variable. +- Sometimes want to treat columns as rows and vice versa. +- Transpose (written `.T`) doesn't copy the data, just changes the program's view of it. +- Like `columns`, it is a member variable. -~~~ python +```python print(data.T) -~~~ +``` -~~~ +```text country Australia New Zealand gdpPercap_1952 10039.59564 10556.57566 gdpPercap_1957 10949.64959 12247.39532 @@ -174,18 +172,18 @@ gdpPercap_1992 23424.76683 18363.32494 gdpPercap_1997 26997.93657 21050.41377 gdpPercap_2002 30687.75473 23189.80135 gdpPercap_2007 34435.36744 25185.00911 -~~~ +``` -## Use `DataFrame.describe()` to get summary statistics about data. +## Use `DataFrame.describe()` to get summary statistics about data -`DataFrame.describe()` gets the summary statistics of only the columns that have numerical data. +`DataFrame.describe()` gets the summary statistics of only the columns that have numerical data. All other columns are ignored, unless you use the argument `include='all'`. -~~~ python +```python print(data.describe()) -~~~ +``` -~~~ +```text gdpPercap_1952 gdpPercap_1957 gdpPercap_1962 gdpPercap_1967 \ count 2.000000 2.000000 2.000000 2.000000 mean 10298.085650 11598.522455 12696.452430 14495.021790 @@ -215,11 +213,10 @@ min 18363.324940 21050.413770 23189.801350 25185.009110 50% 20894.045885 24024.175170 26938.778040 29810.188275 75% 22159.406358 25511.055870 28813.266385 32122.777857 max 23424.766830 26997.936570 30687.754730 34435.367440 -~~~ - +``` -* Not particularly useful with just two records, - but very helpful when there are thousands. +- Not particularly useful with just two records, + but very helpful when there are thousands. ::::challenge{id="reading_other_data" title="Reading Other Data"} @@ -232,10 +229,11 @@ and display its summary statistics. To read in a CSV, we use `pd.read_csv` and pass the filename `'data/gapminder_gdp_americas.csv'` to it. We also once again pass the column name `'country'` to the parameter `index_col` in order to index by country. The summary statistics can be displayed with the `DataFrame.describe()` method. -~~~ python + +```python americas = pd.read_csv('data/gapminder_gdp_americas.csv', index_col='country') americas.describe() -~~~ +``` ::: :::: @@ -246,21 +244,22 @@ After reading the data for the Americas, use `help(americas.head)` and `help(americas.tail)` to find out what `DataFrame.head` and `DataFrame.tail` do. -1. What method call will display the first three rows of this data? -2. What method call will display the last three columns of this data? - (Hint: you may need to change your view of the data.) +1. What method call will display the first three rows of this data? +2. What method call will display the last three columns of this data? + (Hint: you may need to change your view of the data.) :::solution + 1. We can check out the first five rows of `americas` by executing `americas.head()` (allowing us to view the head of the DataFrame). We can specify the number of rows we wish to see by specifying the parameter `n` in our call to `americas.head()`. To view the first three rows, execute: - ~~~ python + ```python americas.head(n=3) - ~~~ + ``` - ~~~ + ```text continent gdpPercap_1952 gdpPercap_1957 gdpPercap_1962 \ country Argentina Americas 5911.315053 6856.856212 7133.166023 @@ -284,24 +283,25 @@ to find out what `DataFrame.head` and `DataFrame.tail` do. Argentina 12779.379640 Bolivia 3822.137084 Brazil 9065.800825 - ~~~ + ``` 2. To check out the last three rows of `americas`, we would use the command, `americas.tail(n=3)`, analogous to `head()` used above. However, here we want to look at the last three columns so we need to change our view and then use `tail()`. To do so, we - create a new DataFrame in which rows and columns are switched: + create a new DataFrame in which rows and columns are switched: - ~~~ python + ```python americas_flipped = americas.T - ~~~ + ``` We can then view the last three columns of `americas` by viewing the last three rows of `americas_flipped`: - ~~~ python + + ```python americas_flipped.tail(n=3) - ~~~ - - ~~~ + ``` + + ```text country Argentina Bolivia Brazil Canada Chile Colombia \ gdpPercap_1997 10967.3 3326.14 7957.98 28954.9 10118.1 6117.36 gdpPercap_2002 8797.64 3413.26 8131.21 33329 10778.8 5755.26 @@ -321,24 +321,24 @@ to find out what `DataFrame.head` and `DataFrame.tail` do. gdpPercap_1997 8792.57 35767.4 9230.24 10165.5 gdpPercap_2002 11460.6 39097.1 7727 8605.05 gdpPercap_2007 18008.5 42951.7 10611.5 11415.8 - ~~~ - - + ``` + This shows the data that we want, but we may prefer to display three columns instead of three rows, so we can flip it back: - ~~~ python - americas_flipped.tail(n=3).T - ~~~ - - __Note:__ we could have done the above in a single line of code by 'chaining' the commands: - ~~~ python + + ```python + americas_flipped.tail(n=3).T + ``` + + **Note:** we could have done the above in a single line of code by 'chaining' the commands: + + ```python americas.T.tail(n=3).T - ~~~ - + ``` + ::: :::: - ::::challenge{id="reading_files_in_other_directories" title="Reading Files in Other Directories"} Imagine the data for your current project is stored in a file called `microbes.csv`, @@ -346,13 +346,13 @@ which is located in a folder called `field_data`. You are doing analysis in a notebook called `analysis.ipynb` in a sibling folder called `thesis`: -~~~ +```text your_home_directory +-- field_data/ | +-- microbes.csv +-- thesis/ +-- analysis.ipynb -~~~ +``` What value(s) should you pass to `read_csv` to read `microbes.csv` in `analysis.ipynb`? @@ -360,9 +360,10 @@ What value(s) should you pass to `read_csv` to read `microbes.csv` in `analysis. We need to specify the path to the file of interest in the call to `pd.read_csv`. We first need to 'jump' out of the folder `thesis` using '../' and then into the folder `field_data` using 'field_data/'. Then we can specify the filename `microbes.csv. The result is as follows: -~~~ python + +```python data_microbes = pd.read_csv('../field_data/microbes.csv') -~~~ +``` ::: :::: @@ -376,23 +377,25 @@ write one of your dataframes to a file called `processed.csv`. You can use `help` to get information on how to use `to_csv`. :::solution In order to write the DataFrame `americas` to a file called `processed.csv`, execute the following command: -~~~ python + +```python americas.to_csv('processed.csv') -~~~ +``` For help on `to_csv`, you could execute, for example: -~~~ python + +```python help(americas.to_csv) -~~~ +``` -Note that `help(to_csv)` throws an error! This is a subtlety and is due to the fact that `to_csv` is NOT a function in -and of itself and the actual call is `americas.to_csv`. +Note that `help(to_csv)` throws an error! This is a subtlety and is due to the fact that `to_csv` is NOT a function in +and of itself and the actual call is `americas.to_csv`. ::: :::: ## Note about Pandas DataFrames/Series -A [DataFrame][[pandas-dataframe]](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html) is a collection of [Series](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.html); +A pands [Dataframe](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html) is a collection of [Series](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.html); The DataFrame is the way Pandas represents a table, and Series is the data-structure Pandas use to represent a column. @@ -407,46 +410,45 @@ between DataFrames. To access a value at the position `[i,j]` of a DataFrame, we have two options, depending on what is the meaning of `i` in use. -Remember that a DataFrame provides an *index* as a way to identify the rows of the table; -a row, then, has a *position* inside the table as well as a *label*, which -uniquely identifies its *entry* in the DataFrame. +Remember that a DataFrame provides an _index_ as a way to identify the rows of the table; +a row, then, has a _position_ inside the table as well as a _label_, which +uniquely identifies its _entry_ in the DataFrame. ## Use `DataFrame.iloc[..., ...]` to select values by their (entry) position -* Can specify location by numerical index analogously to 2D version of character selection in strings. +- Can specify location by numerical index analogously to 2D version of character selection in strings. -~~~ python +```python import pandas as pd data = pd.read_csv('data/gapminder_gdp_europe.csv', index_col='country') print(data.iloc[0, 0]) -~~~ +``` -~~~ +```text 1601.056136 -~~~ - +``` -## Use `DataFrame.loc[..., ...]` to select values by their (entry) label. +## Use `DataFrame.loc[..., ...]` to select values by their (entry) label -* Can specify location by row name analogously to 2D version of dictionary keys. +- Can specify location by row name analogously to 2D version of dictionary keys. -~~~ python +```python print(data.loc["Albania", "gdpPercap_1952"]) -~~~ +``` -~~~ +```text 1601.056136 -~~~ +``` -## Use `:` on its own to mean all columns or all rows. +## Use `:` on its own to mean all columns or all rows -* Just like Python's usual slicing notation. +- Just like Python's usual slicing notation. -~~~ python +```python print(data.loc["Albania", :]) -~~~ +``` -~~~ +```text gdpPercap_1952 1601.056136 gdpPercap_1957 1942.284244 gdpPercap_1962 2312.888958 @@ -460,15 +462,15 @@ gdpPercap_1997 3193.054604 gdpPercap_2002 4604.211737 gdpPercap_2007 5937.029526 Name: Albania, dtype: float64 -~~~ +``` -* Would get the same result printing `data.loc["Albania"]` (without a second index). +- Would get the same result printing `data.loc["Albania"]` (without a second index). -~~~ python +```python print(data.loc[:, "gdpPercap_1952"]) -~~~ +``` -~~~ +```text country Albania 1601.056136 Austria 6137.076492 @@ -478,18 +480,18 @@ Switzerland 14734.232750 Turkey 1969.100980 United Kingdom 9979.508487 Name: gdpPercap_1952, dtype: float64 -~~~ +``` -* Would get the same result printing `data["gdpPercap_1952"]` -* Also get the same result printing `data.gdpPercap_1952` (not recommended, because easily confused with `.` notation for methods) +- Would get the same result printing `data["gdpPercap_1952"]` +- Also get the same result printing `data.gdpPercap_1952` (not recommended, because easily confused with `.` notation for methods) -## Select multiple columns or rows using `DataFrame.loc` and a named slice. +## Select multiple columns or rows using `DataFrame.loc` and a named slice -~~~ python +```python print(data.loc['Italy':'Poland', 'gdpPercap_1962':'gdpPercap_1972']) -~~~ +``` -~~~ +```text gdpPercap_1962 gdpPercap_1967 gdpPercap_1972 country Italy 8243.582340 10022.401310 12269.273780 @@ -497,57 +499,56 @@ Montenegro 4649.593785 5907.850937 7778.414017 Netherlands 12790.849560 15363.251360 18794.745670 Norway 13450.401510 16361.876470 18965.055510 Poland 5338.752143 6557.152776 8006.506993 -~~~ +``` In the above code, we discover that **slicing using `loc` is inclusive at both ends**, which differs from **slicing using `iloc`**, where slicing indicates -everything up to but not including the final index. - +everything up to but not including the final index. -## Result of slicing can be used in further operations. +## Result of slicing can be used in further operations -* Usually don't just print a slice. -* All the statistical operators that work on entire dataframes - work the same way on slices. -* E.g., calculate max of a slice. +- Usually don't just print a slice. +- All the statistical operators that work on entire dataframes + work the same way on slices. +- E.g., calculate max of a slice. -~~~ python +```python print(data.loc['Italy':'Poland', 'gdpPercap_1962':'gdpPercap_1972'].max()) -~~~ +``` -~~~ +```text gdpPercap_1962 13450.40151 gdpPercap_1967 16361.87647 gdpPercap_1972 18965.05551 dtype: float64 -~~~ +``` -~~~ python +```python print(data.loc['Italy':'Poland', 'gdpPercap_1962':'gdpPercap_1972'].min()) -~~~ +``` -~~~ +```text gdpPercap_1962 4649.593785 gdpPercap_1967 5907.850937 gdpPercap_1972 7778.414017 dtype: float64 -~~~ +``` -## Use comparisons to select data based on value. +## Use comparisons to select data based on value -* Comparison is applied element by element. -* Returns a similarly-shaped dataframe of `True` and `False`. +- Comparison is applied element by element. +- Returns a similarly-shaped dataframe of `True` and `False`. -~~~ python +```python # Use a subset of data to keep output readable. subset = data.loc['Italy':'Poland', 'gdpPercap_1962':'gdpPercap_1972'] print('Subset of data:\n', subset) # Which values were greater than 10000 ? print('\nWhere are values large?\n', subset > 10000) -~~~ +``` -~~~ +```text Subset of data: gdpPercap_1962 gdpPercap_1967 gdpPercap_1972 country @@ -565,18 +566,18 @@ Montenegro False False False Netherlands True True True Norway True True True Poland False False False -~~~ +``` -## Select values or NaN using a Boolean mask. +## Select values or NaN using a Boolean mask -* A frame full of Booleans is sometimes called a *mask* because of how it can be used. +- A frame full of Booleans is sometimes called a _mask_ because of how it can be used. -~~~ python +```python mask = subset > 10000 print(subset[mask]) -~~~ +``` -~~~ +```text gdpPercap_1962 gdpPercap_1967 gdpPercap_1972 country Italy NaN 10022.40131 12269.27378 @@ -584,16 +585,16 @@ Montenegro NaN NaN NaN Netherlands 12790.84956 15363.25136 18794.74567 Norway 13450.40151 16361.87647 18965.05551 Poland NaN NaN NaN -~~~ +``` -* Get the value where the mask is true, and NaN (Not a Number) where it is false. -* Useful because NaNs are ignored by operations like max, min, average, etc. +- Get the value where the mask is true, and NaN (Not a Number) where it is false. +- Useful because NaNs are ignored by operations like max, min, average, etc. -~~~ python +```python print(subset[subset > 10000].describe()) -~~~ +``` -~~~ +```text gdpPercap_1962 gdpPercap_1967 gdpPercap_1972 count 2.000000 3.000000 3.000000 mean 13120.625535 13915.843047 16676.358320 @@ -603,28 +604,28 @@ min 12790.849560 10022.401310 12269.273780 50% 13120.625535 15363.251360 18794.745670 75% 13285.513523 15862.563915 18879.900590 max 13450.401510 16361.876470 18965.055510 -~~~ +``` ## Group By: split-apply-combine -Pandas vectorizing methods and grouping operations are features that provide users +Pandas vectorizing methods and grouping operations are features that provide users much flexibility to analyse their data. -For instance, let's say we want to have a clearer view on how the European countries +For instance, let's say we want to have a clearer view on how the European countries split themselves according to their GDP. -1. We may have a glance by splitting the countries in two groups during the years surveyed, - those who presented a GDP *higher* than the European average and those with a *lower* GDP. -2. We then estimate a *wealthy score* based on the historical (from 1962 to 2007) values, - where we account how many times a country has participated in the groups of *lower* or *higher* GDP +1. We may have a glance by splitting the countries in two groups during the years surveyed, + those who presented a GDP _higher_ than the European average and those with a _lower_ GDP. +2. We then estimate a _wealthy score_ based on the historical (from 1962 to 2007) values, + where we account how many times a country has participated in the groups of _lower_ or _higher_ GDP -~~~ python -mask_higher = data data.mean() +```python +mask_higher = data > data.mean() wealth_score = mask_higher.aggregate('sum', axis=1) / len(data.columns) wealth_score -~~~ +``` -~~~ +```text country Albania 0.000000 Austria 1.000000 @@ -657,95 +658,98 @@ Switzerland 1.000000 Turkey 0.000000 United Kingdom 1.000000 dtype: float64 -~~~ +``` Finally, for each group in the `wealth_score` table, we sum their (financial) contribution across the years surveyed using chained methods: -~~~ python +```python data.groupby(wealth_score).sum() -~~~ +``` -~~~ +```text gdpPercap_1952 gdpPercap_1957 gdpPercap_1962 gdpPercap_1967 \ -0.000000 36916.854200 46110.918793 56850.065437 71324.848786 -0.333333 16790.046878 20942.456800 25744.935321 33567.667670 -0.500000 11807.544405 14505.000150 18380.449470 21421.846200 -1.000000 104317.277560 127332.008735 149989.154201 178000.350040 +0.000000 36916.854200 46110.918793 56850.065437 71324.848786 +0.333333 16790.046878 20942.456800 25744.935321 33567.667670 +0.500000 11807.544405 14505.000150 18380.449470 21421.846200 +1.000000 104317.277560 127332.008735 149989.154201 178000.350040 gdpPercap_1972 gdpPercap_1977 gdpPercap_1982 gdpPercap_1987 \ -0.000000 88569.346898 104459.358438 113553.768507 119649.599409 -0.333333 45277.839976 53860.456750 59679.634020 64436.912960 -0.500000 25377.727380 29056.145370 31914.712050 35517.678220 -1.000000 215162.343140 241143.412730 263388.781960 296825.131210 - - gdpPercap_1992 gdpPercap_1997 gdpPercap_2002 gdpPercap_2007 -0.000000 92380.047256 103772.937598 118590.929863 149577.357928 -0.333333 67918.093220 80876.051580 102086.795210 122803.729520 -0.500000 36310.666080 40723.538700 45564.308390 51403.028210 +0.000000 88569.346898 104459.358438 113553.768507 119649.599409 +0.333333 45277.839976 53860.456750 59679.634020 64436.912960 +0.500000 25377.727380 29056.145370 31914.712050 35517.678220 +1.000000 215162.343140 241143.412730 263388.781960 296825.131210 + + gdpPercap_1992 gdpPercap_1997 gdpPercap_2002 gdpPercap_2007 +0.000000 92380.047256 103772.937598 118590.929863 149577.357928 +0.333333 67918.093220 80876.051580 102086.795210 122803.729520 +0.500000 36310.666080 40723.538700 45564.308390 51403.028210 1.000000 315238.235970 346930.926170 385109.939210 427850.333420 -~~~ +``` ::::challenge{id="selection_of_individual_values" title="Selection of Individual Values"} Assume Pandas has been imported into your notebook and the Gapminder GDP data for Europe has been loaded: -~~~ python +```python import pandas as pd df = pd.read_csv('data/gapminder_gdp_europe.csv', index_col='country') -~~~ - +``` Write an expression to find the Per Capita GDP of Serbia in 2007. :::solution The selection can be done by using the labels for both the row ("Serbia") and the column ("gdpPercap_2007"): -~~~ python + +```python print(df.loc['Serbia', 'gdpPercap_2007']) -~~~ +``` The output is -~~~ + +```text 9786.534714 -~~~ ->{: .output} +``` + ::: :::: ::::challenge{id="extent_of_slicing" title="Extent of Slicing"} -1. Do the two statements below produce the same output? -2. Based on this, - what rule governs what is included (or not) in numerical slices and named slices in Pandas? +1. Do the two statements below produce the same output? +2. Based on this, + what rule governs what is included (or not) in numerical slices and named slices in Pandas? -~~~ python +```python print(df.iloc[0:2, 0:2]) print(df.loc['Albania':'Belgium', 'gdpPercap_1952':'gdpPercap_1962']) -~~~ +``` :::solution No, they do not produce the same output! The output of the first statement is: -~~~ + +```text gdpPercap_1952 gdpPercap_1957 -country +country Albania 1601.056136 1942.284244 Austria 6137.076492 8842.598030 -~~~ ->{: .output} +``` + The second statement gives: -~~~ + +```text gdpPercap_1952 gdpPercap_1957 gdpPercap_1962 -country +country Albania 1601.056136 1942.284244 2312.888958 Austria 6137.076492 8842.598030 10750.721110 Belgium 8343.105127 9714.960623 10991.206760 -~~~ ->{: .output} +``` + Clearly, the second statement produces an additional column and an additional row compared to the first statement. -What conclusion can we draw? We see that a numerical slice, 0:2, *omits* the final index (i.e. index 2) +What conclusion can we draw? We see that a numerical slice, 0:2, _omits_ the final index (i.e. index 2) in the range provided, -while a named slice, 'gdpPercap_1952':'gdpPercap_1962', *includes* the final element. +while a named slice, 'gdpPercap*1952':'gdpPercap_1962', \_includes* the final element. ::: :::: @@ -754,51 +758,56 @@ while a named slice, 'gdpPercap_1952':'gdpPercap_1962', *includes* the final ele Explain what each line in the following short program does: what is in `first`, `second`, etc.? -~~~ python +```python first = pd.read_csv('data/gapminder_all.csv', index_col='country') second = first[first['continent'] == 'Americas'] third = second.drop('Puerto Rico') -fourth = third.drop('continent', axis = 1) +fourth = third.drop('continent', axis=1) fourth.to_csv('result.csv') -~~~ +``` :::solution Let's go through this piece of code line by line. -~~~ python + +```python first = pd.read_csv('data/gapminder_all.csv', index_col='country') -~~~ +``` -This line loads the dataset containing the GDP data from all countries into a dataframe called -`first`. The `index_col='country'` parameter selects which column to use as the -row labels in the dataframe. -~~~ python +This line loads the dataset containing the GDP data from all countries into a dataframe called +`first`. The `index_col='country'` parameter selects which column to use as the +row labels in the dataframe. + +```python second = first[first['continent'] == 'Americas'] -~~~ +``` -This line makes a selection: only those rows of `first` for which the 'continent' column matches -'Americas' are extracted. Notice how the Boolean expression inside the brackets, -`first['continent'] == 'Americas'`, is used to select only those rows where the expression is true. -Try printing this expression! Can you print also its individual True/False elements? +This line makes a selection: only those rows of `first` for which the 'continent' column matches +'Americas' are extracted. Notice how the Boolean expression inside the brackets, +`first['continent'] == 'Americas'`, is used to select only those rows where the expression is true. +Try printing this expression! Can you print also its individual True/False elements? (hint: first assign the expression to a variable) -~~~ python + +```python third = second.drop('Puerto Rico') -~~~ +``` -As the syntax suggests, this line drops the row from `second` where the label is 'Puerto Rico'. The +As the syntax suggests, this line drops the row from `second` where the label is 'Puerto Rico'. The resulting dataframe `third` has one row less than the original dataframe `second`. -~~~ python -fourth = third.drop('continent', axis = 1) -~~~ -Again we apply the drop function, but in this case we are dropping not a row but a whole column. -To accomplish this, we need to specify also the `axis` parameter (we want to drop the second column +```python +fourth = third.drop('continent', axis=1) +``` + +Again we apply the drop function, but in this case we are dropping not a row but a whole column. +To accomplish this, we need to specify also the `axis` parameter (we want to drop the second column which has index 1). -~~~ python + +```python fourth.to_csv('result.csv') -~~~ +``` -The final step is to write the data that we have been working on to a csv file. Pandas makes this easy -with the `to_csv()` function. The only required argument to the function is the filename. Note that the +The final step is to write the data that we have been working on to a csv file. Pandas makes this easy +with the `to_csv()` function. The only required argument to the function is the filename. Note that the file will be written in the directory from which you started the Jupyter or Python session. ::: :::: @@ -808,12 +817,11 @@ file will be written in the directory from which you started the Jupyter or Pyth Explain in simple terms what `idxmin` and `idxmax` do in the short program below. When would you use these methods? -~~~ python +```python data = pd.read_csv('data/gapminder_gdp_europe.csv', index_col='country') print(data.idxmin()) print(data.idxmax()) -~~~ - +``` :::solution For each column in `data`, `idxmin` will return the index value corresponding to each column's minimum; @@ -828,54 +836,54 @@ You can use these functions whenever you want to get the row index of the minimu Assume Pandas has been imported and the Gapminder GDP data for Europe has been loaded. Write an expression to select each of the following: -1. GDP per capita for all countries in 1982. -2. GDP per capita for Denmark for all years. -3. GDP per capita for all countries for years *after* 1985. -4. GDP per capita for each country in 2007 as a multiple of - GDP per capita for that country in 1952. +1. GDP per capita for all countries in 1982. +2. GDP per capita for Denmark for all years. +3. GDP per capita for all countries for years _after_ 1985. +4. GDP per capita for each country in 2007 as a multiple of + GDP per capita for that country in 1952. :::solution 1: -~~~ python -data['gdpPercap_1982'] -~~~ +```python +data['gdpPercap_1982'] +``` 2: -~~~ python -data.loc['Denmark',:] -~~~ +```python +data.loc['Denmark',:] +``` 3: -~~~ python + +```python data.loc[:,'gdpPercap_1985':] -~~~ +``` Pandas is smart enough to recognize the number at the end of the column label and does not give you an error, although no column named `gdpPercap_1985` actually exists. This is useful if new columns are added to the CSV file later. 4: -~~~ python + +```python data['gdpPercap_2007']/data['gdpPercap_1952'] -~~~ +``` ::: :::: - ::::challenge{id="exploring_avilable_methods" title="Exploring available methods using the `dir()` function"} -Python includes a `dir()` function that can be used to display all of the available methods (functions) that are built into a data object. In Episode 4, we used some methods with a string. But we can see many more are available by using `dir()`: - -~~~ python -my_string = 'Hello world!' # creation of a string object -dir(myString) -~~~ +Python includes a `dir()` function that can be used to display all of the available methods (functions) that are built into a data object. In Episode 4, we used some methods with a string. But we can see many more are available by using `dir()`: +```python +my_string = 'Hello world!' # creation of a string object +dir(my_string) +``` This command returns: -~~~ python +```text ['__add__', ... '__subclasshook__', @@ -885,24 +893,23 @@ This command returns: ... 'upper', 'zfill'] -~~~ - +``` You can use `help()` or Shift+Tab - "Programming with Python" course by the Carpentries - url: https://swcarpentry.github.io/python-novice-inflammation/ - image: https://carpentries.org/assets/img/TheCarpentries.svg - license: CC-BY-4.0 +attribution: + - citation: > + "Programming with Python" course by the Carpentries + url: https://swcarpentry.github.io/python-novice-inflammation/ + image: https://carpentries.org/assets/img/TheCarpentries.svg + license: CC-BY-4.0 --- In Python, a list is a way to store multiple values together. @@ -20,29 +18,29 @@ In this episode, we will learn how to store multiple values in a list as well as Unlike NumPy arrays, lists are built into the language so we do not have to load a library to use them. We create a list by putting values inside square brackets and separating the values with commas: -~~~ python +```python odds = [1, 3, 5, 7] print('odds are:', odds) -~~~ +``` -~~~ +```text odds are: [1, 3, 5, 7] -~~~ +``` We can access elements of a list using indices -- numbered positions of elements in the list. These positions are numbered starting at 0, so the first element has an index of 0. -~~~ python +```python print('first element:', odds[0]) print('last element:', odds[3]) print('"-1" element:', odds[-1]) -~~~ +``` -~~~ +```text first element: 1 last element: 7 "-1" element: 7 -~~~ +``` Yes, we can use negative numbers as indices in Python. When we do so, the index `-1` gives us the last element in the list, `-2` the second to last, and so on. @@ -53,29 +51,26 @@ we can change the values in a list, but we cannot change individual characters in a string. For example: -~~~ python +```python names = ['Curie', 'Darwing', 'Turing'] # typo in Darwin's name print('names is originally:', names) names[1] = 'Darwin' # correct the name print('final value of names:', names) -~~~ +``` - -~~~ +```text names is originally: ['Curie', 'Darwing', 'Turing'] final value of names: ['Curie', 'Darwin', 'Turing'] -~~~ - +``` works, but: -~~~ python +```python name = 'Darwin' name[0] = 'd' -~~~ - +``` -~~~ +```text --------------------------------------------------------------------------- TypeError Traceback (most recent call last) () @@ -83,11 +78,12 @@ TypeError Traceback (most recent call last) ----2 name[0] = 'd' TypeError: 'str' object does not support item assignment -~~~ +``` does not. :::callout + ## Ch-Ch-Ch-Ch-Changes Data which can be modified in place is called [mutable]({{ page.root }}/reference.html#mutable), @@ -105,33 +101,30 @@ in-place or a function that returns a modified copy and leaves the original unch Be careful when modifying data in-place. If two variables refer to the same list, and you modify the list value, it will change for both variables! -~~~ python +```python salsa = ['peppers', 'onions', 'cilantro', 'tomatoes'] my_salsa = salsa # <-- my_salsa and salsa point to the *same* list data in memory salsa[0] = 'hot peppers' print('Ingredients in my salsa:', my_salsa) -~~~ +``` -~~~ +```text Ingredients in my salsa: ['hot peppers', 'onions', 'cilantro', 'tomatoes'] -~~~ - +``` If you want variables with mutable values to be independent, you must make a copy of the value when you assign it. -~~~ python +```python salsa = ['peppers', 'onions', 'cilantro', 'tomatoes'] my_salsa = list(salsa) # <-- makes a *copy* of the list salsa[0] = 'hot peppers' print('Ingredients in my salsa:', my_salsa) -~~~ - +``` -~~~ +```text Ingredients in my salsa: ['peppers', 'onions', 'cilantro', 'tomatoes'] -~~~ - +``` Because of pitfalls like this, code which modifies data in place can be more difficult to understand. However, it is often far more efficient to modify a large data structure in place @@ -140,105 +133,97 @@ when writing your code. ::: :::callout + ## Nested Lists + Since a list can contain any Python variables, it can even contain other lists. For example, we could represent the products in the shelves of a small grocery shop: -~~~ python +```python x = [['pepper', 'zucchini', 'onion'], ['cabbage', 'lettuce', 'garlic'], ['apple', 'pear', 'banana']] -~~~ - +``` Here is a visual example of how indexing a list of lists `x` works: -[![x is represented as a pepper shaker containing several packets of pepper. [x[0]] is represented as a pepper shaker containing a single packet of pepper. x[0] is represented as a single packet of pepper. x[0][0] is represented as single grain of pepper. Adapted from @hadleywickham.](fig/indexing_lists_python.png)](https://twitter.com/hadleywickham/status/643381054758363136) +[![x is represented as a pepper shaker containing several packets of pepper. [x[0]] is represented as a pepper shaker containing a single packet of pepper. x[0] is represented as a single packet of pepper. x\[0]\[0] is represented as single grain of pepper. Adapted from @hadleywickham.](fig/indexing_lists_python.png)]() Using the previously declared list `x`, these would be the results of the index operations shown in the image: -~~~ python +```python print([x[0]]) -~~~ - +``` -~~~ +```text [['pepper', 'zucchini', 'onion']] -~~~ +``` - -~~~ python +```python print(x[0]) -~~~ - +``` -~~~ +```text ['pepper', 'zucchini', 'onion'] -~~~ - +``` -~~~ python +```python print(x[0][0]) -~~~ +``` - -~~~ +```text 'pepper' -~~~ - +``` Thanks to [Hadley Wickham](https://twitter.com/hadleywickham/status/643381054758363136) for the image above. ::: :::callout + ## Heterogeneous Lists + Lists in Python can contain elements of different types. Example: -~~~ python + +```python sample_ages = [10, 12.5, 'Unknown'] -~~~ +``` ::: There are many ways to change the contents of lists besides assigning new values to individual elements: -~~~ python +```python odds.append(11) print('odds after adding a value:', odds) -~~~ +``` - -~~~ +```text odds after adding a value: [1, 3, 5, 7, 11] -~~~ - +``` -~~~ python +```python removed_element = odds.pop(0) print('odds after removing the first element:', odds) print('removed_element:', removed_element) -~~~ +``` - -~~~ +```text odds after removing the first element: [3, 5, 7, 11] removed_element: 1 -~~~ - +``` -~~~ python +```python odds.reverse() print('odds after reversing:', odds) -~~~ - +``` -~~~ +```text odds after reversing: [11, 7, 5, 3] -~~~ - +``` While modifying in place, it is useful to remember that Python treats lists in a slightly counter-intuitive way. @@ -247,45 +232,41 @@ As we saw earlier, when we modified the `salsa` list item in-place, if we make a copy it and then modify this list, we can cause all sorts of trouble. This also applies to modifying the list using the above functions: -~~~ python +```python odds = [1, 3, 5, 7] primes = odds primes.append(2) print('primes:', primes) print('odds:', odds) -~~~ - +``` -~~~ +```text primes: [1, 3, 5, 7, 2] odds: [1, 3, 5, 7, 2] -~~~ - +``` This is because Python stores a list in memory, and then can use multiple names to refer to the same list. If all we want to do is copy a (simple) list, we can again use the `list` function, so we do not modify a list we did not mean to: -~~~ python +```python odds = [1, 3, 5, 7] primes = list(odds) primes.append(2) print('primes:', primes) print('odds:', odds) -~~~ - +``` -~~~ +```text primes: [1, 3, 5, 7, 2] odds: [1, 3, 5, 7] -~~~ - +``` Subsets of lists and strings can be accessed by specifying ranges of values in brackets, similar to how we accessed ranges of positions in a NumPy array. This is commonly referred to as "slicing" the list/string. -~~~ python +```python binomial_name = 'Drosophila melanogaster' group = binomial_name[0:10] print('group:', group) @@ -299,29 +280,26 @@ print('autosomes:', autosomes) last = chromosomes[-1] print('last:', last) -~~~ +``` - -~~~ +```text group: Drosophila species: melanogaster autosomes: ['2', '3', '4'] last: 4 -~~~ - +``` You can also select non-consecutive elements from a list by slicing with a step size, for example -~~~ python + +```python numbers = [1, 2, 3, 4, 5, 6, 7, 8] odd_numbers = numbers[0:6:2] print('odd numbers:', odd_numbers) -~~~ - +``` -~~~ +```text odd numbers: [1, 3, 5] -~~~ - +``` In the above example, `numbers[0:2:7]` tells python to slice the list `numbers` from element `0` (`1`) to element `6` (`7`), exluded with a @@ -331,20 +309,19 @@ step of elements. Use slicing to access only the last four characters of a string or entries of a list. -~~~ python +```python string_for_slicing = 'Observation date: 02-Feb-2013' list_for_slicing = [['fluorine', 'F'], ['chlorine', 'Cl'], ['bromine', 'Br'], ['iodine', 'I'], ['astatine', 'At']] -~~~ +``` - -~~~ +```text '2013' [['chlorine', 'Cl'], ['bromine', 'Br'], ['iodine', 'I'], ['astatine', 'At']] -~~~ +``` Would your solution work regardless of whether you knew beforehand the length of the string or list @@ -356,10 +333,10 @@ Hint: Remember that indices can be negative as well as positive :::solution Use negative indices to count elements from the end of a container (such as list or string): -~~~ python +```python string_for_slicing[-4:] list_for_slicing[-4:] -~~~ +``` ::: :::: @@ -367,25 +344,23 @@ list_for_slicing[-4:] If you want to take a slice from the beginning of a sequence, you can omit the first index in the range: -~~~ python +```python date = 'Monday 4 January 2016' day = date[0:6] print('Using 0 to begin range:', day) day = date[:6] print('Omitting beginning index:', day) -~~~ - +``` -~~~ +```text Using 0 to begin range: Monday Omitting beginning index: Monday -~~~ - +``` And similarly, you can omit the ending index in the range to take a slice to the very end of the sequence: -~~~ python +```python months = ['jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug', 'sep', 'oct', 'nov', 'dec'] sond = months[8:12] print('With known last position:', sond) @@ -393,14 +368,13 @@ sond = months[8:len(months)] print('Using len() to get last entry:', sond) sond = months[8:] print('Omitting ending index:', sond) -~~~ - +``` -~~~ +```text With known last position: ['sep', 'oct', 'nov', 'dec'] Using len() to get last entry: ['sep', 'oct', 'nov', 'dec'] Omitting ending index: ['sep', 'oct', 'nov', 'dec'] -~~~ +``` ::::challenge{id="overloading" title="Overloading"} @@ -408,19 +382,18 @@ Omitting ending index: ['sep', 'oct', 'nov', 'dec'] Given that, what do you think the multiplication operator `*` does on lists? In particular, what will be the output of the following code? -~~~ python +```python counts = [2, 4, 6, 8, 10] repeats = counts * 2 print(repeats) -~~~ - +``` -1. `[2, 4, 6, 8, 10, 2, 4, 6, 8, 10]` -2. `[4, 8, 12, 16, 20]` -3. `[[2, 4, 6, 8, 10],[2, 4, 6, 8, 10]]` -4. `[2, 4, 6, 8, 10, 4, 8, 12, 16, 20]` +1. `[2, 4, 6, 8, 10, 2, 4, 6, 8, 10]` +2. `[4, 8, 12, 16, 20]` +3. `[[2, 4, 6, 8, 10],[2, 4, 6, 8, 10]]` +4. `[2, 4, 6, 8, 10, 4, 8, 12, 16, 20]` -The technical term for this is *operator overloading*: +The technical term for this is _operator overloading_: a single operator, like `+` or `*`, can do different things depending on what it's applied to. @@ -429,17 +402,15 @@ can do different things depending on what it's applied to. The multiplication operator `*` used on a list replicates elements of the list and concatenates them together: -~~~ +```text [2, 4, 6, 8, 10, 2, 4, 6, 8, 10] -~~~ - +``` It's equivalent to: -~~~ python +```python counts + counts -~~~ +``` ::: :::: - diff --git a/introductory_courses/python/09_for_loops.md b/introductory_courses/python/09_for_loops.md index 91c82fce..aa3101fd 100644 --- a/introductory_courses/python/09_for_loops.md +++ b/introductory_courses/python/09_for_loops.md @@ -1,174 +1,163 @@ --- name: For Loops -dependsOn: [ - introductory_courses.python.08_lists -] +dependsOn: [introductory_courses.python.08_lists] tags: [python] -attribution: - - citation: > - "Programming with Python" course by the Carpentries - url: https://swcarpentry.github.io/python-novice-inflammation/ - image: https://carpentries.org/assets/img/TheCarpentries.svg - license: CC-BY-4.0 +attribution: + - citation: > + "Programming with Python" course by the Carpentries + url: https://swcarpentry.github.io/python-novice-inflammation/ + image: https://carpentries.org/assets/img/TheCarpentries.svg + license: CC-BY-4.0 --- -## A *for loop* executes commands once for each value in a collection. +## A _for loop_ executes commands once for each value in a collection -* Doing calculations on the values in a list one by one - is as painful as working with `pressure_001`, `pressure_002`, etc. -* A *for loop* tells Python to execute some statements once for each value in a list, - a character string, - or some other collection. -* "for each thing in this group, do these operations" +- Doing calculations on the values in a list one by one + is as painful as working with `pressure_001`, `pressure_002`, etc. +- A _for loop_ tells Python to execute some statements once for each value in a list, + a character string, + or some other collection. +- "for each thing in this group, do these operations" -~~~ python +```python for number in [2, 3, 5]: print(number) -~~~ +``` -* This `for` loop is equivalent to: +- This `for` loop is equivalent to: -~~~ python +```python print(2) print(3) print(5) -~~~ +``` +- And the `for` loop's output is: -* And the `for` loop's output is: - -~~~ +```text 2 3 5 -~~~ - +``` -## A `for` loop is made up of a collection, a loop variable, and a body. +## A `for` loop is made up of a collection, a loop variable, and a body -~~~ python +```python for number in [2, 3, 5]: print(number) -~~~ - +``` -* The collection, `[2, 3, 5]`, is what the loop is being run on. -* The body, `print(number)`, specifies what to do for each value in the collection. -* The loop variable, `number`, is what changes for each *iteration* of the loop. - * The "current thing". +- The collection, `[2, 3, 5]`, is what the loop is being run on. +- The body, `print(number)`, specifies what to do for each value in the collection. +- The loop variable, `number`, is what changes for each _iteration_ of the loop. + - The "current thing". -## The first line of the `for` loop must end with a colon, and the body must be indented. +## The first line of the `for` loop must end with a colon, and the body must be indented -* The colon at the end of the first line signals the start of a *block* of statements. -* Python uses indentation rather than `{}` or `begin`/`end` to show *nesting*. - * Any consistent indentation is legal, but almost everyone uses four spaces. +- The colon at the end of the first line signals the start of a _block_ of statements. +- Python uses indentation rather than `{}` or `begin`/`end` to show _nesting_. + - Any consistent indentation is legal, but almost everyone uses four spaces. -~~~ python +```python nolint for number in [2, 3, 5]: print(number) -~~~ +``` -~~~ +```text IndentationError: expected an indented block -~~~ +``` +- Indentation is always meaningful in Python. -* Indentation is always meaningful in Python. - -~~~ python +```python nolint firstName = "Jon" lastName = "Smith" -~~~ +``` -~~~ +```text File "", line 2 lastName = "Smith" ^ IndentationError: unexpected indent -~~~ - +``` -* This error can be fixed by removing the extra spaces - at the beginning of the second line. +- This error can be fixed by removing the extra spaces + at the beginning of the second line. -## Loop variables can be called anything. +## Loop variables can be called anything -* As with all variables, loop variables are: - * Created on demand. - * Meaningless: their names can be anything at all. +- As with all variables, loop variables are: + - Created on demand. + - Meaningless: their names can be anything at all. -~~~ python +```python for kitten in [2, 3, 5]: print(kitten) -~~~ +``` +## The body of a loop can contain many statements -## The body of a loop can contain many statements. +- But no loop should be more than a few lines long. +- Hard for human beings to keep larger chunks of code in mind. -* But no loop should be more than a few lines long. -* Hard for human beings to keep larger chunks of code in mind. - -~~~ python +```python primes = [2, 3, 5] for p in primes: squared = p ** 2 cubed = p ** 3 print(p, squared, cubed) -~~~ +``` -~~~ +```text 2 4 8 3 9 27 5 25 125 -~~~ - +``` -## Use `range` to iterate over a sequence of numbers. +## Use `range` to iterate over a sequence of numbers -* The built-in function [`range`](https://docs.python.org/3/library/stdtypes.html#range) produces a sequence of numbers. - * *Not* a list: the numbers are produced on demand - to make looping over large ranges more efficient. -* `range(N)` is the numbers 0..N-1 - * Exactly the legal indices of a list or character string of length N +- The built-in function [`range`](https://docs.python.org/3/library/stdtypes.html#range) produces a sequence of numbers. + - _Not_ a list: the numbers are produced on demand + to make looping over large ranges more efficient. +- `range(N)` is the numbers 0..N-1 + - Exactly the legal indices of a list or character string of length N -~~~ python +```python print('a range is not a list: range(0, 3)') for number in range(0, 3): print(number) -~~~ +``` -~~~ +```text a range is not a list: range(0, 3) 0 1 2 -~~~ - +``` -## The Accumulator pattern turns many values into one. +## The Accumulator pattern turns many values into one -* A common pattern in programs is to: - 1. Initialize an *accumulator* variable to zero, the empty string, or the empty list. - 2. Update the variable with values from a collection. +- A common pattern in programs is to: + 1. Initialize an _accumulator_ variable to zero, the empty string, or the empty list. + 2. Update the variable with values from a collection. -~~~ python +```python # Sum the first 10 integers. total = 0 for number in range(10): - total = total + (number + 1) + total = total + (number + 1) print(total) -~~~ +``` -~~~ +```text 55 -~~~ +``` - -* Read `total = total + (number + 1)` as: - * Add 1 to the current value of the loop variable `number`. - * Add that to the current value of the accumulator variable `total`. - * Assign that to `total`, replacing the current value. -* We have to add `number + 1` because `range` produces 0..9, not 1..10. +- Read `total = total + (number + 1)` as: + - Add 1 to the current value of the loop variable `number`. + - Add that to the current value of the accumulator variable `total`. + - Assign that to `total`, replacing the current value. +- We have to add `number + 1` because `range` produces 0..9, not 1..10. ::::challenge{id="classifying_errors" title="Classifying Errors"} @@ -184,16 +173,16 @@ A program with a runtime error will start but an error will be thrown under cert Create a table showing the numbers of the lines that are executed when this program runs, and the values of the variables after each line is executed. -~~~ python +```python total = 0 for char in "tin": total = total + 1 -~~~ +``` :::solution | Line no | Variables | -|---------|----------------------| +| ------- | -------------------- | | 1 | total = 0 | | 2 | total = 0 char = 't' | | 3 | total = 1 char = 't' | @@ -201,119 +190,122 @@ for char in "tin": | 3 | total = 2 char = 'i' | | 2 | total = 2 char = 'n' | | 3 | total = 3 char = 'n' | + ::: :::: ::::challenge{id="reversing_a_string" title="Reversing a String"} -## -Fill in the blanks in the program below so that it prints "nit" -(the reverse of the original character string "tin"). +Fill in the blanks in the program below so that it prints "nit" (the reverse of the original character string "tin"). -~~~ python +```python nolint original = "tin" result = ____ for char in original: result = ____ print(result) -~~~ +``` :::solution -~~~ python + +```python original = "tin" result = "" for char in original: result = char + result print(result) -~~~ +``` ::: :::: ::::challenge{id="practice_accumulating" title="Practice Accumulating"} -Fill in the blanks in each of the programs below -to produce the indicated result. +Fill in the blanks in each of the programs below to produce the indicated result. -~~~ python +```python nolint # Total length of the strings in the list: ["red", "green", "blue"] =12 total = 0 for word in ["red", "green", "blue"]: ____ = ____ + len(word) print(total) -~~~ +``` :::solution -~~~ python + +```python total = 0 for word in ["red", "green", "blue"]: total = total + len(word) print(total) -~~~ +``` ::: -~~~ python +```python nolint # List of word lengths: ["red", "green", "blue"] =[3, 5, 4] lengths = ____ for word in ["red", "green", "blue"]: lengths.____(____) print(lengths) -~~~ +``` :::solution -~~~ python + +```python lengths = [] for word in ["red", "green", "blue"]: lengths.append(len(word)) print(lengths) -~~~ +``` ::: -~~~ python +```python nolint # Concatenate all words: ["red", "green", "blue"] ="redgreenblue" words = ["red", "green", "blue"] result = ____ for ____ in ____: ____ print(result) -~~~ +``` :::solution -~~~ python + +```python words = ["red", "green", "blue"] result = "" for word in words: result = result + word print(result) -~~~ +``` ::: -__Create an acronym:__ Starting from the list `["red", "green", "blue"]`, create the acronym `"RGB"` using -a for loop. +**Create an acronym:** Starting from the list `["red", "green", "blue"]`, create the acronym `"RGB"` using a for loop. -__Hint:__ You may need to use a string method to properly format the acronym. +**Hint:** You may need to use a string method to properly format the acronym. :::solution -~~~ python + +```python acronym = "" for word in ["red", "green", "blue"]: acronym = acronym + word[0].upper() print(acronym) -~~~ +``` ::: :::: ::::challenge{id="cumulative_sum" title="Cumulative Sum"} + ## Cumulative Sum Reorder and properly indent the lines of code below so that they print a list with the cumulative sum of data. The result should be `[1, 3, 5, 10]`. -~~~ python +```python nolint cumulative.append(total) for number in data: cumulative = [] @@ -321,10 +313,11 @@ total += number total = 0 print(cumulative) data = [1,2,2,5] -~~~ +``` :::solution -~~~ python + +```python total = 0 data = [1,2,2,5] cumulative = [] @@ -332,7 +325,7 @@ for number in data: total += number cumulative.append(total) print(cumulative) -~~~ +``` ::: :::: @@ -340,7 +333,7 @@ print(cumulative) ::::challenge{id="identifying_variable_name_errors" title="Identifying Variable Name Errors"} 1. Read the code below and try to identify what the errors are - *without* running it. + _without_ running it. 2. Run the code and read the error message. What type of `NameError` do you think this is? Is it a string with no quotes, a misspelled variable, or a @@ -348,7 +341,7 @@ print(cumulative) 3. Fix the error. 4. Repeat steps 2 and 3, until you have fixed all the errors. -~~~ python +```python nolint for number in range(10): # use a if the number is a multiple of 3, otherwise use b if (Number % 3) == 0: @@ -356,14 +349,15 @@ for number in range(10): else: message = message + "b" print(message) -~~~ +``` :::solution + - Python variable names are case sensitive: `number` and `Number` refer to different variables. - The variable `message` needs to be initialized as an empty string. - We want to add the string `"a"` to `message`, not the undefined variable `a`. -~~~ python +```python message = "" for number in range(10): # use a if the number is a multiple of 3, otherwise use b @@ -372,7 +366,7 @@ for number in range(10): else: message = message + "b" print(message) -~~~ +``` ::: :::: @@ -380,21 +374,22 @@ print(message) ::::challenge{id="identifying_item_errors" title="Identifying Item Errors"} 1. Read the code below and try to identify what the errors are - *without* running it. + _without_ running it. 2. Run the code, and read the error message. What type of error is it? 3. Fix the error. -~~~ python +```python seasons = ['Spring', 'Summer', 'Fall', 'Winter'] print('My favorite season is ', seasons[4]) -~~~ +``` :::solution This list has 4 elements and the index to access the last element in the list is `3`. -~~~ python + +```python seasons = ['Spring', 'Summer', 'Fall', 'Winter'] print('My favorite season is ', seasons[3]) -~~~ +``` ::: -:::: \ No newline at end of file +:::: diff --git a/introductory_courses/python/10_conditionals.md b/introductory_courses/python/10_conditionals.md index ab2ac66e..2e113833 100644 --- a/introductory_courses/python/10_conditionals.md +++ b/introductory_courses/python/10_conditionals.md @@ -1,114 +1,108 @@ --- name: Conditionals -dependsOn: [ - introductory_courses.python.09_for_loops -] +dependsOn: [introductory_courses.python.09_for_loops] tags: [python] -attribution: - - citation: > - "Programming with Python" course by the Carpentries - url: https://swcarpentry.github.io/python-novice-inflammation/ - image: https://carpentries.org/assets/img/TheCarpentries.svg - license: CC-BY-4.0 +attribution: + - citation: > + "Programming with Python" course by the Carpentries + url: https://swcarpentry.github.io/python-novice-inflammation/ + image: https://carpentries.org/assets/img/TheCarpentries.svg + license: CC-BY-4.0 --- -## Use `if` statements to control whether or not a block of code is executed. +## Use `if` statements to control whether or not a block of code is executed -* An `if` statement (more properly called a *conditional* statement) - controls whether some block of code is executed or not. -* Structure is similar to a `for` statement: - * First line opens with `if` and ends with a colon - * Body containing one or more statements is indented (usually by 4 spaces) +- An `if` statement (more properly called a _conditional_ statement) + controls whether some block of code is executed or not. +- Structure is similar to a `for` statement: + - First line opens with `if` and ends with a colon + - Body containing one or more statements is indented (usually by 4 spaces) -~~~ python +```python mass = 3.54 -if mass 3.0: +if mass > 3.0: print(mass, 'is large') mass = 2.07 -if mass 3.0: +if mass > 3.0: print(mass, 'is large') -~~~ +``` -~~~ +```text 3.54 is large -~~~ +``` +## Conditionals are often used inside loops -## Conditionals are often used inside loops. +- Not much point using a conditional when we know the value (as above). +- But useful when we have a collection to process. -* Not much point using a conditional when we know the value (as above). -* But useful when we have a collection to process. - -~~~ python +```python masses = [3.54, 2.07, 9.22, 1.86, 1.71] for m in masses: - if m 3.0: + if m > 3.0: print(m, 'is large') -~~~ +``` -~~~ +```text 3.54 is large 9.22 is large -~~~ - +``` -## Use `else` to execute a block of code when an `if` condition is *not* true. +## Use `else` to execute a block of code when an `if` condition is _not_ true -* `else` can be used following an `if`. -* Allows us to specify an alternative to execute when the `if` *branch* isn't taken. +- `else` can be used following an `if`. +- Allows us to specify an alternative to execute when the `if` _branch_ isn't taken. -~~~ python +```python masses = [3.54, 2.07, 9.22, 1.86, 1.71] for m in masses: - if m 3.0: + if m > 3.0: print(m, 'is large') else: print(m, 'is small') -~~~ +``` -~~~ +```text 3.54 is large 2.07 is small 9.22 is large 1.86 is small 1.71 is small -~~~ - +``` -## Use `elif` to specify additional tests. +## Use `elif` to specify additional tests -* May want to provide several alternative choices, each with its own test. -* Use `elif` (short for "else if") and a condition to specify these. -* Always associated with an `if`. -* Must come before the `else` (which is the "catch all"). +- May want to provide several alternative choices, each with its own test. +- Use `elif` (short for "else if") and a condition to specify these. +- Always associated with an `if`. +- Must come before the `else` (which is the "catch all"). -~~~ python +```python masses = [3.54, 2.07, 9.22, 1.86, 1.71] for m in masses: - if m 9.0: + if m > 9.0: print(m, 'is HUGE') - elif m 3.0: + elif m > 3.0: print(m, 'is large') else: print(m, 'is small') -~~~ +``` -~~~ +```text 3.54 is large 2.07 is small 9.22 is HUGE 1.86 is small 1.71 is small -~~~ +``` +## Conditions are tested once, in order -## Conditions are tested once, in order. +- Python steps through the branches of the conditional in order, testing each in turn. +- So ordering matters. -* Python steps through the branches of the conditional in order, testing each in turn. -* So ordering matters. - -~~~ python +```python grade = 85 if grade >= 70: print('grade is C') @@ -116,45 +110,43 @@ elif grade >= 80: print('grade is B') elif grade >= 90: print('grade is A') -~~~ +``` -~~~ +```text grade is C -~~~ - +``` -* Does *not* automatically go back and re-evaluate if values change. +- Does _not_ automatically go back and re-evaluate if values change. -~~~ python +```python velocity = 10.0 -if velocity 20.0: +if velocity > 20.0: print('moving too fast') else: print('adjusting velocity') velocity = 50.0 -~~~ +``` -~~~ +```text adjusting velocity -~~~ +``` +- Often use conditionals in a loop to "evolve" the values of variables. -* Often use conditionals in a loop to "evolve" the values of variables. - -~~~ python +```python velocity = 10.0 for i in range(5): # execute the loop 5 times print(i, ':', velocity) - if velocity 20.0: + if velocity > 20.0: print('moving too fast') velocity = velocity - 5.0 else: print('moving too slow') velocity = velocity + 10.0 print('final velocity:', velocity) -~~~ +``` -~~~ +```text 0 : 10.0 moving too slow 1 : 20.0 @@ -166,71 +158,68 @@ moving too fast 4 : 20.0 moving too slow final velocity: 30.0 -~~~ +``` ## Compound Relations Using `and`, `or`, and Parentheses -Often, you want some combination of things to be true. You can combine -relations within a conditional using `and` and `or`. Continuing the example +Often, you want some combination of things to be true. You can combine +relations within a conditional using `and` and `or`. Continuing the example above, suppose you have -~~~ python -mass = [ 3.54, 2.07, 9.22, 1.86, 1.71] +```python +mass = [ 3.54, 2.07, 9.22, 1.86, 1.71] velocity = [10.00, 20.00, 30.00, 25.00, 20.00] i = 0 for i in range(5): - if mass[i] 5 and velocity[i] 20: + if mass[i] > 5 and velocity[i] > 20: print("Fast heavy object. Duck!") - elif mass[i] 2 and mass[i] <= 5 and velocity[i] <= 20: + elif mass[i] > 2 and mass[i] <= 5 and velocity[i] <= 20: print("Normal traffic") elif mass[i] <= 2 and velocity[i] <= 20: print("Slow light object. Ignore it") else: print("Whoa! Something is up with the data. Check it") -~~~ - +``` Just like with arithmetic, you can and should use parentheses whenever there -is possible ambiguity. A good general rule is to *always* use parentheses -when mixing `and` and `or` in the same condition. That is, instead of: - -~~~ python -if mass[i] <= 2 or mass[i] >= 5 and velocity[i] 20: -~~~ +is possible ambiguity. A good general rule is to _always_ use parentheses +when mixing `and` and `or` in the same condition. That is, instead of: +```python nolint +if mass[i] <= 2 or mass[i] >= 5 and velocity[i] > 20: +``` write one of these: -~~~ python -if (mass[i] <= 2 or mass[i] >= 5) and velocity[i] 20: -if mass[i] <= 2 or (mass[i] >= 5 and velocity[i] 20): -~~~ +```python nolint +if (mass[i] <= 2 or mass[i] >= 5) and velocity[i] > 20: +``` +```python nolint +if mass[i] <= 2 or (mass[i] >= 5 and velocity[i] > 20): +``` so it is perfectly clear to a reader (and to Python) what you really mean. -{: .callout} - ::::challenge{id="tracing_execution" title="Tracing Execution"} What does this program print? -~~~ python +```python pressure = 71.9 -if pressure 50.0: +if pressure > 50.0: pressure = 25.0 elif pressure <= 50.0: pressure = 0.0 print(pressure) -~~~ - +``` :::solution -~~~ +```text 25.0 -~~~ +``` ::: :::: @@ -241,7 +230,7 @@ Fill in the blanks so that this program creates a new list containing zeroes where the original list's values were negative and ones where the original list's values were positive. -~~~ python +```python nolint original = [-1.5, 0.2, 0.4, 0.0, -1.3, 0.4] result = ____ for value in original: @@ -250,16 +239,15 @@ for value in original: else: ____ print(result) -~~~ - +``` -~~~ +```text [0, 1, 1, 1, 0, 1] -~~~ +``` :::solution -~~~ python +```python original = [-1.5, 0.2, 0.4, 0.0, -1.3, 0.4] result = [] for value in original: @@ -268,7 +256,7 @@ for value in original: else: result.append(1) print(result) -~~~ +``` ::: :::: @@ -277,25 +265,25 @@ print(result) Modify this program so that it only processes files with fewer than 50 records. -~~~ python +```python nolint import glob import pandas as pd for filename in glob.glob('data/*.csv'): contents = pd.read_csv(filename) ____: print(filename, len(contents)) -~~~ +``` :::solution -~~~ python +```python import glob import pandas as pd for filename in glob.glob('data/*.csv'): contents = pd.read_csv(filename) if len(contents) < 50: print(filename, len(contents)) -~~~ +``` ::: :::: @@ -305,7 +293,7 @@ for filename in glob.glob('data/*.csv'): Modify this program so that it finds the largest and smallest values in the list no matter what the range of values originally is. -~~~ python +```python nolint values = [...some test data...] smallest, largest = None, None for v in values: @@ -315,91 +303,87 @@ for v in values: smallest = min(____, v) largest = max(____, v) print(smallest, largest) -~~~ - +``` What are the advantages and disadvantages of using this method to find the range of the data? :::solution -~~~ python +```python values = [-2,1,65,78,-54,-24,100] smallest, largest = None, None for v in values: - if smallest == None and largest == None: + if smallest is None and largest is None: smallest, largest = v, v else: smallest = min(smallest, v) largest = max(largest, v) print(smallest, largest) -~~~ +``` It can be argued that an advantage of using this method would be to make the code more readable. However, readability is in the eye of the beholder, so another reader may prefer this approach: -~~~ python +```python values = [-2,1,65,78,-54,-24,100] smallest, largest = None, None for v in values: - if smallest == None or v < smallest: + if smallest is None or v < smallest: smallest = v - if largest == None or v largest: + if largest is None or v > largest: largest = v print(smallest, largest) -~~~ +``` ::: :::: - :::callout + ## Using Functions With Conditionals in Pandas -Functions will often contain conditionals. Here is a short example that +Functions will often contain conditionals. Here is a short example that will indicate which quartile the argument is in based on hand-coded values for the quartile cut points. -~~~ python +```python def calculate_life_quartile(exp): if exp < 58.41: # This observation is in the first quartile return 1 elif exp >= 58.41 and exp < 67.05: # This observation is in the second quartile - return 2 + return 2 elif exp >= 67.05 and exp < 71.70: # This observation is in the third quartile - return 3 + return 3 elif exp >= 71.70: # This observation is in the fourth quartile - return 4 + return 4 else: # This observation has bad data - return None + return None calculate_life_quartile(62.5) -~~~ +``` - -~~~ +```text 2 -~~~ - +``` That function would typically be used within a `for` loop, but Pandas has a different, more efficient way of doing the same thing, and that is by -*applying* a function to a dataframe or a portion of a dataframe. Here +_applying_ a function to a dataframe or a portion of a dataframe. Here is an example, using the definition above. -~~~ python +```python data = pd.read_csv('data/gapminder_all.csv') data['life_qrtl'] = data['lifeExp_1952'].apply(calculate_life_quartile) -~~~ - +``` There is a lot in that second line, so let's take it piece by piece. On the right side of the `=` we start with `data['lifeExp']`, which is the -column in the dataframe called `data` labeled `lifExp`. We use the +column in the dataframe called `data` labeled `lifExp`. We use the `apply()` to do what it says, apply the `calculate_life_quartile` to the value of this column for every row in the dataframe. -::: \ No newline at end of file +::: diff --git a/introductory_courses/python/11_looping_over_data_sets.md b/introductory_courses/python/11_looping_over_data_sets.md index c0524e6d..825f7c5d 100644 --- a/introductory_courses/python/11_looping_over_data_sets.md +++ b/introductory_courses/python/11_looping_over_data_sets.md @@ -1,30 +1,28 @@ --- name: Looping Over Data Sets -dependsOn: [ - introductory_courses.python.10_conditionals -] +dependsOn: [introductory_courses.python.10_conditionals] tags: [python] -attribution: - - citation: > - "Programming with Python" course by the Carpentries - url: https://swcarpentry.github.io/python-novice-inflammation/ - image: https://carpentries.org/assets/img/TheCarpentries.svg - license: CC-BY-4.0 +attribution: + - citation: > + "Programming with Python" course by the Carpentries + url: https://swcarpentry.github.io/python-novice-inflammation/ + image: https://carpentries.org/assets/img/TheCarpentries.svg + license: CC-BY-4.0 --- -## Use a `for` loop to process files given a list of their names. +## Use a `for` loop to process files given a list of their names -* A filename is a character string. -* And lists can contain character strings. +- A filename is a character string. +- And lists can contain character strings. -~~~ python +```python import pandas as pd for filename in ['data/gapminder_gdp_africa.csv', 'data/gapminder_gdp_asia.csv']: data = pd.read_csv(filename, index_col='country') print(filename, data.min()) -~~~ +``` -~~~ +```text data/gapminder_gdp_africa.csv gdpPercap_1952 298.846212 gdpPercap_1957 335.997115 gdpPercap_1962 355.203227 @@ -43,70 +41,67 @@ gdpPercap_1997 415 gdpPercap_2002 611 gdpPercap_2007 944 dtype: float64 -~~~ +``` +## Use [`glob.glob`](https://docs.python.org/3/library/glob.html#glob.glob) to find sets of files whose names match a pattern -## Use [`glob.glob`](https://docs.python.org/3/library/glob.html#glob.glob) to find sets of files whose names match a pattern. +- In Unix, the term "globbing" means "matching a set of files with a pattern". +- The most common patterns are: + - `*` meaning "match zero or more characters" + - `?` meaning "match exactly one character" +- Python's standard library contains the [`glob`](https://docs.python.org/3/library/glob.html) module to provide pattern matching functionality +- The [`glob`](https://docs.python.org/3/library/glob.html) module contains a function also called `glob` to match file patterns +- E.g., `glob.glob('*.txt')` matches all files in the current directory + whose names end with `.txt`. +- Result is a (possibly empty) list of character strings. -* In Unix, the term "globbing" means "matching a set of files with a pattern". -* The most common patterns are: - * `*` meaning "match zero or more characters" - * `?` meaning "match exactly one character" -* Python's standard library contains the [`glob`](https://docs.python.org/3/library/glob.html) module to provide pattern matching functionality -* The [`glob`](https://docs.python.org/3/library/glob.html) module contains a function also called `glob` to match file patterns -* E.g., `glob.glob('*.txt')` matches all files in the current directory - whose names end with `.txt`. -* Result is a (possibly empty) list of character strings. - -~~~ python +```python import glob print('all csv files in data directory:', glob.glob('data/*.csv')) -~~~ +``` -~~~ +```text all csv files in data directory: ['data/gapminder_all.csv', 'data/gapminder_gdp_africa.csv', \ 'data/gapminder_gdp_americas.csv', 'data/gapminder_gdp_asia.csv', 'data/gapminder_gdp_europe.csv', \ 'data/gapminder_gdp_oceania.csv'] -~~~ +``` -~~~ python +```python print('all PDB files:', glob.glob('*.pdb')) -~~~ +``` -~~~ +```text all PDB files: [] -~~~ - +``` -## Use `glob` and `for` to process batches of files. +## Use `glob` and `for` to process batches of files -* Helps a lot if the files are named and stored systematically and consistently - so that simple patterns will find the right data. +- Helps a lot if the files are named and stored systematically and consistently + so that simple patterns will find the right data. -~~~ python +```python for filename in glob.glob('data/gapminder_*.csv'): data = pd.read_csv(filename) print(filename, data['gdpPercap_1952'].min()) -~~~ +``` -~~~ +```text data/gapminder_all.csv 298.8462121 data/gapminder_gdp_africa.csv 298.8462121 data/gapminder_gdp_americas.csv 1397.717137 data/gapminder_gdp_asia.csv 331.0 data/gapminder_gdp_europe.csv 973.5331948 data/gapminder_gdp_oceania.csv 10039.59564 -~~~ +``` - -* This includes all data, as well as per-region data. -* Use a more specific pattern in the exercises to exclude the whole data set. -* But note that the minimum of the entire data set is also the minimum of one of the data sets, - which is a nice check on correctness. +- This includes all data, as well as per-region data. +- Use a more specific pattern in the exercises to exclude the whole data set. +- But note that the minimum of the entire data set is also the minimum of one of the data sets, + which is a nice check on correctness. ::::challenge{id="determining_matches" title="Determining Matches"} -Which of these files is *not* matched by the expression `glob.glob('data/*as*.csv')`? +Which of these files is _not_ matched by the expression `glob.glob('data/*as*.csv')`? 1. `data/gapminder_gdp_africa.csv` 2. `data/gapminder_gdp_americas.csv` @@ -123,7 +118,7 @@ Which of these files is *not* matched by the expression `glob.glob('data/*as*.cs Modify this program so that it prints the number of records in the file that has the fewest records. -~~~ python +```python nolint import glob import pandas as pd fewest = ____ @@ -131,13 +126,14 @@ for filename in glob.glob('data/*.csv'): dataframe = pd.____(filename) fewest = min(____, dataframe.shape[0]) print('smallest file has', fewest, 'records') -~~~ +``` Note that the [`DataFrame.shape()` method](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.shape.html) returns a tuple with the number of rows and columns of the data frame. :::solution -~~~ python + +```python import glob import pandas as pd fewest = float('Inf') @@ -145,7 +141,7 @@ for filename in glob.glob('data/*.csv'): dataframe = pd.read_csv(filename) fewest = min(fewest, dataframe.shape[0]) print('smallest file has', fewest, 'records') -~~~ +``` ::: :::: @@ -158,7 +154,8 @@ in a single chart. :::solution This solution builds a useful legend by using the [string `split` method](https://docs.python.org/3/library/stdtypes.html#str.split) to extract the `region` from the path 'data/gapminder_gdp_a_specific_region.csv'. -~~~ python + +```python import glob import pandas as pd import matplotlib.pyplot as plt @@ -167,35 +164,38 @@ for filename in glob.glob('data/gapminder_gdp*.csv'): dataframe = pd.read_csv(filename) # extract .csv'. # we will split the string using the split method and `_` as our separator, - # retrieve the last string in the list that split returns (`.csv`), + # retrieve the last string in the list that split returns (`.csv`), # and then remove the `.csv` extension from that string. - region = filename.split('_')[-1][:-4] + region = filename.split('_')[-1][:-4] dataframe.mean().plot(ax=ax, label=region) plt.legend() plt.show() -~~~ +``` ::: :::: :::callout + ## Dealing with File Paths + The [`pathlib` module](https://docs.python.org/3/library/pathlib.html) provides useful abstractions for file and path manipulation like returning the name of a file without the file extension. This is very useful when looping over files and directories. In the example below, we create a `Path` object and inspect its attributes. -~~~ python + +```python from pathlib import Path p = Path("data/gapminder_gdp_africa.csv") print(p.parent), print(p.stem), print(p.suffix) -~~~ +``` -~~~ +```text data gapminder_gdp_africa .csv -~~~ +``` -__Hint:__ It is possible to check all available attributes and methods on the `Path` object with the `dir()` +**Hint:** It is possible to check all available attributes and methods on the `Path` object with the `dir()` function! ::: diff --git a/introductory_courses/python/12_errors_and_exceptions.md b/introductory_courses/python/12_errors_and_exceptions.md index d21e9825..55dcbe6e 100644 --- a/introductory_courses/python/12_errors_and_exceptions.md +++ b/introductory_courses/python/12_errors_and_exceptions.md @@ -1,15 +1,13 @@ --- name: Errors and Exceptions -dependsOn: [ - introductory_courses.python.11_looping_over_data_sets -] +dependsOn: [introductory_courses.python.11_looping_over_data_sets] tags: [python] -attribution: - - citation: > - "Programming with Python" course by the Carpentries - url: https://swcarpentry.github.io/python-novice-inflammation/ - image: https://carpentries.org/assets/img/TheCarpentries.svg - license: CC-BY-4.0 +attribution: + - citation: > + "Programming with Python" course by the Carpentries + url: https://swcarpentry.github.io/python-novice-inflammation/ + image: https://carpentries.org/assets/img/TheCarpentries.svg + license: CC-BY-4.0 --- Every programmer encounters errors, @@ -20,14 +18,14 @@ and can make coding feel like a hopeless endeavour. However, understanding what the different types of errors are and when you are likely to encounter them can help a lot. -Once you know *why* you get certain types of errors, +Once you know _why_ you get certain types of errors, they become much easier to fix. Errors in Python have a very specific form, -called a *traceback*. +called a _traceback_. Let's examine one: -~~~ python +```python # This code has an intentional error. You can type it directly or # use it for reference to understand the error message below. def favorite_ice_cream(): @@ -39,10 +37,9 @@ def favorite_ice_cream(): print(ice_creams[3]) favorite_ice_cream() -~~~ +``` - -~~~ +```text --------------------------------------------------------------------------- IndexError Traceback (most recent call last) () @@ -58,27 +55,27 @@ IndexError Traceback (most recent call last) 11 favorite_ice_cream() IndexError: list index out of range -~~~ - +``` This particular traceback has two levels. You can determine the number of levels by looking for the number of arrows on the left hand side. In this case: -1. The first shows code from the cell above, - with an arrow pointing to Line 11 (which is `favorite_ice_cream()`). +1. The first shows code from the cell above, + with an arrow pointing to Line 11 (which is `favorite_ice_cream()`). -2. The second shows some code in the function `favorite_ice_cream`, - with an arrow pointing to Line 9 (which is `print(ice_creams[3])`). +2. The second shows some code in the function `favorite_ice_cream`, + with an arrow pointing to Line 9 (which is `print(ice_creams[3])`). The last level is the actual place where the error occurred. The other level(s) show what function the program executed to get to the next level down. So, in this case, the program first performed a -*function call* to the function `favorite_ice_cream`. +_function call_ to the function `favorite_ice_cream`. Inside this function, the program encountered an error on Line 6, when it tried to run the code `print(ice_creams[3])`. :::callout + ## Long Tracebacks Sometimes, you might see a traceback that is very long @@ -102,7 +99,7 @@ if you fix the error, but encounter a new one, you can tell that the error changed. Additionally, -sometimes knowing *where* the error occurred is enough to fix it, +sometimes knowing _where_ the error occurred is enough to fix it, even if you don't entirely understand the message. If you do encounter an error you don't recognize, @@ -119,7 +116,7 @@ hopefully the custom error message is informative enough to help you figure out When you forget a colon at the end of a line, accidentally add one space too many when indenting under an `if` statement, or forget a parenthesis, -you will encounter a *syntax error*. +you will encounter a _syntax error_. This means that Python couldn't figure out how to read your program. This is similar to forgetting punctuation in English: for example, @@ -134,52 +131,49 @@ If Python doesn't know how to read the program, it will give up and inform you with an error. For example: -~~~ python +```python nolint def some_function() msg = 'hello, world!' print(msg) return msg -~~~ - +``` -~~~ +```text File "", line 1 def some_function() ^ SyntaxError: invalid syntax -~~~ - +``` Here, Python tells us that there is a `SyntaxError` on line 1, and even puts a little arrow in the place where there is an issue. In this case the problem is that the function definition is missing a colon at the end. -Actually, the function above has *two* issues with syntax. +Actually, the function above has _two_ issues with syntax. If we fix the problem with the colon, -we see that there is *also* an `IndentationError`, +we see that there is _also_ an `IndentationError`, which means that the lines in the function definition do not all have the same indentation: -~~~ python +```python nolint def some_function(): msg = 'hello, world!' print(msg) return msg -~~~ - +``` -~~~ +```text File "", line 4 return msg ^ IndentationError: unexpected indent -~~~ - +``` Both `SyntaxError` and `IndentationError` indicate a problem with the syntax of your program, but an `IndentationError` is more specific: -it *always* means that there is a problem with how your code is indented. +it _always_ means that there is a problem with how your code is indented. :::callout + ## Tabs and Spaces Some indentation errors are harder to spot than others. @@ -191,23 +185,23 @@ If you're working in a Jupyter notebook, be sure to copy and paste this example rather than trying to type it in manually because Jupyter automatically replaces tabs with spaces. -~~~ python +```python nolint def some_function(): - msg = 'hello, world!' - print(msg) + msg = 'hello, world!' + print(msg) return msg -~~~ - +``` Visually it is impossible to spot the error. Fortunately, Python does not allow you to mix tabs and spaces. -~~~ +```text File "", line 4 return msg ^ TabError: inconsistent use of tabs and spaces in indentation -~~~ +``` + ::: ## Variable Name Errors @@ -216,20 +210,18 @@ Another very common type of error is called a `NameError`, and occurs when you try to use a variable that does not exist. For example: -~~~ python +```python nolint print(a) -~~~ +``` - -~~~ +```text --------------------------------------------------------------------------- NameError Traceback (most recent call last) () ----1 print(a) NameError: name 'a' is not defined -~~~ - +``` Variable name errors come with some of the most informative error messages, which are usually of the form "name 'the_variable_name' is not defined". @@ -242,33 +234,30 @@ there are a few very common reasons why you might have an undefined variable. The first is that you meant to use a string, but forgot to put quotes around it: -~~~ python +```python nolint print(hello) -~~~ +``` - -~~~ +```text --------------------------------------------------------------------------- NameError Traceback (most recent call last) () ----1 print(hello) NameError: name 'hello' is not defined -~~~ - +``` The second reason is that you might be trying to use a variable that does not yet exist. In the following example, `count` should have been defined (e.g., with `count = 0`) before the for loop: -~~~ python +```python nolint for number in range(10): count = count + number print('The count is:', count) -~~~ +``` - -~~~ +```text --------------------------------------------------------------------------- NameError Traceback (most recent call last) () @@ -277,8 +266,7 @@ NameError Traceback (most recent call last) 3 print('The count is:', count) NameError: name 'count' is not defined -~~~ - +``` Finally, the third possibility is that you made a typo when you were writing your code. Let's say we fixed the error above by adding the line `Count = 0` before the for loop. @@ -287,15 +275,14 @@ Remember that variables are case-sensitive, so the variable `count` is different from `Count`. We still get the same error, because we still have not defined `count`: -~~~ python +```python nolint Count = 0 for number in range(10): count = count + number print('The count is:', count) -~~~ +``` - -~~~ +```text --------------------------------------------------------------------------- NameError Traceback (most recent call last) () @@ -305,8 +292,7 @@ NameError Traceback (most recent call last) 4 print('The count is:', count) NameError: name 'count' is not defined -~~~ - +``` ## Index Errors @@ -319,23 +305,21 @@ and they answered "caturday", you might be a bit annoyed. Python gets similarly annoyed if you try to ask it for an item that doesn't exist: -~~~ python +```python letters = ['a', 'b', 'c'] print('Letter #1 is', letters[0]) print('Letter #2 is', letters[1]) print('Letter #3 is', letters[2]) print('Letter #4 is', letters[3]) -~~~ +``` - -~~~ +```text Letter #1 is a Letter #2 is b Letter #3 is c -~~~ - +``` -~~~ +```text --------------------------------------------------------------------------- IndexError Traceback (most recent call last) () @@ -344,8 +328,7 @@ IndexError Traceback (most recent call last) ----5 print('Letter #4 is', letters[3]) IndexError: list index out of range -~~~ - +``` Here, Python is telling us that there is an `IndexError` in our code, @@ -362,20 +345,18 @@ returns an `UnsupportedOperationError`. More generally, problems with input and output manifest as `IOError`s or `OSError`s, depending on the version of Python you use. -~~~ python +```python file_handle = open('myfile.txt', 'r') -~~~ - +``` -~~~ +```text --------------------------------------------------------------------------- FileNotFoundError Traceback (most recent call last) () ----1 file_handle = open('myfile.txt', 'r') FileNotFoundError: [Errno 2] No such file or directory: 'myfile.txt' -~~~ - +``` One reason for receiving this error is that you specified an incorrect path to the file. For example, @@ -396,13 +377,12 @@ and then try to read from it, you will get an `UnsupportedOperation` error telling you that the file was not opened for reading: -~~~ python +```python file_handle = open('myfile.txt', 'w') file_handle.read() -~~~ - +``` -~~~ +```text --------------------------------------------------------------------------- UnsupportedOperation Traceback (most recent call last) () @@ -410,8 +390,7 @@ UnsupportedOperation Traceback (most recent call last) ----2 file_handle.read() UnsupportedOperation: not readable -~~~ - +``` These are the most common errors with files, though many others exist. @@ -423,13 +402,13 @@ often reveals common reasons why you might get that error. Read the Python code and the resulting traceback below, and answer the following questions: -1. How many levels does the traceback have? -2. What is the function name where the error occurred? -3. On which line number in this function did the error occur? -4. What is the type of error? -5. What is the error message? +1. How many levels does the traceback have? +2. What is the function name where the error occurred? +3. On which line number in this function did the error occur? +4. What is the type of error? +5. What is the error message? -~~~ python +```python # This code has an intentional error. Do not type it directly; # use it for reference to understand the error message below. def print_message(day): @@ -448,10 +427,9 @@ def print_friday_message(): print_message('Friday') print_friday_message() -~~~ - +``` -~~~ +```text --------------------------------------------------------------------------- KeyError Traceback (most recent call last) () @@ -474,16 +452,17 @@ KeyError Traceback (most recent call last) 13 def print_friday_message(): KeyError: 'Friday' -~~~ - +``` :::solution + 1. 3 levels 2. `print_message` 3. 11 4. `KeyError` 5. There isn't really a message; you're supposed -to infer that `Friday` is not a key in `messages`. + to infer that `Friday` is not a key in `messages`. + ::: :::: @@ -494,25 +473,24 @@ to infer that `Friday` is not a key in `messages`. 3. Fix the error. 4. Repeat steps 2 and 3, until you have fixed all the errors. -~~~ python +```python nolint def another_function print('Syntax errors are annoying.') print('But at least Python tells us about them!') print('So they are usually not too hard to fix.') -~~~ - +``` :::solution `SyntaxError` for missing `():` at end of first line, `IndentationError` for mismatch between second and third lines. A fixed version is: -~~~ python +```python def another_function(): print('Syntax errors are annoying.') print('But at least Python tells us about them!') print('So they are usually not too hard to fix.') -~~~ +``` ::: :::: @@ -528,7 +506,7 @@ def another_function(): 3. Fix the error. 4. Repeat steps 2 and 3, until you have fixed all the errors. -~~~ python +```python nolint for number in range(10): # use a if the number is a multiple of 3, otherwise use b if (Number % 3) == 0: @@ -536,8 +514,7 @@ for number in range(10): else: message = message + 'b' print(message) -~~~ - +``` :::solution 3 `NameError`s for `number` being misspelled, for `message` not defined, @@ -545,7 +522,7 @@ and for `a` not being in quotes. Fixed version: -~~~ python +```python message = '' for number in range(10): # use a if the number is a multiple of 3, otherwise use b @@ -554,7 +531,7 @@ for number in range(10): else: message = message + 'b' print(message) -~~~ +``` ::: :::: @@ -565,20 +542,19 @@ print(message) 2. Run the code, and read the error message. What type of error is it? 3. Fix the error. -~~~ python +```python seasons = ['Spring', 'Summer', 'Fall', 'Winter'] print('My favorite season is ', seasons[4]) -~~~ - +``` :::solution `IndexError`; the last entry is `seasons[3]`, so `seasons[4]` doesn't make sense. A fixed version is: -~~~ python +```python seasons = ['Spring', 'Summer', 'Fall', 'Winter'] print('My favorite season is ', seasons[-1]) -~~~ +``` ::: :::: diff --git a/introductory_courses/python/13_writing_functions.md b/introductory_courses/python/13_writing_functions.md index cb92bb4d..4b3c3969 100644 --- a/introductory_courses/python/13_writing_functions.md +++ b/introductory_courses/python/13_writing_functions.md @@ -1,171 +1,161 @@ --- name: Writing Functions -dependsOn: [ - introductory_courses.python.12_errors_and_exceptions -] +dependsOn: [introductory_courses.python.12_errors_and_exceptions] tags: [python] -attribution: - - citation: > - "Programming with Python" course by the Carpentries - url: https://swcarpentry.github.io/python-novice-inflammation/ - image: https://carpentries.org/assets/img/TheCarpentries.svg - license: CC-BY-4.0 +attribution: + - citation: > + "Programming with Python" course by the Carpentries + url: https://swcarpentry.github.io/python-novice-inflammation/ + image: https://carpentries.org/assets/img/TheCarpentries.svg + license: CC-BY-4.0 --- -## Break programs down into functions to make them easier to understand. - -* Human beings can only keep a few items in working memory at a time. -* Understand larger/more complicated ideas by understanding and combining pieces. - * Components in a machine. - * Lemmas when proving theorems. -* Functions serve the same purpose in programs. - * *Encapsulate* complexity so that we can treat it as a single "thing". -* Also enables *re-use*. - * Write one time, use many times. - -## Define a function using `def` with a name, parameters, and a block of code. - -* Begin the definition of a new function with `def`. -* Followed by the name of the function. - * Must obey the same rules as variable names. -* Then *parameters* in parentheses. - * Empty parentheses if the function doesn't take any inputs. - * We will discuss this in detail in a moment. -* Then a colon. -* Then an indented block of code. - -~~~ python +## Break programs down into functions to make them easier to understand + +- Human beings can only keep a few items in working memory at a time. +- Understand larger/more complicated ideas by understanding and combining pieces. + - Components in a machine. + - Lemmas when proving theorems. +- Functions serve the same purpose in programs. + - _Encapsulate_ complexity so that we can treat it as a single "thing". +- Also enables _re-use_. + - Write one time, use many times. + +## Define a function using `def` with a name, parameters, and a block of code + +- Begin the definition of a new function with `def`. +- Followed by the name of the function. + - Must obey the same rules as variable names. +- Then _parameters_ in parentheses. + - Empty parentheses if the function doesn't take any inputs. + - We will discuss this in detail in a moment. +- Then a colon. +- Then an indented block of code. + +```python def print_greeting(): print('Hello!') -~~~ +``` +## Defining a function does not run it -## Defining a function does not run it. +- Defining a function does not run it. + - Like assigning a value to a variable. +- Must call the function to execute the code it contains. -* Defining a function does not run it. - * Like assigning a value to a variable. -* Must call the function to execute the code it contains. - -~~~ python +```python print_greeting() -~~~ +``` -~~~ +```text Hello! -~~~ - +``` -## Arguments in call are matched to parameters in definition. +## Arguments in call are matched to parameters in definition -* Functions are most useful when they can operate on different data. -* Specify *parameters* when defining a function. - * These become variables when the function is executed. - * Are assigned the arguments in the call (i.e., the values passed to the function). - * If you don't name the arguments when using them in the call, the arguments will be matched to -parameters in the order the parameters are defined in the function. +- Functions are most useful when they can operate on different data. +- Specify _parameters_ when defining a function. + - These become variables when the function is executed. + - Are assigned the arguments in the call (i.e., the values passed to the function). + - If you don't name the arguments when using them in the call, the arguments will be matched to + parameters in the order the parameters are defined in the function. -~~~ python +```python def print_date(year, month, day): joined = str(year) + '/' + str(month) + '/' + str(day) print(joined) print_date(1871, 3, 19) -~~~ +``` -~~~ +```text 1871/3/19 -~~~ - +``` Or, we can name the arguments when we call the function, which allows us to specify them in any order: -~~~ python +```python print_date(month=3, day=19, year=1871) -~~~ +``` -~~~ +```text 1871/3/19 -~~~ +``` +- Via [Twitter](https://twitter.com/minisciencegirl/status/693486088963272705): + `()` contains the ingredients for the function + while the body contains the recipe. -* Via [Twitter](https://twitter.com/minisciencegirl/status/693486088963272705): - `()` contains the ingredients for the function - while the body contains the recipe. +## Functions may return a result to their caller using `return` -## Functions may return a result to their caller using `return`. +- Use `return ...` to give a value back to the caller. +- May occur anywhere in the function. +- But functions are easier to understand if `return` occurs: + - At the start to handle special cases. + - At the very end, with a final result. -* Use `return ...` to give a value back to the caller. -* May occur anywhere in the function. -* But functions are easier to understand if `return` occurs: - * At the start to handle special cases. - * At the very end, with a final result. - -~~~ python +```python def average(values): if len(values) == 0: return None return sum(values) / len(values) -~~~ - +``` -~~~ python +```python a = average([1, 3, 4]) print('average of actual values:', a) -~~~ +``` -~~~ +```text average of actual values: 2.6666666666666665 -~~~ +``` - -~~~ python +```python print('average of empty list:', average([])) -~~~ +``` -~~~ +```text average of empty list: None -~~~ - +``` -* Remember: [every function returns something]({{ page.root }}/04-built-in/). -* A function that doesn't explicitly `return` a value automatically returns `None`. +- Remember: [every function returns something]({{ page.root }}/04-built-in/). +- A function that doesn't explicitly `return` a value automatically returns `None`. -~~~ python +```python result = print_date(1871, 3, 19) print('result of call is:', result) -~~~ +``` -~~~ +```text 1871/3/19 result of call is: None -~~~ +``` ::::challenge{id="identifying_syntax_errors" title="Identifying Syntax Errors"} 1. Read the code below and try to identify what the errors are - *without* running it. + _without_ running it. 2. Run the code and read the error message. Is it a `SyntaxError` or an `IndentationError`? 3. Fix the error. 4. Repeat steps 2 and 3 until you have fixed all the errors. -~~~ python +```python nolint def another_function print("Syntax errors are annoying.") print("But at least python tells us about them!") print("So they are usually not too hard to fix.") -~~~ - +``` :::solution -~~~ python +```python def another_function(): - print("Syntax errors are annoying.") - print("But at least Python tells us about them!") - print("So they are usually not too hard to fix.") -~~~ + print("Syntax errors are annoying.") + print("But at least Python tells us about them!") + print("So they are usually not too hard to fix.") +``` ::: :::: @@ -174,30 +164,30 @@ def another_function(): What does the following program print? -~~~ python +```python def report(pressure): print('pressure is', pressure) print('calling', report, 22.5) -~~~ +``` :::solution -~~~ +```text calling = 85: - print("jumbo") + print("jumbo") elif mass >= 70: - print("large") + print("large") elif mass < 70 and mass >= 55: - print("medium") + print("medium") else: - print("small") -~~~ + print("small") +``` +> The simplified program follows. What function definition will make it functional? -> -The simplified program follows. What function definition will make it functional? - -~~~ python +```python nolint # revised version import random for i in range(10): @@ -383,20 +369,20 @@ for i in range(10): # the (random) mass will be 70 +/- 20 grams mass = 70 + 20.0 * (2.0 * random.random() - 1.0) - print(mass, print_egg_label(mass)) - -~~~ + print(mass, print_egg_label(mass)) +``` > -1. Create a function definition for `print_egg_label()` that will work with the revised program above. Note, the function's return value will be significant. Sample output might be `71.23 large`. -2. A dirty egg might have a mass of more than 90 grams, and a spoiled or broken egg will probably have a mass that's less than 50 grams. Modify your `print_egg_label()` function to account for these error conditions. Sample output could be `25 too light, probably spoiled`. + +1. Create a function definition for `print_egg_label()` that will work with the revised program above. Note, the function's return value will be significant. Sample output might be `71.23 large`. +2. A dirty egg might have a mass of more than 90 grams, and a spoiled or broken egg will probably have a mass that's less than 50 grams. Modify your `print_egg_label()` function to account for these error conditions. Sample output could be `25 too light, probably spoiled`. :::solution -~~~ python +```python def print_egg_label(mass): - #egg sizing machinery prints a label + # egg sizing machinery prints a label if mass >= 90: return "warning: egg might be dirty" elif mass >= 85: @@ -409,7 +395,7 @@ def print_egg_label(mass): return "too light, probably spoiled" else: return "small" -~~~ +``` ::: :::: @@ -418,35 +404,32 @@ def print_egg_label(mass): Assume that the following code has been executed: -~~~ python +```python import pandas as pd df = pd.read_csv('data/gapminder_gdp_asia.csv', index_col=0) japan = df.loc['Japan'] -~~~ - +``` 1. Complete the statements below to obtain the average GDP for Japan across the years reported for the 1980s. - ~~~ python - year = 1983 - gdp_decade = 'gdpPercap_' + str(year // ____) - avg = (japan.loc[gdp_decade + ___] + japan.loc[gdp_decade + ___]) / 2 - ~~~ - + ```python nolint + year = 1983 + gdp_decade = 'gdpPercap_' + str(year // ____) + avg = (japan.loc[gdp_decade + ___] + japan.loc[gdp_decade + ___]) / 2 + ``` 2. Abstract the code above into a single function. - ~~~ python - def avg_gdp_in_decade(country, continent, year): - df = pd.read_csv('data/gapminder_gdp_'+___+'.csv',delimiter=',',index_col=0) - ____ - ____ - ____ - return avg - ~~~ - + ```python nolint + def avg_gdp_in_decade(country, continent, year): + df = pd.read_csv('data/gapminder_gdp_'+___+'.csv',delimiter=',',index_col=0) + ____ + ____ + ____ + return avg + ``` 3. How would you generalize this function if you did not know beforehand which specific years occurred as columns in the data? @@ -458,52 +441,48 @@ japan = df.loc['Japan'] 1. The average GDP for Japan across the years reported for the 1980s is computed with: - ~~~ python - year = 1983 - gdp_decade = 'gdpPercap_' + str(year // 10) - avg = (japan.loc[gdp_decade + '2'] + japan.loc[gdp_decade + '7']) / 2 - ~~~ - + ```python + year = 1983 + gdp_decade = 'gdpPercap_' + str(year // 10) + avg = (japan.loc[gdp_decade + '2'] + japan.loc[gdp_decade + '7']) / 2 + ``` 2. That code as a function is: - ~~~ python - def avg_gdp_in_decade(country, continent, year): - df = pd.read_csv('data/gapminder_gdp_' + continent + '.csv', index_col=0) - c = df.loc[country] - gdp_decade = 'gdpPercap_' + str(year // 10) - avg = (c.loc[gdp_decade + '2'] + c.loc[gdp_decade + '7'])/2 - return avg - ~~~ - + ```python test + def avg_gdp_in_decade(country, continent, year): + df = pd.read_csv('data/gapminder_gdp_' + continent + '.csv', index_col=0) + c = df.loc[country] + gdp_decade = 'gdpPercap_' + str(year // 10) + avg = (c.loc[gdp_decade + '2'] + c.loc[gdp_decade + '7'])/2 + return avg + ``` 3. To obtain the average for the relevant years, we need to loop over them: - ~~~ python + ```python def avg_gdp_in_decade(country, continent, year): - df = pd.read_csv('data/gapminder_gdp_' + continent + '.csv', index_col=0) - c = df.loc[country] - gdp_decade = 'gdpPercap_' + str(year // 10) - total = 0.0 - num_years = 0 - for yr_header in c.index: # c's index contains reported years - if yr_header.startswith(gdp_decade): - total = total + c.loc[yr_header] - num_years = num_years + 1 - return total/num_years - ~~~ - + df = pd.read_csv('data/gapminder_gdp_' + continent + '.csv', index_col=0) + c = df.loc[country] + gdp_decade = 'gdpPercap_' + str(year // 10) + total = 0.0 + num_years = 0 + for yr_header in c.index: # c's index contains reported years + if yr_header.startswith(gdp_decade): + total = total + c.loc[yr_header] + num_years = num_years + 1 + return total/num_years + ``` The function can now be called by: -~~~ python +```python avg_gdp_in_decade('Japan','asia',1983) -~~~ - +``` -~~~ +```text 20880.023800000003 -~~~ +``` ::: :::: @@ -513,11 +492,11 @@ avg_gdp_in_decade('Japan','asia',1983) In mathematics, a [dynamical system](https://en.wikipedia.org/wiki/Dynamical_system) is a system in which a function describes the time dependence of a point in a geometrical space. A canonical example of a dynamical system is the [logistic map](https://en.wikipedia.org/wiki/Logistic_map), -a growth model that computes a new population density (between 0 and 1) based on the current +a growth model that computes a new population density (between 0 and 1) based on the current density. In the model, time takes discrete values 0, 1, 2, ... 1. Define a function called `logistic_map` that takes two inputs: `x`, representing the current - population (at time `t`), and a parameter `r = 1`. This function should return a value + population (at time `t`), and a parameter `r = 1`. This function should return a value representing the state of the system (population) at time `t + 1`, using the mapping function: `f(t+1) = r * f(t) * [1 - f(t)]` @@ -536,23 +515,21 @@ density. In the model, time takes discrete values 0, 1, 2, ... :::solution -1. ~~~ python +1. ```python def logistic_map(x, r): return r * x * (1 - x) - ~~~ + ``` - -2. ~~~ python +2. ```python initial_population = 0.5 t_final = 10 r = 1.0 population = [initial_population] for t in range(1, t_final): population.append( logistic_map(population[t-1], r) ) - ~~~ - + ``` -3. ~~~ python +3. ```python def iterate(initial_population, t_final, r): population = [initial_population] for t in range(1, t_final): @@ -562,14 +539,15 @@ density. In the model, time takes discrete values 0, 1, 2, ... for period in (10, 100, 1000): population = iterate(0.5, period, 1) print(population[-1]) - ~~~ + ``` - ~~~ + ```text 0.07508929631879595 0.009485759503982033 0.0009923756709128578 - ~~~ - + ``` + The population seems to be approaching zero. + ::: :::: diff --git a/introductory_courses/python/14_variable_scope.md b/introductory_courses/python/14_variable_scope.md index 419864bc..ad54ba5c 100644 --- a/introductory_courses/python/14_variable_scope.md +++ b/introductory_courses/python/14_variable_scope.md @@ -1,66 +1,63 @@ --- name: Variable Scope -dependsOn: [ - introductory_courses.python.13_writing_functions -] +dependsOn: [introductory_courses.python.13_writing_functions] tags: [python] -attribution: - - citation: > - "Programming with Python" course by the Carpentries - url: https://swcarpentry.github.io/python-novice-inflammation/ - image: https://carpentries.org/assets/img/TheCarpentries.svg - license: CC-BY-4.0 +attribution: + - citation: > + "Programming with Python" course by the Carpentries + url: https://swcarpentry.github.io/python-novice-inflammation/ + image: https://carpentries.org/assets/img/TheCarpentries.svg + license: CC-BY-4.0 --- -## The scope of a variable is the part of a program that can 'see' that variable. +## The scope of a variable is the part of a program that can 'see' that variable -* There are only so many sensible names for variables. -* People using functions shouldn't have to worry about - what variable names the author of the function used. -* People writing functions shouldn't have to worry about - what variable names the function's caller uses. -* The part of a program in which a variable is visible is called its *scope*. +- There are only so many sensible names for variables. +- People using functions shouldn't have to worry about + what variable names the author of the function used. +- People writing functions shouldn't have to worry about + what variable names the function's caller uses. +- The part of a program in which a variable is visible is called its _scope_. -~~~ python +```python pressure = 103.9 def adjust(t): temperature = t * 1.43 / pressure return temperature -~~~ - - -* `pressure` is a *global variable*. - * Defined outside any particular function. - * Visible everywhere. -* `t` and `temperature` are *local variables* in `adjust`. - * Defined in the function. - * Not visible in the main program. - * Remember: a function parameter is a variable - that is automatically assigned a value when the function is called. - -~~~ python +``` + +- `pressure` is a _global variable_. + - Defined outside any particular function. + - Visible everywhere. +- `t` and `temperature` are _local variables_ in `adjust`. + - Defined in the function. + - Not visible in the main program. + - Remember: a function parameter is a variable + that is automatically assigned a value when the function is called. + +```python nolint print('adjusted:', adjust(0.9)) print('temperature after call:', temperature) -~~~ +``` -~~~ +```text adjusted: 0.01238691049085659 -~~~ +``` -~~~ +```text Traceback (most recent call last): File "/Users/swcarpentry/foo.py", line 8, in print('temperature after call:', temperature) NameError: name 'temperature' is not defined -~~~ +``` ::::challenge{id="local_and_global_variable_use" title="Local and Global Variable Use"} Trace the values of all variables in this program as it is executed. (Use '---' as the value of variables before and after they exist.) -~~~ python +```python limit = 100 def clip(value): @@ -68,7 +65,7 @@ def clip(value): value = -22.5 print(clip(value)) -~~~ +``` :::: @@ -77,13 +74,13 @@ print(clip(value)) Read the traceback below, and identify the following: 1. How many levels does the traceback have? -2. What is the file name where the error occurred? -3. What is the function name where the error occurred? -4. On which line number in this function did the error occur? -5. What is the type of error? -6. What is the error message? +1. What is the file name where the error occurred? +1. What is the function name where the error occurred? +1. On which line number in this function did the error occur? +1. What is the type of error? +1. What is the error message? -~~~ +```text --------------------------------------------------------------------------- KeyError Traceback (most recent call last) () @@ -103,16 +100,18 @@ KeyError Traceback (most recent call last) 13 KeyError: 'Friday' -~~~ +``` :::solution + 1. Three levels. -2. `errors_02.py` -3. `print_message` -4. Line 11 -5. `KeyError`. These errors occur when we are trying to look up a key that does not exist (usually in a data -structure such as a dictionary). We can find more information about the `KeyError` and other built-in exceptions -in the [Python docs](https://docs.python.org/3/library/exceptions.html#KeyError). -6. `KeyError: 'Friday'` +1. `errors_02.py` +1. `print_message` +1. Line 11 +1. `KeyError`. These errors occur when we are trying to look up a key that does not exist (usually in a data + structure such as a dictionary). We can find more information about the `KeyError` and other built-in exceptions + in the [Python docs](https://docs.python.org/3/library/exceptions.html#KeyError). +1. `KeyError: 'Friday'` + ::: :::: diff --git a/introductory_courses/python/15_snake_game.md b/introductory_courses/python/15_snake_game.md index c4dfac79..c3b4f42d 100644 --- a/introductory_courses/python/15_snake_game.md +++ b/introductory_courses/python/15_snake_game.md @@ -1,20 +1,18 @@ --- name: "Project: Snake Game" -dependsOn: [ - introductory_courses.python.14_variable_scope -] +dependsOn: [introductory_courses.python.14_variable_scope] tags: [python] -attribution: - - citation: > - "Programming with Python" course by the Carpentries - url: https://swcarpentry.github.io/python-novice-inflammation/ - image: https://carpentries.org/assets/img/TheCarpentries.svg - license: CC-BY-4.0 +attribution: + - citation: > + "Programming with Python" course by the Carpentries + url: https://swcarpentry.github.io/python-novice-inflammation/ + image: https://carpentries.org/assets/img/TheCarpentries.svg + license: CC-BY-4.0 --- ## Snake -[Snake](https://en.wikipedia.org/wiki/Snake_(video_game_genre)) is a common name for a game in which a player maneuvers a line which grows in length, where the main obstacle is the line itself. +[Snake]() is a common name for a game in which a player maneuvers a line which grows in length, where the main obstacle is the line itself. Although variants existed before, it became popular in the late 1990s when it was preloaded onto Nokia phones. ## The project @@ -33,4 +31,3 @@ The instructions for the game are self-contained at the following link: 3. Follow the five walkthrough vignettes to gradually build up your game of Snake :::: - diff --git a/introductory_courses/python/index.md b/introductory_courses/python/index.md index 585846b8..9384585a 100644 --- a/introductory_courses/python/index.md +++ b/introductory_courses/python/index.md @@ -32,12 +32,12 @@ summary: | We will also learn how to use Python for data analysis and visualization. --- -**Python: A High-Level Overview** +## Python: A High-Level Overview Python is a high-level, interpreted programming language that was developed in the late 1980s by Guido van Rossum. With an emphasis on code readability and simplicity, Python is designed to be easy to understand and write, making it a popular choice for beginners and experts alike. -**Key Features of Python** +## Key Features of Python 1. **Simplicity:** Python has a clean, easy-to-understand syntax that emphasizes readability and reduces the cost of program maintenance. Python code is designed to be readable and straightforward, making it an ideal language for novice programmers. @@ -49,7 +49,7 @@ With an emphasis on code readability and simplicity, Python is designed to be ea 5. **Extensive Libraries:** Python's standard library is large and versatile, offering a range of modules and functions for tasks like regular expressions, documentation-generation, unit-testing, threading, databases, web browsers, CGI, FTP, email, XML, XML-RPC, HTML, WAV files, cryptography, GUI, and more. -**Python Use Cases** +## Python Use Cases Python is used in a wide variety of fields: @@ -67,7 +67,7 @@ Python is used in a wide variety of fields: 7. **Cybersecurity:** Python is often used for creating intrusion detection systems, performing malware analysis, and testing vulnerabilities. -**Python's Philosophy** +## Python's Philosophy The philosophy of Python is outlined in a document called "The Zen of Python", which includes guiding principles like "Readability counts", "Explicit is better than implicit", and "Simple is better than complex". These principles help to guide Python's design and use. diff --git a/libraries/pybamm-developer/01_ode.md b/libraries/pybamm-developer/01_ode.md index f6086741..9455e4d2 100644 --- a/libraries/pybamm-developer/01_ode.md +++ b/libraries/pybamm-developer/01_ode.md @@ -1,15 +1,13 @@ --- name: ODE models in PyBaMM -dependsOn: [ - libraries.pybamm -] +dependsOn: [libraries.pybamm] tags: [pybamm] -attribution: - - citation: > - PyBaMM documentation by the PyBaMM Team - url: https://docs.pybamm.org - image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg - license: BSD-3 +attribution: + - citation: > + PyBaMM documentation by the PyBaMM Team + url: https://docs.pybamm.org + image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg + license: BSD-3 --- # A simple ODE battery model @@ -32,7 +30,7 @@ where $x_n$ and $x_p$ are the dimensionless stochiometries of the negative and p The reservoir model has a number of output variables, which are either the state variables $x_n$ and $x_p$ that are explicitly solved for, or derived variables -such as the voltage $V(t)$. +such as the voltage $V(t)$. In PyBaMM a state variable can be defined using the [`pybamm.Variable`](https://docs.pybamm.org/en/stable/source/api/expression_tree/variable.html#variable) @@ -40,6 +38,7 @@ class. For example, if you wanted to define a state variable with name "x", you would write ```python +import pybamm x = pybamm.Variable("x") ``` @@ -49,10 +48,12 @@ Define the state variables for the reservoir model, including the stochiometries $x_n$ and $x_p$. :::solution + ```python x_n = pybamm.Variable("Negative electrode stochiometry") x_p = pybamm.Variable("Positive electrode stochiometry") ``` + ::: :::: @@ -62,7 +63,7 @@ later once we have defined the expressions for the ODEs. ## PyBaMM parameters The reservoir model has a number of parameters that need to be defined. In PyBaMM a parameter can be defined using the [`pybamm.Parameter`](https://docs.pybamm.org/en/stable/source/api/expression_tree/parameter.html#parameter) class. For example, if you wanted to define a parameter with name "a", you would write - + ```python a = pybamm.Parameter("a") ``` @@ -74,6 +75,7 @@ You can also define a parameter that is defined as a function using the [`pybamm ```python P = pybamm.FunctionParameter("Your parameter name here", {"Time [s]": pybamm.t}) ``` + where the first argument is a string with the name of your parameter (which is used when passing the parameter values) and the second argument is a dictionary of `name: symbol` of all the variables on which the function parameter depends on. In particular, `pybamm.t` is a special variable that represents time. ::::challenge{id="ode-parameters" title="Define the parameters for the reservoir model"} @@ -81,6 +83,7 @@ where the first argument is a string with the name of your parameter (which is u Define the parameters for the reservoir model, including the current $I(t)$, the OCV functions $U_p(x_p)$ and $U_n(x_n)$, the capacities $Q_n$ and $Q_p$, and the resistance $R$. :::solution + ```python i = pybamm.FunctionParameter("Current function [A]", {"Time [s]": pybamm.t}) x_n_0 = pybamm.Parameter("Initial negative electrode stochiometry") @@ -91,6 +94,7 @@ Q_n = pybamm.Parameter("Negative electrode capacity [A.h]") Q_p = pybamm.Parameter("Positive electrode capacity [A.h]") R = pybamm.Parameter("Electrode resistance [Ohm]") ``` + ::: :::: @@ -109,6 +113,7 @@ model = pybamm.BaseModel("my model") By construction, PyBaMM expects the equations to be written in a very specific, with time derivatives playing a central role: ODEs must be written in explicit form, that is $\frac{\mathrm{d} u}{\mathrm{d} t} = f(u, t)$. Then, we only need to define the $f(u,t)$ term (called RHS for right hand side) for a given variable $u$, as the left hand side will be assumed to be $\frac{\mathrm{d} u}{\mathrm{d} t}$. PyBaMM can also have equations with no time derivatives, which are called algebraic equations. Going back to the PyBaMM model, the class has four useful attributes for defining a model, which are: + 1. `rhs` - a python dictionary of the right-hand-side equations with the form `variable: rhs`. 2. `algebraic` - a python dictionary of the algebraic equations (we won't need this for our ODE model). Should be passed as a dictionary of the form `variable: algebraic`. Note that the variable is only for indexing purposes, and this imposes `algebraic = 0` not `variable = algebraic`. 3. `initial_conditions` - a python dictionary of the initial conditions of the form `variable: ic`, which imposes `variable = ic` at the initial time. @@ -119,6 +124,7 @@ As an example, lets define a simple model for exponential decay with a single st $$\frac{\mathrm d x}{\mathrm d t} = - a x, \qquad x(0) = 1.$$ We can write this as a PyBaMM model by writing: + ```python x = pybamm.Variable("x") a = pybamm.Parameter("a") @@ -135,6 +141,7 @@ Now we have all the pieces we need to define the reservoir model. Define the model using the parameters and variables you defined earlier. :::solution + ```python model = pybamm.BaseModel("reservoir model") model.rhs[x_n] = -i / Q_n @@ -142,7 +149,7 @@ model.initial_conditions[x_n] = x_n_0 model.rhs[x_p] = i / Q_p model.initial_conditions[x_p] = x_p_0 -model.variables["Voltage [V]"] = U_p - U_n - i * R +model.variables["Voltage [V]"] = U_p - U_n - i * R model.variables["Negative electrode stochiometry"] = x_n model.variables["Positive electrode stochiometry"] = x_p ``` @@ -153,7 +160,6 @@ dictionaries using Python. ::: :::: - ## PyBaMM expressions It is worth pausing here and discussing the concept of an "expression" in @@ -189,7 +195,7 @@ You can also print the expression as a string using the `print` method: print(model.rhs[x_n]) ``` -``` +```text -Current function [A] / Negative electrode capacity [A.h] ``` @@ -198,13 +204,17 @@ mathematical equation, which can then be later on used by the PyBaMM solvers to solve the model equations over time. The variable `children` returns a list of the children nodes of a given parent node. For example, to access the `Negative electrode capacity` parameter we could type + ```python model.rhs[x_n].children[1] ``` + as it is the second children of the division node (remember python starts indexing at 0). The command can be used recursively to navigate across the expression tree. For example, if we want to access the time variable in the expression tree above, we can type + ```python model.rhs[x_n].children[0].children[0].children[0] ``` + This is extremely useful to debug the expression tree as it allows you to access the relevant nodes. ## PyBaMM events @@ -237,6 +247,7 @@ Define four events that ensure that the stochiometries $x_n$ and $x_p$ are between 0 and 1. The simulation should stop when either reach 0 or 1. :::solution + ```python model.events = [ pybamm.Event("Minimum negative stochiometry", x_n - 0), @@ -245,6 +256,7 @@ model.events = [ pybamm.Event("Maximum positive stochiometry", 1 - x_p), ] ``` + ::: :::: @@ -293,31 +305,32 @@ following values: - The OCV functions are the LGM50 OCP from the Chen2020 model, which are given by the functions: ```python +import numpy as np def graphite_LGM50_ocp_Chen2020(sto): - u_eq = ( - 1.9793 * np.exp(-39.3631 * sto) - + 0.2482 - - 0.0909 * np.tanh(29.8538 * (sto - 0.1234)) - - 0.04478 * np.tanh(14.9159 * (sto - 0.2769)) - - 0.0205 * np.tanh(30.4444 * (sto - 0.6103)) - ) + u_eq = ( + 1.9793 * np.exp(-39.3631 * sto) + + 0.2482 + - 0.0909 * np.tanh(29.8538 * (sto - 0.1234)) + - 0.04478 * np.tanh(14.9159 * (sto - 0.2769)) + - 0.0205 * np.tanh(30.4444 * (sto - 0.6103)) + ) - return u_eq + return u_eq def nmc_LGM50_ocp_Chen2020(sto): - u_eq = ( - -0.8090 * sto - + 4.4875 - - 0.0428 * np.tanh(18.5138 * (sto - 0.5542)) - - 17.7326 * np.tanh(15.7890 * (sto - 0.3117)) - + 17.5842 * np.tanh(15.9308 * (sto - 0.3120)) - ) - - return u_eq + u_eq = ( + -0.8090 * sto + + 4.4875 + - 0.0428 * np.tanh(18.5138 * (sto - 0.5542)) + - 17.7326 * np.tanh(15.7890 * (sto - 0.3117)) + + 17.5842 * np.tanh(15.9308 * (sto - 0.3120)) + ) + + return u_eq ``` - :::solution + ```python param = pybamm.ParameterValues({ "Current function [A]": lambda t: 1 + 0.5 * pybamm.sin(100*t), @@ -330,12 +343,12 @@ param = pybamm.ParameterValues({ "Negative electrode OCV": graphite_LGM50_ocp_Chen2020, }) ``` + ::: :::: ## Solving the model - Now that we have defined the reservoir model, we can solve it using the PyBaMM simulation class and plot the results like so: ```python @@ -351,4 +364,3 @@ plot the results. Vary the paramters and see how the solution changes to assure yourself that the model is working as expected. :::: - diff --git a/libraries/pybamm-developer/02_pde.md b/libraries/pybamm-developer/02_pde.md index 65c428c8..6cfdae18 100644 --- a/libraries/pybamm-developer/02_pde.md +++ b/libraries/pybamm-developer/02_pde.md @@ -1,15 +1,13 @@ --- name: PDE models in PyBaMM -dependsOn: [ - libraries.pybamm-developer.01_ode -] +dependsOn: [libraries.pybamm-developer.01_ode] tags: [pybamm] -attribution: - - citation: > - PyBaMM documentation by the PyBaMM Team - url: https://docs.pybamm.org - image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg - license: BSD-3 +attribution: + - citation: > + PyBaMM documentation by the PyBaMM Team + url: https://docs.pybamm.org + image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg + license: BSD-3 --- # Creating a simple PDE model @@ -28,7 +26,6 @@ $$ \left.\frac{\partial c}{\partial r}\right\vert_{r=0} = 0, \quad \left.\frac{\partial c}{\partial r}\right\vert_{r=1} = 2, \quad \left.c\right\vert_{t=0} = 1. $$ - ## Setting up the model As in the previous example, we start with a @@ -41,6 +38,7 @@ This argument is a string and we will later on define the geometry of the domain. ```python +import pybamm model = pybamm.BaseModel() c = pybamm.Variable("Concentration", domain="negative particle") @@ -88,6 +86,7 @@ model.variables = {"Concentration": c, "Flux": N} ``` ## Using the model + Now the model is now completely defined all that remains is to discretise and solve. Since this model is a PDE we need to define the geometry on which it will be solved, and choose how to mesh the geometry and discretise in space. ### Defining a geometry @@ -101,6 +100,7 @@ r = pybamm.SpatialVariable( "r", domain=["negative particle"], coord_sys="spherical polar" ) ``` + As with the concentration variable, we give it the spatial variable an informative name `"r"` but the variable is represented by `r` in the code (in this case it is a bit more confusing as the names are more similar). The domain needs to match the domain we have defined in our concentration variable, and the coordinate system can be chosen from `"cartesian"`, `"cylindrical polar"` and `"spherical polar"`. The geometry on which we wish to solve the model is defined using a nested @@ -129,7 +129,7 @@ mesh = pybamm.Mesh(geometry, submesh_types, var_pts) Here we have used the [`pybamm.Uniform1DSubMesh`](https://docs.pybamm.org/en/stable/source/api/meshes/one_dimensional_submeshes.html#pybamm.Uniform1DSubMesh) class to create a uniform mesh. This class does not require any parameters, and so we can pass it directly to the `submesh_types` dictionary. However, many other submesh types can take -additional parameters. Example of meshes that do require parameters include the +additional parameters. Example of meshes that do require parameters include the [`pybamm.Exponential1DSubMesh`](https://docs.pybamm.org/en/stable/source/api/meshes/one_dimensional_submeshes.html#pybamm.Exponential1DSubMesh) which clusters points close to one or both boundaries using an exponential rule. It takes a parameter which sets how closely the points are clustered together, and also lets the users select the @@ -137,7 +137,7 @@ side on which more points should be clustered. For example, to create a mesh with more nodes clustered to the right (i.e. the surface in the particle problem), using a stretch factor of 2, we pass an instance of the exponential submesh class and a dictionary of parameters into the `MeshGenerator` class as -follows: +follows: ```python exp_mesh = pybamm.MeshGenerator(pybamm.Exponential1DSubMesh, submesh_params={"side": "right", "stretch": 2}) @@ -157,15 +157,19 @@ into matrix-vector multiplications. ```python spatial_methods = {"negative particle": pybamm.FiniteVolume()} disc = pybamm.Discretisation(mesh, spatial_methods) -disc.process_model(model); +disc.process_model(model) ``` -Now that the model has been discretised we are ready to solve. +Now that the model has been discretised we are ready to solve. ## Solving the model + As before, we choose a solver and times at which we want the solution returned. We then solve, extract the variables we are interested in, and plot the result. ```python +import matplotlib.pyplot as plt +import numpy as np +import pybamm # solve solver = pybamm.ScipySolver() t = np.linspace(0, 1, 100) @@ -205,17 +209,17 @@ $$ where $c$ is the concentration, $r$ the radial coordinate, $t$ time, $R$ the particle radius, $D$ the diffusion coefficient, $j$ the interfacial current -density, $F$ Faraday's constant, and $c_0$ the initial concentration. +density, $F$ Faraday's constant, and $c_0$ the initial concentration. We use the following parameters: -| Symbol | Units | Value | -|:-------|:-------------------|:-----------------------------------------------| -| $R$ | m | $10 \times 10^{-6}$ | -| $D$ | m${^2}$ s$^{-1}$ | $3.9 \times 10^{-14}$ | -| $j$ | A m$^{-2}$ | $1.4$ | -| $F$ | C mol$^{-1}$ | $96485$ | -| $c_0$ | mol m$^{-3}$ | $2.5 \times 10^{4}$ | +| Symbol | Units | Value | +| :----- | :--------------- | :-------------------- | +| $R$ | m | $10 \times 10^{-6}$ | +| $D$ | m${^2}$ s$^{-1}$ | $3.9 \times 10^{-14}$ | +| $j$ | A m$^{-2}$ | $1.4$ | +| $F$ | C mol$^{-1}$ | $96485$ | +| $c_0$ | mol m$^{-3}$ | $2.5 \times 10^{4}$ | Create a model for this problem, discretise it and solve it. Use a uniform mesh with 20 points, and discretise the domain using the Finite Volume Method. Solve @@ -244,14 +248,14 @@ c = pybamm.Variable("Concentration [mol.m-3]", domain="negative particle") # governing equations N = -D * pybamm.grad(c) # flux dcdt = -pybamm.div(N) -model.rhs = {c: dcdt} +model.rhs = {c: dcdt} -# boundary conditions +# boundary conditions lbc = pybamm.Scalar(0) rbc = -j / F / D model.boundary_conditions = {c: {"left": (lbc, "Neumann"), "right": (rbc, "Neumann")}} -# initial conditions +# initial conditions model.initial_conditions = {c: c0} model.variables = { @@ -311,6 +315,6 @@ ax2.legend() plt.tight_layout() plt.show() ``` + ::: :::: - diff --git a/libraries/pybamm-developer/03_spm.md b/libraries/pybamm-developer/03_spm.md index 442a8921..0cfa7020 100644 --- a/libraries/pybamm-developer/03_spm.md +++ b/libraries/pybamm-developer/03_spm.md @@ -1,22 +1,20 @@ --- name: Single Particle Model -dependsOn: [ - libraries.pybamm-developer.02_pde -] +dependsOn: [libraries.pybamm-developer.02_pde] tags: [pybamm] -attribution: - - citation: > - PyBaMM documentation by the PyBaMM Team - url: https://docs.pybamm.org - image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg - license: BSD-3 +attribution: + - citation: > + PyBaMM documentation by the PyBaMM Team + url: https://docs.pybamm.org + image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg + license: BSD-3 --- # The Single Particle Model Now we will extend our PDE model to the full single particle model. The single particle model is a system of PDEs that describes the behaviour of a lithium-ion -battery electrode particle. +battery electrode particle. ## The Single Particle Model state equations @@ -50,7 +48,6 @@ j_p &= \frac{-I}{a_p \delta_p F \mathcal{A}}, \end{align*} $$ - where $a_i = 3 \epsilon_i / R_i$ is the specific surface area of the electrode, $\epsilon_i$ is the volume fraction of active material, $\delta_i$ is the thickness of the electrode, $F$ is the Faraday constant, and @@ -63,6 +60,7 @@ other parameters we have seen so far, it is a function of time. We can define this using `pybamm.FunctionParameter`: ```python +import pybamm I = pybamm.FunctionParameter("Current function [A]", {"Time [s]": pybamm.t}) ``` @@ -89,7 +87,6 @@ PyBaMM, we can specify the domain using the `domain` keyword argument: c_n = pybamm.Variable("Negative particle concentration [mol.m-3]", domain="negative particle") ``` - ::::challenge{id="spm-state-equations" title="SPM governing equations"} Copy your PDE model from the previous challenge to a new file, and modify it to @@ -100,6 +97,7 @@ of new parameters. Define the applied current $I$ as a input parameter that is a function of time using `pybamm.FunctionParameter`. :::solution + ```python import pybamm @@ -137,6 +135,7 @@ model.boundary_conditions = {c_i[i]: {"left": (lbc, "Neumann"), "right": (rbc[i] # initial conditions model.initial_conditions = {c_i[i]: c0_i[i] for i in [0, 1]} ``` + ::: :::: @@ -154,8 +153,6 @@ where $U_i$ is the open circuit potential (OCP) of the electrode, $x_i^s = c_i(r=R_i) / c_i^{max}$ is the surface stoichiometry, and $\eta_i$ is the overpotential. - - Assuming Butler-Volmer kinetics and $\alpha_i = 0.5$, the overpotential is given by: $$ @@ -183,7 +180,7 @@ $c_i^{max}$ is the maximum concentration of lithium ions in the electrode, and is a parameter of the model. However, $c_i(r=R_i)$ is the concentration of lithium ions at the surface of the electrode particle. How can we express this in PyBaMM, given that we only have the concentration $c_i$ defined on the whole -domain? +domain? To get the surface concentration, we can use the `pybamm.boundary_value` or `pybamm.surf` functions. The `pybamm.boundary_value` function returns the value @@ -206,10 +203,10 @@ c_n_surf = pybamm.boundary_value(c_n, "right") The OCPs $U_i$ are functions of the surface stoichiometries $x_i^s$, and we can define them using `pybamm.FunctionParameter` in a similar way to the applied current $I$. For example, to define the OCP of the positive electrode as a -function of the surface stoichiometry $x_p^s$: +function of the surface stoichiometry $x_p^s$: ```python -U_p = pybamm.FunctionParameter("Positive electrode OCP [V]", {"stoichiometry": x_p_s}) +U_p = pybamm.FunctionParameter("Positive electrode OCP [V]", {"stoichiometry": "x_p_s"}) ``` ### PyBaMM's built-in functions @@ -239,10 +236,12 @@ $x_p^s$. You can use `pybamm.FunctionParameter` to define the OCPs as functions of the surface stoichiometries. Define the following output variables for the model + - Terminal voltage $V$ - Surface concentration in negative particle $c_n^s$ :::solution + ```python # call universal constants (PyBaMM has them built in) R = pybamm.constants.R @@ -266,10 +265,11 @@ U_i = [pybamm.FunctionParameter(f"{e.capitalize()} electrode OCP [V]", {"stoichi [U_n_plus_eta, U_p_plus_eta] = [U_i[i] + eta_i[i] for i in [0, 1]] V = U_p_plus_eta - U_n_plus_eta model.variables = { - "Voltage [V]": V, - "Negative particle surface concentration [mol.m-3]": c_i_s[0], + "Voltage [V]": V, + "Negative particle surface concentration [mol.m-3]": c_i_s[0], } ``` + ::: :::: @@ -285,7 +285,7 @@ spatial domains. Discretise and solve the SPM model using the same methods as in the previous section. The following parameter values object copies the parameters from the PyBaMM Chen2020 model, feel free to use this to define the parameters for the SPM model. - + ```python param = pybamm.ParameterValues("Chen2020") # PyBaMM parameters provide the exchange current density directly, rather than the reaction rate, so define here @@ -296,6 +296,7 @@ param.update({ ``` :::solution + ```python import numpy as np @@ -345,10 +346,6 @@ ax2.set_ylabel("Negative particle surface concentration [mol.m-3]") plt.tight_layout() plt.show() ``` + ::: :::: - - - - - diff --git a/libraries/pybamm-developer/04_spm_class.md b/libraries/pybamm-developer/04_spm_class.md index e7c71ffa..88b83327 100644 --- a/libraries/pybamm-developer/04_spm_class.md +++ b/libraries/pybamm-developer/04_spm_class.md @@ -1,37 +1,41 @@ --- name: Single Particle Model (now as a class) -dependsOn: [ - libraries.pybamm-developer.03_spm -] +dependsOn: [libraries.pybamm-developer.03_spm] tags: [pybamm] -attribution: - - citation: > - PyBaMM documentation by the PyBaMM Team - url: https://docs.pybamm.org - image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg - license: BSD-3 +attribution: + - citation: > + PyBaMM documentation by the PyBaMM Team + url: https://docs.pybamm.org + image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg + license: BSD-3 --- # The Single Particle Model (now as a class) -In the previous lesson we built the Single Particle Model. We implemented it in a script/notebook, which is a good way to build the model for the first time, but not so much if we want to use the model often or share it with other people. In these cases, it is better to wrap the model into a class. + +In the previous lesson we built the Single Particle Model. We implemented it in a script/notebook, which is a good way to build the model for the first time, but not so much if we want to use the model often or share it with other people. In these cases, it is better to wrap the model into a class. ## What is a class? + A [python class](https://www.w3schools.com/python/python_classes.asp) is a "blueprint" to create an object. A python object is a special variable that contain data (variables) and methods (functions) that can manipulate this data. In our case, the object will be a model, which we can then solve. The goal of this lesson will be to convert the SPM model we wrote in the previous lesson into a class. If you are familiar with classes feel free to start from scratch, otherwise please use [this template](./spm_template.py) as a starting point. Let's first explain the first few lines of the template. After the header and importing PyBaMM we see the declaration of the class: -```python + +```python nolint class WorkshopSPM(pybamm.BaseModel) ``` + This tells us that we are creating a class called `WorkshopSPM` (but you could call it whatever you want), and that this new class inherits from `pybamm.BaseModel`. This simply means that this new class will have all the variables and functions from the `BaseModel` class, so we can build on it. Next, we see that the class has an `__init__` method. All classes have an `__init__` method, which gets called when an object is created from a class. In this case, the `__init__` method takes one optional variable (ignore `self`) called `name`, which is the name of the model (this is used when plotting, amongst others). The definition of the model needs to be included in this `__init__` method. The next line -```python + +```python nolint super().__init__(name=name) ``` + looks a bit obscure, but basically calls the `__init__` method of the class we are inheriting from (i.e. `BaseModel`). This does some useful initialisation of the class, but we do not need to delve in the details. ::::challenge{id="spm-class" title="Wrap SPM as a class"} @@ -45,9 +49,11 @@ The solution should be included in the directory `pybamm/models/full_battery_mod ```python from .spm_workshop import WorkshopSPM ``` + ::: Next we need to add the equations in the `__init__` method of our class. There are multiple ways things can be defined, but it helps to follow this structure: + 1. Variables 2. Parameters 3. Governing equations (in this case the particle models) @@ -56,6 +62,7 @@ Next we need to add the equations in the `__init__` method of our class. There a We have broken this into various challenges to guide you step-by-step. ### Variables + First we need to define the variables of our model. These are any variables we need to solve for (e.g. concentrations), but not any derived quantities we will compute from them (e.g. voltage), as these will be defined in the output variables section. If you want to integrate your model into PyBaMM you should stick to the PyBaMM name convention (see [Tutorial 3](https://docs.pybamm.org/en/latest/source/examples/notebooks/getting_started/tutorial-3-basic-plotting.html) for a list of all variables in the DFN model). @@ -64,19 +71,22 @@ If you want to integrate your model into PyBaMM you should stick to the PyBaMM n In this case we only need to define two variables: the concentrations in the positive and negative particle, respectively. ```python +import pybamm electrodes = ["negative", "positive"] c_i = [pybamm.Variable(f"{e.capitalize()} particle concentration [mol.m-3]", domain=f"{e} particle") for e in electrodes] ``` -::: +::: ### Parameters + Next we need to define the parameters used in our model. Similarly to the variables, we can use any name convention we want, but it is useful to stick to the one PyBaMM uses so you can use the in-build parameter sets. Check [Tutorial 4](https://docs.pybamm.org/en/latest/source/examples/notebooks/getting_started/tutorial-4-setting-parameter-values.html) to see the standard names for the parameters. :::solution + ```python # define parameters -I = pybamm.FunctionParameter("Current function [A]", {"Time [s]": pybamm.t}) +I = pybamm.FunctionParameter("Current function [A]", {"Time [s]": pybamm.t}) D_i = [pybamm.Parameter(f"{e.capitalize()} electrode diffusivity [m2.s-1]") for e in electrodes] R_i = [pybamm.Parameter(f"{e.capitalize()} particle radius [m]") for e in electrodes] c0_i = [pybamm.Parameter(f"Initial concentration in {e} electrode [mol.m-3]") for e in electrodes] @@ -97,13 +107,16 @@ R = pybamm.constants.R a_i = [3 * epsilon_i[i] / R_i[i] for i in [0, 1]] j_i = [I / a_i[0] / delta_i[0] / F / A, -I / a_i[1] / delta_i[1] / F / A] ``` + ::: ### Particle model + Once we have defined the variables and the parameters we can write the governing equations. In this case, we only need to write the model for each particle. :::solution -```python + +```python nolint # governing equations dcdt_i = [pybamm.div(D_i[i] * pybamm.grad(c_i[i])) for i in [0, 1]] self.rhs = {c_i[i]: dcdt_i[i] for i in [0, 1]} @@ -116,13 +129,16 @@ self.boundary_conditions = {c_i[i]: {"left": (lbc, "Neumann"), "right": (rbc[i], # initial conditions self.initial_conditions = {c_i[i]: c0_i[i] for i in [0, 1]} ``` + ::: ### Output variables + Finally, we can define the output variables. Here we want to define all the variables that we might want to plot, save or analyse after solving the model. :::solution -```python + +```python nolint # define intermediate variables and OCP function parameters c_i_s = [pybamm.surf(c_i[i]) for i in [0, 1]] x_i_s = [c_i_s[i] / c_i_max[i] for i in [0, 1]] @@ -134,22 +150,24 @@ eta_i = [2 * R * T / F * pybamm.arcsinh(j_i[i] * F / (2 * i_0_i[i])) for i in [0 [U_n_plus_eta, U_p_plus_eta] = [pybamm.surf(U_i[i]) + eta_i[i] for i in [0, 1]] V = U_p_plus_eta - U_n_plus_eta self.variables = { - "Time [s]": pybamm.t, - "Voltage [V]": V, - "Current [A]": I, - "Negative particle concentration [mol.m-3]": c_i[0], - "Positive particle concentration [mol.m-3]": c_i[1], - "Negative particle surface concentration [mol.m-3]": c_i_s[0], - "Positive particle surface concentration [mol.m-3]": c_i_s[1], + "Time [s]": pybamm.t, + "Voltage [V]": V, + "Current [A]": I, + "Negative particle concentration [mol.m-3]": c_i[0], + "Positive particle concentration [mol.m-3]": c_i[1], + "Negative particle surface concentration [mol.m-3]": c_i_s[0], + "Positive particle surface concentration [mol.m-3]": c_i_s[1], } ``` -::: +::: ### Default attributes (optional) + What we have done so far is enough to run the model, but there are some optional extra steps we can take that will make our lives easier going forward. When we process and solve a model, we need to define several additional objects (e.g. geometry, solver...). We can define what should be the default ones for our model. This means that when we process and solve the model we do not need to define them again but we can simply call the default ones (e.g. `default_geometry`, `default_solver`...) or, even better, use a PyBaMM simulation that will use the default ones automatically. :::solution + ```python @property def default_geometry(self): @@ -168,7 +186,7 @@ def default_submesh_types(self): @property def default_var_pts(self): domains = ["negative particle", "positive particle"] - r_i = [pybamm.SpatialVariable("r", domain=[d], coord_sys="spherical polar") for d in domains] + r_i = [pybamm.SpatialVariable("r", domain=[d], coord_sys="spherical polar") for d in domains] return {r: 20 for r in r_i} @property @@ -199,6 +217,7 @@ Note that we could significantly simplify the code by defining `self.domains`, ` :::: ## Adding the model to PyBaMM + We have now written the model into a class, so the next milestone will be to open a PR to add it to PyBaMM. Before we do that, though, there a few other things we need to do. ::::challenge{id="spm-example" title="Write an example for the model"} @@ -213,35 +232,44 @@ Tests are bits of code that check that the rest of the code performs as expected The goal for this task is to write some tests for your model (at least unit tests, ideally unit and integration tests as well). You can draw some inspiration from the following existing tests for models ([unit](https://github.com/pybamm-team/PyBaMM/blob/develop/tests/unit/test_models/test_full_battery_models/test_lithium_ion/test_basic_models.py) and [integration](https://github.com/pybamm-team/PyBaMM/blob/develop/tests/integration/test_models/test_full_battery_models/test_lithium_ion/test_compare_basic_models.py)). To run the tests locally you can simply type + ```bash nox -s unit ``` -for the unit tests and + +for the unit tests and + ```bash nox -s tests ``` + to run both unit and integration tests. Once your tests run locally you can open a pull request (PR) to the PyBaMM main repository. This will run a whole suite of tests on the cloud (they will take a few minutes to run) that might unearth some more issues with the code. It will also generate a coverage report that will tell you if any parts of your code are not tested. ## Troubleshooting + Writing the equations is not hard, what is hard is getting the model to actually work. Here are a list of the most common errors when writing PyBaMM models and some tips on how to fix it. Remember that using the debugging mode in your code editor is also very useful. ### Domain error + ```bash pybamm.expression_tree.exceptions.DomainError: children must have same or empty domains, not ['positive particle'] and ['negative particle'] ``` This means that at some point in your expression tree an operator takes two nodes (i.e. children) that have incompatible domains. The error message will tell you which line in your model triggered the error, so a good way is to set a debug breakpoint there and analyse the domains of the various symbols in the expression by running + ```python -symbol.domains +from pybamm import Symbol +Symbol.domains ``` Sometimes the issue is further down the expression tree, remember you can visualise the expression tree (see [ODE models in PyBaMM](./01_ode.md)). To access the list of children of a node, you can call the `children` command. You can then access the relevant element in the list and call the `children` command again to navigate down the tree. ### Missing parameters + ```bash KeyError: "'Applied current [A]' not found. Best matches are ['Current function [A]']" ``` -This error means that the model requires a parameter that has not been passed when processing the model. PyBaMM will try to provide a guess of the best matches. In this case, the model has a parameter called `Applied current [A]` but the parameters when solving the model do not include it and the best match is `Current function [A]`. This is a common error if the naming convention of your model does not match PyBaMM's. \ No newline at end of file +This error means that the model requires a parameter that has not been passed when processing the model. PyBaMM will try to provide a guess of the best matches. In this case, the model has a parameter called `Applied current [A]` but the parameters when solving the model do not include it and the best match is `Current function [A]`. This is a common error if the naming convention of your model does not match PyBaMM's. diff --git a/libraries/pybamm-developer/05_spm_acid.md b/libraries/pybamm-developer/05_spm_acid.md index 1194f6be..c68230a6 100644 --- a/libraries/pybamm-developer/05_spm_acid.md +++ b/libraries/pybamm-developer/05_spm_acid.md @@ -1,27 +1,29 @@ --- name: SPM with Acid Dissolution -dependsOn: [ - libraries.pybamm-developer.04_spm_class -] +dependsOn: [libraries.pybamm-developer.04_spm_class] tags: [pybamm] -attribution: - - citation: > - PyBaMM documentation by the PyBaMM Team - url: https://docs.pybamm.org - image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg - license: BSD-3 +attribution: + - citation: > + PyBaMM documentation by the PyBaMM Team + url: https://docs.pybamm.org + image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg + license: BSD-3 --- # The Single Particle Model with Acid Dissolution -The next step will be to extend the Single Particle Model to include acid dissolution. We will use the model introduced by [Kindermann et al (2017)](https://iopscience.iop.org/article/10.1149/2.0321712jes), in particular equations [8]-[10]. Rewritten to match our notation, we have + +The next step will be to extend the Single Particle Model to include acid dissolution. We will use the model introduced by [Kindermann et al (2017)](https://iopscience.iop.org/article/10.1149/2.0321712jes), in particular equations [8]-[10]. Rewritten to match our notation, we have $$ \frac{\mathrm{d} \epsilon_p}{\mathrm{d} t} = - \frac{i_{0,\mathrm{diss}} \exp \left( \frac{F \eta_\mathrm{diss}}{R T} \right)}{c_p^{max} \delta_p F} $$ + with + $$ \eta_\mathrm{diss} = \phi_p - U_\mathrm{diss}, $$ + where $i_{0,\mathrm{diss}}$ is the dissolution exchange current density and $U_\mathrm{diss}$ is the dissolution open-circuit potential. The positive electrode potential is given by $\phi_p = U_p + \eta_p$. ::::challenge{id="spm-acid" title="Write SPM with acid dissolution"} @@ -29,7 +31,8 @@ The challenge for this lesson is to write a new class for SPM with acid dissolut :::solution To extend the SPM model to account for acid dissolution we need to add some additional lines in various parts of the code. Below we include these lines with a hint on which part of the code they should be. -```python + +```python nolint # variables epsilon_s_p = pybamm.Variable("Positive electrode active material volume fraction") @@ -44,6 +47,7 @@ self.rhs[epsilon_s_p] = depsdt self.initial_conditions[epsilon_s_p] = epsilon_s_p_0 ``` + ::: :::: diff --git a/libraries/pybamm-developer/06_submodels.md b/libraries/pybamm-developer/06_submodels.md index f393f72a..9483b27e 100644 --- a/libraries/pybamm-developer/06_submodels.md +++ b/libraries/pybamm-developer/06_submodels.md @@ -1,19 +1,18 @@ --- name: Submodels -dependsOn: [ - libraries.pybamm-developer.05_spm_acid -] +dependsOn: [libraries.pybamm-developer.05_spm_acid] tags: [pybamm] -attribution: - - citation: > - PyBaMM documentation by the PyBaMM Team - url: https://docs.pybamm.org - image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg - license: BSD-3 +attribution: + - citation: > + PyBaMM documentation by the PyBaMM Team + url: https://docs.pybamm.org + image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg + license: BSD-3 --- # Bonus: submodels -So far we have seen how to build a standalone model, but most battery models within PyBaMM are built up using submodels. Submodels are subsets of a model that describe a specfic type of physics (e.g. conservation of lithium in the particles, SEI growth...). Then, multiple submodels can be combined to create a model (e.g. SPM, DFN...). This is very powerful as it allows to mix and match physics and reduces the amount of code one needs to write, but comes at a cost: any new submodel needs to be written in a very specific way to fit into the submodel structure. + +So far we have seen how to build a standalone model, but most battery models within PyBaMM are built up using submodels. Submodels are subsets of a model that describe a specfic type of physics (e.g. conservation of lithium in the particles, SEI growth...). Then, multiple submodels can be combined to create a model (e.g. SPM, DFN...). This is very powerful as it allows to mix and match physics and reduces the amount of code one needs to write, but comes at a cost: any new submodel needs to be written in a very specific way to fit into the submodel structure. Before heading to the task (implement the acid dissolution submodel), let's see how a submodel is structured. Submodels are structured in four steps, defining the: @@ -25,26 +24,29 @@ Before heading to the task (implement the acid dissolution submodel), let's see Not all submodels will include all four steps. For example, if the submodel is an ODE, it will not have to define algebraic equations nor boundary conditions. If a submodel is analytical (e.g. the potentials in SPM), then it does not have equations nor initial/boundary conditions. Let's take the loss of active material (LAM) submodels (see `loss_active_material.py`) and explain each of the steps. ## Fundamental variables -First we need to define the fundamental variables, which are the variables that we need to solve for in our submodel. This is done by the `get_fundamental_variables` method. + +First we need to define the fundamental variables, which are the variables that we need to solve for in our submodel. This is done by the `get_fundamental_variables` method. For the LAM submodel, we observe that first there is a check to see if `self.x_average` is `True` or `False`. Submodels need to work both in Single Particle Models (which average quantities across the electrodes) and Doyle-Fuller-Newman type models (which allow for spatial variations across the electrodes). The `x_average` boolean tells the submodel if it should average quantities across the electrode or not, and it is used to determined whether the fundamental variable we use is the x-averaged version or not. Once the variable is defined, t is passed to the internal method `_get_standard_active_material_variables` which computes any additional related variables (e.g. the averaged version). ## Coupled variables + Next we need to define the couple variables, which are any other variables involved in our model (but not fundamental). This could include, for example, fundamental variables from other submodels (that's why they are called coupled variables) or variables that can be derived from fundamental variables. All this is done by the `get_coupled_variables` method. For the LAM submodel, we see that it has a series of if statements, based on the submodels options, to determine which LAM mechanisms need to be included. Looking at the `"stress"` case, we see that the method imports some variables that would come from the mechanical submodels and then define some new quantities, like the rate of change of active material `deps_solid_dt`. The other cases follow similarly. At the end, some additional processing is done by the internal method `_get_standard_active_material_change_variables`. ## Equations + Once the variables have been defined, we can write the equations. Note that, as mentioned earlier, if our submodel is analytic, then we would skip this step as there is nothing to solve for, and all the relevant quantities would be computed in `get_coupled_variables`. The definition of the equations is split between the differential (dealt by `set_rhs`) and the algebraic (dealt by `set_algebraic`) parts. For LAM, we only have differential equations, so the `set_algebraic` method is not defined. The `set_rhs` method updates the `self.rhs` variable with the relevant equations. Note that, in order to write the equations, we need access to the variables we previously defined, but they will be available through the `variables` dictionary. ## Initial and boundary conditions + Finally, we can define initial and boundary conditions if our submodel requires them. This is done via the `set_initial_conditions` and `set_boundary_conditions` methods, respectively. These methods update the `intial_conditions` and `boundary_conditions` dictionaries, and again may use some of the already defined variables through the `variables` dictionary. There are some other more advanced features of the submodels that we have not covered here. Before implementing any submodels, it is useful to check any existing submodels that are similar and how they have been implemented, and to open an issue to discuss ideas with the other developers. - ::::challenge{id="acid-submodel" title="Implement the acid dissolution submodel"} The task for this lesson is to implement the acid dissolution as a submodel. For a refresher on submodels it might be useful to check the [submodels notebook](https://docs.pybamm.org/en/stable/source/examples/notebooks/models/using-submodels.html). The acid dissolution is a specific type of loss of active material submodel, so it needs to be implemented in `loss_active_material.py` as an additional case in the if statement. diff --git a/libraries/pybamm-developer/index.md b/libraries/pybamm-developer/index.md index 146f44c3..9e15b97d 100644 --- a/libraries/pybamm-developer/index.md +++ b/libraries/pybamm-developer/index.md @@ -1,35 +1,27 @@ --- id: pybamm_developer name: PyBaMM Model Development -dependsOn: [ - libraries.pybamm -] -files: [ - 01_ode.md, - 02_pde.md, - 03_spm.md, - 04_spm_class.md, - 05_spm_acid.md, - 06_submodels.md, -] -attribution: - - citation: > - PyBaMM documentation by the PyBaMM Team - url: https://docs.pybamm.org - image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg - license: BSD-3 +dependsOn: [libraries.pybamm] +files: [01_ode.md, 02_pde.md, 03_spm.md, 04_spm_class.md, 05_spm_acid.md, 06_submodels.md] +attribution: + - citation: > + PyBaMM documentation by the PyBaMM Team + url: https://docs.pybamm.org + image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg + license: BSD-3 summary: | - This course covers battery model development in PyBaMM (Python Battery Mathematical Modelling). - We will learn how to develop new models, submodels & parameter sets, and how to use them in PyBaMM. + This course covers battery model development in PyBaMM (Python Battery Mathematical Modelling). + We will learn how to develop new models, submodels & parameter sets, and how to use them in PyBaMM. --- -**PyBaMM: A High-Level Overview** +## PyBaMM: A High-Level Overview PyBaMM (Python Battery Mathematical Modelling) is an open-source battery simulation package written in Python. Our mission is to accelerate battery modelling research by providing open-source tools for multi-institutional, interdisciplinary collaboration. Broadly, PyBaMM consists of + 1. a framework for writing and solving systems of differential equations, 2. a library of battery models and parameters, and 3. specialized tools for simulating battery-specific experiments and visualizing the results. Together, these enable flexible model definitions and fast battery simulations, allowing users to explore the effect of different battery designs and modeling assumptions under a variety of operating scenarios. -This course assumes a good understanding of how to use PyBaMM for battery simulations. We will learn how to develop new models, submodels & parameter sets, and how to use them in PyBaMM. The slides that provide the introduction to this course are available [here](https://docs.google.com/presentation/d/1ObBeONPWmxDpbh7AwPRz1MOcjwoPMKX-/edit?usp=sharing&ouid=106913304926798358978&rtpof=true&sd=true). \ No newline at end of file +This course assumes a good understanding of how to use PyBaMM for battery simulations. We will learn how to develop new models, submodels & parameter sets, and how to use them in PyBaMM. The slides that provide the introduction to this course are available [here](https://docs.google.com/presentation/d/1ObBeONPWmxDpbh7AwPRz1MOcjwoPMKX-/edit?usp=sharing&ouid=106913304926798358978&rtpof=true&sd=true). diff --git a/libraries/pybamm/01_running_pybamm.md b/libraries/pybamm/01_running_pybamm.md index 930fc05a..d7c288af 100644 --- a/libraries/pybamm/01_running_pybamm.md +++ b/libraries/pybamm/01_running_pybamm.md @@ -1,14 +1,13 @@ --- name: Running PyBaMM -dependsOn: [ -] +dependsOn: [] tags: [pybamm] -attribution: - - citation: > - PyBaMM documentation by the PyBaMM Team - url: https://docs.pybamm.org - image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg - license: BSD-3 +attribution: + - citation: > + PyBaMM documentation by the PyBaMM Team + url: https://docs.pybamm.org + image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg + license: BSD-3 --- Before starting this course, you should be comfortable with the basics of Python and have PyBaMM installed in your machine. If you need a refresher about Python you can check the [Intro to Python](/material/introductory_courses/python) course. The instructions on how to install PyBaMM are available in the [PyBaMM docs](https://docs.pybamm.org/en/latest/). @@ -50,7 +49,7 @@ simulation.solve([0, 3600]) Now that the simulation has been solved, we can simply call the `plot` method to generate an interactive plot of the key variables: ```python -sim.plot() +simulation.plot() ``` ## Comparing multiple models @@ -59,9 +58,9 @@ We have seen how to run a simulation of the DFN model, but PyBaMM includes many ```python models = [ - pybamm.lithium_ion.SPM(), - pybamm.lithium_ion.SPMe(), - pybamm.lithium_ion.DFN(), + pybamm.lithium_ion.SPM(), + pybamm.lithium_ion.SPMe(), + pybamm.lithium_ion.DFN(), ] ``` @@ -70,9 +69,9 @@ so we can now loop over the list, creating and solving the simulations as we go. ```python simulations = [] for model in models: - simulation = pybamm.Simulation(model) - simulation.solve([0, 3600]) - simulations.append(simulation) + simulation = pybamm.Simulation(model) + simulation.solve([0, 3600]) + simulations.append(simulation) ``` We can now plot the results. Because we want to plot all the simulations in the same plot we cannot use the same syntax as before, instead we can use the `pybamm.dynamic_plot` method, which takes the list of simulations as an input: @@ -94,8 +93,8 @@ the argument should be a list of strings with the names of the variables to plot ```python output_variables = [ - "Voltage [V]", - ["Electrode current density [A.m-2]", "Electrolyte current density [A.m-2]"] + "Voltage [V]", + ["Electrode current density [A.m-2]", "Electrolyte current density [A.m-2]"] ] simulation.plot(output_variables=output_variables) ``` @@ -137,6 +136,7 @@ simulation = pybamm.Simulation(model, experiment="Discharge at 3C until 3.3 V") ``` ## Printing citations + We aim to recognize all contributions by automatically generating citations to the relevant papers on which different parts of the code are built. These will change depending on what models and solvers you use. Adding the command @@ -144,4 +144,4 @@ These will change depending on what models and solvers you use. Adding the comma pybamm.print_citations() ``` -to the end of a script or notebook will print all citations that were used by that piece of code. This will print BibTeX information to the terminal; passing a filename to `print_citations` will print the BibTeX information to the specified file instead. \ No newline at end of file +to the end of a script or notebook will print all citations that were used by that piece of code. This will print BibTeX information to the terminal; passing a filename to `print_citations` will print the BibTeX information to the specified file instead. diff --git a/libraries/pybamm/02_experiments.md b/libraries/pybamm/02_experiments.md index 9b0ba2bc..24bed21f 100644 --- a/libraries/pybamm/02_experiments.md +++ b/libraries/pybamm/02_experiments.md @@ -1,21 +1,19 @@ --- name: The Experiment class id: experiments -dependsOn: [ - libraries.pybamm.01_running_pybamm, -] +dependsOn: [libraries.pybamm.01_running_pybamm] tags: [pybamm] -attribution: - - citation: > - PyBaMM documentation by the PyBaMM Team - url: https://docs.pybamm.org - image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg - license: BSD-3 +attribution: + - citation: > + PyBaMM documentation by the PyBaMM Team + url: https://docs.pybamm.org + image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg + license: BSD-3 --- We already saw in [lesson 1](./01_running_pybamm) a very basic use of experiments in which we changed the discharge rate. With the `Experiment` class, however, you can do so much more. The `Experiment` class works by converting text strings into instructions that PyBaMM can use to create a `Simulation` object. Here are some examples: -``` +```text "Discharge at 1C for 0.5 hours", "Discharge at C/20 for 0.5 hours", "Charge at 0.5 C for 45 minutes", @@ -25,7 +23,7 @@ We already saw in [lesson 1](./01_running_pybamm) a very basic use of experiment "Charge at 200 mW for 45 minutes", "Rest for 10 minutes", "Hold at 1 V for 20 seconds", -"Charge at 1 C until 4.1 V", +"Charge at 1 C until 4.1 V",+ "Hold at 4.1 V until 50 mA", "Hold at 3V until C/50", "Discharge at C/3 for 2 hours or until 2.5 V", @@ -34,12 +32,15 @@ We already saw in [lesson 1](./01_running_pybamm) a very basic use of experiment The input argument for the `Experiment` class is a string, or a list [square brackets] of strings like these. The output is an `Experiment` object that can then be used as an optional argument for the `Simulation` class: ```python +import pybamm experiment = pybamm.Experiment([ "Discharge at 1C until 3.3 V", "Charge at 0.3C until 4.0 V", "Hold at 4.0 V until C/100", ]) +model = pybamm.lithium_ion.DFN() simulation = pybamm.Simulation(model, experiment=experiment) +solution = simulation.solve() ``` If you solve the resulting simulation, the solution will have different cycles, one for each string in the list used to create the experiment. You can access them via the `cycles` attribute of the solution, and plot them as usual @@ -69,6 +70,7 @@ simulation2 = pybamm.Simulation(model, experiment=experiment2) ``` You can access a given step by accessing the `steps` attribute of the `cycles` (i.e. `solution.cycles[i].steps[j]`), and plot as usual + ```python solution.cycles[0].steps[1].plot() ``` @@ -81,25 +83,25 @@ experiment3 = pybamm.Experiment( [ "Hold at 4.2 V until C/100", "Rest for 4 hours", - ] + + ] # Capacity check - [( + + [( "Discharge at C/10 until 2.5 V", "Charge at C/10 until 4.2 V", "Hold at 4.2 V until C/100" - )] + + )] # Ageing cycles - [( + + [( "Discharge at 1C until 2.5 V", "Charge at 0.3C until 4.2 V", "Hold at 4.2 V until C/100", - )] * 10 + + )] * 10 # Capacity check - [( + + [( "Discharge at C/10 until 2.5 V", "Charge at C/10 until 4.2 V", "Hold at 4.2 V until C/100" - )] + )] ) ``` @@ -115,4 +117,4 @@ There are 14 cycles in total. Each cycle has three steps, except for the first t :::: -Don't try to run `experiment3` yet. We'll be doing that in the next session. \ No newline at end of file +Don't try to run `experiment3` yet. We'll be doing that in the next session. diff --git a/libraries/pybamm/03_parameter_values.md b/libraries/pybamm/03_parameter_values.md index 6bfa059a..197d3968 100644 --- a/libraries/pybamm/03_parameter_values.md +++ b/libraries/pybamm/03_parameter_values.md @@ -1,15 +1,13 @@ --- name: Parameter sets -dependsOn: [ - libraries.pybamm.02_experiments, -] +dependsOn: [libraries.pybamm.02_experiments] tags: [pybamm] -attribution: - - citation: > - PyBaMM documentation by the PyBaMM Team - url: https://docs.pybamm.org - image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg - license: BSD-3 +attribution: + - citation: > + PyBaMM documentation by the PyBaMM Team + url: https://docs.pybamm.org + image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg + license: BSD-3 --- ## Parameter Sets @@ -19,6 +17,7 @@ PyBaMM comes with 12 ready-made parameter sets for lithium-ion batteries. To sel ```python import pybamm parameter_values = pybamm.ParameterValues("Marquis2019") +model = pybamm.lithium_ion.DFN() ``` There are over 100 parameter values! Fortunately, `ParameterValues` objects are dictionaries so you can look up the parameters you're interested in: @@ -36,40 +35,42 @@ experiment3 = pybamm.Experiment( [ "Hold at 4.2 V until C/100", "Rest for 4 hours", - ] + + ] # Capacity check - [( + + [( "Discharge at C/10 until 2.5 V", "Charge at C/10 until 4.2 V", "Hold at 4.2 V until C/100" - )] + + )] # Ageing cycles - [( + + + [( "Discharge at 1C until 2.5 V", "Charge at 0.3C until 4.2 V", "Hold at 4.2 V until C/100", )] * 10 + # Capacity check - [( + + [( "Discharge at C/10 until 2.5 V", "Charge at C/10 until 4.2 V", "Hold at 4.2 V until C/100" - )] + )] ) ``` The above `experiment3` will not work with the default parameters, because it was designed for a different cell with different voltage limits. Marquis et al. studied the Kokam SLPB78205130H 16 Ah prismatic cell, whereas `experiment3` is designed for the LG M50 5 Ah cylindrical cell. Four of PyBaMM's built-in parameter sets correspond to the LG M50: -* `Chen2020` comes from the first study, published in 2020. -* `Chen2020_composite` is an upgrade of `Chen2020` designed to work with PyBaMM's composite electrode model -* `OKane2022` is a superset of `Chen2020` designed to work with PyBaMM's various degradation models -* `ORegan2022` comes from a follow-up paper to `Chen2020` that addressed some questions the first paper didn't answer + +- `Chen2020` comes from the first study, published in 2020. +- `Chen2020_composite` is an upgrade of `Chen2020` designed to work with PyBaMM's composite electrode model +- `OKane2022` is a superset of `Chen2020` designed to work with PyBaMM's various degradation models +- `ORegan2022` comes from a follow-up paper to `Chen2020` that addressed some questions the first paper didn't answer Like `Experiment` objects, `ParameterValues` objects are an optional argument to the `Simulation` class: ```python simulation3 = pybamm.Simulation( - model, - experiment=experiment3, + model, + experiment=experiment3, parameter_values=parameter_values ) ``` @@ -79,8 +80,8 @@ If you use suitable parameter values, the simulation will run, but the results w ```python parameter_values.update({ "Outer SEI solvent diffusivity [m2.s-1]": 1.25e-20, - "Lithium plating kinetic rate constant [m.s-1]: 1e-8, - "Dead lithium decay constant [s-1]: 4e-6, + "Lithium plating kinetic rate constant [m.s-1]": 1e-8, + "Dead lithium decay constant [s-1]": 4e-6, "Negative electrode cracking rate": 1.95e-18, "Negative electrode LAM constant proportional term [s-1]": 5.5556e-6, "Positive electrode LAM constant proportional term [s-1]": 5.5556e-6, @@ -120,7 +121,7 @@ You would then need to create a new `Simulation` object with the updated paramet ```python simulation = pybamm.Simulation( - model, + model, parameter_values=parameter_values ) ``` @@ -143,7 +144,7 @@ When the model is solved, you can provide a value for the input parameter ```python simulation = pybamm.Simulation( - model, + model, parameter_values=parameter_values ) solution = simulation.solve([0, 3600], inputs={"Current function [A]": 2}) @@ -170,7 +171,7 @@ parameter_values.update({ "Current function [A]": "[input]", }) simulation = pybamm.Simulation( - model, + model, parameter_values=parameter_values ) diff --git a/libraries/pybamm/04_outputs.md b/libraries/pybamm/04_outputs.md index 0ab2eb93..3864f0aa 100644 --- a/libraries/pybamm/04_outputs.md +++ b/libraries/pybamm/04_outputs.md @@ -1,15 +1,13 @@ --- name: Making the most of PyBaMM outputs -dependsOn: [ - libraries.pybamm.03_parameter_values, -] +dependsOn: [libraries.pybamm.03_parameter_values] tags: [pybamm] -attribution: - - citation: > - PyBaMM documentation by the PyBaMM Team - url: https://docs.pybamm.org - image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg - license: BSD-3 +attribution: + - citation: > + PyBaMM documentation by the PyBaMM Team + url: https://docs.pybamm.org + image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg + license: BSD-3 --- There is a large overlap between this exercise and [PyBaMM Tutorial notebook 6](https://docs.pybamm.org/en/latest/source/examples/notebooks/getting_started/tutorial-6-managing-simulation-outputs.html), so we recommended you read both. @@ -19,6 +17,8 @@ There is a large overlap between this exercise and [PyBaMM Tutorial notebook 6]( After solving a simulation, there are two says of accessing the solution: ```python +import pybamm +sim = pybamm.Simulation() sol = sim.solve() ``` @@ -95,7 +95,7 @@ experiment = pybamm.Experiment([ "Hold at 4.0 V until C/100", ]) simulation = pybamm.Simulation( - model, + model, parameter_values=parameter_values, experiment=experiment, ) @@ -130,8 +130,9 @@ sol2 = pybamm.load(path + "my_pybamm_solution.pkl") ``` PyBaMM has a lot of variables, so these `.pkl` files are huge! So why bother? -* You can run another PyBaMM model, with the final results of the saved solution as the initial conditions for the next, by using `model.set_initial_conditions_from(sol2)`, as shown in [this example](https://docs.pybamm.org/en/latest/source/examples/notebooks/initialize-model-with-solution.html) -* You can do the same post-processing on a solution loaded from disk as you can on a "fresh" solution. + +- You can run another PyBaMM model, with the final results of the saved solution as the initial conditions for the next, by using `model.set_initial_conditions_from(sol2)`, as shown in [this example](https://docs.pybamm.org/en/latest/source/examples/notebooks/initialize-model-with-solution.html) +- You can do the same post-processing on a solution loaded from disk as you can on a "fresh" solution. If saving the entire solution would take up too much space, you can use `save_data` to only save the variables you need: @@ -153,14 +154,16 @@ Can you think of three situations where you would save the entire solution, and There is no right answer to this question, but some examples are the following. When to save entire solution? -* If you might want to do additional post-processing later. -* If you're likely to need the solution as an initial condition for anoher simulation. -* If you're submitting the data to an archive. + +- If you might want to do additional post-processing later. +- If you're likely to need the solution as an initial condition for anoher simulation. +- If you're submitting the data to an archive. When to save selected data? -* If the full `.pkl` file would take up too much space or take too long to upload. -* To feed the data to another software package. -* To share the data with non-PyBaMM users. + +- If the full `.pkl` file would take up too much space or take too long to upload. +- To feed the data to another software package. +- To share the data with non-PyBaMM users. ::: @@ -184,7 +187,7 @@ path = "/mnt/c/Users/sokane/pybamm_data/" sol.save_data( path + "tIVQ.csv", ["Time [s]", "Current [A]", "Voltage [V]", "Discharge capacity [A.h]"], - to_format="csv + to_format="csv" ) ``` @@ -194,10 +197,8 @@ What does the above code do? What do you think the intended application was? :::solution -This code simulates a GITT experiment. By exporting the parameters into a `.csv` file, you can use the simulated data to parameterize an equivalent circuit network in the same way as experimental GITT data. +This code simulates a GITT experiment. By exporting the parameters into a `.csv` file, you can use the simulated data to parameterize an equivalent circuit network in the same way as experimental GITT data. ::: :::: - -## \ No newline at end of file diff --git a/libraries/pybamm/05_using_submodels.md b/libraries/pybamm/05_using_submodels.md index a217b96f..30c5e63b 100644 --- a/libraries/pybamm/05_using_submodels.md +++ b/libraries/pybamm/05_using_submodels.md @@ -1,51 +1,50 @@ --- name: Using submodels -dependsOn: [ - libraries.pybamm.04_outputs, -] +dependsOn: [libraries.pybamm.04_outputs] tags: [pybamm] -attribution: - - citation: > - PyBaMM documentation by the PyBaMM Team - url: https://docs.pybamm.org - image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg - license: BSD-3 +attribution: + - citation: > + PyBaMM documentation by the PyBaMM Team + url: https://docs.pybamm.org + image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg + license: BSD-3 --- -One of the main features of PyBaMM is its modular structure that allows for plug and play models. At the core, all models in PyBaMM are built as a collection of submodels, where a submodel determines a specific subset of the physics. For example, the particle submodel would specify how lithium is transported in the particles. +One of the main features of PyBaMM is its modular structure that allows for plug and play models. At the core, all models in PyBaMM are built as a collection of submodels, where a submodel determines a specific subset of the physics. For example, the particle submodel would specify how lithium is transported in the particles. The full list of submodels can be found in the [PyBaMM docs](https://docs.pybamm.org/en/latest/source/api/models/submodels/index.html). You can check which submodels a given model uses by calling -```python +```python nolint model.submodels ``` In this lesson we will focus only on the subset of models to include additional physics, in particular: -* Thermal -* SEI growth -* Lithium plating -* Mechanics -* Active material +- Thermal +- SEI growth +- Lithium plating +- Mechanics +- Active material ## Thermal models We start with the thermal models. These models account for the changes in the temperature caused by the operation of the battery. The thermal models available in PyBaMM are: -* **Isothermal**: temperature stays constant. -* **Lumped**: the temperature is taken to be homogeneous in the battery, so only the average temperature is computed. -* **X-lumped**: the temperature is taken to be homogeneous across the thickness of the cell, but can vary in the directions parallel to the current collectors. Need to be used in conjunction with a current collector model. -* **X-full**: the temperature is allowed to vary across the thickness of the cell. +- **Isothermal**: temperature stays constant. +- **Lumped**: the temperature is taken to be homogeneous in the battery, so only the average temperature is computed. +- **X-lumped**: the temperature is taken to be homogeneous across the thickness of the cell, but can vary in the directions parallel to the current collectors. Need to be used in conjunction with a current collector model. +- **X-full**: the temperature is allowed to vary across the thickness of the cell. More information on the thermal models can be found in [the documentation](https://docs.pybamm.org/en/latest/source/examples/notebooks/models/thermal-models.html). Thermal models add extra physics on top of the electrochemical models, so we need to choose a base electrochemical model to start with. Then, the extra physics can be specified via the model options. For example, if we want to use the DFN model with a lumped thermal model we do ```python +import pybamm model = pybamm.lithium_ion.DFN(options={"thermal": "lumped"}) ``` -and then we can solve the model as usual. +and then we can solve the model as usual. ::::challenge{id=thermal title="Comparing thermal models"} @@ -59,9 +58,9 @@ thermal_options = ["isothermal", "x-full"] solutions = [] for option in thermal_options: - model = pybamm.lithium_ion.DFN(name=option, options={"thermal": option}) - simulation = pybamm.Simulation(model) - solutions.append(simulation.solve([0, 3600])) + model = pybamm.lithium_ion.DFN(name=option, options={"thermal": option}) + simulation = pybamm.Simulation(model) + solutions.append(simulation.solve([0, 3600])) pybamm.dynamic_plot( solutions, @@ -76,7 +75,7 @@ pybamm.dynamic_plot( "Voltage [V]", "Cell temperature [K]", ], - ) +) ``` @@ -90,13 +89,13 @@ We observe that the temperature in the isothermal model remains constant, while Let's now focus our attention to SEI growth models. These models capture the growth of the solid-electrolyte interphase, which is cause by a side reaction between the electrolyte and lithium. There are multiple ways of simulating SEI growth, and PyBaMM has the following options: -* **None**: no SEI included. -* **Constant**: includes an SEI layer which does not grow. -* **Reaction limited**: assumes reaction is the limiting phenomenon, see Section 5.5.3 of Marquis (2020). It can also be specified to be asymmetric. -* **Solvent-diffusion limited**: assumes that solvent diffusion is the limiting phenomenon, see Section 5.5.4 of Marquis (2020). -* **Electron-migration limited**: assumes that migration of electrons is the limiting phenomenon, see Section 5.5.5 of Marquis (2020). -* **Interstitial-diffusion limited**: assumes that diffusion of lithium-ion intestitials is the limiting phenomenon, see Section 5.4 of Marquis (2020). -* **EC reaction limited**: assumes the model is limited by both reaction and dissuions, see Yang et al (2017). +- **None**: no SEI included. +- **Constant**: includes an SEI layer which does not grow. +- **Reaction limited**: assumes reaction is the limiting phenomenon, see Section 5.5.3 of Marquis (2020). It can also be specified to be asymmetric. +- **Solvent-diffusion limited**: assumes that solvent diffusion is the limiting phenomenon, see Section 5.5.4 of Marquis (2020). +- **Electron-migration limited**: assumes that migration of electrons is the limiting phenomenon, see Section 5.5.5 of Marquis (2020). +- **Interstitial-diffusion limited**: assumes that diffusion of lithium-ion intestitials is the limiting phenomenon, see Section 5.4 of Marquis (2020). +- **EC reaction limited**: assumes the model is limited by both reaction and dissuions, see Yang et al (2017). See all the available options in [the docs](https://docs.pybamm.org/en/latest/source/api/models/base_models/base_battery_model.html#pybamm.BatteryModelOptions). For more information on these models, please see [Marquis (2020)](https://ora.ox.ac.uk/objects/uuid:8afdcc34-cc42-48ba-b316-96a6d0f33a45) and [Yang et al (2017)](https://www.sciencedirect.com/science/article/pii/S0378775317307619). @@ -111,19 +110,19 @@ A possible implementation is ```python SEI_options = [ - "reaction limited", - "solvent-diffusion limited", + "reaction limited", + "solvent-diffusion limited", "interstitial-diffusion limited" ] solutions = [] for option in SEI_options: - model = pybamm.lithium_ion.DFN( - name=option, + model = pybamm.lithium_ion.DFN( + name=option, options={"SEI": option, "SEI porosity change": "true"} ) - simulation = pybamm.Simulation(model) - solutions.append(simulation.solve([0, 3600])) + simulation = pybamm.Simulation(model) + solutions.append(simulation.solve([0, 3600])) pybamm.dynamic_plot( solutions, @@ -135,7 +134,7 @@ pybamm.dynamic_plot( "Negative electrode porosity", "X-averaged negative electrode porosity", ], - ) +) ``` We observe that the voltage response is the same for all models, as the SEI contribution is very small. The SEI layer grows fairly homogeneously, and the fastest growing model is the reaction limited one, even though this is likely to be due to the choice of parameters. A possible way to extend this exercise would be to simulate many cycles. @@ -145,18 +144,21 @@ We observe that the voltage response is the same for all models, as the SEI cont :::: ## Particle mechanics + Finally, we consider the models for particle mechanics. These models account for the deformation and cracking on the particles. The models available in PyBaMM are -* None: no mechanical effects included. -* Swelling only: accounts for the deformation of the particles in the lithiation-delithiation cycle. -* Swelling and cracking: accounts for the swelling and also the crack formation on the particle surface. +- None: no mechanical effects included. +- Swelling only: accounts for the deformation of the particles in the lithiation-delithiation cycle. +- Swelling and cracking: accounts for the swelling and also the crack formation on the particle surface. The mechanical models can be set differently for each electrode by passing a tuple as the option. For example, the following option + ```python model = pybamm.lithium_ion.DFN( options={"particle mechanics": ("swelling only", "none")} ) ``` + will include swelling model for the negative electrode and no mechanical effects for the positive electrode. ::::challenge{id=mechanics title="Particle mechanics models"} @@ -180,7 +182,7 @@ solution.plot([ "Negative particle surface radial stress [Pa]", "Negative particle surface tangential stress [Pa]", "Negative particle surface displacement [m]", - "Negative particle crack length [m]", + "Negative particle crack length [m]", "Positive particle surface radial stress [Pa]", "Positive particle surface tangential stress [Pa]", "Positive particle surface displacement [m]", @@ -188,8 +190,8 @@ solution.plot([ ]) ``` -A few key observations are that the surface radial stress is always zero. As expected, there is no cracking in the negative electrode (we did not enable that option) but there is cracking in the positive one. +A few key observations are that the surface radial stress is always zero. As expected, there is no cracking in the negative electrode (we did not enable that option) but there is cracking in the positive one. ::: -:::: \ No newline at end of file +:::: diff --git a/libraries/pybamm/06_final_exercises.md b/libraries/pybamm/06_final_exercises.md index 11dd1849..cf04aee5 100644 --- a/libraries/pybamm/06_final_exercises.md +++ b/libraries/pybamm/06_final_exercises.md @@ -1,20 +1,19 @@ --- name: Final exercises -dependsOn: [ - libraries.pybamm.05_using_submodels, -] +dependsOn: [libraries.pybamm.05_using_submodels] tags: [pybamm] -attribution: - - citation: > - PyBaMM documentation by the PyBaMM Team - url: https://docs.pybamm.org - image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg - license: BSD-3 +attribution: + - citation: > + PyBaMM documentation by the PyBaMM Team + url: https://docs.pybamm.org + image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg + license: BSD-3 --- -To complete the course, here is a list of open-ended exercises to review the content of the course and go into detail into various features. By open-ended we mean that the exercises will often deliberately not specify every single aspect of the model. Some of this additional detail has not been covered in the course, so more information can be found in the [PyBaMM documentation](https://docs.pybamm.org). The exercises are independent from each other so they can be tackled in any order. +To complete the course, here is a list of open-ended exercises to review the content of the course and go into detail into various features. By open-ended we mean that the exercises will often deliberately not specify every single aspect of the model. Some of this additional detail has nsot been covered in the course, so more information can be found in the [PyBaMM documentation](https://docs.pybamm.org). The exercises are independent from each other so they can be tackled in any order. ## Exercise 1: Coupled degradation mechanisms + We saw in [Lesson 5](./05_using_submodels) how to include additional physics to the models. The goal of this exercise is to further explore the degradation submodels, by coupling several of them together. [O'Kane et al (2022)](https://pubs.rsc.org/en/content/articlehtml/2022/cp/d2cp00417h) is a very good reference to read more about coupled degradation mechanisms. 1. Run an electrochemical model of your choice with the SEI growth submodel of your choice too. @@ -24,6 +23,7 @@ We saw in [Lesson 5](./05_using_submodels) how to include additional physics to 5. Use the `plot_summary_variables` function and compare the summary degradation variables for the various degradation models. What do you observe? Tip: check the docs for more information about this function. ## Exercise 2: Long experiments + In [lesson 2](./02_experiments) we saw how to run experiments, that is how to change the cycling conditions between constant current, constant voltage, drive cycles... The goal of this exercise is to explore how to run very long experiments (i.e. experiments with hundreds or even thousands of cycles). 1. Run an experiment consisting of 10 cycles of 1C discharge, rest, a C/3 4.2 V CCCV charge and another rest. @@ -34,6 +34,7 @@ In [lesson 2](./02_experiments) we saw how to run experiments, that is how to ch 6. Plot both the standard and summary variables for this new solution. What do you observe? ## Exercise 3: Half-cell models + A lithium counter-electrode can be used instead of a porous electrode, either to make a lithium-metal battery or to test the properties of an electrode in the laboratory. PyBaMM models can be defined for half-cells, by passing the option `"working electrode": "positive"`. 1. Simulate the DFN model for a half-cell. What differences do you observe with the standard DFN model? @@ -43,7 +44,9 @@ A lithium counter-electrode can be used instead of a porous electrode, either to 5. Using the parameter values found in the previous question, modify the `Chen2020` parameter set to work with the half-cell models. ## Exercise 4: Batch study and sensitivity analysis + One of the first examples we saw in [lesson 1](./01_running_pybamm) was how to compare various PyBaMM models. The goal of this exercise is to explore how to compare various simulations and perform sensitivity analysis. + 1. Compare the SPM, SPMe and DFN models as explained in [lesson 1](./01_running_pybamm). 2. PyBaMM has the `BatchStudy` function that allows to streamline comparisons. Repeat the comparison above but using `BatchStudy`. Tip: check the docs for more information about this function. 3. Compare the three models for two parameter sets of your choice. Tip: you may want to check what the `permutations` option does. @@ -55,4 +58,4 @@ One of the first examples we saw in [lesson 1](./01_running_pybamm) was how to c parameter has the most effect on the solution. (Hint: you can access the sensitivities of variable `var` with respect to parameter `param` by calling `sol[var].sensitivities[param]`, which gives an array rather than a - `pybamm.ProcessedVariable`). \ No newline at end of file + `pybamm.ProcessedVariable`). diff --git a/libraries/pybamm/index.md b/libraries/pybamm/index.md index 48335afd..eec59c2b 100644 --- a/libraries/pybamm/index.md +++ b/libraries/pybamm/index.md @@ -1,29 +1,34 @@ --- id: intro_to_pybamm name: Intro to PyBaMM -dependsOn: [ - introductory_courses.python -] -files: [ - 01_running_pybamm.md, 02_experiments.md, 03_parameter_values.md, 04_outputs.md, 05_using_submodels.md, 06_final_exercises.md -] -attribution: - - citation: > - PyBaMM documentation by the PyBaMM Team - url: https://docs.pybamm.org - image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg - license: BSD-3 +dependsOn: [introductory_courses.python] +files: + [ + 01_running_pybamm.md, + 02_experiments.md, + 03_parameter_values.md, + 04_outputs.md, + 05_using_submodels.md, + 06_final_exercises.md, + ] +attribution: + - citation: > + PyBaMM documentation by the PyBaMM Team + url: https://docs.pybamm.org + image: https://raw.githubusercontent.com/pybamm-team/pybamm.org/main/static/images/pybamm_logo.svg + license: BSD-3 summary: | - This course introduces the basics of PyBaMM (Python Battery Mathematical Modelling), an open-source battery simulation package written in Python. - We will learn how to run PyBaMM models for various parameters and operating conditions. - We will also learn how to process and visualise the outputs of the models. + This course introduces the basics of PyBaMM (Python Battery Mathematical Modelling), an open-source battery simulation package written in Python. + We will learn how to run PyBaMM models for various parameters and operating conditions. + We will also learn how to process and visualise the outputs of the models. --- -**PyBaMM: A High-Level Overview** +## PyBaMM: A High-Level Overview PyBaMM (Python Battery Mathematical Modelling) is an open-source battery simulation package written in Python. Our mission is to accelerate battery modelling research by providing open-source tools for multi-institutional, interdisciplinary collaboration. Broadly, PyBaMM consists of + 1. a framework for writing and solving systems of differential equations, 2. a library of battery models and parameters, and 3. specialized tools for simulating battery-specific experiments and visualizing the results. -Together, these enable flexible model definitions and fast battery simulations, allowing users to explore the effect of different battery designs and modeling assumptions under a variety of operating scenarios. \ No newline at end of file +Together, these enable flexible model definitions and fast battery simulations, allowing users to explore the effect of different battery designs and modeling assumptions under a variety of operating scenarios. diff --git a/libraries/pybamm/slides/next_steps.md b/libraries/pybamm/slides/next_steps.md index 817dd9ae..5d3fda83 100644 --- a/libraries/pybamm/slides/next_steps.md +++ b/libraries/pybamm/slides/next_steps.md @@ -10,20 +10,20 @@ title: Next steps with PyBaMM - [PyBaMM API documentation](https://docs.pybamm.org/en/latest/source/api/index.html) - Community - [PyBaMM github](https://github.com/pybamm-team/PyBaMM/) - - For questions, use the + - For questions, use the [discussions](https://github.com/pybamm-team/PyBaMM/discussions) - - For Bugs, Feature Requests, etc. look at the + - For Bugs, Feature Requests, etc. look at the [issues](https://github.com/pybamm-team/PyBaMM/issues) - [PyBaMM Slack](https://pybamm.slack.com/) - - Monthly public development meetings (see #general on Slack for invite + agenda + - Monthly public development meetings (see #general on Slack for invite + agenda links) ## Contributing to PyBaMM - PyBaMM is an open-source project, and we welcome contributions from the community -- Many ways to contribute: reporting bugs, improving the documentation, submitting +- Many ways to contribute: reporting bugs, improving the documentation, submitting feature requests, or writing code. -- If you are interested in contributing code, please see the [contributing +- If you are interested in contributing code, please see the [contributing guide](https://github.com/pybamm-team/PyBaMM/blob/develop/CONTRIBUTING.md) ## PyBaMM development @@ -47,9 +47,9 @@ How code is added to PyBaMM: - Performance improvements and solver enhancements - Supporting standard formats like BPX data - Integration of PyBaMM with: - - inference software (e.g. - [pybamm-param](https://github.com/paramm-team/pybamm-param) and + - inference software (e.g. + [pybamm-param](https://github.com/paramm-team/pybamm-param) and [PyBOP](https://github.com/pybop-team/PyBOP)) - data sources (e.g. [galv](https://github.com/Battery-Intelligence-Lab/galv)) - -## Questions? \ No newline at end of file + +## Questions? diff --git a/libraries/pybamm/slides/parameters_and_ouputs.md b/libraries/pybamm/slides/parameters_and_ouputs.md index a98da2e5..b1c5319f 100644 --- a/libraries/pybamm/slides/parameters_and_ouputs.md +++ b/libraries/pybamm/slides/parameters_and_ouputs.md @@ -8,7 +8,7 @@ title: Using Parameters and Outputs - Using Parameters Sets - Changing Parameters - Input Parameters -- Outputs: +- Outputs: - Variables - Processed Variables - Plots @@ -16,7 +16,7 @@ title: Using Parameters and Outputs ## Parameter Sets -- PyBaMM comes with a number of parameter sets that have been published in the +- PyBaMM comes with a number of parameter sets that have been published in the literature. - To see the available parameter sets and more information on each one: @@ -53,17 +53,16 @@ parameter_values["Negative electrode thickness [m]"] = 1e-5 ## Background - The PyBaMM Pipeline -- To understand how parameters are used in PyBaMM, we need to understand the PyBaMM - pipeline. -- Most of these steps are done automatically by the `pybamm.Simulation` class, but it is +- To understand how parameters are used in PyBaMM, we need to understand the PyBaMM + pipeline. +- Most of these steps are done automatically by the `pybamm.Simulation` class, but it is useful to know what is going on under the hood. - ### The PyBaMM Pipeline (cont.) The pipeline is a series of steps that PyBaMM goes through to solve a model: -1. Define the model. The equations are continuous (i.e. not discretised in space or +1. Define the model. The equations are continuous (i.e. not discretised in space or time) and the parameters are symbolic (i.e. not yet given a value) @@ -78,8 +77,7 @@ parameter_values = pybamm.ParameterValues("Marquis2019") param.process_model(model) param.process_geometry(geometry) ~~~ - - + ### The PyBaMM Pipeline (cont.) 3. Discretise the equations in space and time @@ -97,19 +95,19 @@ solver = pybamm.CasadiSolver(mode="fast") solution = solver.solve(model, t_eval) ~~~ -For more detail see the +For more detail see the [docs](https://docs.pybamm.org/en/latest/source/examples/notebooks/expression_tree/expression-tree.html) ## Updating Parameters in the Pipeline -- To update the parameters in a model, we need to repeat steps 2-4 of the pipeline (i.e. +- To update the parameters in a model, we need to repeat steps 2-4 of the pipeline (i.e. nearly all of it!). - This overhead can be significant (e.g. paramter sweeps, optimisation). - To avoid this, PyBaMM has the concept of an **input parameter**. ## Input Parameters -- An input parameter is a parameter that does not yet have a value, but one will be +- An input parameter is a parameter that does not yet have a value, but one will be provided when the model is solved. - An input parameter can be created by setting its value to the string `[input]`, like so: @@ -121,11 +119,11 @@ parameter_values.update({ }) ~~~ -- This keeps the parameter symbolic through the pipeline, until the solve step, when the +- This keeps the parameter symbolic through the pipeline, until the solve step, when the user provides a value: ~~~python solution = simulation.solve([0, 3600], inputs={"Current function [A]": 2}) ~~~ - -## Questions? \ No newline at end of file + +## Questions? diff --git a/scientific_computing/bayesian_inference/01-AM.md b/scientific_computing/bayesian_inference/01-AM.md index ea792f02..2d005a73 100644 --- a/scientific_computing/bayesian_inference/01-AM.md +++ b/scientific_computing/bayesian_inference/01-AM.md @@ -1,18 +1,16 @@ --- name: Applied Bayesian Inference - AM exercise -dependsOn: [ -] +dependsOn: [] tags: [] -attribution: -- citation: This material has been adapted from material by Ben Lambert from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Ben Lambert from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- **You can download the slides for the AM lecture [here](slides/applied_ode_modelling.pdf)** @@ -36,7 +34,7 @@ Using Scipy's integrator, write a function which numerically solves the logistic ```python import numpy as np from scipy.integrate import odeint -from plotnine import * +from plotnine import geom_path, geom_line, geom_path, geom_point, geom_hline, geom_vline, ggplot, aes import pandas as pd def logistic_rhs(y, t, r, k): @@ -48,8 +46,7 @@ def logistic_solve(times, r, k, y0): return df df = logistic_solve(np.linspace(0, 40, 100), 0.2, 20, 5) -(ggplot(df, aes(x='t', y='y')) + - geom_line()) +(ggplot(df, aes(x='t', y='y')) + geom_line()) ``` ::: @@ -79,10 +76,11 @@ df['type'] = 'simulated' df_both = pd.concat([df, df_short]) -(ggplot(df_both[df_both['type']=='simulated'], aes(x='t', y='y', colour='type')) + - geom_line() + - geom_point(df_both[df_both['type']=='actual'])) +(ggplot(df_both[df_both['type']=='simulated'], + aes(x='t', y='y', colour='type')) + + geom_line() + geom_point(df_both[df_both['type']=='actual'])) ``` + ::: :::: @@ -112,13 +110,12 @@ def logistic_log_likelihood(times, y_tilde, r, k, y0, sigma): ::: :::: - ::::challenge{id=plot-likelihood title="Plot likelihood"} Using the simulated data you previously generated, plot the log-likelihood as a function of $r$ holding all other parameters at their true values. - :::solution + ```python r = np.linspace(0, 1, 100) log_like = np.zeros(len(r)) @@ -126,14 +123,12 @@ for i in range(len(r)): log_like[i] = logistic_log_likelihood(times, df_short['y'], r[i], 20, 5, 2) df_log_like = pd.DataFrame({'r': r, 'log_like': log_like}) -(ggplot(df_log_like, aes(x='r', y='log_like')) + - geom_line()) +(ggplot(df_log_like, aes(x='r', y='log_like')) + geom_line()) ``` ::: :::: - ::::challenge{id=plot-likelihood-contour title="Plot likelihood contour"} Plot the likelihood (not the log-likelihood as this is harder to distinguish) contour surface for $(r,k)$ holding all other parameters at their true values. @@ -163,10 +158,10 @@ plt.show() ::::challenge{id=prior-predictive title="Prior predictive"} -Assume we have priors on parameters: $r\sim \mathcal{N}(0.2, 0.02), \kappa\sim -\mathcal{N}(20, 4)$, and we fix $y(0)=5$. Generate 100 prior predictive simulations of -the ODE solution $y$ and plot these. Remember, a single prior predictive simulation is -obtained by sampling each parameter from its prior and (in this case) solving the ODE +Assume we have priors on parameters: $r\sim \mathcal{N}(0.2, 0.02), \kappa\sim +\mathcal{N}(20, 4)$, and we fix $y(0)=5$. Generate 100 prior predictive simulations of +the ODE solution $y$ and plot these. Remember, a single prior predictive simulation is +obtained by sampling each parameter from its prior and (in this case) solving the ODE for this parameter set. :::solution @@ -183,8 +178,8 @@ for i in range(n): else: df_big = pd.concat([df_big, temp]) -(ggplot(df_big, aes(x='t', y='y', group='simulation')) + - geom_line(alpha=0.5)) +(ggplot(df_big, aes(x='t', y='y', group='simulation')) + + geom_line(alpha=0.5)) ``` ::: @@ -192,7 +187,7 @@ for i in range(n): ::::challenge{id=initial-condition title="Initial condition"} -Now also allow $y(0)\sim \mathcal{N}(5, 1)$. How does the prior predictive distribution change? +Now also allow $y(0)\sim \mathcal{N}(5, 1)$. How does the prior predictive distribution change? :::solution @@ -208,14 +203,13 @@ for i in range(n): else: df_big = pd.concat([df_big, temp]) -(ggplot(df_big, aes(x='t', y='y', group='simulation')) + - geom_line(alpha=0.5)) +(ggplot(df_big, aes(x='t', y='y', group='simulation')) + + geom_line(alpha=0.5)) ``` ::: :::: - ::::challenge{id=proposal title="Metropolis Sampler: Proposal"} We are now going to write a random walk Metropolis sampler. The first step is to write a method which takes as input $(r,\kappa, \sigma)$ and proposes new values using univariate normal distributions centered at the current values. So, for example, @@ -246,9 +240,9 @@ Assume priors: $r\sim \mathcal{N}(0.2, 0.02), \kappa\sim \mathcal{N}(20, 4), \si ```python def prior(r, k, sigma): - lp = (scipy.stats.norm.logpdf(r, 0.2, 0.02) + - scipy.stats.norm.logpdf(k, 20, 4) + - scipy.stats.norm.logpdf(sigma, 2, 0.2)) + lp = (scipy.stats.norm.logpdf(r, 0.2, 0.02) + + scipy.stats.norm.logpdf(k, 20, 4) + + scipy.stats.norm.logpdf(sigma, 2, 0.2)) return lp ``` @@ -257,7 +251,7 @@ def prior(r, k, sigma): ::::challenge{id=posterior title="Metropolis Sampler: Posterior"} - Write a function which calculates the unnormalised log-posterior (i.e. the sum of the log-prior and log-likelihood), $\text{log-}p(r,\kappa,\sigma|\text{data})$, for a given parameter set. +Write a function which calculates the unnormalised log-posterior (i.e. the sum of the log-prior and log-likelihood), $\text{log-}p(r,\kappa,\sigma|\text{data})$, for a given parameter set. :::solution @@ -268,7 +262,6 @@ def posterior(r, k, sigma, y0, times, y_tilde): return ll + lp ``` - ```python posterior(0.4, 20, 2, 5, times, df_short['y']) ``` @@ -286,7 +279,6 @@ $$ Then generates $u\sim U(0,1)$ and does the following: if $t \geq u$, return $(r',\kappa',\sigma')$; else return $(r,\kappa,\sigma)$. - :::solution ```python @@ -302,16 +294,13 @@ def step_accept(r, k, sigma, y0, times, y_tilde, tau_r, tau_k, tau_sigma): return r, k, sigma ``` - ```python step_accept(0.2, 18, 1.5, 5, times, df_short['y'], 0.01, 1, 0.1) ``` - ::: :::: - ::::challenge{id=mcmc title="Metropolis Sampler: MCMC"} Write a function which iterates 'step_accept' generating a chain of MCMC samples of $(r,\kappa,\sigma)$. Initialise $(r,\kappa,\sigma)$ using samples from the prior. @@ -338,7 +327,7 @@ def MCMC(numsamples, r, k, sigma, y0, times, y_tilde, tau_r, tau_k, tau_sigma): ::::challenge{id=plot-mcmc title="Metropolis Sampler: Plot sampled values"} -Assuming step sizes of $(\tau_r=0.01, \tau_k=1, \tau\_{\sigma} = 0.1) $, generate an +Assuming step sizes of $(\tau_r=0.01, \tau_k=1, \tau\_{\sigma} = 0.1) $, generate an MCMC sample of 1000 draws. Visualise the sampled values of $r$ over time. :::solution @@ -347,7 +336,6 @@ MCMC sample of 1000 draws. Visualise the sampled values of $r$ over time. df = MCMC(1000, 0.2, 18, 1.5, 5, times, df_short['y'], 0.01, 1, 0.1) ``` - ```python plt.plot(df['r']) plt.show() @@ -363,16 +351,13 @@ Plot the pairwise distribution of $(r,\kappa)$. How do the sampled values compar :::solution ```python -(ggplot(df, aes(x='r', y='k')) + - geom_point(colour='blue') + - geom_vline(xintercept=0.2) + - geom_hline(yintercept=20)) +(ggplot(df, aes(x='r', y='k')) + geom_point(colour='blue') + + geom_vline(xintercept=0.2) + geom_hline(yintercept=20)) ``` + ::: :::: - - ::::challenge{id=convergence title="Metropolis Sampler: Convergence"} Modify your MCMC routine to use the following half-normal distributions to sample initial points for your chains. Run four independent chains for 1000 iterations each and plot the $\kappa$ samples over time. How long does it appear your chains be run until they reach a stationary distribution? @@ -385,8 +370,7 @@ $$ \kappa \sim \text{half-N}(0, 20, 10) $$ - -$$ +$$ \sigma \sim \text{half-N}(2, 1) $$ @@ -404,10 +388,8 @@ def truncated_normal(mu, sd): return r[0] ``` - :::solution - ```python def truncated_normal(mu, sd): myclip_a = 0 @@ -420,7 +402,6 @@ def truncated_normal(mu, sd): return r[0] ``` - ```python def MCMC(numsamples, r, k, sigma, y0, times, y_tilde, tau_r, tau_k, tau_sigma): r = truncated_normal(0, 0.2, 0.1) @@ -447,10 +428,8 @@ df3['chain'] = '4' df_all = pd.concat([df, df1, df2, df3]) ``` - ```python -(ggplot(df_all, aes(x='iter', y='k', colour='chain')) + - geom_line()) +(ggplot(df_all, aes(x='iter', y='k', colour='chain')) + geom_line()) ``` ::: @@ -460,7 +439,6 @@ df_all = pd.concat([df, df1, df2, df3]) Using a random subset of 100 samples from all four chains taken from after they appear to have reached the stationary distribution, draw from the posterior predictive distribution of the logistic equation solution and overlay the observations. - :::solution ```python @@ -479,15 +457,13 @@ for i in range(nsamples): else: df_big = pd.concat([df_big, temp]) - + df_big['type'] = 'simulation' df_short df_todo = pd.concat([df_big, df_short]) -(ggplot(df_todo.query('type == "simulation"'), aes(x='t', y='y', group='simulation')) + - geom_line(alpha=0.5) + - geom_point(df_todo.query('type != "simulation"'), colour="orange")) +(ggplot(df_todo.query('type == "simulation"'), aes(x='t', y='y', group='simulation')) ++ geom_line(alpha=0.5) + geom_point(df_todo.query('type != "simulation"'), colour="orange")) ``` ::: :::: - diff --git a/scientific_computing/bayesian_inference/02-PM.md b/scientific_computing/bayesian_inference/02-PM.md index 55ca8887..95133e42 100644 --- a/scientific_computing/bayesian_inference/02-PM.md +++ b/scientific_computing/bayesian_inference/02-PM.md @@ -1,19 +1,16 @@ --- name: Applied Bayesian Inference - PM exercise -dependsOn: [ - scientific_computing.bayesian_inference.01-AM, -] +dependsOn: [scientific_computing.bayesian_inference.01-AM] tags: [] -attribution: -- citation: This material has been adapted from material by Ben Lambert from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Ben Lambert from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- **You can download the slides for the PM lecture [here](slides/practical_ode_modelling.pdf)** @@ -31,9 +28,9 @@ $$ where at time $t=0$, $u(0) = u_0$ and $v(0)=v_0$. Here, $\alpha\geq 0$ yields the prey population growth rate in the absence of predators; $\beta \geq 0$ is the rate of decline in the prey as a result of predation; $\gamma \geq 0$ denotes the rate of decline of predators in absence of prey; $\delta > 0$ is the rate of predator population growth due to predation. -## Preliminary question! +## Preliminary question -Have you got PINTS installed on your machine? If not, clone the PINTS repo on: https://github.com/pints-team/pints and install it by executing: +Have you got PINTS installed on your machine? If not, clone the PINTS repo on: and install it by executing: `pip install -e .[dev,docs]` @@ -56,11 +53,11 @@ def lotka_volterra_rhs(y, t, alpha, beta, gamma, delta): return dydt ``` - ```python from scipy.integrate import odeint -from plotnine import * +from plotnine import aes, geom_point, geom_line, geom_path, ggplot import pandas as pd +import scipy def lotka_volterra_solve(times, alpha, beta, gamma, delta, u_0, v_0): sol = odeint(lotka_volterra_rhs, [u_0, v_0], times, args=(alpha, beta, gamma, delta)) @@ -72,22 +69,18 @@ times = np.linspace(0, 25, 1001) df = lotka_volterra_solve(times, 0.55, 0.028, 0.84, 0.026, 33, 6) df_long = pd.melt(df, id_vars='t') -(ggplot(df_long, aes(x='t', y='value', colour='variable')) + - geom_line()) +(ggplot(df_long, aes(x='t', y='value', colour='variable')) + geom_line()) ``` An alternative way to view this is as orbits. - ```python -(ggplot(df, aes(x='u', y='v')) + - geom_path()) +(ggplot(df, aes(x='u', y='v')) + geom_path()) ``` ::: :::: - ::::challenge{id=different-parameters title="Different parameters"} Run the model with $\alpha=0.79, \beta=0.04, \gamma=1.3, \delta=0.04, u_0=33, v_0=6$. How do the dynamics compare with previous? @@ -100,8 +93,7 @@ df1['simulation'] = '2' df['simulation'] = '1' df_both = pd.concat([df, df1]) -(ggplot(df_both, aes(x='u', y='v', colour='simulation')) + - geom_path()) +(ggplot(df_both, aes(x='u', y='v', colour='simulation')) + geom_path()) ``` ::: @@ -127,12 +119,11 @@ $$ \end{aligned} $$ - ::::challenge{id=generate-observations title="Generate observations"} Using these observation models, generate annual data for 25 years assuming $\sigma_u=\sigma_v=0.25$ and using the parameter sets from parts 1 and 2. Graph these observations. -Note, this may be a helpful resource: https://ben18785.shinyapps.io/distribution-zoo/ +Note, this may be a helpful resource: :::solution @@ -157,10 +148,9 @@ df1['simulation'] = '2' df_both = pd.concat([df, df1]) -(ggplot(df_both, aes(x='u_tilde', y='v_tilde', colour='simulation')) + - geom_path() + - geom_point()) +(ggplot(df_both, aes(x='u_tilde', y='v_tilde', colour='simulation')) + geom_path() + geom_point()) ``` + ::: :::: @@ -177,7 +167,7 @@ with a similar expression for the predator compartment. Here $u(t|\alpha, \beta, :::solution ```python -def lotka_volterra_u_loglikelihood(df_obs, alpha, beta, gamma, delta, u_0, v_0, sigma_u, sigma_v): +def lotka_volterra_u_loglikelihood(df_obs, alpha, beta, gamma, delta, u_0, v_0, sigma_u, sigma_v): df = lotka_volterra_solve(df_obs['t'], alpha, beta, gamma, delta, u_0, v_0) log_p = 0.0 for i in range(len(df_obs['t'])): @@ -207,7 +197,7 @@ plt.show() ::::challenge{id=intro-to-pints title="Intro to PINTS"} -PINTS is a Python package designed in the Department of Computer Science that provides access to all sorts of inference routines, which is especially good for ODEs. Following the approach given here: https://github.com/pints-team/pints/blob/master/examples/stats/custom-model.ipynb wrap wrap a pints.LogPDF class around the log-likelihood function we just created to allow us to access these methods. We're going to hold a number of parameters constant to make inference a little more manageable for this practical, so set up your method with $(u_0=33, v_0=6, \sigma_u=0.25, \sigma_v=0.25)$. +PINTS is a Python package designed in the Department of Computer Science that provides access to all sorts of inference routines, which is especially good for ODEs. Following the approach given here: wrap a pints.LogPDF class around the log-likelihood function we just created to allow us to access these methods. We're going to hold a number of parameters constant to make inference a little more manageable for this practical, so set up your method with $(u_0=33, v_0=6, \sigma_u=0.25, \sigma_v=0.25)$. :::solution @@ -256,6 +246,7 @@ Note, initialising chains all at a single point is not good practive but allows To run the MCMC, we first instantiate the pints.LogPDF object you created in the previous question assuming observational data generated in question 4. Then we instantiate an MCMC Controller object using: ```python +model = LotkaVolterraLogPDF(df) mcmc = pints.MCMCController(model, nchains, xs, method=pints.HaarioBardenetACMC) ``` @@ -276,7 +267,6 @@ to run the MCMC. Note that, at the moment, since we have not specified priors, PINTS implicitly assumes that the priors are uniform (and, in this case, improper). - :::solution ```python @@ -296,7 +286,6 @@ chains = mcmc.run() ::: :::: - ::::challenge{id=mcmc-summary title="MCMC summary"} PINTS has an in-built MCMC summary object that can be called and printed using: @@ -314,12 +303,13 @@ Have your MCMC chains yet converged? If Rhat > 1.1 (probably 1.01 for publicatio results = pints.MCMCSummary(chains=chains, time=mcmc.time()) print(results) ``` + ::: :::: ::::challenge{id=plot-trace title="Plot trace"} -Using PINTS' plotting tools (see: https://github.com/pints-team/pints/blob/master/examples/sampling/adaptive-covariance-haario-bardenet.ipynb) plot pairwise samples from the parameters. Before we do that, let's discard the first half of the chains as these are likely before convergence occurred. +Using PINTS' plotting tools (see: ) plot pairwise samples from the parameters. Before we do that, let's discard the first half of the chains as these are likely before convergence occurred. :::solution @@ -351,7 +341,6 @@ we see that if $\alpha\uparrow$ then $\frac{d u}{dt}\uparrow$. So, if $\beta\upa ::: :::: - ::::challenge{id=posterior-predictive title="Posterior predictive"} Using a randomly drawn subset of 100 posterior draws, generate posterior predictive draws of the mean predator and prey numbers of time. Then plot these, overlaying the data. Why do the posterior predictive simulations not encompass all the variation seen in the data? @@ -380,20 +369,19 @@ for i in range(nsamples): Combine datasets. - ```python df['type'] = 'observations' big_df['type'] = 'simulations' big_df1 = pd.concat([df, big_df]) ``` - ```python -(ggplot(big_df1.query('type=="simulations"'), aes(x='t', y='u')) + - geom_line(aes(group='replicate'), alpha=0.2) + - geom_point(big_df1.query('type=="observations"'), - aes(y='u_tilde'), colour='orange')) +(ggplot(big_df1.query('type=="simulations"'), aes(x='t',y='u')) + + geom_line(aes(group='replicate'), alpha=0.2) + + geom_point(big_df1.query('type=="observations"'), + aes(y='u_tilde'), colour='orange')) ``` + ::: :::: diff --git a/scientific_computing/bayesian_inference/index.md b/scientific_computing/bayesian_inference/index.md index 5e89523b..243b6586 100644 --- a/scientific_computing/bayesian_inference/index.md +++ b/scientific_computing/bayesian_inference/index.md @@ -1,22 +1,15 @@ --- id: bayesian_inference name: Bayesian Inference -dependsOn: [ - scientific_computing.ode_solvers, -] -files: [ - 01-AM.md, - 02-PM.md, -] -attribution: -- citation: This material has been adapted from material by Ben Lambert from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - ---- \ No newline at end of file +dependsOn: [scientific_computing.ode_solvers] +files: [01-AM.md, 02-PM.md] +attribution: + - citation: This material has been adapted from material by Ben Lambert from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 +--- diff --git a/scientific_computing/essential_maths/01_graphs.md b/scientific_computing/essential_maths/01_graphs.md index 1d587c29..c46aa007 100644 --- a/scientific_computing/essential_maths/01_graphs.md +++ b/scientific_computing/essential_maths/01_graphs.md @@ -1,21 +1,19 @@ --- name: Graphs -dependsOn: [ -] +dependsOn: [] tags: [] -attribution: -- citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ---- +--- ## YouTube lecture recording from October 2020 @@ -30,11 +28,11 @@ The material is still very similar: Terminology: -- $Y$ or $y$ is the *dependent* variable, sometimes called the *ordinate* +- $Y$ or $y$ is the _dependent_ variable, sometimes called the _ordinate_ marked on the vertical axis -- $X$ or $x$ is the *independent* variable, sometimes called the *abscissa* +- $X$ or $x$ is the _independent_ variable, sometimes called the _abscissa_ marked on the horizontal axis -- The dependent variable is said to be graphed *against* the independent +- The dependent variable is said to be graphed _against_ the independent variable Essential Features: @@ -55,18 +53,16 @@ Interpretation: - The intercept of this line on the $y$ axis is given by $y=c$, since at $x=0$, $y = c$ - - The gradient of this line (also called its "slope") is given by $$m = {y_2-y_1\over x_2 - x_1}$$ ("change in $y$ divided by change in $x$") - - The intercept of this line on the $x$ axis is given by $x = -{c \over m}$, since at $y=0$ we must have $mx=-c$ ## Graphs of Polynomials -An expression involving higher powers of $x$ is called a *polynomial* in $x$. +An expression involving higher powers of $x$ is called a _polynomial_ in $x$. ### Example @@ -84,11 +80,12 @@ The graph of a polynomial of degree $n$ has at most $n-1$ bends in it. If we wish to test visually whether some data fit a particular relationship, we can transform the data to plot something which should be linear if the relationship holds. -### e.g. Test for *parabolic* shape for data in $(x,y)$: i.e. $y = x^2$ +### e.g. Test for _parabolic_ shape for data in $(x,y)$: i.e. $y = x^2$ - We can plot $Y$ against $X$ where we let $Y=y$ and $X=x^2$. #### First plot the original data + There's a definite curve, and we may suspect the trend is quadratic ![Graph of data with nonlinear trend](fig/01_03_nonlinear_trend.svg) @@ -107,10 +104,8 @@ We next add a trendline through these points which we can use to determine the g - We find $(X,Y)$ lie along a straight line with slope 5 and Y-intercept 87. - - This means that $Y=5X+87$ - - So, $y$ and $x$ can be modelled by the polynomial equation $y=5x^2+87$. ## Example from biosciences @@ -118,29 +113,25 @@ We next add a trendline through these points which we can use to determine the g The rate at which a given enzyme can catalyse a reaction can be dependent upon the substrate concentration: $${1\over V} = {m\over S} + c$$ - where $V$ is the rate of the reaction, $S$ is the substrate concentration and $m$ and $c$ are constants. - - We can derive a straight line graph from the above formula by plotting $Y=1/V$ against $X=1/S$ - - It will have gradient $m$ and ordinate intercept $c$ - First, plot the original data which is observations of $V$ given varying $S$: ![Graph of original data](fig/01_06_original_data.svg) -#### Now plot the data nonlinearly +### Now plot the data nonlinearly If the hypothesised relationship holds, plotting $Y=1/V$ against $X=1/S$ should result in a straight line. ![Graph of linear trend](fig/01_07_linear_trend.svg) -#### Calculate the gradient and the intercept +### Calculate the gradient and the intercept We next add a trendline through these points which we can use to determine the gradient and intercept. @@ -152,8 +143,7 @@ We next add a trendline through these points which we can use to determine the g - So, $V$ and $S$ can be modelled by the equation $1/V=3/S+5$. - -### Introductory problems +## Introductory problems ::::challenge{id="01_intro_01" title="Introductory problems 1"} Sketch the following graphs. @@ -178,6 +168,7 @@ y = 3 * x + 5 plt.plot(x, y) ``` + :::: ::::challenge{id="01_intro_02" title="Introductory problems 2"} @@ -187,8 +178,8 @@ For which values of $x$ are the following functions positive? Negative? Zero? 1. $\displaystyle\sin(x)$ 1. $\displaystyle\sin(3x)$ 1. $\displaystyle\frac{2}{x} - \frac{1}{x^2}$ -:::: +:::: ### Main problems @@ -206,6 +197,7 @@ where $A$ and $B$ are positive constants, $V(R)$ is the potential energy, measur 1. What is the physical interpretation of the sign of $V(R)$, and of its slope? 1. What are the dimensions ([Length], [Mass], [Time]) and units of the constants $A$ and $B$? 1. Use Python to plot the graph of $V$ versus $R$ for $A=0.06$ and $B = 0.03$. Remember to add relevant axis labels. Plot on the same graph the line of $V = 0$, so you can verify your answers in 1. and b). + :::: ::::challenge{id="01_main_02" title="Main problems 2"} @@ -215,17 +207,18 @@ Write down expressions for the gradient, $X$-intercept and $Y$-intercept of each 1. $\displaystyle y = \frac{a}{x}$ 1. $\displaystyle y = b - a \sqrt{x}$ 1. $\displaystyle y = \frac{b}{1+ax}$ + :::: ::::challenge{id="01_main_03" title="Main problems 3"} -The *osmotic pressure* of a solution of a protein is related to the concentration of that protein by the equation: +The _osmotic pressure_ of a solution of a protein is related to the concentration of that protein by the equation: $$Z = R\;T\;b$$ where $Z$ is the osmotic pressure in kPa, $T$ is the temperature in Kelvin, $R$ is the gas constant ($R=8.314\;{\rm kPa}\cdot{\rm dm}^3\cdot{\rm mol}^{-1}\cdot{\rm K}^{-1}$) and $b$ is the molarity of the protein (mol. solute per dm$^3$ solution). Plot a suitable graph to determine, as accurately as possible, the molecular mass (take care with units!) of the protein given the following data taken at room temperature (usually taken as 21$^{\circ}$C): | | | | | | | -| ---------------------------------------|:-----:|:-----:|:-----:|:-----:|:-----:| -| Protein Concentration (in g dm$^{-3})$ | 7.3 | 18.4 | 27.6 | 42.1 | 57.4 | +| -------------------------------------- | :---: | :---: | :---: | :---: | :---: | +| Protein Concentration (in g dm$^{-3})$ | 7.3 | 18.4 | 27.6 | 42.1 | 57.4 | | Osmotic Pressure (in kPa) | 0.211 | 0.533 | 0.804 | 1.236 | 1.701 | Hint: compare the function with the equation of a straight line, $y=mx+c$, and think about the relationship between concentration, molar concentration and molecular weight). @@ -233,13 +226,12 @@ Hint: compare the function with the equation of a straight line, $y=mx+c$, and t Use Python to plot the graph and confirm your pen & paper solution. :::: - ### Extension problems ::::challenge{id="01_ext_01" title="Extension problems 1"} The rate at which a given enzyme catalyses a reaction is dependent upon the substrate concentration: $$V = \frac{S}{m+cS}$$ -where $V$ is the rate of the reaction, $S$ is the substrate concentration and $m$ and $c$ are *unknown* constants. +where $V$ is the rate of the reaction, $S$ is the substrate concentration and $m$ and $c$ are _unknown_ constants. How can we transform $V$ and $S$ to derive a straight line graph relating them? What will be the gradient and the ordinate intercepts? :::: diff --git a/scientific_computing/essential_maths/02_indices_and_logs.md b/scientific_computing/essential_maths/02_indices_and_logs.md index 69289e46..1be16a8c 100644 --- a/scientific_computing/essential_maths/02_indices_and_logs.md +++ b/scientific_computing/essential_maths/02_indices_and_logs.md @@ -1,22 +1,19 @@ --- name: Indices, logs and exponentials -dependsOn: [ - scientific_computing.essential_maths.01_graphs -] +dependsOn: [scientific_computing.essential_maths.01_graphs] tags: [] -attribution: -- citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ---- +--- ## YouTube lecture recording from October 2020 @@ -113,6 +110,7 @@ where $n$ is a positive integer. ### (5) Zero index Similarly, + > $$\frac{a^3}{a^3} = \frac{a \times a \times a}{a \times a \times a} = 1$$, and by using the division rule, the power should be $3-3=0$. @@ -125,21 +123,18 @@ which is undefined: ### (6) Fractional (rational) indices > $$a^1 = a^{1/2} \times a^{1/2} {\rm ~~~~so~~~~}a^{1/2} = \sqrt{a} $$ - > $$a^{1\over n} = \sqrt[n]{a} $$ - > $$a^{n\over m} = \left(\sqrt[m]{a}\right)^n ~~~~~{\rm or}~~~\sqrt[m]{a^n}$$ - > $$a^{-{n\over m}} = {1\over {\left(\sqrt[m]{a}\right)^n}}~~~{\rm or}~~~~{1\over {\sqrt[m]{a^n}}}$$ ### (7) Different bases, same index > $$(a\cdot b)^x = a^x\cdot b^x\qquad {\rm and}\qquad\Bigl( {a\over b}\Bigr)^x={{\,a^x}\over{\,b^x}}$$ -### (8) Two Cautionary remarks: +### (8) Two Cautionary remarks -1. Powers of sums are **not** pretty: $(a+b)^x \ne a^x + b^x$ -2. Powers of differences are **not** pretty: $(a-b)^x \ne a^x - b^x$ +1. Powers of sums are **not** pretty: $(a+b)^x \ne a^x + b^x$ +2. Powers of differences are **not** pretty: $(a-b)^x \ne a^x - b^x$ A useful formula for products of sums: $(p+q)\cdot(s+t) = ps + pt+qs + qt$ @@ -161,7 +156,7 @@ In general we write $\quad x = \log_a y \quad \Leftrightarrow \quad y=a^x$. The base of a logarithm may be any number. Commonly, logarithms either have **base** 10 or **base** $e$. It is almost always a good idea to explicitly state the base, e.g. $\;\log_3 9=2\;$ implies $\;3^2=9\;$. -## Getting a feel for logarithms. +## Getting a feel for logarithms Here's a graph of $y=\log_{10}x$: @@ -178,60 +173,58 @@ Some physical phenomena use log metrics due to their huge dynamic range: ## The laws of logarithms -### (1) Using the same base $a$ for both operations: +### (1) Using the same base $a$ for both operations -- *Taking the logarithm* undoes *raising to a power*: +- _Taking the logarithm_ undoes _raising to a power_: > $$\log_a\,a^r=r$$ -- *Raising to a power* undoes *taking the logarithm*: +- _Raising to a power_ undoes _taking the logarithm_: > $$a^{\log_a\,b}=b$$ -## The laws of logarithms - -### (2) Multiplication. +### (2) Multiplication > $$\log_a (bc) = \log_a b + \log_a c ~~~~~(Add) $$ -### (3) Division. In a similar way to multiplication, +### (3) Division. In a similar way to multiplication > $$\log_a \left({b \over c}\right) = \log_a b - \log_a c ~~~~~(Subtract)$$ -### (4) Powers. +### (4) Powers > $$\log_a b^n = n \log_a b ~~~~~~ (Multiply)$$ -### (5) Changing the base of a logarithm: +### (5) Changing the base of a logarithm > $$\log_a c = {\log_b c\over \log_b a}$$ ### (6) Special case: if $b$ and $c$ are the same, (5) reduces to - > $$ \log_a b ={\log_b b\over \log_b a}={1\over \log_b a}$$ -### (7) The log of any number to the base itself is 1: +### (7) The log of any number to the base itself is 1 > $$\log_a a =1 $$ -### (8) The log of 1 is 0 (unless a=0, in which case it is undefined): +### (8) The log of 1 is 0 (unless a=0, in which case it is undefined) > $$\log_a 1 = 0 \quad{\rm since~~~}\quad a^0=1$$ - -### (9) Inverse operation: +### (9) Inverse operation > $$\log_a a^x = x$$ -### (10) Or, +### (10) Or > $$a^{\log_a x}=x$$ -### (11) Negative logs. +### (11) Negative logs + > $$\log_a {1\over x} = \log_a1-\log_ax= 0 - \log_a x =-\log_ax$$ -### (12) Two cautionary remarks: +### (12) Two cautionary remarks + 1. $\;\log_a (x + y)\;$ and $\;\log_a (x-y)\;$ cannot be simplified any further, and should be left as they are. 2. Neither can $\;\log_a\,x \cdot \log_a\,y\;$ or $\;\displaystyle{{\log_a\,x}\over {\log_a\,y}}.\;$ Leave them as they are. @@ -240,12 +233,12 @@ Some physical phenomena use log metrics due to their huge dynamic range: Can the data below be fitted to the form: $y=Ax^n$? -| x | y | -|-----|-----| -| 4.0 | 6.0 | -| 16.0| 12.0| -| 25.0| 15.0| -| 64.0| 24.0| +| x | y | +| ---- | ---- | +| 4.0 | 6.0 | +| 16.0 | 12.0 | +| 25.0 | 15.0 | +| 64.0 | 24.0 | ![Graph of data table](fig/02_02_data.svg) @@ -261,40 +254,37 @@ Intercept = $0.48 = \log_{10}A$ so $A=3.0$ Data fit curve of the form: -> $$y=3.0\times x^{1/2}$$ +> $$y=3.0\times x^{1/2}$$ ## Example 2: pH 1. What is the pH of a 0.011M solution of HCl? -> $$pH = -\log_{10}[H^+]$$ + > $$pH = -\log_{10}[H^+]$$ -```python -pH = -np.log10(0.011) -print('pH =',pH) -``` + ```python + import numpy as np + pH = -np.log10(0.011) + print('pH =',pH) + ``` -``` -pH = 1.958607314841775 -``` + ```text + pH = 1.958607314841775 + ``` 2. What is the H$^+$ concentration of a solution of HCl with a pH of 3? -$$pH = -\log [H^+] = 3~~~~{\rm~so~~~}$$ + $$pH = -\log [H^+] = 3~~~~{\rm~so~~~}$$ -``` -[H+] = 0.001 M -``` + ```text + [H+] = 0.001 M + ``` ## Example 3: Simplifying logs Write an expression for $x$ without using logarithms: -> $$\log(x) = \log(p) + 2 \log(q) - \log(k) -3$$ - -> $$\log(x) =~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~$$ - -> $$x=~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~$$ +> $$\log(x) = \log(p) + 2 \log(q) - \log(k) -3$$ > $$\log(x) =~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~$$ > $$x=~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~$$ 1. Use the laws of logarithms (above) to put all terms on the right hand side **within** the logarithm. This means we have to rewrite $\;3\;$ as $\;3\log(10)\;$. @@ -302,7 +292,7 @@ Write an expression for $x$ without using logarithms: ## The exponential function and the real number e -### Definition of the Real Number $e$: +### Definition of the Real Number $e$ The real number $e$, also known as Euler's number, is that base for which the graph $y=e^x$ passes through the point $\;(0, 1)\;$ with gradient exactly equal to $\;1$. @@ -322,10 +312,10 @@ $e$ has been found to arise in many branches of mathematics. It is also used as Logs to base $e$ are called **natural logarithms**. -## Definition of the natural logarithm: +## Definition of the natural logarithm The **natural logarithm** of a number is the logarithm of that number taken using the base $\;e\;$. -We usually write $\;\ln(x)\;$ for $\log_e(x)$. Here are some examples: +We usually write $\;\ln(x)\;$ for $\log_e(x)$. Here are some examples: - $\ln(e)=\log_e(e)=1$ - $\ln(10)=\log_e(10)$ = "The power I need to raise $e$ to in order to get 10" @@ -334,9 +324,8 @@ We usually write $\;\ln(x)\;$ for $\log_e(x)$. Here are some examples: - $\exp(\ln(b))=e^{\ln(b)}=e^{\log_e b}=b$ Note that examples (d) and (e) confirm the property that -the functions "$\exp$" and "$\ln$" are *functional -inverses* of one another. - +the functions "$\exp$" and "$\ln$" are _functional +inverses_ of one another. ### Introductory problems @@ -347,6 +336,7 @@ Simplify: 1. $\displaystyle \frac{\left(\sqrt {x}\right)^8}{x^4}$ 1. $\displaystyle \frac{y^{1\over 4}}{y^{-{2 \over 4}}}$ 1. $\displaystyle \frac{10^{-2/3} \times 10^7 \times 10^{-16} \times x^{1/2} \times y^4 \times z^{-1/3}}{10^{-19}\times 10^{43} \times 10^{2/3} \times z^{-1/3} \times y^{1/4} \times x^{5/2}}$ + :::: ::::challenge{id="02_intro_02" title="Introductory problems 2"} @@ -356,6 +346,7 @@ Evaluate the following expressions without using a calculator: 1. $\displaystyle 36^{1\over 2}+64^{2\over 3}$ 1. $\displaystyle \left( {1 \over 3}\right)^{-2}$ 1. $\displaystyle \left({81 \over 9}\right)^{3 \over 2}$ + :::: ::::challenge{id="02_intro_03" title="Introductory problems 3"} @@ -364,6 +355,7 @@ Express the following in logarithmic form: 1. $\displaystyle 5^3 = 125$ 1. $\displaystyle 8^{-{1\over 3}} = {1 \over 2}$ 1. $\displaystyle x^y = 4$ + :::: ::::challenge{id="02_intro_04" title="Introductory problems 4"} @@ -373,6 +365,7 @@ Evaluate the following expressions without using a calculator: 1. $\displaystyle \log_{\pi}\,(1)$ 1. $\displaystyle \log_{b}\, (b^a)$ 1. $\displaystyle 6^{\log_6\,({\pi})}$ + :::: ::::challenge{id="02_intro_05" title="Introductory problems 5"} @@ -391,6 +384,7 @@ Simplify: 1. $\displaystyle \ln \left(1\over{2e}\right)$ 1. $\displaystyle e^{\ln x^4}$ 1. $\displaystyle e^{3+\ln x}$ + :::: ### Main problems @@ -400,13 +394,14 @@ In an experiment, the mass, $m$ grams, of a reaction product is measured at vari $$m=At^n$$ The results are shown in the table below: -| time (min) | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 12 | -|-----------:|:---:|:----:|:----:|:---:|:----:|:---:|:----:|:----:|:---:| +| time (min) | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 12 | +| ---------: | :-: | :--: | :--: | :-: | :--: | :-: | :--: | :--: | :-: | | mass (g) | 0.5 | 0.63 | 0.72 | 0.8 | 0.85 | 0.9 | 0.95 | 0.99 | 1.0 | 1. Confirm this postulate graphically. There is one result that does not conform to this law. Which? 1. Find appropriate values of $A$ and $n$, and in this context, explain the meaning of $n$. What are the units of $A$? 1. Explain, with reasons, whether you would use these results to predict the mass when $t=15$. + :::: ::::challenge{id="02_main_02" title="Main problems 2"} @@ -414,6 +409,7 @@ These problems deal with pH: 1. What is the pH of 130ml of a solution containing 4.7mg HCl, assuming that HCl is completely ionised in solution, and its molecular mass is 36.46? 1. What would be the pH if the concentration of HCl were tripled? + :::: ::::challenge{id="02_main_03" title="Main problems 3"} @@ -422,6 +418,7 @@ Express in terms of $\log (a)$, $\log (b)$, $\log (c)$ and $\log (d)$: 1. $\displaystyle \log\left({{b}\over ac}\right)$ 1. $\displaystyle \log (a^2 b c^3 d^4)$ 1. $\displaystyle \log \left(\sqrt {cd \over ab} \right)$ + :::: ::::challenge{id="02_main_04" title="Main problems 4"} @@ -431,6 +428,7 @@ Simplify: 1. $\displaystyle \log\left(x^2-1\right) - \log\left(x^2+1\right)$ 1. $\displaystyle 3\log_a(4) + \log_a(5) - 2\log_a(9)$ 1. $\displaystyle \log\left(x^9\right) - \log\left(x^6\right)$ + :::: ::::challenge{id="02_main_05" title="Main problems 5"} @@ -440,6 +438,7 @@ Consider the equation $\displaystyle\,\log_3(x) + 4\log_x(3) = 5$: 1. Verify that $x=3$ satisfies this equation 1. There is one other value of $x$ that also satisfies this equation. Find it. + :::: ::::challenge{id="02_main_06" title="Main problems 6"} @@ -449,6 +448,7 @@ Solve the following equations for $x$: 1. $\displaystyle 3^{2x+1} - 28\left(3^x\right) + 9 = 0$ 1. $\displaystyle 16 = \log_2 (x)$ 1. $\displaystyle \left(2 \sqrt 3 \log(x)\right)^2 - 7 \log(x^2) + 2 = 0$ + :::: ::::challenge{id="02_main_07" title="Main problems 7"} @@ -456,6 +456,7 @@ Write an expression for $x$ or $y$ without using logarithms: 1. $\displaystyle \log(x) = \log(3r) - 5 \log(s) + 3\log(t) - 3$ 1. $\displaystyle \log(2y) = 5 + 5\log\left(4^3\right) -15\log\left({\frac{2}{x}}\right) - 6\log(y)$ + :::: ::::challenge{id="02_main_08" title="Main problems 8"} @@ -463,12 +464,14 @@ Write $x$ in terms of $y$ for each of the following: 1. $\displaystyle y=2e^{4x}$ 1. $\displaystyle \ln y = 3 + 2\ln x$ + :::: ::::challenge{id="02_main_09" title="Main problems 9"} Express as a sum or difference of logarithms: 1. $\displaystyle \ln\sqrt{\left({x-1\over x+1}\right)}$ + :::: ::::challenge{id="02_main_10" title="Main problems 10"} @@ -476,5 +479,5 @@ Express as a single logarithm: 1. $\displaystyle 1 - \ln 4x$ 1. $\displaystyle 3\ln x - {1\over 2} \ln\left(5-x^2\right)$ -:::: +:::: diff --git a/scientific_computing/essential_maths/03_differentiation_1.md b/scientific_computing/essential_maths/03_differentiation_1.md index 2ed34e99..c6668cab 100644 --- a/scientific_computing/essential_maths/03_differentiation_1.md +++ b/scientific_computing/essential_maths/03_differentiation_1.md @@ -1,22 +1,19 @@ --- name: Differentiation 1 -dependsOn: [ - scientific_computing.essential_maths.02_indices_and_logs -] +dependsOn: [scientific_computing.essential_maths.02_indices_and_logs] tags: [] -attribution: -- citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ---- +--- ## YouTube lecture recording from October 2020 @@ -29,8 +26,9 @@ The material is still very similar: ## Gradients -We often want to know about the *rate* at which one quantity changes over time. +We often want to know about the _rate_ at which one quantity changes over time. Examples: + 1. The rate of disappearance of substrate with time in an enzyme reaction. 1. The rate of decay of a radioactive substance (how long will it have activity above a certain level?) 1. The rate of bacterial cell growth over time. @@ -38,10 +36,10 @@ Examples: ### Defining the gradient -* The **gradient of a curve** at a point $P$ is the slope of the tangent of the curve at that point. -* The **tangent** is the line that "just touches" (but doesn't cross) the curve. -* The gradient is also known as the **rate of change** or **derivative**, and the process of finding the gradient is called **differentiation**. -* The gradient of the curve $\;y = f(x)\;$ is denoted in a few different ways, the three most common are: +- The **gradient of a curve** at a point $P$ is the slope of the tangent of the curve at that point. +- The **tangent** is the line that "just touches" (but doesn't cross) the curve. +- The gradient is also known as the **rate of change** or **derivative**, and the process of finding the gradient is called **differentiation**. +- The gradient of the curve $\;y = f(x)\;$ is denoted in a few different ways, the three most common are: $$ y', \quad f'(x), \quad \frac{dy}{dx}. $$ @@ -63,13 +61,10 @@ For this function, the gradient is always sloping up to the right, but gets shal ### Algebraic example - If we want to find $y'(x)$ for $y = x^3 + 2$: - $$ \text{Gradient} = \frac{y_2 - y_1}{x_2-x_1} = \frac{\Delta y}{\Delta x}$$ - Try with $x_1 = 1.5,\;1.9,\;1.99,\;\ldots$ @@ -79,20 +74,26 @@ $x_2 = 2.5,\;2.1,\;2.01,\;\ldots$ We can use Python as a calculator to evaluate these differences: ```python -x_1 = 1.5; x_2 = 2.5 -y_1 = x_1**3 + 2; y_2 = x_2**3 + 2 +x_1 = 1.5 +x_2 = 2.5 +y_1 = x_1**3 + 2 +y_2 = x_2**3 + 2 print((y_2-y_1)/(x_2-x_1)) -x_1 = 1.9; x_2 = 2.1 -y_1 = x_1**3 + 2; y_2 = x_2**3 + 2 +x_1 = 1.9 +x_2 = 2.1 +y_1 = x_1**3 + 2 +y_2 = x_2**3 + 2 print((y_2-y_1)/(x_2-x_1)) -x_1 = 1.99; x_2 = 2.01 -y_1 = x_1**3 + 2; y_2 = x_2**3 + 2 +x_1 = 1.99 +x_2 = 2.01 +y_1 = x_1**3 + 2 +y_2 = x_2**3 + 2 print((y_2-y_1)/(x_2-x_1)) ``` -``` +```text 12.25 12.010000000000003 12.00009999999997 @@ -111,7 +112,7 @@ When using the approximation, we denote the changes as $\frac{\Delta y}{\Delta x In this way, $\frac{d}{dx}$ is an operator, acting on $y$. -Note, the $d$s cannot be cancelled out, as they aren't variables, they denote an infinitely small change. +Note, the $d$s cannot be cancelled out, as they aren't variables, they denote an infinitely small change. Notice how, as the finite difference gets smaller and smaller, the approximation to the gradient (the green line) gets closer and closer to the true gradient (the orange line): @@ -123,32 +124,27 @@ Notice how, as the finite difference gets smaller and smaller, the approximation Find the gradient of $y = f(x) = x^3 + 2$. -> $\frac{dy}{dx} = \frac{f(x+h) - f(x)}{h}$ - -> $\frac{dy}{dx} = \frac{(x+h)^3 + 2 - (x^3 + 2)}{h}$ - -> $\frac{dy}{dx} = \frac{x^3 + 3x^2 h + 3xh^2 + h^3 + 2 - x^3 - 2}{h}$ - -> $\frac{dy}{dx} = \frac{3x^2h + 3xh^2 + h^3}{h}$ - -> $\frac{dy}{dx} = 3x^2 + 3xh + h^3$ +> $\frac{dy}{dx} = \frac{f(x+h) - f(x)}{h}$ > $\frac{dy}{dx} = \frac{(x+h)^3 + 2 - (x^3 + 2)}{h}$ > $\frac{dy}{dx} = \frac{x^3 + 3x^2 h + 3xh^2 + h^3 + 2 - x^3 - 2}{h}$ > $\frac{dy}{dx} = \frac{3x^2h + 3xh^2 + h^3}{h}$ > $\frac{dy}{dx} = 3x^2 + 3xh + h^3$ Now this is only exactly right when $h \rightarrow 0$. So letting that happen, we have $\frac{dy}{dx} = 3x^2$ ## Derivative of polynomial functions + Using techniques like the one above (which is called differentiation from first principles), one can generalise the connection between powers of $x$ and their derivatives: If $y = a x^n$, then its **derivative** is $\frac{dy}{dx} = y'(x) = a n x^{n-1}$ ### Examples to try + 1. $y = x^4$ 2. $y = 7x^5$ 3. $y = x^{-2} = \frac{1}{x^2}$ 4. $y = \sqrt{1/x} = (1/x)^{1/2} = x^{-1/2}$ ## Summing and multiplying derivatives + ### Summing > $(f(x) \pm g(x))' = f'(x) \pm g'(x)$ @@ -158,6 +154,7 @@ e.g. > $y = x^2 + x^3, \quad y' = 2x + 3x^2$ ### Multiplying (by a scalar) + > $ (a f(x))' = a f'(x)$ e.g. @@ -173,17 +170,14 @@ e.g. > $y = x\cdot x = x^2, \quad y' \neq 1$ ## Higher-order derivatives + You can take a derivative of a function multiple times in a row. This is usually denoted either $y''(x),\;\;f''(x)\;$ or $\;\frac{d^2 y}{dx^2}\;$ for second-order derivatives (differentiating twice), and similar for higher orders. e.g. -> $y = x^3$ - -> $y' = 3x^2$ +> $y = x^3$ > $y' = 3x^2$ > $y'' = \frac{d^2 y}{dx^2} = 6 x$ -> $y'' = \frac{d^2 y}{dx^2} = 6 x$ - -## Interpreting derivatives: +## Interpreting derivatives The sign of the first derivative $\;f'(x)\;$ tells us how $\;f(x)\;$ is growing @@ -203,17 +197,11 @@ The sign of the first derivative $\;f'(x)\;$ tells us how $\;f(x)\;$ is growing To do this, we need to know both $\;y'(x)\;$ and $\;y''(x)\;$. -> $y'(x) = 6x^2 - 10x - 4$ - -> $y''(x) = 12x - 10$ +> $y'(x) = 6x^2 - 10x - 4$ > $y''(x) = 12x - 10$ Stationary points occur when $\;y'(x) = 0\;$ -> $6x^2 - 10x - 4 = 0$ - -> $(3x + 1)(2x - 4) = 0$ - -> $x = -1/3,\;2$ +> $6x^2 - 10x - 4 = 0$ > $(3x + 1)(2x - 4) = 0$ > $x = -1/3,\;2$ At $x = -1/3$: @@ -229,9 +217,7 @@ So this point is a **mimimum**. Inflection points occur whenever $y''(x) = 0$ -> $y''(x) = 12x - 10 = 0$ - -> $x = \frac{10}{12} = \frac{5}{6}$ +> $y''(x) = 12x - 10 = 0$ > $x = \frac{10}{12} = \frac{5}{6}$ This is an **inflection point**. @@ -243,16 +229,14 @@ Points of inflection are important in biology as they define conditions where a ## Reminder on curve sketching - - Aim to evaluate and identify key values of the function (i.e. turning points, points of inflection) - - Look at the limit behaviour as $\;x \to \pm \infty\;$ and as $\;x\;$ approaches any points where the function is undefined (e.g. $\;x \to 0\;$ for $\;y = 1/x\;$). - -- Determine the first and second order derivatives to find turning points and points of inflection. +- Determine the first and second order derivatives to find turning points and points of inflection. ## Real life example + The number $n$ (in thousands) of bacteria on an agar plate at time $t$ (in days) is given by the expression: $n = 15.42 + 6t - t^2$ @@ -270,19 +254,20 @@ To do this we must find the turning points of the function. - $n'(t) = 6 - 2t$ - $n'(t) = 0 \quad\implies\quad6-2t=0\quad\implies t=3$ - To show this is a maximum, we need to check $n''(t)$ + To show this is a maximum, we need to check $n''(t)$ - $n''(t) = -2$ + $n''(t) = -2$ - Therefore, $n''(t)<0$, for $t = 3$. This means that a maximum occurs at $t = 3$ days. + Therefore, $n''(t)<0$, for $t = 3$. This means that a maximum occurs at $t = 3$ days. 1. Find the number of bacteria on the plate at this time - $n(3) = 15.42 + 6 \times 3 - 3^2 = 24.42$ + $n(3) = 15.42 + 6 \times 3 - 3^2 = 24.42$ - The greatest number of bacteria on the plate is **24,420**. + The greatest number of bacteria on the plate is **24,420**. ## Real life example 2 + The growth rate $R$ of a cell colony with $N$ cells at time $t$ can be represented by the equation $R = \frac{d N}{d t} = kN - bN^2$ @@ -297,40 +282,39 @@ For this example take the constants $k$ and $b$ as $k = 3.8$/hr, and $b = 0.01$/ 1. The equilibrium will occur when the population stops changing, i.e. when $R = 0$. Meaning: - $R = 3.8 N - 0.01 N^2 = 0$ - - $N (3.8 - 0.01 N) = 0$ + $R = 3.8 N - 0.01 N^2 = 0$ + + $N (3.8 - 0.01 N) = 0$ - We can disregard the $N = 0$ solution, as it represents population extinction. This means that - - $N = \frac{3.8}{0.01} = 380$. + We can disregard the $N = 0$ solution, as it represents population extinction. This means that + + $N = \frac{3.8}{0.01} = 380$. 1. To find the largest growth rate, we want the maximal value of $R(N)$. This means we need to find $R'(N) = 0$. - $R(N) = 3.8 N - 0.01 N^2$ - - $R'(N) = 3.8 - 0.02 N$ - - If $R'(N) = 0$ - - $3.8 - 0.02N = 0$ - - $N = 190$ - - Since $R''(N) = -0.02 < 0$, we can be sure that this is a maximum. + $R(N) = 3.8 N - 0.01 N^2$ + + $R'(N) = 3.8 - 0.02 N$ + + If $R'(N) = 0$ + $3.8 - 0.02N = 0$ + $N = 190$ + + Since $R''(N) = -0.02 < 0$, we can be sure that this is a maximum. ### Introductory problems -::::challenge{id="03_intro_01" title="Introductory problems 1"} -Use the formula $\displaystyle \frac{{\rm d}y}{{\rm d}x}=\lim_{h\rightarrow 0}\left({f(x+h)-f(x)\over h}\right)$ to calculate the derivatives of the functions below. +::::challenge{id="03*intro_01" title="Introductory problems 1"} +Use the formula $\displaystyle \frac{{\rm d}y}{{\rm d}x}=\lim*{h\rightarrow 0}\left({f(x+h)-f(x)\over h}\right)$ to calculate the derivatives of the functions below. Check your answers by using the standard rules for differentiation: 1. $\displaystyle y = 3x + 3$ 1. $\displaystyle y = 4x^2 - 3x + 2$ 1. $\displaystyle y = 2x^3-5$ 1. $\displaystyle y=\frac{1}{x^2}\qquad$ (harder) + :::: ::::challenge{id="03_intro_02" title="Introductory problems 2"} @@ -338,6 +322,7 @@ Find the gradient at the given points of the following curves: 1. $\displaystyle y = x^3 - 4 \qquad\rm{where}\qquad x = 1$ 1. $\displaystyle y = 3x^3 + 4x - 3 \qquad\rm{where}\qquad x = -2$ + :::: ::::challenge{id="03_intro_03" title="Introductory problems 3"} @@ -347,11 +332,11 @@ Find the $x$ and $y$ coordinates of the points on the given curves at which the 1. $\displaystyle y = x^3 - 4x^2 + 2x - 2$ 1. $\displaystyle y = \frac{4x+1}{x}$ 1. $\displaystyle y = 16 - 2x^3$ + :::: ### Main problems - ::::challenge{id="03_main_01" title="Main problems 1"} One hour after taking $x\,\rm{mg}$ of a drug, the body temperature, $T$, in $^\circ$C of a patient is given by: $$T=T_0-0.00625~x^2(18-x),$$ @@ -359,6 +344,7 @@ where $T_0$ is the initial body temperature. 1. Determine the value of $x$ that produces the greatest drop in body temperature, and the magnitude of that temperature change. 1. Sketch $T$ as a function of the concentration. + :::: ::::challenge{id="03_main_02" title="Main problems 2"} @@ -366,8 +352,9 @@ The formula for the Lennard Jones potential between two non polar atoms is given $$V(R)={A\over R^{12}} - {B \over R^6}$$ 1. Use this formula to calculate $\displaystyle \frac{{\rm d}V}{{\rm d}R}$ as a function of $R$. -1. Show that the potential where the gradient is zero is $\displaystyle V(R)=\frac{-B^2}{4A}$. +1. Show that the potential where the gradient is zero is $\displaystyle V(R)=\frac{-B^2}{4A}$. 1. Find mathematically whether this point is a maximum, minimum or point of inflexion. + :::: ::::challenge{id="03_main_03" title="Main problems 3"} @@ -378,6 +365,7 @@ $$n = 21.35 + 1.34t - t^2$$ 1. Calculate the time at which the greatest number of bacteria are present on the plate and show that this must be a maximum number. 1. By finding the roots of the equation for $n$, find the two times at which the value of $n$ is zero and say why only one of these times is physically reasonable. Mark and label the maximum point on your graph together with the point at which the number of bacteria is zero. 1. Find the rates at which the bacteria are growing when $t=0.8$ and $t=3.5$ days. + :::: ::::challenge{id="03_main_04" title="Main problems 4"} @@ -387,14 +375,15 @@ where for rainbow trout $k=300\,{\rm cm}\,{\rm s}^{-1.6}$, $b=1.60$, and for gre 1. Compare the distance travelled, and the instantaneous velocity and acceleration for these species at $t=0.1\,$s. 1. Compare the velocities in fish lengths per second, given that the lengths of trout and sunfish are $14.4\,$cm and $8.0\,$cm respectively, commenting on your answer. + :::: -::::challenge{id="03_main_05" title="Main problems 5"} -A researcher measured the concentration $c$ of a protein _in vitro_ and obtained the readings below: +::::challenge{id="03*main_05" title="Main problems 5"} +A researcher measured the concentration $c$ of a protein \_in vitro* and obtained the readings below: -| time (min) | 0 | 1 | 2 | 3 | 4 | 5 | 6 | -|-----------:|:-----:|:----:|:----:|:----:|:----:|:----:|:----:| -| c (mM) | 11.91 | 7.06 | 4.40 | 2.57 | 1.81 | 1.03 | 0.72 | +| time (min) | 0 | 1 | 2 | 3 | 4 | 5 | 6 | +| ---------: | :---: | :--: | :--: | :--: | :--: | :--: | :--: | +| c (mM) | 11.91 | 7.06 | 4.40 | 2.57 | 1.81 | 1.03 | 0.72 | She surmised that the protein was being degraded according to the reaction scheme @@ -405,6 +394,7 @@ under mass action kinetics. 1. Use a suitable transformation to draw a straight-line graph of the data, in order to test her hypothesis. 1. Find the maximum rate of decay and the time at which this occurs. 1. Find the concentration of protein remaining after 10 minutes. + :::: ### Extension problems @@ -418,8 +408,8 @@ Find the values of $x$ for which the following functions have stationary values Use Python to check your results. :::: -::::challenge{id="03_ext_02" title="Extension problems 2"} -The graph shows the rate of CO$_{2}$ emissions per year since 1800, with three fitted lines in red. +::::challenge{id="03*ext_02" title="Extension problems 2"} +The graph shows the rate of CO$*{2}$ emissions per year since 1800, with three fitted lines in red. ![CO2 graph](fig/03_07_c02problem.png) @@ -429,6 +419,7 @@ $\displaystyle y = Ax^{2} + Bx + C$ 1. By using the points $(10,3), (106,14), (166,42)$ taken from the best fit lines, evaulate the coefficients $A$, $B$, and $C$ in this model. 1. Find the minima of this quadratic curve. Use this to assess the suitability of the quadratic model as a fit for the data. + :::: ::::challenge{id="03_ext_03" title="Extension problems 3"} @@ -441,7 +432,6 @@ where $p$ is the protein concentration, $P_0$ is the initial concentration, $k$ Find the rate for this reaction and deduce a plausible reaction schema. Hint: you may find it useful to express the derivative in terms of $p(t)$. :::: - ::::challenge{id="03_ext_04" title="Extension problems 4"} From first principles (as in the first question on this sheet) prove the formulas for differentiation of sums, differences and scalar multiples. diff --git a/scientific_computing/essential_maths/04_differentiation_2.md b/scientific_computing/essential_maths/04_differentiation_2.md index ae33b505..a9d91b5e 100644 --- a/scientific_computing/essential_maths/04_differentiation_2.md +++ b/scientific_computing/essential_maths/04_differentiation_2.md @@ -1,22 +1,19 @@ --- name: Differentiation 2 -dependsOn: [ - scientific_computing.essential_maths.03_differentiation_1 -] +dependsOn: [scientific_computing.essential_maths.03_differentiation_1] tags: [] -attribution: -- citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ---- +--- ## YouTube lecture recording from October 2020 @@ -39,41 +36,29 @@ This means that for small $h$, this expression approximates the derivative. By r > $f(x + h) \approx f(x) + h f'(x)$ -In other words, $f(x+h)$ can be approximated by starting at $f(x)$ and moving a distance $h$ along the tangent $f'(x)$. - +In other words, $f(x+h)$ can be approximated by starting at $f(x)$ and moving a distance $h$ along the tangent $f'(x)$. ### Example + Estimate $\sqrt{5218}$ To do this, we first need an $f(x)$ that is easy to calculate, and close to 5218. For this we can take that $70^2 = 4900$. - To calculate the approximation, we need $\;f'(x)\;$, where $\;f(x) = \sqrt{x}\;$. -> $$f'(x) = \frac{1}{2 \sqrt{x}}$$ - -> $$f'(x) = \frac{1}{2 \sqrt{x}}$$ +> $$f'(x) = \frac{1}{2 \sqrt{x}}$$ > $$f'(x) = \frac{1}{2 \sqrt{x}}$$ Since 5218 = 4900 + 318, we can set $x = 4900$ and $h = 318$. Using the approximation, we have: -> $f(5218) \approx f(4900) + 318 \times f'(4900)$ - -> $f(5218) \approx 70 + 318 \times \frac{1}{140} \approx 72.27$ - -> $\sqrt{5218} = 72.2357252$ - not a bad appriximation! +> $f(5218) \approx f(4900) + 318 \times f'(4900)$ > $f(5218) \approx 70 + 318 \times \frac{1}{140} \approx 72.27$ > $\sqrt{5218} = 72.2357252$ - not a bad appriximation! ## Standard derivatives -It's useful to know the derivatives of all the standard functions, and some basic rules. -> $\frac{d}{dx} (x^n) = n x^{n-1}$ - -> $\frac{d}{dx} (\sin x) = \cos x$ - -> $\frac{d}{dx} (\cos x) = -\sin x$ +It's useful to know the derivatives of all the standard functions, and some basic rules. -> $\frac{d}{dx} (e^x) = e^x$ +> $\frac{d}{dx} (x^n) = n x^{n-1}$ > $\frac{d}{dx} (\sin x) = \cos x$ > $\frac{d}{dx} (\cos x) = -\sin x$ > $\frac{d}{dx} (e^x) = e^x$ To understand the derivative of sin and cos, consider their graphs, and when they are changing positively (increasing), negatively (decreasing) or not at all (no rate of change). @@ -81,21 +66,19 @@ To understand the derivative of sin and cos, consider their graphs, and when the ## Other Differentiation Rules -## Differentiation of sums, and scalar multiples: +## Differentiation of sums, and scalar multiples > $(f(x) \pm g(x))' = f'(x) \pm g'(x)$ - > $(a f(x))' = a f'(x) $ ## Differentiation of products While differentiating sums, and scalar multiples is straightforward, differentiating products is more complex -> $(f(x) g(x) )' \neq f'(x) g'(x)$ - -> $(f(x) g(x) )' = f'(x) g(x) + g'(x) f(x)$ +> $(f(x) g(x) )' \neq f'(x) g'(x)$ > $(f(x) g(x) )' = f'(x) g(x) + g'(x) f(x)$ ### Example + To illustrate that this works, consider $y = (2x^3 - 1)(3x^3 + 2x)$ If we expand this out, we have that $y = 6x^6 + 4x^4 - 3x^3 - 2x$ @@ -104,31 +87,21 @@ From this, clearly, $y' = 36 x^5 + 16x^3 - 9 x^2 - 2$ To use the product rule, instead we say $y = f \times g$, where $f = 2x^3 - 1$, and $g = 3x^3 + 2x$. Therefore -> $f'(x) = 6x^2$ - -> $g'(x) = 9x^2 + 2$ - -> $y' = f'g + g'f = 6x^2 (3x^3 + 2x) + (9x^2 + 2)(2x^3 - 1)$ - -> $y' = 18x^5 + 12x^3 + 18x^5 + 4x^3 - 9x^2 - 2 = 36x^5 + 16x^3 - 9x^2 - 2$ - -So both rules produce the same result. While for simple examples the product rule requires more work, as functions get more complex it saves a lot of time. +> $f'(x) = 6x^2$ > $g'(x) = 9x^2 + 2$ > $y' = f'g + g'f = 6x^2 (3x^3 + 2x) + (9x^2 + 2)(2x^3 - 1)$ > $y' = 18x^5 + 12x^3 + 18x^5 + 4x^3 - 9x^2 - 2 = 36x^5 + 16x^3 - 9x^2 - 2$ +So both rules produce the same result. While for simple examples the product rule requires more work, as functions get more complex it saves a lot of time. ## Differentiating a function of a function - The Chain Rule One of the most useful rules is differentiating a function that has another function inside it $y = f(g(x))$. For this we use the chain rule: -> $y = f(g(x))$ - -> $y'(x) = f'(g(x))\; g'(x) = \frac{df}{dg} \frac{dg}{dx}$ +> $y = f(g(x))$ > $y'(x) = f'(g(x))\; g'(x) = \frac{df}{dg} \frac{dg}{dx}$ ### Example 1: $y = (5x^2 + 2)^4$ -We can write this as $y = g^4$, where $g = 5x^2 + 2$. Given this, we have that -> $\frac{dy}{dg} = 4g^3 = 4(5x^2 + 2)^3$ +We can write this as $y = g^4$, where $g = 5x^2 + 2$. Given this, we have that -> $\frac{dg}{dx} = 10x$ +> $\frac{dy}{dg} = 4g^3 = 4(5x^2 + 2)^3$ > $\frac{dg}{dx} = 10x$ This means that @@ -138,22 +111,19 @@ This extends infinitely to nested functions, meaning $\frac{d}{dx}(a(b(c)) = \frac{d a}{d b} \frac{d}{dx} (b(c)) = \frac{d a}{db} \frac{d b}{dc}\frac{dc}{dx}$ ## Differentiating the ratio of two functions - The Quotient Rule + If $y(x) = \frac{f(x)}{g(x)}$, then by using the product rule, and setting $h(x) = (g(x))^{-1}$, we can show that > $y'(x) = \frac{f'g - g'f}{g^2}$ ### Example -$y = \frac{3x-1}{4x + 2}$ -> $f = 3x - 1, \rightarrow f' = 3$ - -> $g = 4x + 2, \rightarrow g' = 4$ - -> $y' = \frac{f'g - g'f}{g^2} = \frac{3(4x+2) - 4(3x-1)}{(4x+2)^2}$ +$y = \frac{3x-1}{4x + 2}$ -> $y' = \frac{12x + 6 - 12 x + 4}{(4x+2)^2} = \frac{10}{(4x+2)^2}$ +> $f = 3x - 1, \rightarrow f' = 3$ > $g = 4x + 2, \rightarrow g' = 4$ > $y' = \frac{f'g - g'f}{g^2} = \frac{3(4x+2) - 4(3x-1)}{(4x+2)^2}$ > $y' = \frac{12x + 6 - 12 x + 4}{(4x+2)^2} = \frac{10}{(4x+2)^2}$ ## Differentiating inverses - implicit differentiation + For any function $y = f(x)$, with a well defined inverse $f^{-1}(x)$ (not to be confused with $(f(x))^{-1})$), we have by definition that > $x = f^{-1}(f(x)) = f^{-1}(y)$. @@ -166,7 +136,8 @@ But since $\frac{d}{dx}(x) = 1$ > $\frac{d}{dy}(f^{-1}(y)) = \frac{1}{\frac{dy}{dx}}$ -### Example: $y = ln(x)$ +### Example: $y = ln(x)$ + If $y = ln(x)$, this means that $f^{-1}(y) = e^y = x$ By definition ($f^{-1}(y))' = e^y$, as $e^y$ doesn't change under differentiation. This means that @@ -177,50 +148,54 @@ But since $y = ln(x)$: > $\frac{d}{dx}(ln(x)) = \frac{1}{e^{ln(x)}} = \frac{1}{x}$ - -### Example - Differentiating using sympy. +### Example - Differentiating using sympy In Python, there is a special package for calculating derivatives symbolically, called sympy. -This can quickly and easily calculate derivatives (as well as do all sorts of other analytic calculations). +This can quickly and easily calculate derivatives (as well as do all sorts of other analytic calculations). ```python import sympy as sp - -x = sp.symbols('x') #This creates a variable x, which is symbolically represented as the string x. + +x = sp.symbols('x') # This creates a variable x, which is symbolically represented as the string x. # Calculate the derivative of x^2 sp.diff(x**2, x) ``` + $\displaystyle 2 x$ ```python sp.diff(sp.cos(x), x) ``` + $\displaystyle - \sin{\left(x \right)}$ ```python f = (x+1)**3 * sp.cos(x**2 - 5) -sp.diff(f,x) +sp.diff(f,x) ``` + $\displaystyle - 2 x \left(x + 1\right)^{3} \sin{\left(x^{2} - 5 \right)} + 3 \left(x + 1\right)^{2} \cos{\left(x^{2} - 5 \right)}$ ```python f = (x+1)**3 * (x-2)**2 * (x**2 + 4*x + 1)**4 sp.diff(f, x) ``` + $\displaystyle \left(x - 2\right)^{2} \left(x + 1\right)^{3} \cdot \left(8 x + 16\right) \left(x^{2} + 4 x + 1\right)^{3} + 3 \left(x - 2\right)^{2} \left(x + 1\right)^{2} \left(x^{2} + 4 x + 1\right)^{4} + \left(x + 1\right)^{3} \cdot \left(2 x - 4\right) \left(x^{2} + 4 x + 1\right)^{4}$ ```python sp.expand(sp.diff(f, x)) # expand out in polynomial form ``` + $\displaystyle 13 x^{12} + 180 x^{11} + 869 x^{10} + 1250 x^{9} - 2934 x^{8} - 11504 x^{7} - 9142 x^{6} + 10092 x^{5} + 23185 x^{4} + 17068 x^{3} + 6081 x^{2} + 1058 x + 72$ ### Sympy documentation You can look at the documentation for Sympy to see many other possibilities (e.g. we will use Sympy to do symbolic integration later on in this course) -- https://docs.sympy.org/latest/index.html +- Try out Sympy to verify your pen & paper answers to the problem sheets. @@ -237,6 +212,7 @@ Differentiate the following functions, using the stated rules where indicated: 1. Chain rule: $\displaystyle y=({5x^3+10})^{1/2}$ 1. Any rules: $\displaystyle y=\sqrt{{1\over{1-3x}}}$ 1. Any rules: $\displaystyle y=(3x+1)\sqrt{5x^2-x}$ + :::: ::::challenge{id="04_intro_02" title="Introductory problems 2"} @@ -246,21 +222,22 @@ Differentiate the following trigonometric functions with respect to $x$: 1. $\displaystyle \cos(3-2x)$ 1. $\displaystyle \sqrt{\sin(x)}$ 1. $\displaystyle \frac{\cos(x)}{4x}$ + :::: ### Main problems - ::::challenge{id="04_main_01" title="Main problems 1"} Let $y=x^2$. 1. Find the **exact** value of $y$ when $x=2.1$. 1. Now **estimate** the value of $y$ when $x=2.1$ by using the linear approximation formula -$$f(x_1+h)=f(x_1)+hf'(x_1)$$ -and letting $x_1=2.0$ and $h=0.1$. + $$f(x_1+h)=f(x_1)+hf'(x_1)$$ + and letting $x_1=2.0$ and $h=0.1$. 1. Compare your estimate to the true value. Which is bigger? What is there about the shape of the graph of $y=x^2$ that accounts for this? 1. Repeat parts 1. and b) for $x=2.01$. 1. Calculate the absolute error in each estimate. How does this error change with the value of $h$? + :::: ::::challenge{id="04_main_02" title="Main problems 2"} @@ -274,6 +251,7 @@ where $h$ is Planck's constant ($h=6.63\times 10^{-34}\,\rm{Js}$) and $c$ is the 1. Sketch a graph to show how E varies with $\lambda$. 1. Derive an expression for $\displaystyle \frac{{\rm d}E}{{\rm d}\lambda}$, and calculate the slope of your graph when $\lambda=500\,\rm{nm}$. 1. Hence, or otherwise, estimate the difference in energy between a photon of wavelength $500\,\rm{nm}$ and one of wavelength $505\,\rm{nm}$. + :::: ::::challenge{id="04_main_03" title="Main problems 3"} @@ -286,6 +264,7 @@ where $t$ is the number of hours since midnight and $y$ is measured in feet. 1. Sketch y. 1. Find $\displaystyle \frac{{\rm d}y}{{\rm d}t}$. What does it represent, in terms of water level? 1. For $0\le t \le 24$, when is $\displaystyle \frac{{\rm d}y}{{\rm d}t}$ zero? Explain what it means for $\displaystyle \frac{{\rm d}y}{{\rm d}t}$ to be zero. + :::: ::::challenge{id="04_main_04" title="Main problems 4"} @@ -314,17 +293,13 @@ Prove that the rate of increase of salinity with distance from the inlet is give The sine and cosine functions can be written in the form of the following infinite series: -> $$\sin x = x - {x^3\over3!} + {x^5\over5!} - {x^7\over7!} + \ldots$$ - -> $$\cos x = 1 - {x^2\over2!} + {x^4\over4!} - {x^6\over6!} + \ldots$$ +> $$\sin x = x - {x^3\over3!} + {x^5\over5!} - {x^7\over7!} + \ldots$$ > $$\cos x = 1 - {x^2\over2!} + {x^4\over4!} - {x^6\over6!} + \ldots$$ Differentiate these series term by term to verify the standard expressions for $\displaystyle\frac{{\rm d}}{{\rm d}x}(\sin x)$ and $\displaystyle\frac{{\rm d}}{{\rm d}x}(\cos x)$. :::: - ### Extension problems - ::::challenge{id="04_ext_01" title="Extension problems 1"} The focal length of the lens of the eye, $f(t)$, can be controlled so that an object at distance $u(t)$ in front of the eye can be brought to perfect focus on the retina at a constant $v = 1.8\,\rm{cm}$ behind the lens. @@ -350,6 +325,7 @@ where $N(t)$ is the number of bacteria in $1\,\rm{ml}$ of water. 1. State with a reason whether the count of bacteria increases or decreases during the period $0\leq t \leq 10$. 1. Find the minimum value of $\displaystyle \frac{{\rm d}N}{{\rm d}t}$ during this period. + :::: ::::challenge{id="04_ext_03" title="Extension problems 3"} @@ -363,5 +339,5 @@ where $a$, $b$, $c$ and $d$ are positive constants. 1. What values of $\displaystyle \frac{{\rm d}Z}{{\rm d}t}$ and $\displaystyle \frac{{\rm d}L}{{\rm d}t}$ correspond to stable populations? 1. How would the statement 'zebra go extinct' be represented mathematically? 1. For parameters $a=0.05$, $b=0.001$, $c=0.05$, and $d=0.00001$, find all population pairs $(Z,L)$ that yield stable populations. Is extinction inevitable? -:::: +:::: diff --git a/scientific_computing/essential_maths/05_differentiation_3.md b/scientific_computing/essential_maths/05_differentiation_3.md index 120b4313..4aac6736 100644 --- a/scientific_computing/essential_maths/05_differentiation_3.md +++ b/scientific_computing/essential_maths/05_differentiation_3.md @@ -1,22 +1,19 @@ --- name: Differentiation 3 -dependsOn: [ - scientific_computing.essential_maths.04_differentiation_2 -] +dependsOn: [scientific_computing.essential_maths.04_differentiation_2] tags: [] -attribution: -- citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ---- +--- ## YouTube lecture recording from October 2020 @@ -29,9 +26,10 @@ The material is still very similar: ## Exponentials and Partial Differentiation -## Examples of applying chain rule to the exponential function. +## Examples of applying chain rule to the exponential function 1. $\displaystyle y=e^{-ax}$ + - Let $$\displaystyle u=-ax\Rightarrow\frac{{\rm d}u}{{\rm d}x}=-a$$. - Thus $\displaystyle y=e^u$ and - $$\displaystyle \frac{{\rm d}y}{{\rm d}u}=e^u~~\Rightarrow~~\frac{{\rm d}y}{{\rm d}x}=\frac{{\rm d}y}{{\rm d}u}\times\frac{{\rm d}u}{{\rm d}x}=e^u\times(-1.=-ae^{-ax}.$$ @@ -44,8 +42,7 @@ So an important generalization is: > $$\displaystyle \frac{{\rm d}}{{\rm d}x}e^{f(x)}=e^{f(x)}f'(x)$$ for any function $f(x)$ - -## Example with the natural logarithm. +## Example with the natural logarithm 1. $\displaystyle y=\ln(a-x)^2=2\ln(a-x)=2\ln u$. - Let $\displaystyle u=(a-x)$: @@ -55,8 +52,7 @@ This also generalises: > $$\displaystyle \frac{{\rm d}}{{\rm d}x}\ln(f(x)) = {f'(x)\over f(x)}$$ - -## The Derivative of $a^x$: +## The Derivative of $a^x$ By the properties of logarithms and indices we have @@ -70,19 +66,20 @@ Similarly, in general: > $$\displaystyle \frac{{\rm d}}{{\rm d}x}a^{f(x)} = a^{f(x)}\cdot \ln a\cdot f'(x)$$ -## Sympy Example +### Sympy Example Let's try and use Sympy to demonstrate this: ```python +import sympy as sp x, a = sp.symbols('x a') # declare the variables x and a f = sp.Function('f') # declare a function dependent on another variable sp.diff(a**f(x),x) # write the expression we wish to evaluate ``` -> $\displaystyle a^{f{\left(x \right)}} \log{\left(a \right)} \frac{d}{d x} f{\left(x \right)}$ +> $\displaystyle a^{f{\left(x \right)}} \log{\left(a \right)} \frac{d}{d x} f{\left(x \right)}$ -## The Derivative of $\displaystyle \log_a x\,\,$: +## The Derivative of $\displaystyle \log_a x\,\,$ Recall the conversion formula $\displaystyle \log_a x = {{\ln x}\over {\ln a}}$ and note that $\ln a$ is a constant. @@ -94,30 +91,34 @@ In general: > $$\displaystyle \frac{{\rm d}}{{\rm d}x}\log_a f(x) = {{f'(x)} \over {f(x){(\ln 1.}}}$$ -## Sympy Example +### Sympy Example Again, let's use SymPy to demonstrate this: ```python x, a = sp.symbols('x a') # declare the variables x and a f = sp.Function('f') # declare a function dependent on another variable -sp.diff(sp.log(f(x),1.,x) # write the expression we wish to evaluate +sp.diff(sp.log(f(x),a),x) # write the expression we wish to evaluate ``` + > $$\displaystyle \frac{\frac{d}{d x} f{\left(x \right)}}{f{\left(x \right)} \log{\left(a \right)}}$$ +## Further examples -## Further examples: +1. Product Rule: Let $\displaystyle y = x^2\,e^x$. Then: -1. Product Rule: Let $\displaystyle y = x^2\,e^x$. Then: > $$\displaystyle {{dy\over dx}}={d\over dx}x^2e^x={d\over dx}x^2\cdot e^x+x^2\cdot{d\over dx}e^x = (2x + x^2)e^x$$ -1. Quotient Rule: Let $\displaystyle y = {{e^x}\over x}$. Then: +1. Quotient Rule: Let $\displaystyle y = {{e^x}\over x}$. Then: + > $$\displaystyle {{dy\over dx}}={{{{d\over dx}e^x}\cdot x - e^x\cdot {d\over dx}x}\over {x^2}}={{e^x\cdot x - e^x\cdot 1\over {x^2}}}={{x - 1}\over x^2}e^x$$ -1. Chain Rule: $\displaystyle y = e^{x^2}$. Then, letting $\displaystyle f(x) = x^2$: +1. Chain Rule: $\displaystyle y = e^{x^2}$. Then, letting $\displaystyle f(x) = x^2$: + > $$\displaystyle \frac{{\rm d}}{{\rm d}x}e^{x^2} = e^{f(x)}f'(x) = e^{x^2}\cdot 2x$$ -1. $\displaystyle y=\ln (x^2 + 1)$. Then, letting $f(x) = x^2+1$: +1. $\displaystyle y=\ln (x^2 + 1)$. Then, letting $f(x) = x^2+1$: + > $$\displaystyle \frac{{\rm d}}{{\rm d}x}\ln(x^2+1) = {f'(x)\over f(x)} = {2x\over {x^2+1}}$$ 1. $\displaystyle {{\rm d}\over {\rm d}x}2^{x^3}=2^{x^3}\cdot\ln 2\cdot 3x^2$ @@ -130,7 +131,7 @@ sp.diff(sp.log(f(x),1.,x) # write the expression we wish to evaluate ## Functions of several variables: Partial Differentiation -**Definition:** Given a function $z=f(x,y)$ of two variables $x$ and $y$, the **partial derivative of $z$ with respect to $x$** is the function obtained by differentiating $f(x,y)$ with respect to $x$, holding $y$ constant. +**Definition:** Given a function $z=f(x,y)$ of two variables $x$ and $y$, the **partial derivative of $z$ with respect to $x$** is the function obtained by differentiating $f(x,y)$ with respect to $x$, holding $y$ constant. We denote this using $\partial$ (the "curly" delta, sometimes pronounced "del") as shown below: @@ -138,30 +139,19 @@ We denote this using $\partial$ (the "curly" delta, sometimes pronounced "del") ## Example 1 -> $\displaystyle f(x,y)=z=x^2-2y^2$ - -> $$\displaystyle f_x={\partial z\over \partial x}=2x\qquad\rm{and}\qquad f_y={\partial z\over \partial y}=-4y$$ +> $\displaystyle f(x,y)=z=x^2-2y^2$ > $$\displaystyle f_x={\partial z\over \partial x}=2x\qquad\rm{and}\qquad f_y={\partial z\over \partial y}=-4y$$ ## Example 2 Let $\displaystyle z=3x^2y+5xy^2$. Then the partial derivative of $z$ with respect to $x$, holding $y$ fixed, is: -> $$\displaystyle \frac{\partial z}{\partial x}=\frac{\partial}{\partial x}\,\left(3x^2y+5xy^2\right)$$ - -> $$\displaystyle \qquad =3y\cdot 2x + 5y^2\cdot 1$$ - -> $$\displaystyle \qquad =6xy+5y^2$$ - +> $$\displaystyle \frac{\partial z}{\partial x}=\frac{\partial}{\partial x}\,\left(3x^2y+5xy^2\right)$$ > $$\displaystyle \qquad =3y\cdot 2x + 5y^2\cdot 1$$ > $$\displaystyle \qquad =6xy+5y^2$$ while the partial of $z$ with respect to $y$ holding $x$ fixed is: +> $$\displaystyle \frac{\partial z}{\partial y}=\frac{\partial}{\partial y}\,\left(3x^2y+5xy^2\right)\,$$ > $$\displaystyle \qquad =3x^2\cdot 1 + 5x\cdot 2y = 3x^2+10xy$$ -> $$\displaystyle \frac{\partial z}{\partial y}=\frac{\partial}{\partial y}\,\left(3x^2y+5xy^2\right)\,$$ - -> $$\displaystyle \qquad =3x^2\cdot 1 + 5x\cdot 2y = 3x^2+10xy$$ - - -## Sympy example +### Sympy example In the previous slide we had: @@ -173,9 +163,10 @@ Let's redo this in Sympy: x, y = sp.symbols('x y') sp.diff(3*x**2*y + 5*x*y**2,x) ``` + $\displaystyle 6 x y + 5 y^{2}$ -## Higher-Order Partial Derivatives: +## Higher-Order Partial Derivatives Given $z = f(x,y)$ there are now four distinct possibilities for the second-order partial derivatives. @@ -196,8 +187,7 @@ second-order partial derivatives. $$\displaystyle \frac{\partial}{\partial x}\left(\frac{\partial z}{\partial y}\right) =\frac{\partial^2z}{\partial x\partial y} =z_{yx}$$ - -## Example: LaPlace's equation for equilibrium temperature distribution on a copper plate. +## Example: LaPlace's equation for equilibrium temperature distribution on a copper plate Let $\displaystyle T(x,y)$ give the temperature at the point $\displaystyle (x,y)$. @@ -221,19 +211,15 @@ Finally: which proves the result. -The function $\displaystyle z=x^2y - xy^2$ does *not* satisfy LaPlace's equation (and so cannot be a model for thermal equilibrium). +The function $\displaystyle z=x^2y - xy^2$ does _not_ satisfy LaPlace's equation (and so cannot be a model for thermal equilibrium). First note that -> $$\displaystyle z_x = 2xy - y^2$$ - -> $$\displaystyle z_{xx}=2y$$ +> $$\displaystyle z_x = 2xy - y^2$$ > $$\displaystyle z_{xx}=2y$$ and that -> $$\displaystyle z_y = x^2 - 2xy$$ - -> $$\displaystyle z_{yy} =-2x$$ +> $$\displaystyle z_y = x^2 - 2xy$$ > $$\displaystyle z_{yy} =-2x$$ Therefore: @@ -245,6 +231,7 @@ We can also verify this in Sympy like so: T1 = y**2 - x**2 sp.diff(T1, x, x) + sp.diff(T1, y, y) ``` + $\displaystyle 0$ and for the second function: @@ -253,15 +240,16 @@ and for the second function: T2 = x**2*y - x*y**2 sp.diff(T2, x, x) + sp.diff(T2, y, y) ``` + $\displaystyle - 2 x + 2 y$ -## A Note on the Mixed Partials $f_{xy}$ and $f_{yx}$: +## A Note on the Mixed Partials $f_{xy}$ and $f_{yx}$ If all of the partials of $\displaystyle f(x,y)$ exist, then $\displaystyle f_{xy}=f_{yx}$ for all $\displaystyle (x,y)$. -### Example: +### Example -Let $\displaystyle z = x^2y^3+3x^2-2y^4$. Then $\displaystyle z_x=2xy^3+6x$ and $\displaystyle z_y = 3x^2y^2-8y^3$. +Let $\displaystyle z = x^2y^3+3x^2-2y^4$. Then $\displaystyle z_x=2xy^3+6x$ and $\displaystyle z_y = 3x^2y^2-8y^3$. Taking the partial of $\displaystyle z_x$ with respect to $\displaystyle y$ we get @@ -275,7 +263,6 @@ So the operators $\displaystyle {\partial \over \partial x}$ and $\displaystyle > $$\displaystyle {\rm~i.e.~~~~}~{\partial\over \partial x}\biggr({\partial z\over \partial y}\biggl)~~~~={\partial\over \partial y}\biggr({\partial z\over \partial x}\biggl)$$ - ### Introductory problems ::::challenge{id="05_intro_01" title="Introductory problems 1"} @@ -289,6 +276,7 @@ Differentiate the following functions with respect to $x$, using the stated rule 1. Any rules: $\displaystyle \frac{\ln x}{5x-7}$ 1. Any rules: $\displaystyle \frac{e^x}{2x^3-1}$ 1. Any rules: $\displaystyle \log_2\left(x\cos(x)\right)$ + :::: ::::challenge{id="05_intro_02" title="Introductory problems 2"} @@ -320,7 +308,6 @@ where $t$ represents the number of days since infection and $r$ is the percentag Determine an expression for the rate of recovery. :::: - ::::challenge{id="05_main_03" title="Main problems 3"} An experiment called 'the reptilian drag race' looks at how agamid lizards accelerate from a standing start. @@ -332,6 +319,7 @@ where $v_{\rm max}$ is the maximum velocity, and $k$ is a rate constant. 1. Find expressions for the velocity $v$ and acceleration $a$ as functions of time. 1. For $v_{\rm max} = 3\,$m$\,$s$^{-1}$ and $k=10\,$s$^{-1}$, sketch $x$, $v$ and $a/10$ on the same axes for $0\leq t\leq 1\,$s. + :::: ::::challenge{id="05_main_04" title="Main problems 4"} @@ -344,8 +332,8 @@ where $k$ is a positive constant and $0\leq t\leq 1\,$s. 1. Sketch $x$ over time for $k={1\over 2}$ and $k=3$. 1. Calculate an expression for the organism's velocity as a function of time. 1. What is the largest value of $k$ such that the organism never starts moving back towards where it started? -:::: +:::: ::::challenge{id="05_main_05" title="Main problems 5"} The function @@ -364,30 +352,30 @@ Let $\displaystyle \quad z = {2\over 3}x^3 - {3\over 4}x^2y + {2\over 5}y^3$. 1. Find $\displaystyle z_x$ and $\displaystyle z_y$ 1. Find $\displaystyle z_{xx}$ and $\displaystyle z_{yy}$ 1. Show that $\displaystyle z_{xy} = z_{yx}$ -:::: +:::: -::::challenge{id="05_ext_02" title="Extension problems 2"} -Show that $f_{xy}=f_{yx}$ for the following functions: +::::challenge{id="05*ext_02" title="Extension problems 2"} +Show that $f*{xy}=f\_{yx}$ for the following functions: 1. $\displaystyle f(x,y) = x^2 - xy + y^3$ 1. $\displaystyle f(x,y) = e^y\,\ln(2x-y)$ 1. $\displaystyle f(x,y) = 2\,x\,y\,e^{2xy}$ 1. $\displaystyle f(x,y) = x\,\sin(y)$ -:::: +:::: ::::challenge{id="05_ext_03" title="Extension problems 3"} The body mass index, $B$, is used as a parameter to classify people as underweight, normal, overweight and obese. -It is defined as their weight in kg, $w$, divided by the square of their height in meters, $h$. +It is defined as their weight in kg, $w$, divided by the square of their height in meters, $h$. 1. Sketch a graph of $B$ against $w$ for a person who is 1.7m tall. 1. Find the rate of change of $B$ with weight of this person. 1. Sketch a graph of $B$ against $h$ for a child whose weight is constant at 35 kg. 1. Find the rate of change of $B$ with height $h$ of this child. 1. Show that $\displaystyle \left({\partial^2 B\over \partial h \partial w}\right)=\left({\partial^2 B\over \partial w \partial h}\right)$. -:::: +:::: ::::challenge{id="05_ext_04" title="Extension problems 4"} A light wave or a sound wave propagated through time and space can be represented in a simplified form by: @@ -404,4 +392,5 @@ An understanding of this function is essential for many problems such as sound, 1. At what values of $x$ and $t$ does the function repeat itself? 1. Find the rate at which $y$ changes at an arbitrary fixed position. 1. Show that $\displaystyle y_{xt} = y_{tx}$. + :::: diff --git a/scientific_computing/essential_maths/06_integration_1.md b/scientific_computing/essential_maths/06_integration_1.md index 867be200..555c1de0 100644 --- a/scientific_computing/essential_maths/06_integration_1.md +++ b/scientific_computing/essential_maths/06_integration_1.md @@ -1,22 +1,19 @@ --- name: Integration 1 -dependsOn: [ - scientific_computing.essential_maths.05_differentiation_3 -] +dependsOn: [scientific_computing.essential_maths.05_differentiation_3] tags: [] -attribution: -- citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ---- +--- ## YouTube lecture recording from October 2020 @@ -29,10 +26,9 @@ The material is still very similar: ## Introduction and Integration by Substitution - ## Integration -**Calculating the area under a curve** +**Calculating the area under a curve**: If we want to find the area under a curve, we can divide the area into strips, calculate the area of each strip, and sum these areas together. @@ -61,12 +57,10 @@ But, as the size of each rectangle reduces, we converge the true area under the ![Fine approximation to the integral](fig/06_03_fine.svg) - $\displaystyle \int$ is the old English `S` and stands for the phrase "Sum Over". This process is called **integration**. - ## Calculating the integral Let us 'invent' a function $F(x)$ that gives the area from $0$ to $x$. @@ -78,10 +72,7 @@ With this (imagined) function, we can find the area of one of our tiny steps **e Remember we approximated this as $\displaystyle y\,\delta x = f(x)\,\delta x$, so: -> $$\displaystyle F(x + \delta x) - F(x) \approx f(x)\delta x$$ - -> $$\displaystyle f(x) \approx {F(x+\delta x)-F(x)\over\delta x}$$ - +> $$\displaystyle F(x + \delta x) - F(x) \approx f(x)\delta x$$ > $$\displaystyle f(x) \approx {F(x+\delta x)-F(x)\over\delta x}$$ The error in this approximation tends to $0$ as $\displaystyle \delta x\to 0$, so @@ -95,11 +86,10 @@ In other words, for our example, **Integration reverses the process of differentiation**. - - ## Calculating integrals We know that: + > $$\displaystyle {{\rm d} \left(x^2\right)\over {\rm d}x}=~2 x~~~~~\Rightarrow~~~~~\biggr[x^2\biggl]_{x_1}^{x_2} = \int_{x_1}^{x_2} 2 x~{\rm d}x $$ Likewise: @@ -110,15 +100,16 @@ Likewise: > $$\displaystyle {{\rm d}\left(a\thinspace x^{n+1}\right)\over {\rm d}x}=(n+1)a\thinspace~x^n~~~~~\Rightarrow ~~~~\int_{x_1}^{x_2} a\thinspace x^n~{\rm d}x = \biggr[{a\over(n+1)} \thinspace x^{(n+1)}\biggl]_{x_1}^{x_2}$$ - ## SymPy examples $$\displaystyle \int_{x_1}^{x_2} x^6~{\rm d}x =$$ ```python +import sympy as sp x, x1, x2 = sp.symbols('x x_1 x_2') sp.integrate(x**6,(x,x1,x2)) ``` + > $\displaystyle - \frac{x_{1}^{7}}{7} + \frac{x_{2}^{7}}{7}$ --- @@ -128,6 +119,7 @@ $$\displaystyle \int_{x_1}^{x_2} x^{3\over 2}~{\rm d}x =$$ ```python sp.integrate(x**(sp.sympify(3)/2),(x,x1,x2)) ``` + > $\displaystyle - \frac{2 x_{1}^{\frac{5}{2}}}{5} + \frac{2 x_{2}^{\frac{5}{2}}}{5}$ --- @@ -137,21 +129,21 @@ $$\displaystyle \int_{x_1}^{x_2} x^{-{1\over 2}}~{\rm d}x = $$ ```python sp.integrate(x**(-sp.sympify(1)/2),(x,x1,x2)) ``` -> $\displaystyle - 2 \sqrt{x_{1}} + 2 \sqrt{x_{2}}$ +> $\displaystyle - 2 \sqrt{x_{1}} + 2 \sqrt{x_{2}}$ ## Indefinite integrals Consider now the functions: -> $$\displaystyle y=x^2+7 ~~~~~~~~~~{\rm (3)}$$ - -> $$\displaystyle y=x^2-100 ~~~~~~~~~{\rm (4)}$$ +> $$\displaystyle y=x^2+7 ~~~~~~~~~~{\rm (3)}$$ > $$\displaystyle y=x^2-100 ~~~~~~~~~{\rm (4)}$$ Differentiating (3): + > $$\displaystyle {{\rm d}y\over {\rm d}x}=2x$$ Differentiating (4): + > $$\displaystyle {{\rm d}y\over {\rm d}x}=2x$$ This implies that the integral: @@ -160,12 +152,13 @@ This implies that the integral: An integral without limits is called an **indefinite integral**. -## Indefinite integrals in SymPy: +## Indefinite integrals in SymPy ```python x = sp.symbols('x') sp.integrate(2*x,x) ``` + > $\displaystyle x^{2}$ ## Other integrals @@ -174,7 +167,6 @@ Recall that: > $$\displaystyle {{\rm d}\over {\rm d}x}\ln x={1\over x}$$ - And, that: > $$\displaystyle {{\rm d}\over {\rm d}x}\ln x={1\over x}~~~~~~~~\Rightarrow~~\int{1\over x}~{\rm d}x=\ln x + \kappa$$ @@ -194,37 +186,14 @@ Recall that: > $$\displaystyle {{\rm d}\over {\rm d}x}\biggr(\sin x\biggl)= \cos x~~~~~~~~~~~~~{\rm and}~~~~~~~~~~~ {{\rm d}\over {\rm d}x}\biggr(\cos x\biggl)= -\sin x$$ Example (i): -> $$\displaystyle \int_{0}^{\pi/2} \cos x \thinspace dx =$$ - -```python -sp.integrate(sp.cos(x),(x,0,sp.pi/2)) -``` -> $\displaystyle 1$ - - -Example (ii): -> $$\displaystyle \int_{0}^{\pi/2} \sin x \thinspace {\rm d}x =$$ - -```python -sp.integrate(sp.sin(x),(x,0,sp.pi/2)) -``` -> $\displaystyle 1$ - - -## Integrating trigonometric functions - -Recall that: -> $$\displaystyle {{\rm d}\over {\rm d}x}\biggr(\sin x\biggl)= \cos x~~~~~~~~~~~~~{\rm and}~~~~~~~~~~~ {{\rm d}\over {\rm d}x}\biggr(\cos x\biggl)= -\sin x$$ - -Example (i): > $$\displaystyle \int_{0}^{\pi/2} \cos x \thinspace {\rm d}x = \biggr[\sin x \biggl]_0^{\pi/2}=1-0=1$$ Example (ii): -> $$\displaystyle \int_{0}^{\pi/2} \sin x \thinspace {\rm d}x =\biggr[-\cos x \biggl]_0^{\pi/2}=0-(-1)=1$$ +> $$\displaystyle \int_{0}^{\pi/2} \sin x \thinspace {\rm d}x =\biggr[-\cos x \biggl]_0^{\pi/2}=0-(-1)=1$$ -## Summary of integration formulae: +## Summary of integration formulae $\displaystyle \int a\,{\rm d}x=ax+C$ @@ -240,8 +209,7 @@ $\displaystyle \int ax^n\,{\rm d}x={1\over{n+1}}ax^{n+1}+C\qquad{\rm for~all}n{\ $\displaystyle \int x^{-1}\,{\rm d}x=\int {1\over x}\,dx = \ln \vert x\vert +C$ - -## Application: +## Application Recall that $f'(t)$ gives the rate at which $f(t)$ changes at time $t$. @@ -251,7 +219,7 @@ Integrating the derivative $f'(t)$, we see: Therefore, the definite integral from $a$ to $b$ of $f'(t)$ with respect to $t$ will always give the **net** change that $f(t)$ has undergone as the parameter $t$ moves from $a$ to $b$. -## Example: +## Example A chemical process produces NaCl at the rate of $3\sqrt{t}$ grams per minute. We ask three questions: @@ -260,9 +228,9 @@ We ask three questions: 2. What is the quantity of NaCl produced over the next three minutes? 3. What is the mean rate of NaCl production over this interval? -## Solution: +## Solution -Let $f(t)$ denote the grams of NaCl produced after $t$ minutes. Then $f'(t)= 3\sqrt{t}$. +Let $f(t)$ denote the grams of NaCl produced after $t$ minutes. Then $f'(t)= 3\sqrt{t}$. 1. The rate of production one minute into the process is $f'(1) = 3\sqrt{1} = 3$ grams per minute. @@ -274,17 +242,18 @@ Let $f(t)$ denote the grams of NaCl produced after $t$ minutes. Then $f'(t)= 3\ t = sp.Symbol('t') sp.integrate(3*sp.sqrt(t),(t,1,4)) ``` + > $\displaystyle 14$ 3. The mean rate is the constant rate which would give the same overall effect: - + > $$\displaystyle {1\over b-a}\int_a^b\,f'(t)\,{\rm d}t = {1\over 4-1}\int_1^4\,3\sqrt{t}\,{\rm d}t = {14\over 3}$$ ```python sp.integrate(3*sp.sqrt(t),(t,1,4))/(4-1) ``` - $\displaystyle \frac{14}{3}$ +$\displaystyle \frac{14}{3}$ ## Substitution Method @@ -298,7 +267,7 @@ In general, for $g(x)=u$, we can write: This can be thought of as being like the integral version of the chain rule. -### Examples: +### Examples Consider: @@ -307,16 +276,17 @@ Consider: Multiplying this out and then integrating it would be very tedious. Try a substitution instead: -> $$\displaystyle u=2x+3~~~\Longleftrightarrow x={1\over 2}(u-3)={u\over 2}-{3\over 2}~~~~ \Longleftrightarrow {{\rm d}x\over {\rm d}u}={1\over 2}$$ +> $$\displaystyle u=2x+3~~~\Longleftrightarrow x={1\over 2}(u-3)={u\over 2}-{3\over 2}~~~~ \Longleftrightarrow {{\rm d}x\over {\rm d}u}={1\over 2}$$ Substitute into Equation (7) above: -> $$\displaystyle \int_{u(1.}^{u(b)} u^4 \times {1\over 2}~{\rm d}u ={1\over 2}\int_{u(1.}^{u(b)} u^4~du = {1\over 2}\biggr[{1\over 5}u^5\biggl]_{u(1.}^{u(b)}= {1\over 2}\biggr[{1\over 5}(2x+3)^5\biggl]_a^b$$ +> $$\displaystyle \int_{u(1.}^{u(b)} u^4 \times {1\over 2}~{\rm d}u ={1\over 2}\int_{u(1.}^{u(b)} u^4~du = {1\over 2}\biggr[{1\over 5}u^5\biggl]_{u(1.}^{u(b)}= {1\over 2}\biggr[{1\over 5}(2x+3)^5\biggl]_a^b$$ ```python a, b = sp.symbols('a b') sp.integrate((2*x + 3)**4,(x,a,b)) ``` + $\displaystyle - \frac{16 a^{5}}{5} - 24 a^{4} - 72 a^{3} - 108 a^{2} - 81 a + \frac{16 b^{5}}{5} + 24 b^{4} + 72 b^{3} + 108 b^{2} + 81 b$ --- @@ -327,21 +297,17 @@ Let's look at another example: Let: -> $$\displaystyle u=3-4x~~~\Longleftrightarrow x={1\over 4}(3-u)={3\over 4}-{u\over 4}~~~~\Longleftrightarrow {{\rm d}x\over {\rm d}u}=-{1\over 4}$$ +> $$\displaystyle u=3-4x~~~\Longleftrightarrow x={1\over 4}(3-u)={3\over 4}-{u\over 4}~~~~\Longleftrightarrow {{\rm d}x\over {\rm d}u}=-{1\over 4}$$ Substitute into Equation (8) above: - -> $$\displaystyle \int_{u(1.}^{u(b)} u^{-5} \times {-1\over 4}~{\rm d}u = -{1\over 4}\int_{u(1.}^{u(b)} u^{-5}~{\rm d}u = -{1\over 4}\biggr[{1\over-4}u^{-4}\biggl]_{u(1.}^{u(b)}$$ -> $$\displaystyle \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad ~~~ = {1\over 16}\biggr[(3-4x)^{-4}\biggl]_a^b$$ - +> $$\displaystyle \int_{u(1.}^{u(b)} u^{-5} \times {-1\over 4}~{\rm d}u = -{1\over 4}\int_{u(1.}^{u(b)} u^{-5}~{\rm d}u = -{1\over 4}\biggr[{1\over-4}u^{-4}\biggl]_{u(1.}^{u(b)}$$ > $$\displaystyle \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad ~~~ = {1\over 16}\biggr[(3-4x)^{-4}\biggl]_a^b$$ ```python sp.integrate((3 - 4*x)**(-5),(x,a,b)) ``` -> $\displaystyle \frac{1}{4096 b^{4} - 12288 b^{3} + 13824 b^{2} - 6912 b + 1296} - \frac{1}{4096 a^{4} - 12288 a^{3} + 13824 a^{2} - 6912 a + 1296}$ - +> $\displaystyle \frac{1}{4096 b^{4} - 12288 b^{3} + 13824 b^{2} - 6912 b + 1296} - \frac{1}{4096 a^{4} - 12288 a^{3} + 13824 a^{2} - 6912 a + 1296}$ ### Introductory problems @@ -351,8 +317,8 @@ Integrate the following functions with respect to $x$. Remember that you can che 1. $\displaystyle x^3-{1\over{x^4}}+x^2$ 1. $\displaystyle \sqrt[3]{x}+\frac{1}{3\sqrt[4]{x}}$ 1. $\displaystyle \frac{1}{x^2} + \frac{1}{\sqrt[3]{x}} - 7$ -:::: +:::: ::::challenge{id="06_intro_02" title="Introductory problems 2"} Evaluate the following definite integrals: @@ -361,6 +327,7 @@ Evaluate the following definite integrals: 1. $\displaystyle \int_2^3 x^{-2/3}~{\rm d}x$ 1. $\displaystyle \int_0^{\ln(2)}e^{3x}~{\rm d}x$ 1. $\displaystyle \int_0^2(x+1)^{1/5}~{\rm d}x$ + :::: ::::challenge{id="06_intro_03" title="Introductory problems 3"} @@ -372,6 +339,7 @@ Find the integrals below by making the substitution suggested: 1. $\displaystyle \int x^3\sqrt{15-3x^4}~{\rm d}x \qquad\rm{using}\qquad u = 15-3x^4$ 1. $\displaystyle \int_0^4 \sqrt{x^3}\,\sqrt{4+x^{5/2}}~{\rm d}x \qquad\rm{using}\qquad u = 4+x^{5/2}$ 1. $\displaystyle \int_0^1 x^{n-1}(1-x^{n})^2~{\rm d}x \qquad\rm{using}\qquad u = 1-x^{n}$ + :::: ::::challenge{id="06_intro_04" title="Introductory problems 4"} @@ -381,6 +349,7 @@ By making suitable substitutions, find the indefinite integrals of: 1. $\displaystyle \frac{7x}{x^2-2}$ 1. $\displaystyle \frac{3}{\sqrt{5-x}}$ 1. $\displaystyle \frac{1}{x-a}$ + :::: ### Main problems @@ -400,21 +369,23 @@ Because the charges repel, work must be done on the system to bring $q_2$ from i The force $F$ must be applied to overcome the repulsion. 1. Explain why the total work done is given by + > $\displaystyle W=-\int_\infty^x F(x')~{\rm d}{x'} = -{q_1 q_2\over 4\pi \epsilon_0} \int_\infty^x {{\rm d}{x'}\over {x'}^2}$. 1. Calculate the work, $W$, in Joules when - $x =5.3 \times 10^{-11}\,\rm{m}$, - $q_1=q_2=1.6 \times 10^{-19}\,\rm{C}$, - $\epsilon_0=8.85 \times 10^{-12}\,\rm{Fm}^{-1}$. + :::: - ::::challenge{id="06_main_02" title="Main problems 2"} The rate at which the world's oil is being consumed is continuously increasing. Suppose the rate (in billions of barrels per year) is given by the function $r=f(t)$, where $t$ is measured in years and $t=0$ is the start of 1990. 1. Write down a definite integral which represents the total quantity of oil used between the start of 1990 and the end of 2020. 1. Calculate this integral using the function $\displaystyle r(t)=32 e^{0.05t}$. + :::: ::::challenge{id="06_main_03" title="Main problems 3"} @@ -427,8 +398,8 @@ where $k$ and $p$ are positive constants. 1. What are the units of $k$ and $p$? 1. Given that, at the start of 1850, the global emission rate was 0.2 Gigatonnes per year, and that at the start of 2010 the global emission rate was 32 Gigatonnes per year, calculate $k$ and $p$. 1. Calculate the total quantity of carbon emitted since the start of 1850. -:::: +:::: ::::challenge{id="06_main_04" title="Main problems 4"} The velocity $v$ of blood in a cylindrical vessel of radius $R$ and length $l$ is given by @@ -440,8 +411,6 @@ where $\eta$ and $P$ are constants, and $r$ is the radial distance from the cyli Find the average velocity of blood along the radius of the cylinder (i.e. for $0\leq r\leq R$), and compare this with the maximum velocity. :::: - - ::::challenge{id="06_main_05" title="Main problems 5"} Consider the function $\displaystyle y = 4x^3 +2x^2-8x + 2$. @@ -450,9 +419,8 @@ Consider the function $\displaystyle y = 4x^3 +2x^2-8x + 2$. 1. On your graph, shade in the region under the curve between $x=-2$ and $x=2$ and _estimate_ its area. 1. Integrate the function between $x=-2$ and $x=2$. Is this an accurate calculation of the area of the shaded region in part 3? 1. Identify and explain any differences you find between your estimate in part 3 and your calculation in part 4. -:::: - +:::: ### Extension problems @@ -462,32 +430,31 @@ Evaluate the following definite integrals: 1. $\displaystyle \int_{-\pi/2}^{\pi/2} 3\cos (x)\,{\rm d}x$ 1. $\displaystyle \int_{\pi\over 2}^{\pi}\cos (x)\,\sin(x)\,{\rm d}x$ 1. $\displaystyle \int_0^\pi(\cos^2(x)+\sin^2(x))\,{\rm d}x$ -:::: - +:::: ::::challenge{id="06_ext_02" title="Extension problems 2"} Let $u$ and $v$ be functions of $x$. 1. Given that $\displaystyle \def\dd#1#2{{\frac{{\rm d}#1}{{\rm d}#2}}} \quad\dd{}{x}(uv) = u\dd{v}{x} + v\dd{u}{x},\quad$ show that $\displaystyle \def\dd#1#2{{\frac{{\rm d}#1}{{\rm d}#2}}} \quad\int v~\dd{u}{x}~{\rm d}x = uv-\int u~\dd{v}{x}~{\rm d}x.$ 1. By using this 'integration by parts' formula, and substituting $z=x^2$ or otherwise, show that -$$\int_0^{\infty} x^n~e^{-x^2}~{\rm d}x={1\over 2}(n-1)\int_0^{\infty} x^{n-2}e ^{-x^2}~dx\qquad \rm{for}\;n>1.$$ + $$\int_0^{\infty} x^n~e^{-x^2}~{\rm d}x={1\over 2}(n-1)\int_0^{\infty} x^{n-2}e ^{-x^2}~dx\qquad \rm{for}\;n>1.$$ 1. Hence evaluate $\displaystyle \int_0^{\infty} x^5~e^{-x^2}~{\rm d}x.$ -::: + ::: +::::challenge{id="06*ext_03" title="Extension problems 3"} +A country wishes to achieve net-zero CO$*{2}$ emissions in 50 years. At the start of the program, their emissions ($E$) are 800MtCO$_{2}$year$^{-1}$. They decide that they will be able to reduce their emissions at a stable rate, so that each year they emit 12MtCO$_{2}$year$^{-1}$ less than the previous year. -::::challenge{id="06_ext_03" title="Extension problems 3"} -A country wishes to achieve net-zero CO$_{2}$ emissions in 50 years. At the start of the program, their emissions ($E$) are 800MtCO$_{2}$year$^{-1}$. They decide that they will be able to reduce their emissions at a stable rate, so that each year they emit 12MtCO$_{2}$year$^{-1}$ less than the previous year. - -1. Write down the rate of change of the countries emissions ($E$), each year ($t$), $\def\dd#1#2{{\frac{{\rm d}#1}{{\rm d}#2}}} \dd{E}{t}$. Use this to calculate the total emissions that the country had produced over the 50 years. +- Write down the rate of change of the countries emissions ($E$), each year ($t$), $\def\dd#1#2{{\frac{{\rm d}#1}{{\rm d}#2}}} \dd{E}{t}$. Use this to calculate the total emissions that the country had produced over the 50 years. After 10 years of these emissions, the country starts a CO$_{2}$ removal program, whereby a certain amount of CO$_{2}$ is captured from the atmosphere and sequestered underground each year. This CO$_{2}$ follows the curve $$R = 0.1t^{2} - t$$ where $R$ is the amount of CO$_{2}$ removal in MtCO$_{2}$year$^{-1}$. -2. Determine whether the country achieves their 50 year net-zero emissions goal by finding the year in which the emissions produced are equal to the emissions removed. +- Determine whether the country achieves their 50 year net-zero emissions goal by finding the year in which the emissions produced are equal to the emissions removed. After the 50 year program, the countries emission rate stabilises, and they emit the same amount of CO$_{2}$ each year after that. The CO$_{2}$ absorption rate per year follows the same trend as before. The country wishes to have not contributed to global warming at all since the start of the program. This means their net total CO$_{2}$ emissions over the entire program would have to be zero. -3. Show that is takes approximately 109 years for the country to have a net-zero effect on global warming since the start of the program. +- Show that is takes approximately 109 years for the country to have a net-zero effect on global warming since the start of the program. + :::: diff --git a/scientific_computing/essential_maths/07_integration_2.md b/scientific_computing/essential_maths/07_integration_2.md index 7bc7d396..a93025af 100644 --- a/scientific_computing/essential_maths/07_integration_2.md +++ b/scientific_computing/essential_maths/07_integration_2.md @@ -1,22 +1,19 @@ --- name: Integration 2 -dependsOn: [ - scientific_computing.essential_maths.06_integration_1 -] +dependsOn: [scientific_computing.essential_maths.06_integration_1] tags: [] -attribution: -- citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ---- +--- ## YouTube lecture recording from October 2020 @@ -31,7 +28,7 @@ The material is still very similar: ## Integration reverses the process of differentiation -### In general: +### In general > $$\displaystyle {d(a\thinspace x^{n+1})\over dx}=(n+1)a\thinspace~x^n~~~~~\Rightarrow ~~~~\int_{x_1}^{x_2} a\thinspace x^n~dx = \biggr[{a\over(n+1)} \thinspace x^{(n+1)}\biggl]_{x_1}^{x_2}$$ @@ -40,9 +37,7 @@ The material is still very similar: This method can be thought of as an integral version of the chain rule. Suppose we wish to integrate: -> $$\displaystyle I=\int f(g(x))~dx= \int f(u)~dx$$ - -> $$\displaystyle I =\int f(u){dx\over du}~du$$ +> $$\displaystyle I=\int f(g(x))~dx= \int f(u)~dx$$ > $$\displaystyle I =\int f(u){dx\over du}~du$$ ## Integration Method 2: Integration by Parts @@ -68,7 +63,6 @@ with $f(x) \equiv u$ and $g(x) \equiv v$. > $$\displaystyle \int\limits_a^b x~\sqrt{(x+1)}~dx=\int\limits_a^b x~(x+1)^{1/2}~dx$$ - Choose > $$\displaystyle u=x\qquad{\rm and}\qquad v'=\sqrt{(x+1)}$$ @@ -77,7 +71,6 @@ so that: > $$\displaystyle u'=1\qquad{\rm and}\qquad v={2 \over 3}(x+1)^{3/2}$$. - Choose > $$\displaystyle u=x\qquad{\rm and}\qquad v'=\sqrt{(x+1)}$$ @@ -86,13 +79,9 @@ so that: > $$\displaystyle u'=1\qquad{\rm and}\qquad v={2 \over 3}(x+1)^{3/2}$$ - Then: -> $$\displaystyle \int\limits_a^b x~\sqrt{(x+1)}~dx =\biggr[x~{2\over 3}(x+1)^{3/2}\biggl]_a^b - \int\limits_a^b 1 \times {2\over 3}(x+1)^{3/2}~dx$$ - -> $$\displaystyle =\biggr[{2\over 3}~x~(x+1)^{3/2}\biggl]_a^b- \biggr[{2\over 3}\cdot {2\over 5}(x+1)^{5/2}\biggl]_a^b$$ - +> $$\displaystyle \int\limits_a^b x~\sqrt{(x+1)}~dx =\biggr[x~{2\over 3}(x+1)^{3/2}\biggl]_a^b - \int\limits_a^b 1 \times {2\over 3}(x+1)^{3/2}~dx$$ > $$\displaystyle =\biggr[{2\over 3}~x~(x+1)^{3/2}\biggl]_a^b- \biggr[{2\over 3}\cdot {2\over 5}(x+1)^{5/2}\biggl]_a^b$$ If we had chosen the other option for $u$ and $v'$ we would have got: @@ -108,24 +97,23 @@ Let us calculate the indefinite integral, > $$\displaystyle \int \ln x~dx$$ -We can do a bit of a trick. Let: +We can do a bit of a trick. Let: > $$\displaystyle u=\ln x \quad\Longrightarrow\quad u'={1\over x}\qquad{\rm and}\qquad v'=1\Longrightarrow v=x$$ Putting these into our equation: -> $$\displaystyle \int \ln x~dx\quad=\quad\ln x~\cdot x - \int x~\frac{1}{x}dx$$ - -> $$\displaystyle = \quad x \ln x - \int 1 dx\quad=\quad x \ln x - x + C$$ +> $$\displaystyle \int \ln x~dx\quad=\quad\ln x~\cdot x - \int x~\frac{1}{x}dx$$ > $$\displaystyle = \quad x \ln x - \int 1 dx\quad=\quad x \ln x - x + C$$ We can also check this calculation using SymPy: ```python +import sympy as sp x = sp.symbols('x') sp.integrate(sp.log(x),x) ``` -> $\displaystyle x \log{\left(x \right)} - x$ +> $\displaystyle x \log{\left(x \right)} - x$ ### Integration by parts: Example 3 @@ -147,7 +135,6 @@ Then: > $$\displaystyle \int\limits^{\infty}_0 x^3~e^{-x}~dx=-\biggr[x^3~e^{-x}\biggl]^{\infty}_0 +\int\limits^{\infty}_0 3x^2~e^{-x}~dx$$ - Now apply integration by parts to the right-hand side: Choose @@ -160,8 +147,7 @@ so that: Then: -> $$\displaystyle \int\limits^{\infty}_0 3x^2~e^{-x}~dx=-\biggr[3x^2~e^{-x}\biggl]^{\infty}_0 +\int\limits^{\infty}_0 6x~e^{-x}~dx$$ - +> $$\displaystyle \int\limits^{\infty}_0 3x^2~e^{-x}~dx=-\biggr[3x^2~e^{-x}\biggl]^{\infty}_0 +\int\limits^{\infty}_0 6x~e^{-x}~dx$$ And, once more: @@ -175,17 +161,13 @@ so that: Then: -> $$\displaystyle \int\limits^{\infty}_0 6x~e^{-x}~dx =-\biggr[6x~e^{-x}\biggl]^{\infty}_0 +\int\limits^{\infty}_0 6~e^{-x}~dx$$ - -> $$\displaystyle \Longrightarrow \int\limits^{\infty}_0 6~e^{-x}~dx=-\biggr[6~e^{-x}\biggl]^{\infty}_0=-6e^{-\infty}+6e^0=6$$ +> $$\displaystyle \int\limits^{\infty}_0 6x~e^{-x}~dx =-\biggr[6x~e^{-x}\biggl]^{\infty}_0 +\int\limits^{\infty}_0 6~e^{-x}~dx$$ > $$\displaystyle \Longrightarrow \int\limits^{\infty}_0 6~e^{-x}~dx=-\biggr[6~e^{-x}\biggl]^{\infty}_0=-6e^{-\infty}+6e^0=6$$ (Since $e^{-\infty}=0$ and $e^0=1$) The other terms all go to zero: -> $$\displaystyle -\bigr[x^3~e^{-x}\bigl]^{\infty}_0 =-{\infty}^3~e^{-\infty} + 0 =0$$ - -> $$\displaystyle -\bigr[3x^2~e^{-x}\bigl]^{\infty}_0 =-3{\infty}^2~e^{-\infty} + 0 =0$$ +> $$\displaystyle -\bigr[x^3~e^{-x}\bigl]^{\infty}_0 =-{\infty}^3~e^{-\infty} + 0 =0$$ > $$\displaystyle -\bigr[3x^2~e^{-x}\bigl]^{\infty}_0 =-3{\infty}^2~e^{-\infty} + 0 =0$$ So, to answer our original question: @@ -196,6 +178,7 @@ Let's also check it with SymPy: ```python sp.integrate(x**3 * sp.exp(-x),(x,0,sp.oo)) ``` + > $\displaystyle 6$ This result actually generalises: @@ -206,9 +189,7 @@ This result actually generalises: Recall that: -> $$\displaystyle {d\over dx}(\sin x)=\cos x$$ - -> $$\displaystyle {d\over dx}(\cos x)=-\sin x$$ +> $$\displaystyle {d\over dx}(\sin x)=\cos x$$ > $$\displaystyle {d\over dx}(\cos x)=-\sin x$$ Let's try and calculate the following integral: @@ -224,9 +205,7 @@ so that: Then: -> $$\displaystyle I =\biggr[-\cos x~\; e^{-x}\biggl]^b_a -\int\limits^b_a ~(-)\sin x~(-)e^{-x}~dx$$ - -> $$\displaystyle I =\biggr[-\cos x~\; e^{-x}\biggl]^b_a~~-~~\int\limits^b_a ~(-)\sin x~(-)e^{-x}~dx$$ +> $$\displaystyle I =\biggr[-\cos x~\; e^{-x}\biggl]^b_a -\int\limits^b_a ~(-)\sin x~(-)e^{-x}~dx$$ > $$\displaystyle I =\biggr[-\cos x~\; e^{-x}\biggl]^b_a~~-~~\int\limits^b_a ~(-)\sin x~(-)e^{-x}~dx$$ Next, choose @@ -244,7 +223,6 @@ The last term is the integral we started with: > $$\displaystyle \Longrightarrow~~~2~\int\limits^b_a ~\cos x~e^{-x}~dx~ =~\biggr[\sin x~\; e^{-x}\biggl]^b_a~~ -~~\biggr[\cos x~\; e^{-x}\biggl]^b_a$$ - ## Integration Method 3: Partial Fractions If we want to calculate the integral below, none of the previous rules allow us to make much progress. @@ -283,38 +261,38 @@ And on the second fraction: so $\displaystyle \frac{dw}{dx}=1$ and $\displaystyle \frac{dx}{dw}=1$ -> $$\displaystyle \int {2 du \over 2 \times 11 \times u} + \int {dw\over 11w}= -{\ln u\over 11} + {\ln w \over 11}$$ - -> $$\displaystyle =-{\ln |2x+1|\over 11} + {\ln |x-5| \over 11}$$ +> $$\displaystyle \int {2 du \over 2 \times 11 \times u} + \int {dw\over 11w}= -{\ln u\over 11} + {\ln w \over 11}$$ > $$\displaystyle =-{\ln |2x+1|\over 11} + {\ln |x-5| \over 11}$$ SymPy can also solve integrals requiring partial fractions: ```python sp.integrate(1/((2*x + 1)*(x - 5)),x) ``` + > $\displaystyle \frac{\log{\left(x - 5 \right)}}{11} - \frac{\log{\left(x + \frac{1}{2} \right)}}{11}$ This answer seems different because of the arbitrary constant of integration. - ### Introductory problems ::::challenge{id="07_intro_01" title="Introductory problems 1"} By using suitable substitutions, evaluate the following integrals: -1. $\displaystyle \def\d#1{{\rm d}#1} \int x^2(x^3+4)^2~~\d{x}$ -1. $\displaystyle \def\d#1{{\rm d}#1} \int e^{-x}(5-4e^{-x})~\d{x}$ -1. $\displaystyle \def\d#1{{\rm d}#1} \int (1+x)\sqrt{(4x^2+8x+3)}~\d{x}$ -1. $\displaystyle \def\d#1{{\rm d}#1} \int 3x e^{(x^2+1)}~\d{x}$ +1. $\displaystyle \def\d#1{{\rm d}#1} \int x^2(x^3+4)^2~~\d{x}$ +1. $\displaystyle \def\d#1{{\rm d}#1} \int e^{-x}(5-4e^{-x})~\d{x}$ +1. $\displaystyle \def\d#1{{\rm d}#1} \int (1+x)\sqrt{(4x^2+8x+3)}~\d{x}$ +1. $\displaystyle \def\d#1{{\rm d}#1} \int 3x e^{(x^2+1)}~\d{x}$ + :::: ::::challenge{id="07_intro_02" title="Introductory problems 2"} Find the indefinite integrals, with respect to $x$, of the following functions: -1. $\displaystyle x\,e^{3bx}$ -1. $\displaystyle x^3\,e^{-3x}$ -1. $\displaystyle x \cos (x)$ -1. $\displaystyle e^{bx} \sin(x)$ +1. $\displaystyle x\,e^{3bx}$ +1. $\displaystyle x^3\,e^{-3x}$ +1. $\displaystyle x \cos (x)$ +1. $\displaystyle e^{bx} \sin(x)$ + :::: ::::challenge{id="07_intro_03" title="Introductory problems 3"} @@ -326,10 +304,11 @@ Sketch the curve $y=(x-2)(x-5)$ and calculate by integration the area under the ::::challenge{id="07_main_01" title="Main problems 1"} Evaluate the following indefinite and definite integrals: -1. $\displaystyle \def\d#1{{\rm d}#1} \int \frac{6}{(7-x)^3}~\d{x}$ -1. $\displaystyle \def\d#1{{\rm d}#1} \int 13x^3(9-x^4)^5~\d{x}$ -1. $\displaystyle \def\d#1{{\rm d}#1} \int_2^5 5\log(x)~\d{x}$ -1. $\displaystyle \def\d#1{{\rm d}#1} \int x^x\,(1 + \log(x))~\d{x}$ +1. $\displaystyle \def\d#1{{\rm d}#1} \int \frac{6}{(7-x)^3}~\d{x}$ +1. $\displaystyle \def\d#1{{\rm d}#1} \int 13x^3(9-x^4)^5~\d{x}$ +1. $\displaystyle \def\d#1{{\rm d}#1} \int_2^5 5\log(x)~\d{x}$ +1. $\displaystyle \def\d#1{{\rm d}#1} \int x^x\,(1 + \log(x))~\d{x}$ + :::: ::::challenge{id="07_main_02" title="Main problems 2"} @@ -348,11 +327,11 @@ If it burns fuel at a constant rate $\rho\,{\rm kg/s}$, and if the exhaust veloc The rocket starts burning fuel at $t=0\,{\rm s}$ with total mass of $m_0\,{\rm kg}$, and runs out of fuel at a later time $t=t_f\,{\rm s}$, with a final mass of $m_f\,{\rm kg}$. -1. Newton's second law tells us that the instantaneous acceleration $a$ of the rocket at time $t$ is equal to the force propelling it at that time, divided by its mass at that time. -Write down an expression for $a$ as a function of $t$. -1. By integrating this expression, show that the rocket's total change in velocity is given by $\displaystyle v_e \ln\left({m_0\over m_f}\right).$ -:::: +1. Newton's second law tells us that the instantaneous acceleration $a$ of the rocket at time $t$ is equal to the force propelling it at that time, divided by its mass at that time. + Write down an expression for $a$ as a function of $t$. +1. By integrating this expression, show that the rocket's total change in velocity is given by $\displaystyle v_e \ln\left({m_0\over m_f}\right).$ +:::: ::::challenge{id="07_main_03" title="Main problems 3"} The flow of water pumped upwards through the xylem of a tree, $F$, is given by: @@ -362,24 +341,21 @@ The flow of water pumped upwards through the xylem of a tree, $F$, is given by: where $t$ is the tree's age in days, $p$ and $q$ are positive constants, and $M_0p^{3/4}$ is the mass of the tree when planted (i.e.\ at $t=0$). Determine the total volume of water pumped up the tree in its tenth year (ignoring leap years) if: - - $p=10$, - - $q=0.01\,$day$^{-1}$, and - - $M_0=0.92\,$l$\,$day$^{-1}$. -:::: +- $p=10$, +- $q=0.01\,$day$^{-1}$, and +- $M_0=0.92\,$l$\,$day$^{-1}$. -### Extension problems +:::: +### Extension problems ::::challenge{id="07_ext_01" title="Extension problems 1"} Express $\displaystyle \frac{1}{x(x^2-16)}\quad{\rm in~the~form}\quad\frac{A}{x} + \frac{B}{(x+4)} + \frac{C}{(x-4)}$. - Hence calculate $\displaystyle \def\d#1{{\rm d}#1} \int\frac{1}{x(x^2-16)}\,\d{x}.$ :::: - - ::::challenge{id="07_ext_02" title="Extension problems 2"} The probability that a molecule of mass $m$ in a gas at temperature $T$ has speed $v$ is given by the Maxwell-Boltzmann distribution: @@ -391,26 +367,24 @@ where $k$ is Boltzmann's constant. Find the average speed: :::: - - ::::challenge{id="07_ext_03" title="Extension problems 3"} Baranov developed expressions for commercial yields of fish in terms of lengths, $L$, of the fish. His formula gave the total number of fish of length $L$ as $\displaystyle k\,e^{-cL}$, where $c$ and $k$ are constants ($k$ is positive). -1. Give a sketch of the graph $\displaystyle f(L)=k\,e^{-cL}$. -(Something decreasing, concave upward and asymptotic to horizontal axis will do.) -On your sketch, introduce marks on the horizontal axis that represent lengths $L=1, L=2, L=3, L=4 {\rm and } L=5$. -Now draw a rectangle on your sketch that represents the number of fish whose lengths are between $L=3$ and $L=4$. -1. Explain how we can represent the total number of fish $N$ as an area. -Show that this number equals $k/c$. -1. Only fish longer than $L_0$ count as commercial. Hence, assuming that the fish are all similar in shape (i.e. their width and breadth scales with their length) and of equal density $\rho$, show that the weight, $W$, of the commercial fish population is - - > $$\displaystyle \def\d#1{{\rm d}#1} W= \int_{L_0}^{+\infty} a\, k \rho\,L^3 e^{-cL}\,\d{L},$$ - - and hence that - - > $$\displaystyle W={N\, a\, \rho\, e^{-cL_0}\over c^3} \left((cL_0)^3 +3(cL_0)^2+ 6cL_0 +6\right),$$ - - where $a$ is a constant. -:::: +1. Give a sketch of the graph $\displaystyle f(L)=k\,e^{-cL}$. + (Something decreasing, concave upward and asymptotic to horizontal axis will do.) + On your sketch, introduce marks on the horizontal axis that represent lengths $L=1, L=2, L=3, L=4 {\rm and } L=5$. + Now draw a rectangle on your sketch that represents the number of fish whose lengths are between $L=3$ and $L=4$. +1. Explain how we can represent the total number of fish $N$ as an area. + Show that this number equals $k/c$. +1. Only fish longer than $L_0$ count as commercial. Hence, assuming that the fish are all similar in shape (i.e. their width and breadth scales with their length) and of equal density $\rho$, show that the weight, $W$, of the commercial fish population is + > $$\displaystyle \def\d#1{{\rm d}#1} W= \int_{L_0}^{+\infty} a\, k \rho\,L^3 e^{-cL}\,\d{L},$$ + + and hence that + + > $$\displaystyle W={N\, a\, \rho\, e^{-cL_0}\over c^3} \left((cL_0)^3 +3(cL_0)^2+ 6cL_0 +6\right),$$ + + where $a$ is a constant. + +:::: diff --git a/scientific_computing/essential_maths/08_complex_numbers.md b/scientific_computing/essential_maths/08_complex_numbers.md index 951063f2..2cac8246 100644 --- a/scientific_computing/essential_maths/08_complex_numbers.md +++ b/scientific_computing/essential_maths/08_complex_numbers.md @@ -1,22 +1,19 @@ --- name: Complex numbers -dependsOn: [ - scientific_computing.essential_maths.07_integration_2 -] +dependsOn: [scientific_computing.essential_maths.07_integration_2] tags: [] -attribution: -- citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ---- +--- ## YouTube lecture recording from October 2020 @@ -33,7 +30,7 @@ The material is still very similar: **The Imaginary Number $i$:** -- The polynomial $~~x^2-1=0~~$ has two real roots: $~~1~~$ and $~~-1~~$ +- The polynomial $~~x^2-1=0~~$ has two real roots: $~~1~~$ and $~~-1~~$ - The polynomial $~~x^2+1=0~~$ has **no** real roots. Consider solving: @@ -42,21 +39,20 @@ $$x^2 = 1\qquad{\rm and}\qquad x^2 = -1$$ We introduce an "imaginary" number $~~i~~$ so that there are two solutions to $~~x^2 = -1~~$: $~~i~~$ and $~~-i~~$. That is, $~~i~~$ is a number with the property that $~~i^2=-1$. - ## Complex numbers -The complex numbers are the set of all expressions of the form $a + bi$ where $i^2=-1$ and $a$ and $b$ are real numbers: +The complex numbers are the set of all expressions of the form $a + bi$ where $i^2=-1$ and $a$ and $b$ are real numbers: > $$\mathbb{C}=\left\{a + bi~~\vert~~a,~b~\in\mathbb{R}\right\}$$ -For $z=a+bi\in\mathbb{C}$ we define the *real* and *imaginary* +For $z=a+bi\in\mathbb{C}$ we define the _real_ and _imaginary_ parts of $z$ to be ${\Re}(z) = a$ and $\Im(z)=b$. The imaginary number $i$ has no home on the real number line. Instead, we locate it on the **complex plane** at the point $(0,1)$. -- we can represent any complex number $z=a+bi$ as the point $(a,b)$ in the complex plane. -- The coordinates $(a,b)$ are usually called the *cartesian* coordinates for $z$. +- we can represent any complex number $z=a+bi$ as the point $(a,b)$ in the complex plane. +- The coordinates $(a,b)$ are usually called the _cartesian_ coordinates for $z$. (Named after the mathematician and philosopher Rene Descartes). - In this plane, the real numbers lie on the horizontal axis. We usually refer to the horizontal axis of $\mathbb{C}$ as the **real axis**. @@ -65,7 +61,6 @@ The following plot shows a complex number with a real component of 18 and an ima ![The complex plane](fig/08_01_complex_number.svg) - ## Complex conjugates - The complex plane $\mathbb{C}$ contains **all** of the roots of **every** polynomial. @@ -84,7 +79,7 @@ which we can solve using the quadratic formula: > $$={{8\pm\sqrt{64-100}}\over2}={{8\pm\sqrt{-36}}\over2}={{8\pm6i}\over2}=4\pm3i$$ - Note that these two roots are reflections of one another through the - real axis. They are *conjugates* of one another. + real axis. They are _conjugates_ of one another. - In general, let $z=a + bi$. The **conjugate** of $z$ is the complex number $\bar{z}=a-bi$. @@ -92,15 +87,16 @@ which we can solve using the quadratic formula: We can also use Sympy's `solve` method to solve polynomials: ```python +import sympy as sp x = sp.symbols('x') sp.solve(x**2 - 8*x + 25) ``` -> $\displaystyle \left[ 4 - 3 i, \ 4 + 3 i\right]$ +> $\displaystyle \left[ 4 - 3 i, \ 4 + 3 i\right]$ ## Modulus (size) of a complex number -The distance to a point on the complex plane from 0 is called its **modulus**, and we find this by calculating the hypotenuse of the triangle with base ${\Re}(z)$ and height $\Im(z)$: +The distance to a point on the complex plane from 0 is called its **modulus**, and we find this by calculating the hypotenuse of the triangle with base ${\Re}(z)$ and height $\Im(z)$: E.g. The modulus of $4\pm3i$ is $\sqrt{3^2+4^2}=\sqrt{9+16}=\sqrt{25}=5$ @@ -108,7 +104,7 @@ E.g. The modulus of $4\pm3i$ is $\sqrt{3^2+4^2}=\sqrt{9+16}=\sqrt{25}=5$ $|z|=\sqrt{a^2+b^2}$. - The **modulus** is connected to the **conjugate** by means of the formula - $z\cdot \bar{z}=|z|^2$. Indeed: + $z\cdot \bar{z}=|z|^2$. Indeed: $$ \begin{align} @@ -125,7 +121,8 @@ Note that, in Python, because `i` is often used as an index variable in loops, t x = 1 + 2j print(f'x = {x} Re(x) = {x.real} Im(x) = {x.imag} |x| = {abs(x)}') ``` -``` + +```text x = (1+2j) Re(x) = 1.0 Im(x) = 2.0 |x| = 2.23606797749979 ``` @@ -135,11 +132,12 @@ In Sympy, the imaginary unit is `sp.I`: x = 1 + 2 * sp.I print(f'x = {x} Re(x) = {sp.re(x)} Im(x) = {sp.im(x)} |x| = {sp.Abs(x)}') ``` -``` + +```text x = 1 + 2*I Re(x) = 1 Im(x) = 2 |x| = sqrt(5) ``` -## Addition and Subtraction: +## Addition and Subtraction Addition and subtraction of complex numbers work as you would expect: @@ -154,7 +152,8 @@ Try adding: $(5+6i)+(1-i)$: ```python print((5 + 6j) + (1 - 1j)) ``` -``` + +```text (6+5j) ``` @@ -164,11 +163,11 @@ Try subtracting: $(5+6i)-(1-i)$: print((5 + 6j) - (1 - 1j)) ``` -``` +```text (4+7j) ``` -## Multiplication: +## Multiplication Multiplication is not quite so convenient in cartesian coordinates: @@ -184,11 +183,12 @@ Try multiplying: $(5+6i)(1-i)$: ```python print((5 + 6j) * (1 - 1j)) ``` -``` + +```text (11+1j) ``` -## Division: +## Division Division is even more awkward in cartesian coordinates: we have to multiply the numerator and the denominator by the complex conjugate of the denominator. @@ -205,15 +205,16 @@ Try dividing: $${(-4+7i)\over (2+3i)}$$: ```python print((-4 + 7*sp.I) / (2 + 3*sp.I)) ``` -``` + +```text (-4 + 7*I)*(2 - 3*I)/13 ``` ## Polar Coordinates -It is often convenient to represent the complex number $z = a + bi$ in terms of its polar coordinates $\langle r,\theta\rangle$. +It is often convenient to represent the complex number $z = a + bi$ in terms of its polar coordinates $\langle r,\theta\rangle$. -- The angle $\theta$ is called the *argument* of $z$. +- The angle $\theta$ is called the _argument_ of $z$. - The real number $r=|z|$ is sometimes denoted mod$(z)$. @@ -224,11 +225,7 @@ It is often convenient to represent the complex number $z = a + bi$ in terms of Let $z=x+iy$. If we are given the polar coordinates of $z$ and want to express the cartesian coordinates use -> $$x=r\cos\theta$$ - -> $$y=r\sin\theta$$ - -> $$z=r\cos\theta + ri\sin\theta=r(\cos\theta + i\sin\theta)$$ +> $$x=r\cos\theta$$ > $$y=r\sin\theta$$ > $$z=r\cos\theta + ri\sin\theta=r(\cos\theta + i\sin\theta)$$ If we are given the cartesian coordinates and want to find the polar coordinates, use: @@ -253,68 +250,62 @@ Beware of the sign of this tangent: it depends on which quadrant you are in. The positive $x$ axis is defined as having $\theta=0$ and positive $\theta$ goes in an anticlockwise sense around the $x-y$ plane. -## Some examples: +## Some examples -### 1. +### 1 Find the cartesian coordinates for the complex number $z$ with polar coordinates $r=2$ and $\theta=\pi/6$. -> $$\Re(z)=x=r\cos\theta=2\cos(\pi/6)=2\left({{\sqrt{3}\over2}}\right)=\sqrt{3}$$ - -> $$\Im(z)=y=r\sin\theta=2\sin(\pi/6)=2\left({{1\over2}}\right)=1$$ +> $$\Re(z)=x=r\cos\theta=2\cos(\pi/6)=2\left({{\sqrt{3}\over2}}\right)=\sqrt{3}$$ > $$\Im(z)=y=r\sin\theta=2\sin(\pi/6)=2\left({{1\over2}}\right)=1$$ Therefore, $z = \sqrt{3} + i$. ```python +import cmath +import numpy as np print(cmath.rect(2, np.pi/6)) ``` -``` + +```text (1.7320508075688774+0.9999999999999999j) ``` -### 2. +### 2 Find the polar coordinates for the complex number $z= -3+4i$. > $$|z|=r = $$ - > $$\sqrt{(-3)^2+4^2}=\sqrt{25}=5$$ - > $${\rm arg}(z)=\theta=\arctan\left({{y}\over{x}}\right)=$$ - > $$-0.93+\pi{\rm ~radians}\approx 127^\circ$$ ```python print(cmath.polar(-3 + 4j)) ``` -``` + +```text (5.0, 2.214297435588181) ``` -### 3. +### 3 Find the polar coordinates for the complex number $z= -2i$. -> $${\rm mod}(z)=r = |z|=2$$ - -> $${\rm arg}(z)=\theta=-{{\pi}\over2}$$ +> $${\rm mod}(z)=r = |z|=2$$ > $${\rm arg}(z)=\theta=-{{\pi}\over2}$$ ```python print(cmath.polar(-2j)) ``` -``` + +```text (2.0, -1.5707963267948966) ``` -## Multiplication in Polar Coordinates: +## Multiplication in Polar Coordinates First a reminder of three useful and important identities: -> $$\cos^2\theta + \sin^2\theta = 1$$ - -> $$\cos(\theta_1+\theta_2)=\cos\theta_1\cos \theta_2 - \sin\theta_1\sin\theta_2$$ - -> $$\sin(\theta_1+\theta_2)=\sin\theta_1\cos \theta_2 + \sin\theta_2\cos\theta_1$$ +> $$\cos^2\theta + \sin^2\theta = 1$$ > $$\cos(\theta_1+\theta_2)=\cos\theta_1\cos \theta_2 - \sin\theta_1\sin\theta_2$$ > $$\sin(\theta_1+\theta_2)=\sin\theta_1\cos \theta_2 + \sin\theta_2\cos\theta_1$$ Now, let $z_1=r_1\cos\theta_1+ir_1\sin\theta_1$ and $z_2=r_2\cos\theta_2+ir_2\sin\theta_2$. @@ -342,7 +333,7 @@ r_1\cos\theta_1\cr \end{align*} $$ -For the imaginary part too, the moduli multiply while the arguments add. +For the imaginary part too, the moduli multiply while the arguments add. This gives a relatively compact and highly geometric result for the product: @@ -350,14 +341,11 @@ This gives a relatively compact and highly geometric result for the product: It is **multiplicative** in the modulus and **additive** in the argument: -> $$|z_1z_2|= |z_1\cdot |z_2|$$ - -> $$\arg(z_1z_2)=\arg (z_1)+ \arg( z_2)$$ - -This means that when we multiply by $z$, we are **rotating** through the angle $\arg(z)$ and **radially stretching** by a factor of $|z|$. +> $$|z_1z_2|= |z_1\cdot |z_2|$$ > $$\arg(z_1z_2)=\arg (z_1)+ \arg( z_2)$$ +This means that when we multiply by $z$, we are **rotating** through the angle $\arg(z)$ and **radially stretching** by a factor of $|z|$. -## A Remarkable Connection with $e^{i\theta}$: +## A Remarkable Connection with $e^{i\theta}$ First, think of $z=\cos\theta + i\sin\theta$ as a function of $\theta$ and differentiate with respect to $\theta$: @@ -385,7 +373,7 @@ Similarly, we can show that: > $$z=\cos\theta - i \sin\theta=e^{-i\theta}~~~~~~~~~~~~\rm (4)$$ -Adding (3) and (4), and subtracting (3) and (4) gives: +Adding (3) and (4), and subtracting (3) and (4) gives: > $$\cos\theta ={e^{i\theta}+ e^{-i\theta}\over 2}~~~~~~~~~~~~~~~~~~~~~~~~~\sin\theta ={ e^{i\theta}-e^{-i\theta}\over 2i}$$ @@ -393,8 +381,7 @@ This demonstrates that **any complex number** can be written: > $$z=x+iy=r(\cos\theta + i\sin\theta)=r~e^{i\theta}$$ - -### Several important consequences: +### Several important consequences 1. Any complex number can be written in the polar form $z = re^{i\theta}$ where $r=|z|$ and $\theta=\arg(z)$. @@ -404,15 +391,13 @@ This demonstrates that **any complex number** can be written: 3. Multiplication on the unit circle $r=1$ can be carried out by adding the angles: - > $$e^{i\theta_1}\cdot e^{i\theta_2} = e^{i(\theta_1+\theta_2)}$$ - - > $$z=x+iy=r(\cos\theta + i\sin\theta)=r~e^{i\theta}$$ + > $$e^{i\theta_1}\cdot e^{i\theta_2} = e^{i(\theta_1+\theta_2)}$$ > $$z=x+iy=r(\cos\theta + i\sin\theta)=r~e^{i\theta}$$ 4. Exponentiation on the unit circle $r=1$ can be done by multiplying the angle by the index: > $$\left(e^{i\theta}\right)^n = e^{i\theta n}=e^{i(n\theta)}$$ -5. This result is known as **DeMoivre's Theorem**. It is usually stated in its cartesian form: +5. This result is known as **DeMoivre's Theorem**. It is usually stated in its cartesian form: > $$(\cos\theta + i\sin\theta)^n=\cos(n\theta) + i\sin(n\theta)$$ @@ -420,8 +405,6 @@ This demonstrates that **any complex number** can be written: > $$e^{\pi i}+1=0$$ - - ### Introductory problems ::::challenge{id="08_intro_01" title="Introductory problems 1"} @@ -435,6 +418,7 @@ Simplify: 1. $\displaystyle (1+2i)/(1-3i)$ 1. $\displaystyle (2-i)^{-2} +(2+i)^{-2}$ 1. $\displaystyle (5-i)^{-2} -(5+i)^{-2}$ + :::: ::::challenge{id="08_intro_02" title="Introductory problems 2"} @@ -449,6 +433,7 @@ Solve the following equations for $z$: 1. $\displaystyle (7 + i)z - 3i = 6$ 1. $\displaystyle {(z-i)\over (z+i)}={2\over 3}$ 1. $\displaystyle z^2 + (1+4i)z + (15 + 27 i) = 0$ + :::: ::::challenge{id="08_main_02" title="Main problems 2"} @@ -456,6 +441,7 @@ Represent the complex numbers $\displaystyle z_1= 5-2i$ and $\displaystyle z_2=- 1. Write down $z_1$ and $z_2$ in polar form (either give $r$ and $\theta$ for each, or write them as exponentials). 1. What is the product of $z_1$ and $z_2$? + :::: ::::challenge{id="08_main_03" title="Main problems 3"} @@ -479,7 +465,6 @@ Find the complex numbers represented by the vertices of a square if one vertex r ### Extension problems - ::::challenge{id="08_ext_01" title="Extension problems 1"} -Experiment with using Python to solve the problems and confirm your pen & paper solutions. +Experiment with using Python to solve the problems and confirm your pen & paper solutions. :::: diff --git a/scientific_computing/essential_maths/09_differential_equations_1.md b/scientific_computing/essential_maths/09_differential_equations_1.md index 40f757ff..ec75f4dd 100644 --- a/scientific_computing/essential_maths/09_differential_equations_1.md +++ b/scientific_computing/essential_maths/09_differential_equations_1.md @@ -1,22 +1,19 @@ --- name: Differential equations 1 -dependsOn: [ - scientific_computing.essential_maths.07_integration_2 -] +dependsOn: [scientific_computing.essential_maths.07_integration_2] tags: [] -attribution: -- citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ---- +--- ## YouTube lecture recording from October 2020 @@ -47,12 +44,10 @@ Since all the terms in $y$ (just the derivative in this case) on the left, and a > $$\displaystyle {\rm d}y = (x^3 + 5x^2 + 1) ~{\rm d}x$$ - Then we integrate both sides > $$\displaystyle \int {\rm d}y = \int (x^3 + 5x^2 + 1) ~{\rm d}x$$ - The general solution to this equation is > $$\displaystyle y + A = \int x^3 + 5x^2 + 1\,{\rm d}x = {1\over 4}x^4 + {5\over 3}x^3 + x + B$$ @@ -73,13 +68,10 @@ This approach can be generalised into what is known as the **Separation of Varia The equation is already separated, with all terms in $y$ (just the derivative in this case) on the left, and terms involving $x$ on the right. - We then integrate both sides with respect to $x$: > $$\displaystyle \int \frac{{\rm d}y}{{\rm d}x}\,{\rm d}x = \int (2x + c)\, {\rm d}x $$ - > $$\displaystyle y + k_1 = x^2 + cx + k_2$$ - > $$\displaystyle y = x^2 + cx + A\qquad{\rm with}\qquad A=k_2-k_1$$ #### Initial conditions @@ -94,13 +86,13 @@ We have that so: -> $$\displaystyle c = 1^2 + c\cdot 1 + A$$ - -> $$\displaystyle A = -1$$ +> $$\displaystyle c = 1^2 + c\cdot 1 + A$$ > $$\displaystyle A = -1$$ Is this right? Here's an example of how we can use SciPy's `odeint` method to numerically solve differential equations: ```python +import scipy +import numpy as np c = 1.0 A = -1.0 x0 = 1.0 @@ -115,7 +107,6 @@ y_numerical = scipy.integrate.odeint(dydx, c, x) ![Plot of the numerical vs exact solution](fig/09_01_solution.svg) - ### Separation of variables: example 2 > $$\displaystyle {{\rm d}y\over {\rm d}x} = 5 x^{{3\over 2}} y $$ @@ -124,8 +115,6 @@ Separate the variables so that all the terms in $x$ are on one side of the equat > $$\displaystyle {1\over y}\frac{{\rm d}y}{{\rm d}x} = 5 x^{{3\over 2}} $$ -> $$\displaystyle {1\over y}\frac{{\rm d}y}{{\rm d}x} = 5 x^{{3\over 2}} $$ - Integrating both sides with respect to $x$: > $$\displaystyle \int {1\over y}\frac{{\rm d}y}{{\rm d}x}\, {\rm d}x = \int 5x^{{3\over 2}}\,{\rm d}x \quad \Rightarrow\quad \int {1\over y}\,{\rm d}y = \int 5x^{{3\over 2}}\, {\rm d}x$$ @@ -163,12 +152,10 @@ y &= \pm\sqrt{-2 e^{-x} (1+x) + c_3} \end{align*} $$ - Let's check this answer: $\displaystyle y = \pm\sqrt{-2 e^{-x} (1+x) + c_3}$. Substitute in initial conditions $y(0) = 0$ gives $c_3 = 2$ - ```python c3 = 2.0 x0 = 0.0 @@ -185,8 +172,6 @@ y_numerical = scipy.integrate.odeint(dydx, 1e-8, x) # why 1e-8 instead of zero? ![Plot of the numerical vs exact solution](fig/09_02_solution.svg) - - ### Real-world example 1: biochemistry The Michaelis-Menten equation relates the rate of an enzyme reaction to its @@ -229,11 +214,7 @@ When $s\ll K$, $\;s\;$ in the differential equation's denominator can be neglect The rate of change in concentration depends on the concentration, so this is a **first order process**. -> $$\displaystyle \int {1\over s}\,{\rm d}s = -\int {V\over K}\,{\rm d}t$$ - -> $$\displaystyle \Rightarrow\quad\ln s = -{V\over K}t + D\qquad{\rm D~is~a~constant}$$ - -> $$\displaystyle \Rightarrow\quad s = Be^{-V t\over K}\qquad{\rm B~is~a~constant}$$ +> $$\displaystyle \int {1\over s}\,{\rm d}s = -\int {V\over K}\,{\rm d}t$$ > $$\displaystyle \Rightarrow\quad\ln s = -{V\over K}t + D\qquad{\rm D~is~a~constant}$$ > $$\displaystyle \Rightarrow\quad s = Be^{-V t\over K}\qquad{\rm B~is~a~constant}$$ On a graph $\;s\;$ crosses the vertical axis at $\;s=B\;$ and decreases with time exponentially. @@ -251,7 +232,6 @@ K=1.0: ![s with K equal to 1.0](fig/09_04_k10.svg) - ### Real-world example 2: bacterial growth Suppose the growth rate of a bacterial colony is proportional to its size, @@ -260,7 +240,6 @@ How long will it take to reach 11 times its original size? Let $\;N\;$ represent the number of bacteria, and suppose there are $\;N_0\;$ at time $\;t=0$. - We are given that when $\;t=10\;$ hours, $\;N=3N_0$, and need to find the time at which $\;N=11N_0\;$. The equation of growth is: @@ -278,14 +257,13 @@ To find $t$ when $N=11N_0$: > $$\displaystyle \ln{11N_0\over N_0} = 0.11 t \qquad\Rightarrow\qquad t={\ln 11\over 0.11}=21.8{\rm~hours.}$$ - ### Real-world example 3: radioactive decay The rate at which a sample decays is proportional to the amount left, i.e. > $$\displaystyle \frac{{\rm d}N}{{\rm d}t} = -\lambda N$$ -where $\;N\;$ is the mass of radioactive atoms at time $\;t\;$ and $\;\lambda\;$ is called the *decay constant*. +where $\;N\;$ is the mass of radioactive atoms at time $\;t\;$ and $\;\lambda\;$ is called the _decay constant_. The element radium (atomic mass=226) has a decay constant of $\;13.6 \times 10^{-12}\;$s$^{-1}$. @@ -305,15 +283,12 @@ The **half-life**, $\;t_{1\over 2},\;$ is the time taken for $\;N\;$ to reduce b Putting $\;N=N_0/2\;$ and $\;t=t_0+t_{1\over 2}\;$ in (2) we get -> $$\displaystyle \ln {N_0\over2N_0} = -\lambda t_{1\over 2}$$ - -> $$\displaystyle t_{1\over 2} = {\ln 2\over\lambda} \approx {0.693\over\lambda}$$ +> $$\displaystyle \ln {N_0\over2N_0} = -\lambda t_{1\over 2}$$ > $$\displaystyle t_{1\over 2} = {\ln 2\over\lambda} \approx {0.693\over\lambda}$$ Note that this time is **independent** of the initial value $N_0$, The half-life for radium is thus $\;t_{1\over 2}={\ln2\over13.6\times10^{-12}}=5.10\times10^{10}\,s,\;$ or about 1600 years. - ### Real-world example 4: more biochemistry The power to which the concentration of a species is raised in a rate law @@ -329,46 +304,35 @@ components. This rate law is thus second-order overall. Note that this is **different** from the order of an ODE, which is given by the highest derivative. -Both zeroth and first order *processes* are modelled below by *first order -differential equations*. +Both zeroth and first order _processes_ are modelled below by _first order +differential equations_. -#### (A) Zeroth order processes: +#### (A) Zeroth order processes - rate of change is **independent** of concentration, i.e. the rate of change is proportional to concentration raised to power zero -> $$\displaystyle \frac{{\rm d}A}{{\rm d}t} = k \quad \text{(growth)}$$ - -> $$\displaystyle \frac{{\rm d}A}{{\rm d}t} = -k\quad \text{(decay)}$$ +> $$\displaystyle \frac{{\rm d}A}{{\rm d}t} = k \quad \text{(growth)}$$ > $$\displaystyle \frac{{\rm d}A}{{\rm d}t} = -k\quad \text{(decay)}$$ General solutions: -> $$\displaystyle A = A_0 + k(t-t_0)$$ +> $$\displaystyle A = A_0 + k(t-t_0)$$ > $$\displaystyle A = A_0 - k(t-t_0)$$ -> $$\displaystyle A = A_0 - k(t-t_0)$$ - -#### (B) First order processes: +#### (B) First order processes The rate of change depends on the concentration of one species, i.e. the rate of change is proportional to concentration raised to first power. Half-life is a constant, i.e. it is independent of the amount there at the beginning. -> $$\displaystyle \frac{{\rm d}A}{{\rm d}t} = kA\quad\text{growth}$$ - -> $$\displaystyle \frac{{\rm d}A}{{\rm d}t} = -kA\quad\text{decay}$$ +> $$\displaystyle \frac{{\rm d}A}{{\rm d}t} = kA\quad\text{growth}$$ > $$\displaystyle \frac{{\rm d}A}{{\rm d}t} = -kA\quad\text{decay}$$ General solutions: -> $$\displaystyle A = A_0 e^{k(t-t_0)}$$ - -> $$\displaystyle A = A_0 e^{-k(t-t_0)}$$ +> $$\displaystyle A = A_0 e^{k(t-t_0)}$$ > $$\displaystyle A = A_0 e^{-k(t-t_0)}$$ When $k=2.0$ the two different solutions look like this: ![s with K equal to 1.0](fig/09_05_k2.svg) - - - ### Introductory problems ::::challenge{id="09_intro_01" title="Introductory problems 1"} @@ -414,8 +378,8 @@ y = odeint(dydx, y0, x) # plot the numerical solution and your hand-calculated # solution, and check that they agree ``` -:::: +:::: ### Main problems @@ -427,8 +391,8 @@ When first observed, the culture contained $\displaystyle n_0$ bacteria, and two 1. How long did it take for the number of bacteria to triple? 1. Sketch a curve of the solution to the equation that you derive. 1. What assumptions are implicit in this model of bacterial growth? -:::: +:::: ::::challenge{id="09_main_02" title="Main problems 2"} Solve: @@ -436,6 +400,7 @@ Solve: 1. $\displaystyle \def\dd#1#2{{\frac{{\rm d}#1}{{\rm d}#2}}} y^2\dd{y}{x} = \frac{2}{3}x \quad{\rm with}\quad y(\sqrt{2})=1$ 1. $\displaystyle \def\dd#1#2{{\frac{{\rm d}#1}{{\rm d}#2}}} \dd{y}{x} = \frac{\beta}{x} \quad{\rm with}\quad y(1) = 0$. Find $\beta$ such that $\displaystyle y(e^3)=1.$ 1. $\displaystyle \def\dd#1#2{{\frac{{\rm d}#1}{{\rm d}#2}}} \dd{y}{x} = a + bx + cx^2 + dx^3 + ex^4 \quad{\rm with}\quad y(0) = \pi$ + :::: ::::challenge{id="09_main_03" title="Main problems 3"} @@ -446,6 +411,7 @@ Given that the original mass of $\displaystyle A$ is 130g, and that 50g has been 1. Form and solve the differential equation relating $\displaystyle m_t$ to $\displaystyle t$. 1. Find the mass of $\displaystyle A$ transformed over a 300s period. 1. Sketch a graph of $\displaystyle m_t$ versus $\displaystyle t$. + :::: ::::challenge{id="09_main_04" title="Main problems 4"} @@ -461,6 +427,7 @@ If $\displaystyle T_0$ is the initial temperature of a body, $\displaystyle T_s$ At what time can the glycerol be added to the protein? 1. Using a choice of axes that will allow you easily to predict the temperature of the glycerol, sketch a graph of the anticipated variation of the glycerol temperature with time. 1. Once the glycerol has been added to the protein, will the rate of cooling be described by the same constant $\displaystyle k$? Give reasons for your answer. + :::: ::::challenge{id="09_main_05" title="Main problems 5"} @@ -480,23 +447,22 @@ Assuming that the charcoal was formed during the building of the site, use this ### Extension problems -::::challenge{id="09_ext_01" title="Extension problems 1"} -The _absorbance_ $A$ of a solution is given by the equation: -$$A=\log_{10}\left(\frac{I_o}{I}\right)$$ +::::challenge{id="09*ext*01" title="Extension problems 1"} +The **absorbance** $A$ of a solution is given by the equation: +$$A=\log*{10}\left(\frac{I_o}{I}\right)$$ where $I_o$ is the intensity of the light impinging on the solution (incident light) and $I$ is the intensity of the light emerging from it (transmitted light). The Beer-Lambert law states that $$A=\epsilon\cdot c\cdot l$$ where $\epsilon$ is the absorbance of the solute, $c$ is the concentration of the solute and $l$ is the distance that the light has travelled through the solution. - 1. The _transmittance_ $\displaystyle T$ is defined as the fraction of incident light transmitted through the solution ($\displaystyle T={I\over I_o}$). -Derive an expression relating the transmittance, $T$, of the solution to $\epsilon$, $c$ and $l$. + Derive an expression relating the transmittance, $T$, of the solution to $\epsilon$, $c$ and $l$. 1. The _attenuation_ $Q$ of the light beam is defined as the difference between the intensities of the incident and the transmitted light ($Q=I_o-I$). -Derive an expression for the attenuation of the light beam when a beam of light intensity $I_o$ traverses a distance $l$ through a solution of fixed concentration $c$. -Sketch a graph showing the dependence of $Q$ on $l$ in a solution of fixed concentration. + Derive an expression for the attenuation of the light beam when a beam of light intensity $I_o$ traverses a distance $l$ through a solution of fixed concentration $c$. + Sketch a graph showing the dependence of $Q$ on $l$ in a solution of fixed concentration. 1. ATP has a molar absorbtion of $\displaystyle 15.7\times 10^3\,{\rm M}^{-1}{\rm cm}^{-1}$. -Calculate the initial rate (in watts/cm) at which light intensity is attenuated when a light beam of intensity 200 watts enters a $\displaystyle 10\mu\,{\rm M}$ solution of ATP. -What would happen to this rate if + Calculate the initial rate (in watts/cm) at which light intensity is attenuated when a light beam of intensity 200 watts enters a $\displaystyle 10\mu\,{\rm M}$ solution of ATP. + What would happen to this rate if i. the concentration of ATP is doubled; i. the intensity of the incident light is doubled; i. the length of the cell holding the solution is doubled? diff --git a/scientific_computing/essential_maths/10_differential_equations_2.md b/scientific_computing/essential_maths/10_differential_equations_2.md index ea7a1cca..3a04a981 100644 --- a/scientific_computing/essential_maths/10_differential_equations_2.md +++ b/scientific_computing/essential_maths/10_differential_equations_2.md @@ -1,22 +1,19 @@ --- name: Differential equations 2 -dependsOn: [ - scientific_computing.essential_maths.09_differential_equations_1 -] +dependsOn: [scientific_computing.essential_maths.09_differential_equations_1] tags: [] -attribution: -- citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ---- +--- ## YouTube lecture recording from October 2020 @@ -44,7 +41,6 @@ of the leaf at the time when its length is $\;y\;$ cm. --- - 1. Assume the length of the leaf is $\;y\;$ cm at time $\;t\;$ days after it was first observed. 2. Let the rate the leaf receives water be $\;k_1y\;$ where $\;k_1\;$ is a positive constant. @@ -59,7 +55,6 @@ When $k_1$ and $k_2$ are both equal to $1.0$, the solution looks like: ![Solution when k1 and k2 are both equal to 1.0](fig/10_01_solution.svg) - ## Example 2: solid tumour growth --- @@ -76,18 +71,13 @@ Assuming that the growth rate of the tumour depends only on the availability of 3. The rate at which the tumour acquires nutrients, and hence the rate at which the volume increases, is thus proportional to $\;V^{2/3}$. - This gives us the equation: -> $$\displaystyle\frac{{\rm d}V}{{\rm d}t} = kV^{2/3}$$ - -> $$\displaystyle\frac{{\rm d}V}{{\rm d}t} = kV^{2/3}$$ +> $$\displaystyle\frac{{\rm d}V}{{\rm d}t} = kV^{2/3}$$ > $$\displaystyle\frac{{\rm d}V}{{\rm d}t} = kV^{2/3}$$ Solve by separation of variables: -> $$\displaystyle\int V^{-2/3}~{\rm d}V = \int k~{\rm d}t$$ - -> $$\displaystyle V = \left({kt+c\over 3}\right)^3$$ +> $$\displaystyle\int V^{-2/3}~{\rm d}V = \int k~{\rm d}t$$ > $$\displaystyle V = \left({kt+c\over 3}\right)^3$$ where $c$ is a constant of integration, the value of which will depend upon the initial conditions. @@ -95,18 +85,15 @@ When $k=1$ and $c=10$, the solution looks like: ![Solution when k=1 and c=10](fig/10_02_solution.svg) - ## Second Order Differential Equations Let us try to solve the following equation: > $\displaystyle \frac{{\rm d}^2y}{{\rm d}x^2} = \left(\frac{{\rm d}y}{{\rm d}x}\right)^2$ - We will use the substitution $\displaystyle z = \frac{{\rm d}y}{{\rm d}x}$. This implies that $\displaystyle \frac{{\rm d}z}{{\rm d}x} = \frac{{\rm d}^2y}{{\rm d}x^2}$. - Substituting into the original equation, to eliminate $y$, gives > $$\displaystyle \frac{{\rm d}z}{{\rm d}x} = z^2$$ @@ -131,25 +118,22 @@ Determining the values of $A$ and $B$ can be done in several different ways, dep For example, if we know the following: -1. At $\;x=0,\;$ $\;\displaystyle \frac{{\rm d}y}{{\rm d}x} = -1\;$ and $\;y=0\;$. +- At $\;x=0,\;$ $\;\displaystyle \frac{{\rm d}y}{{\rm d}x} = -1\;$ and $\;y=0\;$. - We substitute the first condition into $\displaystyle \frac{{\rm d}y}{{\rm d}x} = -{1\over x+A}$ to obtain $\;A=1\;$. - - Then substitute $A$ and the second condition into the eventual solution - to find $\;B=0$. + We substitute the first condition into $\displaystyle \frac{{\rm d}y}{{\rm d}x} = -{1\over x+A}$ to obtain $\;A=1\;$. + + Then substitute $A$ and the second condition into the eventual solution + to find $\;B=0$. --- Alternatively, if we instead know that: -2. $\;y(0)=0\;$ and $\;y(e-1)=-1$. +- $\;y(0)=0\;$ and $\;y(e-1)=-1$. - This time both conditions can be substituted into the solution: - - > $$\displaystyle y(0)=0 \Rightarrow 0=B-\ln(A) \Rightarrow B=\ln(A)$$ - - > $$\displaystyle y(e-1)=-1 \Rightarrow -1=\ln(A)-\ln{e-1+A} \Rightarrow A=1$$ + This time both conditions can be substituted into the solution: + > $$\displaystyle y(0)=0 \Rightarrow 0=B-\ln(A) \Rightarrow B=\ln(A)$$ > $$\displaystyle y(e-1)=-1 \Rightarrow -1=\ln(A)-\ln{e-1+A} \Rightarrow A=1$$ ## More integration tricks @@ -165,13 +149,13 @@ We can split apart the integral on the RHS using **partial fractions** in SymPy We want $\displaystyle \qquad{1\over y(k_1-k_2y)}={A\over y}+{B\over (k_1-k_2y)}$: - ```python +import sympy as sp y, k1, k2 = sp.symbols('y k_1 k_2') sp.apart(1 / (y*(k1 - k2*y)),y) ``` -> $\displaystyle - \frac{k_{2}}{k_{1} \left(- k_{1} + k_{2} y\right)} + \frac{1}{k_{1} y}$ +> $\displaystyle - \frac{k_{2}}{k_{1} \left(- k_{1} + k_{2} y\right)} + \frac{1}{k_{1} y}$ So $\displaystyle A={1\over k_1}$ and $B={k_2\over k_1}$. @@ -186,7 +170,6 @@ Try doing the algebraic manipulation of this to make $y$ the subject of the equa where $d$ is a constant. - ### Introductory problems ::::challenge{id="10_intro_01" title="Introductory problems 1"} @@ -200,8 +183,6 @@ Find the general solutions of the following differential equations: Check your answers by differentiating them. :::: - - ### Main problems ::::challenge{id="10_main_01" title="Main problems 1"} @@ -209,14 +190,14 @@ A circular patch of oil on the surface of some water has a radius $r$ metres at When $t=0$ minutes, $r=1\,$m and when $t=10$ minutes, $r=2\,$m. 1. Predict the value $T$ of $t$ when $r=4\,$m, using a simple model in which the rate of increase of $r$ is taken to be constant. -Find $T$ for this model. + Find $T$ for this model. 1. In a more refined model, the rate of increase of $r$ is taken to be proportional to $1/r$. -Express this statement as a differential equation, and find the general solution. -Find $T$ for this model. + Express this statement as a differential equation, and find the general solution. + Find $T$ for this model. 1. Compare the two models used in 1. and 2., by sketching $r(t)$ on the same figure or plotting using Python. -Comment on the differences seen during different time intervals. -:::: + Comment on the differences seen during different time intervals. +:::: ::::challenge{id="10_main_02" title="Main problems 2"} A nuclear installation local to Oxford 'lost' 17g of Cobalt-60 between two inspections 6 months apart. @@ -226,8 +207,6 @@ If this explanation was correct, what mass of Cobalt was stored on the site? (The half life of $^{60}$Co = $5.26$ years.) :::: - - ::::challenge{id="10_main_03" title="Main problems 3"} By making a substitution $\displaystyle \def\dd#1#2{{\frac{{\rm d}#1}{{\rm d}#2}}} z = \dd{y}{x}$, solve the equation @@ -239,19 +218,18 @@ Hint: you will need to use partial fractions for part of this question, to break :::: - - ::::challenge{id="10_main_04" title="Main problems 4"} Suppose a cell contains a chemical (the solute) dissolved in it at a concentration of $\displaystyle c(t)$, and the concentration of the same substance outside the cell is a constant $k$. By Fick's law, if $\displaystyle c(t)$ and $k$ are unequal, solute moves across the cell wall at a rate proportional to the difference between $\displaystyle c(t)$ and $k$, towards the region of lower concentration. 1. Write down a differential equation which is satisfied by $\displaystyle c(t)$. 1. Solve this differential equation with the initial condition $\displaystyle c(0)=c_0$. -1. Sketch the solutions for $\displaystyle c_0 > k$ and $\displaystyle k > c_0$. +1. Sketch the solutions for $\displaystyle c_0 > k$ and $\displaystyle k > c_0$. 1. Blood glucose concentration is 5.1 mM and the concentration inside the cell is at 0.1 mM. -If glucose utilisation within the cell is totally inhibited, it takes 1 min for the intracellular concentration to reach 2.6 mM. -How long would it take for the concentration to reach 5.0 mM? + If glucose utilisation within the cell is totally inhibited, it takes 1 min for the intracellular concentration to reach 2.6 mM. + How long would it take for the concentration to reach 5.0 mM? 1. Calculate the amount of glucose (moles) entering the cell to bring the concentration from 0.1 mM to 5 mM assuming the cell is spherical with a diameter of $10\,\mu$m. + :::: ### Extension problems @@ -265,4 +243,5 @@ where $k_1$ and $k_2$ are mass action coefficients. 1. Form and solve a differential equation describing the change in protein concentration. 1. What concentration is reached after 'sufficient' time has elapsed? + :::: diff --git a/scientific_computing/essential_maths/11_differential_equations_3.md b/scientific_computing/essential_maths/11_differential_equations_3.md index 1b1ac13f..64451f64 100644 --- a/scientific_computing/essential_maths/11_differential_equations_3.md +++ b/scientific_computing/essential_maths/11_differential_equations_3.md @@ -1,19 +1,16 @@ --- name: Differential equations 3 -dependsOn: [ - scientific_computing.essential_maths.10_differential_equations_2 -] +dependsOn: [scientific_computing.essential_maths.10_differential_equations_2] tags: [] -attribution: -- citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Steady State Solutions and Mass Action @@ -90,8 +87,7 @@ We can see that it is stable by examining the graph: We can solve the differential equation by separation of variables. > $$\displaystyle \int {1\over k_1 - k_2 s}~{\rm d}s = \int~{\rm d}t $$ - -> $$\displaystyle s(t) = Be^{-k_2 t} + {k_1\over k_2} $$ +> $$\displaystyle s(t) = Be^{-k_2 t} + {k_1\over k_2} $$ Thus, the concentration of $S$ relaxes exponentially to the steady state, no matter the initial condition. @@ -112,7 +108,6 @@ then: > $$\displaystyle \frac{{\rm d}s}{{\rm d}t} = k_1s - k_2 s^2$$ - This system will be at steady state when $\displaystyle k_1 s = k_2 s^2$, i.e. - $s=0$ or @@ -126,7 +121,6 @@ Here, $k_1 = k_2 = 1$: - The local behaviour near the fixed point as $s=1$ is the same as in the previous plot, so we can immediately see that it is a stable steady state - The local behaviour near the fixed point at $s=0$ is the opposite: moving one way or the other, the gradient will take $s$ even further away - ## Non-graphical method Let us investigate this same behaviour in a more rigorous manner. @@ -147,10 +141,9 @@ The derivative (after substituting $s$ and $s^2$ into the original differential It is, since $\varepsilon\gg\varepsilon^2$, negative, pushing $s$ back towards the steady state, hence it is a stable steady state. - ## Example: the logistic equation -The growth of a cell colony can be modelled by the *logistic* equation +The growth of a cell colony can be modelled by the _logistic_ equation > $$\displaystyle \frac{{\rm d}N}{{\rm d}t} = rN\left(1 - {N\over K}\right) $$ @@ -164,12 +157,9 @@ Here is a plot, for $r=K=1$: ![Plot illustrating the steady states of the logistic equation](fig/11_04_logistic_ss.svg) - - - for small positive $N$, $rN>0$ and $N0$ and the population size will increase, meaning that $N=0$ is an unstable steady state. - - In fact the growth rate is positive for $0K$, making $N=K$ a stable steady state. @@ -189,8 +179,8 @@ This can be solved using partial fractions on the left hand side: To solve a differential equation: 1. Calculate the general solution - 1. Try to write it as a separable equation first - 2. Other methods (e.g. integrating factors) not covered in this course + 1. Try to write it as a separable equation first + 2. Other methods (e.g. integrating factors) not covered in this course 2. This general solution will include an arbitrary constant this may be eliminated using initial conditions (if these are given) 3. Can check your solution numerically using Python @@ -223,17 +213,12 @@ Let us look at some examples: Then we get the following differential equations: - > $$\displaystyle \frac{{\rm d}[A]}{{\rm d}t} = -k[A] [B]$$ - - > $$\displaystyle \frac{{\rm d}[B]}{{\rm d}t} = -k[A] [B]$$ - - > $$\displaystyle \frac{{\rm d}[C]}{{\rm d}t} = k[A] [B]$$ - + > $$\displaystyle \frac{{\rm d}[A]}{{\rm d}t} = -k[A] [B]$$ > $$\displaystyle \frac{{\rm d}[B]}{{\rm d}t} = -k[A] [B]$$ > $$\displaystyle \frac{{\rm d}[C]}{{\rm d}t} = k[A] [B]$$ 2. Predation of R by W > $$\displaystyle R+W\underset{^k}{\rightarrow} W$$ - + Gives the following differential equation: > $$\displaystyle \frac{{\rm d}R}{{\rm d}t} = -kRW$$ @@ -254,7 +239,6 @@ Let us look at some examples: > $$\displaystyle \frac{{\rm d}[C]}{{\rm d}t} = -k[C]$$ - ## Numerically solving differential equations What if we can't solve the differential equation (or don't want to)? @@ -267,7 +251,7 @@ Given a differential equation > $$\displaystyle \frac{{\rm d}y}{{\rm d}t} = f(y, t)$$ -with initial state $\;y(t = t_0) = y_0,\;$ we can *approximate* the state at $t = t_0 + \delta{t}$ as: +with initial state $\;y(t = t_0) = y_0,\;$ we can _approximate_ the state at $t = t_0 + \delta{t}$ as: > $$\displaystyle y_1 = y(t + \delta{t}) \approx y_0 + f(y, t) \cdot \delta{t}$$ @@ -277,19 +261,15 @@ and the next state as and so on! -This mean's we can estimate the *entire time course of $y(t)$*, provided: +This mean's we can estimate the _entire time course of $y(t)$_, provided: 1. We can calculate $f(y, t)$ (or approximate it with a computer) 2. We're patient enough to take really tiny steps $\delta{t}$ We have already seen examples of this, using SciPy's ODEInt function, although this uses more sophisticated methods than the one described here. - - ### Introductory problems - - ::::challenge{id="11_intro_01" title="Introductory problems 1"} Determine the steady states are their stabilities, for each of the following: @@ -298,12 +278,11 @@ Determine the steady states are their stabilities, for each of the following: 1. $\displaystyle \def\dd#1#2{{\frac{{\rm d}#1}{{\rm d}#2}}} \dd{x}{t} = e^{x}(x^{2}-1)$ 1. $\displaystyle \def\dd#1#2{{\frac{{\rm d}#1}{{\rm d}#2}}} \dd{N}{t} = \displaystyle r_{0}N\left(1-\frac{N}{K}\right)$, where $r_{0}<0$ and $r_{0},\;K$ are constants 1. $\displaystyle \def\dd#1#2{{\frac{{\rm d}#1}{{\rm d}#2}}} \dd{x}{t} = \displaystyle \frac{Ax}{h+x}$, where $A$ and $h$ are negative constants + :::: ### Main problems - - ::::challenge{id="11_main_01" title="Main problems 1"} Not all chemical systems relax exponentially to steady state. Consider the bimolecular decay reaction @@ -315,22 +294,21 @@ Assuming $k$ is a mass action constant, form and solve a differential equation r If $a(0)=a_0$ you should get $\displaystyle \quad a(t) = \frac{1}{2kt + \frac{1}{a_0}}.$ :::: - ::::challenge{id="11_main_02" title="Main problems 2"} The $SIS$ model is an appropriate model for diseases that mutate quickly and can therefore infect people multiple times, such as the common cold or sexually transmitted infections like gonorrhea and chlamydia. In the model, individuals are 'susceptible' until they are 'infected', and then they return to being 'susceptible' again. Infection requires the interaction of susceptible individuals with infected individuals and therefore follows the law of mass action, whereas the rate at which an individual becomes susceptible again after infection is constant. -1. Let $S$ and $I$ be the proportions of the population that are susceptible and infected. -If infection happens at rate $\beta$ and recovery happens at rate $\gamma$, write down differential equations for $S$ and $I$. +1. Let $S$ and $I$ be the proportions of the population that are susceptible and infected. + If infection happens at rate $\beta$ and recovery happens at rate $\gamma$, write down differential equations for $S$ and $I$. 1. Noting that $S$ and $I$ are proportions of the population, which is assumed constant, reduce the system to a single differential equation in terms of $I$. -In other words, write down a single equation, involving just $I$ and its derivative. + In other words, write down a single equation, involving just $I$ and its derivative. 1. Find both steady states of $I$. Under what conditions on $\beta$ and $\gamma$ are each attainable? 1. Without solving the differential equation, sketch the behaviour of $S$ and $I$ over time, starting with a small quantity of infected individuals. -Illustrate how both steady states may be achieved. -:::: + Illustrate how both steady states may be achieved. +:::: ::::challenge{id="11_main_03" title="Main problems 3"} Consider a closed reaction system consisting of a single reversible reaction: @@ -343,13 +321,12 @@ where $k_f$ and $k_b$ are mass action coefficients. 1. Formulate a pair of coupled differential equations for the change in concentration of $A$ and $B$. 1. Noting that the total concentration $T$ of reactants is constant ($T = [A] + [B]$), reduce the system of equations to a single differential equation. -In other words, write down a single equation, involving either just $A$ and its derivative, or just $B$ and its derivative. + In other words, write down a single equation, involving either just $A$ and its derivative, or just $B$ and its derivative. 1. Find the steady-state concentrations of $A$ and $B$. 1. Solve the single differential equation to reveal the transient behaviour. -Sketch the behaviour for different illustrative initial conditions. -:::: - + Sketch the behaviour for different illustrative initial conditions. +:::: ::::challenge{id="11_main_04" title="Main problems 4"} Consider the simple model @@ -357,18 +334,18 @@ Consider the simple model $$ \def\dd#1#2{{\frac{{\rm d}#1}{{\rm d}#2}}} \dd{s}{t} = k - {V_{\rm max}s\over K_M + s} $$ + in which species $s$ is produced at a fixed rate and consumed via Michaelis-Menten kinetics. Find the steady state of $s$, and verify that it is stable for any non-negative parameter values, provided $\displaystyle V_{\rm max} > k$. :::: - - ::::challenge{id="11_main_05" title="Main problems 5"} Recall the simple model of the production and degradation of a protein from the lecture, shown by the reaction chain $$ \overset{v_1}{\longrightarrow} S \overset{v_2}{\longrightarrow} $$ + where $v_1$ and $v_2$ are reaction rates rather than mass action coefficients. 1. Suppose $v_1 = k_1$ and $v_2 = k_2$. @@ -380,14 +357,13 @@ where $v_1$ and $v_2$ are reaction rates rather than mass action coefficients. $$ and take the parameter values to be $\displaystyle k_0=6/11,\;k_1=60/11,\;k_2=11,\;k_3=1$. Determine the number of steady states and the type of each. + :::: ### Extension problems - - ::::challenge{id="11_ext_01" title="Extension problems 1"} -Various mathematical models have been proposed for the initial growth of solid tumours, and some are summarised in [*The Model Muddle: In Search of Tumor Growth Laws*](https://doi.org/10.1158/0008-5472.can-12-4355). +Various mathematical models have been proposed for the initial growth of solid tumours, and some are summarised in [_The Model Muddle: In Search of Tumor Growth Laws_](https://doi.org/10.1158/0008-5472.can-12-4355). They are differential equations describing the rate of change of tumour volume $V$ as a function of time $t$, for example: 1. $\displaystyle \def\dd#1#2{{\frac{{\rm d}#1}{{\rm d}#2}}} \dd{V(t)}{t} = rV(t)$ @@ -400,8 +376,6 @@ Solve each equation both analytically and numerically, using Python. As was done in Figure 1A in the paper, compare the behaviours of the different growth laws over a suitable time interval for an initially small tumour, again using Python. :::: - - ::::challenge{id="11_ext_02" title="Extension problems 2"} Find the solution to the following differential equations subject to the specified boundary conditions, using integrating factors: @@ -410,6 +384,7 @@ Find the solution to the following differential equations subject to the specifi 1. $\displaystyle \def\dd#1#2{{\frac{{\rm d}#1}{{\rm d}#2}}} x^3\,\dd{y}{x} + 2y = e^{1/x^2} \quad\text{with}\quad y(1)=e$ 1. $\displaystyle \def\dd#1#2{{\frac{{\rm d}#1}{{\rm d}#2}}} \sec(x)\,\dd{y}{x} + y = 1 \quad\text{with}\quad y(0) = 1$ 1. $\displaystyle \def\dd#1#2{{\frac{{\rm d}#1}{{\rm d}#2}}} \dd{y}{x} + y\tan(x) = \cos(x) \quad\text{with}\quad y(0)=1 \quad\text{for}\quad 0 \le x < \frac{\pi}{2}$ + :::: ::::challenge{id="11_ext_03" title="Extension problems 3"} @@ -423,5 +398,5 @@ Show that $\displaystyle y = Ae^{z_1 x} + Be^{z_2 x}$ is a solution to this equa 1. Recalling that any complex number $z$ can be written as $\displaystyle z = re^{i\theta} = r(\cos\theta + i\sin\theta)$, what does this tell you about the nature of the solution? 1. If $y(0)=1$ and $y(\pi/2)=2$ what is the particular solution of the differential equation? -:::: +:::: diff --git a/scientific_computing/essential_maths/12_linear_algebra_1.md b/scientific_computing/essential_maths/12_linear_algebra_1.md index abe12485..f886beb1 100644 --- a/scientific_computing/essential_maths/12_linear_algebra_1.md +++ b/scientific_computing/essential_maths/12_linear_algebra_1.md @@ -1,23 +1,21 @@ --- name: Linear algebra 1 -dependsOn: [ - scientific_computing.essential_maths.08_complex_numbers -] +dependsOn: [scientific_computing.essential_maths.08_complex_numbers] tags: [] -attribution: -- citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 +attribution: + - citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Introduction to Matrices ---- +--- ## YouTube lecture recording from October 2020 @@ -63,7 +61,6 @@ While the algebraic manipulation is straightforward when solving two equations, What we want is a way to be able to easily manipulate **linear** systems, regardless of how big they are. - ## The matrix Matrices are a structure that allow us to more easily manipulate linear systems. @@ -115,7 +112,6 @@ $$ 5\times\left(\begin{matrix} 2&1\\ 3&-4 \end{matrix}\right)=\left(\begin{matrix}10&5\\ 15&-20\end{matrix}\right) $$ - ### Matrix multiplication To multiply two matrices, we multiply each **row** in the first matrix by each **column** in the second one, and put the results into a new matrix. @@ -126,7 +122,6 @@ $$ \left(\begin{matrix} 1&2 \\ 3&4 \end{matrix}\right) \left(\begin{matrix} 5&6\\7&8\end{matrix}\right) = \left(\begin{matrix} 1 \times 5 + 2 \times 7 & 1 \times 6 + 2 \times 8 \\ 3 \times 5 + 4 \times 7 & 3 \times 6 + 4\times 8\end{matrix}\right) = \left(\begin{matrix} 19&22\\43&46\end{matrix}\right) $$ - $$ \left(\begin{matrix} 1&2&3&4\\ 5&6&7&8 \end{matrix}\right) \left(\begin{matrix} 1&2&3\\ 4&5&6\\ 7&8&9\\ 10&11&12 \end{matrix}\right) = \left(\begin{matrix} 70&80&90\\ 158&184&210 \end{matrix}\right) $$ @@ -143,19 +138,17 @@ $$ \left(\begin{matrix} 2 & 3 & 1 \\ 2 & -1 & 3\end{matrix}\right)\left(\begin{matrix} 1 & 0 \\ -1 & -4\end{matrix}\right) =\;?\;?\;? $$ - ### Matrix multiplication is not commutative This means that $A \times B$ is not the same as $B \times A$. This can be easily seen from the fact that multiplying different sized matrices doesn't always work: -> $(3 x 2 \rm{matrix}) \times (2 x 2 \rm{matrix}) = (3 x 2 \rm{matrix})$ - -> $(2 x 2 \rm{matrix}) \times (3 x 2 \rm{matrix}) = ???$ +> $(3 x 2 \rm{matrix}) \times (2 x 2 \rm{matrix}) = (3 x 2 \rm{matrix})$ > $(2 x 2 \rm{matrix}) \times (3 x 2 \rm{matrix}) = ???$ However, even when sizes match, the product is usually not the same. ### The identity matrix + $I$ is the identity matrix, which has the property that: > $A I = I A = A$ @@ -219,11 +212,8 @@ Then: As a check, calculate $A^{-1}A$: > $$\displaystyle A^{-1}A= \frac{1}{2}\left(\begin{matrix}4&3\\ 2&2\end{matrix}\right)\left(\begin{matrix}2&-3\\ -2&4\end{matrix}\right) $$ - > $$\displaystyle = \frac{1}{2}\left(\begin{matrix}2&0\\ 0&2\end{matrix}\right)$$ - > $$\displaystyle = \left(\begin{matrix}1&0\\ 0&1\end{matrix}\right)$$ - > $$\displaystyle =I_2.$$ ## The transpose of a Matrix @@ -232,9 +222,7 @@ $A^T$ is the transpose of $A$. Swap elements across the leading diagonal so that $A^T_{ij}= A_{ji}$. -> $$\displaystyle A=\left(\begin{matrix}2&1&2\\ 1&4&6\\ 1&-1&2\end{matrix}\right)$$ - -> $$\displaystyle A^T=\left(\begin{matrix}2&1&1\\ 1&4&-1\\ 2&6&2\end{matrix}\right)$$ +> $$\displaystyle A=\left(\begin{matrix}2&1&2\\ 1&4&6\\ 1&-1&2\end{matrix}\right)$$ > $$\displaystyle A^T=\left(\begin{matrix}2&1&1\\ 1&4&-1\\ 2&6&2\end{matrix}\right)$$ ## Solving a linear system using matrices @@ -242,7 +230,7 @@ To solve a matrix system $\displaystyle A {\bf x} = {\bf b}$ for an unknown left - If it's of order 2 then use the formula to write $A^{-1}$ and hence ${\bf x} = A^{-1}{\bf b}$. -- If it's larger $(3\times3)$ then there's still a formula for $A^{-1}$ (not in this course). +- If it's larger $(3\times3)$ then there's still a formula for $A^{-1}$ (not in this course). - Use an analytical method (Gaussian elimination) to find the inverse (not in this course). @@ -250,7 +238,6 @@ To solve a matrix system $\displaystyle A {\bf x} = {\bf b}$ for an unknown left - Solve using linear algebra software (e.g. in Python, which we will see shortly). - ### Example of solving a 2x2 linear system > $$\displaystyle A^{-1}A{\bf x}=A^{-1}{\bf b}$$ @@ -278,9 +265,7 @@ We have: Thus: -> $$\displaystyle \left(\begin{matrix}x\\ y\end{matrix}\right) = \frac{1}{10}\left(\begin{matrix}5 &-5\\ 1&1\end{matrix}\right)\left(\begin{matrix}11\\ 9\end{matrix}\right) =\frac{1}{10} \left(\begin{matrix}10\\ 20\end{matrix}\right)$$ - -> $$\displaystyle =\left(\begin{matrix}1\\ 2\end{matrix}\right)$$ +> $$\displaystyle \left(\begin{matrix}x\\ y\end{matrix}\right) = \frac{1}{10}\left(\begin{matrix}5 &-5\\ 1&1\end{matrix}\right)\left(\begin{matrix}11\\ 9\end{matrix}\right) =\frac{1}{10} \left(\begin{matrix}10\\ 20\end{matrix}\right)$$ > $$\displaystyle =\left(\begin{matrix}1\\ 2\end{matrix}\right)$$ And $x=1, y=2$ @@ -304,14 +289,14 @@ In matrix form, this gives: > $\displaystyle \left(\begin{matrix} 1 & 5 & 3 & -1 \\ 1 & -2 & 1 & 4 \\ -3 & 1 & -1 & 2\\ 1 & 1 & 1 & 0 \end{matrix}\right) \left(\begin{matrix} x \\ y \\ z \\ w\end{matrix}\right) = \left(\begin{matrix} 5 \\ 2 \\ -5 \\ 0\end{matrix}\right).$ - #### Numerically, using NumPy ```python -## In python, we use numpy arrays to store the needed matrices -## the procedure linalg.solve, solves the system Ax = b -## We could also calculate the inverse of A (linalg.inv), and then multiply. -## But this is faster +# In python, we use numpy arrays to store the needed matrices +# the procedure linalg.solve, solves the system Ax = b +# We could also calculate the inverse of A (linalg.inv), and then multiply. +# But this is faster +import numpy as np A = np.array([[1,5,3,-1],[1,-2,1,4],[-3,1,-1,2],[1,1,1,0]]) b = np.array([5, 2, -5, 0]) @@ -323,14 +308,15 @@ print(x) print(np.matmul(A,x)) ``` -> ``` -> [-5.94444444 -5.11111111 11.05555556 -3.33333333] -> [ 5.0000000e+00 2.0000000e+00 -5.0000000e+00 -8.8817842e-16] -> ``` +```text +[-5.94444444 -5.11111111 11.05555556 -3.33333333] +[ 5.0000000e+00 2.0000000e+00 -5.0000000e+00 -8.8817842e-16] +``` #### Symbolically, using SymPy ```python +import sympy as sp A = sp.Matrix([[1,5,3,-1],[1,-2,1,4],[-3,1,-1,2],[1,1,1,0]]) A.inv() * sp.Matrix([5, 2, -5, 0]) @@ -339,10 +325,6 @@ A.inv() * sp.Matrix([5, 2, -5, 0]) $\displaystyle \left[\begin{matrix}- \frac{107}{18}\\- \frac{46}{9}\\\frac{199}{18}\\- \frac{10}{3}\end{matrix}\right]$ - - - - ### Introductory problems ::::challenge{id="12_intro_01" title="Introductory problems 1"} @@ -366,24 +348,19 @@ Where $A$ and $B$ are $3\times3$ matrices, and $\mathbf{x}$, $\mathbf{y}$, $\mat Check that your answers make sense by expanding your expressions to ensure you get back to the original equations. :::: - ::::challenge{id="12_intro_02" title="Introductory problems 2"} Given > $$\displaystyle A = \begin{pmatrix} 2 & 1 \\ 3 & 4 \end{pmatrix} $$ - > $$\displaystyle B = \begin{pmatrix} 1 & 4 \\ 7 & 2 \end{pmatrix} $$ - > $$\displaystyle C = \begin{pmatrix} 3 & -1 \\ -5 & 2 \end{pmatrix} $$ - > $$\displaystyle D = \begin{pmatrix} 1 \\ 3 \end{pmatrix} $$ - > $$\displaystyle E = \begin{pmatrix} 2 & -1 \end{pmatrix} $$ 1. Write down $a_{21}$, $b_{12}$ and $c_{22}$ 1. Calculate: - - $A + A$ + - $A + A$ - $A - B$ - $4C$ 1. Calculate, where possible, or explain why the product is not defined: @@ -394,6 +371,7 @@ Given 1. Do A and B commute? Do A and C commute? 1. Does $(AB)C = A(BC)$? Does this either prove or disprove that matrix multiplication is associative? 1. Does $AC + BC = (A+B)C$? Does this either prove or disprove the distributive property of matrices? + :::: ### Main problems @@ -402,24 +380,23 @@ Given If $\displaystyle A = \frac{1}{2}\left(\begin{array}{cc} 1 & 1 \\ 1 & 1 \end{array}\right)$, find $A^2$ and $A^3$ and comment on your results. :::: - ::::challenge{id="12_main_02" title="Main problems 2"} Given > $$\displaystyle A = \begin{pmatrix} 2 & 1 & 3 \\ 3 & -2 & 1 \\ -1 & 0 & 1 \end{pmatrix}; \qquad B = \begin{pmatrix} 0 & -1 & 1 \\ -5 & 2 & -1 \\ 3 & 0 & 2 \end{pmatrix}$$ Find: - - $AB$ - - $BA$ - - ${A}^T {B}^T$ - - ${B}^T {A}^T$ - - ${(AB)^T}$ - - ${(BA)^T}$ + +- $AB$ +- $BA$ +- ${A}^T {B}^T$ +- ${B}^T {A}^T$ +- ${(AB)^T}$ +- ${(BA)^T}$ Comment on your results. :::: - ::::challenge{id="12_main_03" title="Main problems 3"} Find the determinant, $|A|$, of the following matrices: @@ -428,15 +405,13 @@ Find the determinant, $|A|$, of the following matrices: :::: - ::::challenge{id="12_main_04" title="Main problems 4"} Find the inverse, $A^{-1}$, of the following matrices: 1. $A = \begin{pmatrix} 2 & 5 \\-1 & 4 \end{pmatrix}$ 1. $A = \begin{pmatrix} -3 & 2 \\-1 & 7 \end{pmatrix}$ -:::: - +:::: ::::challenge{id="12_main_05" title="Main problems 5"} @@ -459,39 +434,33 @@ x = np.linalg.solve(A, b) print(x) ``` -:::: +:::: ::::challenge{id="12_main_06" title="Main problems 6"} > $$\displaystyle X = \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix}; \qquad Y = \begin{pmatrix} 5 & 6 \\ 7 & 8 \end{pmatrix}$$ Calculate: - - $XY$ - - $YX$ - - $X^{-1}$ - - $Y^{-1}$ - - $X^{-1}Y^{-1}$ - - $Y^{-1}X^{-1}$ - - $(XY)^{-1}$ - - $(YX)^{-1}$ + +- $XY$ +- $YX$ +- $X^{-1}$ +- $Y^{-1}$ +- $X^{-1}Y^{-1}$ +- $Y^{-1}X^{-1}$ +- $(XY)^{-1}$ +- $(YX)^{-1}$ Comment on your results. :::: - - - - ### Extension problems ::::challenge{id="12_ext_01" title="Extension problems 1"} Show that -> $$\begin{pmatrix} 3 & 1 \\ 1 & 3 \end{pmatrix} \begin{pmatrix} \frac{1}{\sqrt 2} \\ \frac{1}{\sqrt 2} \end{pmatrix} = \lambda_1 \begin{pmatrix} \frac{1}{\sqrt 2} \\ \frac{1}{\sqrt 2} \end{pmatrix}$$ - -> $$\begin{pmatrix} 3 & 1 \\ 1 & 3 \end{pmatrix} \begin{pmatrix} \frac{1}{\sqrt 2} \\ \frac{-1}{\sqrt 2} \end{pmatrix} = \lambda_2 \begin{pmatrix} \frac{1}{\sqrt 2} \\ \frac{-1}{\sqrt 2} \end{pmatrix}$$ +> $$\begin{pmatrix} 3 & 1 \\ 1 & 3 \end{pmatrix} \begin{pmatrix} \frac{1}{\sqrt 2} \\ \frac{1}{\sqrt 2} \end{pmatrix} = \lambda_1 \begin{pmatrix} \frac{1}{\sqrt 2} \\ \frac{1}{\sqrt 2} \end{pmatrix}$$ > $$\begin{pmatrix} 3 & 1 \\ 1 & 3 \end{pmatrix} \begin{pmatrix} \frac{1}{\sqrt 2} \\ \frac{-1}{\sqrt 2} \end{pmatrix} = \lambda_2 \begin{pmatrix} \frac{1}{\sqrt 2} \\ \frac{-1}{\sqrt 2} \end{pmatrix}$$ where $\lambda_1$ and $\lambda_2$ are constants to be determined. :::: - diff --git a/scientific_computing/essential_maths/13_linear_algebra_2.md b/scientific_computing/essential_maths/13_linear_algebra_2.md index 784eceb7..890aebeb 100644 --- a/scientific_computing/essential_maths/13_linear_algebra_2.md +++ b/scientific_computing/essential_maths/13_linear_algebra_2.md @@ -1,24 +1,21 @@ --- name: Linear algebra 2 -dependsOn: [ - scientific_computing.essential_maths.12_linear_algebra_1 -] +dependsOn: [scientific_computing.essential_maths.12_linear_algebra_1] tags: [] -attribution: -- citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Eigenvalues and Eigenvectors ---- +--- ## YouTube lecture recording from October 2020 @@ -40,11 +37,12 @@ $$ ### NumPy ```python +import numpy as np A = np.array([[3, 0, 2], [3, 0, -3], [0, 1, 1]]) np.linalg.inv(A) ``` -> ``` +> ```text > array([[ 0.2 , 0.13333333, 0. ], > [-0.2 , 0.2 , 1. ], > [ 0.2 , -0.2 , -0. ]]) @@ -53,13 +51,13 @@ np.linalg.inv(A) ### SymPy ```python +import sympy as sp A = sp.Matrix([[3, 0, 2], [3, 0, -3], [0, 1, 1]]) A.inv() ``` > $\displaystyle \left[\begin{matrix}\frac{1}{5} & \frac{2}{15} & 0\\- \frac{1}{5} & \frac{1}{5} & 1\\\frac{1}{5} & - \frac{1}{5} & 0\end{matrix}\right]$ - ## It doesn't always work Consider the following system @@ -72,13 +70,13 @@ eq3: & 7x & +& 10y & + & 7z & = & c \end{array} $$ -```python +```python nolint A = np.array([[1, 1, 1], [2, 5, 2], [7, 10, 7]]) np.linalg.inv(A) ``` -> ``` +> ```text > --------------------------------------------------------------------------- > > LinAlgError Traceback (most recent call last) @@ -103,14 +101,13 @@ np.linalg.inv(A) > LinAlgError: Singular matrix > ``` - ## Singular matrices The **rank** of an $\;n\,\times\,n\;$ matrix $\;A\;$ is the number of linearly independent rows in $\;A\;$ (rows not combinations of other rows). When $\;\text{rank}(A) < n\;$ then -- The system $\;A\textbf{x} = \textbf{b}\;$ has *fewer* equations than unknowns +- The system $\;A\textbf{x} = \textbf{b}\;$ has _fewer_ equations than unknowns - The matrix is said to be singular - $A\;$ has no inverse - The determinant of $\;A\;$ is 0 @@ -120,7 +117,7 @@ When $\;\text{rank}(A) < n\;$ then An under-determined system (fewer equations than unknowns) may mean that there are **many solutions** or that there are **no solutions**. -An example with many solutions is +An example with many solutions is $$ \begin{align*} @@ -174,23 +171,21 @@ Previous example of a singular system: $$\displaystyle A = \left(\begin{matrix} 1& 1& 1\\ 2& 5& 2\\ 7& 10&7 \end{matrix}\right)$$ - ```python A = np.array([[1, 1, 1], [2, 5, 2], [7, 10, 7]]) np.linalg.matrix_rank(A) ``` -> ``` +> ```text > 2 > ``` - ```python import scipy.linalg scipy.linalg.null_space(A) ``` -> ``` +> ```text > array([[-7.07106781e-01], > [-1.11022302e-16], > [ 7.07106781e-01]]) @@ -202,24 +197,19 @@ Try it: > $$\left(\begin{matrix} 1& 1& 1\\ 2& 5& 2\\ 7& 11&7 \end{matrix}\right) \left(\begin{matrix} -1000\\ 0 \\ 1000 \end{matrix}\right) = \quad ?$$ - ```python np.matmul(A,np.array([-1000,0,1000])) ``` -> ``` +> ```text > array([0, 0, 0]) > ``` - ## Eigenvectors: motivation The **eigenvalues** and **eigenvectors** give an indication of how much effect the matrix has, and in what direction. -> $$\displaystyle A=\left(\begin{matrix} \cos(45)&-\sin(45)\\ \sin(45)&\cos(45)\\\end{matrix}\right)\qquad\text{has no scaling effect.}$$ - -> $$\displaystyle B=\left(\begin{matrix} 2& 0 \\ 0&\frac{1}{2}\\\end{matrix}\right)\qquad\qquad\text{doubles in }x\text{, but halves in }y\text{.}$$ - +> $$\displaystyle A=\left(\begin{matrix} \cos(45)&-\sin(45)\\ \sin(45)&\cos(45)\\\end{matrix}\right)\qquad\text{has no scaling effect.}$$ > $$\displaystyle B=\left(\begin{matrix} 2& 0 \\ 0&\frac{1}{2}\\\end{matrix}\right)\qquad\qquad\text{doubles in }x\text{, but halves in }y\text{.}$$ Repeated applications of $\;A\;$ stay the same distance from the origin, but repeated applications of $\;B\;$ move towards $\;(\infty, 0).$ @@ -244,6 +234,7 @@ Note that if $\;\textbf{v}\;$ is a solution, then so is a scaling $\;a\textbf{v} > $$\displaystyle A (a \textbf{v}) = \lambda (a \textbf{v}).$$ ## Finding Eigenvalues + Another way to write previous equation: $$ @@ -286,11 +277,7 @@ so $\displaystyle (A-\lambda I)$ must be singular: What are the eigenvalues for this matrix? -> $$\displaystyle A=\left(\begin{matrix}-2&-2\\ 1&-5\\\end{matrix}\right)$$ - -> $$\displaystyle |A-\lambda I|=\left\vert\begin{matrix}-2-\lambda&-2\\ 1&-5-\lambda\end{matrix}\right\vert=(-2-\lambda)(-5-\lambda)-(-2)$$ - -> $$\displaystyle =10+5\lambda+\lambda^2+2\lambda+2=\lambda^2+7\lambda+12=(\lambda+3)(\lambda+4)=0$$ +> $$\displaystyle A=\left(\begin{matrix}-2&-2\\ 1&-5\\\end{matrix}\right)$$ > $$\displaystyle |A-\lambda I|=\left\vert\begin{matrix}-2-\lambda&-2\\ 1&-5-\lambda\end{matrix}\right\vert=(-2-\lambda)(-5-\lambda)-(-2)$$ > $$\displaystyle =10+5\lambda+\lambda^2+2\lambda+2=\lambda^2+7\lambda+12=(\lambda+3)(\lambda+4)=0$$ So the eigenvalues are $\lambda_1=-3$ and $\lambda_2=-4$. @@ -298,13 +285,12 @@ So the eigenvalues are $\lambda_1=-3$ and $\lambda_2=-4$. Numpy: - ```python A = np.array([[-2, -2], [1, -5]]) np.linalg.eig(A)[0] ``` -> ``` +> ```text > array([-3., -4.]) > ``` @@ -314,14 +300,15 @@ SymPy: A2 = sp.Matrix([[-2, -2], [1, -5]]) A2.eigenvals() ``` + > $\displaystyle \left\{ -4 : 1, \ -3 : 1\right\}$ ## Finding Eigenvectors For an eigenvalue, the corresponding vector comes from substitution into $\;A \textbf{v} = \lambda \textbf{v}$: - ### Example + What are the eigenvectors for > $$\displaystyle A=\left(\begin{matrix}-2&-2\\ 1&-5\\\end{matrix}\right)?$$ @@ -329,25 +316,19 @@ What are the eigenvectors for The eigenvalues are $\;\lambda_1=-3\;$ and $\;\lambda_2=-4.\;$ The eigenvectors are $\;\textbf{v}_1\;$ and $\;\textbf{v}_2\;$ where: -> $$\displaystyle A\textbf{v}_1=\lambda_1 \textbf{v}_1.$$ - -> $$\displaystyle \left(\begin{matrix}-2&-2\\ 1&-5\\\end{matrix}\right) \left(\begin{matrix}u_1\\ v_1\\\end{matrix}\right) = \left(\begin{matrix}-3u_1\\ -3v_1\\\end{matrix}\right)$$ - -> $$\displaystyle u_1 = 2v_1. \text{ (from the top or bottom equation)}$$ - +> $$\displaystyle A\textbf{v}_1=\lambda_1 \textbf{v}_1.$$ > $$\displaystyle \left(\begin{matrix}-2&-2\\ 1&-5\\\end{matrix}\right) \left(\begin{matrix}u_1\\ v_1\\\end{matrix}\right) = \left(\begin{matrix}-3u_1\\ -3v_1\\\end{matrix}\right)$$ > $$\displaystyle u_1 = 2v_1. \text{ (from the top or bottom equation)}$$ > $$\displaystyle \left(\begin{matrix}u_1\\ v_1\\\end{matrix}\right) = \left(\begin{matrix}2 \\ 1\\\end{matrix}\right), \left(\begin{matrix}1 \\ 0.5\\\end{matrix}\right), \left(\begin{matrix}-4.4 \\ -2.2\\\end{matrix}\right), \left(\begin{matrix}2\alpha \\ \alpha\\\end{matrix}\right)\ldots $$ ## Eigenvectors in Python Numpy: - ```python A = np.array([[-2, -2], [1, -5]]) np.linalg.eig(A)[1] ``` -> ``` +> ```text > array([[0.89442719, 0.70710678], > [0.4472136 , 0.70710678]]) > ``` @@ -359,13 +340,14 @@ c = sp.symbols('c') A2 = sp.Matrix([[-2, c], [1, -5]]) A2.eigenvects() ``` + > $\displaystyle \left[ \left( - \frac{\sqrt{4 c + 9}}{2} - \frac{7}{2}, \ 1, \ \left[ \left[\begin{matrix}\frac{3}{2} - \frac{\sqrt{4 c + 9}}{2}\\1\end{matrix}\right]\right]\right), \ \left( \frac{\sqrt{4 c + 9}}{2} - \frac{7}{2}, \ 1, \ \left[ \left[\begin{matrix}\frac{\sqrt{4 c + 9}}{2} + \frac{3}{2}\\1\end{matrix}\right]\right]\right)\right]$ ## Diagonalising matrices Any nonsingular matrix $A$ can be rewritten as a product of eigenvectors and eigenvalues. -If $\;A\;$ has eigenvalues $\;\lambda_1\;$ and $\;\lambda_2\;$ with corresponding eigenvectors $\;\left(\begin{matrix}u_1\\ v_1\\\end{matrix}\right)\;$ and +If $\;A\;$ has eigenvalues $\;\lambda_1\;$ and $\;\lambda_2\;$ with corresponding eigenvectors $\;\left(\begin{matrix}u_1\\ v_1\\\end{matrix}\right)\;$ and > $\displaystyle \left(\begin{matrix}u_2\\ v_2\\\end{matrix}\right)$ @@ -382,12 +364,10 @@ $$ This is like a scaling surrounded by rotations and separates how much effect the matrix has $\;(\lambda_i)\;$ from the directions $\;(\textbf{v}_i).$ - ## For example > $$\displaystyle A=\left(\begin{matrix}-2&-2\\ 1&-5\\\end{matrix}\right)$$ - ```python A = np.array([[-2, -2], [1, -5]]) w, v = np.linalg.eig(A) @@ -396,7 +376,7 @@ inv_v = np.linalg.inv(v) np.matmul( np.matmul(v, np.diag(w)) , inv_v ) ``` -> ``` +> ```text > array([[-2., -2.], > [ 1., -5.]]) > ``` @@ -419,7 +399,6 @@ Symmetric matricies have orthogonal eigenvectors, e.g. > $$\displaystyle A=\left(\begin{matrix}19&20&-16\\ 20&13&4 \\ -16&4&31\\\end{matrix}\right)$$ - ```python A = np.array([[19, 20, -16], [20, 13, 4], [-16, 4, 31]]) w, v = np.linalg.eig(A) @@ -430,7 +409,7 @@ print(np.dot(v[:,0],v[:,2])) print(np.dot(v[:,1],v[:,2])) ``` -> ``` +> ```text > [[ 0.66666667 -0.66666667 0.33333333] > [-0.66666667 -0.33333333 0.66666667] > [ 0.33333333 0.66666667 0.66666667]] @@ -443,6 +422,7 @@ print(np.dot(v[:,1],v[:,2])) > ``` ## Normalised eigenvectors + If > $$\displaystyle \left(\begin{matrix}x\\ y\\ z\\\end{matrix}\right),$$ @@ -454,12 +434,12 @@ is an eigenvector, then is the corresponding normalised vector: a vector of unit length (magnitude). ## Orthogonal matrices + A matrix is orthogonal if its columns are normalised orthogonal vectors. One can prove that if $\;M\;$ is orthogonal then: > $$\displaystyle M^TM=I\qquad M^T=M^{-1}$$ - Note that if the eigenvectors are written in orthogonal form then the diagonalising equation is simplified: @@ -478,7 +458,6 @@ A \end{align*} $$ - ## Summary - Matrix representation of simultaneous equations @@ -489,21 +468,12 @@ $$ - Python for solving systems, finding inverse, null space and eigenvalues/vectors - Diagonalising matrices (we will use this for systems of differential equations) - - - - - ### Introductory problems ::::challenge{id="13_intro_01" title="Introductory problems 1"} Given -> $$\displaystyle A = \left(\begin{array}{cc} 1 & 0 \\ 0 & i \end{array}\right);$$ - -> $$\displaystyle B = \left(\begin{array}{cc} 0 & i \\ i & 0 \end{array}\right);$$ - -> $$\displaystyle C = \left(\begin{array}{cc} \frac{1}{\sqrt{2}} & \frac{1}{2}\left(1-i\right) \\\frac{1}{2}\left(1+i\right) & -\frac{1}{\sqrt{2}} \end{array}\right);$$ +> $$\displaystyle A = \left(\begin{array}{cc} 1 & 0 \\ 0 & i \end{array}\right);$$ > $$\displaystyle B = \left(\begin{array}{cc} 0 & i \\ i & 0 \end{array}\right);$$ > $$\displaystyle C = \left(\begin{array}{cc} \frac{1}{\sqrt{2}} & \frac{1}{2}\left(1-i\right) \\\frac{1}{2}\left(1+i\right) & -\frac{1}{\sqrt{2}} \end{array}\right);$$ verify by hand, and using the `numpy.linalg` module, that @@ -520,8 +490,8 @@ A = np.array([[1, 0], [0, 1j]]) print(A * np.linalg.inv(A)) ``` -:::: +:::: ::::challenge{id="13_intro_02" title="Introductory problems 2"} Let A be an $n \times n$ invertible matrix. Let $I$ be the $n\times n$ identity matrix and let $B$ be an $n\times n$ matrix. @@ -530,7 +500,6 @@ Suppose that $ABA^{-1}=I$. Determine the matrix $B$ in terms of the matrix $A$. :::: - ### Main problems ::::challenge{id="13_main_01" title="Main problems 1"} @@ -552,31 +521,25 @@ Calculate and simplify $A^{2017} \mathbf{x}$. ::::challenge{id="13_main_02" title="Main problems 2"} For each of the following matrices -> $$\displaystyle A = \left(\begin{array}{cc} 2 & 3 \\ 1 & 4 \end{array}\right);$$ - -> $$\displaystyle B = \left(\begin{array}{cc} 4 & 2 \\ 6 & 8 \end{array}\right);$$ - -> $$\displaystyle C = \left(\begin{array}{cc} 1 & 4 \\ 1 & 1 \end{array}\right);$$ - -> $$\displaystyle D = \left(\begin{array}{cc} x & 0 \\ 0 & y \end{array}\right),$$ +> $$\displaystyle A = \left(\begin{array}{cc} 2 & 3 \\ 1 & 4 \end{array}\right);$$ > $$\displaystyle B = \left(\begin{array}{cc} 4 & 2 \\ 6 & 8 \end{array}\right);$$ > $$\displaystyle C = \left(\begin{array}{cc} 1 & 4 \\ 1 & 1 \end{array}\right);$$ > $$\displaystyle D = \left(\begin{array}{cc} x & 0 \\ 0 & y \end{array}\right),$$ compute the determinant, eigenvalues and eigenvectors by hand. Check your results by verifying that $Q\mathbf{x} = \lambda_i \mathbf{x}$, where $Q=A$, $B$, $C$ or $D$, and by using the `numpy.linalg` module. ```python - # hint - import numpy as np +# hint +import numpy as np - A = np.array([[2, 3], [1, 4]]) +A = np.array([[2, 3], [1, 4]]) - e_vals, e_vecs = np.linalg.eig(A) +e_vals, e_vecs = np.linalg.eig(A) - print(e_vals) - print(e_vecs) +print(e_vals) +print(e_vecs) ``` -:::: +:::: ::::challenge{id="13_main_03" title="Main problems 3"} Orthogonal vectors. @@ -602,12 +565,6 @@ Find a $2{\times}2$ matrix $A$ such that $\displaystyle \quad A\left(\begin{arra ### Extension problems - ::::challenge{id="13_ext_01" title="Extension problems 1"} If there exists a matrix $M$ whose columns are those of normalised (unit length) and orthogonal vectors, prove that $M^TM=I$ which implies that $M^T=M^{-1}$. :::: - - - - - diff --git a/scientific_computing/essential_maths/14_system_1.md b/scientific_computing/essential_maths/14_system_1.md index c00eea8e..0a628e2a 100644 --- a/scientific_computing/essential_maths/14_system_1.md +++ b/scientific_computing/essential_maths/14_system_1.md @@ -1,25 +1,25 @@ --- name: Systems of differential equations 1 -dependsOn: [ - scientific_computing.essential_maths.11_differential_equations_3, - scientific_computing.essential_maths.13_linear_algebra_2 -] +dependsOn: + [ + scientific_computing.essential_maths.11_differential_equations_3, + scientific_computing.essential_maths.13_linear_algebra_2, + ] tags: [] -attribution: -- citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Linear Systems ---- +--- ## YouTube lecture recording from October 2020 @@ -36,7 +36,6 @@ Our aim is to solve systems of equations of the form: for $\displaystyle i=1,\ldots,n$. - Let us first consider the simplest form: a $2\times2$ **linear** system $$ @@ -63,6 +62,7 @@ In order to understand these systems, we must first understand coupled linear sy $\displaystyle A = \left(\begin{matrix} 1 & 1\\2 & 0 \end{matrix}\right)$ ```python +import sympy as sp A = sp.Matrix([[1, 1],[2, 0]]) A.eigenvals() ``` @@ -76,8 +76,8 @@ $\displaystyle A = \left(\begin{matrix} 1 & 1\\2 & 0 \end{matrix}\right)$ ```python A.eigenvects() ``` -> $\displaystyle \left[ \left( -1, \ 1, \ \left[ \left[\begin{matrix}- \frac{1}{2}\\1\end{matrix}\right]\right]\right), \ \left( 2, \ 1, \ \left[ \left[\begin{matrix}1\\1\end{matrix}\right]\right]\right)\right]$ +> $\displaystyle \left[ \left( -1, \ 1, \ \left[ \left[\begin{matrix}- \frac{1}{2}\\1\end{matrix}\right]\right]\right), \ \left( 2, \ 1, \ \left[ \left[\begin{matrix}1\\1\end{matrix}\right]\right]\right)\right]$ ## Recap of diagonalisation @@ -103,6 +103,7 @@ $$ $$ we have three methods of analysing them mathematically: + - Turn them into one second order equation (if we can solve second order!) - Divide one by other, to get one equation independent of $t$ - Perform matrix diagonalisation (extends to $n \times n$ problems) @@ -127,7 +128,6 @@ y(0) &= 3. \end{align*} $$ - ## Method 1: Second order We start with: @@ -142,15 +142,10 @@ $$ We can convert that into a second order equation: $$ -\frac{{\rm d}^2x}{{\rm d}t^2} -= -\frac{{\rm d}x}{{\rm d}t} + \frac{{\rm d}y}{{\rm d}t} -= -\frac{{\rm d}x}{{\rm d}t} + 2x +\frac{{\rm d}^2x}{{\rm d}t^2} = \frac{{\rm d}x}{{\rm d}t} + \frac{{\rm d}y}{{\rm d}t} = \frac{{\rm d}x}{{\rm d}t} + 2x \quad \quad\implies \quad\quad -\boxed{\frac{{\rm d}^2x}{{\rm d}t^2} -= -\frac{{\rm d}x}{{\rm d}t} + 2x} + +\boxed{\frac{{\rm d}^2x}{{\rm d}t^2} = \frac{{\rm d}x}{{\rm d}t} + 2x} $$ ## Method 2: eliminate $t$ @@ -167,20 +162,16 @@ $$ Then, dividing: $$ -\frac{{\rm d}y}{{\rm d}x} -= -\frac{ \quad \frac{{\rm d}y}{{\rm d}t} \quad }{ \frac{{\rm d}x}{{\rm d}t} } +\frac{{\rm d}y}{{\rm d}x} = \frac{ \quad \frac{{\rm d}y}{{\rm d}t} \quad }{ \frac{{\rm d}x}{{\rm d}t} } \quad \quad\implies \quad\quad -\boxed{\frac{{\rm d}y}{{\rm d}x} -= -\frac{2x}{x+y} } + +\boxed{\frac{{\rm d}y}{{\rm d}x} = \frac{2x}{x+y} } $$ ## Method 3: diagonalisation Let $\displaystyle \mathbf{v}=\left(\begin{matrix}x\\y\end{matrix}\right),$ - then $$ @@ -188,10 +179,7 @@ $$ \frac{{\rm d}y}{{\rm d}t} = 2x, \quad \implies \quad \boxed{\frac{{\rm d}\mathbf{v}}{{\rm d}t} = A\mathbf{v},} $$ - -where $\displaystyle A = \left(\begin{matrix} 1 & 1\\2 & 0 \end{matrix}\right).$ - - +where $\displaystyle A = \left(\begin{matrix} 1 & 1\\2 & 0 \end{matrix}\right).$ Substitute @@ -209,14 +197,10 @@ We can now introduce a new variable $\displaystyle \;\mathbf{z} = \left(\begi > $$\displaystyle \frac{{\rm d}\mathbf{z}}{{\rm d}t} = \left(\begin{matrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{matrix}\right) \mathbf{z}.$$ - But now, because the matrix is diagonal, the system is no longer coupled. The first equation **only** involves $z_1$ and the second **only** involves $z_2$, so we can solve each one individually: -> $$\displaystyle \frac{{\rm d}z_1}{{\rm d}t} = \lambda_1 z_1 \qquad\implies\qquad z_1(t) = A\,e^{\lambda_1 t}$$ - - -> $$\displaystyle \frac{{\rm d}z_2}{{\rm d}t} = \lambda_2 z_2 \qquad\implies\qquad z_2(t) = B\,e^{\lambda_2 t}$$ +> $$\displaystyle \frac{{\rm d}z_1}{{\rm d}t} = \lambda_1 z_1 \qquad\implies\qquad z_1(t) = A\,e^{\lambda_1 t}$$ > $$\displaystyle \frac{{\rm d}z_2}{{\rm d}t} = \lambda_2 z_2 \qquad\implies\qquad z_2(t) = B\,e^{\lambda_2 t}$$ Finally, we can substitute $z_1$ and $z_2$ back in terms of $x$ and $y$ to find the solution to the original coupled system: @@ -226,18 +210,13 @@ We have Rearranging, we now have two equations relating $x$ and $y$: -> $$\displaystyle -2x + 2y = C\,e^{\lambda_1 t}$$ - -> $$\displaystyle 2x + y = D\,e^{\lambda_2 t}$$ - +> $$\displaystyle -2x + 2y = C\,e^{\lambda_1 t}$$ > $$\displaystyle 2x + y = D\,e^{\lambda_2 t}$$ -where $C=3A$ and $D=3B$. Using our initial conditions, $\,x(0)=0\,$ and $\,y(0)=3\,$ we find $C=6$ and $D=3$. +where $C=3A$ and $D=3B$. Using our initial conditions, $\,x(0)=0\,$ and $\,y(0)=3\,$ we find $C=6$ and $D=3$. Finally, solving the simultaneous equations, we have a solution: -> $$\displaystyle x(t) = -e^{\lambda_1 t} + e^{\lambda_2 t}$$ - -> $$\displaystyle y(t) = 2e^{\lambda_1 t} + e^{\lambda_2 t}$$ +> $$\displaystyle x(t) = -e^{\lambda_1 t} + e^{\lambda_2 t}$$ > $$\displaystyle y(t) = 2e^{\lambda_1 t} + e^{\lambda_2 t}$$ ## Summary @@ -245,13 +224,6 @@ Finally, solving the simultaneous equations, we have a solution: order ODEs - Best method depends on the system and what you need to ask about it - - - - - - - ### Introductory problems ::::challenge{id="14_intro_01" title="Introductory problems 1"} @@ -262,8 +234,6 @@ Find the general solution to the following system of ODEs: Sketch the form of the solution in the $x,\,y$ plane, using arrows to indicate where the solution moves over time. :::: - - ::::challenge{id="14_intro_02" title="Introductory problems 2"} Take the general decoupled linear system @@ -271,15 +241,15 @@ Take the general decoupled linear system 1. Integrate the two equations separately to solve for $x$ and $y$ in terms of $t$. 1. If you start at $t=0$, $x(0)=0$, $y(0)=0$ what happens to the solution over time? -1. If you start at a general position $x(0)=x_0$, $y(0)=y_0$ what happens to the solution as $t\rightarrow\infty$? +1. If you start at a general position $x(0)=x_0$, $y(0)=y_0$ what happens to the solution as $t\rightarrow\infty$? - What if $a$ and $b$ are both negative? - What if only one of $a$ or $b$ is negative? What if either $x_0$ or $y_0$ is negative? -1. Either by eliminating $t$ from the original equations or by eliminating $t$ from your solutions to part 1., find a general solution of the system. (Why not try both methods?) +1. Either by eliminating $t$ from the original equations or by eliminating $t$ from your solutions to part 1., find a general solution of the system. (Why not try both methods?) 1. Sketch this solution for - $\quad a>0,\;\;b>0,\;\;a=b$ - $\quad a>0,\;\;b<0,\;\;a=-b$. -:::: +:::: ### Main problems @@ -291,7 +261,6 @@ By reformulating the following system as one first order equation (i.e eliminati Sketch the form of the solutions in the $x,\,y$ plane. :::: - ::::challenge{id="14_main_02" title="Main problems 2"} Again by eliminating $t$ and reformulating the system as one first order equation, find the general solution to the following system of ODEs: @@ -314,15 +283,15 @@ Find the eigenvalues and two independent eigenvectors $\mathbf{v}_{1}$ and $\mat - What happens as $t\rightarrow \infty$? - What about $t\rightarrow -\infty$? - What is $\displaystyle \def\dd#1#2{{\frac{{\rm d}#1}{{\rm d}#2}}} \dd{y}{x}$ at a general point on the $y$-axis? -:::: +:::: ### Extension problems ::::challenge{id="14_ext_01" title="Extension problems 1"} -The force on a damped harmonic oscillator is $\displaystyle \def\dd#1#2{{\frac{{\rm d}#1}{{\rm d}#2}}} f = -k x - m \nu \dd{x}{t}$, where $x$ is a displacement, $k>0$ is a spring force constant, $m>0$ is the mass and $\nu>0$ is the strength of the damping. +The force on a damped harmonic oscillator is $\displaystyle \def\dd#1#2{{\frac{{\rm d}#1}{{\rm d}#2}}} f = -k x - m \nu \dd{x}{t}$, where $x$ is a displacement, $k>0$ is a spring force constant, $m>0$ is the mass and $\nu>0$ is the strength of the damping. 1. Use Newton's 2nd law of motion to write down an equation for the acceleration $\displaystyle \def\ddd#1#2{{{{\rm d}^2#1}\over{\d{#2}^2}}} \def\d#1{{\rm d}#1} \ddd{x}{t}$. 1. Make the substitution $\displaystyle \def\ddd#1#2{{{{\rm d}^2#1}\over{\d{#2}^2}}} \def\d#1{{\rm d}#1} \def\dd#1#2{{\frac{{\rm d}#1}{{\rm d}#2}}} y=\dd{x}{t}\;\left(\text{and hence }\dd{y}{t} = \ddd{x}{t}\right)$ to obtain a system of two first-order linear ODEs. -:::: +:::: diff --git a/scientific_computing/essential_maths/15_system_2.md b/scientific_computing/essential_maths/15_system_2.md index 5e2a8320..25655e00 100644 --- a/scientific_computing/essential_maths/15_system_2.md +++ b/scientific_computing/essential_maths/15_system_2.md @@ -1,23 +1,21 @@ --- name: Systems of differential equations 2 -dependsOn: [ - scientific_computing.essential_maths.14_system_1 -] +dependsOn: [scientific_computing.essential_maths.14_system_1] tags: [] -attribution: -- citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 +attribution: + - citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## System Simplification ---- +--- ## YouTube lecture recording from October 2020 @@ -33,16 +31,17 @@ Recap - So far we have looked at systems of **first order**, **linear** ODEs in **two dimensions** - These systems can be solved analytically - - 3 methods of solving systems of first order ODEs - - Diagonalisation extends to $N$-dimensional systems + - 3 methods of solving systems of first order ODEs + - Diagonalisation extends to $N$-dimensional systems Plan - Aim to look at systems of **first order**, **nonlinear** ODEs in **more dimensions** - How we go about modelling a problem - Simplifying systems of ODEs - - Reducing number of parameters - - Reducing number of equations + + - Reducing number of parameters + - Reducing number of equations - Phase plane analysis @@ -93,7 +92,7 @@ $$ \end{align*} $$ -Note that $\theta$, $\phi$ and $\tau$ are arbitrary values for scaling $N$, $P$, and $T$. +Note that $\theta$, $\phi$ and $\tau$ are arbitrary values for scaling $N$, $P$, and $T$. $$ \begin{align*} @@ -150,8 +149,7 @@ $$ \end{align*} $$ - -However, the enzyme is recycled: it is used in the complex and then released. This means that $e + c = e_{tot}$ where $e_{tot}$ is constant. +However, the enzyme is recycled: it is used in the complex and then released. This means that $e + c = e_{tot}$ where $e_{tot}$ is constant. Making the substitution $e = e_{tot} - c$ to eliminate $e$ we arrive at the 3 ODE system: @@ -190,14 +188,13 @@ $$ \end{align*} $$ - This means that we have used conservation and quasi-steady state to go from a 4-dimensional system $(s,e,c,p)$ to a two-dimensional approximation which captures some of the behaviour. Two dimensions are good because we can plot their behaviour on a phase plane diagram. ## Phase planes and nullclines -A system of **nonlinear** ODEs may have more than one fixed point (or may have none). +A system of **nonlinear** ODEs may have more than one fixed point (or may have none). Finding fixed points in two-dimensional systems is aided by **nullclines**. An $x$-nullcline is a line where $\dot{x}=0$ and a $y$-nullcline is a line where $\dot{y}=0$. @@ -217,12 +214,10 @@ has $x$-nullclines at $x=0$ and $1-x-y=0$; and $y$-nullclines at $y=0$ and $2-3x Nullcline intersections give us the fixed points. Nullclines can be annotated to give the direction (and magnitude) of the non-zero derivative. - ### Plot of the nullclines ![Plot of the nullclines of the ODE system](fig/15_01_nullclines.svg) - ### Plot of the phase plane ![Plot of the phase plane of the ODE system](fig/15_02_phase_plane.svg) @@ -230,45 +225,47 @@ Nullclines can be annotated to give the direction (and magnitude) of the non-zer The nullclines allow us to add arrows demonstrating the flow direction, and by following the arrows we can sketch the behaviour of solutions (green lines). The arrows can only cross the $x$-nullclines vertically, and the $y$-nullclines horizontally. - ### Python code to plot the phase plane ```python +import numpy as np +from matplotlib import pyplot as plt +import scipy def dX_dt(X, t): return np.array([ X[0]*(1. - X[0]) - X[0]*X[1], 2.*X[1]*(1.-X[1]/2.) -3*X[0]*X[1]]) def plot_phase_plane(): - + plt.figure(figsize=(10,10)) - + init_x = [1.05, 0.9, 0.7, 0.5, 0.5, 0.32, 0.1] init_y = [1.0, 1.3, 1.6, 1.8, 0.2, 0.2, 0.2] - + plt.plot(init_x, init_y, 'g*', markersize=20) - + for v in zip(init_x,init_y): X0 = v # starting point X = scipy.integrate.odeint( dX_dt, X0, np.linspace(0,10,100)) plt.plot( X[:,0], X[:,1], lw=3, color='green') - - - + + + # plot nullclines x = np.linspace(-0.1,1.1,24) y = np.linspace(-0.1,2.1,24) - + plt.hlines(0,-1,15, color='#F39200', lw=4, label='y-nullcline 1') plt.plot(x,1 - x, color='#0072bd', lw=4, label='x-nullcline 2') plt.vlines(0,-1,15, color='#0072bd', lw=4, label='x-nullcline 1') plt.plot(x,2 - 3*x, color='#F39200', lw=4, label='y-nullcline 2') # quiverplot - define a grid and compute direction at each point - X , Y = np.meshgrid(x, y) # create a grid + X, Y = np.meshgrid(x, y) # create a grid DX = X*(1-X) - X*Y # evaluate dx/dt - DY = 2*Y*(1 - Y/2.0) - 3*X*Y # evaluate dy/dt - M = (np.hypot(DX, DY)) # norm growth rate - M[ M == 0] = 1. # avoid zero division errors + DY = 2*Y*(1 - Y/2.0) - 3*X*Y # evaluate dy/dt + M = (np.hypot(DX, DY)) # norm growth rate + M[ M == 0] = 1. # avoid zero division errors plt.quiver(X, Y, DX/M, DY/M, M) plt.xlim(-0.05,1.1) @@ -280,13 +277,12 @@ def plot_phase_plane(): ## Summary - Simplification - - Rescaling to dimensionless quantities - - Conservation - - Quasi-steady state approximation - -- Nullclines are a powerful way of finding steady states and phase flow + - Rescaling to dimensionless quantities + - Conservation + - Quasi-steady state approximation +- Nullclines are a powerful way of finding steady states and phase flow ### Introductory problems @@ -296,21 +292,19 @@ Find the fixed points of the following linear systems: 1. $\displaystyle \dot{x} = x+3y, \qquad \dot{y}=-6x+5y;$ 1. $\displaystyle \dot{x} = x+3y+4, \qquad \dot{y}=-6x+5y-1;$ 1. $\displaystyle \dot{x} = x+3y+1, \qquad \dot{y}=-6x+5y.$ -:::: - +:::: ::::challenge{id="15_intro_02" title="Introductory problems 2"} Find the fixed points of the following nonlinear systems: 1. $\displaystyle \dot{x} = -4y+2xy-8 \qquad \dot{y}=4y^2-x^2;$ 1. $\displaystyle \dot{x} = y-x^2+2, \qquad \dot{y}=2(x^2-y^2).$ + :::: ### Main problems - - ::::challenge{id="15_main_01" title="Main problems 1"} Consider the chemical reaction network: @@ -318,8 +312,8 @@ Consider the chemical reaction network: 1. Write down the system of two linear ODEs which describe the evolution of the concentrations of A and B in this system under the law of mass action. 1. Find the ratio of concentrations of A and B for which this system is in steady state: that is the concentrations do not change over time. -:::: +:::: ::::challenge{id="15_main_02" title="Main problems 2"} Consider the reversible enzyme reaction: @@ -333,7 +327,6 @@ Verify the Haldane relation, which states that when the reaction is in equilibri where $p$ and $s$ are the concentrations of $P$ and $S$, respectively. :::: - ::::challenge{id="15_main_03" title="Main problems 3"} The population of a host, $H(t)$, and a parasite, $P(t)$, are described approximately by the equations: @@ -357,8 +350,6 @@ Sketch the phase flow across the following lines: :::: - - ::::challenge{id="15_main_04" title="Main problems 4"} Consider a lake with some fish attractive to anglers. We wish to model the fish-angler interaction under the following assumptions: @@ -373,4 +364,3 @@ We wish to model the fish-angler interaction under the following assumptions: > $$ \dot{x} = rx(1 - x) - xy,\qquad \dot{y} = \beta x - y $$ :::: - diff --git a/scientific_computing/essential_maths/16_system_3.md b/scientific_computing/essential_maths/16_system_3.md index 355e5806..039c7e7a 100644 --- a/scientific_computing/essential_maths/16_system_3.md +++ b/scientific_computing/essential_maths/16_system_3.md @@ -1,24 +1,21 @@ --- name: Systems of differential equations 3 -dependsOn: [ - scientific_computing.essential_maths.15_system_2 -] +dependsOn: [scientific_computing.essential_maths.15_system_2] tags: [] -attribution: -- citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Phase Planes and Stability ---- +--- ## YouTube lecture recording from October 2020 @@ -47,7 +44,6 @@ $$ ![Plot of the phase plane](fig/16_01_phase_plane.svg) - ## Linear ODEs for understanding nonlinear The decoupled ODE system @@ -65,10 +61,10 @@ Solutions look like $\;x=Ae^{\lambda_1 t}\;$ and $\;y=Be^{\lambda_2 t}\;$ and th or shrink exponentially depending on the values of $\;\lambda_1\;$ and $\;\lambda_2.$ If $\;\lambda_1 < 0\;$ and $\;\lambda_2 < 0\;$ then all the flow is towards the -fixed point. If $\;\lambda_1\;$ or $\;\lambda_2\;$ is positive then some flow will be driven away (towards infinity). +fixed point. If $\;\lambda_1\;$ or $\;\lambda_2\;$ is positive then some flow will be driven away (towards infinity). Adding in a constant (inhomogeneous) component shifts the fixed point -away from the origin. Where is the fixed point of +away from the origin. Where is the fixed point of $$ \begin{align*} @@ -89,7 +85,7 @@ $$ \end{align*} $$ -has a fixed point at the origin. The long-term growth or shrinkage of solutions over time is determined by the eigenvalues of the matrix +has a fixed point at the origin. The long-term growth or shrinkage of solutions over time is determined by the eigenvalues of the matrix > $$\displaystyle A = \begin{pmatrix} a & b \\ c & d \end{pmatrix}$$ @@ -107,8 +103,7 @@ can be written $$ \left( \begin{array}{c} \dot{x} \\ \dot{y} \end{array} -\right) -= +\right) = \left( \begin{array}{cc} a & b \\ c& d \end{array} \right) @@ -127,8 +122,7 @@ It has a fixed point at $$ \left( \begin{array}{c} x \\ y \end{array} -\right) -= +\right) = -\left( \begin{array}{cc} a & b \\ c& d \end{array} \right)^{-1} @@ -182,7 +176,7 @@ $$ $$ This means that (really close to the fixed point) we can approximate -with a linear system. The eigenvalues $\;\lambda_1,\;\lambda_2\;$ of the matrix +with a linear system. The eigenvalues $\;\lambda_1,\;\lambda_2\;$ of the matrix $$ J = \left( @@ -190,7 +184,7 @@ J = \left( \right) $$ -will determine if a small perturbation away from $\;(x^*,\;y^*)\;$ will decay or grow. +will determine if a small perturbation away from $\;(x^*,\;y^*)\;$ will decay or grow. ## Steady state classification @@ -200,16 +194,14 @@ J = \left( \right) $$ - - $\lambda_1<\lambda_2<0$ Stable node - $\lambda_1=\lambda_2<0$ Stable star - $\lambda_1>\lambda_2>0$ Unstable node - $\lambda_1=\lambda_2>0$ Unstable star -- $\lambda_1<0<\lambda_2$ Saddle (or hyperbolic) point: unstable +- $\lambda_1<0<\lambda_2$ Saddle (or hyperbolic) point: unstable - Complex $\lambda$: Spiral (with real part determining stability) - Imaginary $\lambda$: Neutral (solution cycles round fixed point) - The presence of negative eigenvalues determines whether a steady state is physically viable. ### Eigenvalues of $J$ @@ -297,17 +289,13 @@ $$ If we now look again at the phase plane, after having calculated the stability of the fixed points, we can see that the arrows move towards the stable fixed points, and away from the unstable ones. - ![Plot of the phase plane](fig/16_01_phase_plane.svg) - ## Summary -- Eigenvalues tell us about the behaviour of linear systems +- Eigenvalues tell us about the behaviour of linear systems - Eigenvalues tell us about the stability of nonlinear systems - - ### Main problems > These questions are extensions of questions on the previous page. @@ -318,6 +306,7 @@ Classify the fixed points and discuss stability of the following linear systems: 1. $\displaystyle \dot{x} = x+3y, \qquad \dot{y}=-6x+5y;$ 1. $\displaystyle \dot{x} = x+3y+4, \qquad \dot{y}=-6x+5y-1;$ 1. $\displaystyle \dot{x} = x+3y+1, \qquad \dot{y}=-6x+5y.$ + :::: ::::challenge{id="16_main_02" title="Main problems 2"} @@ -325,6 +314,7 @@ Classify the fixed points and discuss stability of the following nonlinear syste 1. $\displaystyle \dot{x} = -4y+2xy-8 \qquad \dot{y}=4y^2-x^2;$ 1. $\displaystyle \dot{x} = y-x^2+2, \qquad \dot{y}=2(x^2-y^2).$ + :::: ::::challenge{id="16_main_03" title="Main problems 3"} @@ -365,6 +355,7 @@ The $\dot{x}$ notation represents the derivative with respect to non-dimensional 1. Calculate the steady states of the system. 1. Determine the stability of the fixed points in the case $\beta = r = 4$. 1. Draw the phase plane, including the nullclines and phase trajectories. + :::: ### Extension problems diff --git a/scientific_computing/essential_maths/17_probability_1.md b/scientific_computing/essential_maths/17_probability_1.md index cb987df2..5cc88198 100644 --- a/scientific_computing/essential_maths/17_probability_1.md +++ b/scientific_computing/essential_maths/17_probability_1.md @@ -1,19 +1,16 @@ --- name: Probability 1 -dependsOn: [ - scientific_computing.essential_maths.07_integration_2 -] +dependsOn: [scientific_computing.essential_maths.07_integration_2] tags: [] -attribution: -- citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- Coming soon. diff --git a/scientific_computing/essential_maths/18_probability_2.md b/scientific_computing/essential_maths/18_probability_2.md index e950170d..6732647d 100644 --- a/scientific_computing/essential_maths/18_probability_2.md +++ b/scientific_computing/essential_maths/18_probability_2.md @@ -1,19 +1,16 @@ --- name: Probability 2 -dependsOn: [ - scientific_computing.essential_maths.17_probability_1 -] +dependsOn: [scientific_computing.essential_maths.17_probability_1] tags: [] -attribution: -- citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- Coming soon. diff --git a/scientific_computing/essential_maths/19_probability_3.md b/scientific_computing/essential_maths/19_probability_3.md index 7832eea4..64f633ab 100644 --- a/scientific_computing/essential_maths/19_probability_3.md +++ b/scientific_computing/essential_maths/19_probability_3.md @@ -1,18 +1,16 @@ --- name: Probability 3 -dependsOn: [ - scientific_computing.essential_maths.18_probability_2 -] +dependsOn: [scientific_computing.essential_maths.18_probability_2] tags: [] -attribution: -- citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 +attribution: + - citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- Coming soon. diff --git a/scientific_computing/essential_maths/index.md b/scientific_computing/essential_maths/index.md index e9d8a1e1..10b84f91 100644 --- a/scientific_computing/essential_maths/index.md +++ b/scientific_computing/essential_maths/index.md @@ -2,7 +2,8 @@ id: essential_maths name: Essential Maths dependsOn: [] -files: [ +files: + [ 01_graphs.md, 02_indices_and_logs.md, 03_differentiation_1.md, @@ -22,23 +23,22 @@ files: [ 17_probability_1.md, 18_probability_2.md, 19_probability_3.md, -] + ] summary: | - A course in maths, containing a primer in everything you need to know in order to analyse systems of differential equations. -attribution: -- citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 + A course in maths, containing a primer in everything you need to know in order to analyse systems of differential equations. +attribution: + - citation: This material has been adapted from material by Fergus Cooper from the "Essential Mathematics" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- The most recent substantial iteration of this course was developed by Fergus Cooper, Beth Dingley, and Elliot Howard-Spink. - ## Why Maths? Maths is the language we use to quantitatively describe the world. @@ -50,13 +50,13 @@ You are going into research, and you Some of you **may** feel you don't need to know any maths, but this course will be useful to you even if you never have to write down a system of differential equations yourself. -**Data analysis:** +### Data analysis - Interpretation and inference - Identify patterns, trends, relationships - Deal robustly with uncertainty and variation -**Describe the behaviour of systems:** +### Describe the behaviour of systems - Remove ambiguity: explicit assumptions - Quantitative hypotheses @@ -64,17 +64,15 @@ Some of you **may** feel you don't need to know any maths, but this course will - "If I make this intervention, I expect to see that change" - Explain **why** something is observed - -**Vital for dynamic and nonlinear systems** +### Vital for dynamic and nonlinear systems - Simple intuition breaks down - Most of biology is dynamic and nonlinear! - ## Course aims - Develop confidence in your mathematical abilities - - Extensive practice + - Extensive practice - Become able to communicate effectively with mathematical collaborators - Ensure you can read and understand mathematical papers in your field - Build on your ability to apply computational tools from **Python** to solve problems diff --git a/scientific_computing/linear_algebra/01-matrix-form-of-equations.md b/scientific_computing/linear_algebra/01-matrix-form-of-equations.md index 446ea1a8..ded368b8 100644 --- a/scientific_computing/linear_algebra/01-matrix-form-of-equations.md +++ b/scientific_computing/linear_algebra/01-matrix-form-of-equations.md @@ -1,24 +1,21 @@ --- name: Matrix form of equations -dependsOn: [ -] +dependsOn: [] tags: [] -attribution: -- citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - +attribution: + - citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Systems of linear equations -Linear algebra is largely concerned with representing, manipulating and solving large +Linear algebra is largely concerned with representing, manipulating and solving large systems of linear equations. Consider the following 2 linear equations: $$ @@ -28,8 +25,8 @@ a_2x+b_2y &= c_2, \quad (2) \end{aligned} $$ -where the values $\;x\;$ and $\;y\;$ are to be found, and $\;a_1, \;b_1, \;a_2, \;b_2, -\;c_1\;$ and $\;c_2\;$ are given constants. We know that we can use linear combinations +where the values $\;x\;$ and $\;y\;$ are to be found, and $\;a_1, \;b_1, \;a_2, \;b_2, +\;c_1\;$ and $\;c_2\;$ are given constants. We know that we can use linear combinations of these two equations to solve this sytem for $\;x\;$ and $\;y\;$, like so: $$ @@ -42,9 +39,9 @@ $$ ## The Matrix -We can also write these linear equations using a matrix. Matrices are structures that -allow us to more easily manipulate linear systems. While not particularly useful for -just 2 equations, using a matrix representation allows us to generalise to, say $N$ +We can also write these linear equations using a matrix. Matrices are structures that +allow us to more easily manipulate linear systems. While not particularly useful for +just 2 equations, using a matrix representation allows us to generalise to, say $N$ equations and $M$ unknowns, or to solve large systems of equations on a computer. Consider the original system: @@ -52,7 +49,7 @@ Consider the original system: $$ \begin{aligned} a_1x+b_1y &= c_1, \\ -a_2x+b_2y &= c_2. +a_2x+b_2y &= c_2. \end{aligned} $$ @@ -63,8 +60,8 @@ $$ \left(\begin{matrix}x\\y\end{matrix}\right) =\left(\begin{matrix}c_1\\ c_2 \end{matrix}\right). $$ - -Think about how this form relates to the original linear system. + +Think about how this form relates to the original linear system. ## Geometry of linear equations @@ -73,7 +70,7 @@ Consider the following system of equations $$ \begin{aligned} x + -2y &= -1, \\ --x + 3y &= 3. +-x + 3y &= 3. \end{aligned} $$ @@ -83,12 +80,12 @@ $$ A = \left(\begin{matrix} 1 & -2 \\ -1 & 3\end{matrix}\right) $$ -Plotting these two linear equations on a graph shows graphically the solution to this +Plotting these two linear equations on a graph shows graphically the solution to this equation given by $$ -\left(\begin{matrix} x \\ y \end{matrix}\right) = A^{-1} \left(\begin{matrix} -1 \\ 3 -\end{matrix}\right) = \left(\begin{matrix} 3 \\ 2 +\left(\begin{matrix} x \\ y \end{matrix}\right) = A^{-1} \left(\begin{matrix} -1 \\ 3 +\end{matrix}\right) = \left(\begin{matrix} 3 \\ 2 \end{matrix}\right) $$ @@ -106,7 +103,7 @@ $$ -x + 2y &= 3. \end{aligned} $$ -and +and $$ \begin{aligned} @@ -116,24 +113,24 @@ $$ ![(left) infinite solutions, (right) no solutions](images/01-sim2.svg) -The first gives the plot on the left, while the second, which has a different vector of -constants on the RHS, gives the plot on the right. You can see that depending on the -constants, the system of equations can have an infinite number of solutions, or no -solutions at all. +The first gives the plot on the left, while the second, which has a different vector of +constants on the RHS, gives the plot on the right. You can see that depending on the +constants, the system of equations can have an infinite number of solutions, or no +solutions at all. -The matrix $A$ in this case is singular, and therefore does not have an inverse. Looking -at the equations again, you can see that the two rows of the matrix $A$ are multiples of -the other, and thus there is only *one* independent row. That is, the *rank* of the +The matrix $A$ in this case is singular, and therefore does not have an inverse. Looking +at the equations again, you can see that the two rows of the matrix $A$ are multiples of +the other, and thus there is only _one_ independent row. That is, the _rank_ of the matrix is one. ## Singular matrices -The *rank* of an $\;n\,\times\,n\;$ matrix $\;A\;$ is the number of linearly independent rows in $\;A\;$ (rows not combinations of other rows). +The _rank_ of an $\;n\,\times\,n\;$ matrix $\;A\;$ is the number of linearly independent rows in $\;A\;$ (rows not combinations of other rows). When $\;\text{rank}(A) < n\;$ then - The matrix is said to be 'rank deficient' -- The system $\;A\textbf{x} = \textbf{b}\;$ has *fewer* equations than unknowns +- The system $\;A\textbf{x} = \textbf{b}\;$ has _fewer_ equations than unknowns - The matrix is said to be singular - The matrix is said to be underdetermined - $A\;$ has no inverse @@ -142,9 +139,9 @@ When $\;\text{rank}(A) < n\;$ then ### The determinant -One way of solving a system of equations represented by $A x = b$ is to calculate the -*inverse* of A, giving the solution as $x = A^{-1} b$. This can be done by calculating -what is known as the *determinant* of $A$. +One way of solving a system of equations represented by $A x = b$ is to calculate the +_inverse_ of A, giving the solution as $x = A^{-1} b$. This can be done by calculating +what is known as the _determinant_ of $A$. If @@ -164,31 +161,32 @@ $$ A^{-1} = \frac{1}{ps-qr} \left(\begin{matrix} s & -q \\ -r & p\end{matrix}\right) $$ -Calculating the inverse of a matrix using its determinant can be very costly for larger -matrices, therefore other algorithms are used (e.g. Gaussian Elimination, which is +Calculating the inverse of a matrix using its determinant can be very costly for larger +matrices, therefore other algorithms are used (e.g. Gaussian Elimination, which is introduced in the next section) -If $|A| = 0$, A is said to be **singular** (have no inverse). Graphically, this is -represented by the parallel or non-intersecting lines in the figure above. +If $|A| = 0$, A is said to be **singular** (have no inverse). Graphically, this is +represented by the parallel or non-intersecting lines in the figure above. ### Using Python to calculate the inverse -To find $A^{-1}$ for +To find $A^{-1}$ for $$ A = \left(\begin{matrix}3&0&2\\ 3&0&-3\\ 0&1&-1\end{matrix}\right) $$ - + you can using numpy like so: - + ```python +import numpy as np A = np.array([[3, 0, 2], [3, 0, -3], [0, 1, 1]]) np.linalg.inv(A) ``` Output: -``` +```text array([[ 0.2 , 0.13333333, 0. ], [-0.2 , 0.2 , 1. ], [ 0.2 , -0.2 , -0. ]]) @@ -207,7 +205,7 @@ np.linalg.inv(A) Output: -``` +```text Traceback (most recent call last): File "", line 1, in File "<__array_function__ internals>", line 6, in inv @@ -225,34 +223,34 @@ numpy.linalg.LinAlgError: Singular matrix - Strang, G. (2016). Introduction to linear algebra (Fifth ed.). Wellesley. - Linear algebra and its applications by Gilbert Strang - lots of supplimentary material via MIT course page here: -https://github.com/mitmath/1806/blob/master/summaries.md -- LA from an ODE perspective: Kapitula, T. (2015). Ordinary Differential Equations and + +- LA from an ODE perspective: Kapitula, T. (2015). Ordinary Differential Equations and Linear Algebra. Society for Industrial and Applied Mathematics. ## Problems ::::challenge{id=intersection-of-planes title="Intersection of planes" } -1. Describe the intersection of the three planes $u+v+w+z = 6$, $u+w+z = 4$ and $u+w = - 2$ (all in four-dimensional space). Is it a line or a point or a fourth equation that - leaves us with no solution. an empty set? What is the intersection if the fourth +1. Describe the intersection of the three planes $u+v+w+z = 6$, $u+w+z = 4$ and $u+w = + 2$ (all in four-dimensional space). Is it a line or a point or a fourth equation that + leaves us with no solution. an empty set? What is the intersection if the fourth plane $u = −1$ is included? Find a fourth equation that leaves us with no solution. :::solution -The intersection of the 3 planes is the 1d line $u + w = 2$, $v=2$ and $z=2$. -Introducing a fourth equation that does not intersect this line (e.g. $u + w = 3$) +The intersection of the 3 planes is the 1d line $u + w = 2$, $v=2$ and $z=2$. +Introducing a fourth equation that does not intersect this line (e.g. $u + w = 3$) leaves us with no solutions. ::: :::: - ::::challenge{id=python-intersection-of-planes title="Python: Intersection of planes" } -2. Sketch or plot in Python these three lines and decide if the equations are solvable: - 3 by 2 system $x + 2y = 2$, $x − y = 2$, and $y = 1$. What happens if all right-hand - sides are zero? Is there any nonzero choice of right- hand sides that allows the - three lines to intersect at the same point? + +Sketch or plot in Python these three lines and decide if the equations are solvable: +3 by 2 system $x + 2y = 2$, $x − y = 2$, and $y = 1$. What happens if all right-hand +sides are zero? Is there any nonzero choice of right- hand sides that allows the +three lines to intersect at the same point? :::solution @@ -282,10 +280,10 @@ plot_lines(2, -1, 1) ::::challenge{id=upper-triangular-matrix title="Upper triangular matrix" } -3. Write a Python function that takes in a $3 \times 3$ upper triangular matrix $A$ - represented as an `ndarray`, and a rhs vector $b$, and solves the equation $A x = b$. - i.e. the function will solve the following triangular system for $x = (x_1, x_2, - x_3)$: +Write a Python function that takes in a $3 \times 3$ upper triangular matrix $A$ +represented as an `ndarray`, and a rhs vector $b$, and solves the equation $A x = b$. +i.e. the function will solve the following triangular system for $x = (x_1, x_2, +x_3)$: $$ \begin{aligned} @@ -295,11 +293,12 @@ A_{33} x_3 &= b_3 \end{aligned} $$ - Generalise this function to a $n \times n$ triangular matrix input. +Generalise this function to a $n \times n$ triangular matrix input. :::solution ```python +import scipy def solve_triangular(A, b): n = len(b) x = np.empty_like(b) diff --git a/scientific_computing/linear_algebra/02-gaussian-elimination.md b/scientific_computing/linear_algebra/02-gaussian-elimination.md index 763cbf91..2598995e 100644 --- a/scientific_computing/linear_algebra/02-gaussian-elimination.md +++ b/scientific_computing/linear_algebra/02-gaussian-elimination.md @@ -1,31 +1,29 @@ --- name: Gaussian Elimination -dependsOn: [ - 'scientific_computing.linear_algebra.01-matrix-form-of-equations', -] +dependsOn: ["scientific_computing.linear_algebra.01-matrix-form-of-equations"] tags: [] questions: -- "What is the relationship between matrices and systems of linear equations?" -- "What is a singular matrix and when does it occur?" -- "What is Gaussian Elimination and why is it useful?" + - "What is the relationship between matrices and systems of linear equations?" + - "What is a singular matrix and when does it occur?" + - "What is Gaussian Elimination and why is it useful?" learningOutcomes: -- "Understand the main useful concepts for the solution of systems of linear equations" -- "Understand singular matrices and the rank of a matrix" -- "Understand and be able to implement Gaussian Elimination" -attribution: -- citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 + - "Understand the main useful concepts for the solution of systems of linear equations" + - "Understand singular matrices and the rank of a matrix" + - "Understand and be able to implement Gaussian Elimination" +attribution: + - citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Gaussian Elimination -Consider the problem: Find $x, y, z$ such that +Consider the problem: Find $x, y, z$ such that $$ \begin{aligned} @@ -35,18 +33,18 @@ eq3: & -x &+& 3y& -& 2z & = & 3 \end{aligned} $$ -**Gaussian elimination -- step 1** reduce the above system of equations so that the +**Gaussian elimination -- step 1** reduce the above system of equations so that the unknown $x$ is removed from the last two equations: $$ \begin{aligned} eq1: & 2x & +& y& -& z & = &3 \\ eq2 ~\rightarrow eq1 - 2 \times eq2: & & & y & - &11 z & =& -9 \\ -eq3~ \rightarrow ~ eq1 + 2 \times eq3: & & & 7y &- &5z& = & 9 +eq3~ \rightarrow ~ eq1 + 2 \times eq3: & & & 7y &- &5z& = & 9 \end{aligned} $$ -In this case the 2 coefficient for x in the first row is the *pivot*, and we are using +In this case the 2 coefficient for x in the first row is the _pivot_, and we are using this pivot value to convert the other two x coefficients in the rows below to zeros. **Gaussian elimination -- step 2** remove the unknown $y$ from the last equation: @@ -54,8 +52,8 @@ this pivot value to convert the other two x coefficients in the rows below to ze $$ \begin{aligned} eq1: & 2x + y - z & = 3 \\ -eq2: & ~~~ y - 11 z & = -9 \\ -eq3 ~ \rightarrow ~ 7 \times eq2 - eq3: & ~~~ -72 z = -72 +eq2: & ``` y - 11 z & = -9 \\ +eq3 ~ \rightarrow ~ 7 \times eq2 - eq3: & ``` -72 z = -72 \end{aligned} $$ @@ -64,49 +62,49 @@ Now the pivot is the 1 coefficient for $y$ in eq2. $$ \begin{aligned} eq1: & 2x + y - z & = 3 \\ -eq2: & ~~~ y - 11 z & = -9 \\ -eq3: & ~~~ -72 z & = -72 \end{aligned} +eq2: & ``` y - 11 z & = -9 \\ +eq3: & ``` -72 z & = -72 \end{aligned} $$ -This system is said to be *upper triangular*. It is also known as *row echelon* form, -and the leading coefficients ([2, 1, -72] in this case) are known as the *pivots*. +This system is said to be _upper triangular_. It is also known as _row echelon_ form, +and the leading coefficients ([2, 1, -72] in this case) are known as the _pivots_. **Gaussian elimination -- step 3** We can now use back substitution to obtain $x,y,z$. -In this case +In this case $$ z = 1,$$ $$ eq2: ~~ y - 11 = -9 ~~ \implies ~~ y = 2,$$ -$$ eq1: ~~ 2x +2 -1 = 3 , ~~ \implies ~~ x = 1.$$ +$$ eq1: ~~ 2x +2 -1 = 3 , ~~ \implies ~~ x = 1.$$ ### Pivoting -Consider the following system +Consider the following system $$ \begin{aligned} eq1: & x & + & y & + & z & = & a \\ eq2: & 2x & + & 2y & + & 5z & = & b \\ -eq3: & 4x & +& 6y & + & 8z & = & c -\end{aligned} +eq3: & 4x & +& 6y & + & 8z & = & c +\end{aligned} $$ -This firstly reduces to +This firstly reduces to $$ \begin{aligned} eq1: & x & + & y & + & z & = & a \\ eq2:& & & & & 3z & = & b' \\ -eq3: & & & 2y & + & 4z & = & c' +eq3: & & & 2y & + & 4z & = & c' \end{aligned} $$ -The problem here is that we have zero for the pivot in eq2. This can easily be switched +The problem here is that we have zero for the pivot in eq2. This can easily be switched into upper triangular form by switching rows two and three. -**Partial pivoting**: In general, we should be worried about both zero and very small -pivot values, as in the latter case they will lead to division by a small value, which -can cause large roundoff errors. So common practice is to select a row/pivot value such -that the pivot value is as large as possible +**Partial pivoting**: In general, we should be worried about both zero and very small +pivot values, as in the latter case they will lead to division by a small value, which +can cause large roundoff errors. So common practice is to select a row/pivot value such +that the pivot value is as large as possible ### Singular matrices in Gaussian Elimination @@ -114,11 +112,11 @@ $$ \begin{aligned} eq1: & x & + & y & + & z & = & a \\ eq2: & 2x & + & 2y & + & 5z & = & b \\ -eq3: & 4x & +& 4y & + & 8z & = & c -\end{aligned} +eq3: & 4x & +& 4y & + & 8z & = & c +\end{aligned} $$ -This firstly reduces to +This firstly reduces to $$ \begin{aligned} @@ -127,60 +125,62 @@ eq2:& & & & & 3z & = & b' \\ eq3:& & & & & 4z & = & c' \end{aligned} $$ -We cannot solve this by switching rows in this case, which means that the matrix is +We cannot solve this by switching rows in this case, which means that the matrix is singular and has no inverse ### Gaussian Elimination Rules 1. Operate on LHS and RHS (or RHSs) at the same time -2. Replace row with a sum/combination of rows -3. Work on one column at a time, choosing a pivot (leading non-zero entry in a chosen +1. Replace row with a sum/combination of rows +1. Work on one column at a time, choosing a pivot (leading non-zero entry in a chosen row), and eliminating all other non-zero values below that -3. Switch rows to avoid zeros on the diagonal (*pivoting*) -4. If (3) does not work, zeros on the diagonal (*pivots*) indicate a singular matrix +1. Switch rows to avoid zeros on the diagonal (_pivoting_) +1. If (3) does not work, zeros on the diagonal (_pivots_) indicate a singular matrix -**Computational cost**: If the number of equations $n$ is large, then a number of +**Computational cost**: If the number of equations $n$ is large, then a number of operations for gaussian elimination is $\mathcal{O}(n^3)$. ### Pseudocode -[Wikipedia](https://en.wikipedia.org/wiki/Gaussian_elimination#Pseudocode) has a -pseudocode implementation of the gaussian elimination algorithm which is helpful to +[Wikipedia](https://en.wikipedia.org/wiki/Gaussian_elimination#Pseudocode) has a +pseudocode implementation of the gaussian elimination algorithm which is helpful to understand how it works: - h := 1 /* Initialization of the pivot row */ - k := 1 /* Initialization of the pivot column */ - - while h ≤ m and k ≤ n - /* Find the k-th pivot: */ - i_max := argmax (i = h ... m, abs(A[i, k])) - if A[i_max, k] = 0 - /* No pivot in this column, pass to next column */ - k := k+1 - else - swap rows(h, i_max) - /* Do for all rows below pivot: */ - for i = h + 1 ... m: - f := A[i, k] / A[h, k] - /* Fill with zeros the lower part of pivot column: */ - A[i, k] := 0 - /* Do for all remaining elements in current row: */ - for j = k + 1 ... n: - A[i, j] := A[i, j] - A[h, j] * f - /* Increase pivot row and column */ - h := h + 1 - k := k + 1 - +```text +h := 1 /* Initialization of the pivot row */ +k := 1 /* Initialization of the pivot column */ + +while h ≤ m and k ≤ n + /* Find the k-th pivot: */ + i_max := argmax (i = h ... m, abs(A[i, k])) + if A[i_max, k] = 0 + /* No pivot in this column, pass to next column */ + k := k+1 + else + swap rows(h, i_max) + /* Do for all rows below pivot: */ + for i = h + 1 ... m: + f := A[i, k] / A[h, k] + /* Fill with zeros the lower part of pivot column: */ + A[i, k] := 0 + /* Do for all remaining elements in current row: */ + for j = k + 1 ... n: + A[i, j] := A[i, j] - A[h, j] * f + /* Increase pivot row and column */ + h := h + 1 + k := k + 1 +``` ::::challenge{id=gaussian-elimination title="Gaussian Elimination"} -1. Code a Python function that takes an 2D numpy array representing a matrix $A$, and a - 1D array representing a RHS $b$, and returns the solution of the linear equation $Ax - = b$. If you wish you can assume that the matrix has an inverse. Try it out on a few - test matrices and check your answer using +1. Code a Python function that takes an 2D numpy array representing a matrix $A$, and a + 1D array representing a RHS $b$, and returns the solution of the linear equation $Ax + = b$. If you wish you can assume that the matrix has an inverse. Try it out on a few + test matrices and check your answer using [`scipy.linalg.solve`](https://docs.scipy.org/doc/scipy-0.18.1/reference/generated/scipy.linalg.solve.html). :::solution + ```python import numpy as np import matplotlib.pylab as plt @@ -266,21 +266,22 @@ for A, b in zip(As, bs): x_mine = solve_gaussian_elimination(A, b) np.testing.assert_almost_equal(x_scipy, x_mine) ``` + ::: :::: ## Condition Number -Gaussian Elimination might still fail if $A$ is close to being singular, if a slight -change to its values causes it to be singular. In this case simple round-off error in -the floating point calculations can lead to zeros in the pivot positions. +Gaussian Elimination might still fail if $A$ is close to being singular, if a slight +change to its values causes it to be singular. In this case simple round-off error in +the floating point calculations can lead to zeros in the pivot positions. -Even if the pivot value is not exactly zero, a pivot value close to zero can lead to -large differences in the final result. In this case the matrix would be *nearly -singular*, or *ill-conditioned*. Most linear algebra packages will include a method of -calculating the *condition number* of a matrix, which evaluates how sensitive the -solution is to the input values of the matrix or rhs vector. An identity matrix has a -condition number of 1, while an exactly singular matrix has a condition number of +Even if the pivot value is not exactly zero, a pivot value close to zero can lead to +large differences in the final result. In this case the matrix would be _nearly +singular_, or _ill-conditioned_. Most linear algebra packages will include a method of +calculating the _condition number_ of a matrix, which evaluates how sensitive the +solution is to the input values of the matrix or rhs vector. An identity matrix has a +condition number of 1, while an exactly singular matrix has a condition number of infinity. ::::challenge{id=condition-number title="Condition Number"} @@ -294,20 +295,19 @@ $$ \end{aligned} $$ -2. Solve system (1), and then solve system (2), below, in which the data on the right - have been rounded to two decimal places. In each case, find the exact solution. +1. Solve system (1), and then solve system (2), below, in which the data on the right + have been rounded to two decimal places. In each case, find the exact solution. -$$ -\begin{aligned} -4.5 x_1 + 3.1 x_2 &= 19.25, (2)\\ -1.6 x_1 + 1.1 x_2 &= 6.84. -\end{aligned} -$$ + $$ + \begin{aligned} + 4.5 x_1 + 3.1 x_2 &= 19.25, (2)\\ + 1.6 x_1 + 1.1 x_2 &= 6.84. + \end{aligned} + $$ -3. The entries in (2) differ from those in (1) by less than .05%. Find the percentage +1. The entries in (2) differ from those in (1) by less than .05%. Find the percentage error when using the solution of (2) as an approximation for the solution of (1). - -4. Use `numpy.linalg.cond` to produce the condition number of the coefficient matrix in +1. Use `numpy.linalg.cond` to produce the condition number of the coefficient matrix in (1). :::solution @@ -325,4 +325,3 @@ print('condition number is ', np.linalg.cond(A)) ::: :::: - diff --git a/scientific_computing/linear_algebra/04-LU-decomposition.md b/scientific_computing/linear_algebra/04-LU-decomposition.md index 9f1abf85..63cb06de 100644 --- a/scientific_computing/linear_algebra/04-LU-decomposition.md +++ b/scientific_computing/linear_algebra/04-LU-decomposition.md @@ -1,47 +1,45 @@ --- name: LU decomposition -dependsOn: [ - 'scientific_computing.linear_algebra.02-gaussian-elimination', -] +dependsOn: ["scientific_computing.linear_algebra.02-gaussian-elimination"] tags: [] -attribution: -- citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 +attribution: + - citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- --- ## Matrix decompositions -Matrix factorisations play a key role in the solution of problems of the type $A x = b$. -Often (e.g. ODE solvers), you have a fixed matrix $A$ that must be solved with many -different $b$ vectors. A matrix factorisation is effectivly a pre-processing step that -allows you to partition $A$ into multiple factors (e.g. $A = LU$ in the case of $LU$ -decomposition), so that the actual solve is as quick as possible. Different +Matrix factorisations play a key role in the solution of problems of the type $A x = b$. +Often (e.g. ODE solvers), you have a fixed matrix $A$ that must be solved with many +different $b$ vectors. A matrix factorisation is effectivly a pre-processing step that +allows you to partition $A$ into multiple factors (e.g. $A = LU$ in the case of $LU$ +decomposition), so that the actual solve is as quick as possible. Different decompositions have other uses besides solving $A x = b$, for example: -- the $LU$, $QR$ and Cholesky decomposition can be used to quickly find the determinant - of a large matrix, since $\det(AB) = \det(A) \det(B)$ and the determinant of a - triangular matrix is simply the product of its diagonal entries. -- The Cholesky decomposition can be used to [sample from a multivariate normal - distribution](https://stats.stackexchange.com/questions/89796/can-i-use-the-cholesky-method-for-generating-correlated-random-variables-with-gi/89830#89830), - and is a very efficient technique to solve $A x = b$ for the specific case of a - positive definite matrix. -- The $QR$ decomposition can be used to solve a minimum least squares problem, to find - the eigenvalues and eigenvectors of a matrix, and to calulcate the [Singular Value - Decomposition](https://en.wikipedia.org/wiki/Singular_value_decomposition) (SVD), +- the $LU$, $QR$ and Cholesky decomposition can be used to quickly find the determinant + of a large matrix, since $\det(AB) = \det(A) \det(B)$ and the determinant of a + triangular matrix is simply the product of its diagonal entries. +- The Cholesky decomposition can be used to [sample from a multivariate normal + distribution](https://stats.stackexchange.com/questions/89796/can-i-use-the-cholesky-method-for-generating-correlated-random-variables-with-gi/89830#89830), + and is a very efficient technique to solve $A x = b$ for the specific case of a + positive definite matrix. +- The $QR$ decomposition can be used to solve a minimum least squares problem, to find + the eigenvalues and eigenvectors of a matrix, and to calulcate the [Singular Value + Decomposition](https://en.wikipedia.org/wiki/Singular_value_decomposition) (SVD), which is itself another very useful decomposition! ## $LU$ decomposition -The $LU$ decomposition is closely related to gaussian elimination. It takes the original -equation to be solved $A x = b$ and splits it up into two separate equations involving a +The $LU$ decomposition is closely related to gaussian elimination. It takes the original +equation to be solved $A x = b$ and splits it up into two separate equations involving a unit lower triangular matrix $L$, and the row echelon matrix $U$: $$ @@ -51,13 +49,13 @@ U x &= y \end{aligned} $$ -where $A = LU$. The $L$ matrix is a *unit* lower triangular matrix and thus has ones on -the diagonal, whereas $U$ is in row echelon form with pivot values in the leading +where $A = LU$. The $L$ matrix is a _unit_ lower triangular matrix and thus has ones on +the diagonal, whereas $U$ is in row echelon form with pivot values in the leading coefficients of each row. $$ A = \left( \begin{matrix} -1 & 0 & 0 & 0 \\ +1 & 0 & 0 & 0 \\ \ast & 1 & 0 & 0 \\ \ast & \ast & 1 & 0 \\ \ast & \ast & \ast & 1 \end{matrix}\right) @@ -69,82 +67,80 @@ p_1 & \ast & \ast & \ast \\ \end{matrix}\right) $$ -Thus, we have converted our original problem of solving $A x = b$ into two separate -solves, first solving the equation $L y = b$, and then using the result $y$ to solve $U -x = y$. +Thus, we have converted our original problem of solving $A x = b$ into two separate +solves, first solving the equation $L y = b$, and then using the result $y$ to solve $U +x = y$. $$ \begin{aligned} \left(\begin{matrix} 1 & 0 & 0 & 0 \\ \ast & 1 & 0 & 0 \\ \ast & \ast & 1 & 0 \\ -\ast & \ast & \ast & 1 +\ast & \ast & \ast & 1 \end{matrix}\right) \left(\begin{matrix} -y_1 \\ +y_1 \\ y_2 \\ y_3 \\ -y_4 +y_4 \end{matrix}\right) &= \left(\begin{matrix} -b_1 \\ +b_1 \\ b_2 \\ b_3 \\ -b_4 +b_4 \end{matrix}\right) \\ \left(\begin{matrix} -p_1 & \ast & \ast & \ast \\ +p_1 & \ast & \ast & \ast \\ 0 & p_2 & \ast & \ast \\ 0 & 0 & p_3 & \ast \\ -0 & 0 & 0 & p_4 +0 & 0 & 0 & p_4 \end{matrix}\right) \left(\begin{matrix} -x_1 \\ +x_1 \\ x_2 \\ x_3 \\ -x_4 +x_4 \end{matrix}\right) &= \left(\begin{matrix} -y_1 \\ +y_1 \\ y_2 \\ y_3 \\ -y_4 +y_4 \end{matrix}\right) \end{aligned} $$ - -However, each of those solves is very cheap to compute, in this case for the 4x4 matrix -shown above the solution of $L y = b$ only needs 6 multiplication and 6 additions, -whereas $U x = y$ requires 4 divisions, 6 multiplications and 6 additions, leading to a -total of 28 arithmetic operations, much fewer in comparison with the 62 operations -required to solve the original equation $A x = b$. In general, $LU$ decomposition for an -$n \times n$ matrix takes about $2 n^3 / 3$ flops, or floating point operations, to +However, each of those solves is very cheap to compute, in this case for the 4x4 matrix +shown above the solution of $L y = b$ only needs 6 multiplication and 6 additions, +whereas $U x = y$ requires 4 divisions, 6 multiplications and 6 additions, leading to a +total of 28 arithmetic operations, much fewer in comparison with the 62 operations +required to solve the original equation $A x = b$. In general, $LU$ decomposition for an +$n \times n$ matrix takes about $2 n^3 / 3$ flops, or floating point operations, to compute. - ## $LU$ factorisation without pivoting -A relativly simple $LU$ algorithm can be described if we assume that no pivoting is -required during a gaussian elimination. In this case, the gaussian elimination process -is a sequence of $p$ linear operations $E_1, E_2, ..., E_p$, with each operation $E_i$ -being a row replacement that adds a multiple of one row to another below it (i.e. $E_i$ -is lower triangular). The final matrix after applying the sequence of row reductions is +A relativly simple $LU$ algorithm can be described if we assume that no pivoting is +required during a gaussian elimination. In this case, the gaussian elimination process +is a sequence of $p$ linear operations $E_1, E_2, ..., E_p$, with each operation $E_i$ +being a row replacement that adds a multiple of one row to another below it (i.e. $E_i$ +is lower triangular). The final matrix after applying the sequence of row reductions is $U$ in row echelon form, that is: $$ E_p \cdots E_2 E_1 A = U $$ -Since we have $A = LU$, we can show that the sequence of operations $E_1, E_2, ..., E_p$ +Since we have $A = LU$, we can show that the sequence of operations $E_1, E_2, ..., E_p$ is also the sequence that reduces the matrix $L$ to an identity matrix: $$ A = (E_p \cdots E_2 E_1)^{-1} U = LU, $$ -therefore, +therefore, $$ L = (E_p \cdots E_2 E_1)^{-1}, @@ -156,116 +152,115 @@ $$ (E_p \cdots E_2 E_1) L = (E_p \cdots E_2 E_1) (E_p \cdots E_2 E_1)^{-1} = I $$ - -This implies how we can build up the matrix $L$. We choose values for $L$ such that the -series of row operations $E_1, E_2, ..., E_p$ convert the matrix $L$ to the identity -matrix. Since each $E_i$ is lower triangular, we know that both $(E_p \cdots E_2 E_1)$ +This implies how we can build up the matrix $L$. We choose values for $L$ such that the +series of row operations $E_1, E_2, ..., E_p$ convert the matrix $L$ to the identity +matrix. Since each $E_i$ is lower triangular, we know that both $(E_p \cdots E_2 E_1)$ and $(E_p \cdots E_2 E_1)^{-1}$ are also lower triangular. For example, consider the following matrix $$ A = \left(\begin{matrix} -3 & 2 & 1 & -3 \\ +3 & 2 & 1 & -3 \\ -6 & -2 & 1 & 5 \\ 3 & -4 & -7 & 2 \\ --9 & -6 & -1 & 15 +-9 & -6 & -1 & 15 \end{matrix}\right) $$ -After three row reductions, $R_2 \mathrel{{+}{=}} 2 R_1$, $R_3 \mathrel{{+}{=}} -1 R_1$, +After three row reductions, $R_2 \mathrel{{+}{=}} 2 R_1$, $R_3 \mathrel{{+}{=}} -1 R_1$, and $R_3 \mathrel{{+}{=}} 3 R_1$, we have the following result: $$ E_1 E_2 E_3 A = \left(\begin{matrix} -3 & 2 & 1 & -3 \\ +3 & 2 & 1 & -3 \\ 0 & 2 & * & * \\ 0 & -6 & * & * \\ -0 & 0 & * & * +0 & 0 & * & * \end{matrix}\right) $$ -To build the 1st column of $L$, we simply divide the 1st column of $A$ by the pivot +To build the 1st column of $L$, we simply divide the 1st column of $A$ by the pivot value 3, giving $$ L = \left(\begin{matrix} -1 & 0 & 0 & 0 \\ +1 & 0 & 0 & 0 \\ -2 & 1 & 0 & 0 \\ 1 & * & 1 & 0 \\ --3 & * & * & 1 +-3 & * & * & 1 \end{matrix}\right) $$ -For the next column we do the same, using the new pivot value $A_{2,2} = 2$ in row 2 to -reduce $A_{3,2}$ and $A_{4,2}$ to zero, and then dividing the column vector under the +For the next column we do the same, using the new pivot value $A_{2,2} = 2$ in row 2 to +reduce $A_{3,2}$ and $A_{4,2}$ to zero, and then dividing the column vector under the pivot $(-6, 0)^T$ by the pivot value 2 to obtain the next column of $L$. -Repeating this process for all the columns in $A$, we obtain the final factorisation. -You can verify for yourself that repeating the same row operations we did to form $U$ to +Repeating this process for all the columns in $A$, we obtain the final factorisation. +You can verify for yourself that repeating the same row operations we did to form $U$ to the matrix $L$ reduces it to the identity matrix. $$ L = \left(\begin{matrix} -1 & 0 & 0 & 0 \\ +1 & 0 & 0 & 0 \\ -2 & 1 & 0 & 0 \\ 1 & -3 & 1 & 0 \\ --3 & 0 & 2 & 1 +-3 & 0 & 2 & 1 \end{matrix}\right) $$ $$ E_1 E_2 ... E_p A = U = \left(\begin{matrix} -3 & 2 & 1 & -3 \\ +3 & 2 & 1 & -3 \\ 0 & 2 & 3 & -1 \\ 0 & 0 & 1 & 2 \\ -0 & 0 & 0 & 2 +0 & 0 & 0 & 2 \end{matrix}\right) $$ ## Pivoting -Of course, for any practial $LU$ factorisation we need to consider pivoting. Any matrix -$A$ can be factorised into $PLU$, where $P$ is a permutation matrix, and $L$ and $U$ are -defined as before. During the gaussian elimination steps we store an array of row -indices $p_i$ indicating that row $i$ is interchanged with row $p_i$, and the resultant -array of $p_i$ can be used to build the permutation matrix $P$ (It would be wasteful to -store the entire martix $P$ so the array $p_i$ is stored instead). +Of course, for any practial $LU$ factorisation we need to consider pivoting. Any matrix +$A$ can be factorised into $PLU$, where $P$ is a permutation matrix, and $L$ and $U$ are +defined as before. During the gaussian elimination steps we store an array of row +indices $p_i$ indicating that row $i$ is interchanged with row $p_i$, and the resultant +array of $p_i$ can be used to build the permutation matrix $P$ (It would be wasteful to +store the entire martix $P$ so the array $p_i$ is stored instead). Thus, the LU algorithm proceeds as follows: -1. Begin with the left-most column $i=0$, find an appropriate pivot (e.g. maximum entry - in the column) and designate this row as the pivot row. Interchange this row with row - $i$, and store the pivot row index as $p_i$. Use row replacements to create zeros - below the pivot. Create the corresponding column for $L$ by dividing by the pivot +1. Begin with the left-most column $i=0$, find an appropriate pivot (e.g. maximum entry + in the column) and designate this row as the pivot row. Interchange this row with row + $i$, and store the pivot row index as $p_i$. Use row replacements to create zeros + below the pivot. Create the corresponding column for $L$ by dividing by the pivot value. -2. Continue along to the next column $i$, again choosing a pivot row $p_i$, - interchanging it with row $i$ and creating zeros below the pivot, creating the new - column in $L$, and making sure to record which pivot row has been chosen for each +2. Continue along to the next column $i$, again choosing a pivot row $p_i$, + interchanging it with row $i$ and creating zeros below the pivot, creating the new + column in $L$, and making sure to record which pivot row has been chosen for each column. Repeat this step for all the columns of the matrix. -3. Once the last column has been done, $U$ should be in row echlon form and $L$ should - be a unit lower triangular matrix. The array $p_i$ implicitly defines the permutation +3. Once the last column has been done, $U$ should be in row echlon form and $L$ should + be a unit lower triangular matrix. The array $p_i$ implicitly defines the permutation matrix $P$ -In practice, most library implementation store $L$ and $U$ in the same matrix since they +In practice, most library implementation store $L$ and $U$ in the same matrix since they are lower and upper triangular respectivly. ## $LDL$ decomposition -It is often very benificial when solving linear systems to consider and take advantage -of any special structure that the matrix $A$ might possesses. The $LDL$ decomposition is -a varient on LU decomposition which is only applicable to a symmetric matrix $A$ (i.e. -$A = A^T$). The advantage of using this decomposition is that it takes advantage of the -redundent entries in the matrix to reduce the amount of computation to $n^3/3$, which is +It is often very benificial when solving linear systems to consider and take advantage +of any special structure that the matrix $A$ might possesses. The $LDL$ decomposition is +a varient on LU decomposition which is only applicable to a symmetric matrix $A$ (i.e. +$A = A^T$). The advantage of using this decomposition is that it takes advantage of the +redundent entries in the matrix to reduce the amount of computation to $n^3/3$, which is about a half that required for the $LU$ decomposition. ## Other Reading -- Linear algebra and its applications by David C. Lay. Chaper 2.5 -- Golub, G. H. & Van Loan, C. F. Matrix Computations, 3rd Ed. (Johns Hopkins University +- Linear algebra and its applications by David C. Lay. Chaper 2.5 +- Golub, G. H. & Van Loan, C. F. Matrix Computations, 3rd Ed. (Johns Hopkins University Press, 1996). Chapter 3.2 & 4.1 -- https://en.wikipedia.org/wiki/LU_decomposition -- https://en.wikipedia.org/wiki/Cholesky_decomposition#LDL_decomposition_2 +- +- ## Software @@ -281,11 +276,11 @@ about a half that required for the $LU$ decomposition. ::::challenge{id=lu-decomposition title="LU decomposition"} -Take your gaussian elimination code that you wrote in the previous lesson and use it to -write an LU decomposition function that takes in a martix $A$, and returns $L$, $U$ and -the array $p_i$. You can check your answer using -[`scipy.linalg.lu_factor`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.lu_factor.html). -Hint: the permutation matrix is tricky to construct, so you might want to use the test +Take your gaussian elimination code that you wrote in the previous lesson and use it to +write an LU decomposition function that takes in a martix $A$, and returns $L$, $U$ and +the array $p_i$. You can check your answer using +[`scipy.linalg.lu_factor`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.lu_factor.html). +Hint: the permutation matrix is tricky to construct, so you might want to use the test given in the documentation for `lu_factor`. :::solution @@ -370,5 +365,6 @@ for A in As: np.testing.assert_almost_equal(calculate_L_mult_U(LU_scipy), A[row_indices_scipy]) np.testing.assert_almost_equal(calculate_L_mult_U(LU_mine), A[row_indices_mine]) ``` + ::: :::: diff --git a/scientific_computing/linear_algebra/06-Cholesky-decomposition.md b/scientific_computing/linear_algebra/06-Cholesky-decomposition.md index ceae2a14..3d54021c 100644 --- a/scientific_computing/linear_algebra/06-Cholesky-decomposition.md +++ b/scientific_computing/linear_algebra/06-Cholesky-decomposition.md @@ -1,30 +1,27 @@ --- name: Cholesky decomposition -dependsOn: [ - 'scientific_computing.linear_algebra.04-LU-decomposition', -] +dependsOn: ["scientific_computing.linear_algebra.04-LU-decomposition"] tags: [] -attribution: -- citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Cholesky decomposition -*Symmetric positive definite* matrices are a very special type of matrix that often -arise in practice. From a computational point of view, this class of matrix is very -attractive because it is possible to decompose a symmetic positive definite matrix $A$ -very efficiently into a single lower triangular matrix $G$ so that $A = GG^T$. +_Symmetric positive definite_ matrices are a very special type of matrix that often +arise in practice. From a computational point of view, this class of matrix is very +attractive because it is possible to decompose a symmetic positive definite matrix $A$ +very efficiently into a single lower triangular matrix $G$ so that $A = GG^T$. -A matrix $A$ is positive definite if $x^T A x > 0$ for any nonzero $x \in \mathbb{R}$. -This statement by itself is not terribly intuitive, so lets look at also look at an +A matrix $A$ is positive definite if $x^T A x > 0$ for any nonzero $x \in \mathbb{R}$. +This statement by itself is not terribly intuitive, so lets look at also look at an example of a $2 \times 2$ matrix $$ @@ -45,33 +42,33 @@ x &= (1,-1)^T \Rightarrow x^T A x = a_{11} - 2a_{12} + a_{22} > 0 \\ \end{aligned} $$ -The first two equations show that the diagonal entries of $A$ must be positive, and -combining the last two equations imply $|a_{12}| \le (a_{11} + a_{22}) / 2$, that is -that the matrix has much of its "mass" on the diagonal (note: this is *not* the same as -the matrix being diagonally dominant, where $|a_{ii}| > \sum_{i=1...n,j \ne i} -|a_{ij}|$). These two observations for our $2 \times 2$ matrix also apply for a general -$n \times n$ SPD matrix. One of the very nice consequences of this "weighty" diagonal +The first two equations show that the diagonal entries of $A$ must be positive, and +combining the last two equations imply $|a_{12}| \le (a_{11} + a_{22}) / 2$, that is +that the matrix has much of its "mass" on the diagonal (note: this is _not_ the same as +the matrix being diagonally dominant, where $|a_{ii}| > \sum_{i=1...n,j \ne i} +|a_{ij}|$). These two observations for our $2 \times 2$ matrix also apply for a general +$n \times n$ SPD matrix. One of the very nice consequences of this "weighty" diagonal for SPD matrices is that it precludes the need for pivoting. -It can be shown that if $A$ is a SPD matrix, then the $LDL^T$ decomposition exists and -that $D = \text{diag}(d_1, ..., d_n)$ has positive diagonal entries. Therefore, it is -straightforward to see that $LDL^T$ = $GG^T$, where $G = L \text{diag}(\sqrt{d_1}, ..., -\sqrt{d_n})$. The decomposition $A = GG^T$ is known as the cholesky decomposition and -can be efficiently constructed in $n^3 / 3$ flops. There are a number of algorithms to -construct this decomposition, and both the [wikipedia -entry](https://en.wikipedia.org/wiki/Cholesky_decomposition) and Chapter 4.2 of the +It can be shown that if $A$ is a SPD matrix, then the $LDL^T$ decomposition exists and +that $D = \text{diag}(d_1, ..., d_n)$ has positive diagonal entries. Therefore, it is +straightforward to see that $LDL^T$ = $GG^T$, where $G = L \text{diag}(\sqrt{d_1}, ..., +\sqrt{d_n})$. The decomposition $A = GG^T$ is known as the cholesky decomposition and +can be efficiently constructed in $n^3 / 3$ flops. There are a number of algorithms to +construct this decomposition, and both the [wikipedia +entry](https://en.wikipedia.org/wiki/Cholesky_decomposition) and Chapter 4.2 of the Matrix Computations textbook by Golub and Van Loan gives a number of different varients. -Note that a $LDL$ decomposition can also be used to calculate a cholesky decomposition, -and this could be more efficient approach since (a) the SPD structure means that we can -neglect pivoting in the $LDL$ decomposition, and (b) the $LDL$ decomposition does not -requiring taking the square root of the diagonal elements. +Note that a $LDL$ decomposition can also be used to calculate a cholesky decomposition, +and this could be more efficient approach since (a) the SPD structure means that we can +neglect pivoting in the $LDL$ decomposition, and (b) the $LDL$ decomposition does not +requiring taking the square root of the diagonal elements. ### Other Reading -- Golub, G. H. & Van Loan, C. F. Matrix Computations, 3rd Ed. (Johns Hopkins University +- Golub, G. H. & Van Loan, C. F. Matrix Computations, 3rd Ed. (Johns Hopkins University Press, 1996). Chapter 4.2 -- https://en.wikipedia.org/wiki/Cholesky_decomposition +- ### Software @@ -89,16 +86,16 @@ requiring taking the square root of the diagonal elements. ::::challenge{id=sampling-random-fields title="Sampling random fields"} -Imagine that we wanted to sample an array of values $x_i$, for $i = 1...n$, where each -value is sampled from an independent normal distribution with standard deviation +Imagine that we wanted to sample an array of values $x_i$, for $i = 1...n$, where each +value is sampled from an independent normal distribution with standard deviation $\sigma$ $$ x_i \sim \mathcal{N}(0, \sigma) $$ -This could be achieved, for example, by sampling from a normal distribution with unit -standard deviation, a function that typically exists in any computer language, then +This could be achieved, for example, by sampling from a normal distribution with unit +standard deviation, a function that typically exists in any computer language, then multiplying by $\sigma$ $$ @@ -107,18 +104,18 @@ $$ where $\eta \sim \mathcal{N}(0, 1)$ -Now imagine that instead of an independent normal distribution you wish to sample -$\mathbf{x} = [x_1, x_2, ..., x_n]$ from a multivariate normal distribution with some +Now imagine that instead of an independent normal distribution you wish to sample +$\mathbf{x} = [x_1, x_2, ..., x_n]$ from a multivariate normal distribution with some covariance matrix $\Sigma$ $$ \mathbf{x} \sim \mathcal{N}(\mathbf{0}, \Sigma) $$ -We can achive this in practice by using the Cholesky decomposition. A covariance -matrix is a symmetic positive semidefinite matrix (i.e. $x^T \Sigma x \ge 0$}, and -therefore can be decomposed into $\Sigma = LL^T$. We can then draw a sample from -$\mathcal{N}(\mathbf{0}, \Sigma)$ by scaling an independently generated random vector +We can achive this in practice by using the Cholesky decomposition. A covariance +matrix is a symmetic positive semidefinite matrix (i.e. $x^T \Sigma x \ge 0$}, and +therefore can be decomposed into $\Sigma = LL^T$. We can then draw a sample from +$\mathcal{N}(\mathbf{0}, \Sigma)$ by scaling an independently generated random vector by $L$ $$ @@ -127,18 +124,18 @@ $$ where each element of the vector $\eta$ is $\eta_i \sim \mathcal{N}(0, 1)$. -Write Python code to randomly sample an n-dimensional vector $x$ from +Write Python code to randomly sample an n-dimensional vector $x$ from 1. an independent normal distribution with variance $\sigma_1^2$. -2. a multivariate normal distribution using a covariance matrix $\Sigma_{ij} = - \sigma_1^2 \exp{(-(i- j)^2 / \sigma_2^2)}$. Try different values for the magnitute - $\sigma_1$, and lenghtscale $\sigma_2$ parameters and their effect on the sampled - $\mathbf{x}$. Hint: while the expression for $\Sigma$ is guarrenteed to be positive - definte for all values of $\sigma_1$ and $\sigma_2$, numerical round-off can mean - that the Cholesky decomposition can fail. To guarrentee a positive definite - $\Sigma$, try adding a small amount (e.g. 1e-5) to the diagonal of $\Sigma$. This - is equivilent to adding a very small amount of independent normal noise to +2. a multivariate normal distribution using a covariance matrix $\Sigma_{ij} = + \sigma_1^2 \exp{(-(i- j)^2 / \sigma_2^2)}$. Try different values for the magnitute + $\sigma_1$, and lenghtscale $\sigma_2$ parameters and their effect on the sampled + $\mathbf{x}$. Hint: while the expression for $\Sigma$ is guarrenteed to be positive + definte for all values of $\sigma_1$ and $\sigma_2$, numerical round-off can mean + that the Cholesky decomposition can fail. To guarrentee a positive definite + $\Sigma$, try adding a small amount (e.g. 1e-5) to the diagonal of $\Sigma$. This + is equivilent to adding a very small amount of independent normal noise to $\mathbf{x}$. :::solution @@ -173,53 +170,52 @@ plt.plot(sample_zero_mean_random_field(K1)) plt.plot(sample_zero_mean_random_field(K2)) plt.show() ``` + ::: :::: - ::::challenge{id=likelihood title="Likelihood"} -Now imagine that we have a vector of measurements $\mathbf{x}$, and we assume that a -suitable model for these measurements is that they are generated from a zero-mean, +Now imagine that we have a vector of measurements $\mathbf{x}$, and we assume that a +suitable model for these measurements is that they are generated from a zero-mean, multivariate normal distribuion, i.e. $$ \mathbf{x} \sim \mathcal{N}(\mathbf{0}, \Sigma) $$ -We assume that the covariance matrix is of the following form, with two parameters -$\mathbf{\theta} = (\sigma_1, \sigma_2)$. +We assume that the covariance matrix is of the following form, with two parameters +$\mathbf{\theta} = (\sigma_1, \sigma_2)$. $$ \Sigma_{ij} = \sigma_1^2 \exp{(-(i- j)^2/ \sigma_2^2)} $$ -We can write down the *likelihood* of the covariance parameters $\mathbf{\theta}$, given -a given dataset $\mathbf{x}$, by using the probability distribution function (PDF) for a -zero-mean multivariate normal distribution +We can write down the _likelihood_ of the covariance parameters $\mathbf{\theta}$, given +a given dataset $\mathbf{x}$, by using the probability distribution function (PDF) for a +zero-mean multivariate normal distribution $$ -P(\mathbf{\theta} | \mathbf{x}) = (2 \pi)^{\frac{n}{2}} \text{ -det}(\Sigma)^{\frac{1}{2}} \exp{\left( \frac{1}{2}\mathbf{x}^T \Sigma^{-1} +P(\mathbf{\theta} | \mathbf{x}) = (2 \pi)^{\frac{n}{2}} \text{ +det}(\Sigma)^{\frac{1}{2}} \exp{\left( \frac{1}{2}\mathbf{x}^T \Sigma^{-1} \mathbf{x}\right)} $$ Typically we work with the log of the likelihood for numerical reasons, which is $$ -\mathcal{L} = -\frac{1}{2} \log(|\Sigma|) + \mathbf{x}^T \Sigma^{-1} \mathbf{x} + +\mathcal{L} = -\frac{1}{2} \log(|\Sigma|) + \mathbf{x}^T \Sigma^{-1} \mathbf{x} + \frac{n}{2} \log(2\pi) $$ -3. Generate a simulated dataset $\mathbf{x}$ using your code for question (2) using - "true" parameters $\mathbf{\theta}^t = (\sigma^t_1, \sigma^t_2)$. Then calculate - the log-likelihood using the Cholesky decomposition to efficiently calculate the - log determinant and the inverse of the covariance matrix. Vary $\mathbf{\theta}$ - and satisfy yourself that the maximum of the likelihood occurs at your "true" - parameters. In practice, when you don't know the true parameters, you could use an - optimisation algorithm to automatically determine the *most likely* model - parameters that give rise to your data. - +Generate a simulated dataset $\mathbf{x}$ using your code for question (2) using +"true" parameters $\mathbf{\theta}^t = (\sigma^t_1, \sigma^t_2)$. Then calculate +the log-likelihood using the Cholesky decomposition to efficiently calculate the +log determinant and the inverse of the covariance matrix. Vary $\mathbf{\theta}$ +and satisfy yourself that the maximum of the likelihood occurs at your "true" +parameters. In practice, when you don't know the true parameters, you could use an +optimisation algorithm to automatically determine the _most likely_ model +parameters that give rise to your data. :::solution @@ -248,12 +244,12 @@ plt.clabel(contours, inline=True, fontsize=8) plt.imshow(L, extent=[0.5, 1.5, 5.0, 15.0], origin='lower', cmap='RdGy', alpha=0.5, vmin=levels[0], aspect='auto') -c = plt.colorbar(); +c = plt.colorbar() c.set_label(r'$\mathcal{L}$') plt.xlabel(r'$\sigma_1$') plt.ylabel(r'$\sigma_2$') plt.show() ``` + ::: :::: - diff --git a/scientific_computing/linear_algebra/07-QR-decomposition.md b/scientific_computing/linear_algebra/07-QR-decomposition.md index d05dbd7a..7da12749 100644 --- a/scientific_computing/linear_algebra/07-QR-decomposition.md +++ b/scientific_computing/linear_algebra/07-QR-decomposition.md @@ -1,77 +1,71 @@ --- name: QR decomposition -dependsOn: [ - 'scientific_computing.linear_algebra.06-Cholesky-decomposition', -] +dependsOn: ["scientific_computing.linear_algebra.06-Cholesky-decomposition"] tags: [] -attribution: -- citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- - ## QR decomposition ## The least-squares problem -One of the most important application of the $QR$ decomposition is the least squares -solution of a set of overdetermined equations. That is a set of $m$ linear equations -with $n$ unknowns, with $m \ge n$. The least squares problem to be solved is the -mimimisation of $||A x - b ||_2$, where $|| x ||_2 = \sqrt{x_1^2 + x_2^2 + ... + x_m^2}$ -is the standard 2-norm, and where $A \in \mathbb{R}^{m \times n}$ with $m \ge n$ and $b -\in \mathbb{R}^m$. In this case, the problem $Ax = b$ will often have no solution, and -thus it is nessessary to consider $Ax$ and $b$ as *approximatelly* equal, and to +One of the most important application of the $QR$ decomposition is the least squares +solution of a set of overdetermined equations. That is a set of $m$ linear equations +with $n$ unknowns, with $m \ge n$. The least squares problem to be solved is the +mimimisation of $||A x - b ||_2$, where $|| x ||_2 = \sqrt{x_1^2 + x_2^2 + ... + x_m^2}$ +is the standard 2-norm, and where $A \in \mathbb{R}^{m \times n}$ with $m \ge n$ and $b +\in \mathbb{R}^m$. In this case, the problem $Ax = b$ will often have no solution, and +thus it is nessessary to consider $Ax$ and $b$ as _approximatelly_ equal, and to minimise the distance between them by minimising the loss function $||A x - b||_2$. -To solve this least squares problem, we need to consider the subspace of all vectors in -$\mathbb{R}^{m}$ that are formed from linear combinations of the columns of $A$. This is -known as the column space of the $A$, and is denoted as $\text{Col }A$. Given that *any* -linear combination of the columns of $A$ will lie in this space, we can say that $Ax$ +To solve this least squares problem, we need to consider the subspace of all vectors in +$\mathbb{R}^{m}$ that are formed from linear combinations of the columns of $A$. This is +known as the column space of the $A$, and is denoted as $\text{Col }A$. Given that _any_ +linear combination of the columns of $A$ will lie in this space, we can say that $Ax$ will also lie in $\text{Col }A$ for any $x$. -Now consider a projection of $b$ into the column space of $A$ to give a new vector -$\hat{b}$ (i.e. $\hat{b}$ is the closest point in Col $A$ to $b$), see the diagram -below. Because $\hat{b}$ is in the column space of $A$, we know that there is another +Now consider a projection of $b$ into the column space of $A$ to give a new vector +$\hat{b}$ (i.e. $\hat{b}$ is the closest point in Col $A$ to $b$), see the diagram +below. Because $\hat{b}$ is in the column space of $A$, we know that there is another vector $\hat{x}$ that also lies in Col $A$ and satisfies $$ A \hat{x} = \hat{b} $$ -Since $\hat{b}$ is the closest point to $b$ in the column space of $A$, we can therefore +Since $\hat{b}$ is the closest point to $b$ in the column space of $A$, we can therefore say that $\hat{x}$ is the least-squares solution. - ![least squares problem](images/linear-least-squares.svg) - -We can show that the vector $b - \hat{b} = b - A \hat{x}$ is orthogonal to Col $A$ and -therefore also orthogonal to each column in $A$, so we have $a_j^T (b - A \hat{x})$ for +We can show that the vector $b - \hat{b} = b - A \hat{x}$ is orthogonal to Col $A$ and +therefore also orthogonal to each column in $A$, so we have $a_j^T (b - A \hat{x})$ for each column $a_j$ of $A$. Putting these $m$ equations together we can write $$ A^T (b - A \hat{x}) = 0 $$ -or rearranged slightly, we can find the least-sqaures solution $\hat{x}$ via the +or rearranged slightly, we can find the least-sqaures solution $\hat{x}$ via the solution of the equation $$ A^T A \hat{x} = A^T b $$ -The $QR$ decomposition divides $A = QR$ into an orthogonal matrix $Q$, and an upper -triangular matrix $R$. Most importantly for the least-squares problem, the matrix $Q$ is +The $QR$ decomposition divides $A = QR$ into an orthogonal matrix $Q$, and an upper +triangular matrix $R$. Most importantly for the least-squares problem, the matrix $Q$ is also an orthonormal basis for Col $A$ and therefore $\hat{b} = Q Q^T b$. -Given this decomposition, it can be shown that the least squares solution of $A x = b$ +Given this decomposition, it can be shown that the least squares solution of $A x = b$ is given by $$ @@ -84,10 +78,10 @@ $$ A\hat{x} = QR \hat{x} = QRR^{-1}Q^T b = Q Q^T b = \hat{b} $$ -Therefore $A\hat{x} = \hat{b}$, which proves that $\hat{x}$ is the least-squares +Therefore $A\hat{x} = \hat{b}$, which proves that $\hat{x}$ is the least-squares solution for $A x = b$ -Finally, we note that the inverse $R^{-1}$ should not be calculated directly, but +Finally, we note that the inverse $R^{-1}$ should not be calculated directly, but instead $\hat{x}$ should be found by solving $$ @@ -96,9 +90,9 @@ $$ ### Constructing the QR decomposition -$QR$ decomposisions are normally computed via Householder reflections, Givens rotations -or the Gram-Schmidt process. For a brief summary of the first two methods, it is useful -to consider a simple $2 \times 2$ reflection or rotation of a 2d vector. For example, +$QR$ decomposisions are normally computed via Householder reflections, Givens rotations +or the Gram-Schmidt process. For a brief summary of the first two methods, it is useful +to consider a simple $2 \times 2$ reflection or rotation of a 2d vector. For example, the matrix $$ @@ -108,11 +102,11 @@ Q = \left(\begin{matrix} \end{matrix}\right) $$ -is a *rotation* matrix that when applied to a vector $x$ will result in $y = Qx$, where -$y$ is rotated counterclockwise through the angle $\theta$. $Q$ is also *orthogonal* +is a _rotation_ matrix that when applied to a vector $x$ will result in $y = Qx$, where +$y$ is rotated counterclockwise through the angle $\theta$. $Q$ is also _orthogonal_ since $QQ^T = I$. -Similarly, a $2 \times 2$ *reflection* matrix can be constructed as +Similarly, a $2 \times 2$ _reflection_ matrix can be constructed as $$ Q = \left(\begin{matrix} @@ -121,107 +115,105 @@ Q = \left(\begin{matrix} \end{matrix}\right) $$ -which when applied to a vector $x$ will result in $y = Qx$, where $y$ is reflected +which when applied to a vector $x$ will result in $y = Qx$, where $y$ is reflected across the line defined by $\text{span}((\cos(\theta), \sin(\theta))^T)$. -Rotations and reflections are often useful because they can be selected in order to -introduce zeros to the vector they are applied to. Given an $m \times n$ matrix $A$, a -series of $n$ *Householder reflections* can be applied to reduce $A$ to an upper +Rotations and reflections are often useful because they can be selected in order to +introduce zeros to the vector they are applied to. Given an $m \times n$ matrix $A$, a +series of $n$ _Householder reflections_ can be applied to reduce $A$ to an upper triangular matrix $R$ $$ H_n ... H_2 H_1 A = R $$ -By setting $Q = H_1 H_2 ... H_n$, we can show that $A = QR$, and that $Q$ is an +By setting $Q = H_1 H_2 ... H_n$, we can show that $A = QR$, and that $Q$ is an orthogonal matrix which is also an orthonormal basis for the column space of $A$. -Similarly, a *Givens rotation* can be used to zero a single component of $A$, so that a +Similarly, a _Givens rotation_ can be used to zero a single component of $A$, so that a a series of rotations can be used to contruct the upper triangular matrix $R$ $$ G_j ... G_2 G_1 A = R $$ -so that $Q = G_1 G_2 ... G_j$, and $A = QR$. For both the Householder and Givens -methods, it is often useful to not construct the full matrix $Q$ but to keep $Q$ -factored as a implicit product of either $H_1 H_2 ... H_n$ or $G_1 G_2 ... G_j$. Fast +so that $Q = G_1 G_2 ... G_j$, and $A = QR$. For both the Householder and Givens +methods, it is often useful to not construct the full matrix $Q$ but to keep $Q$ +factored as a implicit product of either $H_1 H_2 ... H_n$ or $G_1 G_2 ... G_j$. Fast algorithms exist to calculate the produce of these factored forms to another vector. -The final method to contruct a $QR$ decomposition is using the Gram-Schmidt process, -which is a process for contructing an orthogonal or orthonormal basis for a given -subspace defined by the span of the set of vectors $x_1, x_2, ..., x_n$. If these $n$ -vectors are the columns of the $m \times n$ matrix $A$, then the Gram-Schmidt process -can be used to directly contruct the orthonormal basis of the column space of $A$ given -by $Q$, and that $A = QR$ where $R$ is an upper triangular matrix. The matrix $R$ can be -calculated using $R = Q^T A$. Note that the classical Gram-Schmidt exhibits poor -numerical qualities, therefore a modified version of the algorithm exists, which is +The final method to contruct a $QR$ decomposition is using the Gram-Schmidt process, +which is a process for contructing an orthogonal or orthonormal basis for a given +subspace defined by the span of the set of vectors $x_1, x_2, ..., x_n$. If these $n$ +vectors are the columns of the $m \times n$ matrix $A$, then the Gram-Schmidt process +can be used to directly contruct the orthonormal basis of the column space of $A$ given +by $Q$, and that $A = QR$ where $R$ is an upper triangular matrix. The matrix $R$ can be +calculated using $R = Q^T A$. Note that the classical Gram-Schmidt exhibits poor +numerical qualities, therefore a modified version of the algorithm exists, which is described in the Golub and Van Loan Matrix Computations textbook listed below. -In terms of computational work, the Householder method takes $2n^2(m-n/3)$ flops to -compute $Q$ in factored form, and another $2n^2(m-n/3)$ to get the full matrix $Q$, -whereas the Gram-Schmidt method is more efficient at $2mn^2$ flops. However, Householder -is normally prefered in practice as even with the modified algorithm the numerical -properies of the Gram-Schmidt are still poor in comparison with both Householder and -Givens (i.e. the final orthogonality of $Q$ is not ideal), so is only useful when the -columns of $A$ are already fairly independent. Using Givens rotations the matrix $R$ can -be found in $2n^2(m-n/3)$, or the factorised form of the $QR$ decomposition can be found -in the same amount of time. The full matrix $Q$ is not normally calculated via Givens -rotations. Using Givens rotations is most useful when there are only few non-zeros in +In terms of computational work, the Householder method takes $2n^2(m-n/3)$ flops to +compute $Q$ in factored form, and another $2n^2(m-n/3)$ to get the full matrix $Q$, +whereas the Gram-Schmidt method is more efficient at $2mn^2$ flops. However, Householder +is normally prefered in practice as even with the modified algorithm the numerical +properies of the Gram-Schmidt are still poor in comparison with both Householder and +Givens (i.e. the final orthogonality of $Q$ is not ideal), so is only useful when the +columns of $A$ are already fairly independent. Using Givens rotations the matrix $R$ can +be found in $2n^2(m-n/3)$, or the factorised form of the $QR$ decomposition can be found +in the same amount of time. The full matrix $Q$ is not normally calculated via Givens +rotations. Using Givens rotations is most useful when there are only few non-zeros in $A$, and is more easily parallised than Householder. ### Other Reading -The discussion in this section relied on concepts such as orthogonal and orthonormal -vector pairs, vector spaces and subspaces and basis vectors. It is well worth +The discussion in this section relied on concepts such as orthogonal and orthonormal +vector pairs, vector spaces and subspaces and basis vectors. It is well worth investigating these topics further in: - Linear algebra and its applications by David C. Lay. Chapers 4 & 6. Additional reading on the $QR$ decomposition can be found at: -- Linear algebra and its applications by David C. Lay. Chaper 6.4 -- Golub, G. H. & Van Loan, C. F. Matrix Computations, 3rd Ed. (Johns Hopkins University +- Linear algebra and its applications by David C. Lay. Chaper 6.4 +- Golub, G. H. & Van Loan, C. F. Matrix Computations, 3rd Ed. (Johns Hopkins University Press, 1996). Chapter 5.2 -- https://en.wikipedia.org/wiki/QR_decomposition +- ### Software - [`scipy.linalg.qr`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.qr.html) +[`scipy.linalg.qr`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.qr.html) - [`scipy.linalg.qr_multiply`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.qr_multiply.html) +[`scipy.linalg.qr_multiply`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.qr_multiply.html) - [`scipy.linalg.qr_update`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.qr_update.html) +[`scipy.linalg.qr_update`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.qr_update.html) - [`scipy.linalg.qr_delete`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.qr_delete.html) - - [`scipy.linalg.qr_insert`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.qr_insert.html) - - [`numpy.linalg.lstsq`](https://numpy.org/doc/stable/reference/generated/numpy.linalg.lstsq.html) +[`scipy.linalg.qr_delete`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.qr_delete.html) +[`scipy.linalg.qr_insert`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.qr_insert.html) +[`numpy.linalg.lstsq`](https://numpy.org/doc/stable/reference/generated/numpy.linalg.lstsq.html) ## Problems ::::challenge{id=model-fitting title="Model fitting"} -For this exercises we will be using some data on Oxford's weather which is hosted by -[Saad Jbabdi](https://users.fmrib.ox.ac.uk/~saad/) from the Wellcome Centre for -Integrative NeuroImaging (FMRIB), which can be obtained +For this exercises we will be using some data on Oxford's weather which is hosted by +[Saad Jbabdi](https://users.fmrib.ox.ac.uk/~saad/) from the Wellcome Centre for +Integrative NeuroImaging (FMRIB), which can be obtained [here](http://www.fmrib.ox.ac.uk/~saad/ONBI/OxfordWeather.txt). -We wish to fit a quadratic model of the form $y = a x^2 + b x + c$ to the hours of -sunlight observed in Oxford (7th column in `OxfordWeather.txt`) versus the month (2nd -column). The dataset in question has $m > 3$ data points, so our model gives us a set of -$m$ equations for 3 unknowns $a$, $b$, and $c$ that are overdetermined, that is, for +We wish to fit a quadratic model of the form $y = a x^2 + b x + c$ to the hours of +sunlight observed in Oxford (7th column in `OxfordWeather.txt`) versus the month (2nd +column). The dataset in question has $m > 3$ data points, so our model gives us a set of +$m$ equations for 3 unknowns $a$, $b$, and $c$ that are overdetermined, that is, for each data point $(y_i, x_i)$ for $i=1..m$ we have: $$ y_i = a x_i^2 + b x_i + c $$ -Use a $QR$ decomposition to find the least-squares solution to these equations (you can -check it using `np.linalg.lstsq` if you like), and therefore fit the model to the data. +Use a $QR$ decomposition to find the least-squares solution to these equations (you can +check it using `np.linalg.lstsq` if you like), and therefore fit the model to the data. Plot the model and the data side by side to qualitatively evaluate the fit. :::solution @@ -233,8 +225,8 @@ import numpy as np import scipy.linalg names = ['year', 'month', 'maxTemp', 'minTemp', 'hoursFrost', 'rain', 'hoursSun'] -df = pd.read_csv('OxfordWeather.txt', - delim_whitespace=True, header=None, names=names) +df = pd.read_csv('OxfordWeather.txt', + delim_whitespace=True, header=None, names=names) x = df.month.values.reshape(-1,1) y = df.hoursSun.values.reshape(-1,1) @@ -261,5 +253,6 @@ plt.xlabel('month') plt.ylabel('hoursSun') plt.show() ``` + ::: :::: diff --git a/scientific_computing/linear_algebra/index.md b/scientific_computing/linear_algebra/index.md index c05cb2c6..e259c513 100644 --- a/scientific_computing/linear_algebra/index.md +++ b/scientific_computing/linear_algebra/index.md @@ -1,35 +1,32 @@ --- id: linear_algebra name: Linear Algebra Solvers -dependsOn: [ - scientific_computing.essential_maths, -] -files: [ - 01-matrix-form-of-equations.md, - 02-gaussian-elimination.md, - 04-LU-decomposition.md, - 06-Cholesky-decomposition.md, - 07-QR-decomposition.md, -] +dependsOn: [scientific_computing.essential_maths] +files: + [ + 01-matrix-form-of-equations.md, + 02-gaussian-elimination.md, + 04-LU-decomposition.md, + 06-Cholesky-decomposition.md, + 07-QR-decomposition.md, + ] summary: | This course covers the basics of linear algebra focusing on the solution of dense systems of linear equations using direct methods and matrix decompositions. Both the theory and implementation of these methods are covered, as well as some standard Python libraries for linear algebra. -attribution: -- citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - +attribution: + - citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- -Linear algebra is the branch of mathematics concerning sets of linear equations +Linear algebra is the branch of mathematics concerning sets of linear equations of the form $a_1x_1 + a_2x_2 + \cdots + a_nx_n = b$, which can be written using matrices and vectors as $Ax = b$. The solution of these types of equations is fundamental to many areas of diff --git a/scientific_computing/ode_solvers/01-AM.md b/scientific_computing/ode_solvers/01-AM.md index cffbeadf..a295c956 100644 --- a/scientific_computing/ode_solvers/01-AM.md +++ b/scientific_computing/ode_solvers/01-AM.md @@ -1,103 +1,103 @@ --- name: ODE Solvers - AM Exercises -dependsOn: [ -] +dependsOn: [] tags: [] -attribution: -- citation: This material has been adapted from material by Joe Pitt-Francis from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - +attribution: + - citation: This material has been adapted from material by Joe Pitt-Francis from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- - # Solving Ordinary Differential Equations (ODEs) -**You can download the slides for this lecture +**You can download the slides for this lecture [here](slides/LectureSolvingODEs.pdf)** The intentions behind this exercise are: + - To give you a clear goal to meet - To give you a chance to revise or learn a few things. Specifically: - The abstract base class pattern; - Unit testing with `unittest`; - Ordinary differential equation solver implementations. -A nice feature of this assignment is that you are provided with a testing infrastructure +A nice feature of this assignment is that you are provided with a testing infrastructure which gives an -implicit specification of the code you are required to write. This means that when +implicit specification of the code you are required to write. This means that when you've written code such that all the tests pass then you will know that you have completed the exercise. -### Getting the code for these exercises +## Getting the code for these exercises -Download the file [ODEsAM.zip](slides/ODEsAM.zip). You have been -provided with one source class (`AbstractOdeSolver`) and with two test files. -You will not need to edit any of these 3 source files. If you do make edits to -them during the course of the exercise then make sure that your final version -works with the original versions of these files. +Download the file [ODEsAM.zip](slides/ODEsAM.zip). You have been +provided with one source class (`AbstractOdeSolver`) and with two test files. +You will not need to edit any of these 3 source files. If you do make edits to +them during the course of the exercise then make sure that your final version +works with the original versions of these files. ::::challenge{id=forward-euler title="Forward Euler"} -**Running first test**: +**Running first test**: -Check to see what running the simplest test-suite `TestOdeSolvers.py` does at this -stage. It should stop with `ModuleNotFoundError: No module named +Check to see what running the simplest test-suite `TestOdeSolvers.py` does at this +stage. It should stop with `ModuleNotFoundError: No module named 'ForwardEulerOdeSolver'`. In order for the test to run, you need to create the -`ForwardEulerOdeSolver` class in `ForwardEulerOdeSolver.py`. This class should +`ForwardEulerOdeSolver` class in `ForwardEulerOdeSolver.py`. This class should inherit from `AbstractOdeSolver`, and should provide the one method: -`Solve()`. (This method's implementation does not yet have to contain any code -other than `pass`.) Now check that the `TestOdeSolvers.py` test-suite runs. It -is expected to throw exceptions at this stage. +`Solve()`. (This method's implementation does not yet have to contain any code +other than `pass`.) Now check that the `TestOdeSolvers.py` test-suite runs. It +is expected to throw exceptions at this stage. + +:::solution -:::solution ```python from AbstractOdeSolver import AbstractOdeSolver # Abstract base class class ForwardEulerOdeSolver(AbstractOdeSolver): - def Solve(self): pass + def Solve(self): + pass ``` -This answer is just for completeness. There's nothing to this part + +This answer is just for completeness. There's nothing to this part apart from a warm-up. ::: -**Writing a forward Euler solver**: - -The good news is that 1 out of 3 of the tests in the `TestOdeSolvers` test-suite -pass immediately. The first test `test_abstract_class_methods` verifies that -all the functionality I have written in the abstract base class works and -consequently the test passes. The other two tests which check -`ForwardEulerOdeSolver.Solve()` works correctly are the tests which are failing. -If you need a revision tutorial about the Forward Euler method then please ask -now. - -Your task is to implement the `Solve()` method so that it solves from the -initial values over a time-interval with a fixed time step using forward Euler. -On exit from the method you should have populated `solutionTrace` with a list of -$(x,y)$ values (as pairs/tuples) and `timeTrace` with the list of corresponding -times. The test-suite uses two simple model ordinary differential 2-d systems -of equations. These are plugged into the solver as right-hand side functions. +**Writing a forward Euler solver**: + +The good news is that 1 out of 3 of the tests in the `TestOdeSolvers` test-suite +pass immediately. The first test `test_abstract_class_methods` verifies that +all the functionality I have written in the abstract base class works and +consequently the test passes. The other two tests which check +`ForwardEulerOdeSolver.Solve()` works correctly are the tests which are failing. +If you need a revision tutorial about the Forward Euler method then please ask +now. + +Your task is to implement the `Solve()` method so that it solves from the +initial values over a time-interval with a fixed time step using forward Euler. +On exit from the method you should have populated `solutionTrace` with a list of +$(x,y)$ values (as pairs/tuples) and `timeTrace` with the list of corresponding +times. The test-suite uses two simple model ordinary differential 2-d systems +of equations. These are plugged into the solver as right-hand side functions. They are defined at the top of the test-suite file and both have straightforward analytic solutions. -As you develop your `Solve()` method you should reduce the number of failing -lines. One or two of the test lines are harder to satisfy than others. Attempt -to have no failures but the fewer the better. Please make sure that you have +As you develop your `Solve()` method you should reduce the number of failing +lines. One or two of the test lines are harder to satisfy than others. Attempt +to have no failures but the fewer the better. Please make sure that you have comments in your code. - :::solution -Here's a version which first builds the lists to the correct length and then +Here's a version which first builds the lists to the correct length and then writes in using indexing. ```python -def Solve(self): +class ForwardEulerOdeSolver(AbstractOdeSolver): + def Solve(self): """ The forward Euler solve method """ # Fulfil the "throw if not setup" test if self.numberOfTimeSteps <= 0 : @@ -124,16 +124,9 @@ def Solve(self): # Update solution and time to new values values = (x,y) - time = self.startTime + (i+1)*self.timeStepSize; + time = self.startTime + (i+1)*self.timeStepSize self.solutionTrace[i+1] = values self.timeTrace[i+1] = time -``` - -Here's an outline of a version with appends to the back of the lists. -This code also illustrates where testing may fail if the time-stepper uses -repeated addition. - -```python # Initial state is recorded self.solutionTrace.append(self.initialValues) self.timeTrace.append(self.startTime) @@ -143,27 +136,29 @@ repeated addition. values = self.initialValues for i in range(0, self.numberOfTimeSteps): - ... # Update solution and time to new values values = (x,y) - time = self.startTime + (i+1)*self.timeStepSize; - accumulated_time += self.timeStepSize; # Wrong way to do this - self.solutionTrace.append(values); + time = self.startTime + (i+1)*self.timeStepSize + accumulated_time += self.timeStepSize # Wrong way to do this + self.solutionTrace.append(values) self.timeTrace.append(accumulated_time) - ... ``` +Here's an outline of a version with appends to the back of the lists. +This code also illustrates where testing may fail if the time-stepper uses +repeated addition. + ::: :::: ::::challenge{id=higher-order title="A Higher-Order Solver"} -Now run `TestHigherOrderOdeSolverRunner`. This new test-suite +Now run `TestHigherOrderOdeSolverRunner`. This new test-suite requires you make another class `HigherOrderOdeSolver.py` which should -again provide only a `Solve()` method. It tests the convergence behaviour of -your two solvers and checks that the higher-order one is better. My model answer -(which passes the test) uses a standard second order Runge-Kutta method. Again -work on your solution until there are few lines (ideally no lines) of failing +again provide only a `Solve()` method. It tests the convergence behaviour of +your two solvers and checks that the higher-order one is better. My model answer +(which passes the test) uses a standard second order Runge-Kutta method. Again +work on your solution until there are few lines (ideally no lines) of failing tests. :::solution @@ -213,20 +208,21 @@ class HigherOrderOdeSolver(AbstractOdeSolver): ::: :::: - - ::::challenge{id=library-function title="Using a library function"} Write a solver implementation which uses `scipy.integrate.odeint` at its core. -You should be able to do this with very few lines of code, including just a -single call to `odeint`. Test it in a similar way to the +You should be able to do this with very few lines of code, including just a +single call to `odeint`. Test it in a similar way to the `HigherOrderOdeSolver`, using progressively smaller time-steps. -- What do you notice about the reported error? +- What do you notice about the reported error? - What control do you have over the error? :::solution + ```python +import numpy as np +from scipy.integrate import odeint class OdeIntOdeSolver(AbstractOdeSolver): def Solve(self): """ The solve method using scipy.integrate.odeint""" @@ -242,8 +238,9 @@ class OdeIntOdeSolver(AbstractOdeSolver): With the default setting and the same test as before I get something similar to: -``` -#step_size euler_error hi_error oi_error + +```text +#step_size euler_error hi_error oi_error 6.283185307179586 4.442882938158366 14.647777676841754 1.0706378471935034e-07 3.141592653589793 7.088890675779233 14.73085508342019 1.1477237333346198e-07 1.5707963267948966 6.5841823894717395 2.953368442233411 1.1211067003602387e-07 @@ -251,21 +248,21 @@ similar to: ``` This is because `odeint` has absolute and relative error tolerances -set and it uses adaptive time-steps *between* the time-steps -which we care about. That means that the first $2\pi$ step around the +set and it uses adaptive time-steps _between_ the time-steps +which we care about. That means that the first $2\pi$ step around the circle is about as accurate as the finer tests which follow it. We can control the error by setting the error tolerances, with -```python +```python nolint self.solutionTrace =\ odeint(self.rhsFunction, self.initialValues, times, rtol=1e-12, atol=1e-12) ``` we get: -``` -#step_size euler_error hi_error oi_error +```text +#step_size euler_error hi_error oi_error 6.283185307179586 4.442882938158366 14.647777676841754 2.4189691882833333e-12 3.141592653589793 7.088890675779233 14.73085508342019 2.2407390802495764e-12 1.5707963267948966 6.5841823894717395 2.953368442233411 1.990068084042871e-12 @@ -277,18 +274,16 @@ we get: ::::challenge{id=convergence title="Convergence Behaviour"} - **[Extension]** Use the test output in `TestHigherOrderOdeSolver` as the +**[Extension]** Use the test output in `TestHigherOrderOdeSolver` as the model for plotting convergence behaviour for these two solvers and at -least one other solver that you should write. Use `matplotlib` or some other -plotting program to make a log-log plot with time-step on the x-axis and error +least one other solver that you should write. Use `matplotlib` or some other +plotting program to make a log-log plot with time-step on the x-axis and error on the y-axis. :::: - ::::challenge{id=cpp title="C++"} -**[Extension]** If you love `C++` then you are welcome to investigate coding ODE -solvers in C++ by investigating the assignment which formed the basis of this exercise: +**[Extension]** If you love `C++` then you are welcome to investigate coding ODE +solvers in C++ by investigating the assignment which formed the basis of this exercise: [https://www.cs.ox.ac.uk/people/joe.pitt-francis/oxfordonly/FridayAssignment.pdf](https://www.cs.ox.ac.uk/people/joe.pitt-francis/oxfordonly/FridayAssignment.pdf) :::: - diff --git a/scientific_computing/ode_solvers/02-PM.md b/scientific_computing/ode_solvers/02-PM.md index dde99627..fea0acd9 100644 --- a/scientific_computing/ode_solvers/02-PM.md +++ b/scientific_computing/ode_solvers/02-PM.md @@ -1,112 +1,109 @@ --- name: ODE Solvers - PM Exercises -dependsOn: [ - scientific_computing.ode_solvers.01-AM, -] +dependsOn: [scientific_computing.ode_solvers.01-AM] tags: [] -attribution: -- citation: This material has been adapted from material by Joe Pitt-Francis from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - +attribution: + - citation: This material has been adapted from material by Joe Pitt-Francis from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- -In these exercises we will use Python to solve a variety of Ordinary Differential -Equations (ODEs), including initial and boundary value problems. Prior to attempting the -exercises you should be familiar with how to solve first order ODEs using separable -solutions and integrating factors. You should also know how solve a second order ODE -with constant coefficients and how to reduce a general second order ODE to a system of -coupled first order equations. If you are not familiar with this material then speak to -a module leader or demonstrator who will give you a tutorial. Advanced exercises are -marked with a star ($\star$); you should attempt these only if you have time and have +In these exercises we will use Python to solve a variety of Ordinary Differential +Equations (ODEs), including initial and boundary value problems. Prior to attempting the +exercises you should be familiar with how to solve first order ODEs using separable +solutions and integrating factors. You should also know how solve a second order ODE +with constant coefficients and how to reduce a general second order ODE to a system of +coupled first order equations. If you are not familiar with this material then speak to +a module leader or demonstrator who will give you a tutorial. Advanced exercises are +marked with a star ($\star$); you should attempt these only if you have time and have completed the other exercises. ## Motivation -Ordinary differential equations are used very frequently in modelling biological -and physiological processes. Most commonly they are used to model the way in -which quantities of interest (such as concentrations of drugs, viral load, or -population densities) change as a function of time. The earlier exercises below -are revision of the kinds of ODEs you may have encountered at A level or as an -undergraduate. The later exercises are taken from real models of chemical and +Ordinary differential equations are used very frequently in modelling biological +and physiological processes. Most commonly they are used to model the way in +which quantities of interest (such as concentrations of drugs, viral load, or +population densities) change as a function of time. The earlier exercises below +are revision of the kinds of ODEs you may have encountered at A level or as an +undergraduate. The later exercises are taken from real models of chemical and biological systems. - The intentions behind this exercise are: + - To give you a clear goal to meet - To give you a chance to revise or learn a few things. Specifically: - The abstract base class pattern; - Unit testing with `unittest`; - Ordinary differential equation solver implementations. -A nice feature of this assignment is that you are provided with a testing +A nice feature of this assignment is that you are provided with a testing infrastructure which gives an -implicit specification of the code you are required to write. This means that +implicit specification of the code you are required to write. This means that when you've written code -such that all the tests pass then you will know that you have completed the +such that all the tests pass then you will know that you have completed the exercise. - ::::challenge{id=initial-value-problems title="Initial value problems"} Solve the following initial value problems by hand: - 1. $\dfrac{\textrm{d} y}{\textrm{d} x} = x^{2}$, $\;$ subject to $y(0)=1$. - 2. $\dfrac{\textrm{d} x}{\textrm{d} t} = \dfrac{t^{2}}{x}$, $\;$ subject to $x(0)=1$. - 3. $\dfrac{\textrm{d} y}{\textrm{d} x} + \dfrac{y}{x}=1$, $\;$ subject to $y(1)=0$. - 4. $\dfrac{\textrm{d} y_{1}}{\textrm{d} x} = -y_{2}$ and $\dfrac{\textrm{d} - y_{2}}{\textrm{d} x} = y_{1}$, $\;$ subject to - $y_{1}(0)=1$ and $y_{2}(0)=0$. - 5. $\dfrac{\textrm{d}^{2} y}{\textrm{d} x^{2}} +3\dfrac{\textrm{d} y}{\textrm{d} x} +1. $\dfrac{\textrm{d} y}{\textrm{d} x} = x^{2}$, $\;$ subject to $y(0)=1$. +2. $\dfrac{\textrm{d} x}{\textrm{d} t} = \dfrac{t^{2}}{x}$, $\;$ subject to $x(0)=1$. +3. $\dfrac{\textrm{d} y}{\textrm{d} x} + \dfrac{y}{x}=1$, $\;$ subject to $y(1)=0$. +4. $\dfrac{\textrm{d} y_{1}}{\textrm{d} x} = -y_{2}$ and $\dfrac{\textrm{d} + y_{2}}{\textrm{d} x} = y_{1}$, $\;$ subject to + $y_{1}(0)=1$ and $y_{2}(0)=0$. +5. $\dfrac{\textrm{d}^{2} y}{\textrm{d} x^{2}} +3\dfrac{\textrm{d} y}{\textrm{d} x} -4y=0$, $\;$ subject to $y(0)=1$ and $y'(0)=0$. - Solve each of the above initial value problems numerically using your best Python ODE solver and compare with the analytic solutions. Note that in order to solve the last problem numerically you will need to reformulate the equation as a system of first order -ODEs. See if, for some of your answers, you can make a figure similar to those below, -made using model solutions. +ODEs. See if, for some of your answers, you can make a figure similar to those below, +made using model solutions. -![Solution for question 2.](images/Figure_1a.png) -![Solution for question 4.](images/Figure_1b.png) +![Solution for question 2.](images/Figure_1a.png) +![Solution for question 4.](images/Figure_1b.png) :::solution The analytical solutions to the initial value problems are given as follows: - 1. $\dfrac{\textrm{d} y}{\textrm{d} x} = x^{2}$, $y(0)=1$ $\;$ $\Rightarrow$ $\;$ - $y(x)=\frac{x^{3}}{3}+1.$} - 2. $\dfrac{\textrm{d} x}{\textrm{d} t} = \dfrac{t^{2}}{x}$, $x(0)=1$ $\;$ - $\Rightarrow$ \ $x(t)=\sqrt{\frac{2t^{3}}{3}+1}.$ - 3. $\dfrac{\textrm{d} y}{\textrm{d} x} + \dfrac{y}{x}=1$, $y(1)=0$ $\;$ - $\Rightarrow$ $\;$ $y(x)=\frac{x}{2}-\frac{1}{2x}.$ - 4. $\dfrac{\textrm{d} y_{1}}{\textrm{d} x} = -y_{2}$, $\dfrac{\textrm{d} - y_{2}}{\textrm{d} x} = y_{1}$, $y_{1}(0)=1$, $y_{2}(0)=0$ $\;$ $\Rightarrow$ $\;$ - $y_{1}(x)=\cos{x}, \; y_{1}(x)=\sin{x}.$ - 5. $\dfrac{\textrm{d}^{2} y}{\textrm{d} x^{2}} +3\dfrac{\textrm{d} y}{\textrm{d} x} - -4y=0$, $y(0)=1$, $y'(0)=0$ $\;$ $\Rightarrow$ $\;$ $y(x)=\frac{4}{5}e^{x}+ +1. $\dfrac{\textrm{d} y}{\textrm{d} x} = x^{2}$, $y(0)=1$ $\;$ $\Rightarrow$ $\;$ + $y(x)=\frac{x^{3}}{3}+1.$} +2. $\dfrac{\textrm{d} x}{\textrm{d} t} = \dfrac{t^{2}}{x}$, $x(0)=1$ $\;$ + $\Rightarrow$ \ $x(t)=\sqrt{\frac{2t^{3}}{3}+1}.$ +3. $\dfrac{\textrm{d} y}{\textrm{d} x} + \dfrac{y}{x}=1$, $y(1)=0$ $\;$ + $\Rightarrow$ $\;$ $y(x)=\frac{x}{2}-\frac{1}{2x}.$ +4. $\dfrac{\textrm{d} y_{1}}{\textrm{d} x} = -y_{2}$, $\dfrac{\textrm{d} + y_{2}}{\textrm{d} x} = y_{1}$, $y_{1}(0)=1$, $y_{2}(0)=0$ $\;$ $\Rightarrow$ $\;$ + $y_{1}(x)=\cos{x}, \; y_{1}(x)=\sin{x}.$ +5. $\dfrac{\textrm{d}^{2} y}{\textrm{d} x^{2}} +3\dfrac{\textrm{d} y}{\textrm{d} x} + -4y=0$, $y(0)=1$, $y'(0)=0$ $\;$ $\Rightarrow$ $\;$ $y(x)=\frac{4}{5}e^{x}+ \frac{1}{5}e^{-4x} .$ - -The following example Python codes can be modified to solve all the above +The following example Python codes can be modified to solve all the above initial value problems: -```python +```python nolint """ Code for question (ii) """ +from matplotlib import pyplot as plt +from scipy.integrate import odeint +import numpy as np +import matplotlib.pyplot as plt # Function to solve dxdt=tˆ2/x. def dxdtfor2dsolver(u, t): - # dxdt=t*t/x (dydt is unused) - return (t*t/u[0], 0) + # dxdt=t*t/x (dydt is unused) + return (t*t/u[0], 0) def dxdt(x, t): - # dxdt=t*t/x - return t*t/x + # dxdt=t*t/x + return t*t/x # My best solver solver = HigherOrderOdeSolver() @@ -122,7 +119,7 @@ oi = odeint(dxdt, 1.0, solver.GetTimeTrace()) t = solver.GetTimeTrace() analytic = np.sqrt((np.power(t, 3)*2)/3 + 1.0) -plt.plot(solver.GetTimeTrace(),solver.GetXTrace(),'b+', +plt.plot(solver.GetTimeTrace(),solver.GetXTrace(),'b+', label='My best solver') plt.plot(solver.GetTimeTrace(), oi, 'gx', label='odeint') plt.plot(solver.GetTimeTrace(), analytic, 'r', label='Analytic') @@ -133,8 +130,13 @@ plt.ylabel('x') plt.show() ``` -```python +```python nolint """ Code for question (iv) """ +from matplotlib import pyplot as plt +from scipy.integrate import odeint +import numpy as np +import matplotlib.pyplot as plt +import math def dydx(y, x): # This one ought to look familiar (rhs_circle) return (-y[1], y[0]) @@ -156,7 +158,7 @@ y = solver.GetYTrace() analyticx = np.cos(t) analyticy = np.sin(t) -plt.plot(solver.GetXTrace(), solver.GetYTrace(), 'b+', +plt.plot(solver.GetXTrace(), solver.GetYTrace(), 'b+', label='My best solver') plt.plot(oi[:,0], oi[:,1], 'gx', label='odeint') plt.plot(analyticx, analyticy, 'r', label='Analytic') @@ -171,7 +173,6 @@ plt.show() ::: :::: - ::::challenge{id=boundary-value-problems title="Boundary value problems"} Use `scipy.integrate.solve_bvp` to solve the boundary value problem @@ -182,19 +183,22 @@ $$ \end{aligned} $$ -subject to the boundary conditions $y(0)=1$ and $y(1)=1$. +subject to the boundary conditions $y(0)=1$ and $y(1)=1$. -Solve the same boundary value problem, but now with the boundary conditions $y'(0) = 0$ +Solve the same boundary value problem, but now with the boundary conditions $y'(0) = 0$ and $y(1)=1$. :::solution ```python +from scipy.integrate import solve_bvp +import matplotlib.pyplot as plt +import numpy as np # Solving y'' + 3y'- -4y # y[0] is y, y[1] is y' # dy[0]/dx = y[1] and dy[1]/dx = -3y[1]+4y[0] def dydx(x, y): - return np.vstack((y[1], -3*y[1]+4*y[0])) + return np.vstack((y[1], -3*y[1]+4*y[0])) def bcs_a(yat0, yat1): # Dirichlet: y at both ends = 1, i.e. y(x=1)-1 = 0 @@ -209,22 +213,23 @@ sol_a = solve_bvp(dydx, bcs_a, x, init_y) sol_b = solve_bvp(dydx, bcs_b, x, init_y) plt.plot(sol_a.x, sol_a.y[0], 'b-+', label='(a)') plt.plot(sol_b.x, sol_b.y[0], 'r-*', label='(b)') -plt.title('Exercise (a)'); plt.legend() -plt.xlabel('x'); plt.ylabel('y') +plt.title('Exercise (a)') +plt.legend() +plt.xlabel('x') +plt.ylabel('y') plt.show() ``` ::: :::: - ::::challenge{id=chemical-reaction-systems title="Chemical reaction systems"} -Mathematical models of simple chemical or biochemical reaction mechanisms often take the -form of non-linear systems of ordinary differential equations (derived using the -standard chemical laws of mass action). Often the various reactions making up the system -happen on very different time scales leading to a *stiff* system. An example is -Robertson's chemical reaction model, in which the concentrations of three reacting +Mathematical models of simple chemical or biochemical reaction mechanisms often take the +form of non-linear systems of ordinary differential equations (derived using the +standard chemical laws of mass action). Often the various reactions making up the system +happen on very different time scales leading to a _stiff_ system. An example is +Robertson's chemical reaction model, in which the concentrations of three reacting chemical species evolve according to the system of equations $$ @@ -244,40 +249,41 @@ $$ $$ **Note: your mileage may vary with this question,** because with more modern versions - of `scipy` it becomes increasingly hard to stop the integration - scheme being "clever" and using a sophisticated scheme earlier. - -1. Read the documentation for `scipy.integrate.ode`. Solve this - system using the Dormand \& Prince `dopri` solver, which is a - high-order Runge-Kutta solver, until $t=100$ in steps of $\Delta t=1$. (Note that the interface is a bit - more difficult to set up than `odeint` but an example is given with - the documentation.) What warning do you get from the solver? Can you change any of the - `dopri` parameters to get rid of warnings? How long does the - integrator take to solve the system? +of `scipy` it becomes increasingly hard to stop the integration +scheme being "clever" and using a sophisticated scheme earlier. + +1. Read the documentation for `scipy.integrate.ode`. Solve this + system using the Dormand \& Prince `dopri` solver, which is a + high-order Runge-Kutta solver, until $t=100$ in steps of $\Delta t=1$. (Note that the interface is a bit + more difficult to set up than `odeint` but an example is given with + the documentation.) What warning do you get from the solver? Can you change any of the + `dopri` parameters to get rid of warnings? How long does the + integrator take to solve the system? 2. If you are still getting warnings for the `dopri` then you will - have a hint to help you select a better `scipy.integrate.ode` - method. Switch to such a method. Again work to remove all warnings. + have a hint to help you select a better `scipy.integrate.ode` + method. Switch to such a method. Again work to remove all warnings. 3. ($\star$) Explain what is happening mathematically and chemically. 4. Repeat parts (a)--(b) for the system -$$ -\begin{aligned} - \dfrac{\textrm{d} y_{1}}{\textrm{d} x} &= -0.04y_1 + y_2 y_3, \\ - \dfrac{\textrm{d} y_{2}}{\textrm{d} x} &=0.04y_1 - y_2 y_3 - 30y_2^2,\\ - \dfrac{\textrm{d} y_{3}}{\textrm{d} x} &= 30y_2^2, -\end{aligned} -$$ + $$ + \begin{aligned} + \dfrac{\textrm{d} y_{1}}{\textrm{d} x} &= -0.04y_1 + y_2 y_3, \\ + \dfrac{\textrm{d} y_{2}}{\textrm{d} x} &=0.04y_1 - y_2 y_3 - 30y_2^2,\\ + \dfrac{\textrm{d} y_{3}}{\textrm{d} x} &= 30y_2^2, + \end{aligned} + $$ -with the same initial conditions. Which solver is faster now? Which -solver gives you warnings now? + with the same initial conditions. + Which solver is faster now? + Which solver gives you warnings now? 5. Which solver should you use in which situation? -6. ($\star$) If you are feeling brave then assess how one of the fixed time-step solvers +6. ($\star$) If you are feeling brave then assess how one of the fixed time-step solvers you have written yourself measures up to the - solver you used in part (b). There are 3 species so you may want - to extend the functionality to cope with $y_3$. Use the solution - from part (b) as a reference to measure your error. How small do - you need to make your time-step to get within a particular error? + solver you used in part (b). There are 3 species so you may want + to extend the functionality to cope with $y_3$. Use the solution + from part (b) as a reference to measure your error. How small do + you need to make your time-step to get within a particular error? :::solution @@ -285,8 +291,8 @@ Code for all parts of this question is given below. To answer part (c) there is a large discrepancy between numerical values (or concentrations) so effects are occurring on different -scales. The proper term for this is *multiscale* but in ODEs -we often call these systems "stiff". Problems arise a little after the +scales. The proper term for this is _multiscale_ but in ODEs +we often call these systems "stiff". Problems arise a little after the initial time when there is a small amount of $y_2$ which leads to massive gradients. @@ -299,7 +305,6 @@ import matplotlib.pyplot as plt import time from scipy.integrate import ode -#def Robertson(Y,t): def Robertson(t, Y): dYdt = [-0.04*Y[0] + 10000*Y[1]*Y[2], 0.04*Y[0] - 10000*Y[1]*Y[2] - 30000000*Y[1]**2, @@ -313,8 +318,12 @@ def SimpleODE(t, Y): return dYdt -t=0; y=[1,0,0] -times=[t]; y1=[1]; y2=[0]; y3=[0] +t=0 +y=[1,0,0] +times=[t] +y1=[1] +y2=[0] +y3=[0] # Experiments for part (a) solver = ode(Robertson).set_integrator('dopri') @@ -375,27 +384,26 @@ plt.show() ::: :::: - ::::challenge{id=zombie-invasion title="When Zombies Attack!"} -($\star$)Inspired by a famous SIR model for epidemics there is a +($\star$)Inspired by a famous SIR model for epidemics there is a SZR model for zombie invasion. The paper describing this model is available to download from -[https://mysite.science.uottawa.ca/rsmith43/Zombies.pdf](https://mysite.science.uottawa.ca/rsmith43/Zombies.pdf) +[https://mysite.science.uottawa.ca/rsmith43/Zombies.pdf](https://mysite.science.uottawa.ca/rsmith43/Zombies.pdf) which includes modelling code in `Matlab`. -A more advanced model included in the paper, which includes latent infection and +A more advanced model included in the paper, which includes latent infection and quarantine, is known as the SIZRQ model. This model is defined by $$ \begin{aligned} \dfrac{\textrm{d} S}{\textrm{d} t} &= \Pi - \beta SZ - \delta S, \\ - \dfrac{\textrm{d} I}{\textrm{d} t} &= \beta S Z -\rho I - \delta I - \kappa I, + \dfrac{\textrm{d} I}{\textrm{d} t} &= \beta S Z -\rho I - \delta I - \kappa I, \\ - \dfrac{\textrm{d} Z}{\textrm{d} t} &= \rho I + \zeta R - \alpha S Z -\sigma Z, + \dfrac{\textrm{d} Z}{\textrm{d} t} &= \rho I + \zeta R - \alpha S Z -\sigma Z, \\ - \dfrac{\textrm{d} R}{\textrm{d} t} &= \delta S + \delta I + \alpha SZ -\zeta R + + \dfrac{\textrm{d} R}{\textrm{d} t} &= \delta S + \delta I + \alpha SZ -\zeta R + \gamma Q, \\ \dfrac{\textrm{d} Q}{\textrm{d} t} &= \kappa I +\sigma Z - \gamma Q, \end{aligned} @@ -413,43 +421,57 @@ and parameter values $$ \begin{aligned} - \Pi = 0, \; - \alpha = 0.005, \; - & \beta = 0.0095, \; - \zeta = 0.1, \; + \Pi = 0, \; + \alpha = 0.005, \; + & \beta = 0.0095, \; + \zeta = 0.1, \; \delta = 0.0001, \\ - \rho = 0.5, \; - & \kappa = 0.1, \; - \sigma = 0.01, \; + \rho = 0.5, \; + & \kappa = 0.1, \; + \sigma = 0.01, \; \gamma = 0.01. \end{aligned} $$ 1. Solve this system of equations numerically. -2. How realistic is this type of model? Can you think of any improvements to the +2. How realistic is this type of model? Can you think of any improvements to the model? :::solution ```python +import numpy as np +from scipy.integrate import odeint # Function to solve the SIZRQ Zombie ODE system. -alpha = 0.005; zeta = 0.1; rho = 0.5; -sigma = 0.01; pi = 0; beta = 0.0095; -delta = 0.0001; kappa = 0.1;gamma = 0.01; +alpha = 0.005 +zeta = 0.1 +rho = 0.5 +sigma = 0.01 +pi = 0 +beta = 0.0095 +delta = 0.0001 +kappa = 0.1 +gamma = 0.01 def Zombie(Y, t): - S = Y[0]; I = Y[1]; Z = Y[2]; R = Y[3]; Q = Y[4] - dSdt = pi - beta*S*Z - delta*S #Susceptible - dIdt = beta*S*Z - rho*I - delta*I - kappa*I; # Infected - dZdt = rho*I + zeta*R - alpha*S*Z - sigma*Z; #Zombie - dRdt = delta*S + delta*I + alpha*S*Z - zeta*R + gamma*Q; #Removed - dQdt = kappa*I + sigma*Z - gamma*Q; #Quarantined + S = Y[0] + I = Y[1] + Z = Y[2] + R = Y[3] + Q = Y[4] + dSdt = pi - beta*S*Z - delta*S # Susceptible + dIdt = beta*S*Z - rho*I - delta*I - kappa*I # Infected + dZdt = rho*I + zeta*R - alpha*S*Z - sigma*Z # Zombie + dRdt = delta*S + delta*I + alpha*S*Z - zeta*R + gamma*Q # Removed + dQdt = kappa*I + sigma*Z - gamma*Q # Quarantined return [dSdt, dIdt, dZdt, dRdt, dQdt] -S0=500; I0=0; Z0=0; R0=0; Q0=0 -StartTime=0; EndTime = 50 +S0=500 +I0=Z0=R0=Q0=0 +StartTime=0 +EndTime = 50 t = np.linspace(StartTime, EndTime, 100) -Y=odeint(Zombie, [S0,I0,Z0,R0,Q0], t) +Y = odeint(Zombie, [S0,I0,Z0,R0,Q0], t) plt.plot(t,Y[:,0], label='Susceptible') plt.plot(t,Y[:,1], label='Infected') @@ -457,8 +479,10 @@ plt.plot(t,Y[:,2], label='Zombie') plt.plot(t,Y[:,3], label='Removed') plt.plot(t,Y[:,4], label='Quarantined') -plt.legend();plt.title('When zombies attack') -plt.xlabel('time');plt.show() +plt.legend() +plt.title('When zombies attack') +plt.xlabel('time') +plt.show() ``` Possible improvements include the incorporation of delays, spatial @@ -473,16 +497,16 @@ heterogeneity, and more realistic nonlinear interaction terms. which may occur either at predetermined times or when a given condition on the values of the state variables or their derivatives is met. The following model describes a zombie outbreak in which there is -periodic culling (Spoiler alert. It looks like we all die - whatever we do. - If you think that's depressing then you might have to read this - [followup paper](http://dx.doi.org/10.1080/23737867.2014.11414478)). +periodic culling (Spoiler alert. It looks like we all die +whatever we do. +If you think that's depressing then you might have to read this +[followup paper](http://dx.doi.org/10.1080/23737867.2014.11414478)). of the zombie population: $$ \begin{aligned} \dfrac{\textrm{d} S}{\textrm{d} t} &= \Pi - \beta SZ - \delta S, & \; t &\ne t_{n} ,\\ -\dfrac{\textrm{d} Z}{\textrm{d} t} &= \beta SZ + \zeta R - \alpha S Z, & t &\ne t_{n}, +\dfrac{\textrm{d} Z}{\textrm{d} t} &= \beta SZ + \zeta R - \alpha S Z, & t &\ne t_{n}, \\ \dfrac{\textrm{d} R}{\textrm{d} t} &= \delta S + \alpha SZ -\zeta R, & t &\ne t_{n}, \\ \triangle Z &= -kZ, & t &= t_{n}. @@ -503,53 +527,62 @@ and with parameter values $$ \begin{aligned} - \Pi = 0, \; \alpha = 0.005, \; \beta = 0.0095, \; \zeta = 0.1, \; \delta = 0.0001, + \Pi = 0, \; \alpha = 0.005, \; \beta = 0.0095, \; \zeta = 0.1, \; \delta = 0.0001, \; k = 0.25, \end{aligned} $$ and culling every 10 units of time (i.e. $t_1=10$, $t_2=20$, etc.). Hint: you will need to loop over time intervals to solve this model. - :::solution ```python +import numpy as np +from scipy.integrate import odeint # Function to solve the SZR Zombie ODE system with culling. -alpha = 0.005; zeta = 0.1; pi = 0 -beta = 0.0095; delta = 0.0001 +alpha = 0.005 +zeta = 0.1 +pi = 0 +beta = 0.0095 +delta = 0.0001 def SZR(Y,t): - S = Y[0]; Z = Y[1]; R = Y[2] - #Susceptible / Zombie / Removed + S = Y[0] + Z = Y[1] + R = Y[2] + # Susceptible / Zombie / Removed dSdt = pi - beta*S*Z - delta*S dZdt = beta*S*Z + zeta*R - alpha*S*Z dRdt = delta*S + alpha*S*Z - zeta*R return [dSdt, dZdt, dRdt] -S0=500; Z0=0; R0=0 +S0=500 +Z0=0 +R0=0 EndTime = 50 CullEffect = 0.30 CullInterval = 10.0 NumCull = round(EndTime/CullInterval) Y0 = [S0, Z0, R0] -for i in range (0, NumCull): - t = np.linspace(CullInterval*i, CullInterval*(i+1), 100) - Y = odeint(SZR, Y0, t) - # Get state at end - Y0 = Y[-1,:] - CullSize = CullEffect*Y0[1] - Y0[1] -= CullSize - Y0[2] += CullSize - plt.plot(t,Y[:,0], 'r', label='Susceptible') - plt.plot(t,Y[:,1], 'g', label='Zombie') - plt.plot(t,Y[:,2], 'b', label='Removed') -plt.legend();plt.title('Zombie culling at '+str(CullEffect*100)+'%') -plt.xlabel('time');plt.show() +for i in range(0, NumCull): + t = np.linspace(CullInterval*i, CullInterval*(i+1), 100) + Y = odeint(SZR, Y0, t) + # Get state at end + Y0 = Y[-1,:] + CullSize = CullEffect*Y0[1] + Y0[1] -= CullSize + Y0[2] += CullSize + plt.plot(t,Y[:,0], 'r', label='Susceptible') + plt.plot(t,Y[:,1], 'g', label='Zombie') + plt.plot(t,Y[:,2], 'b', label='Removed') +plt.legend() +plt.title('Zombie culling at '+str(CullEffect*100)+'%') +plt.xlabel('time') +plt.show() # Turn that graph on its side and you have a Christmas tree! ``` ::: :::: - diff --git a/scientific_computing/ode_solvers/index.md b/scientific_computing/ode_solvers/index.md index 0c31cb63..d5b36c7c 100644 --- a/scientific_computing/ode_solvers/index.md +++ b/scientific_computing/ode_solvers/index.md @@ -1,22 +1,15 @@ --- id: ode_solvers name: Solving ODEs -dependsOn: [ - scientific_computing.essential_maths, -] -files: [ - 01-AM.md, - 02-PM.md, -] -attribution: -- citation: This material has been adapted from material by Joe Pitt-Francis from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - +dependsOn: [scientific_computing.essential_maths] +files: [01-AM.md, 02-PM.md] +attribution: + - citation: This material has been adapted from material by Joe Pitt-Francis from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- diff --git a/scientific_computing/optimisation/01-nonlinear-optimisation.md b/scientific_computing/optimisation/01-nonlinear-optimisation.md index 55c5f274..947c31c8 100644 --- a/scientific_computing/optimisation/01-nonlinear-optimisation.md +++ b/scientific_computing/optimisation/01-nonlinear-optimisation.md @@ -1,31 +1,29 @@ --- name: Non-linear Optimisation -dependsOn: [ -] +dependsOn: [] tags: [] -attribution: -- citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Mathematical formulation -Optimisation aims to find the minimum (or equivilently the maximum) of some *objective*, -or *loss* function $f$, given a set of $n$ parameters $\theta$ +Optimisation aims to find the minimum (or equivilently the maximum) of some _objective_, +or _loss_ function $f$, given a set of $n$ parameters $\theta$ $$ \min_{\theta \in \mathcal{R}^n} f(\theta) $$ -We might also have a set of *constraints*, for example a parameter might be required to -be non-negative (e.g. a concentration or population number). These are often written as +We might also have a set of _constraints_, for example a parameter might be required to +be non-negative (e.g. a concentration or population number). These are often written as a set of equality $\mathcal{E}$ and inequality $\mathcal{I}$ constraints $$ @@ -34,60 +32,58 @@ c_i(\theta) = 0, & i \in \mathcal{E} \\ c_i(\theta) \ge 0, & i \in \mathcal{I} \end{cases} $$ -Or these might simply be defined as *bounds* in parameter space that restrict the +Or these might simply be defined as _bounds_ in parameter space that restrict the minimisation to a given domain $\Omega \in \mathcal{R}^n$ $$ \min_{\theta \in \Omega} f(\theta) $$ +## Useful terms -## Useful terms - -*Modelling* is the process of defining the objective function $f$, the parameters of -interest $\theta$, and the constraints. The algorithms for performing the minimisation -fall under the field of optimisation. Sub-fields of this are concerned with the -minimisation of discrete function, often called *integer programming*. Confusingly, it -is common to see the terms "optimisation" and "programming" used interchangeably, as the -latter term was coined before the 1940s, and does not refer to computer software +_Modelling_ is the process of defining the objective function $f$, the parameters of +interest $\theta$, and the constraints. The algorithms for performing the minimisation +fall under the field of optimisation. Sub-fields of this are concerned with the +minimisation of discrete function, often called _integer programming_. Confusingly, it +is common to see the terms "optimisation" and "programming" used interchangeably, as the +latter term was coined before the 1940s, and does not refer to computer software programming at all. -If the function $f$ is linear, then there are specific algorithms for this class of -problem that fall under the topic of *linear programming*, or *linear optimisation*. The -more general problem of a non-linear $f$ is termed *non-linear programming*, or -*non-linear optimisation*. If a set of equality and/or inequality constraints are needed -then algorithms that deal with these fall under the topic of *constrained* optimisation. - -An important distinction when looking at optimisation problems is the notion of *global* -versus *local* optimisation. The latter finds a point in parameter space $\theta_m$ that -has a function value $f(\theta_m)$ greater than the surrounding points, but might not -necessarily be the global minimum. These algorithms are often initialised to a point -that is near to the minima of interest. The more general problem of global optimisation -is significantly more difficult as it requires that the optimisation be robust to -finding and rejecting such local minima. For a function that is *convex*, then local and -global minimisation are the same, which is very advantagous since local minimisation -algorithms are often both faster and often have more guarentees of convergence. The -function $f$ is a convex function if its domain $\Omega$ is a convex set, and for any +If the function $f$ is linear, then there are specific algorithms for this class of +problem that fall under the topic of _linear programming_, or _linear optimisation_. The +more general problem of a non-linear $f$ is termed _non-linear programming_, or +_non-linear optimisation_. If a set of equality and/or inequality constraints are needed +then algorithms that deal with these fall under the topic of _constrained_ optimisation. + +An important distinction when looking at optimisation problems is the notion of _global_ +versus _local_ optimisation. The latter finds a point in parameter space $\theta_m$ that +has a function value $f(\theta_m)$ greater than the surrounding points, but might not +necessarily be the global minimum. These algorithms are often initialised to a point +that is near to the minima of interest. The more general problem of global optimisation +is significantly more difficult as it requires that the optimisation be robust to +finding and rejecting such local minima. For a function that is _convex_, then local and +global minimisation are the same, which is very advantagous since local minimisation +algorithms are often both faster and often have more guarentees of convergence. The +function $f$ is a convex function if its domain $\Omega$ is a convex set, and for any two points $\theta_x$ and $\theta_y$: $$ -f(\alpha \theta_x + (1 - \alpha) \theta_y ) \le \alpha f(\theta_x) + (1 - \alpha) +f(\alpha \theta_x + (1 - \alpha) \theta_y ) \le \alpha f(\theta_x) + (1 - \alpha) f(\theta_y), \text{ for all } \alpha \in [0, 1]. $$ -The term *convex programming* is used to describe the case of contrained optimisation -where $f$ is convex, the equality constraints are linear and the inequality contraints +The term _convex programming_ is used to describe the case of contrained optimisation +where $f$ is convex, the equality constraints are linear and the inequality contraints are concave. ## Non-linear optimisation and Root-Finding -Non-linear optimisation is closely related to finding the roots, or zeros, of a -function. This can be seen easily by considering the fact that at each local minima or -maxima of the function the value of the gradient of $f$ is zero, i.e. $\nabla f = 0$. -Therefore finding a local minima or maxima of $f$ corresponds to finding the zeros of +Non-linear optimisation is closely related to finding the roots, or zeros, of a +function. This can be seen easily by considering the fact that at each local minima or +maxima of the function the value of the gradient of $f$ is zero, i.e. $\nabla f = 0$. +Therefore finding a local minima or maxima of $f$ corresponds to finding the zeros of the function $g = \nabla f$. - ### Other reading - Numerical optimization by Nocedal, Jorge; Wright, Stephen J., 1960- @@ -96,10 +92,6 @@ the function $g = \nabla f$. - Luenberger, Linear and Nonlinear Programming, 3/e, Springer - Bertsekas, Nonlinear Programming, Athena - Ruszczynski, Nonlinear Optimization, Princeton University Press -- Broyden, C. G. (1972). "Quasi-Newton Methods". In Murray, W. (ed.). Numerical Methods - for Unconstrained Optimization. London: Academic Press. pp. 87–106. ISBN +- Broyden, C. G. (1972). "Quasi-Newton Methods". In Murray, W. (ed.). Numerical Methods + for Unconstrained Optimization. London: Academic Press. pp. 87–106. ISBN 0-12-512250-0. - - - - diff --git a/scientific_computing/optimisation/02-line-search-methods.md b/scientific_computing/optimisation/02-line-search-methods.md index 1a1670ab..9a5d8137 100644 --- a/scientific_computing/optimisation/02-line-search-methods.md +++ b/scientific_computing/optimisation/02-line-search-methods.md @@ -1,98 +1,93 @@ --- name: Line Search Methods -dependsOn: [ - scientific_computing.optimisation.01-nonlinear-optimisation, -] +dependsOn: [scientific_computing.optimisation.01-nonlinear-optimisation] tags: [] -attribution: -- citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- - - ### Gradient descent -One of the simplest local optimisation algoriths is *gradient descent*. It is -initialised at some point in parameter space $a_0$, and at each iteration the function -$f(x)$ is reduced by following the direction of *steepest descent* $-\nabla f(a)$ +One of the simplest local optimisation algoriths is _gradient descent_. It is +initialised at some point in parameter space $a_0$, and at each iteration the function +$f(x)$ is reduced by following the direction of _steepest descent_ $-\nabla f(a)$ $$ a_{n+1} = a_n - \gamma \nabla f(a_n) $$ -This is an example of an important class of algorithms called the *line search* methods. -These algorithms choose a *search direction* $p_k$ at each iteration $k$, and search -along the 1D line from the initial point $a_k$ to a new point +This is an example of an important class of algorithms called the _line search_ methods. +These algorithms choose a _search direction_ $p_k$ at each iteration $k$, and search +along the 1D line from the initial point $a_k$ to a new point $$ a_{k+1} = a_k + \alpha p_k -$$ +$$ -with a lower function value. The problem at each iteration becomes a one-dimensional -optimisation problem along $p_k$ to find the optimal value of $\alpha$. Each line search -algorithm is thus defined on how it chooses both the search direction $p_k$ and the +with a lower function value. The problem at each iteration becomes a one-dimensional +optimisation problem along $p_k$ to find the optimal value of $\alpha$. Each line search +algorithm is thus defined on how it chooses both the search direction $p_k$ and the optimal $\alpha$. -![](images/Gradient_descent.gif) +![animation illustrating gradient descent](images/Gradient_descent.gif) -### Plateaus with low gradient +### Plateaus with low gradient -An obvious downside to simple gradient descent can be seen for functions which have -regions of zero or small gradients, or plateaus. Here a gradient descent algorithm with a -constant $\gamma$ will proceed very slowly, if at all. This motivates another important -line search algorithm, *Newtons method*. +An obvious downside to simple gradient descent can be seen for functions which have +regions of zero or small gradients, or plateaus. Here a gradient descent algorithm with a +constant $\gamma$ will proceed very slowly, if at all. This motivates another important +line search algorithm, _Newtons method_. -The Newtons direction $p^N_k$ can be derived by considering the second-order Taylor +The Newtons direction $p^N_k$ can be derived by considering the second-order Taylor expansion of the function $f(x)$ $$ -f(a_k + p) \approx f(a_k) + p^T \nabla f(a_k) + \frac{1}{2} p^T \nabla^2 f(a_k) p = +f(a_k + p) \approx f(a_k) + p^T \nabla f(a_k) + \frac{1}{2} p^T \nabla^2 f(a_k) p = m_k(p). $$ -We find the value of $p$ that minimises $m_k(p)$ by setting the derivative of $m_k$ to -zero, leading to +We find the value of $p$ that minimises $m_k(p)$ by setting the derivative of $m_k$ to +zero, leading to $$ p_k^N = - (\nabla^2 f(a_k))^{-1} \nabla f(a_k) $$ -Unlike the steepest descent, Newtons method has a natural step length $\alpha \approx -1$, which is suitable for a wide variety of problems and can quickly cross areas of low -gradient. Naturally, since the algorithm is based on a *second-order* approximation of +Unlike the steepest descent, Newtons method has a natural step length $\alpha \approx +1$, which is suitable for a wide variety of problems and can quickly cross areas of low +gradient. Naturally, since the algorithm is based on a _second-order_ approximation of the function $f$, it works better if this approximation is reasonably accurate. -Newtons method can be used as long as the inverse of the second derivative of the -function $(\nabla^2 f(a_k))^{-1}$, exists (e.g. it will always exist for a positive -definite $\nabla^2 f$). However, even when this inverse does exist it is possible that -the direction $p^N_k$ does not satisfy the descent condition $f(a_k + \alpha p^N_k) < -f(a_k)$ (or equivilently $\nabla f(a_k)^T p^N < 0$), so many modifications to Newtons -methods, falling under a class of methods called *Quasi-Newton* methods, have been -proposed to satisfy this descent condition. - -Quasi-Newton methods do not require the (often onerous) calculation of the hession -$\nabla^2 f(x)$ like Newtons, instead they form an approximation to the hessian $B_k -\approx \nabla^2 f(a_k)$ that is updated at each step using the information given by the -gradient evaluations $\nabla f(a_k)$. Two popular methods of performing this update are -the *symmetric-rank-one* (SR1), and the *Broyden, Fletcher, Goldfarb, and Shanno, -(BFGS)* formula. Once the approximation $B_k$ is formed then the search direction is +Newtons method can be used as long as the inverse of the second derivative of the +function $(\nabla^2 f(a_k))^{-1}$, exists (e.g. it will always exist for a positive +definite $\nabla^2 f$). However, even when this inverse does exist it is possible that +the direction $p^N_k$ does not satisfy the descent condition $f(a_k + \alpha p^N_k) < +f(a_k)$ (or equivilently $\nabla f(a_k)^T p^N < 0$), so many modifications to Newtons +methods, falling under a class of methods called _Quasi-Newton_ methods, have been +proposed to satisfy this descent condition. + +Quasi-Newton methods do not require the (often onerous) calculation of the hession +$\nabla^2 f(x)$ like Newtons, instead they form an approximation to the hessian $B_k +\approx \nabla^2 f(a_k)$ that is updated at each step using the information given by the +gradient evaluations $\nabla f(a_k)$. Two popular methods of performing this update are +the _symmetric-rank-one_ (SR1), and the _Broyden, Fletcher, Goldfarb, and Shanno, +(BFGS)_ formula. Once the approximation $B_k$ is formed then the search direction is calculated via $$ p_k = -B_k^{-1} \nabla f(a_k) $$ -For more details of other line search methods, please see Chapter 3 of the Nocedal and -Wright textbook, or in the other textbooks listed at the end of this lesson. Finally, it -should be noted that the *conjugate gradient* method can also be used for non-linear +For more details of other line search methods, please see Chapter 3 of the Nocedal and +Wright textbook, or in the other textbooks listed at the end of this lesson. Finally, it +should be noted that the _conjugate gradient_ method can also be used for non-linear optimisation, where the search direction is given by $$ @@ -101,37 +96,37 @@ $$ ### Step length -In line search methods, choosing the step length $\alpha_k$ is a non-trivial task. -Ideally we would want to chose $\alpha_k$ to minimise the function along the +In line search methods, choosing the step length $\alpha_k$ is a non-trivial task. +Ideally we would want to chose $\alpha_k$ to minimise the function along the one-dimensional search direction $p_k$. That is, we wish to minimise $$ \phi(\alpha_k) = f(a_k + \alpha_k p_k),\text{ }\alpha_k > 0. $$ -In general it is too expensive to do this minimisation exactly, so approximate methods -are used so that multiple trial $\alpha_k$ values are trialled, stopping when a candidate -is found that satisfies a set of *conditions*. There are two main conditions used, the -*Wolfe conditions* and the *Goldstein* conditions. +In general it is too expensive to do this minimisation exactly, so approximate methods +are used so that multiple trial $\alpha_k$ values are trialled, stopping when a candidate +is found that satisfies a set of _conditions_. There are two main conditions used, the +_Wolfe conditions_ and the _Goldstein_ conditions. -The two Wolfe conditions are the *sufficient decrease* condition, which ensures that the -reduction in the function value is proportional to the step length $\alpha_k$ and the -gradient in the direction of the step +The two Wolfe conditions are the _sufficient decrease_ condition, which ensures that the +reduction in the function value is proportional to the step length $\alpha_k$ and the +gradient in the direction of the step $$ f(a_k + \alpha_k p_k) \le f(a_k) + c_1 \alpha_k \nabla f(a_k)^T p_k. $$ -The second Wolfe condition is the *curvature* condition, which prevents unacceptibly -short steps by ensuring that the slope of $\phi$ is greater than some constant $c_2$ +The second Wolfe condition is the _curvature_ condition, which prevents unacceptibly +short steps by ensuring that the slope of $\phi$ is greater than some constant $c_2$ times the initial slope $\phi'(0)$ $$ \nabla f(a_k + \alpha_k p_k)^T p_k \ge c_2 \nabla f(a_k)^T p_k, $$ -where $c_2 \in (c_1, 1)$. Typical values are $c_1 = 10^{-4}$ and $c_2 = 0.9$. The -*strong Wolfe* conditions restrict the gradient $\phi'$ to be small, so as to exclude +where $c_2 \in (c_1, 1)$. Typical values are $c_1 = 10^{-4}$ and $c_2 = 0.9$. The +_strong Wolfe_ conditions restrict the gradient $\phi'$ to be small, so as to exclude points that are too far from stationary points of $\phi$ $$ @@ -142,26 +137,26 @@ $$ |\nabla f(a_k + \alpha_k p_k)^T p_k| \le c_2 |\nabla f(a_k)^T p_k|, $$ -The Goldstein conditions are similar in spirit to the Wolfe conditions, and are formed +The Goldstein conditions are similar in spirit to the Wolfe conditions, and are formed from the two inequalities $$ -f(a_k) + (1 - c) \alpha_k \nabla f(a_k)^T p_k \le f(a_k + \alpha_k p_k) \le f(a_k) + c +f(a_k) + (1 - c) \alpha_k \nabla f(a_k)^T p_k \le f(a_k + \alpha_k p_k) \le f(a_k) + c \alpha_k \nabla f(a_k)^T p_k. $$ -with $0 < c < 1/2$. The first inequality prevents small step sizes while the second is -the same sufficient decrease condition as in the Wolfe conditions. The Goldstein -conditions are often used in Newton-type methods but for quasi-Newton methods the Wolfe +with $0 < c < 1/2$. The first inequality prevents small step sizes while the second is +the same sufficient decrease condition as in the Wolfe conditions. The Goldstein +conditions are often used in Newton-type methods but for quasi-Newton methods the Wolfe conditions are prefered. The diagrams from the text by Nocedal and Wright illustrate the two conditions -![](images/conditions.jpg) +![Wolf versus Goldstein conditions](images/conditions.jpg) -Algorithms for choosing candidate step size values $\alpha_k$ can be complicated, so we -will only mention here one of the simplest, which is the *backtracking* method. This -approach implicitly satisfies the condition on too small $\alpha_k$, and only repeatedly -test for the common sufficient decrease condition that appears in both the Wolfe and +Algorithms for choosing candidate step size values $\alpha_k$ can be complicated, so we +will only mention here one of the simplest, which is the _backtracking_ method. This +approach implicitly satisfies the condition on too small $\alpha_k$, and only repeatedly +test for the common sufficient decrease condition that appears in both the Wolfe and Goldstein condtitions. ### Backtracking algorithm @@ -170,57 +165,54 @@ Choose $\bar{\alpha} > 0$, $\rho \in (0, 1)$, $c \in (0, 1)$ $\alpha_k := \bar{\alpha}$ -**repeat** until $f(a_k + \alpha_k p_k) \le f(a_k) + c \alpha_k \nabla f(a_k)^T p_k$ +**repeat** until $f(a_k + \alpha_k p_k) \le f(a_k) + c \alpha_k \nabla f(a_k)^T p_k$ -> $\alpha_k := \rho \alpha_k$ - -**end repeat** +> $\alpha_k := \rho \alpha_k$ +**end repeat**. ### Software -- Scipy has a wide variety of (mostly) line search and trust region algorithms in - [`scipy.optimize`](https://docs.scipy.org/doc/scipy/reference/optimize.html). There +- Scipy has a wide variety of (mostly) line search and trust region algorithms in + [`scipy.optimize`](https://docs.scipy.org/doc/scipy/reference/optimize.html). There are 14 local minimisers, so we won't list them all here! -- It is worth noting that Scipy includes the - [`line_search`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.line_search.html#scipy.optimize.line_search) - function, which allows you to use their line search satisfying the strong Wolfe +- It is worth noting that Scipy includes the + [`line_search`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.line_search.html#scipy.optimize.line_search) + function, which allows you to use their line search satisfying the strong Wolfe conditions with your own custom search direction. -- Scipy also includes a - [`HessianUpdateStrategy`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.HessianUpdateStrategy.html#scipy.optimize.HessianUpdateStrategy), - which provides an interface for specifying an approximate Hessian for use in - quasi-Newton methods, along with two implementations - [`BFGS`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.BFGS.html#scipy.optimize.BFGS) - and +- Scipy also includes a + [`HessianUpdateStrategy`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.HessianUpdateStrategy.html#scipy.optimize.HessianUpdateStrategy), + which provides an interface for specifying an approximate Hessian for use in + quasi-Newton methods, along with two implementations + [`BFGS`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.BFGS.html#scipy.optimize.BFGS) + and [`SR1`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.SR1.html#scipy.optimize.SR1). - -### Problems +### Problems ::::challenge{id=line-search-problems title="Line Search Problems" } -1. Program the steepest descent and Newton algorithms using the backtracking line - search. Use them to minimize the [Rosenbrock - function](https://en.wikipedia.org/wiki/Rosenbrock_function). Set the initial step - length $\alpha_0 = 1$ and print the step length used by each method at each - iteration. First try the initial point $x_0 = (1.2, 1.2)^T$ and then the more - difficult starting point $x_0 = (−1.2, 1)^T$. -2. Plot the function surface using `matplotlib` and overlay the line search segments so - you can visualise the progress of your algorithm. Observe the difference between - the algorithms when the gradient of the rosenbrock function is low (i.e. at the +1. Program the steepest descent and Newton algorithms using the backtracking line + search. Use them to minimize the [Rosenbrock + function](https://en.wikipedia.org/wiki/Rosenbrock_function). Set the initial step + length $\alpha_0 = 1$ and print the step length used by each method at each + iteration. First try the initial point $x_0 = (1.2, 1.2)^T$ and then the more + difficult starting point $x_0 = (−1.2, 1)^T$. +2. Plot the function surface using `matplotlib` and overlay the line search segments so + you can visualise the progress of your algorithm. Observe the difference between + the algorithms when the gradient of the rosenbrock function is low (i.e. at the bottom of the curved valley) -3. Repeat (1) and (2) above using the line search implemented in Scipy - [`scipy.optimize.line_search`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.line_search.html), +3. Repeat (1) and (2) above using the line search implemented in Scipy + [`scipy.optimize.line_search`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.line_search.html), which uses the strong Wolfe conditions. - :::solution ```python -import matplotlib.pylab as plt import numpy as np from matplotlib import cm import scipy.optimize +from matplotlib import pyplot as plt def steepest_descent(x, f, grad_f, hessian_f): return -grad_f(x) @@ -328,5 +320,6 @@ for x0 in [x01, x02]: ax2.legend() plt.show() ``` + ::: :::: diff --git a/scientific_computing/optimisation/03-trust-region-methods.md b/scientific_computing/optimisation/03-trust-region-methods.md index b4876bb9..d0bf4624 100644 --- a/scientific_computing/optimisation/03-trust-region-methods.md +++ b/scientific_computing/optimisation/03-trust-region-methods.md @@ -1,31 +1,28 @@ --- name: Trust Region Methods -dependsOn: [ - scientific_computing.optimisation.02-line-search-methods, -] +dependsOn: [scientific_computing.optimisation.02-line-search-methods] tags: [] -attribution: -- citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ### Saddle points -Saddle point pose a particular challenge in non-linear optimisation, particularly in -higher dimensions. The plots below show two examples of saddle points in two dimensions. -Like local minima and maxima, these are stationary points where the gradient of the -function is zero $\nabla f = 0$, but where the value of the function rises along certain -directions and reduces along others (left plot). An alternative type of saddle point -arises when the hessian is singular, and are characterised by a plateau around the -stationary point, like the [monkey saddle](https://en.wikipedia.org/wiki/Monkey_saddle) -depicted in the plot to the right. +Saddle point pose a particular challenge in non-linear optimisation, particularly in +higher dimensions. The plots below show two examples of saddle points in two dimensions. +Like local minima and maxima, these are stationary points where the gradient of the +function is zero $\nabla f = 0$, but where the value of the function rises along certain +directions and reduces along others (left plot). An alternative type of saddle point +arises when the hessian is singular, and are characterised by a plateau around the +stationary point, like the [monkey saddle](https://en.wikipedia.org/wiki/Monkey_saddle) +depicted in the plot to the right. ![Two examples of saddle points](images/saddle.svg) @@ -36,97 +33,87 @@ f(\mathbf{\theta}^\star + \delta \mathbf{\theta}) = f(\mathbf{\theta}^\star) + $$ where $\lambda_i$ is the $i$th eigenvalue of the Hessian, and $\nabla \mathbf{v}_i$ is the motion of $\delta \mathbf{\theta}$ along the $i$th eigenvector of -the Hessian. If the $i$th eigenvalue is negative/positive then along the $i$th +the Hessian. If the $i$th eigenvalue is negative/positive then along the $i$th eigenvector the function $f$ will achieve a maximum/minimum at $\mathbf{\theta}^\star$. -*Gradient descent* algorithms will move away or towards $\mathbf{\theta}^\star$ with a +_Gradient descent_ algorithms will move away or towards $\mathbf{\theta}^\star$ with a step given by $-\lambda_i \delta \mathbf{v}_i$. So for negative eigenvalues the -motion will be towards lower values of $f$ *away* from $\mathbf{\theta}^\star$. For -positive eigenvalues the motion will be towards lower values of $f$ *towards* +motion will be towards lower values of $f$ _away_ from $\mathbf{\theta}^\star$. For +positive eigenvalues the motion will be towards lower values of $f$ _towards_ $\mathbf{\theta}^\star$. The problem here is the size of the step, which is very small for small values of $\lambda_i$. -*Newton methods* rescale the step size by $\lambda_i$ so that it becomes +_Newton methods_ rescale the step size by $\lambda_i$ so that it becomes $-\delta \mathbf{v}_i$. For negative eigenvalues, this has the undesirable -characteristic that these methods move towards *increasing* values of $f$ (i.e. -towards the critical point) along corresponding eigenvectors. Since for -positive eigenvalues it is *also* moving towards the critical point, this means -that saddle points act as *attractors* for these types of methods. - -*Trust region methods* restate the optimisation problem as a sequence of -optimisations of a second order approximation to -$f$ in a local *trust-region* surrounding the current point $a_k$. The exact -solution to each of these subproblems can be shown to be +characteristic that these methods move towards _increasing_ values of $f$ (i.e. +towards the critical point) along corresponding eigenvectors. Since for +positive eigenvalues it is _also_ moving towards the critical point, this means +that saddle points act as _attractors_ for these types of methods. + +_Trust region methods_ restate the optimisation problem as a sequence of +optimisations of a second order approximation to +$f$ in a local _trust-region_ surrounding the current point $a_k$. The exact +solution to each of these subproblems can be shown to be $(\nabla^2 f(a_k) + \lambda_t I)^{-1} \nabla f(a_k)$, where the value of $\lambda_t$ is related to the size of the trust region. In comparison with the previous methods above, -this is equivilent to moving with a step given by +this is equivilent to moving with a step given by $-\frac{\lambda_i}{\lambda_i + \lambda_t}\delta \mathbf{v}_i$. As long as $\lambda_t$ is chosen to be larger than the most negative eigenvalue then the direction of each step is now always -towards more negative values of $f$. As long as $\lambda_t$ is small +towards more negative values of $f$. As long as $\lambda_t$ is small compared with $\lambda_i$ then we avoid the small step sizes associated with -gradient descent. +gradient descent. ### Trust region methods -Like many line search methods, trust region methods also use the second order Taylor +Like many line search methods, trust region methods also use the second order Taylor expansion of $f$ around $a_k$ $$ f(a_k + p) \approx f(a_k) + g_k^T p + \frac{1}{2} p^T B_k p = m_k(p) $$ -where $g_k = \nabla f(a_k)$, $B_k$ is an approximation to the hessian matrix $B_k -\approx \nabla^2 f(a_k)$ or the hessian itself $B_k = \nabla^2 f(a_k)$. Trust region -methods aim to find the $p$ that minimises $m_k$ in a local trust region $||p|| < -\Delta_k$ around the current point $a_k$, where $\Delta_k$ is the trust region radius. +where $g_k = \nabla f(a_k)$, $B_k$ is an approximation to the hessian matrix $B_k +\approx \nabla^2 f(a_k)$ or the hessian itself $B_k = \nabla^2 f(a_k)$. Trust region +methods aim to find the $p$ that minimises $m_k$ in a local trust region $||p|| < +\Delta_k$ around the current point $a_k$, where $\Delta_k$ is the trust region radius. -Solving the minimisation given above is normally done approximately, with different -trust region methods varying how the approximation is achieved. Choosing the -trust-region radius is fundamental to this class of methods, and is done by comparing +Solving the minimisation given above is normally done approximately, with different +trust region methods varying how the approximation is achieved. Choosing the +trust-region radius is fundamental to this class of methods, and is done by comparing the actual to the predicted reduction in the function value $$ \rho_k = \frac{f(a_k) - f(a_k + p_k)}{m_k(0) - m_k(p_k)}. $$ -Since $m_k(0) - m_k(p_k)$ is always positive, if $\rho_k$ is negative then the actual -function value is increasing, the step is rejected and the trust region radius -$\Delta_k$ is decreased in order to improve the approximate model $m_k$. If $\rho_k$ is -positive but much smaller than one then we do not alter $\Delta_k$. If $\rho_k$ is close -to or greater than 1 we can be confident in our model and thus increase $\Delta_k$. The -general algorithm for a trust region method (reproduced from the text by Nocedal and +Since $m_k(0) - m_k(p_k)$ is always positive, if $\rho_k$ is negative then the actual +function value is increasing, the step is rejected and the trust region radius +$\Delta_k$ is decreased in order to improve the approximate model $m_k$. If $\rho_k$ is +positive but much smaller than one then we do not alter $\Delta_k$. If $\rho_k$ is close +to or greater than 1 we can be confident in our model and thus increase $\Delta_k$. The +general algorithm for a trust region method (reproduced from the text by Nocedal and Wright cited below) is: ### Trust region algorithm -Given $a_0$, $\hat{\Delta} > 0$, $\Delta_0 \in (0, \hat{\Delta})$, and $\nu \in [0, +Given $a_0$, $\hat{\Delta} > 0$, $\Delta_0 \in (0, \hat{\Delta})$, and $\nu \in [0, \frac{1}{4})$: -**for** $k = 0, 1, 2, ...$ -> Obtain $p_k$ by (approximately) minimising $m_k(p)$ where $||p|| < \Delta_k$ -> $\rho_k := \frac{f(a_k) - f(a_k + p_k)}{m_k(0) - m_k(p_k)}$ -> **if** $\rho_k < \frac{1}{4}$ ->> $\Delta\_{k+1} := \frac{1}{4} \Delta_k$ - -> **else** ->> **if** $\rho_k > \frac{3}{4}$ and $||p_k|| = \Delta_k$ ->>> $\Delta\_{k+1} := \min(2 \Delta_k, \hat{\Delta})$ - ->> **else** ->>> $\Delta\_{k+1} := \Delta_k$ - -> **if** $\rho\_k > \nu$ ->> $a\_{k+1} := a_k + p_k$ +**for** $k = 0, 1, 2, ...$ -> **else** ->> $a\_{k+1} := a_k$ +> Obtain $p_k$ by (approximately) minimising $m_k(p)$ where $||p|| < \Delta_k$ > $\rho_k := \frac{f(a_k) - f(a_k + p_k)}{m_k(0) - m_k(p_k)}$ > **if** $\rho_k < \frac{1}{4}$ +> +> > $\Delta\_{k+1} := \frac{1}{4} \Delta_k$ > **else** >> **if** $\rho_k > \frac{3}{4}$ and $||p_k|| = \Delta_k$ +> > +> > > $\Delta\_{k+1} := \min(2 \Delta_k, \hat{\Delta})$ >> **else** >>> $\Delta\_{k+1} := \Delta_k$ > **if** $\rho\_k > \nu$ >> $a\_{k+1} := a_k + p_k$ +> > > **else** >> $a\_{k+1} := a_k$ -**end for** +**end for**. ### Solving the trust region subproblem -We will describe two algorithms for minimising $m_k(p)$, the *Cauchy point* and the -*dogleg* methods. The Cauchy point first solves a linear version of $m_k$ defined as +We will describe two algorithms for minimising $m_k(p)$, the _Cauchy point_ and the +_dogleg_ methods. The Cauchy point first solves a linear version of $m_k$ defined as $$ p^s_k = \min_{p \in \mathcal{R}^n} f(a_k) + g_k^T p \text{ for }||p|| \le \Delta_k @@ -146,7 +133,7 @@ $$ p_k^C = -\tau_k \frac{\Delta_k}{|| g_k ||} g_k, $$ -where +where $$ \tau_k = \begin{cases} @@ -155,26 +142,25 @@ $$ \end{cases} $$ -The second method we describe is the *dogleg* method, which is applicable when $B_k$ is -a positive definite matrix. If the original hessian is positive definite then this -method is directly applicable, or one of the quasi-Newton positive definite -approximation to the hessian could also be used. The dogleg method is derived by -considering the path of the $p$ that minimises $m_k(p)$ with increasing $\Delta_k$, -which forms a curved path in parameter space. The method approximates this path with two -straight line segments. The first segment follows the steepest descent direction and is +The second method we describe is the _dogleg_ method, which is applicable when $B_k$ is +a positive definite matrix. If the original hessian is positive definite then this +method is directly applicable, or one of the quasi-Newton positive definite +approximation to the hessian could also be used. The dogleg method is derived by +considering the path of the $p$ that minimises $m_k(p)$ with increasing $\Delta_k$, +which forms a curved path in parameter space. The method approximates this path with two +straight line segments. The first segment follows the steepest descent direction and is given by $$ p_k^U = -\frac{g_k^T g_k}{g_k^T B_k g_k} g_k $$ -The second step is along the path between $p_k^U$ and $p^B_k = -B_k^{-1} g_k$. In the -case where $p_k^B$ is *inside* the trust region $||p_k^B|| \le \Delta_k$ then $p_k^B$ -can be used without modification. Otherwise the point of intersection with the -trust-region radius must be calculated, which can be done by solving the following +The second step is along the path between $p_k^U$ and $p^B_k = -B_k^{-1} g_k$. In the +case where $p_k^B$ is _inside_ the trust region $||p_k^B|| \le \Delta_k$ then $p_k^B$ +can be used without modification. Otherwise the point of intersection with the +trust-region radius must be calculated, which can be done by solving the following quadratic equation - $$ ||p_k^U + (\tau - 1)(p_k^B - p_k^U)||^2 = \Delta_k^2 $$ @@ -188,21 +174,19 @@ p_k^U + (\tau - 1)(p_k^B - p_k^U), & 1 \le \tau \le 2. \end{cases} $$ - ### Problems ::::challenge{id="the-dog-leg" title="The Dog Leg"} -1. Let $f(x) = 10 \left( x_2 − x^2_1 \right)^2 + (1 − x_1)^2$. At $x = (0,−1)$ draw the - contour lines of the quadratic model - - $$ - m_k(p) = f(a_k) + g_k^T p + \frac{1}{2} p^T B_k p, - $$ - - assuming that $B\_k$ is the Hessian of $f$. Draw the family of solutions of $\min_{p - \in \mathcal{R}^n}m_k(p)$ so that $||p|| \le \Delta_k$ as the trust region radius - varies from $\Delta_k = 0$ to $\Delta_k = 2$. Repeat this at $x = (0, 0.5)$. +Let $f(x) = 10 \left( x_2 − x^2_1 \right)^2 + (1 − x_1)^2$. At $x = (0,−1)$ draw the contour lines of the quadratic model + +$$ +m_k(p) = f(a_k) + g_k^T p + \frac{1}{2} p^T B_k p, +$$ + +assuming that $B\_k$ is the Hessian of $f$. +Draw the family of solutions of $\min_{p\in \mathcal{R}^n}m_k(p)$ sothat $||p|| \le \Delta_k$ as the trust region radius varies from $\Delta_k = 0$ to $\Delta_k = 2$. +Repeat this at $x = (0, 0.5)$. :::solution @@ -255,19 +239,20 @@ for x0 in [np.array([0., -1.]), np.array([0., 0.5])]: plt.show() ``` + ::: :::: - ::::challenge{id="dogleg-method" title="Dogleg method"} -2. Write a program that implements the dogleg method. Choose $B_k$ to be the exact - Hessian. Apply it to minimise the function in (1) from the same two starting - points. If you wish, experiment with the update rule for the trust region by - changing the constants in the trust region algorithm given above, or by designing - your own rules. - +Write a program that implements the dogleg method. Choose $B_k$ to be the exact +Hessian. Apply it to minimise the function in (1) from the same two starting +points. If you wish, experiment with the update rule for the trust region by +changing the constants in the trust region algorithm given above, or by designing +your own rules. + :::solution + ```python def line_sphere_intersect(o, u, r2): """ @@ -351,5 +336,6 @@ for x0 in [np.array([0., -1.]), np.array([0., 0.5])]: plt.show() ``` + ::: :::: diff --git a/scientific_computing/optimisation/05-finite-difference-method.md b/scientific_computing/optimisation/05-finite-difference-method.md index 7280df5c..6b467684 100644 --- a/scientific_computing/optimisation/05-finite-difference-method.md +++ b/scientific_computing/optimisation/05-finite-difference-method.md @@ -1,120 +1,121 @@ --- name: Derivative-free methods -dependsOn: [ - scientific_computing.optimisation.03-trust-region-methods, -] +dependsOn: [scientific_computing.optimisation.03-trust-region-methods] tags: [] -attribution: -- citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- -The line search and trust region methods introduced in the previous lesson all required -that the user be able to calculate the gradient of the function $\nabla f$. However, in -many cases the gradient is either not available or too error-prone to be of use. For -example, the function $f$ might be only available as a compiled executable or the result -of a physical experiment. The model might be stochastic, or the gradient evaluation -might be noisy due to numerical innacuracies, or of sufficiently complexity that the -gradient is either unknown or too expensive to compute. - -Here we describe two of the most common methods for derivative-free optimisation, using -a finite difference approximation to approximate the derivative, and the [Nelder-Mead -algorithm](https://doi.org/10.1093/comjnl/7.4.308), which is a Simplex search method. +The line search and trust region methods introduced in the previous lesson all required +that the user be able to calculate the gradient of the function $\nabla f$. However, in +many cases the gradient is either not available or too error-prone to be of use. For +example, the function $f$ might be only available as a compiled executable or the result +of a physical experiment. The model might be stochastic, or the gradient evaluation +might be noisy due to numerical innacuracies, or of sufficiently complexity that the +gradient is either unknown or too expensive to compute. + +Here we describe two of the most common methods for derivative-free optimisation, using +a finite difference approximation to approximate the derivative, and the [Nelder-Mead +algorithm](https://doi.org/10.1093/comjnl/7.4.308), which is a Simplex search method. However, there are a large number of derivative-free methods, ranging from the classical -[*Direct Search -methods*](https://www.sciencedirect.com/science/article/pii/S0377042700004234) like -*Pattern search*, *Simplex search*, *Rosenbrock'* or *Powell's* methods. Then there are -emulator or model -based methods that build up a model of the function $f$ and minimise -that using a gradient-based method, a powerful example of this class of methods is -[Bayesian -Optimisation](http://papers.nips.cc/paper/4522-practical-bayesian-optimization). Many -global optimsiation algorithms are derivative-free, including *randomised algorithms* -such as [Simulated Annealing](https://science.sciencemag.org/content/220/4598/671), and -*evolution-based* strategies such as the popular [Covariance matrix adaptation evolution -strategy (CMA-ES)](https://arxiv.org/abs/1604.00772), or *swarm algorithms* inspired -from bees/ants like [Particle Swarm +[_Direct Search +methods_](https://www.sciencedirect.com/science/article/pii/S0377042700004234) like +_Pattern search_, _Simplex search_, _Rosenbrock'_ or _Powell's_ methods. Then there are +emulator or model -based methods that build up a model of the function $f$ and minimise +that using a gradient-based method, a powerful example of this class of methods is +[Bayesian +Optimisation](http://papers.nips.cc/paper/4522-practical-bayesian-optimization). Many +global optimsiation algorithms are derivative-free, including _randomised algorithms_ +such as [Simulated Annealing](https://science.sciencemag.org/content/220/4598/671), and +_evolution-based_ strategies such as the popular [Covariance matrix adaptation evolution +strategy (CMA-ES)](https://arxiv.org/abs/1604.00772), or _swarm algorithms_ inspired +from bees/ants like [Particle Swarm Optimisation](https://doi.org/10.1109/ICNN.1995.488968). ## Finite difference -The simplest way of converting a gradient-based optimisation algorithm to a derivative +The simplest way of converting a gradient-based optimisation algorithm to a derivative free one is to approximate the gradient of the function using finite differences. -The Finite Difference (FD) method is based on taking a Taylor series expansion of either -$f(x+h)$ and $f(x-h)$ (and others) for a small parameter $f$ about $x$. Consider a +The Finite Difference (FD) method is based on taking a Taylor series expansion of either +$f(x+h)$ and $f(x-h)$ (and others) for a small parameter $f$ about $x$. Consider a smooth function $f(x)$ then its Taylor expansion is $$ -f(x+h) = f(x) + h f'(x) + \frac{h^2}{2} f''(x) + \frac{h^3}{6} f'''(x) + \frac{h^4}{24} f'''''(x) + \ldots +f(x+h) = f(x) + h f'(x) + \frac{h^2}{2} f''(x) + \frac{h^3}{6} f'''(x) + \frac{h^4}{24} f'''''(x) + \ldots $$ $$ f(x-h) = f(x) - h f'(x) + \frac{h^2}{2} f''(x) - \frac{h^3}{6} f'''(x) + \frac{h^4}{24} f'''''(x) - \ldots $$ -From this, we can compute three different *schemes* (approximations) to $u'(x)$: +From this, we can compute three different _schemes_ (approximations) to $u'(x)$: **Forward difference**: + $$ u'(x) = \frac{u(x+h)-u(x)}{h} + O(h) $$ **Backward difference**: + $$ u'(x) = \frac{u(x)-u(x-h)}{h} + O(h) $$ **Centered difference**: + $$ u'(x) = \frac{u(x+h)-u(x-h)}{2h} + O(h^2) $$ -Finite difference approximations are easily computed, but suffer from innacuracies which -can cause optimisation algorithms to fail or perform poorely. As well as the error in -the FD approximation itself (e.g. $O(h^2)$ for centered difference), the function -evaluation itself might have some numerical or stochastic "noise". If this noise -dominates over the (small) step size $h$, then it is entirely probable that the -calculated steepest descent $-\nabla f(x)$ will **not** be a direction of descent for +Finite difference approximations are easily computed, but suffer from innacuracies which +can cause optimisation algorithms to fail or perform poorely. As well as the error in +the FD approximation itself (e.g. $O(h^2)$ for centered difference), the function +evaluation itself might have some numerical or stochastic "noise". If this noise +dominates over the (small) step size $h$, then it is entirely probable that the +calculated steepest descent $-\nabla f(x)$ will **not** be a direction of descent for $f$. ### Software -It is very common that optimisation libraries provide a finite difference approximation -to the Jacobian $\nabla f$ if it is not supplied, as is done for the gradient-based +It is very common that optimisation libraries provide a finite difference approximation +to the Jacobian $\nabla f$ if it is not supplied, as is done for the gradient-based methods in [`scipy.optimize`](https://docs.scipy.org/doc/scipy/reference/optimize.html). -More dedicated libraries can give superior approximations to the gradient, like the -[`numdifftools`](https://numdifftools.readthedocs.io/en/latest/index.html) package. This -library provides higher order FD approximations and *Richardson extrapolation* to -evaluate the limit of $h \rightarrow 0$, and can calculate Jacobians and Hessians of -user-supplied functions. +More dedicated libraries can give superior approximations to the gradient, like the +[`numdifftools`](https://numdifftools.readthedocs.io/en/latest/index.html) package. This +library provides higher order FD approximations and _Richardson extrapolation_ to +evaluate the limit of $h \rightarrow 0$, and can calculate Jacobians and Hessians of +user-supplied functions. ### Problems ::::challenge{id=comparing-methods title="Comparing optimisation methods"} -Use the following methods from -[`scipy.optimize`](https://docs.scipy.org/doc/scipy/reference/optimize.html) to minimize -the 2D [Rosenbrock +Use the following methods from +[`scipy.optimize`](https://docs.scipy.org/doc/scipy/reference/optimize.html) to minimize +the 2D [Rosenbrock function](https://en.wikipedia.org/wiki/Rosenbrock_function): - - Nelder-Mead Simplex - - Conjugate Gradient - - BFGS Quasi-Newton - - Newton-CG - - SHG Global Optimisation + +- Nelder-Mead Simplex +- Conjugate Gradient +- BFGS Quasi-Newton +- Newton-CG +- SHG Global Optimisation Either use $x_0 = (−1.2, 1)^T$ as the starting point, or experiment with your own. -In each case perform the optimisation with and without a user-supplied jacobian and -evaluate the effect on the number of evaluations of the function $f$ required to -converge to the optimum. Optional: You can take the derivatives by hand, or use -automatic differentiation via the [`autograd`](https://github.com/HIPS/autograd) or +In each case perform the optimisation with and without a user-supplied jacobian and +evaluate the effect on the number of evaluations of the function $f$ required to +converge to the optimum. Optional: You can take the derivatives by hand, or use +automatic differentiation via the [`autograd`](https://github.com/HIPS/autograd) or [`JAX`](https://github.com/google/jax) packages :::solution @@ -125,7 +126,7 @@ import matplotlib.pyplot as plt from scipy.optimize import minimize, shgo from autograd import grad -def convex_function(x): +def convex(x): return np.sum(np.array(x)**2, axis=0) def rosenbrock(x): @@ -178,11 +179,10 @@ def optimize(function, method, autodiff): if method == 'shgo': bounds = [(-10, 10), (-10.0, 10.0)] res = shgo(function, bounds, callback=fill_eval_points, - options={'disp': True}) + options={'disp': True}) else: res = minimize(function, x0, method=method, callback=fill_eval_points, - jac = jac, - options={'disp': True}) + jac=jac, options={'disp': True}) nx, ny = (100, 100) x = np.linspace(-5, 5, nx) @@ -216,15 +216,15 @@ def optimize(function, method, autodiff): if __name__ == '__main__': - for f in [convex_function, rosenbrock, rastrigin]: + for f in [convex, rosenbrock, rastrigin]: for m in ['shgo','nelder-mead', 'cg', 'bfgs', 'newton-cg']: for a in [False, True]: - if m == 'newton-cg' and a == False: + if m == 'newton-cg' and a is False: continue - if m == 'shgo' and a == True: + if m == 'shgo' and a is True: continue optimize(f, m, a) ``` + ::: :::: - diff --git a/scientific_computing/optimisation/06-nelder-mead.md b/scientific_computing/optimisation/06-nelder-mead.md index 5494dbb9..a5e83a34 100644 --- a/scientific_computing/optimisation/06-nelder-mead.md +++ b/scientific_computing/optimisation/06-nelder-mead.md @@ -1,70 +1,67 @@ --- name: Nelder-Mead method -dependsOn: [ - scientific_computing.optimisation.05-finite-difference-method, -] +dependsOn: [scientific_computing.optimisation.05-finite-difference-method] tags: [] -attribution: -- citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- -The [Nelder-Mead method](https://en.wikipedia.org/wiki/Nelder%E2%80%93Mead_method) is -popular and implementations exist in many optimisation software libraries. It is based -on the idea of a simplex in parameter space of dimension $n$, which is formed from the -convex hull of $n + 1$ points in $\mathcal{R}^n$. These points $x_i$ are ordered +The [Nelder-Mead method](https://en.wikipedia.org/wiki/Nelder%E2%80%93Mead_method) is +popular and implementations exist in many optimisation software libraries. It is based +on the idea of a simplex in parameter space of dimension $n$, which is formed from the +convex hull of $n + 1$ points in $\mathcal{R}^n$. These points $x_i$ are ordered according to their function value so that $$ f(x_1) \le f(x_2) \le \cdots \le f(x_{n+1}) $$ -For each iteration of the algorithm, there are five different points of interest, the +For each iteration of the algorithm, there are five different points of interest, the first of which is the centroid of the $n$ points with the lowest $f$ values $$ \bar{x} = \frac{1}{n}\sum_{i=1}^n x_i $$ -The other four points are defined by considering the line joining $\bar{x}$ and the +The other four points are defined by considering the line joining $\bar{x}$ and the point with the highest $f$ value $x_{n+1}$ $$ \bar{x}(t) = \bar{x} + t(x_{n+1} - \bar{x}) $$ -The four points are the *reflection*, *expanding*, the *inside contraction* and *outside -contraction* points, given by $\bar{x}(-1)$, $\bar{x}(-2)$, $\bar{x}(1/2)$, and +The four points are the _reflection_, _expanding_, the _inside contraction_ and _outside +contraction_ points, given by $\bar{x}(-1)$, $\bar{x}(-2)$, $\bar{x}(1/2)$, and $\bar{x}(-1/2)$ respectively. -The Nelder-Mead algorithm tries to replace $x_{n+1}$ by reflecting, expanding, or -contracting the simplex to one of these points. If it cannot find a better point, then -all the vertices on the simplex are shrunk towards the best vertex $x_1$. +The Nelder-Mead algorithm tries to replace $x_{n+1}$ by reflecting, expanding, or +contracting the simplex to one of these points. If it cannot find a better point, then +all the vertices on the simplex are shrunk towards the best vertex $x_1$. -![Nelder–Mead simplex search over the Rosenbrock banana -function](images/Nelder-Mead_Rosenbrock.gif) +![Nelder–Mead simplex search over the Rosenbrock banana +function](images/Nelder-Mead_Rosenbrock.gif) ## Algorithm -[Scholarpedia](http://www.scholarpedia.org/article/Nelder-Mead_algorithm) and -[Wikipedia](https://en.wikipedia.org/wiki/Nelder%E2%80%93Mead_method) provide diagrams -and pseudocode of the Nelder-Mead algorithm, as does Chapter 9.5 of the Nocedal and +[Scholarpedia](http://www.scholarpedia.org/article/Nelder-Mead_algorithm) and +[Wikipedia](https://en.wikipedia.org/wiki/Nelder%E2%80%93Mead_method) provide diagrams +and pseudocode of the Nelder-Mead algorithm, as does Chapter 9.5 of the Nocedal and Write textbook given below ## Initialisation and termination -It is not obvious how Nelder-Mead should be initialised, given a single starting point -$x_0$ by the user. The (Gau 2012) paper provides an initialisation routine that was also -used in MATLAB's fminsearch function. The $x_0$ point is used as one of the vertices, -with the remaining $n$ vertices set to $x_0 + \tau_i e_i$, where $e_i$ is a unit vector -in the $i^{th}$ coordinate and +It is not obvious how Nelder-Mead should be initialised, given a single starting point +$x_0$ by the user. The (Gau 2012) paper provides an initialisation routine that was also +used in MATLAB's fminsearch function. The $x_0$ point is used as one of the vertices, +with the remaining $n$ vertices set to $x_0 + \tau_i e_i$, where $e_i$ is a unit vector +in the $i^{th}$ coordinate and $$ \tau_i = \begin{cases} @@ -73,19 +70,19 @@ $$ \end{cases} $$ -For termination, Nelder and Mead recommended stopping the iteration when the standard -deviation of the function evaluations reduces below a certain tolerance. MATLAB's -fminsearch terminates when -$\max\_{2 \le i \le n+1} |f\_i - f\_1| \le \text{TolFun}$ and $\max\_{2 \le i \le n+1} -|| x\_i - x\_1 ||_\infty \le \text{TolX}$, or if the maximum number of iterations of +For termination, Nelder and Mead recommended stopping the iteration when the standard +deviation of the function evaluations reduces below a certain tolerance. MATLAB's +fminsearch terminates when +$\max\_{2 \le i \le n+1} |f\_i - f\_1| \le \text{TolFun}$ and $\max\_{2 \le i \le n+1} +|| x\_i - x\_1 ||_\infty \le \text{TolX}$, or if the maximum number of iterations of function evaluations is reached. ## Other Reading -- Nelder, John A.; R. Mead (1965). "A simplex method for function minimization". +- Nelder, John A.; R. Mead (1965). "A simplex method for function minimization". Computer Journal. 7 (4): 308–313. doi:10.1093/comjnl/7.4.308. -- [Gao, F., & Han, L. (2012). Implementing the Nelder-Mead simplex algorithm with - adaptive parameters. Computational Optimization and Applications, 51(1), +- [Gao, F., & Han, L. (2012). Implementing the Nelder-Mead simplex algorithm with + adaptive parameters. Computational Optimization and Applications, 51(1), 259-277.](http://www.webpages.uidaho.edu/~fuchang/res/ANMS.pdf) - Numerical optimization by Nocedal, Jorge; Wright, Stephen J., 1960-, Chapter 9 @@ -93,14 +90,15 @@ function evaluations is reached. ::::challenge{id=nelder-mead title="Nelder-Mead algorithm"} -Code up the Nelder-Mead algorithm and compare its performance against the steepest -descent, Newton and dogleg algorithms you did in the last lesson. You can evaluate them +Code up the Nelder-Mead algorithm and compare its performance against the steepest +descent, Newton and dogleg algorithms you did in the last lesson. You can evaluate them on the 2D quadratic function $f(x, y) = x^2 + y^2$, the 2D [Rosenbrock -function](https://en.wikipedia.org/wiki/Rosenbrock_function) or on one of many different -[optimisation test +function](https://en.wikipedia.org/wiki/Rosenbrock_function) or on one of many different +[optimisation test functions](https://en.wikipedia.org/wiki/Test_functions_for_optimization) :::solution + ```python import numpy as np import matplotlib.pyplot as plt @@ -239,5 +237,6 @@ if __name__ == '__main__': ) plt.show() ``` + ::: :::: diff --git a/scientific_computing/optimisation/index.md b/scientific_computing/optimisation/index.md index ff4b0bfc..d0dc1b5f 100644 --- a/scientific_computing/optimisation/index.md +++ b/scientific_computing/optimisation/index.md @@ -1,29 +1,24 @@ --- name: Non-linear Optimisation id: non-linear-optimisation -dependsOn: [ - scientific_computing.essential_maths, -] -files: [ - 01-nonlinear-optimisation.md, - 02-line-search-methods.md, - 03-trust-region-methods.md, - 05-finite-difference-method.md, - 06-nelder-mead.md, -] +dependsOn: [scientific_computing.essential_maths] +files: + [ + 01-nonlinear-optimisation.md, + 02-line-search-methods.md, + 03-trust-region-methods.md, + 05-finite-difference-method.md, + 06-nelder-mead.md, + ] summary: | - This course introduces the concept of continuous integration and how to set it up for a Python project using GitHub Actions. -attribution: -- citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - + This course introduces the concept of continuous integration and how to set it up for a Python project using GitHub Actions. +attribution: + - citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- - - - diff --git a/scientific_computing/sparse_linear_algebra/01-sparse-matrices.md b/scientific_computing/sparse_linear_algebra/01-sparse-matrices.md index 942fb463..03479439 100644 --- a/scientific_computing/sparse_linear_algebra/01-sparse-matrices.md +++ b/scientific_computing/sparse_linear_algebra/01-sparse-matrices.md @@ -1,61 +1,57 @@ --- name: Sparse Matrices -dependsOn: [ -] +dependsOn: [] tags: [] -attribution: -- citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 +attribution: + - citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Why sparse matrices -Taking advantage of any special structure in the matrix of interest is always of great -importance when designing a linear algebra algorithm/solver. Thus far we have discussed -special structures such as symmetric or positive definite matrices, but one of the most -common matrix structures in scientific computing is that of a *sparse matrix*, or a -matrix containing many zero elements. Since zeros can be ignored in many computations, -for example multiplication or addition, a sparse matrix will specify a special data -structure so that only the *non-zero* elements of the matrix are actually stored and -used in computation. Note also that a sparse matrix can itself be symmetric or positive -definite, and it is often necessary to take all these properties into account when +Taking advantage of any special structure in the matrix of interest is always of great +importance when designing a linear algebra algorithm/solver. Thus far we have discussed +special structures such as symmetric or positive definite matrices, but one of the most +common matrix structures in scientific computing is that of a _sparse matrix_, or a +matrix containing many zero elements. Since zeros can be ignored in many computations, +for example multiplication or addition, a sparse matrix will specify a special data +structure so that only the _non-zero_ elements of the matrix are actually stored and +used in computation. Note also that a sparse matrix can itself be symmetric or positive +definite, and it is often necessary to take all these properties into account when designing your algorithm. -The most obvious benefit of a sparse matrix is low memory usage. A dense matrix needs to -store all the elements of a matrix of size $n$, and thus the memory requirements scale -as $\mathcal{O}(n^2)$. For example, the following show the memory requirements of a -matrix of double precision numbers (taken from the excellent -[scipy-lectures](http://scipy-lectures.org/advanced/scipy_sparse/introduction.html#why-sparse-matrices) - +The most obvious benefit of a sparse matrix is low memory usage. A dense matrix needs to +store all the elements of a matrix of size $n$, and thus the memory requirements scale +as $\mathcal{O}(n^2)$. For example, the following show the memory requirements of a +matrix of double precision numbers (taken from the excellent +[scipy-lectures](http://scipy-lectures.org/advanced/scipy_sparse/introduction.html#why-sparse-matrices) ![least squares problem](images/sparse_versus_dense.svg) -A sparse matrix only stores non-zero elements, and in many different applications this -represents a huge memory saving as matrices are often very sparse, holding only a few +A sparse matrix only stores non-zero elements, and in many different applications this +represents a huge memory saving as matrices are often very sparse, holding only a few non-zero elements. Some typical applications are: -- solution of partial differential equations (PDEs), such as the finite difference +- solution of partial differential equations (PDEs), such as the finite difference method illustrated below -- applications of graphs or networks (e.g. electrical networks, website links), where +- applications of graphs or networks (e.g. electrical networks, website links), where non-zero elements of the matrix represent edges between nodes - -Note that while a sparse matrix has obvious benefits in terms of matrix multiplication, -where the zero elements can simply be ignored, direct solver algorithms such as $LU$ -decomposition for the problem $Ax = b$, where $A$ is sparse, need considerably more -thought as the zeros in $A$ can have propagating effects, and there is no guarantee that -the decomposition of a $A$ or its inverse will be itself sparse, there can be a -significant amount of what is known as *fill-in* (i.e. non-zero elements where there -were zeros in the original matrix). This fact motivates a separate class of *iterative* -(as opposed to *direct*) solvers that only rely on the matrix multiplication of $A$ with -a vector, ignoring the internal sparsity structure of $A$ and only taking advantage of -the increased speed of the matrix multiplication itself. These iterative solvers will be -covered in the following chapter, but in this chapter we will focus on the practical +Note that while a sparse matrix has obvious benefits in terms of matrix multiplication, +where the zero elements can simply be ignored, direct solver algorithms such as $LU$ +decomposition for the problem $Ax = b$, where $A$ is sparse, need considerably more +thought as the zeros in $A$ can have propagating effects, and there is no guarantee that +the decomposition of a $A$ or its inverse will be itself sparse, there can be a +significant amount of what is known as _fill-in_ (i.e. non-zero elements where there +were zeros in the original matrix). This fact motivates a separate class of _iterative_ +(as opposed to _direct_) solvers that only rely on the matrix multiplication of $A$ with +a vector, ignoring the internal sparsity structure of $A$ and only taking advantage of +the increased speed of the matrix multiplication itself. These iterative solvers will be +covered in the following chapter, but in this chapter we will focus on the practical requirements of constructing and using sparse matrices using the `scipy.sparse` library. - diff --git a/scientific_computing/sparse_linear_algebra/02-coo-matrix.md b/scientific_computing/sparse_linear_algebra/02-coo-matrix.md index 047852eb..15bae08e 100644 --- a/scientific_computing/sparse_linear_algebra/02-coo-matrix.md +++ b/scientific_computing/sparse_linear_algebra/02-coo-matrix.md @@ -1,24 +1,18 @@ --- name: COOrdinate format -dependsOn: [ - 'scientific_computing.sparse_linear_algebra.01-sparse-matrices', -] +dependsOn: ["scientific_computing.sparse_linear_algebra.01-sparse-matrices"] tags: [] -attribution: -- citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - +attribution: + - citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- - - As an example of a sparse matrix format, this section describes one of the sparse formats implemented in Scipy, the The COOrdinate format (COO). This is also known as the "ijv" or "triplet" format, and stores the non-zero elements in three arrays, `row`, @@ -30,30 +24,33 @@ formats implemented in Scipy, the The COOrdinate format (COO). This is also know - fast matrix-vector multiplication - fast elementwise operations (e.g. multiply each element by 2 is just `data * 2`) -However, slicing using this format is difficult. +However, slicing using this format is difficult. -Here are some examples of the COO matrix format using -[`scipy.sparse.coo_matrix`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.coo_matrix.html). -Again, these have been taken from -[scipy-lectures](http://scipy-lectures.org/advanced/scipy_sparse/introduction.html#why-sparse-matrices), -which is an excellent resource and contains examples of the other sparse matrix formats +Here are some examples of the COO matrix format using +[`scipy.sparse.coo_matrix`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.coo_matrix.html). +Again, these have been taken from +[scipy-lectures](http://scipy-lectures.org/advanced/scipy_sparse/introduction.html#why-sparse-matrices), +which is an excellent resource and contains examples of the other sparse matrix formats implemented in Scipy. -### create empty COO matrix: +### create empty COO matrix ```python +from scipy import sparse +import numpy as np mtx = sparse.coo_matrix((3, 4), dtype=np.int8) mtx.todense() ``` Output: -``` + +```text matrix([[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]], dtype=int8) ``` -### create using (data, ij) tuple: +### create using (data, ij) tuple ```python row = np.array([0, 3, 1, 0]) @@ -66,7 +63,7 @@ mtx.todense() Output: -``` +```text >>> mtx <4x4 sparse matrix of type '' with 4 stored elements in COOrdinate format> @@ -77,8 +74,7 @@ matrix([[4, 0, 9, 0], [0, 0, 0, 5]]) ``` - -### duplicates entries are summed together: +### duplicates entries are summed together ```python row = np.array([0, 0, 1, 3, 1, 0, 0]) @@ -90,7 +86,7 @@ mtx.todense() Output: -``` +```text >>> mtx.todense() matrix([[3, 0, 1, 0], [0, 2, 0, 0], @@ -98,7 +94,7 @@ matrix([[3, 0, 1, 0], [0, 0, 0, 1]]) ``` -### no slicing…: +### no slicing… ```python mtx[2, 3] @@ -106,7 +102,7 @@ mtx[2, 3] Output: -``` +```text >>> mtx[2, 3] Traceback (most recent call last): File "", line 1, in diff --git a/scientific_computing/sparse_linear_algebra/03-finite-difference.md b/scientific_computing/sparse_linear_algebra/03-finite-difference.md index 24616d23..114cc5c3 100644 --- a/scientific_computing/sparse_linear_algebra/03-finite-difference.md +++ b/scientific_computing/sparse_linear_algebra/03-finite-difference.md @@ -1,27 +1,21 @@ --- name: Finite Difference Matrix -dependsOn: [ - 'scientific_computing.sparse_linear_algebra.02-coo-matrix', -] +dependsOn: ["scientific_computing.sparse_linear_algebra.02-coo-matrix"] tags: [] -attribution: -- citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - +attribution: + - citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- - - -Many matrices in scientific computing contain mostly zeros, particularly those arising -from the discretisation of partial differential equations (PDEs). Here we will construct -a sparse matrix using `scipy.sparse` that is derived from the finite difference +Many matrices in scientific computing contain mostly zeros, particularly those arising +from the discretisation of partial differential equations (PDEs). Here we will construct +a sparse matrix using `scipy.sparse` that is derived from the finite difference discretistaion of the Poisson equation. In 1D, Poisson equation is $$ @@ -34,13 +28,13 @@ $$ u_{xx} \approx \frac{u(x + h) - 2u(x) + u(x-h)}{h^2} $$ -We will discretise $u_{xx} = 0$ at $N$ regular points along $x$ from 0 to 1, given by +We will discretise $u_{xx} = 0$ at $N$ regular points along $x$ from 0 to 1, given by $x_1$, $x_2$: +----+----+----------+----+> x 0 x_1 x_2 ... x_N 1 -Using this set of point and the discretised equation, this gives a set of $N$ equations +Using this set of point and the discretised equation, this gives a set of $N$ equations at each interior point on the domain: $$ @@ -49,9 +43,9 @@ $$ where $v_i \approx u(x_i)$. -To solve these equations we will need additional equations at $x=0$ and $x=1$, known as -the *boundary conditions*. For this example we will use $u(x) = g(x)$ at $x=0$ and $x=1$ -(also known as a non-homogenous Dirichlet bc), so that $v_0 = g(0)$, and $v\_{N+1} = +To solve these equations we will need additional equations at $x=0$ and $x=1$, known as +the _boundary conditions_. For this example we will use $u(x) = g(x)$ at $x=0$ and $x=1$ +(also known as a non-homogenous Dirichlet bc), so that $v_0 = g(0)$, and $v\_{N+1} = g(1)$, and the equation at $x_1$ becomes: $$ @@ -77,7 +71,7 @@ $$ v_2 \\ \vdots \\ v_{N-1}\\ -v_{N} +v_{N} \end{bmatrix} = \begin{bmatrix} f(x_1) \\ f(x_2) \\ @@ -94,7 +88,6 @@ $$ The relevant sparse matrix here is $A$, given by - $$ A = \begin{bmatrix} -2 & 1 & & & \\ 1 & -2 & 1 & & \\ @@ -103,10 +96,9 @@ A = \begin{bmatrix} -2 & 1 & & & \\ & & & 1 & -2 \end{bmatrix} $$ -As you can see, the number of non-zero elements grows linearly with the size $N$, so a +As you can see, the number of non-zero elements grows linearly with the size $N$, so a sparse matrix format is much preferred over a dense matrix holding all $N^2$ elements! - ## Additional Reading For more on the Finite Difference Method for solving PDEs: diff --git a/scientific_computing/sparse_linear_algebra/04-scipy-sparse.md b/scientific_computing/sparse_linear_algebra/04-scipy-sparse.md index f88c10ce..e2cd0b2d 100644 --- a/scientific_computing/sparse_linear_algebra/04-scipy-sparse.md +++ b/scientific_computing/sparse_linear_algebra/04-scipy-sparse.md @@ -1,20 +1,16 @@ --- name: Scipy.sparse and problems -dependsOn: [ - 'scientific_computing.sparse_linear_algebra.03-finite-difference', -] +dependsOn: ["scientific_computing.sparse_linear_algebra.03-finite-difference"] tags: [] -attribution: -- citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - +attribution: + - citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- There are seven available sparse matrix types in `scipy.sparse`: @@ -27,54 +23,53 @@ There are seven available sparse matrix types in `scipy.sparse`: - `coo_matrix`: COOrdinate format (aka IJV, triplet format) - `dia_matrix`: DIAgonal format -As indicated by the excellent -[documentation](https://docs.scipy.org/doc/scipy/reference/sparse.html), the -`dok_matrix` or `lil_matrix` formats are preferable to construct matrices as they +As indicated by the excellent +[documentation](https://docs.scipy.org/doc/scipy/reference/sparse.html), the +`dok_matrix` or `lil_matrix` formats are preferable to construct matrices as they support basic slicing and indexing similar to a standard NumPy array. -You will notice that the FD matrix we have constructed for the Poisson problem is -composed entirely of diagonal elements, as is often the case. If you were constructing a -similar matrix in MATLAB, you would use the -[`spdiags`](https://uk.mathworks.com/help/matlab/ref/spdiags.html) function, and -`scipy.sparse` has its own -[equivalent](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.spdiags.html). -However, all the `scipy.sparse` formats also have special methods -[`setdiag`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.lil_matrix.setdiag.html) +You will notice that the FD matrix we have constructed for the Poisson problem is +composed entirely of diagonal elements, as is often the case. If you were constructing a +similar matrix in MATLAB, you would use the +[`spdiags`](https://uk.mathworks.com/help/matlab/ref/spdiags.html) function, and +`scipy.sparse` has its own +[equivalent](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.spdiags.html). +However, all the `scipy.sparse` formats also have special methods +[`setdiag`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.lil_matrix.setdiag.html) which provide a more object-orientated method of doing the same thing. Scipy has a few different direct solvers for sparse matrics, given below: - -[`spsolve`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.spsolve.html#scipy.sparse.linalg.spsolve): + +[`spsolve`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.spsolve.html#scipy.sparse.linalg.spsolve): This solves $Ax=b$ where $A$ is converted into CSC or CSR form - -[`spsolve_triangular`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.spsolve_triangular.html#scipy.sparse.linalg.spsolve_triangular): + +[`spsolve_triangular`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.spsolve_triangular.html#scipy.sparse.linalg.spsolve_triangular): Solves $Ax=b$, where $A$ is assumed to be triangular. - -[`factorized`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.factorized.html#scipy.sparse.linalg.factorized): -This computes the $LU$ decomposition of the input matrix $A$, returning a Python +[`factorized`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.factorized.html#scipy.sparse.linalg.factorized): +This computes the $LU$ decomposition of the input matrix $A$, returning a Python function that can be called to solve $Ax = b$ -[`splu`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.splu.html#scipy.sparse.linalg.splu): -This computes the $LU$ decomposition of the input matrix $A$ using the popular SuperLU +[`splu`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.splu.html#scipy.sparse.linalg.splu): +This computes the $LU$ decomposition of the input matrix $A$ using the popular SuperLU library. It returns a Python object of class -[`SuperLU`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.SuperLU.html#scipy.sparse.linalg.SuperLU), +[`SuperLU`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.SuperLU.html#scipy.sparse.linalg.SuperLU), that has a `solve` method you can use to solve $Ax = b$ -Note, `scipy.sparse.linalg` also has many iterative solvers, which we will investigate +Note, `scipy.sparse.linalg` also has many iterative solvers, which we will investigate further in the next chapter. ### Problems -Your goal for this problem is to construct the FD matrix $A$ given above, using +Your goal for this problem is to construct the FD matrix $A$ given above, using `scipy.sparse`, and: ::::challenge{id=scipy-sparse-poisson title="Visualise Poisson Matrix"} -1. Visualise the matrix $A$ using the Matplotlib - [`spy`](https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.spy.html) plot +Visualise the matrix $A$ using the Matplotlib [`spy`](https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.spy.html) plot :::solution + ```python import scipy.sparse.linalg import scipy.sparse as sp @@ -88,17 +83,21 @@ A = sp.spdiags([e, -2*e, e], [-1, 0, 1], N, N, format='csc') plt.spy(A) plt.show() ``` + ::: :::: ::::challenge{id=scipy-sparse-poisson-solve title="Solve Poisson problem"} -2. Solve the Poisson problem using: - - $f(x) = 2 \cos(x) / e^x$, with boundary conditions $g(0) = 0$ and $g(2 \pi)=0$. The - analytical solution is $u_{a}(x) = -\sin(x) / e^x$. - - $f(x) = 2 \sin(x) / e^x$, with boundary conditions $g(0) = 1$ and $g(2 \pi)=1 / e^{2 - \pi}$. The analytical solution is $u_{a}(x) = \cos(x) / e^x$ + +Solve the Poisson problem using: + +- $f(x) = 2 \cos(x) / e^x$, with boundary conditions $g(0) = 0$ and $g(2 \pi)=0$. The + analytical solution is $u_{a}(x) = -\sin(x) / e^x$. +- $f(x) = 2 \sin(x) / e^x$, with boundary conditions $g(0) = 1$ and $g(2 \pi)=1 / e^{2 + \pi}$. The analytical solution is $u_{a}(x) = \cos(x) / e^x$ :::solution + ```python L = 2*np.pi x = np.linspace(0, L, N+2) @@ -128,20 +127,21 @@ for f, a in zip([fcos, fsin], [analytical_cos, analytical_sin]): plt.legend() plt.show() ``` + ::: :::: - ::::challenge{id=sparse-versus-dense-mult title="Sparse versus dense: matrix multiplication"} -3. Vary the number of discretisation points $N$ and calculate $AA$ using both sparse and - dense matrices. For each $N$ calculate the time to calculate the matix - multiplicatiion using Python's - [`time.perf_counter`](https://docs.python.org/3/library/time.html#time.perf_counter), - and plot execution time versus $N$ for dense and sparse matrix multiplication. - Comment on how the time varies with $N$. +Vary the number of discretisation points $N$ and calculate $AA$ using both sparse and +dense matrices. For each $N$ calculate the time to calculate the matix +multiplicatiion using Python's +[`time.perf_counter`](https://docs.python.org/3/library/time.html#time.perf_counter), +and plot execution time versus $N$ for dense and sparse matrix multiplication. +Comment on how the time varies with $N$. :::solution + ```python import time @@ -174,18 +174,20 @@ plt.ylabel('time taken') plt.legend() plt.show() ``` + ::: :::: ::::challenge{id=sparse-versus-dense-solve title="Sparse versus dense: solving Poisson problem"} -4. Vary the number of discretisation points $N$ and solve the Poisson problem with - varying $N$, and with using both the sparse and direct $LU$ solvers. For each $N$ - record the time taken for both the dense and sparse solvers, and record the numerical - error $||\mathbf{v} - \mathbf{v}_a||_2$. Generate plots of both error and time versus +1. Vary the number of discretisation points $N$ and solve the Poisson problem with + varying $N$, and with using both the sparse and direct $LU$ solvers. For each $N$ + record the time taken for both the dense and sparse solvers, and record the numerical + error $||\mathbf{v} - \mathbf{v}_a||_2$. Generate plots of both error and time versus $N$, and comment on how they vary with $N$ :::solution + ```python times = np.empty(num, dtype=float) times_dense = np.empty(num, dtype=float) @@ -227,5 +229,6 @@ plt.ylabel('time taken') plt.legend() plt.show() ``` + ::: :::: diff --git a/scientific_computing/sparse_linear_algebra/06-jacobi-relaxation-methods.md b/scientific_computing/sparse_linear_algebra/06-jacobi-relaxation-methods.md index fb9689ea..e4e4247c 100644 --- a/scientific_computing/sparse_linear_algebra/06-jacobi-relaxation-methods.md +++ b/scientific_computing/sparse_linear_algebra/06-jacobi-relaxation-methods.md @@ -1,61 +1,56 @@ --- name: Jacobi and Relaxation Methods -dependsOn: [ - 'scientific_computing.sparse_linear_algebra.04-scipy-sparse', -] +dependsOn: ["scientific_computing.sparse_linear_algebra.04-scipy-sparse"] tags: [] -attribution: -- citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - +attribution: + - citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Iterative Methods -Previously we have discussed *direct* linear algebra solvers based on decompositions of -the original matrix $A$. The amount of computational effort required to achieve these -decomposisions is $\mathcal{O}(n^3)$, where $n$ is the number of rows of a square -matrix. They are therefore unsuitable for the large, sparse systems of equations that -are typically encountered in scientific applications. An alternate class of linear -algebra solvers are the *iterative* methods, which produce a series of *approximate* -solutions $x_k$ to the $A x = b$ problem. The performance of each algorithm is then -based on how quickly, or how many iterations $k$ are required, for the solution $x_k$ to +Previously we have discussed _direct_ linear algebra solvers based on decompositions of +the original matrix $A$. The amount of computational effort required to achieve these +decomposisions is $\mathcal{O}(n^3)$, where $n$ is the number of rows of a square +matrix. They are therefore unsuitable for the large, sparse systems of equations that +are typically encountered in scientific applications. An alternate class of linear +algebra solvers are the _iterative_ methods, which produce a series of _approximate_ +solutions $x_k$ to the $A x = b$ problem. The performance of each algorithm is then +based on how quickly, or how many iterations $k$ are required, for the solution $x_k$ to converge to within a set tolerance of the true solution $x$. - ## Jacobi Method -The Jacobi method is the simplest of the iterative methods, and relies on the fact that -the matrix is *diagonally dominant*. Starting from the problem definition: +The Jacobi method is the simplest of the iterative methods, and relies on the fact that +the matrix is _diagonally dominant_. Starting from the problem definition: $$ A\mathbf{x} = \mathbf{b} $$ -we decompose $A$ in to $A = L + D + U$, where $L$ is lower triangular, $D$ is diagonal, -$U$ is upper triangular. +we decompose $A$ in to $A = L + D + U$, where $L$ is lower triangular, $D$ is diagonal, +$U$ is upper triangular. $$ A\mathbf{x} = L\mathbf{x} + D\mathbf{x} + U\mathbf{x} = \mathbf{b} $$ -We then assume that we have an initial guess at the solution $\mathbf{x}^0$, and try to -find a new estimate $\mathbf{x}^1$. Assuming that the diagonal $D$ dominates over $L$ -and $U$, a sensible choice would be to insert $x^0$ and the unknown $x^1$ into the +We then assume that we have an initial guess at the solution $\mathbf{x}^0$, and try to +find a new estimate $\mathbf{x}^1$. Assuming that the diagonal $D$ dominates over $L$ +and $U$, a sensible choice would be to insert $x^0$ and the unknown $x^1$ into the equation like so: $$ L\mathbf{x}^0 + D\mathbf{x}^1 + U\mathbf{x}^0 = \mathbf{b} $$ -we can rearrange to get an equation for $x^1$. This is easily solved as we can take the +we can rearrange to get an equation for $x^1$. This is easily solved as we can take the inverse of the diagonal matrix by simply inverting each diagonal element individually: $$ @@ -70,8 +65,8 @@ $$ ## Relaxation methods -The Jacobi method is an example of a relaxation method, where the matrix $A$ is split -into a dominant part $M$ (which is easy to solve), and the remainder $N$. That is, $A = +The Jacobi method is an example of a relaxation method, where the matrix $A$ is split +into a dominant part $M$ (which is easy to solve), and the remainder $N$. That is, $A = M - N$ $$ @@ -82,59 +77,59 @@ $$ \mathbf{x}_{k+1} = M^{-1}N\mathbf{x}_k + M^{-1}\mathbf{b} $$ -This can be rearranged in terms of the *residual* $\mathbf{r}_k = \mathbf{b} - A +This can be rearranged in terms of the _residual_ $\mathbf{r}_k = \mathbf{b} - A \mathbf{x}_k$ to the update equation $$ \mathbf{x}_{k+1} = \mathbf{x}_{k} + M^{-1}\mathbf{r}_k $$ -For the Jacobi method $M = D$ and $N = -(L + U)$. Other relaxation methods include -Gauss-Seidel, where $M = (D + L)$ and $N = -U$, and successive over-relaxation (SOR), -where $M = \frac{1}{\omega} D + L$ and $N = -(\frac{\omega - 1}{\omega} D + U)$, where -$\omega$ is the *relaxation* parameter that is within the range $0 \le \omega \le 2$. +For the Jacobi method $M = D$ and $N = -(L + U)$. Other relaxation methods include +Gauss-Seidel, where $M = (D + L)$ and $N = -U$, and successive over-relaxation (SOR), +where $M = \frac{1}{\omega} D + L$ and $N = -(\frac{\omega - 1}{\omega} D + U)$, where +$\omega$ is the _relaxation_ parameter that is within the range $0 \le \omega \le 2$. -For any relaxation method to converge we need $\rho(M^{-1}N) < 1$, where $\rho()$ is the -*spectral radius* of $M^{-1} N$, which is defined as the largest eigenvalue $\lambda$ of +For any relaxation method to converge we need $\rho(M^{-1}N) < 1$, where $\rho()$ is the +_spectral radius_ of $M^{-1} N$, which is defined as the largest eigenvalue $\lambda$ of a a given matrix $G$: $$ \rho(G) = \max{|\lambda|: \lambda \in \lambda(G)} $$ -For the SOR method, the relaxation parameter $\omega$ is generally chosen to minimise -$\rho(M^{-1}N)$, so that the speed of convergence is maximised. In some cases this -optimal $\omega$ is known, for example for finite difference discretisation of the +For the SOR method, the relaxation parameter $\omega$ is generally chosen to minimise +$\rho(M^{-1}N)$, so that the speed of convergence is maximised. In some cases this +optimal $\omega$ is known, for example for finite difference discretisation of the [Poisson equation](https://www.sciencedirect.com/science/article/pii/S0893965908001523). -However, in many cases sophisticated eigenvalue analysis is required to determine the -optimal $\omega$. +However, in many cases sophisticated eigenvalue analysis is required to determine the +optimal $\omega$. ### Other Reading -- Golub, G. H. & Van Loan, C. F. Matrix Computations, 3rd Ed. (Johns Hopkins University - Press, 1996). Chapter 10 -- Barrett, R., Berry, M., Chan, T. F., Demmel, J., Donato, J., Dongarra, J., ... & Van - der Vorst, H. (1994). Templates for the solution of linear systems: building blocks +- Golub, G. H. & Van Loan, C. F. Matrix Computations, 3rd Ed. (Johns Hopkins University + Press, 1996). Chapter 10 +- Barrett, R., Berry, M., Chan, T. F., Demmel, J., Donato, J., Dongarra, J., ... & Van + der Vorst, H. (1994). Templates for the solution of linear systems: building blocks for iterative methods. Society for Industrial and Applied Mathematics. ### Problems - ::::challenge{id=2d-poisson-jacobi-relaxation title="2D Poisson Jacobi Relaxation"} -This exercise involves the manipulation and solution of the linear system resulting from -the finite difference solution to Poisson's equation in *two* dimensions. Let $A$ be a -sparse symmetric positive definite matrix of dimension $(N-1)^2 \times (N-1)^2$ created +This exercise involves the manipulation and solution of the linear system resulting from +the finite difference solution to Poisson's equation in _two_ dimensions. Let $A$ be a +sparse symmetric positive definite matrix of dimension $(N-1)^2 \times (N-1)^2$ created using `scipy.sparse` (for a given $N$) by the function `buildA` as follows: + ```python import numpy as np import scipy.sparse as sp def buildA(N): dx = 1 / N - nvar = (N - 1)**2; - e1 = np.ones((nvar), dtype=float); + nvar = (N - 1)**2 + e1 = np.ones((nvar), dtype=float) e2 = np.copy(e1) e2[::N-1] = 0 e3 = np.copy(e1) @@ -143,7 +138,7 @@ def buildA(N): (-e1, -e3, 4*e1, -e2, -e1), (-(N-1), -1, 0, 1, N-1), nvar, nvar ) - A = A / dx**2; + A = A / dx**2 return A ``` @@ -170,12 +165,12 @@ We will consider manipulation of the matrix $A$ and solution of the linear systems $A\mathbf{U}_i=\mathbf{f}_i$. The solution to this linear system corresponds to a finite difference solution to Poisson's equation $-\nabla^2 u = f$ on the unit square with zero Dirichlet boundary conditions where $f$ is -either $\sin(\pi x) \sin (\pi y)$ or $\max(x,1-x) \max(y,1-y)$. PDEs of this type occur +either $\sin(\pi x) \sin (\pi y)$ or $\max(x,1-x) \max(y,1-y)$. PDEs of this type occur (usually with some additional reaction and or convection terms) very frequently in mathematical modelling of physiological processes, and even in image -analysis. +analysis. -1. Write a function to solve a linear system using the Jacobi method. In +- Write a function to solve a linear system using the Jacobi method. In terms of $N$, how many iterations does it take to converge? (Try $N=4,8,16,32,64$.) @@ -225,18 +220,19 @@ plt.xlabel('N') plt.ylabel('iterations') plt.show() ``` + ::: :::: ::::challenge{id=2d-poisson-sor-relaxation title="2D Poisson SOR Relaxation"} -2. Write a function to solve a linear system using the SOR method. For - $N=64$ and right-hand-side $\mathbf{f}_2$ determine numerically the best - choice of the relaxation parameter to 2 decimal places and compare this - with theory. Hint, use - [`scipy.optimize.minimize_scalar`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize_scalar.html#scipy.optimize.minimize_scalar). +- Write a function to solve a linear system using the SOR method. + For $N=64$ and right-hand-side $\mathbf{f}_2$ determine numerically the best choice of the relaxation parameter t2 decimal places and compare this with theory. + Hint, use + [`scipy.optimize.minimize_scalar`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize_scalar.html#scipy.optimize.minimize_scalar) :::solution + ```python def SOR(A, b, omega, x0=None, tol=1e-5, max_iter=300): if x0 is None: @@ -270,5 +266,6 @@ def SOR_iterations(omega): res = scipy.optimize.minimize_scalar(SOR_iterations, bracket=[0.1, 1.0, 1.99], tol=1e-2) print('ideal omega is', res.x, 'versus analytic value of', 2 / (1 + np.sin(np.pi/N))) ``` + ::: :::: diff --git a/scientific_computing/sparse_linear_algebra/07-conjugate-gradient-method.md b/scientific_computing/sparse_linear_algebra/07-conjugate-gradient-method.md index 04df96cb..0bbef8b4 100644 --- a/scientific_computing/sparse_linear_algebra/07-conjugate-gradient-method.md +++ b/scientific_computing/sparse_linear_algebra/07-conjugate-gradient-method.md @@ -1,24 +1,21 @@ --- name: Krylov subspace methods and CG -dependsOn: [ - 'scientific_computing.sparse_linear_algebra.06-jacobi-relaxation-methods' -] +dependsOn: ["scientific_computing.sparse_linear_algebra.06-jacobi-relaxation-methods"] tags: [] -attribution: -- citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## The Krylov subspace -The set of basis vectors for the Krylov subspace $\mathcal{K}_k(A, b)$ are formed by +The set of basis vectors for the Krylov subspace $\mathcal{K}_k(A, b)$ are formed by repeated application of a matrix $A$ on a vector $b$ $$ @@ -27,44 +24,43 @@ $$ ## Krylov subspace methods -Krylov subspace methods are an important family of iterative algorithms for solving -$Ax=b$. Lets suppose that $A$ is an $n \times n$ invertible matrix, and our only -knowledge of $A$ is its matrix-vector product with an arbitrary vector $\mathbf{x}$. -Repeated application of $A$, $n$ times on an initial vector $b$ gives us a sequence of -vectors $\mathbf{b}, A \mathbf{b}, A^2 \mathbf{b}, ..., A^{n} \mathbf{b}$. Since we are -in $n$-dimensional space, we are guaranteed that these $n+1$ vectors are linearly +Krylov subspace methods are an important family of iterative algorithms for solving +$Ax=b$. Lets suppose that $A$ is an $n \times n$ invertible matrix, and our only +knowledge of $A$ is its matrix-vector product with an arbitrary vector $\mathbf{x}$. +Repeated application of $A$, $n$ times on an initial vector $b$ gives us a sequence of +vectors $\mathbf{b}, A \mathbf{b}, A^2 \mathbf{b}, ..., A^{n} \mathbf{b}$. Since we are +in $n$-dimensional space, we are guaranteed that these $n+1$ vectors are linearly dependent, and therefore $$ a_0 \mathbf{b} + a_1 A \mathbf{b} + ... + a_n A^n \mathbf{b} = 0 $$ -for some set of coefficients $a_i$. Let now pick the smallest index $k$ such that $a_k +for some set of coefficients $a_i$. Let now pick the smallest index $k$ such that $a_k \ne 0$, and multiply the above equation by $\frac{1}{a_k} A^{-k-1}$, giving $$ A^{-1} b = \frac{1}{a_k} ( a_{k+1} b + ... + a_n A^{n-k-1} b ) $$ -This implies that $A^{-1} b$ can be found only via repeated application of $A$, and also +This implies that $A^{-1} b$ can be found only via repeated application of $A$, and also motivates the search for solutions from the Krylov subspace. -For each iteration $k$ of a Krylov subspace method, we choose the "best" linear -combination of the Krylov basis vectors $\mathbf{b}, A\mathbf{b}, ... , A^{k−1} -\mathbf{b}$ to form an improved solution $\mathbf{x}_k$. Different methods give various +For each iteration $k$ of a Krylov subspace method, we choose the "best" linear +combination of the Krylov basis vectors $\mathbf{b}, A\mathbf{b}, ... , A^{k−1} +\mathbf{b}$ to form an improved solution $\mathbf{x}_k$. Different methods give various definitions of "best", for example: 1. The residual $\mathbf{r}_k = \mathbf{b} − A\mathbf{x}_k$ -is orthogonal to $\mathcal{K}_k$ (Conjugate Gradients). + is orthogonal to $\mathcal{K}_k$ (Conjugate Gradients). 2. The residual $\mathbf{r}_k$ has minimum norm for $\mathbf{x}_k$ -in $\mathcal{K}_k$ (GMRES and MINRES). -3. $\mathbf{r}_k$ is orthogonal to a different space $\mathcal{K}_k(A^T)$ (BiConjugate + in $\mathcal{K}_k$ (GMRES and MINRES). +3. $\mathbf{r}_k$ is orthogonal to a different space $\mathcal{K}_k(A^T)$ (BiConjugate Gradients). ## Conjugate Gradient Method - -Here we will give a brief summary of the CG method, for more details you can consult the +Here we will give a brief summary of the CG method, for more details you can consult the text by Golub and Van Loan (Chapter 10). The CG method is based on minimising the function @@ -73,19 +69,19 @@ $$ \phi(x) = \frac{1}{2}x^T A x - x^T b $$ -If we set $x$ to the solution of $Ax =b$, that is $x = A^{-1} b$, then the value of -$\phi(x)$ is at its minimum $\phi(A^{-1} b) = -b^T A^{-1} b / 2$, showing that solving +If we set $x$ to the solution of $Ax =b$, that is $x = A^{-1} b$, then the value of +$\phi(x)$ is at its minimum $\phi(A^{-1} b) = -b^T A^{-1} b / 2$, showing that solving $Ax = b$ and minimising $\phi$ are equivalent. -At each iteration $k$ of CG we are concerned with the *residual*, defined as $r_k = b - -A x_k$. If the residual is nonzero, then at each step we wish to find a positive -$\alpha$ such that $\phi(x_k + \alpha p_k) < \phi(x_k)$, where $p_k$ is the *search -direction* at each $k$. For the classical steepest descent optimisation algorithm the -search direction would be the residual $p_k = r_k$, however, steepest descent can suffer -from convergence problems, so instead we aim to find a set of search directions $p_k$ so -that $p_k^T r\_{k-1} \ne 0$ (i.e. at each step we are guaranteed to reduce $\phi$), and -that the search directions are linearly independent. The latter condition guarantees -that the method will converge in at most $n$ steps, where $n$ is the size of the square +At each iteration $k$ of CG we are concerned with the _residual_, defined as $r_k = b - +A x_k$. If the residual is nonzero, then at each step we wish to find a positive +$\alpha$ such that $\phi(x_k + \alpha p_k) < \phi(x_k)$, where $p_k$ is the _search +direction_ at each $k$. For the classical steepest descent optimisation algorithm the +search direction would be the residual $p_k = r_k$, however, steepest descent can suffer +from convergence problems, so instead we aim to find a set of search directions $p_k$ so +that $p_k^T r\_{k-1} \ne 0$ (i.e. at each step we are guaranteed to reduce $\phi$), and +that the search directions are linearly independent. The latter condition guarantees +that the method will converge in at most $n$ steps, where $n$ is the size of the square matrix $A$. It can be shown that the best set of search directions can be achieved by setting @@ -98,187 +94,212 @@ p_k &= r_{k-1} + \beta_k p_{k-1} \\ \end{aligned} $$ -This ensures that the search direction $\mathbf{p}\_k$ is the closest vector to -$\mathbf{r}_{k-1}$ that is also *A-conjugate* to $\mathbf{p}\_1, ..., -\mathbf{p}\_{k-1}$, i.e. $p^T_i A p_j=0$ for all $i \ne j$, which gives the algorithm its -name. After $k$ iterations the sequence of residuals $\mathbf{r}_i$ for $i=1..k$ form a +This ensures that the search direction $\mathbf{p}\_k$ is the closest vector to +$\mathbf{r}_{k-1}$ that is also _A-conjugate_ to $\mathbf{p}\_1, ..., +\mathbf{p}\_{k-1}$, i.e. $p^T_i A p_j=0$ for all $i \ne j$, which gives the algorithm its +name. After $k$ iterations the sequence of residuals $\mathbf{r}_i$ for $i=1..k$ form a set of mutually orthogonal vectors that span the Krylov subspace $\mathcal{K}_k$. -Directly using the above equations in an iterative algorithm results in the standard CG -algorithm. A more efficient algorithm can be derived from this by computing the -residuals recursively via $r_k = r\_{k-1} - \alpha_k A p_k$, leading to the final -algorithm given below (reproduced from +Directly using the above equations in an iterative algorithm results in the standard CG +algorithm. A more efficient algorithm can be derived from this by computing the +residuals recursively via $r_k = r\_{k-1} - \alpha_k A p_k$, leading to the final +algorithm given below (reproduced from [Wikipedia](https://en.wikipedia.org/wiki/Conjugate_gradient_method)): ![Conjugate Gradient algorithm](images/cg_pseudocode.svg) ### Preconditioning -The CG method works well (i.e. converges quickly) if the *condition number* of the -matrix $A$ is low. The condition number of a matrix gives a measure of how much the -solution $x$ changes in response to a small change in the input $b$, and is a property -of the matrix $A$ itself, so can vary from problem to problem. In order to keep the -number of iterations small for iterative solvers, it is therefore often necessary to use -a *preconditioner*, which is a method of transforming what might be a difficult problem +The CG method works well (i.e. converges quickly) if the _condition number_ of the +matrix $A$ is low. The condition number of a matrix gives a measure of how much the +solution $x$ changes in response to a small change in the input $b$, and is a property +of the matrix $A$ itself, so can vary from problem to problem. In order to keep the +number of iterations small for iterative solvers, it is therefore often necessary to use +a _preconditioner_, which is a method of transforming what might be a difficult problem with a poorly conditioned $A$, into a well conditioned problem that is easy to solve. -Consider the case of preconditioning for the CG methods, we start from the standard -problem $A x = b$, and we wish to solve an *equivalent* transformed problem given by +Consider the case of preconditioning for the CG methods, we start from the standard +problem $A x = b$, and we wish to solve an _equivalent_ transformed problem given by $$ \tilde{A} \tilde{x} = \tilde{b} $$ -where $\tilde{A} = C^{-1} A C^{-1}$, $\tilde{x} = Cx$, $\tilde{b} = C^{-1} b $, and $C$ +where $\tilde{A} = C^{-1} A C^{-1}$, $\tilde{x} = Cx$, $\tilde{b} = C^{-1} b $, and $C$ is a symmetric positive matrix. -We then simply apply the standard CG method as given above to this transformed problem. -This leads to an algorithm which is then simplified by instead computing the transformed -quantities $\tilde{p}_k = C p_k$, $\tilde{x}_k = C x_k$, and $\tilde{r}_k = C^{-1} r_k$. -Finally we define a matrix $M = C^2$, which is known as the *preconditioner*, leading to -the final preconditioned CG algorithm given below (reproduced and edited from +We then simply apply the standard CG method as given above to this transformed problem. +This leads to an algorithm which is then simplified by instead computing the transformed +quantities $\tilde{p}_k = C p_k$, $\tilde{x}_k = C x_k$, and $\tilde{r}_k = C^{-1} r_k$. +Finally we define a matrix $M = C^2$, which is known as the _preconditioner_, leading to +the final preconditioned CG algorithm given below (reproduced and edited from [Wikipedia](https://en.wikipedia.org/wiki/Conjugate_gradient_method)): - -$\mathbf{r}\_0 := \mathbf{b} - \mathbf{A x}\_0$ - -$\mathbf{z}\_0 := \mathbf{M}^{-1} \mathbf{r}\_0$ - -$\mathbf{p}\_0 := \mathbf{z}\_0$ - -$k := 0 \, $ +> $\mathbf{r}\_0 := \mathbf{b} - \mathbf{A x}\_0$ >$\mathbf{z}\_0 := \mathbf{M}^{-1} \mathbf{r}\_0$ >$\mathbf{p}\_0 := \mathbf{z}\_0$ +> $k := 0 \, $ **repeat until $|| \mathbf{r}_k ||_2 < \epsilon ||\mathbf{b}||_2$** -> $\alpha\_k := \frac{\mathbf{r}\_k^T \mathbf{z}\_k}{ \mathbf{p}\_k^T -\mathbf{A p}\_k }$ - -> $\mathbf{x}\_{k+1} := \mathbf{x}\_k + \alpha\_k \mathbf{p}\_k$ -> $\mathbf{r}\_{k+1} := \mathbf{r}\_k - \alpha_k \mathbf{A p}\_k$ +> $\alpha\_k := \frac{\mathbf{r}\_k^T \mathbf{z}\_k}{ \mathbf{p}\_k^T +\mathbf{A p}\_k }$ > $\mathbf{x}\_{k+1} := \mathbf{x}\_k + \alpha\_k \mathbf{p}\_k$ > $\mathbf{r}\_{k+1} := \mathbf{r}\_k - \alpha_k \mathbf{A p}\_k$ > **if** $r\_{k+1}$ is sufficiently small exit loop **end if** > $\mathbf{z}\_{k+1} := \mathbf{M}^{-1} \mathbf{r}\_{k+1}$ > $\beta\_k := \frac{\mathbf{r}\_{k+1}^T \mathbf{z}\_{k+1}}{\mathbf{r}\_k^T}$ > $\mathbf{p}\_{k+1} := \mathbf{z}\_{k+1} + \beta_k \mathbf{p}\_k$ +> $k := k + 1 \, $ -> **if** $r\_{k+1}$ is sufficiently small then exit loop **end if** +**end repeat**. -> $\mathbf{z}\_{k+1} := \mathbf{M}^{-1} \mathbf{r}\_{k+1}$ - -> $\beta\_k := \frac{\mathbf{r}\_{k+1}^T \mathbf{z}\_{k+1}}{\mathbf{r}\_k^T}$ - -> $\mathbf{p}\_{k+1} := \mathbf{z}\_{k+1} + \beta_k \mathbf{p}\_k$ - -> $k := k + 1 \, $ - -**end repeat** - -The key point to note here is that the preconditioner is used by inverting $M$, so this -matrix must be "easy" to solve in some fashion, and also result in a transformed problem +The key point to note here is that the preconditioner is used by inverting $M$, so this +matrix must be "easy" to solve in some fashion, and also result in a transformed problem with better conditioning. -**Termination**: The CG algorithm is normally run until convergence to a given -tolerance which is based on the norm of the input vector $b$. In the algorithm above we -iterate until the residual norm is less than some fraction (set by the user) of the norm +**Termination**: The CG algorithm is normally run until convergence to a given +tolerance which is based on the norm of the input vector $b$. In the algorithm above we +iterate until the residual norm is less than some fraction (set by the user) of the norm of $b$. -What preconditioner to choose for a given problem is often highly problem-specific, but -some useful general purpose preconditioners exist, such as the *incomplete Cholesky -preconditioner* for preconditioned CG (see Chapter 10.3.2 of the Golub & Van Loan text -given below). Chapter 3 of the [Barrett et al. -text](https://www.netlib.org/templates/templates.pdf), also cited below, contains +What preconditioner to choose for a given problem is often highly problem-specific, but +some useful general purpose preconditioners exist, such as the _incomplete Cholesky +preconditioner_ for preconditioned CG (see Chapter 10.3.2 of the Golub & Van Loan text +given below). Chapter 3 of the [Barrett et al. +text](https://www.netlib.org/templates/templates.pdf), also cited below, contains descriptions of a few more commonly used preconditioners. ### Other Reading -- Golub, G. H. & Van Loan, C. F. Matrix Computations, 3rd Ed. (Johns Hopkins University - Press, 1996). Chapter 10 -- Barrett, R., Berry, M., Chan, T. F., Demmel, J., Donato, J., Dongarra, J., ... & Van - der Vorst, H. (1994). Templates for the solution of linear systems: building blocks +- Golub, G. H. & Van Loan, C. F. Matrix Computations, 3rd Ed. (Johns Hopkins University + Press, 1996). Chapter 10 +- Barrett, R., Berry, M., Chan, T. F., Demmel, J., Donato, J., Dongarra, J., ... & Van + der Vorst, H. (1994). Templates for the solution of linear systems: building blocks for iterative methods. Society for Industrial and Applied Mathematics. ### Python: scipy.sparse.linalg -Once again the best resource for Python is the [`scipi.sparse.linalg` -documentation](https://docs.scipy.org/doc/scipy/reference/sparse.linalg.html). The +Once again the best resource for Python is the [`scipi.sparse.linalg` +documentation](https://docs.scipy.org/doc/scipy/reference/sparse.linalg.html). The available iterative solvers in Scipy are: -- [BIConjugate Gradient iteration +- [BIConjugate Gradient iteration (BiCG)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.bicg.html#scipy.sparse.linalg.bicg) - - Applicable to non-symmetric problems. Requires the matrix-vector product of $A$ + - Applicable to non-symmetric problems. Requires the matrix-vector product of $A$ and its transpose $A^T$. -- [Quasi-Minimal Residual iteration +- [Quasi-Minimal Residual iteration (QMR)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.qmr.html#scipy.sparse.linalg.qmr) - Applicable to non-symmetric $A$ - - Designed as an improvement of BiCG, avoids one of the two failure situations of + - Designed as an improvement of BiCG, avoids one of the two failure situations of BiCG - - Computational costs slightly higher than BiCG, still requires the transpose + - Computational costs slightly higher than BiCG, still requires the transpose $A^T$. -- [Conjugate Gradient Squared iteration +- [Conjugate Gradient Squared iteration (CGS)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.cgs.html#scipy.sparse.linalg.cgs) - Applicable to non-symmetric $A$ - - Often converges twice as fast as BiCG, but is often irregular and can diverge if + - Often converges twice as fast as BiCG, but is often irregular and can diverge if starting guess is close to solution. - Unlike BiCG, the two matrix-vector products cannot be parallelized. -- [BIConjugate Gradient STABilized iteration +- [BIConjugate Gradient STABilized iteration (BiCGSTAB)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.bicgstab.html#scipy.sparse.linalg.bicgstab) - Applicable to non-symmetric $A$ - Computational cost similar to BiCG, but does not require the transpose of $A$. - - Simliar convergence speed as CGS, but avoids the irregular convergence properties + - Simliar convergence speed as CGS, but avoids the irregular convergence properties of this method -- [Conjugate Gradient iteration +- [Conjugate Gradient iteration (CG)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.cg.html#scipy.sparse.linalg.cg) + - Applicable only to symmetric positive definite $A$. - Speed of convergences depends on condition number -- [Generalized Minimal RESidual iteration +- [Generalized Minimal RESidual iteration (GMRES)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.gmres.html#scipy.sparse.linalg.gmres) - Applicable non-symmetric $A$ - - Best convergence properties, but each additional iteration becomes increasingly + - Best convergence properties, but each additional iteration becomes increasingly expensive, with large storage costs. - - To limit the increasing cost with additional iterations, it is necessary to - periodically *restart* the method. When to do this is highly dependence on the + - To limit the increasing cost with additional iterations, it is necessary to + periodically _restart_ the method. When to do this is highly dependence on the properties of $A$ - Requires only matrix-vector products with $A$ -- - [LGMRES](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.lgmres.html#scipy.sparse.linalg.lgmres) - - Modification to GMRES that uses alternating residual vectors to improve +- [LGMRES](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.lgmres.html#scipy.sparse.linalg.lgmres) + - Modification to GMRES that uses alternating residual vectors to improve convergence. - - It is possible to supply the algorithm with "guess" vectors used to augment the - Krylov subspace, which is useful if you are solving several very similar + - It is possible to supply the algorithm with "guess" vectors used to augment the + Krylov subspace, which is useful if you are solving several very similar matrices one after another. -- [MINimum RESidual iteration +- [MINimum RESidual iteration (MINRES)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.minres.html#scipy.sparse.linalg.minres) - Applicable to symmetric $A$ - [GCROT(m,k)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.gcrotmk.html#scipy.sparse.linalg.gcrotmk) -`scipy.sparse.linalg` also contains two iterative solvers for least-squares problems, -[`lsqr`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.lsqr.html#scipy.sparse.linalg.lsqr) -and +`scipy.sparse.linalg` also contains two iterative solvers for least-squares problems, +[`lsqr`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.lsqr.html#scipy.sparse.linalg.lsqr) +and [`lsmr`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.lsmr.html#scipy.sparse.linalg.lsmr) ### Problems ::::challenge{id=comparing-solvers title="Comparing solvers"} -For this problem we are going to compare many of the different types of solvers, both +For this problem we are going to compare many of the different types of solvers, both direct and iterative, that we've looked at thus far. -Note: We will be using the Finite Difference matrix $A$ based on the two-dimensional -finite difference approximation of the Poisson problem that we developed in the previous +Note: We will be using the Finite Difference matrix $A$ based on the two-dimensional +finite difference approximation of the Poisson problem that we developed in the previous exercise. For $N=4,8,16,32,64,128$ try the following: -1. Solve the linear systems using $\mathbf{U}_i=A^{-1} \mathbf{f}_i$ (see - [`scipy.linalg.inv`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.inv.html) + +1. Solve the linear systems using $\mathbf{U}_i=A^{-1} \mathbf{f}_i$ (see + [`scipy.linalg.inv`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.inv.html) and record the time this takes on a $\log$-$\log$ graph. (Omit the case $N=128$ - and note this may take a while for $N=64$.) -2. Solve the linear systems using the $\text{LU}$ and Cholesky decompositions. Plot the + and note this may take a while for $N=64$.) +2. Solve the linear systems using the $\text{LU}$ and Cholesky decompositions. Plot the time this takes on the same graph. -3. Now solve the systems iteratively using a conjugate gradients solver (you can use the - one in `scipy.linalg.sparse`, or you can code up your own). How many iterations - are needed for each problem? Explain the results for the right-hand-side - $\mathbf{f}_1$. For the right-hand-side $\mathbf{f}_2$ what is the relationship +3. Now solve the systems iteratively using a conjugate gradients solver (you can use the + one in `scipy.linalg.sparse`, or you can code up your own). How many iterations + are needed for each problem? Explain the results for the right-hand-side + $\mathbf{f}_1$. For the right-hand-side $\mathbf{f}_2$ what is the relationship between the number of iterations and $N$. How long do the computations take? 4. Repeat using the `scipy.sparse.linalg` BICGSTAB and GMRES solvers. :::solution + +We redefine our functions from the previous lesson: + ```python +import sympy as sp +import numpy as np +def buildf1(N): + x = np.arange(0, 1, 1/N).reshape(N, 1) + y = x.T + f = np.dot(np.sin(np.pi*x), np.sin(np.pi*y)) + return f[1:,1:].reshape(-1,1) +``` + +```python +def buildf2(N): + x = np.arange(0, 1, 1/N).reshape(N, 1) + y = x.T + f = np.dot(np.maximum(x,1-x), np.maximum(y,1-y)) + return f[1:,1:].reshape(-1, 1) +``` + +```python +def buildA(N): + dx = 1 / N + nvar = (N - 1)**2 + e1 = np.ones((nvar), dtype=float) + e2 = np.copy(e1) + e2[::N-1] = 0 + e3 = np.copy(e1) + e3[N-2::N-1] = 0 + A = sp.spdiags( + (-e1, -e3, 4*e1, -e2, -e1), + (-(N-1), -1, 0, 1, N-1), nvar, nvar + ) + A = A / dx**2 + return A +``` + +```python +import numpy as np +from matplotlib import pyplot as plt +import time +import scipy num = 20 times = np.empty((num, 2, 6), dtype=float) iterations = np.empty((num, 2, 3), dtype=int) @@ -370,5 +391,6 @@ A = buildA(10) f = buildf1(10) np.testing.assert_almost_equal(A@f, 19.57739348*f) ``` + ::: :::: diff --git a/scientific_computing/sparse_linear_algebra/index.md b/scientific_computing/sparse_linear_algebra/index.md index d9047435..82bfb9b3 100644 --- a/scientific_computing/sparse_linear_algebra/index.md +++ b/scientific_computing/sparse_linear_algebra/index.md @@ -1,39 +1,35 @@ --- id: sparse_linear_algebra name: Sparse Matrices and Iterative Solvers -dependsOn: [ - scientific_computing.linear_algebra, -] -files: [ - 01-sparse-matrices.md, - 02-coo-matrix.md, - 03-finite-difference.md, - 04-scipy-sparse.md, - 06-jacobi-relaxation-methods.md, - 07-conjugate-gradient-method.md, -] +dependsOn: [scientific_computing.linear_algebra] +files: + [ + 01-sparse-matrices.md, + 02-coo-matrix.md, + 03-finite-difference.md, + 04-scipy-sparse.md, + 06-jacobi-relaxation-methods.md, + 07-conjugate-gradient-method.md, + ] summary: | This course covers the basics of iterative methods for solving linear systems, and how to use sparse matrices to represent and solve these systems efficiently. -attribution: -- citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - +attribution: + - citation: This material has been adapted from material by Martin Robinson from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- Many problems in science and engineering involve the solution of large systems -of linear equations that are *sparse*, i.e. most of the entries in the matrix -are zero. In this case, it is often more efficient to use *iterative* methods +of linear equations that are _sparse_, i.e. most of the entries in the matrix +are zero. In this case, it is often more efficient to use _iterative_ methods to solve the system of equations, rather than the direct methods covered in the previous course. This course covers the basics of iterative methods for solving linear systems, and how to use sparse matrices to represent and solve these systems efficiently. - diff --git a/software_architecture_and_design/functional/higher_order_functions_cpp.md b/software_architecture_and_design/functional/higher_order_functions_cpp.md index ae206588..08081fbc 100644 --- a/software_architecture_and_design/functional/higher_order_functions_cpp.md +++ b/software_architecture_and_design/functional/higher_order_functions_cpp.md @@ -1,44 +1,40 @@ --- name: Higher Order Functions -dependsOn: [ - software_architecture_and_design.functional.side_effects_cpp, -] +dependsOn: [software_architecture_and_design.functional.side_effects_cpp] tags: [cpp] -attribution: - - citation: > - This material was adapted from an "Introduction to C++" course developed by the - Oxford RSE group. - url: https://www.rse.ox.ac.uk - image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: > + This material was adapted from an "Introduction to C++" course developed by the + Oxford RSE group. + url: https://www.rse.ox.ac.uk + image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- - ## First Class Functions Languages that treat functions as first-class citizens allow functions to be passed as arguments to other functions, returned from functions, or assigned to -variables. In C++ this is typically done via lamda functions or function objects. +variables. In C++ this is typically done via lambda functions or function objects. ### Lambda Functions -*Lambda functions* are small, nameless functions which are defined in the +_Lambda functions_ are small, nameless functions which are defined in the normal flow of the program, typically as they are needed. They consist of three part, delimited by square, round, then curly brackets. The curly brackets form the -body of the function, for example +body of the function, for example -~~~cpp +```cpp ignore auto hello_world = []() { std::cout << "hello world" << std::endl; }; hello_world(); -~~~ +``` The `auto`{.Cpp} keyword allows the compiler to determine the correct type for the lambda, rather than you declaring it manually (impossible for lambda @@ -47,26 +43,26 @@ functions). You can call or execute a lambda as you would any other function. The round brackets contain the list of arguments to the function, and the square brackets **capture** variables from the outside scope, for example -~~~cpp +```cpp int i = 1; auto add_i_to_arg = [i](int arg) { return arg + i; }; std::cout << add_i_to_arg(3) << std::endl; // prints 4 -~~~ +``` This captures `i` by value. To capture by reference use `&`: -~~~cpp +```cpp int i = 1; auto add_arg_to_i = [&i](int arg) { i += arg; }; add_arg_to_i(3); std::cout << i << std::endl; // prints 4 -~~~ +``` You can capture all variables used in the lambda function using either `[=]`, which captures everything by value, or `[&]`, which captures everything by reference: -~~~cpp +```cpp int i = 1; auto add_i_to_arg = [=](int arg) { return arg + i; }; std::cout << add_i_to_arg(3) << std::endl; // prints 4 @@ -74,13 +70,13 @@ std::cout << add_i_to_arg(3) << std::endl; // prints 4 auto add_arg_to_i = [&](int arg) { i += arg; }; add_arg_to_i(3); std::cout << i << std::endl; // prints 4 -~~~ +``` ### Function objects -A lambda function in C++ is syntactical suger for a function object, which is +A lambda function in C++ is syntactical sugar for a function object, which is simply a class with a round bracket operator defined. For example you could -define the last `add_i_to_arg` lamda manually as a function object +define the last `add_i_to_arg` lambda manually as a function object ```cpp class AddIToArg { @@ -102,7 +98,7 @@ int main() { ``` Under the hood, when you write you lambda the compiler simply creates and -compiles the equivilant function object for you. If you want more control of the +compiles the equivalent function object for you. If you want more control of the process, you can write the function object manually. ### Polymorphic function @@ -118,14 +114,14 @@ add = [](int i) { return i + 2; }; The two lambdas have different types, even though they are both functions that take a single `int` as an argument and return another `int`. -``` -/home/mrobins/git/cpp_tmp/prodecural.cpp:14:35: error: no match for 'operator=' (operand types are 'main()::' and 'main()::') +```text +/home/mrobins/git/cpp_tmp/procedural.cpp:14:35: error: no match for 'operator=' (operand types are 'main()::' and 'main()::') 14 | add = [](int i) { return i + 2; }; | ^ -/home/mrobins/git/cpp_tmp/prodecural.cpp:13:15: note: candidate: 'main()::& main()::::operator=(const main()::&)' (deleted) +/home/mrobins/git/cpp_tmp/procedural.cpp:13:15: note: candidate: 'main()::& main()::::operator=(const main()::&)' (deleted) 13 | auto add = [](int i) { return i + 1; }; | ^ -/home/mrobins/git/cpp_tmp/prodecural.cpp:13:15: note: no known conversion for argument 1 from 'main()::' to 'const main()::&' +/home/mrobins/git/cpp_tmp/procedural.cpp:13:15: note: no known conversion for argument 1 from 'main()::' to 'const main()::&' ``` This causes problems if for example, you want to store a collection of function @@ -147,16 +143,15 @@ for (const auto& op: ops) { std::cout << result << std::end; // prints 6 ``` -`std::function` is an example of *type erasure*. - +`std::function` is an example of _type erasure_. ## Higher Order Functions One of the main uses of lambda functions is to create temporary functions to pass into higher order functions. A higher order function is simply a function -that has other functions as one of its arguments. +that has other functions as one of its arguments. -To illustrate the benifits of higher order functions, let us define two +To illustrate the benefits of higher order functions, let us define two functions, one that calculates the sum of a `std::vector`, the other which calculates the maximum value the same vector type. @@ -192,17 +187,17 @@ int reduce(const std::vector& data, std::function bin_op) { int main() { std::vector data = {1, 2, 3, 4, -1}; - std::cout << reduce(data, std::plus()) << std::endl; - std::cout << reduce(data, std::multiplies()) << std::endl; - std::cout << reduce(data, [](int a, int b) { return std::max(a, b); }) << std::endl; - std::cout << reduce(data, [](int a, int b) { return std::min(a, b); }) << std::endl; + std::cout << reduce(data, std::plus()) << std::endl; + std::cout << reduce(data, std::multiplies()) << std::endl; + std::cout << reduce(data, [](int a, int b) { return std::max(a, b); }) << std::endl; + std::cout << reduce(data, [](int a, int b) { return std::min(a, b); }) << std::endl; } ``` Excellent! We have reduced the amount of code we need to write, reducing the number of possible bugs and making the code easier to maintain in the future. -C++ actually has a `std::reduce`, which is part of the *algorithms* standard library. +C++ actually has a `std::reduce`, which is part of the _algorithms_ standard library. ### The Algorithms Library @@ -213,8 +208,8 @@ recognising their conceptual similarities. Using the algorithms library means: (a) you reduce the amount of (algorithmic) code you need to write, reducing bugs and increasing maintainability (b) you make clear to the reader what your code is doing, since these are commonly used algorithms -(b) you benifit from bullet proof, efficient implementations written by the same teams that write the compiler you are using -(c) you can benifit from *executors* to instantly parallise or vectorise your code for high performance. +(b) you benefit from bullet proof, efficient implementations written by the same teams that write the compiler you are using +(c) you can benefit from _executors_ to instantly parallelise or vectorise your code for high performance. Lets go through a few examples inspired by the common functional algorithms "map", "filter" and "reduce" (also the inspiration for the MapReduce @@ -226,18 +221,18 @@ First the map, or `std::transform`: std::vector data = {1.0, 2.0, -1.1, 5.0}; // transform in-place -std::transform(std::begin(data), std::end(data), std::begin(data), +std::transform(std::begin(data), std::end(data), std::begin(data), [](const double& x) { return 2.0 * x; } ); std::vector new_data(data.size()); // transform to a new collection -std::transform(std::begin(data), std::end(data), std::begin(new_data), +std::transform(std::begin(data), std::end(data), std::begin(new_data), [](const double& x) { return 3.14 * std::pow(x, 2); } ); ``` -Then the filter, or `std::copy_if`, which we will use to print out all the prime numbers to 1000. +Then the filter, or `std::copy_if`, which we will use to print out all the prime numbers to 1000. Here we also introduce two more useful tools: @@ -260,7 +255,7 @@ bool is_prime(int n) { } int main() { - std::vector data(1000); + std::vector data(1000); std::iota(data.begin(), data.end(), 1); // fill with numbers 1 -> 1000 std::copy_if(data.begin(), data.end(), std::ostream_iterator(std::cout, " "), @@ -277,9 +272,9 @@ self-contained `is_prime` function we can potentially reuse. Finally, the reduce, or `std::reduce`, which we will use to calculate the min and maximum elements of an vector. At the same time we introduce another algorithm `std::generate`, which assigns values to a range based on a generator function, and some -of the random number generation options in the standard library. +of the random number generation options in the standard library. -~~~ cpp +```cpp #include #include #include @@ -296,7 +291,7 @@ int main() { std::normal_distribution dist(5, 2); auto gen_random = [&]() { return dist(gen);}; - std::vector data(1000); + std::vector data(1000); std::generate(data.begin(), data.end(), gen_random); auto calc_min_max = [](std::tuple acc, double x) { @@ -308,28 +303,26 @@ int main() { auto [min, max] = std::accumulate(data.begin(), data.end(), std::make_tuple(0., 0.), calc_min_max); std::cout << "min is "<< min << " max is "<< max << std::endl; } -~~~ - +``` ::::challenge{id=sum_squares title="Sum of Squares"} Use `std::accumulate` to write a function that calculates the sum of the squares of the values in a vector. Your function should behave as below: -~~~ cpp +```cpp std::cout << sum_of_squares({0}) << std::endl; std::cout << sum_of_squares({1, 3, -2}) << std::endl; -~~~ +``` -~~~ +```text 0 14 -~~~ - +``` :::solution -~~~cpp +```cpp #include #include #include @@ -339,51 +332,53 @@ int sum_of_squares(const std::vector& data) { auto sum_squares = [](int sum, int x) { return sum + std::pow(x, 2); }; return std::accumulate(data.begin(), data.end(), 0, sum_squares); } -~~~ -::: +``` +::: Now let's assume we're reading in these numbers from an input file, so they arrive as a list of strings. Write a new function `map_str_to_int` using `std::transform` that passes the following tests: -~~~ cpp +```cpp std::cout << sum_of_squares(map_str_to_int({"1", "2", "3"})) << std::endl; std::cout << sum_of_squares(map_str_to_int({"-1", "-2", "-3"})) << std::endl; -~~~ +``` -~~~ +```text 14 14 -~~~ +``` :::solution -~~~cpp + +```cpp const std::vector map_str_to_int(const std::vector& data) { std::vector new_data(data.size()); auto str_to_int = [](std::string x) { return std::atoi(x.c_str()); }; std::transform(data.begin(), data.end(), new_data.begin(), str_to_int); return new_data; } -~~~ +``` + ::: Finally, we'd like it to be possible for users to comment out numbers in the input file they give to our program. Extend your `map_str_to_int` function so that the following tests pass: -~~~ cpp +```cpp std::cout << sum_of_squares(map_str_to_int({"1", "2", "3"})) << std::endl; std::cout << sum_of_squares(map_str_to_int({"1", "2", "#100", "3"})) << std::endl; -~~~ +``` -~~~ +```text 14 14 14 -~~~ +``` :::solution -~~~cpp +```cpp std::vector map_str_to_int(const std::vector& data) { std::vector new_data; std::vector filtered_data; @@ -395,11 +390,11 @@ std::vector map_str_to_int(const std::vector& data) { [](std::string x) { return std::atoi(x.c_str()); }); return new_data; } -~~~ +``` or -~~~cpp +```cpp std::vector map_str_to_int(const std::vector& data) { std::vector new_data; new_data.reserve(data.size()); @@ -409,21 +404,22 @@ std::vector map_str_to_int(const std::vector& data) { } new_data.push_back(std::atoi(x.c_str())); } - return new_data; + return new_data; } -~~~ +``` Here you can start to see a limitation of the algorithms in the standard library, in that it is difficult to efficiently compose together multiple -elemental algorithms into more complex algorithm. The *ranges* library is an +elemental algorithms into more complex algorithm. The _ranges_ library is an C++20 addition to the standard library aims to solve this problem, you can read more about the ranges library [here](https://en.cppreference.com/w/cpp/ranges). ::: :::: -## Key Points: +## Key Points + - Higher-order functions in C++: Functions that accept other functions as arguments or return them as results, enabling more modular, reusable, and expressive code. - Lambda expressions: Anonymous functions defined using lambda syntax, often utilized as arguments for higher-order functions, offering flexibility and conciseness. - Polymorphic functions: `std::function` allows functions to be passed around as objects with a common interface, enabling polymorphism and more flexible higher-order function usage. -- Standard library algorithms: C++ includes a variety of higher-order functions (e.g. `std::transform`, `std::for_each`, and `std::accumulate`) to perform common operations on data structures efficiently and with less boilerplate code. \ No newline at end of file +- Standard library algorithms: C++ includes a variety of higher-order functions (e.g. `std::transform`, `std::for_each`, and `std::accumulate`) to perform common operations on data structures efficiently and with less boilerplate code. diff --git a/software_architecture_and_design/functional/higher_order_functions_python.md b/software_architecture_and_design/functional/higher_order_functions_python.md index 0277929d..fdcece3c 100644 --- a/software_architecture_and_design/functional/higher_order_functions_python.md +++ b/software_architecture_and_design/functional/higher_order_functions_python.md @@ -1,19 +1,16 @@ --- name: Higher Order Functions -dependsOn: [ - software_architecture_and_design.functional.side_effects_python, -] +dependsOn: [software_architecture_and_design.functional.side_effects_python] tags: [python] -attribution: - - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## First Class Functions @@ -24,7 +21,7 @@ variables. This is a powerful feature of functional programming languages, and i In Python, functions are first-class citizens, which means that they can be passed to other functions as arguments, for example: -~~~ python +```python def add_one(x): return x + 1 @@ -32,31 +29,30 @@ def apply_function(f, x): return f(x) print(apply_function(add_one, 1)) -~~~ +``` -~~~ +```text 2 -~~~ +``` ## Lambda Functions -*Lambda functions* are small, nameless functions which are defined in the +_Lambda functions_ are small, nameless functions which are defined in the normal flow of the program, typically as they are needed. The structure of these functions is not dissimilar to a normal python function definition - we have a -keyword `lambda`, a list of parameters, a colon, then the function body. In +keyword `lambda`, a list of parameters, a colon, then the function body. In Python, the function body is limited to a single expression, which becomes the return value. - -~~~ python -add_one = lambda x: x + 1 +```python +add_one = lambda x: x + 1 # NOQA E731 print(add_one(1)) -~~~ +``` -~~~ +```text 2 -~~~ +``` We have assigned the lambda function to a variable, so we can see it more clearly, but we'd normally use it immediately. Most style guides (we'll come back to these later in the course) consider it bad style to assign a lambda to a variable. @@ -75,7 +71,7 @@ Due to their simplicity, it can be useful to have a lamdba function as the inner One of the main uses of lambda functions is to create temporary functions to pass into higher order functions. A higher order function is simply a function -that has other functions as one of its arguments. +that has other functions as one of its arguments. To illustrate the benifits of higher order functions, let us define two functions, one that calculates the sum of a list of values, the other @@ -114,7 +110,7 @@ print(reduce(data, lambda a, b: max(a, b))) print(reduce(data, lambda a, b: min(a, b))) ``` -``` +```text 9 0 4 @@ -128,20 +124,20 @@ number of possible bugs and making the code easier to maintain in the future. Python has a number of higher order functions built in, including `map`, `filter` and `reduce`. Note that the `map` and `filter` functions in Python use -**lazy evaluation**. This means that values in an iterable collection are not -actually calculated until you need them. We'll explain some of the implications +**lazy evaluation**. This means that values in an iterable collection are not +actually calculated until you need them. We'll explain some of the implications of this a little later, but for now, we'll just use `list()` to convert the -results to a normal list. In these examples we also see the more typical usage +results to a normal list. In these examples we also see the more typical usage of lambda functions. The `map` function, takes a function and applies it to each value in an -**iterable**. Here, 'iterable' means any object that can be iterated over - for +**iterable**. Here, 'iterable' means any object that can be iterated over - for more details see the [Iterable Abstract Base Class documentation](https://docs.python.org/3/library/collections.abc.html#collections.abc.Iterable). The results of each of those applications become the values in the **iterable** that is returned. -~~~python +```python l = [1, 2, 3] def add_one(x): @@ -150,16 +146,16 @@ def add_one(x): # Returns a so need to cast to list print(list(map(add_one, l))) print(list(map(lambda x: x + 1, l))) -~~~ +``` -~~~ +```text [2, 3, 4] [2, 3, 4] -~~~ +``` Like `map`, `filter` takes a function and applies it to each value in an iterable, keeping the value if the result of the function application is `True`. -~~~ python +```python l = [1, 2, 3] def is_gt_one(x): @@ -168,18 +164,18 @@ def is_gt_one(x): # Returns a so need to cast to list print(list(filter(is_gt_one, l))) print(list(filter(lambda x: x > 1, l))) -~~~ +``` -~~~ +```text [2, 3] [2, 3] -~~~ +``` The `reduce` function is different. This function uses a function which accepts two values to accumulate the values in the iterable. The simplest uses here are to calculate the sum or product of a sequence. -~~~ python +```python from functools import reduce l = [1, 2, 3] @@ -189,94 +185,96 @@ def add(a, b): print(reduce(add, l)) print(reduce(lambda a, b: a + b, l)) -~~~ +``` -~~~ +```text 6 6 -~~~ +``` These are the fundamental components of the MapReduce style, and can be combined to perform much more complex data processing operations. - ::::challenge{id=sum_squares title="Sum of Squares"} Using `map` and `reduce`, write a function that calculates the sum of the squares of the values in a list. Your function should behave as below: -~~~ python +```python def sum_of_squares(l): # Your code here + return print(sum_of_squares([0])) print(sum_of_squares([1])) print(sum_of_squares([1, 2, 3])) print(sum_of_squares([-1])) print(sum_of_squares([-1, -2, -3])) -~~~ +``` -~~~ +```text 0 1 14 1 14 -~~~ +``` :::solution -~~~ python +```python from functools import reduce def sum_of_squares(l): squares = map(lambda x: x * x, l) return reduce(lambda a, b: a + b, squares) -~~~ +``` + ::: Now let's assume we're reading in these numbers from an input file, so they arrive as a list of strings. Modify your function so that it passes the following tests: -~~~ python +```python print(sum_of_squares(['1', '2', '3'])) print(sum_of_squares(['-1', '-2', '-3'])) -~~~ +``` -~~~ +```text 14 14 -~~~ +``` :::solution -~~~ python + +```python from functools import reduce def sum_of_squares(l): integers = map(int, l) squares = map(lambda x: x * x, integers) return reduce(lambda a, b: a + b, squares) -~~~ +``` ::: Finally, like comments in Python, we'd like it to be possible for users to comment out numbers in the input file they give to our program. Extend your function so that the following tests pass (don't worry about passing the first set of tests with lists of integers): -~~~ python +```python print(sum_of_squares(['1', '2', '3'])) print(sum_of_squares(['-1', '-2', '-3'])) print(sum_of_squares(['1', '2', '#100', '3'])) -~~~ +``` -~~~ +```text 14 14 14 -~~~ +``` :::solution -~~~ python +```python from functools import reduce def sum_of_squares(l): @@ -284,7 +282,8 @@ def sum_of_squares(l): integers = map(int, not_comments) squares = map(lambda x: x * x, integers) return reduce(lambda a, b: a + b, squares) -~~~ +``` + ::: :::: @@ -303,9 +302,9 @@ Is this as much as you would expect for the number of cores your CPU has? **Hint:** To time the execution of a Python script we can use the Linux program `time`: -~~~bash +```bash time python3 my_script.py -~~~ +``` Would we get the same benefits from parallel equivalents of the `filter` and `reduce` functions? Why, or why not? @@ -323,36 +322,36 @@ with, rather than always getting back a `map` or `filter` iterable. ### List Comprehensions The **list comprehension** is probably the most commonly used comprehension -type. As you might expect from the name, list comprehensions produce a list -from some other iterable type. In effect they are the same as using `map` +type. As you might expect from the name, list comprehensions produce a list +from some other iterable type. In effect they are the same as using `map` and/or `filter` and using `list()` to cast the result to a list, as we did previously. All comprehension types are structured in a similar way, using the syntax for a literal of that type (in the case below, a list literal) containing what looks -like the top of a for loop. To the left of the `for` we put the equivalent of +like the top of a for loop. To the left of the `for` we put the equivalent of the map operation we want to use: -~~~python +```python print([i for i in range(5)]) print([2 * i for i in range(5)]) -~~~ +``` -~~~ +```text [0, 1, 2, 3, 4] [0, 2, 4, 6, 8] -~~~ +``` We can also use list comprehensions to perform the equivalent of a filter operation, by putting the filter condition at the end: -~~~python +```python print([2 * i for i in range(5) if i % 2 == 0]) -~~~ +``` -~~~ +```text [0, 4, 8] -~~~ +``` ### Dictionary and Set Comprehensions @@ -361,37 +360,37 @@ comprehensions but use the dictionary or set literal syntax. So set comprehensions are: -~~~python +```python print({2 * i for i in range(5)}) -~~~ +``` -~~~ +```text {0, 2, 4, 6, 8} -~~~ +``` While dictionary comprehensions are: -~~~python +```python print({i: 2 * i for i in range(5)}) -~~~ +``` -~~~ +```text {0: 0, 1: 2, 2: 4, 3: 6, 4: 8} -~~~ +``` :::callout + ## Why No Tuple Comprehension Raymond Hettinger, one of the Python core developers, said in 2013: -~~~ +```text Generally, lists are for looping; tuples for structs. Lists are homogeneous; tuples heterogeneous. Lists for variable length. -~~~ +``` Since tuples aren't intended to represent sequences, there's no need for them to have a comprehension structure. ::: - ## Generators **Generator expressions** look exactly like you might expect a tuple @@ -400,21 +399,21 @@ Since tuples aren't intended to represent sequences, there's no need for them to What happens if we try to use them in the same was as we did list comprehensions? -~~~python +```python print((2 * i for i in range(5))) -~~~ +``` -~~~ +```text at 0x7efc21efcdd0> -~~~ +``` Like the `map` and `filter` functions, generator expressions are not evaluated until you iterate over them. -~~~python +```python for i in (2 * i for i in range(5)): print(i) -~~~ +``` ## Decorators @@ -422,7 +421,7 @@ Decorators are higher order functions that take a function as an argument, modif Let's look at the following code for ways on how to "decorate" functions. -~~~python +```python def with_logging(func): """A decorator which adds logging to a function.""" @@ -450,9 +449,9 @@ def add_two(n): print(add_one(1)) print(add_two(1)) -~~~ +``` -~~~ +```text Before function call Adding one After function call @@ -461,7 +460,7 @@ Before function call Adding two After function call 3 -~~~ +``` In this example, we see a decorator (`with_logging`) and two different syntaxes for applying the decorator to a function. The decorator is implemented here as a @@ -503,18 +502,18 @@ different use-cases, but we won’t worry about that here. For the function to measure, you may wish to use this as an example: -~~~python +```python def measure_me(n): total = 0 for i in range(n): total += i * i return total -~~~ +``` :::solution -~~~python +```python import time def profile(func): @@ -537,21 +536,22 @@ def measure_me(n): return total print(measure_me(1000000)) -~~~ +``` -~~~ +```text Took 0.124199753 seconds 333332833333500000 -~~~ +``` + ::: :::: -## Key Points: +## Key Points -- *First-Class Functions*: functions that can be passed as arguments to other functions, returned from functions, or assigned to variables. -- *Lambda Functions*: small, nameless functions defined in the normal flow of the program with a keyword lambda. -- *Higher-Order Functions*: a function that has other functions as one of its arguments. -- *Map, Filter and Reduce*: built-in higher order functions in Python that use lazy evaluation. -- *Comprehensions*: a more Pythonic way to structure map and filter operations. -- *Generators*: similar to list comprehensions, but behave differently and not evaluated until you iterate over them. -- *Decorators*: higher-order functions that take a function as an argument, modify it, and return it. +- _First-Class Functions_: functions that can be passed as arguments to other functions, returned from functions, or assigned to variables. +- _Lambda Functions_: small, nameless functions defined in the normal flow of the program with a keyword lambda. +- _Higher-Order Functions_: a function that has other functions as one of its arguments. +- _Map, Filter and Reduce_: built-in higher order functions in Python that use lazy evaluation. +- _Comprehensions_: a more Pythonic way to structure map and filter operations. +- _Generators_: similar to list comprehensions, but behave differently and not evaluated until you iterate over them. +- _Decorators_: higher-order functions that take a function as an argument, modify it, and return it. diff --git a/software_architecture_and_design/functional/index.md b/software_architecture_and_design/functional/index.md index 9bb67c9a..6424d36e 100644 --- a/software_architecture_and_design/functional/index.md +++ b/software_architecture_and_design/functional/index.md @@ -1,37 +1,35 @@ --- id: functional name: Functional Programming -dependsOn: [ - software_architecture_and_design.procedural, -] -files: [ +dependsOn: [software_architecture_and_design.procedural] +files: + [ side_effects_cpp.md, side_effects_python.md, recursion_cpp.md, recursion_python.md, higher_order_functions_cpp.md, higher_order_functions_python.md, -] + ] summary: | - Functional Programming is based around the idea that programs are constructed - by applying and composing/chaining **functions**. This course will introduce - you to the basics of functional programming in either Python or C++. -attribution: - - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - + Functional Programming is based around the idea that programs are constructed + by applying and composing/chaining **functions**. This course will introduce + you to the basics of functional programming in either Python or C++. +attribution: + - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- Functional programming is a programming paradigm where programs are constructed by applying and composing/chaining **functions**. Functional programming is based on the [mathematical definition of a -function](https://en.wikipedia.org/wiki/Function_(mathematics)) `f()`, which +function]() `f()`, which applies a transformation to some input data giving us some other data as a result (i.e. a mapping from input `x` to output `f(x)`). Thus, a program written in a functional style becomes a series of transformations on data which are @@ -47,7 +45,6 @@ focussed on **what** transformations are done to the data, rather than **how** these transformations are performed (i.e. a detailed sequence of steps which update the state of the code to reach a desired state). - :::callout In his introduction to functional programming in Advanced R, Hadley Wickham gives a good summary of the style: @@ -57,4 +54,4 @@ In his introduction to functional programming in Advanced R, Hadley Wickham give > Each function taken by itself is simple and straightforward to understand; complexity is handled by composing functions in various ways. > > -- Hadley Wickham - [Functional Style](https://adv-r.hadley.nz/fp.html) -::: \ No newline at end of file +> ::: diff --git a/software_architecture_and_design/functional/recursion_cpp.md b/software_architecture_and_design/functional/recursion_cpp.md index 3e21b914..3dd8f1c9 100644 --- a/software_architecture_and_design/functional/recursion_cpp.md +++ b/software_architecture_and_design/functional/recursion_cpp.md @@ -1,28 +1,25 @@ --- name: Recursion -dependsOn: [ - software_architecture_and_design.functional.higher_order_functions_cpp, -] +dependsOn: [software_architecture_and_design.functional.higher_order_functions_cpp] tags: [cpp] -attribution: - - citation: > - This material was adapted from an "Introduction to C++" course developed by the - Oxford RSE group. - url: https://www.rse.ox.ac.uk - image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: > + This material was adapted from an "Introduction to C++" course developed by the + Oxford RSE group. + url: https://www.rse.ox.ac.uk + image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Recursion Recursion is one of the common strategies used in Functional Programming. Instead of using loops to **iteratively** apply an operation, we can express a -result in terms of previous results. To do this, the function needs to call +result in terms of previous results. To do this, the function needs to call itself to get the previous result, this is called **recursion**. The following two code examples implement the calculation of a factorial using @@ -33,7 +30,7 @@ iteration and recursion, respectively. Recall that the factorial of a number `n` // factorial // @param n: the number to calculate the factorial of // @return: the factorial of n -int factorial(int n): +int factorial(int n) { int product = 1; for (int i = 2; i <= n; ++i) { product *= i; @@ -42,14 +39,14 @@ int factorial(int n): } ``` -Functions in procedural programming are *procedures* that describe a detailed +Functions in procedural programming are _procedures_ that describe a detailed list of instructions to tell the computer what to do step by step and how to change the state of the program and advance towards the result. They often use -*iteration* to repeat a series of steps. Functional programming, on the other -hand, often uses *recursion* - an ability of a function to call/repeat +_iteration_ to repeat a series of steps. Functional programming, on the other +hand, often uses _recursion_ - an ability of a function to call/repeat itself until a particular condition is reached. -~~~cpp +```cpp // factorial // @param n: the number to calculate the factorial of // @return: the factorial of n @@ -62,9 +59,9 @@ int factorial(int n) { } return n * factorial(n - 1); } -~~~ +``` -Note: this implementation is an example of *tail recursion*, which is typically +Note: this implementation is an example of _tail recursion_, which is typically optimised by the compiler back to an iterative implementation (since this is faster). @@ -88,8 +85,8 @@ int main() { // 1 * // / \ // 2 3 - Node t = Node('+', { Node(1), - Node('*', { Node(2), + Node t = Node('+', { Node(1), + Node('*', { Node(2), Node(3) }) } @@ -98,11 +95,11 @@ int main() { ``` Write: + 1. a function that traverses the tree and returns the total number of nodes 2. a function that traverses the tree and returns the result of the expression - :::solution ```cpp @@ -168,10 +165,11 @@ int evaluate2(const Node& t) { return std::accumulate(t.children.begin() + 1, t.children.end(), evaluate(t.children[0]), op); } ``` + ::: :::: -## Key Points: +## Key Points - Recursion is a programming technique where a function calls itself, allowing solutions to problems that can be broken down into smaller subproblems -- Recursion is a useful approach for calculation and operations on tree data structures. \ No newline at end of file +- Recursion is a useful approach for calculation and operations on tree data structures. diff --git a/software_architecture_and_design/functional/recursion_python.md b/software_architecture_and_design/functional/recursion_python.md index 1915acc3..47575e94 100644 --- a/software_architecture_and_design/functional/recursion_python.md +++ b/software_architecture_and_design/functional/recursion_python.md @@ -1,33 +1,30 @@ --- name: Recursion -dependsOn: [ - software_architecture_and_design.functional.higher_order_functions_python, -] +dependsOn: [software_architecture_and_design.functional.higher_order_functions_python] tags: [python] -attribution: - - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Recursion Recursion is one of the common strategies used in Functional Programming. Instead of using loops to **iteratively** apply an operation, we can express a -result in terms of previous results. To do this, the function needs to call +result in terms of previous results. To do this, the function needs to call itself to get the previous result, this is called **recursion**. The following two code examples implement the calculation of a factorial using iteration and recursion, respectively. Recall that the factorial of a number `n` (denoted by `n!`) is calculated as the product of integer numbers from 1 to `n`. -~~~python +```python def factorial(n): """Calculate the factorial of a given number. @@ -43,16 +40,16 @@ def factorial(n): factorial = factorial * i return factorial -~~~ +``` -Functions in procedural programming are *procedures* that describe a detailed +Functions in procedural programming are _procedures_ that describe a detailed list of instructions to tell the computer what to do step by step and how to change the state of the program and advance towards the result. They often use -*iteration* to repeat a series of steps. Functional programming, on the other -hand, typically uses *recursion* - an ability of a function to call/repeat +_iteration_ to repeat a series of steps. Functional programming, on the other +hand, typically uses _recursion_ - an ability of a function to call/repeat itself until a particular condition is reached. -~~~python +```python def factorial(n): """Calculate the factorial of a given number. @@ -66,7 +63,7 @@ def factorial(n): return 1 # exit from recursion, prevents infinite loops else: return n * factorial(n-1) # recursive call to the same function -~~~ +``` ::::challenge{id="recursion_on_trees" title="Recursion on trees"} @@ -95,11 +92,11 @@ t = Node('+', [Node('1'), ``` Write: + 1. a function that traverses the tree and returns the total number of nodes 2. a function that traverses the tree and returns the result of the expression - :::solution ```python @@ -136,10 +133,11 @@ def evaluate(tree): else: raise ValueError(f"Unknown operator: {tree.value}") ``` + ::: :::: -## Key Points: +## Key Points - Recursion is a programming technique where a function calls itself, allowing solutions to problems that can be broken down into smaller subproblems - Recursion is a useful approach for calculation and operations on tree data structures. diff --git a/software_architecture_and_design/functional/side_effects_cpp.md b/software_architecture_and_design/functional/side_effects_cpp.md index 93ce016c..348a73c0 100644 --- a/software_architecture_and_design/functional/side_effects_cpp.md +++ b/software_architecture_and_design/functional/side_effects_cpp.md @@ -1,20 +1,16 @@ --- +--- name: State and Side Effects -dependsOn: [ -] +dependsOn: [] tags: [cpp] -attribution: - - citation: > - This material was adapted from an "Introduction to C++" course developed by the - Oxford RSE group. - url: https://www.rse.ox.ac.uk - image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material was adapted from an "Introduction to C++" course developed by the Oxford RSE group. + url: https://www.rse.ox.ac.uk + image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Program state @@ -79,7 +75,7 @@ std::getline(myfile, line); // Same call to getline, but result is different! ``` The main downside of having a state that is constantly updated is that it makes -it harder for us to *reason* about our code, to work out what it is doing. +it harder for us to _reason_ about our code, to work out what it is doing. However, the main upside is that we can use state to make calculations more efficient, for example to sum up the values in a vector we use a single variable `sum` to hold the state of the computation @@ -97,15 +93,15 @@ for (const auto& x: data) { Functional computations only rely on the values that are provided as inputs to a function and not on the state of the program that precedes the function call. They do not modify data that exists outside the current function, including the -input data - this property is referred to as the *immutability of data*. This -means that such functions do not create any *side effects*, i.e. do not perform +input data - this property is referred to as the _immutability of data_. This +means that such functions do not create any _side effects_, i.e. do not perform any action that affects anything other than the value they return. A pure function is therefore the computational version of a mathematical function. For example: printing text, writing to a file, modifying the value of an input argument, or changing the value of a global variable. Functions without side affects that return the same data each time the same input arguments are provided are called -*pure functions*. +_pure functions_. ::::challenge{id="pure-functions" title="Pure Functions"} @@ -132,6 +128,7 @@ void increment_x(int& x) { ``` :::solution + ```cpp // a pure function, no side effects :) int increment_and_return_x(const int& x) { @@ -159,6 +156,7 @@ void increment_x(int& x) { x += one; } ``` + ::: :::: @@ -330,12 +328,10 @@ int main() { return 0; } ``` + ::: :::: - - - ## Benefits of Functional Code There are a few benefits we get when working with pure functions: @@ -355,7 +351,7 @@ will be, or how to measure them. **Composability** refers to the ability to make a new function from a chain of other functions by piping the output of one as the input to the next. If a -function does not have side effects or non-deterministic behaviour, then all +function does not have side effects or non-deterministic behaviour, then all of its behaviour is reflected in the value it returns. As a consequence of this, any chain of combined pure functions is itself pure, so we keep all these benefits when we are combining functions into a larger program. @@ -365,20 +361,20 @@ benefits when we are combining functions into a larger program. *a lot of data, we can often improve performance by splitting data and *distributing the computation across multiple processors. The output of a pure *function depends only on its input, so we will get the right result regardless -*of when or where the code runs. +\*of when or where the code runs. There are other advantageous properties that can be derived from the functional approach to coding. In languages which support functional programming, a -function is a *first-class object* like any other object - not only can you +function is a _first-class object_ like any other object - not only can you compose/chain functions together, but functions can be used as inputs to, passed around or returned as results from other functions (remember, in functional -programming *code is data*). This is why functional programming is suitable for +programming _code is data_). This is why functional programming is suitable for processing data efficiently - in particular in the world of Big Data, where code is much smaller than the data, sending the code to where data is located is cheaper and faster than the other way round. Let's see how we can do data processing using functional programming. -## Key Points: +## Key Points - Program state is composed of variables' values, including those modified by functions and interactions with the Operating System. - Functional computations rely only on input values, are immutable, and do not create side effects. Pure functions are testable, composable, and parallelizable. diff --git a/software_architecture_and_design/functional/side_effects_python.md b/software_architecture_and_design/functional/side_effects_python.md index 2363c123..98cc8230 100644 --- a/software_architecture_and_design/functional/side_effects_python.md +++ b/software_architecture_and_design/functional/side_effects_python.md @@ -1,33 +1,30 @@ --- name: Side Effects -dependsOn: [ -] +dependsOn: [] tags: [python] -attribution: - - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- - ## Program state -In programming, the term "state" refers to the current status or condition of a program -at a particular moment in time. It can include various pieces of information, such as -the values of variables, data structures, or objects, and the state of the system +In programming, the term "state" refers to the current status or condition of a program +at a particular moment in time. It can include various pieces of information, such as +the values of variables, data structures, or objects, and the state of the system resources, such as the memory, files, or network connections. -The state of a program can be modifiable or immutable, depending on whether it can be -changed or not during the program's execution. Modifiable state can be a powerful tool -in programming, as it allows us to store and update temporary data that we can use to -make calculations more efficient. However, it also introduces complexity and potential -pitfalls, as changes in state can lead to unpredictable behavior, bugs, or security +The state of a program can be modifiable or immutable, depending on whether it can be +changed or not during the program's execution. Modifiable state can be a powerful tool +in programming, as it allows us to store and update temporary data that we can use to +make calculations more efficient. However, it also introduces complexity and potential +pitfalls, as changes in state can lead to unpredictable behavior, bugs, or security vulnerabilities. The current state of a given program is composed of many different parts, some @@ -60,10 +57,11 @@ more unclear that this is being updated. The global variable and function might even be declared in a separate file and brought in via an `import` ```python -z = 3 +z = 3 def my_cool_function(x, y): + global z x = y - z = z + 1; + z = z + 1 y = [3, 2] @@ -76,6 +74,7 @@ opaque, part of the current state. This includes memory allocations, and file IO. ```python +import numpy as np x = np.zeros(1000) # do we have enough RAM available for this? myfile = open("example.txt", "w") # does this file exist? Do I have write permissions? @@ -85,44 +84,45 @@ line = myfile.readline() # Same call to readline, but result is different! ``` The main downside of having a state that is constantly updated is that it makes -it harder for us to *reason* about our code, to work out what it is doing. +it harder for us to _reason_ about our code, to work out what it is doing. However, the upside is that we can use state to store temporary data to make calculations more efficient and store temporary data. For example an iteration loop that keeps track of a running total is a common pattern in procedural programming: -```python +```python nolint result = 0 for x in data: result += expensive_computation(x) ``` + ## Side Effects and Pure Functions -By considering how we use state in our programs, we can improve our programming by -making it more predictable, reliable, and testable. One way to achieve this is by -adopting functional programming principles, which promote the use of pure functions that -do not modify any external state and rely only on their input parameters to produce -their output. Pure functions are easier to reason about and test, and they enable -composability, parallelism, and other benefits that can improve the quality and +By considering how we use state in our programs, we can improve our programming by +making it more predictable, reliable, and testable. One way to achieve this is by +adopting functional programming principles, which promote the use of pure functions that +do not modify any external state and rely only on their input parameters to produce +their output. Pure functions are easier to reason about and test, and they enable +composability, parallelism, and other benefits that can improve the quality and efficiency of our code. Functional computations only rely on the values that are provided as inputs to a function and not on the state of the program that precedes the function call. They do not modify data that exists outside the current function, including the -input data - this property is referred to as the *immutability of data*. This -means that such functions do not create any *side effects*, i.e. do not perform +input data - this property is referred to as the _immutability of data_. This +means that such functions do not create any _side effects_, i.e. do not perform any action that affects anything other than the value they return. For example: printing text, writing to a file, modifying the value of an input argument, or changing the value of a global variable. Functions without side affects that return the same data each time the same input arguments are provided are called -*pure functions*. +_pure functions_. ::::challenge{id="pure-functions" title="Pure Functions"} Which of these functions are pure? If you're not sure, explain your reasoning to someone else, do they agree? -~~~python +```python def add_one(x): return x + 1 @@ -136,15 +136,17 @@ def append_item_1(a_list, item): def append_item_2(a_list, item): result = a_list + [item] return result -~~~ +``` :::solution + ## Solution 1. `add_one` is pure - it has no effects other than to return a value and this value will always be the same when given the same inputs 2. `say_hello` is not pure - printing text counts as a side effect, even though it is the clear purpose of the function 3. `append_item_1` is not pure - the argument `a_list` gets modified as a side effect - try this yourself to prove it 4. `append_item_2` is pure - the result is a new variable, so this time `a_list` does not get modified - again, try this yourself + ::: :::: @@ -187,7 +189,7 @@ def get_neighbors(grid, i, j): (indices[:, 1] >= 0) & (indices[:, 1] < cols) valid_indices[4] = False # exclude current cell return grid[indices[valid_indices][:, 0], indices[valid_indices][:, 1]] - + # Test grid = np.array([[0, 0, 0, 0, 0], [0, 0, 1, 0, 0], @@ -254,7 +256,6 @@ assert np.array_equal(new_grid, grid), "Grid should be unchanged" ::: :::: - ## Benefits of Functional Code There are a few benefits we get when working with pure functions: @@ -274,17 +275,17 @@ will be, or how to measure them. **Composability** refers to the ability to make a new function from a chain of other functions by piping the output of one as the input to the next. If a -function does not have side effects or non-deterministic behaviour, then all +function does not have side effects or non-deterministic behaviour, then all of its behaviour is reflected in the value it returns. As a consequence of this, any chain of combined pure functions is itself pure, so we keep all these benefits when we are combining functions into a larger program. As an example of this, we could make a function called `add_two`, using the `add_one` function we already have. -~~~python +```python def add_two(x): return add_one(add_one(x)) -~~~ +``` **Parallelisability** is the ability for operations to be performed at the same time (independently). If we know that a function is fully pure and we have got @@ -294,7 +295,9 @@ function depends only on its input, so we will get the right result regardless of when or where the code runs. :::callout + ## Everything in Moderation + Despite the benefits that pure functions can bring, we should not be trying to use them everywhere. Any software we write needs to interact with the rest of the world somehow, which requires side effects. With pure functions you cannot @@ -311,16 +314,16 @@ as they return new data objects instead of changing existing ones. There are other advantageous properties that can be derived from the functional approach to coding. In languages which support functional programming, a -function is a *first-class object* like any other object - not only can you +function is a _first-class object_ like any other object - not only can you compose/chain functions together, but functions can be used as inputs to, passed around or returned as results from other functions (remember, in functional -programming *code is data*). This is why functional programming is suitable for +programming _code is data_). This is why functional programming is suitable for processing data efficiently - in particular in the world of Big Data, where code is much smaller than the data, sending the code to where data is located is cheaper and faster than the other way round. Let's see how we can do data processing using functional programming. -## Key Points: +## Key Points - Program state is composed of variables' values, including those modified by functions and interactions with the Operating System. - Functional computations rely only on input values, are immutable, and do not create side effects. Pure functions are testable, composable, and parallelizable. diff --git a/software_architecture_and_design/index.md b/software_architecture_and_design/index.md index be7eb32f..c4040de4 100644 --- a/software_architecture_and_design/index.md +++ b/software_architecture_and_design/index.md @@ -10,4 +10,4 @@ summary: | Programming design concepts and patterns, such as procedural, object-orientated, functional, generic. Covers data structures, memory allocation, ownership. --- -Programming design concepts and patterns, such as procedural, object-orientated, functional, generic. Covers data structures, memory allocation, ownership. \ No newline at end of file +Programming design concepts and patterns, such as procedural, object-orientated, functional, generic. Covers data structures, memory allocation, ownership. diff --git a/software_architecture_and_design/object_orientated/classes.md b/software_architecture_and_design/object_orientated/classes.md index 3ff73c2d..ba7e4751 100644 --- a/software_architecture_and_design/object_orientated/classes.md +++ b/software_architecture_and_design/object_orientated/classes.md @@ -1,18 +1,19 @@ --- name: Classes -dependsOn: [ -] +dependsOn: [] tags: [python] -attribution: - - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +learningOutcomes: + - Explain the object orientated programming paradigm. + - Define a class to encapsulate data. +attribution: + - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Structuring Data @@ -20,10 +21,11 @@ attribution: One of the main difficulties we encounter when building more complex software is how to structure our data. So far, we've been processing data from a single source and with a simple tabular structure, but it would be useful to be able to combine data from a range of different sources and with more data than just an array of numbers. -~~~ python +```python +import numpy as np data = np.array([[1., 2., 3.], [4., 5., 6.]]) -~~~ +``` Using this data structure has the advantage of being able to use NumPy operations to process the data and Matplotlib to plot it, but often we need to have more structure than this. For example, we may need to attach more information about the patients and store this alongside our measurements of inflammation. @@ -31,7 +33,7 @@ For example, we may need to attach more information about the patients and store We can do this using the Python data structures we're already familiar with, dictionaries and lists. For instance, say we wish to store a list of patients on a clinical inflammation trial. We could attach a name to each of our patients: -~~~ python +```python patients = [ { 'name': 'Alice', @@ -42,7 +44,7 @@ patients = [ 'data': [4., 5., 6.], }, ] -~~~ +``` ::::challenge{id=structuring-data title="Structuring Data"} @@ -52,15 +54,15 @@ When used as below, it should produce the expected output. If you're not sure where to begin, think about ways you might be able to effectively loop over two collections at once. Also, don't worry too much about the data type of the `data` value, it can be a Python list, or a NumPy array - either is fine. -~~~ python +```python nolint data = np.array([[1., 2., 3.], [4., 5., 6.]]) output = attach_names(data, ['Alice', 'Bob']) print(output) -~~~ +``` -~~~ +```text [ { 'name': 'Alice', @@ -71,13 +73,13 @@ print(output) 'data': [4., 5., 6.], }, ] -~~~ +``` :::solution One possible solution, perhaps the most obvious, is to use the `range` function to index into both lists at the same location: -~~~ python +```python def attach_names(data, names): """Create datastructure containing patient records.""" output = [] @@ -87,10 +89,10 @@ def attach_names(data, names): 'data': data[i]}) return output -~~~ +``` However, this solution has a potential problem that can occur sometimes, depending on the input. -What might go wrong with this solution? How could we fix it? +What might go wrong with this solution? How could we fix it? ::: @@ -107,7 +109,7 @@ Checking that our inputs are valid in this way is an example of a precondition, If you've not previously come across the `zip` function, read [this section](https://docs.python.org/3/library/functions.html#zip) of the Python documentation. -~~~ python +```python def attach_names(data, names): """Create datastructure containing patient records.""" assert len(data) == len(names) @@ -118,7 +120,8 @@ def attach_names(data, names): 'data': data_row}) return output -~~~ +``` + ::: :::: @@ -127,15 +130,15 @@ def attach_names(data, names): Using nested dictionaries and lists should work for some of the simpler cases where we need to handle structured data, but they get quite difficult to manage -once the structure becomes a bit more complex. For this reason, in the object +once the structure becomes a bit more complex. For this reason, in the object oriented paradigm, we use **classes** to help with managing this data and the -operations we would want to perform on it. A class is a **template** +operations we would want to perform on it. A class is a **template** (blueprint) for a structured piece of data, so when we create some data using a class, we can be certain that it has the same structure each time. With our list of dictionaries we had in the example above, we have no real guarantee that each dictionary has the same structure, e.g. the same keys -(`name` and `data`) unless we check it manually. With a class, if an object is +(`name` and `data`) unless we check it manually. With a class, if an object is an **instance** of that class (i.e. it was made using that template), we know it will have the structure defined by that class. Different programming languages make slightly different guarantees about how strictly the structure will match, @@ -145,7 +148,7 @@ derived from the same class must follow the same behaviour. You may not have realised, but you should already be familiar with some of the classes that come bundled as part of Python, for example: -~~~ python +```python my_list = [1, 2, 3] my_dict = {1: '1', 2: '2', 3: '3'} my_set = {1, 2, 3} @@ -153,13 +156,13 @@ my_set = {1, 2, 3} print(type(my_list)) print(type(my_dict)) print(type(my_set)) -~~~ +``` -~~~ +```text -~~~ +``` Lists, dictionaries and sets are a slightly special type of class, but they behave in much the same way as a class we might define ourselves: @@ -169,8 +172,8 @@ Lists, dictionaries and sets are a slightly special type of class, but they beha The behaviours we may have seen previously include: - Lists can be appended to -- Lists can be indexed -- Lists can be sliced +- Lists can be indexed +- Lists can be sliced - Key-value pairs can be added to dictionaries - The value at a key can be looked up in a dictionary - The union of two sets can be found (the set of values present in any of the sets) @@ -180,7 +183,7 @@ The behaviours we may have seen previously include: Let's start with a minimal example of a class representing our patients. -~~~ python +```python # file: inflammation/models.py class Patient: @@ -190,18 +193,18 @@ class Patient: alice = Patient('Alice') print(alice.name) -~~~ +``` -~~~ +```text Alice -~~~ +``` -Here we've defined a class with one method: `__init__`. This method is the +Here we've defined a class with one method: `__init__`. This method is the **initialiser** method, which is responsible for setting up the initial values and structure of the data inside a new instance of the class - this is very similar to **constructors** in other languages, so the term is often used in -Python too. The `__init__` method is called every time we create a new instance -of the class, as in `Patient('Alice')`. The argument `self` refers to the +Python too. The `__init__` method is called every time we create a new instance +of the class, as in `Patient('Alice')`. The argument `self` refers to the instance on which we are calling the method and gets filled in automatically by Python - we do not need to provide a value for this when we call the method. @@ -226,16 +229,16 @@ we add functions which operate on the data the class contains. These functions are the member functions or methods. Methods on classes are the same as normal functions, except that they live -inside a class and have an extra first parameter `self`. Using the name `self` +inside a class and have an extra first parameter `self`. Using the name `self` is not strictly necessary, but is a very strong convention - it is extremely -rare to see any other name chosen. When we call a method on an object, the -value of `self` is automatically set to this object - hence the name. As we saw +rare to see any other name chosen. When we call a method on an object, the +value of `self` is automatically set to this object - hence the name. As we saw with the `__init__` method previously, we do not need to explicitly provide a value for the `self` argument, this is done for us by Python. Let's add another method on our Patient class that adds a new observation to a Patient instance. -~~~ python +```python # file: inflammation/models.py class Patient: @@ -266,18 +269,19 @@ print(alice) observation = alice.add_observation(3) print(observation) print(alice.observations) -~~~ +``` -~~~ +```text <__main__.Patient object at 0x7fd7e61b73d0> {'day': 0, 'value': 3} [{'day': 0, 'value': 3}] -~~~ +``` Note also how we used `day=None` in the parameter list of the `add_observation` method, then initialise it if the value is indeed `None`. This is one of the common ways to handle an optional argument in Python, so we'll see this pattern quite a lot in real projects. :::callout + ## Class and Static Methods Sometimes, the function we're writing doesn't need access to any data belonging to a particular object. @@ -296,10 +300,10 @@ Both of these method types are created using **decorators** - for more informati Why is the `__init__` method not called `init`? There are a few special method names that we can use which Python will use to provide a few common behaviours, each of which begins and ends with a **d**ouble-**under**score, hence the name **dunder method**. -When writing your own Python classes, you'll almost always want to write an `__init__` method, but there are a few other common ones you might need sometimes. You may have noticed in the code above that the method `print(alice)` returned `<__main__.Patient object at 0x7fd7e61b73d0>`, which is the string represenation of the `alice` object. We +When writing your own Python classes, you'll almost always want to write an `__init__` method, but there are a few other common ones you might need sometimes. You may have noticed in the code above that the method `print(alice)` returned `<__main__.Patient object at 0x7fd7e61b73d0>`, which is the string represenation of the `alice` object. We may want the print statement to display the object's name instead. We can achieve this by overriding the `__str__` method of our class. -~~~ python +```python # file: inflammation/models.py class Patient: @@ -331,11 +335,11 @@ class Patient: alice = Patient('Alice') print(alice) -~~~ +``` -~~~ +```text Alice -~~~ +``` These dunder methods are not usually called directly, but rather provide the implementation of some functionality we can use - we didn't call `alice.__str__()`, but it was called for us when we did `print(alice)`. Some we see quite commonly are: @@ -356,19 +360,19 @@ Your class should: - Have an author - When printed using `print(book)`, show text in the format "title by author" -~~~ python +```python nolint book = Book('A Book', 'Me') print(book) -~~~ +``` -~~~ +```text A Book by Me -~~~ +``` :::solution -~~~ python +```python class Book: def __init__(self, title, author): self.title = title @@ -376,7 +380,8 @@ class Book: def __str__(self): return self.title + ' by ' + self.author -~~~ +``` + ::: :::: @@ -385,7 +390,7 @@ class Book: The final special type of method we will introduce is a **property**. Properties are methods which behave like data - when we want to access them, we do not need to use brackets to call the method manually. -~~~ python +```python # file: inflammation/models.py class Patient: @@ -402,19 +407,20 @@ alice.add_observation(4) obs = alice.last_observation print(obs) -~~~ +``` -~~~ +```text {'day': 1, 'value': 4} -~~~ +``` You may recognise the `@` syntax from episodes on functional programming - -`property` is another example of a **decorator**. In this case the `property` +`property` is another example of a **decorator**. In this case the `property` decorator is taking the `last_observation` function and modifying its behaviour, -so it can be accessed as if it were a normal attribute. It is also possible to +so it can be accessed as if it were a normal attribute. It is also possible to make your own decorators, but we won't cover it here. -## Key Points: +## Key Points + - Object oriented programming is a programming paradigm based on the concept of classes, which encapsulate data and code. - Classes allow us to organise data into distinct concepts. - By breaking down our data into classes, we can reason about the behaviour of parts of our data. diff --git a/software_architecture_and_design/object_orientated/classes_cpp.md b/software_architecture_and_design/object_orientated/classes_cpp.md index 62123387..3dcab4f7 100644 --- a/software_architecture_and_design/object_orientated/classes_cpp.md +++ b/software_architecture_and_design/object_orientated/classes_cpp.md @@ -1,20 +1,18 @@ --- name: Classes -dependsOn: [ -] +dependsOn: [] tags: [cpp] -attribution: - - citation: > - This material was adapted from an "Introduction to C++" course developed by the - Oxford RSE group. - url: https://www.rse.ox.ac.uk - image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: > + This material was adapted from an "Introduction to C++" course developed by the + Oxford RSE group. + url: https://www.rse.ox.ac.uk + image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Prerequisites @@ -22,11 +20,11 @@ attribution: The code blocks in this lesson will assume that some boilerplate C++ code is present. In particular, we will assume that the following headers are included: -~~~ cpp +```cpp #include #include #include -~~~ +``` We will also assume that you are using the C++17 language standard, or later. This will be the default with most modern compilers. @@ -34,13 +32,12 @@ This will be the default with most modern compilers. Furthermore, for this lesson it will be assumed you are writing all your code in a single file. Class definitions will be assumed to exist before `main`, and all other code will be assumed to exist within `main`. - ## Structuring Data Let's assume we're writing a game and we have data related to characters and items. In a procedural style, with no classes, we might structure our game data with separate vectors for each attribute: -~~~ cpp +```cpp // Character data std::vector character_names; std::vector character_healthPoints; @@ -51,11 +48,12 @@ std::vector character_positions_y; std::vector item_names; std::vector item_positions_x; std::vector item_positions_y; -~~~ +``` Structuring our data in this way does have some advantages: it may be possible to update every character's position at once, and having those positions contiguous in memory may make this an efficient operation. -However, as we build out the functioanlity of characters and items, this will quickly become unwealdy. +However, as we build out the functionality of characters and items, this will quickly become unwieldy. + 1. There will be lots of vectors! 2. A function that operates on a character might need to take many arguments. 3. There is a very loose tie between attributes: the implicit assumption that position `i` contains information about item `i`. @@ -68,12 +66,12 @@ Write a function, called `move_character` that will take an index and update the One possible solution is as follows. -~~~ cpp +```cpp void move_character(int character_index, float new_x, float new_y) { character_positions_x[character_index] = new_x; character_positions_y[character_index] = new_y; } -~~~ +``` This works fine, but has a few undesirable characteristics. @@ -111,7 +109,7 @@ Let's start by tidying up one of the problems with the previous solution where w It's not ideal that the position is two unrelated floats. Let's create an class called `Position`: -~~~ cpp +```cpp class Position { public: float x; @@ -120,7 +118,7 @@ public: Position(float x, float y) : x(x), y(y) {} }; -~~~ +``` Let's break down the syntax. @@ -136,18 +134,17 @@ Let's break down the syntax. Here's an example of how to create an object of the Position class: -~~~ cpp +```cpp // Creates a Position object at coordinates (10.0, 20.0) called pos Position pos(10.0, 20.0); -~~~ +``` We can then modify the position as follows: -~~~ cpp +```cpp pos.x = 30.0; // Changes the x coordinate to 30.0 pos.y = 40.0; // Changes the y coordinate to 40.0 -~~~ - +``` :::::challenge{id=using-position title="Using Position"} @@ -158,7 +155,7 @@ Then, update the `move_character` method appropriately. By using a `std::vector` we can reduce the number of vectors we are storing. -~~~ cpp +```cpp // Character data std::vector character_names; std::vector character_healthPoints; @@ -167,31 +164,31 @@ std::vector character_positions; // Item data std::vector item_names; std::vector item_positions; -~~~ +``` We can use the Position object just like it's a float. -- Because the object is simple*, the compiler will automatically generate the methods necessary to assign one Position to another. -- Because the object is small (it just contains two floats), it does not need to be passed by reference. -~~~ cpp +- Because the object is simple\*, the compiler will automatically generate the methods necessary to assign one Position to another. +- Because the object is small (it just contains two floats), it does not need to be passed by reference. + +```cpp void move_character(int character_index, Position new_position) { character_positions[character_index] = new_position; } -~~~ +``` -If you're interested in the * next to simple above, you may want to read about the [rule of zero](https://en.cppreference.com/w/cpp/language/rule_of_three). +If you're interested in the \* next to simple above, you may want to read about the [rule of zero](https://en.cppreference.com/w/cpp/language/rule_of_three). :::: ::::: - :::::challenge{id=character-and-item-classes title="Write a class for characters and items"} Write a class that encapsulates the data relating to characters and items. ::::solution -~~~ cpp +```cpp class Character { public: std::string name; @@ -210,16 +207,17 @@ public: Item(std::string name, Position position) : name(name), position(position) {} }; -~~~ +``` + :::: ::::: After writing the three classes `Position`, `Character` and `Item`, we can re-write all of our data that we originally had has: -~~~ cpp +```cpp std::vector characters; std::vector items; -~~~ +``` ## Encapsulating Behaviour @@ -229,7 +227,7 @@ To define the behaviour of a class we add functions which operate on the data th Methods on classes are the same as normal functions, except that they live inside a class. We can relocate our `move_character` method from being a free function to being a member function of the class `character`: -~~~ cpp +```cpp class Character { public: std::string name; @@ -238,16 +236,16 @@ public: Character(std::string name, int healthPoints, Position position) : name(name), healthPoints(healthPoints), position(position) {} - + void move(Position new_position) { position = new_position; } }; -~~~ +``` We can then create an object of type `Character` and change its position: -~~~ cpp +```cpp // Create a Character object Position initialPosition(10.0, 20.0); // Position at coordinates (10.0, 20.0) Character character("Hero", 100, initialPosition); // Character named "Hero" with 100 health points at initialPosition @@ -255,8 +253,7 @@ Character character("Hero", 100, initialPosition); // Character named "Hero" wi // Call move to change the character's position Position newPosition(30.0, 40.0); // New position at coordinates (30.0, 40.0) character.move(newPosition); // Move the character to newPosition -~~~ - +``` ## Taking the basics further @@ -267,14 +264,13 @@ Let's briefly touch on a few additional features. It's generally good practice for class data to be private, and for it to be accessed through a function called a **getter**. - We'll start by making the position member of the Character class private. Then, we'll add a public "getter" method to provide access to it. With a getter, you can control how a class's data is accessed. Similarly, with a setter, you can control how data is modified. Our `move` method is an example of a setter, although it would be more standard to call the method `setPosition`. For example, you could check if new data is valid before setting a variable, or you could make a variable read-only (by providing a getter but not a setter). -~~~ cpp +```cpp class Character { private: Position position; @@ -294,7 +290,7 @@ public: return position; } }; -~~~ +``` ::::challenge{id="make-all-data-private" title="Make all data private"} @@ -302,7 +298,7 @@ Make all data members private, and implement getters to access the data. :::solution -~~~ cpp +```cpp class Character { private: std::string name; @@ -349,7 +345,8 @@ public: return position; } }; -~~~ +``` + ::: :::: @@ -361,7 +358,7 @@ This can make your classes more intuitive to use. Now let's overload the `==` operator to compare two Character objects. We'll say that two characters are the same if they have the same name and healthPoints: -~~~ cpp +```cpp class Character { // ...existing code... @@ -369,11 +366,11 @@ class Character { return name == other.name && healthPoints == other.healthPoints; } }; -~~~ +``` You can now compare two `Character` objects like this: -~~~ cpp +```cpp Character character1("Hero", 100, Position(10.0, 20.0)); Character character2("Hero", 100, Position(30.0, 40.0)); if (character1 == character2) { @@ -381,7 +378,7 @@ if (character1 == character2) { } else { std::cout << "The characters are different.\n"; } -~~~ +``` ### Static Members @@ -391,7 +388,7 @@ They are declared with the keyword `static`. Let's add a static member to the Character class to keep track of how many characters have been created. Every time a new character is created, we'll increment this counter: -~~~ cpp +```cpp class Character { // ...existing code... @@ -400,40 +397,37 @@ class Character { public: Character(std::string name, int healthPoints, Position position) : name(name), healthPoints(healthPoints), position(position) { - characterCount++; // this line is new, and the counter is + characterCount++; // this line is new, and the counter is } static int getCharacterCount() { return characterCount; } }; -~~~ +``` In C++, a static data member is usually initialised outside the class body, typically in a source file. However, as of C++17, you can declare and initialize an inline static data member inside the class body. The inline keyword tells the compiler that the static member might be defined in multiple translation units (i.e., source files), but they all refer to the same member. - - ::::challenge{id="experiment-with-classes" title="Experiment with classes"} Add data or behaviour to these classes. :::: +## Key Points -## Key Points: - Object oriented programming is a programming paradigm based on the concept of classes, which encapsulate data and behaviour. - Classes allow us to organise data into distinct concepts. - By breaking down our data into classes, we can reason about the behaviour of parts of our data. - ## Full code sample for lession Here is working code for this lession that defines the classes and then gives an example of how to use them. You can also see this code in action, and play with it and run it, on [Compiler Explorer](https://gcc.godbolt.org/z/x7b38ba4e): -~~~ cpp +```cpp #include #include #include @@ -536,4 +530,4 @@ int main() { return 0; } -~~~ +``` diff --git a/software_architecture_and_design/object_orientated/index.md b/software_architecture_and_design/object_orientated/index.md index 7c7ee09a..547944df 100644 --- a/software_architecture_and_design/object_orientated/index.md +++ b/software_architecture_and_design/object_orientated/index.md @@ -1,30 +1,28 @@ --- id: object_orientated name: Object-Orientated Programming -dependsOn: [ - software_architecture_and_design.procedural, -] -files: [ +dependsOn: [software_architecture_and_design.procedural] +files: + [ classes.md, classes_cpp.md, inheritance_and_composition.md, inheritance_and_composition_cpp.md, polymorphism.md, polymorphism_cpp.md, -] + ] summary: | - The Object Oriented Paradigm builds upon the Procedural Paradigm, but builds code around data. - This course will introduce you to the basics of Object Oriented Programming in either Python or C++. -attribution: - - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - + The Object Oriented Paradigm builds upon the Procedural Paradigm, but builds code around data. + This course will introduce you to the basics of Object Oriented Programming in either Python or C++. +attribution: + - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## The Object Oriented Paradigm @@ -40,4 +38,4 @@ Since all of the data representing a single object is now stored together, it ma We need some way to describe the template for how an object's data should be structured, which we call a **class**. So, a class is a template describing the structure of some collection of data, along with the code necessary to describe the behaviour of that data. -You may wish to think of the Object Oriented Paradigm as focussing on the **nouns** of a computation. \ No newline at end of file +You may wish to think of the Object Oriented Paradigm as focussing on the **nouns** of a computation. diff --git a/software_architecture_and_design/object_orientated/inheritance_and_composition.md b/software_architecture_and_design/object_orientated/inheritance_and_composition.md index c972fc55..47c8c608 100644 --- a/software_architecture_and_design/object_orientated/inheritance_and_composition.md +++ b/software_architecture_and_design/object_orientated/inheritance_and_composition.md @@ -1,19 +1,20 @@ --- name: Inheritance and Composition -dependsOn: [ - software_architecture_and_design.object_orientated.classes, -] +dependsOn: [software_architecture_and_design.object_orientated.classes] tags: [python] -attribution: - - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +learningOutcomes: + - Define composition in relation to a class. + - Define inhertitance in relation to a class. + - Explain the different between composition and inheritance. +attribution: + - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Relationships Between Classes @@ -33,12 +34,12 @@ That time, we used a function which converted temperatures in Celsius to Kelvin In the same way, in object oriented programming, we can make things components of other things. -We often use composition where we can say 'x *has a* y' - for example in our inflammation project, we might want to say that a doctor *has* patients or that a patient *has* observations. +We often use composition where we can say 'x _has a_ y' - for example in our inflammation project, we might want to say that a doctor _has_ patients or that a patient _has_ observations. In the case of our example, we're already saying that patients have observations, so we're already using composition here. We're currently implementing an observation as a dictionary with a known set of keys though, so maybe we should make an `Observation` class as well. -~~~ python +```python # file: inflammation/models.py class Observation: @@ -76,19 +77,19 @@ alice = Patient('Alice') obs = alice.add_observation(3) print(obs) -~~~ +``` -~~~ +```text 3 -~~~ +``` Now we're using a composition of two custom classes to describe the relationship between two types of entity in the system that we're modelling. ### Inheritance The other type of relationship used in object oriented programming is **inheritance**. -Inheritance is about data and behaviour shared by classes, because they have some shared identity - 'x *is a* y'. -If class `X` inherits from (*is a*) class `Y`, we say that `Y` is the **superclass** or **parent class** of `X`, or `X` is a **subclass** of `Y`. +Inheritance is about data and behaviour shared by classes, because they have some shared identity - 'x _is a_ y'. +If class `X` inherits from (_is a_) class `Y`, we say that `Y` is the **superclass** or **parent class** of `X`, or `X` is a **subclass** of `Y`. If we want to extend the previous example to also manage people who aren't patients we can add another class `Person`. But `Person` will share some data and behaviour with `Patient` - in this case both have a name and show that name when you print them. @@ -97,7 +98,7 @@ Since we expect all patients to be people (hopefully!), it makes sense to implem To write our class in Python, we used the `class` keyword, the name of the class, and then a block of the functions that belong to it. If the class **inherits** from another class, we include the parent class name in brackets. -~~~ python +```python # file: inflammation/models.py class Observation: @@ -145,14 +146,14 @@ print(bob) obs = bob.add_observation(4) print(obs) -~~~ +``` -~~~ +```text Alice 3 Bob AttributeError: 'Person' object has no attribute 'add_observation' -~~~ +``` As expected, an error is thrown because we cannot add an observation to `bob`, who is a Person but not a Patient. @@ -167,13 +168,12 @@ The order in which it does this search is known as the **method resolution order The line `super().__init__(name)` gets the parent class, then calls the `__init__` method, providing the `name` variable that `Person.__init__` requires. This is quite a common pattern, particularly for `__init__` methods, where we need to make sure an object is initialised as a valid `X`, before we can initialise it as a valid `Y` - e.g. a valid `Person` must have a name, before we can properly initialise a `Patient` model with their inflammation data. - ## Composition vs Inheritance When deciding how to implement a model of a particular system, you often have a choice of either composition or inheritance, where there is no obviously correct choice. -For example, it's not obvious whether a photocopier *is a* printer and *is a* scanner, or *has a* printer and *has a* scanner. +For example, it's not obvious whether a photocopier _is a_ printer and _is a_ scanner, or _has a_ printer and _has a_ scanner. -~~~ python +```python class Machine: pass @@ -186,9 +186,9 @@ class Scanner(Machine): class Copier(Printer, Scanner): # Copier `is a` Printer and `is a` Scanner pass -~~~ +``` -~~~ python +```python class Machine: pass @@ -203,7 +203,7 @@ class Copier(Machine): # Copier `has a` Printer and `has a` Scanner self.printer = Printer() self.scanner = Scanner() -~~~ +``` Both of these would be perfectly valid models and would work for most purposes. However, unless there's something about how you need to use the model which would benefit from using a model based on inheritance, it's usually recommended to opt for **composition over inheritance**. @@ -218,33 +218,33 @@ It exists in Python, but is often not present in other Object Oriented languages Although this might seem useful, like in our inheritance-based model of the photocopier above, it's best to avoid it unless you're sure it's the right thing to do, due to the complexity of the inheritance heirarchy. Often using multiple inheritance is a sign you should instead be using composition - again like the photocopier model above. - ::::challenge{id="a-model-patient" title="A Model Patient"} Above we gave an example of a `Patient` class which inherits from `Person`. Let's can start with extending the system such that there must be a `Doctor` class to hold the data representing a single doctor, which: - - must have a `name` attribute - - must have a list of patients that this doctor is responsible for. + +- must have a `name` attribute +- must have a list of patients that this doctor is responsible for. In addition to these, try to think of an extra feature you could add to the models which would be useful for managing a dataset like this - imagine we're -running a clinical trial, what else might we want to know? Try using Test +running a clinical trial, what else might we want to know? Try using Test Driven Development for any features you add: write the tests first, then add the feature. Once you've finished the initial implementation, do you have much duplicated -code? Is there anywhere you could make better use of composition or inheritance +code? Is there anywhere you could make better use of composition or inheritance to improve your implementation? For any extra features you've added, explain them and how you implemented them -to your neighbour. Would they have implemented that feature in the same way? +to your neighbour. Would they have implemented that feature in the same way? :::solution -One example solution is shown below. You may start by writing some tests (that will initially fail), and then +One example solution is shown below. You may start by writing some tests (that will initially fail), and then develop the code to satisfy the new requirements and pass the tests. -~~~ python -# file: tests/test_patient.py -"""Tests for the Patient model.""" +```python +# file: tests/test_patient.py +"""Tests for the Patient model.""" def test_create_patient(): """Check a patient is created correctly given a name.""" @@ -288,11 +288,11 @@ def test_no_duplicate_patients(): alice = Patient("Alice") doc.add_patient(alice) doc.add_patient(alice) - assert len(doc.patients) == 1 + assert len(doc.patients) == 1 ... -~~~ +``` -~~~ python +```python # file: inflammation/models.py ... class Person: @@ -317,7 +317,7 @@ class Patient(Person): day = 0 new_observation = Observation(day, value) self.observations.append(new_observation) - return new_observation + return new_observation class Doctor(Person): """A doctor in an inflammation study.""" @@ -333,9 +333,11 @@ class Doctor(Person): return self.patients.append(new_patient) ... -~~~ +``` + ::: :::: -## Key Points: -- Relationships between concepts can be described using inheritance (*is a*) and composition (*has a*). \ No newline at end of file +## Key Points + +- Relationships between concepts can be described using inheritance (_is a_) and composition (_has a_). diff --git a/software_architecture_and_design/object_orientated/inheritance_and_composition_cpp.md b/software_architecture_and_design/object_orientated/inheritance_and_composition_cpp.md index b3d07b88..3a710baa 100644 --- a/software_architecture_and_design/object_orientated/inheritance_and_composition_cpp.md +++ b/software_architecture_and_design/object_orientated/inheritance_and_composition_cpp.md @@ -1,21 +1,18 @@ --- name: Inheritance and Composition -dependsOn: [ - software_architecture_and_design.object_orientated.classes_cpp, -] +dependsOn: [software_architecture_and_design.object_orientated.classes_cpp] tags: [cpp] -attribution: - - citation: > - This material was adapted from an "Introduction to C++" course developed by the - Oxford RSE group. - url: https://www.rse.ox.ac.uk - image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: > + This material was adapted from an "Introduction to C++" course developed by the + Oxford RSE group. + url: https://www.rse.ox.ac.uk + image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Prerequisites @@ -23,11 +20,11 @@ attribution: The code blocks in this lesson will assume that some boilerplate C++ code is present. In particular, we will assume that the following headers are included: -~~~ cpp +```cpp #include #include #include -~~~ +``` We will also assume that you are using the C++17 language standard, or later. This will be the default with most modern compilers. @@ -52,10 +49,9 @@ That time, we used a function which converted temperatures in Celsius to Kelvin In the same way, in object oriented programming, we can make things components of other things. -We often use composition where we can say 'x *has a* y' - for example in our game, we might want to say that a character *has* an inventory, and that an inventory *has* items. - -In the case of our example, we're already saying that a character *has a* position, so we're already using composition here. +We often use composition where we can say 'x _has a_ y' - for example in our game, we might want to say that a character _has_ an inventory, and that an inventory _has_ items. +In the case of our example, we're already saying that a character _has a_ position, so we're already using composition here. :::::challenge{id=inventory title="Write an inventory"} @@ -68,7 +64,7 @@ Modify your `Character` class to contain an `Inventory` data member. Here is an example of what that might look like: -~~~ cpp +```cpp class Inventory { private: std::vector items; @@ -114,44 +110,43 @@ public: return inventory.getItem(index); } }; -~~~ +``` + :::: ::::: We now have several examples of composition: -- Character *has a* position -- Item *has a* position -- Characher *has an* inventory -- Inventory *has many* items +- Character _has a_ position +- Item _has a_ position +- Characher _has an_ inventory +- Inventory _has many_ items You can see how we can build quickly build up complex behaviours. -Now have a think: would it be simple to build this behavour without classes? +Now have a think: would it be simple to build this behaviour without classes? It would probably be very messy. - ### Inheritance The other type of relationship used in object oriented programming is **inheritance**. -Inheritance is about data and behaviour shared by classes, because they have some shared identity - 'x *is a* y'. +Inheritance is about data and behaviour shared by classes, because they have some shared identity - 'x _is a_ y'. For instance, we might have two types of character: warriors and mages. We can create two classes: `Warrior` and `Mage`. But, fundamentally, they are both characters and have common code such as an inventory and a position. We should not duplicate this code. -We achieve this through *inheritance*. -If class `Warrior` inherits from (*is a*) `Character`, we say that `Character` is the **base class**, **parent class**, or **superclass** of `Warrior`. +We achieve this through _inheritance_. +If class `Warrior` inherits from (_is a_) `Character`, we say that `Character` is the **base class**, **parent class**, or **superclass** of `Warrior`. We say that `Warrior` is a **derived class**, **child class**, or **subclass** of `Character`. - The base class provides a set of attributes and behaviors that the derived class can inherit. The derived class can then add or override these attributes and behaviors as needed. This terminology is common across many object-oriented programming languages. A Warrior class may look something like this: -~~~ cpp +```cpp class Warrior : public Character { private: int strength; @@ -168,7 +163,7 @@ public: return strength; } }; -~~~ +``` Let's examine the syntax: @@ -180,13 +175,11 @@ Let's examine the syntax: 4. **Methods**: `void physicalAttack()` is a public method unique to `Warrior`. This could be an example of method overriding, if there was a `physicalAttack()` method in the `Character` class that we wanted to behave differently for `Warrior`. `int getStrength() const` is a getter method for `strength`. - Note: in this example, `Character(name, health, position, inventoryCapacity)` is the call to the base class constructor, which will be executed before the body of the `Warrior` constructor. After the base class constructor has been called, the `Warrior` constructor will continue with its own initialisation, setting the value of `strength` in this case. This sequence ensures that the base class portion of the `Warrior` object is properly constructed before the `Warrior` constructor attempts to use it or modify it. This is a fundamental feature of how constructors and inheritance work together in C++. - :::::challenge{id=mage title="Write a Mage class"} Write a class called `Mage` that inherits from `Character`, and give it some unique data and behaviour. @@ -195,7 +188,7 @@ Write a class called `Mage` that inherits from `Character`, and give it some uni Here is an example of what that might look like: -~~~ cpp +```cpp class Mage : public Character { private: int manaPoints; @@ -212,15 +205,15 @@ public: return manaPoints; } }; -~~~ +``` + :::: ::::: - ## Composition vs Inheritance When deciding how to implement a model of a particular system, you often have a choice of either composition or inheritance, where there is no obviously correct choice. -For example, it's not obvious whether a photocopier *is a* printer and *is a* scanner, or *has a* printer and *has a* scanner. +For example, it's not obvious whether a photocopier _is a_ printer and _is a_ scanner, or _has a_ printer and _has a_ scanner. Both of these would be perfectly valid models and would work for most purposes. However, unless there's something about how you need to use the model which would benefit from using a model based on inheritance, it's usually recommended to opt for **composition over inheritance**. @@ -231,7 +224,6 @@ Composition, on the other hand, tends to offer greater flexibility. It allows you to change behavior on the fly by changing the component at runtime and leads to a more decoupled system, which is easier to maintain and evolve. The downside can be that it might result in a little more boilerplate code as you delegate methods to the component classes. - :::::challenge{id=swords-and-shields title="Swords and Shields"} Swords and shields are types of `Item`. @@ -243,7 +235,7 @@ Update your code to reflect this, and identify the inheritance and composition n Here is an example of what that might look like: -~~~ cpp +```cpp class Sword : public Item { private: int damage; @@ -312,11 +304,11 @@ public: return equippedSword; } }; -~~~ +``` Then we can use that functionality like this: -~~~ cpp +```cpp Sword sword("Excalibur", 10); Shield shield("Aegis", 5); @@ -345,21 +337,21 @@ if (mage.getEquippedSword()) { } return 0; -~~~ +``` :::: ::::: -## Key Points: -- Relationships between concepts can be described using inheritance (*is a*) and composition (*has a*). +## Key Points +- Relationships between concepts can be described using inheritance (_is a_) and composition (_has a_). ## Full code sample for lession -Here is working code for this lession that defines the classes and then gives an example of how to use them. +Here is working code for this lesson that defines the classes and then gives an example of how to use them. You can also see this code in action, and play with it and run it, on [Compiler Explorer](https://gcc.godbolt.org/z/K51dPz1os): -~~~ cpp +```cpp #include #include #include @@ -538,4 +530,4 @@ int main() { return 0; } -~~~ +``` diff --git a/software_architecture_and_design/object_orientated/polymorphism.md b/software_architecture_and_design/object_orientated/polymorphism.md index 0af432a1..5ef079a4 100644 --- a/software_architecture_and_design/object_orientated/polymorphism.md +++ b/software_architecture_and_design/object_orientated/polymorphism.md @@ -1,19 +1,19 @@ --- name: Polymorphism -dependsOn: [ - software_architecture_and_design.object_orientated.inheritance_and_composition, -] +dependsOn: [software_architecture_and_design.object_orientated.inheritance_and_composition] tags: [python] -attribution: - - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +learningOutcomes: + - Define polymorphism. + - Apply polymorphism principles to class design. +attribution: + - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Class-based polymorphism @@ -34,7 +34,7 @@ want to be able to: We can implement this in our classes like so: -~~~ python +```python ... class Person: """A person.""" @@ -46,7 +46,7 @@ class Person: return self.name def set_id(self, id): - raise NotImplementedError('set_id not implemented') + raise NotImplementedError('set_id not implemented') def get_id(self): return self.id @@ -74,7 +74,7 @@ class Doctor(Person): def set_id(self, id): self.id = 'D' + str(id).zfill(4) ... -~~~ +``` Here we have defined the **interface** for our `Person` class, which is that there should be a `set_id` method. We have also defined the `__str__` method, @@ -104,12 +104,11 @@ print(alice) print(bob) ``` -``` +```text Doctor: Alice Patient: Bob ``` - We can also store collections of different types of people in a single list: ```python @@ -118,7 +117,7 @@ for person in people: print(person) ``` -``` +```text Doctor: Alice Patient: Bob ``` @@ -131,7 +130,6 @@ if an object has the right methods, it can be treated as if it is of a particular type. Using our example above, if an object has a `set_id` and `__str__` method, it can be treated as if it is a `Person` object. For example, - ```python class Administrator: """An administrator in an inflammation study.""" @@ -213,7 +211,7 @@ class Trial: ::: :::: -## Key Points: +## Key Points + - Class-based Polymorphism in programming languages allows objects of different classes to be treated as if they were the same type - Python uses duck typing to allow polymorphism in a flexible way, "if it looks like a duck and quacks like a duck, it must be a duck" - diff --git a/software_architecture_and_design/object_orientated/polymorphism_cpp.md b/software_architecture_and_design/object_orientated/polymorphism_cpp.md index 193cbdf0..850c7958 100644 --- a/software_architecture_and_design/object_orientated/polymorphism_cpp.md +++ b/software_architecture_and_design/object_orientated/polymorphism_cpp.md @@ -1,21 +1,18 @@ --- name: Polymorphism -dependsOn: [ - software_architecture_and_design.object_orientated.inheritance_and_composition_cpp, -] +dependsOn: [software_architecture_and_design.object_orientated.inheritance_and_composition_cpp] tags: [cpp] -attribution: - - citation: > - This material was adapted from an "Introduction to C++" course developed by the - Oxford RSE group. - url: https://www.rse.ox.ac.uk - image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: > + This material was adapted from an "Introduction to C++" course developed by the + Oxford RSE group. + url: https://www.rse.ox.ac.uk + image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Prerequisites @@ -23,11 +20,11 @@ attribution: The code blocks in this lesson will assume that some boilerplate C++ code is present. In particular, we will assume that the following headers are included: -~~~ cpp +```cpp #include #include #include -~~~ +``` We will also assume that you are using the C++17 language standard, or later. This will be the default with most modern compilers. @@ -52,20 +49,19 @@ In C++ we archive this with **method overriding**: For this lesson we'll simplify the overall example, but feel free to modify your more extensive classes: -~~~ cpp +```cpp class Character { public: virtual void performAttack() const { // Default implementation } }; -~~~ +``` Here, the **virtual** keyword indicates that this function can be overridden in derived classes. We can then add the `performAttack()` method to the derived classes: - -~~~ cpp +```cpp class Warrior : public Character { public: void performAttack() const override { @@ -79,9 +75,9 @@ public: std::cout << "Mage casts a spell!" << std::endl; } }; -~~~ +``` -Notice that the **virtual** keyword is onpy present in the base class. +Notice that the **virtual** keyword is only present in the base class. The **override** keyword indicates that the function is intended to override a virtual function from the base class. It is not mandatory to add the **override** keyword, but it is considered best practice for the following reasons: @@ -90,11 +86,10 @@ It is not mandatory to add the **override** keyword, but it is considered best p - **Detecting Errors at Compilation**: When you use the `override` keyword, the compiler performs a check to ensure that the function being declared in the derived class is indeed overriding a virtual function from the base class. It helps detect errors, such as misspelled function names or accidental deviations from the base class function signature. If the function in the derived class does not match any base class virtual function, a compilation error is generated, alerting you to the mistake. - We can use this new code in many ways, but in general we will need a pointer or reference to the base class. Here's an example which we will then break down: -~~~ cpp +```cpp std::vector> characters; characters.push_back(std::make_unique()); characters.push_back(std::make_unique()); @@ -102,13 +97,12 @@ characters.push_back(std::make_unique()); for (const auto& character : characters) { character->performAttack(); } -~~~ +``` -~~~ +```text Warrior attacks! Mage casts a spell! -~~~ - +``` - `std::vector> characters;`: This declares a vector named `characters` that holds [`std::unique_ptr` smart pointers](https://en.cppreference.com/w/cpp/memory/unique_ptr) to `Character` objects. The use of `std::unique_ptr` ensures that the ownership and memory management of the objects in the vector are handled automatically. @@ -123,7 +117,6 @@ Mage casts a spell! During each iteration of the loop, the `performAttack()` function is called on each `Character` object, including both `Warrior` and `Mage` objects. Polymorphism comes into play here, as the virtual `performAttack()` function is called on each object, and the appropriate overridden implementation in the derived class is executed based on the actual object type. - ## Abstract classes Sometimes we want a base class to define a structure that is common to all derived classes, but we don't want to be able to directly instantiate that object. @@ -131,7 +124,7 @@ In our example, it may be that we can never have a character that is not either In this case, we would like `Character` to become an abstract class. An abstract class cannot be instantiated directly, and it is meant to serve as a base for derived classes by providing an interface that derived classes must implement. -A class becomes abstract if it has at least one *pure virtual function*, that is, a virtual function that does not have an implementaiton. +A class becomes abstract if it has at least one _pure virtual function_, that is, a virtual function that does not have an implementation. 1. **Pure Virtual Function**: The `Character` class would have at least one pure virtual function, declared as follows: @@ -195,21 +188,22 @@ virtual ~Character() = default; ``` - In C++, when an object is deleted through a pointer to a base class type, the destructor of the base class is called, but not the derived class destructors. -This can lead to a problem known as *slicing*, where only the base class portion of the object is destroyed, resulting in a potential resource leak or undefined behavior. + This can lead to a problem known as _slicing_, where only the base class portion of the object is destroyed, resulting in a potential resource leak or undefined behavior. - When deleting an object through a base class pointer or reference, the derived class destructor is also called, ensuring that the derived class's resources are properly released. - In the given example, although the `Character` class does not contain any member variables that need explicit cleanup, adding a virtual destructor is a good practice for future-proofing the code. If derived classes add their own resources or dynamically allocated memory, the virtual destructor will ensure proper destruction of those resources when deleting derived class objects through base class pointers. - Therefore, when making a class abstract and intended to be used as a base class, it is generally advisable to include a virtual destructor in the base class, even if it has no explicit cleanup to perform. -## Key Points: +## Key Points + - Class-based Polymorphism in programming languages allows objects of different classes to be treated as if they were the same type. -- Classes can be made abstract by providing at least one pure virtual function, but you should remember the virual destructor, too. +- Classes can be made abstract by providing at least one pure virtual function, but you should remember the virtual destructor, too. ## Full code sample for lession -Here is working code for this lession that defines the classes and then gives an example of how to use them. +Here is working code for this lesson that defines the classes and then gives an example of how to use them. You can also see this code in action, and play with it and run it, on [Compiler Explorer](https://gcc.godbolt.org/z/KoaoET9v9): -~~~ cpp +```cpp #include #include #include @@ -246,4 +240,4 @@ int main() { return 0; } -~~~ +``` diff --git a/software_architecture_and_design/procedural/arrays_python.md b/software_architecture_and_design/procedural/arrays_python.md index 9fee7179..8dfeaa01 100644 --- a/software_architecture_and_design/procedural/arrays_python.md +++ b/software_architecture_and_design/procedural/arrays_python.md @@ -1,19 +1,16 @@ --- name: Arrays -dependsOn: [ - software_architecture_and_design.procedural.functions_python, -] +dependsOn: [software_architecture_and_design.procedural.functions_python] tags: [python] -attribution: - - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- NumPy is a widely-used Python library for numerical computing that provides @@ -34,18 +31,18 @@ install NumPy in. For example, say we wish to create a new "learning numpy" project, and create a new environment within this project. We can type the following: -~~~bash +```bash mkdir learning_numpy cd learning_numpy python3 -m venv venv source venv/bin/activate -~~~ +``` Then we can use `pip` to install `numpy`: -~~~bash +```bash pip install numpy -~~~ +``` This reflects a typical way of working with virtual environments: a new project comes along, we create a new virtual environment in a location close to where @@ -54,61 +51,61 @@ virtual environment and install the packages we'll need for the work. ## NumPy Arrays vs Python Lists -NumPy's array type represents a multidimensional array or tensor *Mi,j,k...n* +NumPy's array type represents a multidimensional array or tensor **M**~i,j,k...n~ The NumPy array seems at first to be just like a list: -~~~python +```python import numpy as np my_array = np.array(range(5)) my_array -~~~ +``` Note here we are importing the NumPy module as `np`, an established convention for using NumPy which means we can refer to NumPy using `np.` instead of the slightly more laborious `numpy.`. -~~~ +```text array([0, 1, 2, 3, 4]) -~~~ +``` -Ok, so they *look* like a list. +Ok, so they _look_ like a list. -~~~python +```python my_array[2] -~~~ +``` -~~~ +```text 2 -~~~ +``` We can access them like a list. We can also access NumPy arrays as a collection in `for` loops: -~~~python +```python for element in my_array: print("Hello" * element) -~~~ +``` -~~~ +```text Hello HelloHello HelloHelloHello HelloHelloHelloHello -~~~ +``` However, we can also see our first weakness of NumPy arrays versus Python lists: -~~~python +```python my_array.append(4) -~~~ +``` -~~~ +```text Traceback (most recent call last): File "", line 1, in AttributeError: 'numpy.ndarray' object has no attribute 'append' -~~~ +``` For NumPy arrays, you typically don't change the data size once you've defined your array, whereas for Python lists, you can do this efficiently. Also, NumPy @@ -119,36 +116,35 @@ return... One great advantage of NumPy arrays is that most operations can be applied element-wise automatically, and in a very Pythonic way! -~~~python +```python my_array + 2 -~~~ +``` `+` in this context is an elementwise operation performed on all the matrix elements, and gives us: -~~~ +```text array([2, 3, 4, 5, 6]) -~~~ +``` ::::challenge{id=elementwise-operations Title="Other elementwise operations"} Try using `-`, `*`, `/` in the above statement instead. Do they do what you expect? - :::solution -~~~python +```python my_array - 2 my_array * 2 my_array / 2 -~~~ +``` Will yield the following respectively: -~~~ +```text array([-2, -1, 0, 1, 2]) array([0, 2, 4, 6, 8]) array([0. , 0.5, 1. , 1.5, 2. ]) -~~~ +``` Note the final one with `/` - digits after the `.` are omitted if they don't show anything interesting (i.e. they are zero). @@ -162,18 +158,18 @@ performance of NumPy over Python lists. First, using Python lists we can do the following, that creates a 2D list of size 10000x10000, sets all elements to zero, then adds 10 to all those elements: -~~~python +```python nested_list = [[0 for _ in range(10000)] for _ in range(10000)] nested_list = [[x+10 for x in column] for column in nested_list] -~~~ +``` That took a while! In NumPy we replicate this by doing: -~~~python +```python import numpy as np array = np.zeros((10000, 10000)) array = array + 10 -~~~ +``` Here, we import the NumPy library, use a specialised function to set up a NumPy array of size 5000x5000 with elements set to zero, and then - in a very Pythonic @@ -188,60 +184,62 @@ inflammation in patients who have been given a new treatment for arthritis. Let's download this dataset now. First, create a new directory inflammation and `cd` to it: -~~~bash -$ mkdir inflammation -$ cd inflammation -~~~ +```bash +mkdir inflammation +cd inflammation +``` If on WSL or Linux (e.g. Ubuntu or the Ubuntu VM), then do: -~~~bash -$ wget https://www.uhpc-training.co.uk/material/software_architecture_and_design/procedural/inflammation/inflammation.zip -~~~ +```bash +wget https://www.uhpc-training.co.uk/material/HPCu/software_architecture_and_design/procedural/inflammation/inflammation.zip +``` Or, if on a Mac, do: -~~~bash -$ curl -O https://www.uhpc-training.co.uk/material/software_architecture_and_design/procedural/inflammation/inflammation.zip -~~~ +```bash +curl -O https://www.uhpc-training.co.uk/material/HPCu/software_architecture_and_design/procedural/inflammation/inflammation.zip +``` Once done, you can unzip this file using the `unzip` command in Bash, which will unpack all the files in this zip archive into the current directory: -~~~bash -$ unzip inflammation.zip -~~~ +```bash +unzip inflammation.zip +``` -This zip file contains some code as well as the datasets we need stored in the `data` directory (which is what we're +This zip file contains some code as well as the datasets we need stored in the `data` directory (which is what we're interested in). -~~~bash -$ cd data -~~~ +```bash +cd data +``` :::callout + ## What Does the Patient Inflammation Data Contain? Each dataset records inflammation measurements from a separate clinical trial of the drug, and each dataset contains information for 60 patients, who had their inflammation levels recorded for 40 days whilst participating in the trial. ![Snapshot of the inflammation dataset](fig/inflammation-study-pipeline.png) -*Inflammation study pipeline from the [Software Carpentry Python novice lesson](https://swcarpentry.github.io/python-novice-inflammation/fig/lesson-overview.svg)* +_Inflammation study pipeline from the [Software Carpentry Python novice lesson](https://swcarpentry.github.io/python-novice-inflammation/fig/lesson-overview.svg)_ Each of the data files uses the popular [comma-separated (CSV) format](https://en.wikipedia.org/wiki/Comma-separated_values) to represent the data, where: - Each row holds inflammation measurements for a single patient, - Each column represents a successive day in the trial, - Each cell represents an inflammation reading on a given day for a patient (in some arbitrary units of inflammation measurement). + ::: We can use first NumPy to load our dataset into a Python variable: -~~~python +```python data = np.loadtxt(fname='../data/inflammation-01.csv', delimiter=',') data -~~~ +``` -~~~ +```text array([[0., 0., 1., ..., 3., 0., 0.], [0., 1., 2., ..., 1., 0., 1.], [0., 1., 1., ..., 2., 1., 1.], @@ -249,11 +247,12 @@ array([[0., 0., 1., ..., 3., 0., 0.], [0., 1., 1., ..., 1., 1., 1.], [0., 0., 0., ..., 0., 2., 0.], [0., 0., 1., ..., 1., 1., 0.]]) -~~~ +``` So, the data in this case has 60 rows (one for each patient) and 40 columns (one for each day) as we would expect. Each cell in the data represents an inflammation reading on a given day for a patient. So this shows the results of measuring the inflammation of 60 patients over a 40 day period. -:::callout +:::callout{variant="info"} + ## In the Corner What may also surprise you is that when Python displays an array, @@ -269,64 +268,64 @@ which can be confusing when plotting data. Let's ask what type of thing `data` refers to: -~~~python +```python print(type(data)) -~~~ +``` -~~~ +```text -~~~ +``` The output tells us that `data` currently refers to an N-dimensional array, the functionality for which is provided by the NumPy library. :::callout + ## Data Type A Numpy array contains one or more elements of the same type. The `type` function will only tell you that a variable is a NumPy array but won't tell you the type of thing inside the array. We can find out the type of the data contained in the NumPy array. -~~~python +```python print(data.dtype) -~~~ +``` -~~~ +```text float64 -~~~ +``` This tells us that the NumPy array's elements are 64-bit floating-point numbers. ::: -With the following command, we can see the array's *shape*: +With the following command, we can see the array's _shape_: -~~~python +```python print(data.shape) -~~~ +``` -~~~ +```text (60, 40) -~~~ +``` The output tells us that the `data` array variable contains 60 rows and 40 columns, as we would expect. We can also access specific elements of our 2D array (such as the first value) like this: -~~~python +```python data[0, 0] -~~~ +``` -~~~ +```text 0.0 -~~~ +``` Or the value in the middle of the dataset: -~~~python +```python data[30, 20] -~~~ +``` -~~~ +```text 13.0 -~~~ - +``` ### Slicing our Inflammation Data @@ -334,16 +333,16 @@ An index like `[30, 20]` selects a single element of an array, but similar to ho For example, we can select the first ten days (columns) of values for the first four patients (rows) like this: -~~~python +```python data[0:4, 0:10] -~~~ +``` -~~~ +```text array([[0., 0., 1., 3., 1., 2., 4., 7., 8., 3.], [0., 1., 2., 1., 2., 1., 3., 2., 2., 6.], [0., 1., 1., 3., 3., 2., 6., 2., 5., 9.], [0., 0., 2., 0., 4., 2., 2., 1., 6., 7.]]) -~~~ +``` So here `0:4` means, "Start at index 0 and go up to, but not including, index 4." Again, the up-to-but-not-including takes a bit of getting used to, but the @@ -352,105 +351,107 @@ values in the slice. And as we saw with lists, we don't have to start slices at 0: -~~~python +```python data[5:10, 0:10] -~~~ +``` Which will show us data from patients 5-9 (rows) across the first 10 days (columns): -~~~ +```text array([[0., 0., 1., 2., 2., 4., 2., 1., 6., 4.], [0., 0., 2., 2., 4., 2., 2., 5., 5., 8.], [0., 0., 1., 2., 3., 1., 2., 3., 5., 3.], [0., 0., 0., 3., 1., 5., 6., 5., 5., 8.], [0., 1., 1., 2., 1., 3., 5., 3., 5., 8.]]) -~~~ +``` We also don't have to include the upper and lower bound on the slice. If we don't include the lower bound, Python uses 0 by default; if we don't include the upper, the slice runs to the end of the axis, and if we don't include either (i.e., if we just use ':' on its own), the slice includes everything: -~~~python +```python small = data[:3, 36:] small -~~~ +``` The above example selects rows 0 through 2 and columns 36 through to the end of the array: -~~~ +```text array([[2., 3., 0., 0.], [1., 1., 0., 1.], [2., 2., 1., 1.]]) -~~~ +``` :::callout + ## Numpy Memory Numpy memory management can be tricksy: -~~~python +```python x = np.arange(5) y = x[:] -~~~ +``` -~~~python +```python y[2] = 0 x -~~~ +``` -~~~ +```text array([0, 1, 0, 3, 4]) -~~~ +``` It does not behave like lists! -~~~python +```python x = list(range(5)) y = x[:] -~~~ +``` -~~~python +```python y[2] = 0 x -~~~ +``` -~~~ +```text [0, 1, 2, 3, 4] -~~~ +``` We can use `np.copy()` to force the use of separate memory and actually copy the -values. Otherwise NumPy tries its hardest to make slices be *views* on data, +values. Otherwise NumPy tries its hardest to make slices be _views_ on data, referencing existing values and not copying them. So an example using `np.copy()`: -~~~python +```python x = np.arange(5) y = np.copy(x) y[2] = 0 x -~~~ +``` -~~~ +```text array([0, 1, 2, 3, 4]) -~~~ +``` + ::: ### Elementwise Operations on Multiple Arrays As we've seen, arrays also know how to perform common mathematical operations on their values element-by-element: -~~~python +```python doubledata = data * 2.0 -~~~ +``` Will create a new array `doubledata`, each element of which is twice the value of the corresponding element in `data`: -~~~python +```python print('original:') data[:3, 36:] print('doubledata:') doubledata[:3, 36:] -~~~ +``` -~~~ +```text original: array([[2., 3., 0., 0.], [1., 1., 0., 1.], @@ -459,37 +460,37 @@ doubledata: array([[4., 6., 0., 0.], [2., 2., 0., 2.], [4., 4., 2., 2.]]) -~~~ +``` If, instead of taking an array and doing arithmetic with a single value (as above), you did the arithmetic operation with another array of the same shape, the operation will be done on corresponding elements of the two arrays: -~~~python +```python tripledata = doubledata + data -~~~ +``` Will give you an array where `tripledata[0,0]` will equal `doubledata[0,0]` plus `data[0,0]`, and so on for all other elements of the arrays. -~~~python +```python print('tripledata:') print(tripledata[:3, 36:]) -~~~ +``` -~~~ +```text tripledata: array([[6., 9., 0., 0.], [3., 3., 0., 3.], [6., 6., 3., 3.]]) -~~~ +``` ::::challenge{id=stacking-arrays title="Stacking Arrays"} Arrays can be concatenated and stacked on top of one another, using NumPy's `vstack` and `hstack` functions for vertical and horizontal stacking, respectively. -~~~python +```python import numpy as np A = np.array([[1,2,3], [4,5,6], [7,8,9]]) @@ -503,9 +504,9 @@ print(B) C = np.vstack([A, A]) print('C = ') print(C) -~~~ +``` -~~~ +```text A = [[1 2 3] [4 5 6] @@ -521,7 +522,7 @@ C = [1 2 3] [4 5 6] [7 8 9]] -~~~ +``` Write some additional code that slices the first and last columns of our inflammation `data` array, and stacks them into a 60x2 array, to give us data from the first and last days of our trial across all patients. @@ -536,20 +537,21 @@ the index itself can be a slice or array. For example, `data[:, :1]` returns a two dimensional array with one singleton dimension (i.e. a column vector). -~~~python +```python D = np.hstack([data[:, :1], data[:, -1:]]) print('D = ') print(D) -~~~ +``` -~~~ +```text D = [[0. 0.] [0. 1.] ... [0. 0.] [0. 0.]] -~~~ +``` + ::: :::: @@ -557,34 +559,34 @@ D = You can also do [dot products](https://en.wikipedia.org/wiki/Dot_product) of NumPy arrays: -~~~python +```python a = np.array([[1, 2], [3, 4]]) b = np.array([[5, 6], [7, 8]]) np.dot(a, b) -~~~ +``` -~~~ +```text array([[19, 22], [43, 50]]) -~~~ - +``` ### More Complex Operations Often, we want to do more than add, subtract, multiply, and divide array elements. NumPy knows how to do more complex operations, too. If we want to find the average inflammation for all patients on all days, for example, we can ask NumPy to compute `data`'s mean value: -~~~python +```python print(np.mean(data)) -~~~ +``` -~~~ +```text 6.14875 -~~~ +``` `mean` is a function that takes an array as an argument. :::callout + ## Not All Functions Have Input Generally, a function uses inputs to produce outputs. @@ -592,15 +594,17 @@ However, some functions produce outputs without needing any input. For example, checking the current time doesn't require any input. -~~~ +```text import time print(time.ctime()) -~~~ +``` + {: .language-python} -~~~ +```text Fri Sep 30 14:52:40 2022 -~~~ +``` + {: .output} For functions that don't take in any arguments, @@ -614,34 +618,34 @@ Let's use three of those functions to get some descriptive values about the dataset. We'll also use multiple assignment, a convenient Python feature that will enable us to do this all in one line. -~~~python +```python maxval, minval, stdval = np.max(data), np.min(data), np.std(data) print('max inflammation:', maxval) print('min inflammation:', minval) print('std deviation:', stdval) -~~~ +``` Here we've assigned the return value from `np.max(data)` to the variable `maxval`, the value from `np.min(data)` to `minval`, and so on. -~~~ +```text max inflammation: 20.0 min inflammation: 0.0 std deviation: 4.613833197118566 -~~~ +``` When analyzing data, though, we often want to look at variations in statistical values, such as the maximum inflammation per patient or the average inflammation per day. One way to do this is to create a new temporary array of the data we want, then ask it to do the calculation: -~~~python +```python np.max(data[0, :]) -~~~ +``` So here, we're looking at the maximum inflammation across all days for the first patient, which is -~~~ +```text 18.0 -~~~ +``` What if we need the maximum inflammation for each patient over all days (as in the next diagram on the left) or the average for each day (as in the diagram on @@ -652,11 +656,11 @@ an axis: To support this functionality, most array functions allow us to specify the axis we want to work on. If we ask for the average across axis 0 (rows in our 2D example), we get: -~~~python +```python print(np.mean(data, axis=0)) -~~~ +``` -~~~ +```text [ 0. 0.45 1.11666667 1.75 2.43333333 3.15 3.8 3.88333333 5.23333333 5.51666667 5.95 5.9 8.35 7.73333333 8.36666667 9.5 9.58333333 @@ -665,66 +669,66 @@ print(np.mean(data, axis=0)) 7.33333333 6.58333333 6.06666667 5.95 5.11666667 3.6 3.3 3.56666667 2.48333333 1.5 1.13333333 0.56666667] -~~~ +``` As a quick check, we can ask this array what its shape is: -~~~python +```python print(np.mean(data, axis=0).shape) -~~~ +``` -~~~ +```text (40,) -~~~ +``` The expression `(40,)` tells us we have an N×1 vector, so this is the average inflammation per day for all patients. If we average across axis 1 (columns in our 2D example), we get: -~~~python +```python patients_avg = np.mean(data, axis=1) patients_avg -~~~ +``` -~~~ +```text [ 5.45 5.425 6.1 5.9 5.55 6.225 5.975 6.65 6.625 6.525 6.775 5.8 6.225 5.75 5.225 6.3 6.55 5.7 5.85 6.55 5.775 5.825 6.175 6.1 5.8 6.425 6.05 6.025 6.175 6.55 6.175 6.35 6.725 6.125 7.075 5.725 5.925 6.15 6.075 5.75 5.975 5.725 6.3 5.9 6.75 5.925 7.225 6.15 5.95 6.275 5.7 6.1 6.825 5.975 6.725 5.7 6.25 6.4 7.05 5.9 ] -~~~ +``` Which is the average inflammation per patient across all days. ::::challenge{id=change-in-inflammation title="Change in Inflammation"} This patient data is _longitudinal_ in the sense that each row represents a -series of observations relating to one individual. This means that +series of observations relating to one individual. This means that the change in inflammation over time is a meaningful concept. The `np.diff()` function takes a NumPy array and returns the -differences between two successive values along a specified axis. For +differences between two successive values along a specified axis. For example, a NumPy array that looks like this: -~~~python +```python npdiff = np.array([ 0, 2, 5, 9, 14]) -~~~ +``` Calling `np.diff(npdiff)` would do the following calculations and put the answers in another array. -~~~python +```python [ 2 - 0, 5 - 2, 9 - 5, 14 - 9 ] -~~~ +``` -~~~python +```python np.diff(npdiff) -~~~ +``` -~~~python +```python array([2, 3, 4, 5]) -~~~ +``` Which axis would it make sense to use this function along? @@ -734,9 +738,10 @@ difference between two arbitrary patients. The column axis (1) is in days, so the difference is the change in inflammation -- a meaningful concept. -~~~python +```python np.diff(data, axis=1) -~~~ +``` + ::: If the shape of an individual data file is `(60, 40)` (60 rows and 40 @@ -753,23 +758,23 @@ it matter if the change in inflammation is an increase or a decrease? :::solution By using the `np.max()` function after you apply the `np.diff()` -function, you will get the largest difference between days. We can *functionally -compose* these together. +function, you will get the largest difference between days. We can _functionally +compose_ these together. -~~~python +```python np.max(np.diff(data, axis=1), axis=1) -~~~ +``` -~~~python +```python array([ 7., 12., 11., 10., 11., 13., 10., 8., 10., 10., 7., 7., 13., 7., 10., 10., 8., 10., 9., 10., 13., 7., 12., 9., 12., 11., 10., 10., 7., 10., 11., 10., 8., 11., 12., 10., 9., 10., 13., 10., 7., 7., 10., 13., 12., 8., 8., 10., 10., 9., 8., 13., 10., 7., 10., 8., 12., 10., 7., 12.]) -~~~ +``` -If inflammation values *decrease* along an axis, then the difference from +If inflammation values _decrease_ along an axis, then the difference from one element to the next will be negative. If you are interested in the **magnitude** of the change and not the direction, the `np.absolute()` function will provide that. @@ -777,18 +782,19 @@ direction, the `np.absolute()` function will provide that. Notice the difference if you get the largest _absolute_ difference between readings. -~~~python +```python np.max(np.absolute(np.diff(data, axis=1)), axis=1) -~~~ +``` -~~~python +```python array([ 12., 14., 11., 13., 11., 13., 10., 12., 10., 10., 10., 12., 13., 10., 11., 10., 12., 13., 9., 10., 13., 9., 12., 9., 12., 11., 10., 13., 9., 13., 11., 11., 8., 11., 12., 13., 9., 10., 13., 11., 11., 13., 11., 13., 13., 10., 9., 10., 10., 9., 9., 13., 10., 9., 10., 11., 13., 10., 10., 12.]) -~~~ +``` + ::: :::: @@ -798,63 +804,63 @@ This is another really powerful feature of NumPy, and covers a 'special case' of By default, array operations are element-by-element: -~~~python +```python np.arange(5) * np.arange(5) -~~~ +``` -~~~ +```text array([ 0, 1, 4, 9, 16]) -~~~ +``` If we multiply arrays with non-matching shapes we get an error: -~~~python +```python np.arange(5) * np.arange(6) -~~~ +``` -~~~ +```text Traceback (most recent call last): File "", line 1, in ValueError: operands could not be broadcast together with shapes (5,) (6,) -~~~ +``` Or with a multi-dimensional array: -~~~python +```python np.zeros([2,3]) * np.zeros([2,4]) -~~~ +``` -~~~ +```text Traceback (most recent call last): File "", line 1, in ValueError: operands could not be broadcast together with shapes (2,3) (2,4) -~~~ +``` Arrays must match in all dimensions in order to be compatible: -~~~python +```python np.ones([3, 3]) * np.ones([3, 3]) # Note elementwise multiply, *not* matrix multiply. -~~~ +``` -~~~ +```text array([[ 1., 1., 1.], [ 1., 1., 1.], [ 1., 1., 1.]]) -~~~ +``` **Except**, that if one array has any Dimension size of 1, then the data is -*automatically REPEATED to match the other dimension. This is known as -**broadcasting*. +_automatically REPEATED to match the other dimension. This is known as +\*\*broadcasting_. So, let's consider a subset of our inflammation data (just so we can easily see what's going on): -~~~python +```python subset = data[:10, :10] subset -~~~ +``` -~~~ +```text array([[0., 0., 1., 3., 1., 2., 4., 7., 8., 3.], [0., 1., 2., 1., 2., 1., 3., 2., 2., 6.], [0., 1., 1., 3., 3., 2., 6., 2., 5., 9.], @@ -865,29 +871,29 @@ array([[0., 0., 1., 3., 1., 2., 4., 7., 8., 3.], [0., 0., 1., 2., 3., 1., 2., 3., 5., 3.], [0., 0., 0., 3., 1., 5., 6., 5., 5., 8.], [0., 1., 1., 2., 1., 3., 5., 3., 5., 8.]]) -~~~ +``` Let's assume we wanted to multiply each of the 10 individual day values in a patient row for every patient, by contents of the following array: -~~~python +```python multiplier = np.arange(1, 11) multiplier -~~~ +``` -~~~ +```text array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) -~~~ +``` So, the first day value in a patient row is multiplied by 1, the second day by 2, the third day by 3, etc. We can just do: -~~~python +```python subset * multiplier -~~~ +``` -~~~ +```text array([[ 0., 0., 3., 12., 5., 12., 28., 56., 72., 30.], [ 0., 2., 6., 4., 10., 6., 21., 16., 18., 60.], [ 0., 2., 3., 12., 15., 12., 42., 16., 45., 90.], @@ -898,7 +904,7 @@ array([[ 0., 0., 3., 12., 5., 12., 28., 56., 72., 30.], [ 0., 0., 3., 8., 15., 6., 14., 24., 45., 30.], [ 0., 0., 0., 12., 5., 30., 42., 40., 45., 80.], [ 0., 2., 3., 8., 5., 18., 35., 24., 45., 80.]]) -~~~ +``` Which gives us what we want, since each value in `multiplier` is applied successively to each value in a patient's row, but over every patient's row. So, @@ -911,11 +917,12 @@ automatically repeats the data in `multiplier` to match the number of patients (the first dimension in `subset`) so the `*` operation can be applied over arrays of equal shape. -## Key Points: +## Key Points + - Processing NumPy arrays is generally much faster than processing Python lists. - NumPy arrays have specialised capabilities to support complex mathematical operations, and are less flexible that Python lists. - Slicing NumPy arrays returns a reference to the original dataset, not a copy of it like with Python lists. - NumPy arrays only hold elements of a single data type and are generally fixed in size. - Use `numpy.mean(array)`, `numpy.max(array)`, and `numpy.min(array)` to calculate simple statistics. - Use `numpy.mean(array, axis=0)` or `numpy.mean(array, axis=1)` to calculate statistics across the specified axis. -- Broadcasting allows you to apply an operation to two arrays of different shape, repeating the data in an array of a one-long dimension to match the larger array. \ No newline at end of file +- Broadcasting allows you to apply an operation to two arrays of different shape, repeating the data in an array of a one-long dimension to match the larger array. diff --git a/software_architecture_and_design/procedural/containers_cpp.md b/software_architecture_and_design/procedural/containers_cpp.md index 6887a98b..964e786e 100644 --- a/software_architecture_and_design/procedural/containers_cpp.md +++ b/software_architecture_and_design/procedural/containers_cpp.md @@ -1,34 +1,31 @@ --- name: Containers -dependsOn: [ - software_architecture_and_design.procedural.variables_cpp, -] +dependsOn: [software_architecture_and_design.procedural.variables_cpp] tags: [cpp] -attribution: - - citation: > - This material was adapted from an "Introduction to C++" course developed by the - Oxford RSE group. - url: https://www.rse.ox.ac.uk - image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: > + This material was adapted from an "Introduction to C++" course developed by the + Oxford RSE group. + url: https://www.rse.ox.ac.uk + image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- -*Container* types are those that can hold other objects, and C++ supports a number of -different [containers](https://en.cppreference.com/w/cpp/container) we can use to hold -data of differing types in a multitude of ways. +_Container_ types are those that can hold other objects, and C++ supports a number of +different [containers](https://en.cppreference.com/w/cpp/container) we can use to hold +data of differing types in a multitude of ways. ## Vector and Arrays -One of the most fundamental data structures in any language is the array, used to hold -many values at once in a contiguous region of memory. There are two array containers in -C++, depending on if you have a fixed sized array (`std::array`), or variable sized -(`std::vector`). When using these a user needs to specify the type of value held by that -container, and for the `std::array` you also need to specify the length. For example, to +One of the most fundamental data structures in any language is the array, used to hold +many values at once in a contiguous region of memory. There are two array containers in +C++, depending on if you have a fixed sized array (`std::array`), or variable sized +(`std::vector`). When using these a user needs to specify the type of value held by that +container, and for the `std::array` you also need to specify the length. For example, to define a `std::vector` of `double` values you could write ```cpp @@ -41,79 +38,79 @@ or a `std::array` of five `int` values is declared as: std::array y; ``` -The angle bracket syntax here is an example of using *templates* in C++, and both -`std::vector` and `std::array` are examples of *templated* classes. Templates in C++ are -a form of *generic programming*, and allow us to write classes (and functions) that can -accept many different types. All the container in C++ need to be able to hold any type -of value, and therefore all of the container types in C++ are templated on the value -type. The `std::array` class represents an array with a pre-defined size, and so this -size is another template arguement. Note that unlike arguements to functions, all -template arguements must be know *at compile time*. - -Since `std::array` has more limited use compared with `std::vector`, we will focus the -remainder of this section on `std::vector`. The interface to `std::array` is very -similar, and you can read about this particular container -[here](https://en.cppreference.com/w/cpp/container/array). We will also from here on -refer to `std::vector` as a vector, rather than the more general term "array", to match +The angle bracket syntax here is an example of using _templates_ in C++, and both +`std::vector` and `std::array` are examples of _templated_ classes. Templates in C++ are +a form of _generic programming_, and allow us to write classes (and functions) that can +accept many different types. All the container in C++ need to be able to hold any type +of value, and therefore all of the container types in C++ are templated on the value +type. The `std::array` class represents an array with a pre-defined size, and so this +size is another template argument. Note that unlike arguments to functions, all +template arguments must be know _at compile time_. + +Since `std::array` has more limited use compared with `std::vector`, we will focus the +remainder of this section on `std::vector`. The interface to `std::array` is very +similar, and you can read about this particular container +[here](https://en.cppreference.com/w/cpp/container/array). We will also from here on +refer to `std::vector` as a vector, rather than the more general term "array", to match with the name of the class itself. ### Creating and Extracting Things from Vectors -To define a vector initialised to a list of values, we can simply write a comma -separated list of items in curly brackets. We can also define a two dimensional vector +To define a vector initialised to a list of values, we can simply write a comma +separated list of items in curly brackets. We can also define a two dimensional vector by defining a "vector of vectors". -~~~cpp +```cpp std::vector odds = {1, 3, 5, 7, 9, 11, 15}; std::vector> more_numbers = { {1, 2}, {3, 4, 5}, { {6, 7}, {8} } } -~~~ +``` -We can see that our multi-dimensional vector can contain elements themselves of any size -and depth. This could be used as way of representing matrices, but later we'll learn a +We can see that our multi-dimensional vector can contain elements themselves of any size +and depth. This could be used as way of representing matrices, but later we'll learn a better way to represent these. -This curly bracket syntax is for representing *initializer lists* in C++. These -initializer lists can only be used when initialising, or constructing, an instance of a -class, and cannot be used once the instance has been already created. For example, the +This curly bracket syntax is for representing _initializer lists_ in C++. These +initializer lists can only be used when initialising, or constructing, an instance of a +class, and cannot be used once the instance has been already created. For example, the following code will give a compile error: -~~~cpp +```cpp std::vector odds; odds = {1, 3, 5, 7, 9, 11, 15}; -~~~ +``` -Note that every value in a vector must be of the same type, and this must match the type +Note that every value in a vector must be of the same type, and this must match the type that the `std::vector` is templated on. -We can select individual elements from vecotrs by indexing them. Looking at our `odds` +We can select individual elements from vectors by indexing them. Looking at our `odds` list: ![index-list](../fig/05-index-list-odd.png) For example: -~~~cpp -std::cout << odds[0] << ' ' << odds[-1] << std::endl; -~~~ +```cpp +std::cout << odds[0] << ' ' << odds[-1] << std::end; +``` This will print the first and last elements of a list: -~~~ +```text 1 15 -~~~ +``` -We can replace elements within a specific part of the list (note that in C++, indexes +We can replace elements within a specific part of the list (note that in C++, indexes start at 0): -~~~cpp +```cpp odds[6] = 13; -~~~ +``` -To add elements to the *end* of the vector use `push_back`, remove elements from -the *end* of the vector using `pop_back`. You can resize the vector using +To add elements to the _end_ of the vector use `push_back`, remove elements from +the _end_ of the vector using `pop_back`. You can resize the vector using `resize`. Get the current size of the vector using `size`. -~~~cpp +```cpp std::vector x; x.push_back(1.0); x.push_back(2.0); // x holds {1.0, 2.0} @@ -121,79 +118,80 @@ x.pop_back(); // x holds {1.0} x.resize(3); // x holds {1.0, ?, ?} std::cout << x.size() << std::endl; // 3 -~~~ +``` ## Loop or iterate over a Vector -Every container in C++ defines its own *iterators*, which can be used to iterate +Every container in C++ defines its own _iterators_, which can be used to iterate over that container. -~~~cpp +```cpp for (std::vector::iterator i = x.begin(); i != x.end(); ++i) { std:cout << *i << std::endl; } -~~~ +``` -An iterator acts like a pointer to each element of the vector, and thus it can be +An iterator acts like a pointer to each element of the vector, and thus it can be dereferenced using `*` to obtain a reference to the value pointed to. -We can simplify this rather verbose iterator classname by using the `auto`{.Cpp} +We can simplify this rather verbose iterator classname by using the `auto`{.Cpp} keyword. This tells the compiler to infer the correct type (i.e. what is returned from `x.begin()`{.Cpp}: -~~~cpp +```cpp for (auto i = x.begin(); i != x.end(); ++i) { std:cout << *i << std::endl; } -~~~ +``` -Another `for` loop in C++ is the *range-based* loop, and these have the most compact +Another `for` loop in C++ is the _range-based_ loop, and these have the most compact syntax, and work with any container that has `begin` and `end` methods. -~~~cpp +```cpp std::vector x = {1.0, 2.0, 3.0, 4.0}; for (double i: x) { std:cout << i << std::endl; } -~~~ +``` You can also use `auto`{.Cpp} here to simplify things... -~~~cpp +```cpp for (auto i: x) { std:cout << i << std::endl; } -~~~ +``` The previous code snippet could not alter the contents of the vector -because `i` was a *copy* of each element of x. You can instead make `i` a +because `i` was a _copy_ of each element of x. You can instead make `i` a reference to either edit values -~~~cpp +```cpp for (auto& i: x) { i = 1.0; // set each element to 1.0 } -~~~ +``` or to provide a constant reference to each value (thus avoiding any copies) -~~~cpp +```cpp for (const auto& i: x) { std::cout << i << std::endl; // print each element to the console } -~~~ +``` ::::challenge{id=dot_product title="Dot Product" } Write code to calculate the scalar (dot) product of two `std::vector` variables :::solution + ```cpp std::vector x = {1.0, 2.0, 3.0}; std::vector y = {1.0, 2.0, 3.0}; @@ -206,24 +204,26 @@ for (int i = 0; i < x.size(); ++i) { std::cout << "dot with vectors = "<< dot << std::endl; ``` + ::: :::: ::::challenge{id=matrix_multiply title="Matrix multiply" } -Write code to multiply two 3 x 3 matrices $C = AB$ using `std::array`. Think about how you would -store your matrices. You could use a flat array `std::array`, or -you could use nested arrays `std::array, 3>`. Output the +Write code to multiply two 3 x 3 matrices $C = AB$ using `std::array`. Think about how you would +store your matrices. You could use a flat array `std::array`, or +you could use nested arrays `std::array, 3>`. Output the result in a nicely formatted way, for example: -~~~ +```text C = | 1, 2, 3 | | 4, 5, 6 | | 7, 8, 9 | -~~~ +``` :::solution + ```cpp std::array,3> A = {{{5, 8, 2}, {8, 3, 1}, {5, 3, 9}}}; std::array,3> B = {{{1, 0, 0}, {0, 1, 0}, {0, 0, 1}}}; @@ -232,7 +232,7 @@ std::array,3> C = {}; for (int i = 0; i < 3; ++i) { for (int j = 0; j < 3; ++j) { for (int k = 0; k < 3; ++k) { - C[i][j] += A[i][k] * B[k][j]; + C[i][j] += A[i][k] * B[k][j]; } } } @@ -250,28 +250,29 @@ for (int i = 0; i < 3; ++i) { } } ``` + ::: ### Deleting Values, big-O notation and std::list -Deleting elements from the end of a vector is simple and fast and can be done using the -`pop_back` function, which takes constant, or $\mathcal{O}(1)$ time using big-O notation. This -means that the time taken is a constant or fixed amount of time independent of the size -of the vector. Deleting elements from the *start* or *middle* of the vector is more -difficult. An vector in C++ is an implementation of an *array* data structure, and -therefore the values contained occupy a *contiguous* section of memory, the start of -which is also the start of the vector. When deleting an element from the start or -middle, the remainder of the vector must be shifted down to maintain the contiguous -nature of the vector and the alignment of the first element to the start of the -allocated memory. Therefore deleting elmements from the start or middle of a vector -takes an amount of time that scales linearly with the size of the vector $n$, or +Deleting elements from the end of a vector is simple and fast and can be done using the +`pop_back` function, which takes constant, or $\mathcal{O}(1)$ time using big-O notation. This +means that the time taken is a constant or fixed amount of time independent of the size +of the vector. Deleting elements from the _start_ or _middle_ of the vector is more +difficult. An vector in C++ is an implementation of an _array_ data structure, and +therefore the values contained occupy a _contiguous_ section of memory, the start of +which is also the start of the vector. When deleting an element from the start or +middle, the remainder of the vector must be shifted down to maintain the contiguous +nature of the vector and the alignment of the first element to the start of the +allocated memory. Therefore deleting elmements from the start or middle of a vector +takes an amount of time that scales linearly with the size of the vector $n$, or $\mathcal{O}(n)$ time. For example, if we want to delete an element from the middle of a vector while -preserving the order of the elements, we can do the - following: +preserving the order of the elements, we can do the +following: -~~~cpp +```cpp std::vector x = {1, 2, 3, 4}; auto delete_this = x.begin() + 1; // an iterator to "2" for (auto i = x.begin(); i != x.end(); i++) { @@ -285,20 +286,20 @@ for (auto i = x.begin(); i != x.end(); i++) { std::cout << *i << ", "; } std::cout << "]" << std::endl; -~~~ +``` -Notice that this requires a loop through all the elements of the vector, hence the time -taken is $\mathcal{O}(n)$. The output of this program will show us the vector with a '2' +Notice that this requires a loop through all the elements of the vector, hence the time +taken is $\mathcal{O}(n)$. The output of this program will show us the vector with a '2' removed: -~~~ +```text [1, 3, 4, ] -~~~ +``` -A linked list is a data structure that provides constant-time insertion or deletion of +A linked list is a data structure that provides constant-time insertion or deletion of elements in the middle/start of the container. The C++ implmentation of a linked list is `std::list`, which you can use like this: -~~~cpp +```cpp std::list x = {1, 2, 3, 4}; auto delete_this = x.begin() + 1; // an iterator to "2" x.erase(delete_this); @@ -308,7 +309,7 @@ for (auto i = x.begin(); i != x.end(); i++) { std::cout << *i << ", "; } std::cout << "]" << std::endl; -~~~ +``` ## Move semantics for containers @@ -332,9 +333,9 @@ is useful to have an awareness of how this allocation works. Generally the memory allocation is handled automatically by the allocator, which reserves a certain amount of memory (its capacity) which might be greater than the size of the vector. Whenever the size of the vector exceeds this capacity the allocator -reallocates the memory for that vector, reserving a greater amount. +reallocates the memory for that vector, reserving a greater amount. -~~~cpp +```cpp std::vector x; int old_capacity = x.capacity(); for (int i = 0; i < 3000; i++) { @@ -344,9 +345,9 @@ for (int i = 0; i < 3000; i++) { std::cout << "Size = " << x.size() << " Capacity = " << x.capacity() << std::endl; } } -~~~ +``` -~~~ +```text Size = 1 Capacity = 1 Size = 2 Capacity = 2 Size = 3 Capacity = 4 @@ -360,16 +361,16 @@ Size = 257 Capacity = 512 Size = 513 Capacity = 1024 Size = 1025 Capacity = 2048 Size = 2049 Capacity = 4096 -~~~ +``` Memory allocations are in general slow, so if the user has knowledge of the neccessary size of the vector, then this process can be optimised by reserving the correct amount of memory using `std::vector::reserve()`{.cpp} -~~~cpp +```cpp std::vector x; x.reserve(3000); -~~~ +``` Another implication of memory reallocation for any container is that memory reallocation neccessarily invalidates any iterators currently pointing at @@ -377,7 +378,7 @@ specific elements (since they are now at a new memory address). This can be a source of bugs, so be aware that growing or resizing a vector can invalidate your iterators! -~~~cpp +```cpp std::vector data = {1, 2, 3, 4}; // keep track of how much data we've already processed auto processed = data.begin(); @@ -394,16 +395,16 @@ for (int i = 0; i < 10; i++) { for (; processed != data.end(); processed++) { process_data(*processed); } -~~~ +``` If the function `process_data` prints out the value given, then the output might look like the below. In this case the reallocated vector has been moved to a section of memory far away from the original location, and all the intermediate memory locations are processed as well as the vector itself: -~~~ +```text 1 2 3 4 0 0 1041 0 540155953 540287027 540024880 825503793 891301920 892416052 859126069 808727840 808925234 891303730 842018868 808990772 892483616 926101557 941634361 808661305 808597809 842610720 808857908 941634101 842086709 959852598 942684192 943141431 941633588 842610736 875770421 825833504 926101555 941633587 825242164 943077432 942684192 925907257 941634103 942944825 909194803 909261088 892416049 958412597 859189556 825635636 942684192 858863158 941634864 959789104 959461431 842283040 925905206 941633586 892876848 942684471 825506080 825504566 941633840 942682676 959461174 959789344 892482872 958412857 943075892 842608948 859060512 875639857 958411059 859189556 943207731 842283040 925905206 941635123 926364983 825373744 892483616 892547896 958411824 808531506 892679473 825506080 892547894 941635384 875705650 875966770 859060512 876033840 958411315 943075892 842608948 892483872 842477625 958412597 859189556 858796340 842283296 942945337 958412082 959527216 858798132 959461664 808531506 941635640 825504313 959721526 943012128 892481844 941635385 942750005 909456697 892483616 909456182 958412339 943075892 842608948 943011872 825439800 958412853 859189556 875968564 959789344 825833527 958411824 909392181 825439281 842283040 808663090 958410804 809055538 909128245 825506080 892547894 941635128 926429753 942946358 842283296 875837494 941633847 808793394 808988726 892483616 892612661 958412342 859189556 808728627 842283296 909260854 958412343 909392181 876032305 959789344 859387959 941634612 942944825 842479666 943012128 942813492 958412597 925905716 842610741 842283040 959983670 941635636 909130037 842085680 892811296 943272758 958412597 825505845 959787057 959789088 891303992 808661305 842610995 942684192 858863158 941634864 825635380 892942640 842283296 825505846 941634105 909654069 943010099 825506080 942945078 941634614 859190578 808989493 842610720 909259833 941633588 942813748 909718067 892483616 943009845 958412340 859189556 892350772 959461664 808727862 958413110 825242420 960049200 892483616 808857653 958410808 876163636 943140917 825506080 909390646 941634609 959527221 943142192 942684192 876165177 941634361 825635380 808597296 959461664 943273266 958411571 859189556 943207731 842283296 926101816 958412852 825702704 926298168 842610720 909326388 958412337 808465204 892614713 943012128 858927412 941633588 942750005 909456697 842610720 925906227 958411319 909392181 875968049 942684192 943141431 958411318 825505845 808530227 892483616 875705394 958410802 875573302 808464953 842610720 909326388 941635121 892876848 859125303 0 0 49 0 1641085147 5 469321016 -564037215 0 1 2 3 0 0 81 0 1 2 3 4 0 1 2 3 4 5 6 7 8 9 -~~~ +``` ## Strings as Containers @@ -412,7 +413,7 @@ Conceptually, a string is a type of container, in this case of letters. The C++. The `std::string`{.cpp} container class has most of the same iterators and functions as `std::vector`{.cpp} and behaves in a very similar way: -~~~cpp +```cpp #include //... @@ -424,19 +425,19 @@ for (auto i = element.begin(); i != element.end(); i++) { std::cout << *i; } std::cout << std::endl; -~~~ +``` gives the output -~~~ +```text x oxygen! -~~~ +``` As well as acting like a vector, `std::string`{.cpp} also has some useful string-specific functionality, like being able to concatenate strings, to find and extract substrings or to easily print out to screen -~~~cpp +```cpp using namespace std::string_literals; const std::string oxygen = "oxygen"; // initialise with a const char * @@ -447,12 +448,12 @@ const auto first_hydrogen = water.substr(0, first_dash); // first_hydrogen is a std::cout << "water is " << water << std::endl; std::cout << "first element in water is " << first_hydrogen << std::endl; -~~~ +``` -~~~ +```text water is hydrogen-oxygen-hydrogen first element in water is hydrogen -~~~ +``` ## Map and Set @@ -467,14 +468,14 @@ map, the other is the value type that is stored in the map. The `std::map` class implements a mapping between the key type to the value type. For example, we can store and access the populations of various UK cities like so: -~~~cpp +```cpp #include //... -std::map> populations = { +std::map> populations = { {"Liverpool": 467995}, - {"Edinburgh": 448850}, + {"Edinburgh": 448850}, {"Manchester": 430818} } @@ -486,13 +487,13 @@ for (const auto& [key, value] : m) { std::cout << std::endl; const auto key = "Liverpool"s; -std::cout << "The population of " << key << " is " << populations[key] << std::endl; -~~~ +std::cout << "The population of " << key << " is " << populations[key] << std::endl; +``` -~~~ -[Edinburgh] = 448850; [Liverpool] = 467995; [Manchester] = 430818; [Oxford] = 137343; +```text +[Edinburgh] = 448850; [Liverpool] = 467995; [Manchester] = 430818; [Oxford] = 137343; The population of Liverpool is 467995 -~~~ +``` A set is similar to a map that only contains keys (no values). The C++ implementation of a set is `std::set`. Each element of the set is unique (just @@ -500,16 +501,16 @@ like the keys in a map). ## Tuples -A tuple is a fixed-size container of values that can be of *different* types. It +A tuple is a fixed-size container of values that can be of _different_ types. It is most useful for holding a collection of useful variables, or returning multiple values from a function. -~~~cpp +```cpp std::tuple y = {'Apple', 1, 'Cherry'}; auto fruits = sdt::make_tuple(3.14, 2, 'Cherry'); -~~~ +``` -Values can be obtained from a tuple via *destructuring*. For C++17 and onwards, the syntax is +Values can be obtained from a tuple via _destructuring_. For C++17 and onwards, the syntax is ```cpp auto [weight, number, name] = fruits; @@ -518,16 +519,12 @@ auto [weight, number, name] = fruits; Note that previously the syntax was more cumbersome: ```cpp -double weight; +double weight; int number; std::string name; std::tie(weight, number, name) = fruits; ``` - - - - ## General Rule Your programs will be faster and more readable if you use the appropriate @@ -545,10 +542,10 @@ Download the iris dataset hosted by the UCI Machine Learning Repository [here](h 2. sepal width in cm 3. petal length in cm 4. petal width in cm -5. class: +5. class: -- Iris Setosa -- Iris Versicolour - -- Iris Virginica)_) + -- Iris Virginica)\_) Your goal is to provide minimum and maximum bounds of sepal length for each class of Iris. Below is an example code for reading in the dataset and printing @@ -577,7 +574,7 @@ int main() { continue; } std::replace(line.begin(), line.end(), ',', ' '); - + std::istringstream iss(line); double sepal_len, unused; std::string iris_class; @@ -588,6 +585,7 @@ int main() { ``` :::solution + ```cpp #include #include @@ -611,7 +609,7 @@ int main() { continue; } std::replace(line.begin(), line.end(), ',', ' '); - + std::istringstream iss(line); double sepal_len, unused; std::string iris_class; @@ -637,9 +635,9 @@ int main() { for (const auto& [iris_class, bounds]: stats) { const auto& [min, max] = bounds; - std::cout << iris_class << " = (" << min << " - " << max << ")" << std::endl; + std::cout << iris_class << " = (" << min << " - " << max << ")" << std::endl; } } ``` -:::: \ No newline at end of file +:::: diff --git a/software_architecture_and_design/procedural/containers_python.md b/software_architecture_and_design/procedural/containers_python.md index c3513972..70ce77d9 100644 --- a/software_architecture_and_design/procedural/containers_python.md +++ b/software_architecture_and_design/procedural/containers_python.md @@ -1,30 +1,26 @@ --- name: Containers -dependsOn: [ - software_architecture_and_design.procedural.variables_python, -] +dependsOn: [software_architecture_and_design.procedural.variables_python] tags: [python] learningOutcomes: - Use container-type variables to hold multiple sets of data. - Use indexing and other access methods to access data within containers. - Differentiate between mutable and immutable variable types. -attribution: - - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- -*Container* types are those that can hold other objects, and Python supports a +_Container_ types are those that can hold other objects, and Python supports a number of different containers we can use to hold data of differing types in a multitude of ways. - ## Lists One of the most fundamental data structures in any language is the array, used @@ -39,10 +35,10 @@ use more specialised types of containers we'll see later. To define a list we simply write a comma separated list of items in square brackets: -~~~python +```python odds = [1, 3, 5, 7, 9, 11, 15] more_numbers = [ [1, 2], [3, 4, 5], [ [6, 7], [8] ] ] -~~~ +``` We can see that our multi-dimensional array can contain elements themselves of any size and depth. This could be used as way of representing matrices, but @@ -52,9 +48,9 @@ A list in Python is just an ordered collection of items which can be of any type. By comparison, in many languages an array is an ordered collection of items of a single type - so a list is more flexible than an array. -~~~python +```python various_things = [1, 2, 'banana', 3.4, [1, 2] ] -~~~ +``` We can select individual elements from lists by indexing them. Looking at our `odds` list: @@ -62,68 +58,68 @@ We can select individual elements from lists by indexing them. Looking at our `o For example: -~~~python +```python print(odds[0], odds[-1]) -~~~ +``` This will print the first and last elements of a list: -~~~ +```text 1 15 -~~~ +``` We can replace elements within a specific part of the list (note that in Python, indexes start at 0): -~~~python +```python odds[6] = 13 -~~~ +``` -We can also *slice* lists to either extract or set an arbitrary subset of the list. +We can also _slice_ lists to either extract or set an arbitrary subset of the list. ![slice-list](fig/05-slice-list-odd.png) -Note that here, we are selecting the *boundaries* between elements, and not the indexes. +Note that here, we are selecting the _boundaries_ between elements, and not the indexes. For example, to show us elements 3 to 5 (inclusive) in our list: -~~~python +```python odds[2:5] -~~~ +``` We select the boundaries `2` and `5`, which produce: -~~~ +```text [5, 7, 9] -~~~ +``` We can also leave out either start or end parts and they will assume their maximum possible value: -~~~python +```python odds[5:] -~~~ +``` -~~~ +```text [11, 13] -~~~ +``` Or even: -~~~python +```python odds[:] -~~~ +``` Which will show us the whole list. -~~~ +```text [1, 3, 5, 7, 9, 11, 13] -~~~ +``` ::::challenge{id=slicing-from-the-end, title="Sliding from the End"} Use slicing to access only the last four characters of a string or entries of a list. -~~~python +```python string_for_slicing = "Observation date: 02-Feb-2013" list_for_slicing = [["fluorine", "F"], ["chlorine", "Cl"], @@ -133,19 +129,19 @@ list_for_slicing = [["fluorine", "F"], print(string_for_slicing) print(list_for_slicing) -~~~ +``` -~~~ +```text 'Observation date: 02-Feb-2013' [['fluorine', 'F'], ['chlorine', 'Cl'], ['bromine', 'Br'], ['iodine', 'I'], ['astatine', 'At']] -~~~ +``` So what would you use to see the following? -~~~ +```text '2013' [['chlorine', 'Cl'], ['bromine', 'Br'], ['iodine', 'I'], ['astatine', 'At']] -~~~ +``` Would your solution work regardless of whether you knew beforehand the length of the string or list @@ -157,10 +153,11 @@ Hint: Remember that indices can be negative as well as positive :::solution Use negative indices to count elements from the end of a container (such as list or string): -~~~python +```python string_for_slicing[-4:] list_for_slicing[-4:] -~~~ +``` + ::: :::: @@ -168,16 +165,17 @@ list_for_slicing[-4:] Conceptually, a string is a type of container, in this case of letters. We can also index and slice strings in the same way as a list: -~~~python +```python element = 'oxygen' print(element[1], element[0:3], element[3:6]) -~~~ +``` -~~~ +```text x oxy gen -~~~ +``` + +:::callout{variant="tip"} -:::callout ## Only One Way to do It Which demonstrates a key design principle behind Python: "there should be one - and preferably only one - obvious way to do it." @@ -185,78 +183,80 @@ Which demonstrates a key design principle behind Python: "there should be one - ::: :::callout + ## Mutability -An important thing to remember is that Python variables are simply *references* to values, and also that they fall into two distinct types: +An important thing to remember is that Python variables are simply _references_ to values, and also that they fall into two distinct types: + +- Immutable types: value changes by referencing a newly created value (e.g. when adding a letter in a string). Note you cannot change individual elements of an immutable container (e.g. you can't change a single character in a string directly 'in place') +- Mutable types: values can be changed 'in place', e.g. changing or adding an item in a list -* Immutable types: value changes by referencing a newly created value (e.g. when adding a letter in a string). Note you cannot change individual elements of an immutable container (e.g. you can't change a single character in a string directly 'in place') -* Mutable types: values can be changed 'in place', e.g. changing or adding an item in a list ::: This matters when you are 'copying' variables or passing them as arguments to functions. So since strings are immutable types, we cannot change elements 'in place': -~~~python +```python element[2] = 'y' -~~~ +``` -~~~ +```text Traceback (most recent call last): File "", line 1, in TypeError: 'str' object does not support item assignment -~~~ +``` Sometimes it's useful to be able to split a string by a delimiter into a list: -~~~python +```python str = 'This is a string' a_list = str.split() print(a_list) -~~~ +``` -~~~ +```text ['This', 'is', 'a', 'string'] -~~~ +``` We can also join together a list of strings into a single string: -~~~python +```python new_str = ' '.join(a_list) print(new_str) -~~~ +``` -~~~ +```text This is a string -~~~ +``` ### Adding and Deleting Elements We can also add and delete elements from a Python list at any time: -~~~python +```python odds.append(21) del odds[0] odds -~~~ +``` Which will show us the list with a '21' added to the end and its first element removed: -~~~ +```text [3, 5, 7, 9, 11, 13, 21] -~~~ +``` ### Checking an Elements are in a List We can also check if an element is within a list: -~~~python +```python 9 in odds -~~~ +``` -~~~ +```text True -~~~ +``` ## Tuples @@ -264,38 +264,38 @@ A tuple is an immutable sequence. It is like a list, in terms of indexing, repet With a single element tuple, you need to end the assignment with a comma. -~~~python +```python x = 0, y = ('Apple', 'Banana', 'Cherry') type(x) -~~~ +``` -~~~ +```text -~~~ +``` So similarly to lists for indexing elements: -~~~python +```python y[1] -~~~ +``` -~~~ +```text 'Banana' -~~~ +``` But as we mentioned, it's an immutable type: -~~~python +```python my_tuple = ('Hello', 'World') my_tuple[0] = 'Goodbye' -~~~ +``` -~~~ +```text Traceback (most recent call last): File "", line 1, in TypeError: 'tuple' object does not support item assignment -~~~ +``` ## Dictionaries @@ -305,100 +305,100 @@ This is also known as an "associative array", "map" or "hash" in other languages In a list, we use a number to look up an element: -~~~python +```python names = 'Martin Luther King'.split(' ') names[1] -~~~ +``` -~~~ +```text 'Luther' -~~~ +``` In a dictionary, we look up an element using another object of our choice: -~~~python -me = { 'name': 'Joe', 'age': 39, +```python +me = { 'name': 'Joe', 'age': 39, 'Jobs': ['Programmer', 'Teacher'] } me -~~~ +``` -~~~ +```text {'name': 'Joe', 'age': 39, 'Jobs': ['Programmer', 'Teacher']} -~~~ +``` -~~~python +```python me['Jobs'] -~~~ +``` -~~~ +```text ['Programmer', 'Teacher'] -~~~ +``` -~~~python +```python me['age'] -~~~ +``` -~~~ +```text 39 -~~~ +``` -~~~python +```python type(me) -~~~ +``` -~~~ +```text -~~~ +``` ### Keys and Values The things we can use to look up with are called keys: -~~~python +```python me.keys() -~~~ +``` -~~~ +```text dict_keys(['name', age', 'Jobs']) -~~~ +``` The things we can look up are called values: -~~~python +```python me.values() -~~~ +``` -~~~ +```text dict_values(['Joe', 39, ['Programmer', 'Teacher']]) -~~~ +``` When we test for containment on the dict itself we essentially test on its keys: -~~~python +```python 'Jobs' in me -~~~ +``` -~~~ +```python True -~~~ +``` -~~~python +```python 'Joe' in me -~~~ +``` -~~~ +```python False -~~~ +``` But we can also test on the values of a dict: -~~~python +```python 'Joe' in me.values() -~~~ +``` -~~~ +```python True -~~~ +``` ### Immutable Keys Only @@ -406,63 +406,63 @@ The way in which dictionaries work is one of the coolest things in computer scie One consequence of this implementation is that you can only use immutable things as keys. -~~~python +```python good_match = { - ("Lamb", "Mint"): True, + ("Lamb", "Mint"): True, ("Bacon", "Chocolate"): False - } -~~~ +} +``` But: -~~~python +```python nolint illegal = { - ["Lamb", "Mint"]: True, + ["Lamb", "Mint"]: True, ["Bacon", "Chocolate"]: False - } -~~~ +} +``` -~~~ +```text Traceback (most recent call last): File "", line 3, in TypeError: unhashable type: 'list' -~~~ +``` Remember -- square brackets denote lists, round brackets denote tuples. -## Beware 'Copying' of Containers! +## Beware 'Copying' of Containers -Here, note that `y` is not equal to the contents of `x`, it is a second label on the *same object*. So when we change `y`, we are also changing `x`. This is generally true for mutable types in Python. +Here, note that `y` is not equal to the contents of `x`, it is a second label on the _same object_. So when we change `y`, we are also changing `x`. This is generally true for mutable types in Python. -~~~python +```python x = [1, 2, 3] y = x y[1] = 20 print(x, y) -~~~ +``` -~~~ +```text [1, 20, 3] [1, 20, 3] -~~~ +``` Instead, if we wanted to ensure our changes occurred separately on an actual copy of the contents, we could do: -~~~python +```python x = [1, 2, 3] y = x[:] y[1] = 20 print(x, y) -~~~ +``` -~~~ +```text [1, 2, 3] [1, 20, 3] -~~~ +``` In this case, we are using `x[:]` to create a new list containing all the elements of `x` which is assigned to `y`. This happens whenever we take any sized slice from a list. This gets more complicated when we consider nested lists. -~~~python +```python x = [['a', 'b'] , 'c'] y = x z = x[:] @@ -471,11 +471,11 @@ x[0][1] = 'd' z[1] = 'e' print(x, y, z) -~~~ +``` -~~~ +```text [['a', 'd'], 'c'] [['a', 'd'], 'c'] [['a', 'd'], 'e'] -~~~ +``` Note that `x` and `y` are the same as we may expect. But `z`, despite being a copy of `x`'s original contents, now contains `'d'` in its nested list. @@ -484,7 +484,7 @@ The copies that we make through slicing are called shallow copies: we don't copy all the objects they contain, only the references to them. This is why the nested list in `x[0]` is not copied, so `z[0]` still refers to it. It is possible to actually create copies of all the contents, however deeply nested -they are - this is called a *deep copy*. Python provides methods for that in its +they are - this is called a _deep copy_. Python provides methods for that in its standard library in the `copy` module. ## General Rule @@ -494,7 +494,8 @@ container type for your data's meaning. For example, always use a set for lists which can't in principle contain the same data twice, always use a dictionary for anything which feels like a mapping from keys to values. -## Key Points: +## Key Points + - Python containers can contain values of any type. - Lists, sets, and dictionaries are mutable types whose values can be changed after creation. - Lists store elements as an ordered sequence of potentially non-unique values. @@ -502,6 +503,6 @@ for anything which feels like a mapping from keys to values. - Dictionary keys are required to be of an immutable type. - Sets are an unordered collection of unique elements. - Containers can contain other containers as elements. -- Use `x[a:b]` to extract a subset of data from `x`, with `a` and `b` representing element *boundaries*, not indexes. +- Use `x[a:b]` to extract a subset of data from `x`, with `a` and `b` representing element _boundaries_, not indexes. - Tuples are an immutable type whose values cannot be changed after creation and must be re-created. -- Doing `x = y`, where `y` is a container, doesn't copy its elements, it just creates a new reference to it. \ No newline at end of file +- Doing `x = y`, where `y` is a container, doesn't copy its elements, it just creates a new reference to it. diff --git a/software_architecture_and_design/procedural/functions_cpp.md b/software_architecture_and_design/procedural/functions_cpp.md index 840166af..b386976c 100644 --- a/software_architecture_and_design/procedural/functions_cpp.md +++ b/software_architecture_and_design/procedural/functions_cpp.md @@ -3,14 +3,14 @@ name: Functions dependsOn: [software_architecture_and_design.procedural.containers_cpp] tags: [cpp] attribution: -- citation: This material was adapted from an "Introduction to C++" course developed by the Oxford RSE group. - url: https://www.rse.ox.ac.uk - image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg - license: CC-BY-4.0 -- citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 + - citation: This material was adapted from an "Introduction to C++" course developed by the Oxford RSE group. + url: https://www.rse.ox.ac.uk + image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- # Functions @@ -114,7 +114,7 @@ int main() { [`[< compiler explorer >]`](https://gcc.godbolt.org/z/F1MBsS) -# Pass by reference +## Pass by reference A common way of allowing a function to change the value of a variable outside the function is to use _references_. You can do this by adding diff --git a/software_architecture_and_design/procedural/functions_python.md b/software_architecture_and_design/procedural/functions_python.md index d7ef1f5e..43d499b4 100644 --- a/software_architecture_and_design/procedural/functions_python.md +++ b/software_architecture_and_design/procedural/functions_python.md @@ -1,27 +1,26 @@ --- name: Functions -dependsOn: [ - software_architecture_and_design.procedural.containers_python, -] +dependsOn: [software_architecture_and_design.procedural.containers_python] tags: [python] learningOutcomes: - - Define a function that takes parameters. - - Return a value from a function. - - Test and debug a function. - - Set default values for function parameters. - - Explain why we should divide programs into small, single-purpose functions. -attribution: - - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - + - Define a function that takes parameters. + - Return a value from a function. + - Test and debug a function. + - Set default values for function parameters. + - Explain why we should divide programs into small, single-purpose functions. +attribution: + - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- + ## Using Functions + In most modern programming languages these procedures are called **functions**. Python has many pre-defined functions built in and we've already met some of them. @@ -29,83 +28,86 @@ To use, or **call**, a function we use the name of the function, followed by bra All functions in Python **return** a single value as their result. :::callout + ## Return Values + Though all functions return a single value in Python, this value may itself be: + - a collection of values - `None` - a special value that is interpreted as though nothing has been returned + ::: -~~~python +```python char_count = len('Python') print(char_count) -~~~ +``` -~~~ +```text 6 -~~~ +``` Some functions are a little different in that they belong to an object, so must be accessed through the object using the dot operator. These are called **methods** or **member functions**. We've already seen some of these as well, but we'll see more when we get to the Object Oriented Paradigm later. -~~~python +```python nums = [1, 2, 3] nums.append(4) print(nums) -~~~ +``` -~~~ +```text [1, 2, 3, 4] -~~~ +``` The append function is actually also one of these functions that return `None`. We can test this again by printing its output. -~~~ python +```python nums = [1, 2, 3] result = nums.append(4) print(result) print(nums) -~~~ +``` -~~~ +```text None [1, 2, 3, 4] -~~~ +``` It's relatively common for a function to return `None` if the purpose of the function is to modify one of its input values. That's the case here - the purpose of the `append` function is to append a value to an existing list. - ## Creating Functions Although Python has many built in functions, it wouldn't be much use if we couldn't also define our own. Most languages use a keyword to signify a **function definition**, in Python that keyword is `def`. -~~~ python +```python def add_one(value): return value + 1 two = add_one(1) print(two) -~~~ +``` -~~~ +```text 2 -~~~ +``` -~~~ python +```python def say_hello(name): return 'Hello, ' + name + '!' print(say_hello('World')) -~~~ +``` -~~~ +```text Hello, World! -~~~ +``` To define a function, we use `def`, followed by the name of the function and its **parameters** in brackets. Just like with other code blocks (like `for` and `if`), we use a colon to signify the body of the function and indent the body by four spaces. @@ -114,7 +116,6 @@ Note that we used the word **argument** when we were calling a function, but **p The parameters of a function are the names of the variables which are created inside the function to accept its input data. The arguments of a function are the values that we give to a function when we call it, to put into its parameters. - Sometimes, it's useful for a parameter to have a default value. When we call a function, parameters with default values can be used in one of three ways: @@ -122,22 +123,23 @@ When we call a function, parameters with default values can be used in one of th 2. We can provide our own value in the normal way 3. We can provide a value in the form of a **named argument** - arguments which are not named are called **positional arguments** -~~~ python +```python def say_hello(name='World'): return 'Hello, ' + name + '!' print(say_hello()) print(say_hello('Python')) print(say_hello(name='Named Argument')) -~~~ +``` -~~~ +```text Hello, World! Hello, Python! Hello, Named Argument! -~~~ +``` :::callout + ## Declarations and Definitions Some languages have a distinction between **declaration** and **definition** (or **implementation**) of a function. @@ -157,20 +159,21 @@ See [this page](https://docs.microsoft.com/en-us/cpp/cpp/declarations-and-defini Write a short function called `fence` that takes two parameters called original and wrapper and returns a new string that has the wrapper character at the beginning and end of the original. A call to your function should look like this: -~~~ python +```python nolint print(fence('name', '*')) -~~~ +``` -~~~ +```text *name* -~~~ +``` :::solution -~~~ python +```python def fence(original, wrapper): return wrapper + original + wrapper -~~~ +``` + ::: :::: @@ -181,7 +184,7 @@ How many different ways can you call this function using combinations of named a :::solution -~~~ python +```python def say_hello(greeting='Hello', name='World'): return greeting + ', ' + name + '!' @@ -203,7 +206,7 @@ print(say_hello('Hello', name='World')) # Both named arguments print(say_hello(greeting='Hello', name='World')) -~~~ +``` You should have found that Python will not let you provide positional arguments after named ones. @@ -219,7 +222,7 @@ Within a function, any variables that are created (such as parameters or other v For example, what would be the output from the following: -~~~ python +```python f = 0 k = 0 @@ -231,7 +234,7 @@ multiply_by_10(2) multiply_by_10(8) print(k) -~~~ +``` 1. 20 2. 80 @@ -244,7 +247,7 @@ with those defined outside of the function. This is really useful, since it means we don’t have to worry about conflicts with variable names that are defined outside of our function that may cause it -to behave incorrectly. This is known as variable scoping. +to behave incorrectly. This is known as variable scoping. ::: :::: @@ -253,15 +256,15 @@ to behave incorrectly. This is known as variable scoping. One of the main reasons for defining a function is to encapsulate our code, so that it can be used without having to worry about exactly how the computation is -performed. This means we're free to implement the function however we want, +performed. This means we're free to implement the function however we want, including deferring some part of the task to another function that already exists. For example, if some data processing code we're working on needs to be able to accept temperatures in Fahrenheit, we might need a way to convert these into -Kelvin. So we could write these two temperature conversion functions: +Kelvin. So we could write these two temperature conversion functions: -~~~ python +```python def fahr_to_cels(fahr): # Convert temperature in Fahrenheit to Celsius cels = (fahr + 32) * (5 / 9) @@ -275,17 +278,17 @@ def fahr_to_kelv(fahr): print(fahr_to_kelv(32)) print(fahr_to_kelv(212)) -~~~ +``` But if we look at these two functions, we notice that the conversion from -Fahrenheit to Celsius is actually duplicated in both functions. This makes +Fahrenheit to Celsius is actually duplicated in both functions. This makes sense, since this is a necessary step in both functions, but duplicated code is wasteful and increases the chance of us making an error - what if we made a typo in one of the equations? So, we can remove the duplicated code, by calling one function from inside the other: -~~~ python +```python def fahr_to_cels(fahr): # Convert temperature in Fahrenheit to Celsius cels = (fahr + 32) * (5 / 9) @@ -299,12 +302,12 @@ def fahr_to_kelv(fahr): print(fahr_to_kelv(32)) print(fahr_to_kelv(212)) -~~~ +``` Now we've removed the duplicated code, but we might actually want to go one step further and remove some of the other unnecessary bits: -~~~ python +```python def fahr_to_cels(fahr): # Convert temperature in Fahrenheit to Celsius return (fahr + 32) * (5 / 9) @@ -315,7 +318,7 @@ def fahr_to_kelv(fahr): print(fahr_to_kelv(32)) print(fahr_to_kelv(212)) -~~~ +``` Now we have each function down to one statement, which should be easier to read and hopefully has reduced the chance of us making a mistake. Whether you actually prefer the second or third version is up to you, but we should at least try to reduce duplication where posssible. @@ -327,7 +330,7 @@ As a common example to illustrate each of the paradigms, we'll write some code t First, let's create a data structure to keep track of the papers that a group of academics are publishing. Note that we could use an actual `date` type to store the publication date, but they're much more complicated to work with, so we'll just use the year. -~~~ python +```python academics = [ { 'name': 'Alice', @@ -352,11 +355,11 @@ academics = [ ] } ] -~~~ +``` We want a convenient way to add new papers to the data structure. -~~~ python +```python def write_paper(academics, name, title, date): paper = { 'title': title, @@ -367,11 +370,11 @@ def write_paper(academics, name, title, date): if academic['name'] == name: academic['papers'].append(paper) break -~~~ +``` We're introducing a new keyword here, `break`, which exits from inside a loop. When the `break` keyword is encountered, execution jumps to the next line -outside of the loop. If there isn't a next line, as in our example here, then +outside of the loop. If there isn't a next line, as in our example here, then it's the end of the current block of code. This is useful when we have to search for something in a list - once we've found @@ -381,7 +384,9 @@ items. What happens if we call this function for an academic who doesn't exist? :::callout + ## Exceptions + In many programming languages, we use **exceptions** to indicate that exceptional behaviour has occured and the flow of execution should be diverted. Exceptions are often **raised** (or **thrown** in some other programming languages) as the result of an error condition. @@ -391,7 +396,7 @@ For the moment we'll just raise the exception, and assume that it will get handl In Python, exceptions may also be used to alter the flow of execution even when an error has not occured. For example, when iterating over a collection, a `StopIteration` exception is the way in which Python tells a loop construct to terminate, though this is hidden from you. -~~~ python +```python def write_paper(academics, name, title, date): paper = { 'title': title, @@ -405,7 +410,8 @@ def write_paper(academics, name, title, date): else: raise KeyError('Named academic does not exist') -~~~ +``` + {: .language-python} The `for-else` structure used here is relatively unusual, but can be useful when you're using a loop to search for a value. @@ -415,12 +421,11 @@ When you're using a loop to search for something, this means that it has not bee For more information see [this section](https://docs.python.org/3/tutorial/controlflow.html#break-and-continue-statements-and-else-clauses-on-loops) of the Python documentation. ::: - ::::challenge{id=passing-lists-to-functions title="Passing Lists to Functions"} We have seen previously that functions are not able to change the value of a variable which is used as their argument. -~~~ python +```python def append_to_list(l): l.append('appended') l = [1, 2, 3] @@ -431,17 +436,18 @@ a_list = ['this', 'is', 'a', 'list'] print(append_to_list(a_list)) print(a_list) -~~~ +``` Before running this code, think about what you expect the output to be. Now run the code, does it behave as you expected? Why does the function behave in this way? :::solution -~~~ + +```text [1, 2, 3, 'again'] ['this', 'is', 'a', 'list', 'appended'] -~~~ +``` The reason for this behaviour is that lists are **mutable** so when we pass one in to a function any modifications are made to the actual list as it exist in memory. Using `=` to assign a new value creates a new list in memory (it does not modify the existing list) and assigns it to the variable / name `l`. @@ -461,7 +467,7 @@ Write a function called `count_papers`, that when called with `count_papers(acad One possible solution is: -~~~ python +```python def count_papers(academics): count = 0 @@ -472,11 +478,12 @@ def count_papers(academics): total = count_papers(academics) print(total) -~~~ +``` -~~~ +```text 3 -~~~ +``` + ::: :::: @@ -488,7 +495,7 @@ Write a function called `list_papers`, that when called with `list_papers(academ One possible solution is: -~~~ python +```python def list_papers(academics): papers = [] @@ -496,10 +503,12 @@ def list_papers(academics): papers = papers + academic['papers'] return papers -~~~ +``` + ::: :::: -## Key Points: +## Key Points + - Functions allow us to separate out blocks of code which perform a common task -- Functions have their own scope and do not clash with variables defined outside \ No newline at end of file +- Functions have their own scope and do not clash with variables defined outside diff --git a/software_architecture_and_design/procedural/index.md b/software_architecture_and_design/procedural/index.md index 20d1a11b..8964219c 100644 --- a/software_architecture_and_design/procedural/index.md +++ b/software_architecture_and_design/procedural/index.md @@ -1,32 +1,25 @@ --- id: procedural name: Procedural Programming -files: [ - variables_cpp.md, - functions_cpp.md, - containers_cpp.md, - variables_python.md, - containers_python.md, - functions_python.md, - arrays_python.md, -] -dependsOn: [ - technology_and_tooling.ide -] +files: + [ + [variables_cpp.md, functions_cpp.md, containers_cpp.md], + [variables_python.md, containers_python.md, functions_python.md, arrays_python.md], + ] +dependsOn: [technology_and_tooling.ide] summary: | Procedural Programming is based around the idea that code should be structured into a sequence of **procedures** that operate on data. This course will introduce you to the basics of procedural programming in either Python or C++. -attribution: - - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## The Procedural Paradigm @@ -38,4 +31,4 @@ returns some output. We can then use these procedures to perform computation, without having to be concerned with exactly how the computation is performed. You may wish to think of the Procedural Paradigm as focussing on the **verbs** -of a computation. \ No newline at end of file +of a computation. diff --git a/software_architecture_and_design/procedural/variables_cpp.md b/software_architecture_and_design/procedural/variables_cpp.md index b87080ec..6a4bcbfc 100644 --- a/software_architecture_and_design/procedural/variables_cpp.md +++ b/software_architecture_and_design/procedural/variables_cpp.md @@ -35,7 +35,7 @@ clang++ --version You should see something like: -``` +```text Homebrew clang version 15.0.3 Target: x86_64-apple-darwin22.1.0 Thread model: posix @@ -50,7 +50,7 @@ which clang++ You should see something like: -``` +```text /usr/local/opt/llvm/bin/clang++ ``` @@ -81,7 +81,7 @@ int six = 2 * 3; std::cout << "six = " << six << std::endl; ``` -``` +```text six = 6 ``` @@ -107,16 +107,16 @@ If we try to use a variable that hasn't been defined, we get a compiler error: int seven = sixe + 1; ``` -``` +````text /Users/martinjrobins/git/thing/procedural.cpp:7:17: error: use of undeclared identifier 'sixe'; did you mean 'six'? int seven = sixe + 1; - ^~~~ + ^``` six /Users/martinjrobins/git/thing/procedural.cpp:5:9: note: 'six' declared here int six = 2 * 3; ^ 1 error generated. -``` +```` Note here we accidentally wrote `sixe` instead of `six`, so the compiler recognised this as an _undeclared identifier_ and gave an error. It even @@ -138,11 +138,11 @@ const int six = 2 * 3; six = 7; ``` -``` +````text /Users/martinjrobins/git/thing/procedural.cpp:8:9: error: cannot assign to variable 'six' with const-qualified type 'const int' six = 7; - ~~~ ^ -``` + ``` ^ +```` The compiler has saved us again! You can assist the compiler (and perhaps more importantly, other readers of your code!) by always marking variables that you @@ -314,7 +314,7 @@ int main() { } ``` -``` +```text Joe Frederick 'Bloggs' ``` @@ -369,7 +369,7 @@ std::cout << "seven = " << r_number2 << std::endl; std::cout << "seven = " << six << std::endl; ``` -``` +```text six = 6 seven = 7 seven = 7 @@ -409,7 +409,7 @@ int six = 6; int& r_six = six; ``` -An rvalue refernce is more general in that it can also be bound to temporaries, +An rvalue reference is more general in that it can also be bound to temporaries, or rvalues. An rvalue could be a literal like `6` or the result of an expression like `a + b` (i.e. something that you might see on the right hand side of an assignment `=` statement). An rvalue reference is declared using two ampersands `&&`. @@ -433,13 +433,13 @@ the `std::move` function to do this more efficiently by changing the lvalue references to rvalue references. ```cpp -T tmp(std::move(war_and_peace); +T tmp(std::move(war_and_peace)); war_and_peace = std::move(moby_dick); moby_dick = std::move(tmp); ``` The `std::move` function allows us to transfer the value of variable `a` to -variable `b`, without the requiriment of maintaining the value of `a`. Note that +variable `b`, without the requirement of maintaining the value of `a`. Note that after we have moved `a` its value is now unspecified, so after the last statement in the snippet above, the value of `tmp` will be unspecified. @@ -499,7 +499,7 @@ be represented by a `float`, according to the rules dictated [here](https://en.cppreference.com/w/cpp/language/implicit_conversion). Since the value now in `y` is different to the value in `x`, the result is: -``` +```text x != y ``` @@ -535,7 +535,7 @@ std::cout << "mean is " << mean << std::endl; Here we are creating a vector of `double` with all the elements initialised to 1.0. This program outputs: -``` +```text mean is 0 ``` diff --git a/software_architecture_and_design/procedural/variables_python.md b/software_architecture_and_design/procedural/variables_python.md index 3469970d..c6eb8ec9 100644 --- a/software_architecture_and_design/procedural/variables_python.md +++ b/software_architecture_and_design/procedural/variables_python.md @@ -1,24 +1,20 @@ --- name: Variables -dependsOn: [ - technology_and_tooling.bash_shell.bash, - technology_and_tooling.ide.cpp, -] +dependsOn: [technology_and_tooling.bash_shell.bash, technology_and_tooling.ide.cpp] tags: [python] learningOutcomes: - Describe the fundamental types of variables. - Assign values to basic variables and make use of them. - Print the content of variables. -attribution: - - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Getting started @@ -27,7 +23,7 @@ Create a new folder and open it in VSCode, e.g. on the command-line (bash or oth ```bash mkdir procedural -cd procedual +cd procedural code . ``` @@ -48,10 +44,10 @@ python And then you are presented with something like: -``` +```text Python 3.10.4 (main, Jun 29 2022, 12:14:53) [GCC 11.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. ->>> +>>> ``` And lo and behold! You are presented with yet another prompt. @@ -60,7 +56,7 @@ But note that shell commands won't work again until we exit the interpreter. Whi You can exit the interpreter and get back to the shell by typing: -```python +```text >>> exit() ``` @@ -87,9 +83,9 @@ six = 2 * 3 print(six) ``` -~~~ +```text 6 -~~~ +``` Note that in terms of naming variables, Python's variables must begin with a letter. @@ -98,42 +94,41 @@ the command "Run Selection/Line in Python Terminal"). If we look for a variable that hasn't ever been defined, we get an error telling us so: -~~~python +```python nolint print(seven) -~~~ +``` -~~~ +```text Traceback (most recent call last): File "", line 1, in NameError: name 'seven' is not defined -~~~ +``` You can also assign an arbitrary number of variables on the same line: -~~~python +```python one, two = 1, 2 -~~~ +``` ::::challenge{title="Sorting out references" id=sorting_reference} What does the following program print out? -~~~python +```python first, second = 1, 2 third, fourth = second, first print(third, fourth) -~~~ +``` :::solution - ~~~ - 2 1 - ~~~ +```text +2 1 +``` ::: :::: - Although we commonly refer to `variables` even in Python (because it is the common terminology), we really mean `names` or `identifiers`. In Python, `variables` are name tags for values, or labelled boxes that contain a value. @@ -156,10 +151,10 @@ more easily run out. So in Python, when I write: -~~~python +```python number = 1 number = 2 -~~~ +``` The following things happen: @@ -170,36 +165,35 @@ The following things happen: 1. The old address, containing '1', now has no labels. 1. The garbage collector frees the memory at the old address. - ## Objects and Types An object, like `number`, has a type. We can use `type()` to tell us the type of the variable. For our variable above: -~~~python +```python type(number) -~~~ +``` Note we don't need to use `print` - the Python interpreter will just output the result: -~~~ +```text -~~~ +``` -Depending on its type, an object can have different properties: data fields *inside* the object. +Depending on its type, an object can have different properties: data fields _inside_ the object. Consider a Python complex number for example, which Python supports natively: -~~~python +```python z = 3+1j -~~~ +``` We can see what properties and methods an object has available using the dir function: -~~~python +```python dir(z) -~~~ +``` -~~~ +```text ['__abs__' '__add__' '__bool__' @@ -249,7 +243,7 @@ dir(z) 'conjugate' 'imag' 'real'] - ~~~ +``` You can see that there are several methods whose name starts and ends with `__` (e.g. `__init__`): these are special methods that Python uses internally, and @@ -258,29 +252,29 @@ some of them later on in this course as they become useful. The others (in this case, `conjugate`, `img` and `real`) are the methods and fields through which we can interact with this object. -~~~python +```python type(z) -~~~ +``` -~~~ +```text -~~~ +``` -~~~python +```python z.real -~~~ +``` -~~~ +```text 3.0 -~~~ +``` -~~~python +```python z.imag -~~~ +``` -~~~ +```text 1.0 -~~~ +``` A property of an object is accessed with a dot. The jargon is that the "dot operator" is used to obtain a property of an object. @@ -288,28 +282,28 @@ A property of an object is accessed with a dot. The jargon is that the "dot oper Since we're not declaring the type of a variable, how does it work it out? -Python is an interpreted language that is *dynamically typed*, which means the -type of a variable is determined and *bound* to the variable at runtime from its +Python is an interpreted language that is _dynamically typed_, which means the +type of a variable is determined and _bound_ to the variable at runtime from its given value. So when we assign a floating point number, for example, it's type is inferred: ### Floats -~~~python +```python weight_kg = 55 weight_lb = 2.2 * weight_kg print('Weight in lb', weight_lb) -~~~ +``` Note we can add as many things that we want to `print` by separating them with a comma. -For a float, a number after a point is optional. But the *dot* makes it a float. +For a float, a number after a point is optional. But the _dot_ makes it a float. -~~~ +```text Weight in lb 121.00000000000001 -~~~ +``` -So the thing with floats is that they are *representation* of a real number. +So the thing with floats is that they are _representation_ of a real number. Representing a third or the root of 2 would be impossible for a computer, so these are really approximations of real numbers using an ubiquitous standard ([IEEE-754](https://docs.python.org/3/tutorial/floatingpoint.html#representation-error)). @@ -324,12 +318,12 @@ An important thing to remember, particularly in numerical analyses, is that a `f Draw diagrams showing what variables refer to what values after each statement in the following program: -~~~ +```python weight = 70.5 age = 35 weight = weight * 1.14 age = age + 20 -~~~ +``` ::: @@ -337,47 +331,47 @@ age = age + 20 Note that before, we also used a `string` in our use of `print`. In Python, we can use either single quotes or double quotes, or even both if we need to include quotes within a string, e.g.: -~~~python +```python given = 'Joe' middle = "Frederick" family = "'Bloggs'" full = given + " " + middle + " " + family print(full) -~~~ +``` Here we use the `+` operator to concatenate strings together. -~~~ +```text Joe Frederick 'Bloggs' -~~~ +``` With quotes, the main thing is to be consistent in how you use them (i.e. not like we've used them above!). -We've looked at properties on objects. But many objects can also have *methods* (types of functions) associated with them, which we can use to perform operations on the object. +We've looked at properties on objects. But many objects can also have _methods_ (types of functions) associated with them, which we can use to perform operations on the object. For strings, we also can do things like: -~~~python +```python given.upper() -~~~ +``` Which returns the upper case version of the string. -~~~ +```text 'JOE' -~~~ +``` Note it isn't changing `given`'s string itself, it's returning a new string in uppercase. There are other methods we can use on strings, such as: -~~~python +```python ' Hello'.strip() -~~~ +``` -~~~ +```text 'Hello' -~~~ +``` We'll be looking at classes and objects in more detail later today. @@ -385,61 +379,60 @@ We'll be looking at classes and objects in more detail later today. We can use boolean variables to capture `True` or `False`, useful in conditionals and loops, e.g.: -~~~python +```python is_joe = (given == 'Joe') flag = False print(is_joe, flag) -~~~ +``` -~~~ +```text True False -~~~ +``` ### No Value? We can also assign variable with no value: -~~~python +```python nothing = None print(nothing) -~~~ +``` -~~~ +```text None -~~~ +``` `None` is the special Python value for a no-value variable. If that's the output, what's the type of `nothing`? -~~~python +```python type(nothing) -~~~ +``` -~~~ +```text -~~~ - +``` ### Converting Between Types With floats, ints and strings, we can use in-built functions to convert between types: -~~~python +```python age, house_number = 30, '76' print(str(age), float(age), int(house_number), float(house_number)) -~~~ +``` -~~~ +```text 30 30.0 76 76.0 -~~~ +``` +## Key Points -## Key Points: - Python is an interpreted, dynamically typed language. - Run the interpreter from the command line by typing `python`. - Use `variable = value` to assign a value to a variable in order to record it in memory. - Variables are created on demand whenever a value is assigned to them. - Use `print(something)` to display the value of `something`. - `None` as an empty variable value has its own type. -- Convert a variable to another type by using `new_type_name(variable)`. \ No newline at end of file +- Convert a variable to another type by using `new_type_name(variable)`. diff --git a/software_project_management/collaboration/github.md b/software_project_management/collaboration/github.md index 9f22d823..c1a575ba 100644 --- a/software_project_management/collaboration/github.md +++ b/software_project_management/collaboration/github.md @@ -1,23 +1,17 @@ --- name: Github -dependsOn: [ - software_project_management.version_control_with_git.collaboration -] +dependsOn: [software_project_management.version_control_with_git.collaboration] tags: [github] -attribution: - - citation: > - Matt Jaquiery, Abhishek Dasgupta (2022) "Intermediate Git Collaboration" - url: https://github.com/OxfordRSE/intermediate-git-collaboration - image: https://avatars.githubusercontent.com/u/38728121?s=200&v=4 - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - - - +attribution: + - citation: > + Matt Jaquiery, Abhishek Dasgupta (2022) "Intermediate Git Collaboration" + url: https://github.com/OxfordRSE/intermediate-git-collaboration + image: https://avatars.githubusercontent.com/u/38728121?s=200&v=4 + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## GitHub @@ -28,7 +22,7 @@ GitHub offers a suite of project management tools alongside a cloud hosting platform for git projects. Here, we'll cover a few of the key parts of GitHub that you'll need to know for collaborating. -## *Contributors* and *collaborators* +## _Contributors_ and _collaborators_ GitHub grew out of open source software development, so its roots are very pro collaboration. By default, repositories are **public**, meaning that they @@ -38,51 +32,51 @@ Letting people view your repository, and allowing them to make their own copies to play around with, is usually good thing. But you probably don't want just anyone to be able to edit **your** copy of the repository! -GitHub uses *contributors* to define anyone who has added a change to a -repository, and *collaborators* to define people who are allowed to change your +GitHub uses _contributors_ to define anyone who has added a change to a +repository, and _collaborators_ to define people who are allowed to change your version of the code. -## *Forking* repositories +## _Forking_ repositories -If you're not a *collaborator* on a repository, you can still make changes to -your own copy of a **public** repository. You make your own copy by *forking* +If you're not a _collaborator_ on a repository, you can still make changes to +your own copy of a **public** repository. You make your own copy by _forking_ the repository on GitHub. This will create a copy of the repository under your -account, which you can then *clone*, modify, and update. If you make a change -that you're happy with, you can always submit a *pull request* from your fork +account, which you can then _clone_, modify, and update. If you make a change +that you're happy with, you can always submit a _pull request_ from your fork back to the original repository to request that they include your code. -We won't cover *forks* here, because on almost all projects you will work on -you will be a *collaborator* with the ability to modify the original remote +We won't cover _forks_ here, because on almost all projects you will work on +you will be a _collaborator_ with the ability to modify the original remote repository directly. ![To create a fork, navigate to the repository you want to fork on GitHub, and click the 'fork' button (highlighted here in yellow).](fig/github-fork.png) -## *Issues* +## _Issues_ -GitHub uses *issues* as its primary project tracking feature. It also has -many other features that do similar things, but *issues* are the most heavily +GitHub uses _issues_ as its primary project tracking feature. It also has +many other features that do similar things, but _issues_ are the most heavily used. -*Issues* are used to identify areas of improvement for the repository. They -can be anything: bug reports, questions, ideas. Once created, *issues* can be +_Issues_ are used to identify areas of improvement for the repository. They +can be anything: bug reports, questions, ideas. Once created, _issues_ can be assigned to developers, marked as 'closed' when no longer relevant, and be grouped together into larger project features like milestones. ![To create an issue, click the 'Issues' button (highlighted here in yellow) and then 'New Issue'.](fig/github-issues.png) -It is very common for repositories to use the *issue* tracker as their primary +It is very common for repositories to use the _issue_ tracker as their primary means of organisation. Many software and other projects hosted on GitHub will -request that users submit bug reports directly through the *issue* interface. +request that users submit bug reports directly through the _issue_ interface. -## *Pull requests* +## _Pull requests_ When working alone, it is common to work only on the `main` branch, or to use -other branches and then *merge* freely whenever you feel like it. When working +other branches and then _merge_ freely whenever you feel like it. When working collaboratively, a more structured approach is required. The collaborative -version of *merging* is the *pull request* or *PR*. +version of _merging_ is the _pull request_ or _PR_. -A *pull request* is a **request** to **pull** the changes in one branch into -another branch. When a *pull request* is created, the two branches are +A _pull request_ is a **request** to **pull** the changes in one branch into +another branch. When a _pull request_ is created, the two branches are designated and the code is compared to show what changes will be made to the target branch (the first one in the figure). @@ -91,32 +85,32 @@ target branch (the first one in the figure). Pull requests on small projects between trusted collaborators may be merged (i.e. accepted) immediately, while more mature or complex projects may have processes that need to be completed before a PR is allowed to be merged. -This process is likely to include *code review* and/or automated testing -(*CI/CD*). +This process is likely to include _code review_ and/or automated testing +(_CI/CD_). -## *Code review* +## _Code review_ -*Code review* is a process of pooling knowledge and experience, and inviting +_Code review_ is a process of pooling knowledge and experience, and inviting additional perspectives on changes so that they are more likely to be helpful and less likely to include mistakes or be written in confusing ways. -GitHub incorporates *code review* as part of the *pull request* framework. -When a *pull request* is opened, a review can be requested from a -*collaborator*. Anyone with the ability to comment on the *pull request* is +GitHub incorporates _code review_ as part of the _pull request_ framework. +When a _pull request_ is opened, a review can be requested from a +_collaborator_. Anyone with the ability to comment on the _pull request_ is able to contribute to the review, whether or not they are invited. The review happens using the list of changes that are included in the body -of the *pull request*. Any line or section of code can be selected for comment. +of the _pull request_. Any line or section of code can be selected for comment. Comments should be courteous and constructive. It's fine to include questions where you don't understand something, or where you want to allow a developer to consider another perspective without ordering them to change things, but it should always be clear from a comment what, if anything, you want the developer to **do** in response. -## *Continuous Integration* and *Continuous Deployment* +## _Continuous Integration_ and _Continuous Deployment_ GitHub allows all accounts around 3000 hours of virtual machine time per month. -This is used to test (*integration*) and deploy your code. Almost all modern +This is used to test (_integration_) and deploy your code. Almost all modern code should include some tests that guarantee that the code does what it is supposed to do. These tests can be set up to run automatically, whenever changes are made to the repository, so that it becomes immediately obvious if a change @@ -126,6 +120,6 @@ Many projects are deployed to some or other platform (this tutorial, for example, runs on [GitHub Pages](https://pages.github.com/)) and its web code is regenerated automatically whenever the content code is updated. -You do not need a deep knowledge of *CI/CD* on this course, but you should +You do not need a deep knowledge of _CI/CD_ on this course, but you should be aware what they are so that you can pick up the details of the specific implementation used by any project you collaborate on. diff --git a/software_project_management/collaboration/index.md b/software_project_management/collaboration/index.md index f924f582..254e476c 100644 --- a/software_project_management/collaboration/index.md +++ b/software_project_management/collaboration/index.md @@ -1,29 +1,23 @@ --- name: Collaboration on GitHub id: collaboration -dependsOn: [ - technology_and_tooling.version_control -] -files: [ - refresher.md, - workflows.md, - issues.md, -] -attribution: - - citation: > - "Aleksandra Nenadic, Steve Crouch, James Graham, et al. (2022). carpentries-incubator/python-intermediate-development: beta (beta). Zenodo. https://doi.org/10.5281/zenodo.6532057" - url: https://doi.org/10.5281/zenodo.6532057 - image: https://carpentries-incubator.github.io/python-intermediate-development/assets/img/incubator-logo-blue.svg - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 +dependsOn: [technology_and_tooling.version_control] +files: [refresher.md, workflows.md, issues.md] +attribution: + - citation: > + "Aleksandra Nenadic, Steve Crouch, James Graham, et al. (2022). carpentries-incubator/python-intermediate-development: beta (beta). Zenodo. https://doi.org/10.5281/zenodo.6532057" + url: https://doi.org/10.5281/zenodo.6532057 + image: https://carpentries-incubator.github.io/python-intermediate-development/assets/img/incubator-logo-blue.svg + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 summary: | - This course will show you how to use Git for effective collaboration with others - using the GitHub platform. After completion of the course, you will be able to - contribute to projects on GitHub, including opening issues and pull requests. + This course will show you how to use Git for effective collaboration with others + using the GitHub platform. After completion of the course, you will be able to + contribute to projects on GitHub, including opening issues and pull requests. --- This course will show you how to use Git for effective collaboration with others @@ -35,4 +29,4 @@ contribute to projects on GitHub, including opening issues and pull requests. we use Git to version control our code in conjunction with [GitHub](https://github.com/) for code backup and sharing. GitHub is one of the leading integrated products and social platforms for modern software development, monitoring and management - it will help us with -version control, issue management, code review, code testing/Continuous Integration, and collaborative development. \ No newline at end of file +version control, issue management, code review, code testing/Continuous Integration, and collaborative development. diff --git a/software_project_management/collaboration/issues.md b/software_project_management/collaboration/issues.md index 84f826c1..f527d349 100644 --- a/software_project_management/collaboration/issues.md +++ b/software_project_management/collaboration/issues.md @@ -1,26 +1,27 @@ --- name: Issue Management -dependsOn: [ - software_project_management.collaboration.refresher -] +dependsOn: [software_project_management.collaboration.refresher] tags: [github] -attribution: - - citation: > - "Aleksandra Nenadic, Steve Crouch, James Graham, et al. (2022). carpentries-incubator/python-intermediate-development: beta (beta). Zenodo. https://doi.org/10.5281/zenodo.6532057" - url: https://doi.org/10.5281/zenodo.6532057 - image: https://carpentries-incubator.github.io/python-intermediate-development/assets/img/incubator-logo-blue.svg - license: CC-BY-4.0 - - citation: > - Matt Jaquiery, Abhishek Dasgupta (2022) "Intermediate Git Collaboration" - url: https://github.com/OxfordRSE/intermediate-git-collaboration - image: https://avatars.githubusercontent.com/u/38728121?s=200&v=4 - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - +learningOutcomes: + - Understand what issues are and why they are useful. + - Create an issue on GitHub. + - Classify an issue depending on its purpose. + - Refer to colleagues or yourself in aspects related to issues. +attribution: + - citation: > + "Aleksandra Nenadic, Steve Crouch, James Graham, et al. (2022). carpentries-incubator/python-intermediate-development: beta (beta). Zenodo. https://doi.org/10.5281/zenodo.6532057" + url: https://doi.org/10.5281/zenodo.6532057 + image: https://carpentries-incubator.github.io/python-intermediate-development/assets/img/incubator-logo-blue.svg + license: CC-BY-4.0 + - citation: > + Matt Jaquiery, Abhishek Dasgupta (2022) "Intermediate Git Collaboration" + url: https://github.com/OxfordRSE/intermediate-git-collaboration + image: https://avatars.githubusercontent.com/u/38728121?s=200&v=4 + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Introduction @@ -29,7 +30,7 @@ Developing software is a project and, like most projects, it consists of multiple tasks. Keeping track of identified issues with the software, the list of tasks the team has to do, progress on each, prioritising tasks for future development, planning sprints and releases, etc., can quickly become a -non-trivial task in itself. Without a good team project management process and +non-trivial task in itself. Without a good team project management process and framework, it can be hard to keep track of what’s done, or what needs doing, and particularly difficult to convey that to others in the team or share the responsibilities. @@ -37,7 +38,7 @@ responsibilities. ## Using GitHub to Manage Issues With Software As a piece of software is used, bugs and other issues will inevitably come to -light - nothing is perfect! If you work on your code with collaborators, or +light - nothing is perfect! If you work on your code with collaborators, or have non-developer users, it can be helpful to have a single shared record of all the problems people have found with the code, not only to keep track of them for you to work on later, but to avoid people emailing you to report a bug that @@ -45,7 +46,7 @@ you already know about! GitHub provides **Issues** - a framework for managing bug reports, feature requests, and lists of future work. -Go back to the home page for your `oxrse_unit_conv` repository in GitHub, and click on +Go back to the home page for your `oxrse_unit_conv` repository in GitHub, and click on the **Issue** tab. You should see a page listing the open issues on your repository - currently there should be none. @@ -55,10 +56,10 @@ Let's go through the process of creating a new issue. Start by clicking the `New ![Creating a new issue in GitHub](fig/github-new-issue.png) -When you create an issue, you can add a range of details to them. They can be *assigned to a specific developer* for example - this can be a helpful way to know who, if anyone, is currently working to fix the issue, or a way to assign +When you create an issue, you can add a range of details to them. They can be _assigned to a specific developer_ for example - this can be a helpful way to know who, if anyone, is currently working to fix the issue, or a way to assign responsibility to someone to deal with it. -They can also be assigned a *label*. The labels available for issues can be customised, and given a colour, allowing you to see at a glance the state of your code's issues. The [default labels available in GitHub](https://docs.github.com/en/issues/using-labels-and-milestones-to-track-work/managing-labels) include: +They can also be assigned a _label_. The labels available for issues can be customised, and given a colour, allowing you to see at a glance the state of your code's issues. The [default labels available in GitHub](https://docs.github.com/en/issues/using-labels-and-milestones-to-track-work/managing-labels) include: - `bug` - indicates an unexpected problem or unintended behavior - `documentation` - indicates a need for improvements or additions to documentation @@ -70,17 +71,18 @@ They can also be assigned a *label*. The labels available for issues can be cust - `question` - indicates that an issue, pull request, or discussion needs more information - `wontfix` - indicates that work won't continue on an issue, pull request, or discussion -You can also create your own custom labels to help with classifying issues. There are no -rules really about naming the labels - use whatever makes sense for your project. Some -conventional custom labels include: `status:in progress` (to indicate that someone -started working on the issue), `status:blocked` (to indicate that the progress on -addressing issue is blocked by another issue or activity), `bug` (to indicate that this -issue is a report of a bug or fault in the code), `enhancement` (to indicate that this +You can also create your own custom labels to help with classifying issues. There are no +rules really about naming the labels - use whatever makes sense for your project. Some +conventional custom labels include: `status:in progress` (to indicate that someone +started working on the issue), `status:blocked` (to indicate that the progress on +addressing issue is blocked by another issue or activity), `bug` (to indicate that this +issue is a report of a bug or fault in the code), `enhancement` (to indicate that this issue is for a new feature for the software) - :::callout + ## Manage Issues With Your Code Openly + Having open, publicly-visible lists of the the limitations and problems with your code is incredibly helpful. Even if some issues end up languishing unfixed for years, letting users know about them can save them a huge amount of work @@ -89,27 +91,26 @@ also help you see at a glance what state your code is in, making it easier to prioritise future work! ::: - :::challenge{id=first-issue title="Our First Issue!"} -The `oxrse_unit_conv` repo that you cloned previously -([https://github.com/OxfordRSE/oxrse_unit_conv](https://github.com/OxfordRSE/oxrse_unit_conv)). -is a small toy Python project that implements some classes for SI and non-SI units (you +The `oxrse_unit_conv` repo that you cloned previously +([https://github.com/OxfordRSE/oxrse_unit_conv](https://github.com/OxfordRSE/oxrse_unit_conv)). +is a small toy Python project that implements some classes for SI and non-SI units (you can read the `README.md` file for more information), and implements convertions -between values of different units. There are some initial units defined, but many are -missing. +between values of different units. There are some initial units defined, but many are +missing. -Individually, with a critical eye, think of an aspect of the code in this repo that -needs improvement. This could be to add a new unit to the project, or it could be to add -any other functionality that you think would be useful, or to fix any bugs that you +Individually, with a critical eye, think of an aspect of the code in this repo that +needs improvement. This could be to add a new unit to the project, or it could be to add +any other functionality that you think would be useful, or to fix any bugs that you find. ::: ### Issue (and Pull Request) Templates GitHub also allows you to set up issue and pull request templates for your -software project. Such templates provide a structure for the issue/pull request +software project. Such templates provide a structure for the issue/pull request descriptions, and/or prompt issue reporters and collaborators to fill in answers to pre-set questions. They can help contributors raise issues or submit pull requests in a way that is clear, helpful and provides enough information for @@ -119,13 +120,13 @@ own](https://docs.github.com/en/communities/using-templates-to-encourage-useful- ## Using GitHub's Notifications & Referencing System to Communicate -GitHub implements a comprehensive [notifications system](https://docs.github.com/en/account-and-profile/managing-subscriptions-and-notifications-on-github/setting-up-notifications/configuring-notifications) -to keep the team up-to-date with activities in your code repository and notify you when something happens or changes +GitHub implements a comprehensive [notifications system](https://docs.github.com/en/account-and-profile/managing-subscriptions-and-notifications-on-github/setting-up-notifications/configuring-notifications) +to keep the team up-to-date with activities in your code repository and notify you when something happens or changes in your software project. You can choose whether to watch or unwatch an individual repository, -or can choose to only be notified of certain event types such as updates to issues, pull requests, direct mentions, -etc. GitHub also provides an additional useful notification feature for collaborative work - **Mentions**. -In addition to referencing team members (which will result in an appropriate notification), GitHub allows us -to reference issues, pull requests and comments from one another - providing a useful way of connecting things +or can choose to only be notified of certain event types such as updates to issues, pull requests, direct mentions, +etc. GitHub also provides an additional useful notification feature for collaborative work - **Mentions**. +In addition to referencing team members (which will result in an appropriate notification), GitHub allows us +to reference issues, pull requests and comments from one another - providing a useful way of connecting things and conversations in your project. ### Referencing Team Members Using Mentions @@ -134,7 +135,7 @@ The mention system notifies team members when somebody else references them in an issue, comment or pull request - you can use this to notify people when you want to check a detail with them, or let them know something has been fixed or changed (much easier than writing out all the same information again in an -email). +email). You can use the mention system to link to/notify an individual GitHub account or a whole team for notifying multiple people. Typing `@` in GitHub will @@ -142,15 +143,15 @@ bring up a list of all accounts and teams linked to the repository that can be "mentioned". People will then receive notifications based on their preferred notification methods - e.g. via email or GitHub's User Interface. -### Referencing Issues, Pull Requests and Comments +### Referencing Issues, Pull Requests and Comments GitHub also lets you mention/reference one issue or pull request from another (and people "watching" these will be notified of any such updates). Whilst writing the description of an issue, or commenting on one, if you type `#` you should see a list of the issues and pull requests on the -repository. They are coloured green if they're open, or white if they're +repository. They are coloured green if they're open, or white if they're closed. Continue typing the issue number, and the list will narrow down, then -you can hit `Return` to select the entry and link the two. For +you can hit `Return` to select the entry and link the two. For example, if you realise that several of your bugs have common roots, or that one enhancement can't be implemented before you've finished another, you can use the mention system to indicate the depending issue(s). This is a simple way to add @@ -167,16 +168,17 @@ and GitHub will render it nicely using the identifier's short form and link to t :::challenge{id=first-mention title="Our First Mention/Reference!"} -Add a mention to one of your team members using the `@` notation -in a comment within an issue or a pull request in your repository - e.g. to -ask them a question or a clarification on something or to do some additional work. +Add a mention to one of your team members using the `@` notation +in a comment within an issue or a pull request in your repository - e.g. to +ask them a question or a clarification on something or to do some additional work. -Alternatively, add another issue to your repository and reference the issue you created +Alternatively, add another issue to your repository and reference the issue you created in the previous exercise using the `#` notation. ::: -## Key Points: +## Key Points + - We should use GitHub's **Issues** to keep track of software problems and other requests for change - even if we are the only developer and user. - GitHub’s **Mentions** play an important part in communicating between collaborators and is used as a way of alerting team members of activities and referencing one issue/pull requests/comment/commit from another. - Without a good project and issue management framework, it can be hard to keep track of what’s done, or what needs doing, and particularly difficult to convey that to others in the team or sharing the responsibilities. diff --git a/software_project_management/collaboration/refresher.md b/software_project_management/collaboration/refresher.md index 4f6ff75a..75161baa 100644 --- a/software_project_management/collaboration/refresher.md +++ b/software_project_management/collaboration/refresher.md @@ -1,63 +1,66 @@ --- name: Git Refresher -dependsOn: [ -] +dependsOn: [] tags: [git] -attribution: - - citation: > - "Aleksandra Nenadic, Steve Crouch, James Graham, et al. (2022). carpentries-incubator/python-intermediate-development: beta (beta). Zenodo. https://doi.org/10.5281/zenodo.6532057" - url: https://doi.org/10.5281/zenodo.6532057 - image: https://carpentries-incubator.github.io/python-intermediate-development/assets/img/incubator-logo-blue.svg - license: CC-BY-4.0 - - citation: > - Matt Jaquiery, Abhishek Dasgupta (2022) "Intermediate Git Collaboration" - url: https://github.com/OxfordRSE/intermediate-git-collaboration - image: https://avatars.githubusercontent.com/u/38728121?s=200&v=4 - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - +learningOutcomes: + - Commit changes in a software project to a local repository and publish them in a remote repository on GitHub. + - Create and use branches for managing different threads of code development. + - Learn to use feature branch workflow to effectively collaborate with a team on a software project. +attribution: + - citation: > + "Aleksandra Nenadic, Steve Crouch, James Graham, et al. (2022). carpentries-incubator/python-intermediate-development: beta (beta). Zenodo. https://doi.org/10.5281/zenodo.6532057" + url: https://doi.org/10.5281/zenodo.6532057 + image: https://carpentries-incubator.github.io/python-intermediate-development/assets/img/incubator-logo-blue.svg + license: CC-BY-4.0 + - citation: > + Matt Jaquiery, Abhishek Dasgupta (2022) "Intermediate Git Collaboration" + url: https://github.com/OxfordRSE/intermediate-git-collaboration + image: https://avatars.githubusercontent.com/u/38728121?s=200&v=4 + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Git -Git is a version control system for tracking changes in computer files and coordinating work on those files among +Git is a version control system for tracking changes in computer files and coordinating work on those files among multiple people. It is primarily used for source code management in software development but it can be used to -track changes in files in general - it is particularly effective for tracking text-based files (e.g. source code files, +track changes in files in general - it is particularly effective for tracking text-based files (e.g. source code files, CSV, Markdown, HTML, CSS, Tex, etc. files). Git has several important characteristics: + - support for non-linear development allowing you and your colleagues to work on different parts of a project concurrently, - support for distributed development allowing for multiple people to be working on the same project (even the same file) at the same time, - every change recorded by Git remains part of the project history and can be retrieved at a later date, so even -if you make a mistake you can revert to a point before it. + if you make a mistake you can revert to a point before it. The diagram below shows a typical software development lifecycle with Git and the commonly used commands to interact with different parts of Git infrastructure, such as: -- **working directory** - a directory (including any subdirectories) where your - project files live and where you are currently working. It is also known as - the “untracked” area of Git. Any changes to files will be marked by Git in - the working directory. If you make changes to the working directory and do - not explicitly tell Git to save them - you will likely lose those changes. - Using `git add filename` command, you tell Git to start tracking changes to + +- **working directory** - a directory (including any subdirectories) where your + project files live and where you are currently working. It is also known as + the “untracked” area of Git. Any changes to files will be marked by Git in + the working directory. If you make changes to the working directory and do + not explicitly tell Git to save them - you will likely lose those changes. + Using `git add filename` command, you tell Git to start tracking changes to file `filename` within your working directory. -- **staging area (index)** - once you tell Git to start tracking changes to - files (with `git add filename` command), Git saves those changes in the - staging area. Each subsequent change to the same file needs to be followed by - another `git add filename` command to tell Git to update it in the staging - area. To see what is in your working directory and staging area at any moment +- **staging area (index)** - once you tell Git to start tracking changes to + files (with `git add filename` command), Git saves those changes in the + staging area. Each subsequent change to the same file needs to be followed by + another `git add filename` command to tell Git to update it in the staging + area. To see what is in your working directory and staging area at any moment (i.e. what changes is Git tracking), run the command `git status`. -- **local repository** - stored within the `.git` directory of your project, - this is where Git wraps together all your changes from the staging area and - puts them using the `git commit` command. Each commit is a new, permanent - snapshot (checkpoint, record) of your project in time, which you can share or +- **local repository** - stored within the `.git` directory of your project, + this is where Git wraps together all your changes from the staging area and + puts them using the `git commit` command. Each commit is a new, permanent + snapshot (checkpoint, record) of your project in time, which you can share or revert back to. -- **remote repository** - this is a version of your project that is hosted - somewhere on the Internet (e.g. on GitHub, GitLab or somewhere else). While - your project is nicely version-controlled in your local repository, and you +- **remote repository** - this is a version of your project that is hosted + somewhere on the Internet (e.g. on GitHub, GitLab or somewhere else). While + your project is nicely version-controlled in your local repository, and you have snapshots of its versions from the past, if your machine crashes - you still may lose all your work. Working with a remote @@ -65,9 +68,9 @@ with different parts of Git infrastructure, such as: in order to collaborate with others and to backup your work on a different machine. ![Development lifecycle with Git](fig/git-lifecycle.png) -*Software development lifecycle with Git from [PNGWing](https://www.pngwing.com/en/free-png-sazxf) (licenced for non-commercial reuse)* +_Software development lifecycle with Git from [PNGWing](https://www.pngwing.com/en/free-png-sazxf) (licenced for non-commercial reuse)_ -### Forking the project +### Forking the project This course is based on a project that is hosted on GitHub. The project repository is [https://github.com/OxfordRSE/oxrse_unit_conv](https://github.com/OxfordRSE/oxrse_unit_conv). @@ -84,14 +87,14 @@ local machine and start working on it. To clone the project repository to your local machine, you need to copy the URL of the repository. You can do this by clicking the green "Code" button in the top right corner of the project repository page on GitHub and copying the URL -from either the "HTTPS" tab, or if you have [generated](https://docs.github.com/en/authentication/connecting-to-github-with-ssh/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent) +from either the "HTTPS" tab, or if you have [generated](https://docs.github.com/en/authentication/connecting-to-github-with-ssh/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent) and [added](https://docs.github.com/en/authentication/connecting-to-github-with-ssh/adding-a-new-ssh-key-to-your-github-account) an ssh key to your github account (recommended), from the "SSH" tab. Once you have copied the URL, you can clone the repository to your local machine using the `git clone` command. -~~~bash +```bash git clone git@github.com:OxfordRSE/oxrse_unit_conv.git -~~~ +``` ### Installing the project @@ -102,32 +105,32 @@ command in the project directory. This will install the project in the "editable" mode, which means that any changes you make to the project files will be immediately reflected in the installed version of the project. -~~~bash +```bash python3 -m venv venv source venv/bin/activate pip install -e . -~~~ +``` ### Viewing Changes -The first thing to do upon navigating into our software project's directory -root is to check the current status of our local working directory and +The first thing to do upon navigating into our software project's directory +root is to check the current status of our local working directory and repository. -~~~bash +```bash git status -~~~ +``` -~~~ +```text On branch main Your branch is up to date with 'origin/main'. Untracked files: (use "git add ..." to include in what will be committed) - venv/ + venv/ nothing added to commit but untracked files present (use "git add" to track) -~~~ +``` As expected, Git is telling us that we have an untracked director - the directory "venv" - present in our working @@ -148,7 +151,7 @@ containing ".venv/". It does not matter much in this case where within the file you add these lines, so let's do it at the end. Your `.gitignore` should look something like this: -~~~ +```text # IDEs .vscode/ .idea/ @@ -167,12 +170,12 @@ something like this: # Virtual environments venv/ .venv/ -~~~ +``` You may notice that we are already not tracking certain files and directories with useful comments about what exactly we are ignoring. You may also notice that each line in `.ignore` is actually a pattern, so you can ignore multiple -files that match a pattern (e.g. "*.png" will ignore all PNG files in the +files that match a pattern (e.g. "\*.png" will ignore all PNG files in the current directory). If you run the `git status` command now, you will notice that Git has cleverly @@ -180,34 +183,34 @@ understood that you want to ignore changes to the "venv" directory so it is not warning us about it any more. However, it has now detected a change to `.gitignore` file that needs to be committed. -~~~bash -$ git status -~~~ +```bash +git status +``` -~~~ +```text On branch main Your branch is up to date with 'origin/main'. Changes not staged for commit: (use "git add ..." to update what will be committed) (use "git restore ..." to discard changes in working directory) - modified: .gitignore + modified: .gitignore no changes added to commit (use "git add" and/or "git commit -a") -~~~ +``` -To commit the changes `.gitignore` to the local repository, we first have to add this file to +To commit the changes `.gitignore` to the local repository, we first have to add this file to staging area to prepare them for committing: -~~~bash -$ git add .gitignore -~~~ +```bash +git add .gitignore +``` Now we can commit to the local repository with: -~~~bash -$ git commit -m "Ignoring virtual env. folder." -~~~ +```bash +git commit -m "Ignoring virtual env. folder." +``` Remember to use meaningful messages for your commits. @@ -221,29 +224,30 @@ for example, by two collaborators making different changes to the same lines in a file. By pulling first, we are made aware of any changes made by others, in particular if there are any conflicts between their changes and ours. -~~~bash -$ git pull -~~~ +```bash +git pull +``` -Now we've ensured our repository is synchronised with the remote one, we can now push our changes. GitHub has recently [strengthened authentication requirements for Git operations](https://github.blog/2020-12-15-token-authentication-requirements-for-git-operations/ -) accessing GitHub from the command line over HTTPS. This means you cannot use passwords for authentication -over HTTPS any more - you either need to [set up and use a personal access token](https://catalyst.zoho.com/help/tutorials/githubbot/generate-access-token.html) for additional security if you want to continue +Now we've ensured our repository is synchronised with the remote one, we can now push our changes. GitHub has recently [strengthened authentication requirements for Git operations](https://github.blog/2020-12-15-token-authentication-requirements-for-git-operations/) accessing GitHub from the command line over HTTPS. This means you cannot use passwords for authentication +over HTTPS any more - you either need to [set up and use a personal access token](https://catalyst.zoho.com/help/tutorials/githubbot/generate-access-token.html) for additional security if you want to continue to use HTTPS, or switch to use private and public key pair over SSH before you can push remotely the changes you made locally. So, when you run the command below: -~~~bash -$ git push origin main -~~~ +```bash +git push origin main +``` :::callout + ## Authentication Errors If you get a warning that HTTPS access is deprecated, or a token is required, then you accidentally cloned the repository using HTTPS and not SSH. You can fix this from the command line by resetting the remote repository URL setting on your local repo: -~~~bash -$ git remote set-url origin git@github.com:OxfordRSE/oxrse_unit_conv.git -~~~ +```bash +git remote set-url origin git@github.com:OxfordRSE/oxrse_unit_conv.git +``` + ::: In the above command, @@ -253,34 +257,37 @@ repository locally); `main` is the name of our main (and currently only) development branch. :::callout + # Git Remotes -Note that systems like Git allow us to synchronise work between any two or more copies of the same repository - + +Note that systems like Git allow us to synchronise work between any two or more copies of the same repository - the ones that are not located on your machine are "Git remotes" for you. In practice, though, it is easiest to agree with your collaborators to use one copy as a central hub (such as GitHub or GitLab), where everyone pushes their -changes to. This also avoid risks associated with keeping the "central copy" on someone’s laptop. +changes to. This also avoid risks associated with keeping the "central copy" on someone’s laptop. You can have more than one remote configured for your local repository, each of which generally is either read-only or read/write for you. Collaborating with others involves managing these remote repositories and pushing and pulling information to and from them when you need to share work. ![git-distributed](fig/git-distributed.png) -*Git - distributed version control system, from (https://www.w3docs.com/learn-git/git-repository.html) (freely available)* +_Git - distributed version control system, from () (freely available)_ ::: ## Git Branches + When we do `git status`, Git also tells us that we are currently on the `main` branch of the project. A branch is one version of your project (the files in your repository) that can contain its own set of commits. We can create a new branch, make changes to the code which we then commit to the branch, and, once we are happy with those changes, merge them back to the main branch. To see what other branches are available, do: -~~~bash -$ git branch -~~~ +```bash +git branch +``` -~~~ +```text * main -~~~ +``` At the moment, there's only one branch (`main`) and hence only one version of the code available. When you create a Git repository for the first time, by default you only get one version (i.e. branch) - `main`. Let's have a look at @@ -292,26 +299,26 @@ While it is technically OK to commit your changes directly to `main` branch, and you may often find yourself doing so for some minor changes, the best practice is to use a new branch for each separate and self-contained unit/piece of work you want to add to the project. This unit of work is also often called a -*feature* and the branch where you develop it is called a -*feature branch*. Each feature branch should have its own meaningful name - -*indicating its purpose (e.g. "issue23-fix"). If we keep making changes +_feature_ and the branch where you develop it is called a +_feature branch_. Each feature branch should have its own meaningful name - +\*indicating its purpose (e.g. "issue23-fix"). If we keep making changes and pushing them directly to `main` branch on GitHub, then anyone who downloads our software from there will get all of our work in progress - whether or not it's ready to use! So, working on a separate branch for each feature you are adding is good for several reasons: -* it enables the main branch to remain stable while you and the team explore and test the new code on a feature -branch, -* it enables you to keep the untested and not-yet-functional feature branch code under version control and -backed up, -* you and other team members may work on several features at the same time independently from one another, -* if you decide that the feature is not working or is no longer needed - you can easily and safely discard that -branch without affecting the rest of the code. +- it enables the main branch to remain stable while you and the team explore and test the new code on a feature + branch, +- it enables you to keep the untested and not-yet-functional feature branch code under version control and + backed up, +- you and other team members may work on several features at the same time independently from one another, +- if you decide that the feature is not working or is no longer needed - you can easily and safely discard that + branch without affecting the rest of the code. Branches are commonly used as part of a feature-branch workflow, shown in the diagram below. ![Git feature branch workflow diagram](fig/git-feature-branch.svg) -*Git feature branches>, adapted from [Git Tutorial by sillevl](https://sillevl.gitbooks.io/git/content/collaboration/workflows/gitflow/) (Creative Commons Attribution 4.0 International License)* +_Git feature branches>, adapted from [Git Tutorial by sillevl](https://sillevl.gitbooks.io/git/content/collaboration/workflows/gitflow/) (Creative Commons Attribution 4.0 International License)_ In the software development workflow, we typically have a main branch which is the version of the code that is tested, stable and reliable. Then, we normally @@ -319,7 +326,7 @@ have a development branch (called `develop` or `dev` by convention) that we use for work-in-progress code. As we work on adding new features to the code, we create new feature branches that first get merged into `develop` after a thorough testing process. After even more testing - `develop` branch will get -merged into `main`. The points when feature branches are merged to `develop`, +merged into `main`. The points when feature branches are merged to `develop`, and `develop` to `main` depend entirely on the practice/strategy established in the team. For example, for smaller projects (e.g. if you are working alone on a project or in a very small team), feature branches sometimes get directly merged @@ -332,80 +339,87 @@ is broken should be in `main`. Let's create a `develop` branch to work on: -~~~bash -$ git branch develop -~~~ +```bash +git branch develop +``` This command does not give any output, but if we run `git branch` again, without giving it a new branch name, we can see the list of branches we have - including the new one we have just made. -~~~bash -$ git branch -~~~ +```bash +git branch +``` -~~~ +```text develop * main -~~~ +``` The `*` indicates the currently active branch. So how do we switch to our new branch? We use the `git checkout` command with the name of the branch: -~~~bash -$ git checkout develop -~~~ +```bash +git checkout develop +``` -~~~ +```text Switched to branch 'develop' -~~~ +``` :::callout + ## Create and Switch to Branch Shortcut + A shortcut to create a new branch and immediately switch to it: -~~~bash -$ git checkout -b develop -~~~ +```bash +git checkout -b develop +``` + ::: ### Updating Branches -If we start updating and committing files now, the commits will happen on the `develop` branch and will not affect + +If we start updating and committing files now, the commits will happen on the `develop` branch and will not affect the version of the code in `main`. We add and commit things to `develop` branch in the same way as we do to `main`. First make a small modification to `README.md` in your IDE, for example change "Oxford RSE" in the title to your name. Then, if we do: -~~~bash -$ git status -~~~ +```bash +git status +``` -~~~ +```text On branch develop Changes not staged for commit: (use "git add ..." to update what will be committed) (use "git checkout -- ..." to discard changes in working directory) - modified: README.md + modified: README.md no changes added to commit (use "git add" and/or "git commit -a") -~~~ +``` Git is telling us that we are on branch `develop` and which tracked files have been modified in our working directory. We can now `add` and `commit` the changes in the usual way. -~~~bash -$ git add README.md -$ git commit -m "personalised README.md" -~~~ +```bash +git add README.md +git commit -m "personalised README.md" +``` :::callout + ## Currently Active Branch + Remember, `add` and `commit` commands always act on the currently active branch. You have to be careful and aware of which branch you are working with at any given moment. `git status` can help with that, and you will find yourself invoking it very often. ::: ### Pushing New Branch Remotely + We push the contents of the `develop` branch to GitHub in the same way as we pushed the `main` branch. However, as we have just created this branch locally, it still does not exist in our remote repository. You can check that in GitHub by listing all branches. @@ -415,16 +429,18 @@ listing all branches. To push a new local branch remotely for the first time, you could use the `-u` switch and the name of the branch you are creating and pushing to: -~~~bash -$ git push -u origin develop -~~~ +```bash +git push -u origin develop +``` :::callout - ## Git Push With `-u` Switch + +## Git Push With `-u` Switch + Using the `-u` switch with the `git push` command is a handy shortcut for: (1) creating the new remote branch and (2) setting your local branch to automatically track the remote one at the same time. You need to use the `-u` switch only once to set up that association between your branch and the remote one explicitly. - After that you could simply use `git push` without specifying the remote repository, if you wished so. We still prefer +After that you could simply use `git push` without specifying the remote repository, if you wished so. We still prefer to explicitly state this information in commands. ::: @@ -440,51 +456,53 @@ Now the others can check out the `develop` branch too and continue to develop co After the initial push of the new branch, each next time we push to it in the usual manner (i.e. without the `-u` switch): -~~~bash -$ git push origin develop -~~~ +```bash +git push origin develop +``` :::callout + ## What is the Relationship Between Originating and New Branches? It's natural to think that new branches have a parent/child relationship with their originating branch, but in actual Git terms, branches themselves do not have parents but single commits do. Any commit can have zero parents -(a root, or initial, commit), one parent (a regular commit), or multiple parents (a merge commit), and using this +(a root, or initial, commit), one parent (a regular commit), or multiple parents (a merge commit), and using this structure, we can build a 'view' of branches from a set of commits and their relationships. A common way to look at it is that Git branches are really only [lightweight, movable pointers to commits](https://git-scm.com/book/en/v2/Git-Branching-Branches-in-a-Nutshell). So as a new commit is added to a branch, the branch pointer is moved to the new commit. -What this means is that when you accomplish a merge between two branches, Git is able to determine the common 'commit ancestor' through +What this means is that when you accomplish a merge between two branches, Git is able to determine the common 'commit ancestor' through the commits in a 'branch', and use that common ancestor to determine which commits need to be merged onto the destination branch. It also means that, in theory, you could merge any branch with any other at any time... although it may not make sense to do so! ::: ### Merging Into Main Branch + Once you have tested your changes on the `develop` branch, you will want to merge them onto the `main` branch. To do so, make sure you have all your changes committed and switch to `main`: -~~~bash -$ git checkout main -~~~ +```bash +git checkout main +``` -~~~ +```text Switched to branch 'main' Your branch is up to date with 'origin/main'. -~~~ +``` To merge the `develop` branch on top of `main` do: -~~~bash -$ git merge develop -~~~ +```bash +git merge develop +``` -~~~ +```text Updating 05e1ffb..be60389 Fast-forward README.md | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) -~~~ +``` If there are no conflicts, Git will merge the branches without complaining and replay all commits from `develop` on top of the last commit from `main`. If there are merge conflicts (e.g. a team collaborator modified the same @@ -492,12 +510,14 @@ portion of the same file you are working on and checked in their changes before will be marked and you will need to resolve those conflicts and commit the changes before attempting to merge again. Since we have no conflicts, we can now push the `main` branch to the remote repository: -~~~bash +```bash git push origin main -~~~ +``` :::callout + ## All Branches Are Equal + In Git, all branches are equal - there is nothing special about the `main` branch. It is called that by convention and is created by default, but it can also be called something else. A good example is `gh-pages` branch which is the main branch for website projects hosted on GitHub (rather than `main`, which can @@ -505,7 +525,9 @@ be safely deleted for such projects). ::: :::callout + ## Keeping Main Branch Stable + Good software development practice is to keep the `main` branch stable while you and the team develop and test new functionalities on feature branches (which can be done in parallel and independently by different team members). The next step is to merge @@ -514,7 +536,7 @@ well with the rest of the code (and not just in isolation). We talk more about d of the following episodes. ::: -## Key Points: +## Key Points - A branch is one version of your project that can contain its own set of commits. -- Feature branches enable us to develop / explore / test new code features without affecting the stable `main` code. \ No newline at end of file +- Feature branches enable us to develop / explore / test new code features without affecting the stable `main` code. diff --git a/software_project_management/collaboration/workflows.md b/software_project_management/collaboration/workflows.md index e9ae6318..b8f38838 100644 --- a/software_project_management/collaboration/workflows.md +++ b/software_project_management/collaboration/workflows.md @@ -1,29 +1,30 @@ --- name: Collaborative Worklow -dependsOn: [ - software_project_management.collaboration.issues -] +dependsOn: [software_project_management.collaboration.issues] tags: [github] -attribution: - - citation: > - "Aleksandra Nenadic, Steve Crouch, James Graham, et al. (2022). carpentries-incubator/python-intermediate-development: beta (beta). Zenodo. https://doi.org/10.5281/zenodo.6532057" - url: https://doi.org/10.5281/zenodo.6532057 - image: https://carpentries-incubator.github.io/python-intermediate-development/assets/img/incubator-logo-blue.svg - license: CC-BY-4.0 - - citation: > - Matt Jaquiery, Abhishek Dasgupta (2022) "Intermediate Git Collaboration" - url: https://github.com/OxfordRSE/intermediate-git-collaboration - image: https://avatars.githubusercontent.com/u/38728121?s=200&v=4 - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - +learningOutcomes: + - Describe commonly used code review techniques. + - Describe how code reviews and pull requests can be used within teams to increase communication and improve code. + - Raise a pull request via GitHub to be reviewed by others. + - Conduct and submit a code review of a pull request. + - List the characteristics of what makes a good code review process. +attribution: + - citation: > + "Aleksandra Nenadic, Steve Crouch, James Graham, et al. (2022). carpentries-incubator/python-intermediate-development: beta (beta). Zenodo. https://doi.org/10.5281/zenodo.6532057" + url: https://doi.org/10.5281/zenodo.6532057 + image: https://carpentries-incubator.github.io/python-intermediate-development/assets/img/incubator-logo-blue.svg + license: CC-BY-4.0 + - citation: > + Matt Jaquiery, Abhishek Dasgupta (2022) "Intermediate Git Collaboration" + url: https://github.com/OxfordRSE/intermediate-git-collaboration + image: https://avatars.githubusercontent.com/u/38728121?s=200&v=4 + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- - ## Introduction Software is often designed and built as part of a team, so in this episode we'll @@ -33,14 +34,15 @@ our code by engaging in code review process with other team members. :::callout ## Collaborative Code Development Models -The way your team provides contributions to the shared codebase depends on the type of development model you use in your project. + +The way your team provides contributions to the shared codebase depends on the type of development model you use in your project. Two commonly used models are: - **fork and pull model** - where anyone can **fork** an existing repository (to create their copy of the project linked to the source) and push changes to their - personal fork. A contributor can work independently on their own fork as they + personal fork. A contributor can work independently on their own fork as they do not need permissions on the source repository to push modifications to a fork - they own. The changes from contributors can then be **pulled** into the source + they own. The changes from contributors can then be **pulled** into the source repository by the project maintainer on request and after a code review process. This model is popular with open source projects as it reduces the start up costs for new contributors and allows them to work independently without upfront @@ -48,16 +50,16 @@ Two commonly used models are: model when you are an external collaborator on a project rather than a core team member. -- **shared repository model** - where collaborators are granted push access to a single shared code repository. - Even though collaborators have write access to the main - development and production branches, the best practice of creating feature branches for new developments and - when changes need to be made is still followed. This is to enable easier testing of the new code and - initiate code review and general discussion about a set of changes before they are merged - into the development branch. This model is more prevalent with teams and organisations +- **shared repository model** - where collaborators are granted push access to a single shared code repository. + Even though collaborators have write access to the main + development and production branches, the best practice of creating feature branches for new developments and + when changes need to be made is still followed. This is to enable easier testing of the new code and + initiate code review and general discussion about a set of changes before they are merged + into the development branch. This model is more prevalent with teams and organisations collaborating on private projects. ::: - + Regardless of the collaborative code development model you and your collaborators use - code reviews are one of the widely accepted best practices for software development in teams and something you should adopt in your @@ -76,11 +78,12 @@ Discuss as a group: what do you think are the reasons behind, and advantages of, :::solution The purposes of code review include: + - improving internal code readability, understandability, quality and maintainability - checking for coding standards compliance, code uniformity and consistency - checking for test coverage and detecting bugs and code defects early - detecting performance problems and identifying code optimisation points -- finding alternative/better solutions. +- finding alternative/better solutions. An effective code review prevents errors from creeping into your software by improving code quality at an early stage of the software development process. It helps with learning, i.e. sharing knowledge about the codebase, @@ -105,14 +108,14 @@ compared to fixing the same defect in the development and maintenance stages, respectively. Since the cost of bug fixes grows in orders of magnitude throughout the software lifecycle, it is far more efficient to find and fix defects as close as possible to the point where they were introduced. -There are several **code review techniques** with various degree of formality and the -use of a technical infrastructure, here we will be using a **Tool-assisted code review** -, using GitHub's Pull Requests. It is a lightweight tool, included with GitHub's core -service for free and has gained popularity within the software development community in -recent years. This tool helps with the following tasks: (1) collecting and displaying -the updated files and highlighting what has changed, (2) facilitating a conversation -between team members (reviewers and developers), and (3) allowing code administrators -and product managers a certain control and overview of the code development workflow. +There are several **code review techniques** with various degree of formality and the +use of a technical infrastructure, here we will be using a **Tool-assisted code review** +, using GitHub's Pull Requests. It is a lightweight tool, included with GitHub's core +service for free and has gained popularity within the software development community in +recent years. This tool helps with the following tasks: (1) collecting and displaying +the updated files and highlighting what has changed, (2) facilitating a conversation +between team members (reviewers and developers), and (3) allowing code administrators +and product managers a certain control and overview of the code development workflow. ## Adding code via GitHub's Pull Requests @@ -122,66 +125,66 @@ you've pushed to a branch in a repository on GitHub and that your code is ready for review. Once a pull request is opened, you can discuss and review the potential changes with others on the team and add follow-up commits based on the feedback before your changes are merged from your feature branch into the -base branch. - -How you create your feature branches and open pull requests in GitHub will depend on your collaborative code +base branch. + +How you create your feature branches and open pull requests in GitHub will depend on your collaborative code development model: -- In the shared repository model, in order to create a feature branch and open a - pull request based on it you must have write access to the source repository or, for organisation-owned repositories, - you must be a member of the organisation that owns the repository. Once you have access to the repository, you proceed +- In the shared repository model, in order to create a feature branch and open a + pull request based on it you must have write access to the source repository or, for organisation-owned repositories, + you must be a member of the organisation that owns the repository. Once you have access to the repository, you proceed to create a feature branch on that repository directly. - In the fork and pull model, where you do not have write permissions to the source repository, you need to fork the repository first before you create a feature branch (in your fork) to base your pull request on. -In both development models, it is recommended to create a feature branch for your work and -the subsequent pull request, even though you can submit pull requests from any branch or commit. This is because, +In both development models, it is recommended to create a feature branch for your work and +the subsequent pull request, even though you can submit pull requests from any branch or commit. This is because, with a feature branch, you can push follow-up commits as a response to feedback and update your proposed changes within -a self-contained bundle. +a self-contained bundle. -The only difference in creating a pull request between the two models is how you create -the feature branch. In either model, once you are ready to merge your changes in - you +The only difference in creating a pull request between the two models is how you create +the feature branch. In either model, once you are ready to merge your changes in - you will need to specify the base branch and the head -branch. - +branch. + ## Issues, Pull Requests and Code Review In Action -Let's see this in action - you and your fellow learners are going to be organised in small teams and assume to be -collaborating in the shared repository model. You will be added as a collaborator to another team member's repository -(which becomes the shared repository in this context) and, likewise, you will add other team members as collaborators -on your repository. You can form teams of two and work on each other's repositories. If there are 3 members in -your group you can go in a round robin fashion (the first team member does a pull request on the second member's -repository and receives a pull request on their repository from the third team member). If you are going through the -material on your own and do not have a collaborator, you can do pull requests on your own repository from one to +Let's see this in action - you and your fellow learners are going to be organised in small teams and assume to be +collaborating in the shared repository model. You will be added as a collaborator to another team member's repository +(which becomes the shared repository in this context) and, likewise, you will add other team members as collaborators +on your repository. You can form teams of two and work on each other's repositories. If there are 3 members in +your group you can go in a round robin fashion (the first team member does a pull request on the second member's +repository and receives a pull request on their repository from the third team member). If you are going through the +material on your own and do not have a collaborator, you can do pull requests on your own repository from one to another branch. Recall the `oxrse_unit_conv` repo that you cloned previously ([https://github.com/OxfordRSE/oxrse_unit_conv](https://github.com/OxfordRSE/oxrse_unit_conv)). This is a small toy Python project that implements some classes for SI and non-SI units (you can read the `README.md` file for more information), and implements convertions -between values of different units. +between values of different units. -In the previous section you each implemented an issue to add a new feature (e.g. a new -unit) or bugfix. Now your taks is to implement this feature or bugfix, along with tests -to make sure your new code works correctly or that the bug is fixed. You can use the -existing tests as a guide for how to write new tests. You can also use the existing +In the previous section you each implemented an issue to add a new feature (e.g. a new +unit) or bugfix. Now your taks is to implement this feature or bugfix, along with tests +to make sure your new code works correctly or that the bug is fixed. You can use the +existing tests as a guide for how to write new tests. You can also use the existing tests to ensure that your changes do not break any existing functionality. -You will propose changes to their repository (the shared repository in this context) via +You will propose changes to their repository (the shared repository in this context) via issues and pull requests -(acting as the code author) and engage in code review with your team member (acting as a -code reviewer). Similarly, you will receive a pull request on your repository from -another team member, in which case the roles will be reversed. The following diagram +(acting as the code author) and engage in code review with your team member (acting as a +code reviewer). Similarly, you will receive a pull request on your repository from +another team member, in which case the roles will be reversed. The following diagram depicts the branches that you should have in the repository. ![Branches for a feature and its tests](fig/exercise-feature-branch.svg) -*Adapted from [Git Tutorial by sillevl](https://sillevl.gitbooks.io/git/content/collaboration/workflows/gitflow/) (Creative Commons Attribution 4.0 International License)* +_Adapted from [Git Tutorial by sillevl](https://sillevl.gitbooks.io/git/content/collaboration/workflows/gitflow/) (Creative Commons Attribution 4.0 International License)_ To achieve this, the following steps are needed. -#### Step 1: Adding Collaborators to a Shared Repository +### Step 1: Adding Collaborators to a Shared Repository -You need to add the other team member(s) as collaborator(s) on your repository +You need to add the other team member(s) as collaborator(s) on your repository to enable them to create branches and pull requests. To do so, each repository owner needs to: 1. Head over to Settings section of your software project's repository in GitHub. @@ -194,186 +197,191 @@ to enable them to create branches and pull requests. To do so, each repository o 5. Once they accept the invitation, they will have the collaborator-level access to your repository and will show up in the list of your collaborators. -See the full details on [collaborator permissions for personal repositories](https://docs.github.com/en/account-and-profile/setting-up-and-managing-your-github-user-account/managing-user-account-settings/permission-levels-for-a-user-account-repository) +See the full details on [collaborator permissions for personal repositories](https://docs.github.com/en/account-and-profile/setting-up-and-managing-your-github-user-account/managing-user-account-settings/permission-levels-for-a-user-account-repository) to understand what collaborators will be able to do within your repository. Note that repositories owned by an organisation have a [more granular access control](https://docs.github.com/en/get-started/learning-about-github/access-permissions-on-github) compared to that of personal repositories. -#### Step 2: Create an issue for the feature you are going to implement +### Step 2: Create an issue for the feature you are going to implement -You might already have an issue from the previous section, but if not, head over to the +You might already have an issue from the previous section, but if not, head over to the "Issues" tab and create a new issue that you will implement. -#### Step 3: Create a Feature Branch +### Step 3: Create a Feature Branch -1. Obtain the GitHub URL of the shared repository you will be working on and clone it - locally if you havn't already. This will create a copy of the repository locally on +1. Obtain the GitHub URL of the shared repository you will be working on and clone it + locally if you havn't already. This will create a copy of the repository locally on your machine along with all of its (remote) branches. - ~~~bash - $ git clone - $ cd - ~~~ -2. Organise within you team what naming convention you will use for new branches. A - common choice it to use the issue number and one or more keywords, for example - `i23-feature-name`. + + ```bash + git clone + cd + ``` + +2. Organise within you team what naming convention you will use for new branches. A + common choice it to use the issue number and one or more keywords, for example + `i23-feature-name`. 3. Create and checkout the new branch in your local repository - ~~~bash - $ git checkout -b i23-feature-name - ~~~ - - You are now located in the new (local) `i23-feature-name` branch and are ready to + ```bash + git checkout -b i23-feature-name + ``` + + You are now located in the new (local) `i23-feature-name` branch and are ready to start adding your code. -#### Step 4: Adding New Code +### Step 4: Adding New Code :::challenge{id=implement title="Implement Feature/Bugfix and Tests"} -Now implement the new feature or bugfix that you have described in your issue. It is a -good idea to commit often while developing, providing you with a history of commits you -can go back to, and others in your team with information of development progressing -elsewhere in the collaboration. You can "tag" a commit with an issue by including an +Now implement the new feature or bugfix that you have described in your issue. It is a +good idea to commit often while developing, providing you with a history of commits you +can go back to, and others in your team with information of development progressing +elsewhere in the collaboration. You can "tag" a commit with an issue by including an issue number reference (e.g. "#23") in the commit message. -~~~bash -$ git add -A -$ git commit -m "#23 add test for unit nmol/sec" -~~~ - -Make sure you write tests to ensure that the bug has been fixed or the feature works as -expected. For a bug fix, you effectivly start with a test which is simply the code that -leads to this bug. Then its a matter of implementing fixes until the test passes. For a -feature you can either start off by writing a test that illustrates how you will -implement the feature, and will pass once this is done (this is normally given the name -"Test-Driven Development"), or you can test the feature once you have written it to +```bash +git add -A +git commit -m "#23 add test for unit nmol/sec" +``` + +Make sure you write tests to ensure that the bug has been fixed or the feature works as +expected. For a bug fix, you effectivly start with a test which is simply the code that +leads to this bug. Then its a matter of implementing fixes until the test passes. For a +feature you can either start off by writing a test that illustrates how you will +implement the feature, and will pass once this is done (this is normally given the name +"Test-Driven Development"), or you can test the feature once you have written it to check that the code works. ::: -:::callout +:::callout{variant="tip"} + ## Testing Based on Requirements -Tests should test functionality, which stem from the software requirements, rather than an implementation. Tests can + +Tests should test functionality, which stem from the software requirements, rather than an implementation. Tests can be seen as a reflection of those requirements - checking if the requirements are satisfied. ::: Remember to commit your new code to your branch `feature-x-tests`. +### Step 5: Submitting a Pull Request - -#### Step 5: Submitting a Pull Request - -When you have finished adding your code and tests and have committed the changes to your -local `i23-feature-name`, and are ready for the others in the team to review them, you +When you have finished adding your code and tests and have committed the changes to your +local `i23-feature-name`, and are ready for the others in the team to review them, you have to do the following: 1. Push your local feature branch `i23-feature-name` remotely to the shared repository. - ~~~bash - $ git push -u origin i23-feature-name - ~~~ -2. Normally step one will provide a handy url for you to create the PR. However, if not, - or you wish to do it manualy, Head over to the remote repository in GitHub and locate - your new (`i23-feature-name`) branch from the dropdown box on the Code tab (you can + + ```bash + git push -u origin i23-feature-name + ``` + +1. Normally step one will provide a handy url for you to create the PR. However, if not, + or you wish to do it manualy, Head over to the remote repository in GitHub and locate + your new (`i23-feature-name`) branch from the dropdown box on the Code tab (you can search for your branch or use the "View all branches" option). ![All repository branches in GitHub](fig/github-branches.png) -3. Open a pull request by clicking "Compare & pull request" button. +1. Open a pull request by clicking "Compare & pull request" button. ![Submitting a pull request in GitHub](fig/github-create-pull-request.png) -4. Select the base and the head branch, e.g. `main` and `i23-feature-name`, - respectively. Recall that the base branch is where you want your changes to be merged +1. Select the base and the head branch, e.g. `main` and `i23-feature-name`, + respectively. Recall that the base branch is where you want your changes to be merged and the head branch contains your changes. -5. Add a comment describing the nature of the changes, and then submit the pull request. -6. Repository moderator and other collaborators on the repository (code reviewers) will be notified of your pull request by GitHub. -7. At this point, the code review process is initiated. +1. Add a comment describing the nature of the changes, and then submit the pull request. +1. Repository moderator and other collaborators on the repository (code reviewers) will be notified of your pull request by GitHub. +1. At this point, the code review process is initiated. You should receive a similar pull request from other team members on your repository. #### Step 6: Code Review -1. The repository moderator/code reviewers reviews your changes and provides feedback to you +1. The repository moderator/code reviewers reviews your changes and provides feedback to you in the form of comments. -2. Respond to their comments and do any subsequent commits, as requested by reviewers. -3. The tests are automatically run by the Continuous Integration setup via Github - Actions, and a report will be generated. Once all tests pass your PR will be given a +1. Respond to their comments and do any subsequent commits, as requested by reviewers. +1. The tests are automatically run by the Continuous Integration setup via Github + Actions, and a report will be generated. Once all tests pass your PR will be given a nice green tick. -3. It may take a few rounds of exchanging comments, discussions, and additional commits - until the team is ready to accept your changes and all tests pass. +1. It may take a few rounds of exchanging comments, discussions, and additional commits + until the team is ready to accept your changes and all tests pass. Perform the above actions on the pull request you received, this time acting as the moderator/code reviewer. #### Step 7: Closing a Pull Request -1. Once the moderator approves your changes and all tests pass, either one of you can - merge onto the base branch (who actually does the merging may differ from team to +1. Once the moderator approves your changes and all tests pass, either one of you can + merge onto the base branch (who actually does the merging may differ from team to team). ![Merging a pull request in GitHub](fig/github-merge-pull-request.png) -2. Delete the merged branch to reduce the clutter in the repository. +1. Delete the merged branch to reduce the clutter in the repository. Repeat the above actions for the pull request you received. -If the work on the feature branch is completed and it is sufficiently tested, the +If the work on the feature branch is completed and it is sufficiently tested, the feature branch can now be merged into the `main` branch. ## Best Practice for Code Review - -There are multiple perspectives to a code review process - from general practices to technical details -relating to different roles involved in the process. It is critical for the code's quality, stability and maintainability -that the team decides on this process and sticks to it. Here are some examples of best practices for you to consider + +There are multiple perspectives to a code review process - from general practices to technical details +relating to different roles involved in the process. It is critical for the code's quality, stability and maintainability +that the team decides on this process and sticks to it. Here are some examples of best practices for you to consider (also check these useful code review blogs from [Swarmia](https://www.swarmia.com/blog/a-complete-guide-to-code-reviews/?utm_term=code%20review&utm_campaign=Code+review+best+practices&utm_source=adwords&utm_medium=ppc&hsa_acc=6644081770&hsa_cam=14940336179&hsa_grp=131344939434&hsa_ad=552679672005&hsa_src=g&hsa_tgt=kwd-17740433&hsa_kw=code%20review&hsa_mt=b&hsa_net=adwords&hsa_ver=3&gclid=Cj0KCQiAw9qOBhC-ARIsAG-rdn7_nhMMyE7aeSzosRRqZ52vafBOyMrpL4Ypru0PHWK4Rl8QLIhkeA0aAsxqEALw_wcB) and [Smartbear](https://smartbear.com/learn/code-review/best-practices-for-peer-code-review/)): - + 1. Decide the focus of your code review process, e.g., consider some of the following: - - code design and functionality - does the code fit in the overall design and does it do what was intended? + - code design and functionality - does the code fit in the overall design and does it do what was intended? - code understandability and complexity - is the code readable and would another developer be able to understand it? - tests - does the code have automated tests? - naming - are names used for variables and functions descriptive, do they follow naming conventions? - - comments and documentation - are there clear and useful comments that explain - complex designs well and focus on the "why/because" rather than the "what/how"? -2. Do not review code too quickly and do not review for too long in one sitting. According to - [“Best Kept Secrets of Peer Code Review” (Cohen, - 2006)](https://www.amazon.co.uk/Best-Kept-Secrets-Peer-Review/dp/1599160676) - the - first hour of review matters the most as detection of defects significantly drops - after this period. - [Studies into code - review](https://smartbear.com/resources/ebooks/the-state-of-code-review-2020-report/) - also show that you should not review more than 400 lines of code at a time. + - comments and documentation - are there clear and useful comments that explain + complex designs well and focus on the "why/because" rather than the "what/how"? +1. Do not review code too quickly and do not review for too long in one sitting. According to + [“Best Kept Secrets of Peer Code Review” (Cohen, 2006)](https://www.amazon.co.uk/Best-Kept-Secrets-Peer-Review/dp/1599160676) - the + first hour of review matters the most as detection of defects significantly drops + after this period. + [Studies into code + review](https://smartbear.com/resources/ebooks/the-state-of-code-review-2020-report/) + also show that you should not review more than 400 lines of code at a time. Conducting more frequent shorter reviews seems to be more effective. -3. Decide on the level of depth for code reviews to maintain the balance between the creation time - and time spent reviewing code - e.g. reserve them for critical portions of code and - avoid nit-picking on small details. Try using automated checks and linters when - possible, e.g. for consistent usage of certain terminology across the code and code +1. Decide on the level of depth for code reviews to maintain the balance between the creation time + and time spent reviewing code - e.g. reserve them for critical portions of code and + avoid nit-picking on small details. Try using automated checks and linters when + possible, e.g. for consistent usage of certain terminology across the code and code styles. -4. Communicate clearly and effectively - when reviewing code, be explicit about the action you request from the author. -5. Foster a positive feedback culture: - - give feedback about the code, not about the author - - accept that there are multiple correct solutions to a problem - - sandwich criticism with positive comments and praise -7. Utilise multiple code review techniques - use email, pair programming, - over-the-shoulder, team discussions and tool-assisted or any combination that works - for your team. However, for the most effective and efficient code reviews, +1. Communicate clearly and effectively - when reviewing code, be explicit about the action you request from the author. +1. Foster a positive feedback culture: + + - give feedback about the code, not about the author + - accept that there are multiple correct solutions to a problem + - sandwich criticism with positive comments and praise + +1. Utilise multiple code review techniques - use email, pair programming, + over-the-shoulder, team discussions and tool-assisted or any combination that works + for your team. However, for the most effective and efficient code reviews, tool-assisted process is recommended. -9. From a more technical perspective: +1. From a more technical perspective: - use a feature branch for pull requests as you can push follow-up commits if you need to update your proposed changes - - avoid large pull requests as they are more difficult to review. You can refer to - some [studies](https://jserd.springeropen.com/articles/10.1186/s40411-018-0058-0) - and [Google - recommendations](https://google.github.io/eng-practices/review/developer/small-cls.html) + - avoid large pull requests as they are more difficult to review. You can refer to + some [studies](https://jserd.springeropen.com/articles/10.1186/s40411-018-0058-0) + and [Google + recommendations](https://google.github.io/eng-practices/review/developer/small-cls.html) as to what a "large pull request" is but be aware that it is not exact science. - don't force push to a pull request as it changes the repository history and can corrupt your pull request for other collaborators - - use pull request states in GitHub effectively (based on your team's code review - process) - e.g. in GitHub you can open a pull request in a `DRAFT` state to show - progress or request early feedback; `READY FOR REVIEW` when you are ready for - feedback; `CHANGES REQUESTED` to let the author know they need to fix the requested - changes or discuss more; `APPROVED` to let the author they can merge their pull + - use pull request states in GitHub effectively (based on your team's code review + process) - e.g. in GitHub you can open a pull request in a `DRAFT` state to show + progress or request early feedback; `READY FOR REVIEW` when you are ready for + feedback; `CHANGES REQUESTED` to let the author know they need to fix the requested + changes or discuss more; `APPROVED` to let the author they can merge their pull request. :::challenge{id=own-environment title="Code Review in Your Own Working Environment"} -At the start of this episode we briefly looked at a number of techniques for doing code -review, and as an example, went on to see how we can use GitHub Pull Requests to review -team member code changes. Finally, we also looked at some best practices for doing code +At the start of this episode we briefly looked at a number of techniques for doing code +review, and as an example, went on to see how we can use GitHub Pull Requests to review +team member code changes. Finally, we also looked at some best practices for doing code reviews in general. -Now think about how you typically develop code, and how you might institute code review -practices within your own working environment. Write down briefly for your own reference +Now think about how you typically develop code, and how you might institute code review +practices within your own working environment. Write down briefly for your own reference (perhaps using bullet points) some answers to the following questions: - Which 2 or 3 key circumstances would code review be most useful for you and your colleagues? @@ -384,8 +392,10 @@ practices within your own working environment. Write down briefly for your own r - How long would the activity take? - Who would ideally be involved? - Any particular practices you would use? + ::: -## Key Points: +## Key Points + - Code review is a team software quality assurance practice where team members look at parts of the codebase in order to improve their code's readability, understandability, quality and maintainability. -- It is important to agree on a set of best practices and establish a code review process in a team to help to sustain a good, stable and maintainable code for many years. \ No newline at end of file +- It is important to agree on a set of best practices and establish a code review process in a team to help to sustain a good, stable and maintainable code for many years. diff --git a/software_project_management/continuous_integration/code_coverage.md b/software_project_management/continuous_integration/code_coverage.md index d6648630..cd5d776c 100644 --- a/software_project_management/continuous_integration/code_coverage.md +++ b/software_project_management/continuous_integration/code_coverage.md @@ -1,20 +1,22 @@ --- name: Code Coverage -dependsOn: [ - software_project_management.continuous_integration.github_actions -] +dependsOn: [software_project_management.continuous_integration.github_actions] tags: [codecov, github] -attribution: - - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - +learningOutcomes: + - Describe the types of code coverage. + - List the benefits of code coverage. + - Add and configure use of code coverage tool to our software project. + - Run a code coverage tool to understand how much of our code is being tested using unit tests. + - Configure GitHub Actions to automate the process of code coverage analysis over a code repository. +attribution: + - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Testing code coverage @@ -60,18 +62,18 @@ Once you have done that, you should see your repository on the Codecov webpage. Next to your repository, you should see text like: -> Not yet enabled [setup repo]() +> Not yet enabled [setup repo] You will see a token of the form `CODECOV_TOKEN=XXXXXXXXXX`. This token will allow your GitHub Actions workflow to communicate with Codecov. It is not strictly needed if your repository is public, but let's run through the process of setting it up anway. 1. To to your repository on GitHub -1. Go to *Settings* -1. On the left, go to *Secrets and variables* then to *Actions* -1. Click *New repository secret* +1. Go to _Settings_ +1. On the left, go to _Secrets and variables_ then to _Actions_ +1. Click _New repository secret_ 1. Set the name to `CODECOV_TOKEN` and the value to the token string, copied from Codecov -1. Click *Add secret* +1. Click _Add secret_ This will allow you to use the token in a GitHub Actions workflow, without anyone who looks at your workflow seeing what the token is. We will see shortly how this is used. @@ -83,15 +85,15 @@ Your repository is now set up and ready to use with Codecov. We will use the [pytest-cov](https://pytest-cov.readthedocs.io/en/latest/) tool to generate coverage information when running our unit tests with pytest. To install this locally, using pip, run: -~~~ bash +```bash pip install pytest-cov -~~~ +``` We can now run pytest in the following way to generate an xml file containing coverage information for the whole project: -~~~ bash +```bash pytest --cov-config=.coveragerc --cov=./ci_course --cov-report=xml -~~~ +``` Let's break that down: @@ -112,69 +114,67 @@ Then, run the pytest command above and verify that it generates a `coverage.xml` The file should contain something along these lines: -~~~ xml +```xml - - - - /home/runner/work/test_ci/test_ci/ci_course - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + /home/runner/work/test_ci/test_ci/ci_course + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + -~~~ +``` + ::: :::: - ## Use GitHub Actions to automate this process We want to generate coverage, and upload the results to Codecov, every time someone commits or opens a pull request. - - ::::challenge{id="run-coverage-on-github" title="Run coverage on GitHub"} Write a GitHub Actions workflow that generates a coverage report, and uploads it to codecov. You will find the following step helpful: -~~~ yml +```yml - name: Upload coverage reports to Codecov uses: codecov/codecov-action@v3 with: token: ${{ secrets.CODECOV_TOKEN }} fail_ci_if_error: true files: coverage.xml -~~~ +``` You can read more about the `codecov-action` step here: [https://github.com/codecov/codecov-action](https://github.com/codecov/codecov-action) @@ -183,51 +183,49 @@ You can read more about the `codecov-action` step here: The file should contain something along these lines: -~~~ yml +```yml name: Coverage on: push: - branches: [ "main" ] + branches: ["main"] pull_request: - branches: [ "main" ] + branches: ["main"] workflow_dispatch: jobs: build: - runs-on: ubuntu-latest steps: + - uses: actions/checkout@v3 + + - name: Set up Python 3.10 + uses: actions/setup-python@v3 + with: + python-version: "3.10" + + - name: Install dependencies + run: | + python -m pip install --upgrade pip setuptools wheel + python -m pip install .[dev] + + - name: Run coverage + run: | + pytest --cov-config=.coveragerc --cov=./ci_course --cov-report=xml + cat coverage.xml + + - name: Upload coverage reports to Codecov + uses: codecov/codecov-action@v3 + with: + token: ${{ secrets.CODECOV_TOKEN }} + fail_ci_if_error: true + files: coverage.xml +``` - - uses: actions/checkout@v3 - - - name: Set up Python 3.10 - uses: actions/setup-python@v3 - with: - python-version: "3.10" - - - name: Install dependencies - run: | - python -m pip install --upgrade pip setuptools wheel - python -m pip install .[dev] - - - name: Run coverage - run: | - pytest --cov-config=.coveragerc --cov=./ci_course --cov-report=xml - cat coverage.xml - - - name: Upload coverage reports to Codecov - uses: codecov/codecov-action@v3 - with: - token: ${{ secrets.CODECOV_TOKEN }} - fail_ci_if_error: true - files: coverage.xml -~~~ ::: :::: - Once you have committed this workflow file, go to GitHub and check that it has run successfully. If it has, go to Codecov and explore the coverage report. @@ -243,9 +241,10 @@ Identify which parts of the project are not covered, and write a new test to ens Update `test_functionality.py`, for instace by adding the following line to `test_minimum()`: -~~~ python +```python nolint assert ci_course.minimum("hi", "there") is None -~~~ +``` + ::: :::: diff --git a/software_project_management/continuous_integration/documentation.md b/software_project_management/continuous_integration/documentation.md index bf4dce60..9ece923d 100644 --- a/software_project_management/continuous_integration/documentation.md +++ b/software_project_management/continuous_integration/documentation.md @@ -1,24 +1,22 @@ --- name: Documentation -dependsOn: [ - software_project_management.continuous_integration.code_coverage -] - +dependsOn: [software_project_management.continuous_integration.code_coverage] tags: [sphinx, readthedocs] -attribution: - - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - +learningOutcomes: + - List benefits of having good documentation for software. + - Describe the key features of the Sphinx and Read the Docs documentation and hosting tools. + - Use Sphinx to generate documentation for a software project. +attribution: + - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- - Writing documentation for software is crucial. 1. **Improves Code Understanding**: Good documentation gives a clear understanding of the flow, architecture, and functionalities of the software, reducing the time it takes for new team members to understand how things work. @@ -45,7 +43,6 @@ Writing documentation for software is crucial. We will be using [Sphinx](https://www.sphinx-doc.org/en/master/) and [Read the Docs](https://readthedocs.org/) to create and deploy documentation pages for our repository. - ## Sphinx Sphinx is a powerful and flexible open-source documentation generation tool primarily used for Python, but it can be used for other programming languages as well. @@ -62,7 +59,6 @@ Here are some key features: 1. **Theming Support**: Sphinx supports themes for its HTML output, allowing documentation to match the aesthetic and branding of a project or organization. - ## Read the Docs Read the Docs is a free and open-source platform for hosting software documentation. @@ -80,14 +76,13 @@ Here are some of its key features: 1. **PDF and EPUB Export**: Users can download a PDF or EPUB version of your documentation for offline reading. - ## Getting started From your repository, run: -~~~ bash +```bash pip install -e ."[dev,docs]" -~~~ +``` to ensure you have all development and documentation dependencies installed. @@ -95,9 +90,9 @@ Next, create a directory at the top level of your project called `docs`. From the `docs` directory, run -~~~ +```shell sphinx-quickstart -~~~ +``` Use the default values, but fill in a unique project name. @@ -114,7 +109,7 @@ Next, go to [Read the Docs](https://readthedocs.org/). - Follow the instructions, leaving everything as default You should then see your documentation building! -Wait for it to complete, and then click *View Docs*. +Wait for it to complete, and then click _View Docs_. This will take you to the website `https://.readthedocs.io/en/latest/`. ::::challenge{id="start-documenting" title="Start documenting"} diff --git a/software_project_management/continuous_integration/github_actions.md b/software_project_management/continuous_integration/github_actions.md index 37dc43d5..e4af1454 100644 --- a/software_project_management/continuous_integration/github_actions.md +++ b/software_project_management/continuous_integration/github_actions.md @@ -1,31 +1,34 @@ --- name: GitHub Actions -dependsOn: [ -] +dependsOn: [] tags: [github] -attribution: - - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - +learningOutcomes: + - Describe the structure and steps of a basic GitHub Actions workflow. + - Build a basic workflow and run it on GitHub. + - Create a workflow for a Python program to run a static code analysis tool and unit tests over the codebase. + - Diagnose and fix a workflow fault. + - Parameterise the running of a workflow over multiple operating systems. +attribution: + - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Overview -With a GitHub repository there's a very easy way to set up CI that runs when your -repository changes: simply add a [.yml file](https://learnxinyminutes.com/docs/yaml/) to your repository in the directory +With a GitHub repository there's a very easy way to set up CI that runs when your +repository changes: simply add a [.yml file](https://learnxinyminutes.com/docs/yaml/) to your repository in the directory -~~~ +```text .github/workflows -~~~ +``` -Each file in this directory represents a *workflow* and will, when triggered, spin up a virtual machine and run the sequence of commands in the file. +Each file in this directory represents a _workflow_ and will, when triggered, spin up a virtual machine and run the sequence of commands in the file. Information about the specifications of these VMs can be found [here](https://docs.github.com/en/free-pro-team@latest/actions/reference/specifications-for-github-hosted-runners). At the time of writing, each VM will have a 2-core CPU, 7GB of RAM and 14 GB of SSD space available, and each workflow can run for up to 6 hours. @@ -39,13 +42,13 @@ In this section you will create several workflows by using the wizard and built- We will start with a minimal example to demonstrate various features of a GitHub Actions workflow. Createa file in your repository called: -~~~ +```text .github/workflows/basic.yml -~~~ +``` Copy the following code, then commit and push the changes to GitHub. -~~~ yml +```yml name: Basic GitHub Actions Workflow on: @@ -58,9 +61,9 @@ jobs: runs-on: ubuntu-latest steps: - - name: Run a one-line script - run: echo "Hello, world!" -~~~ + - name: Run a one-line script + run: echo "Hello, world!" +``` Here's a brief breakdown of this basic workflow: @@ -76,26 +79,26 @@ Here's a brief breakdown of this basic workflow: 6. The `run` field specifies the command to run. Here, it's just echoing "Hello, world!" to the console. -If you now navigate to the *Actions* tab on your GitHub repository, you should see that this workflow has run and succeeded. +If you now navigate to the _Actions_ tab on your GitHub repository, you should see that this workflow has run and succeeded. In this case it was run because we just pushed a change. -We can also trigger this workflow by opening a pull request, or by navigating navigating to the workflow via the *Actions* tab and then selecting the *Run Workflow" dropdown (this is the `workflow_dispatch` trigger). +We can also trigger this workflow by opening a pull request, or by navigating navigating to the workflow via the _Actions_ tab and then selecting the \*Run Workflow" dropdown (this is the `workflow_dispatch` trigger). ## Creating a Python-specific workflow Now let's do something more useful. -Navigate to the GitHub *Actions* tab and then click *New Workflow* (near the top left). +Navigate to the GitHub _Actions_ tab and then click _New Workflow_ (near the top left). This will let us start with a preset workflow containg many of the elements we are interested in. -Search for "python package" and select the following workflow by pressing *Configure*: +Search for "python package" and select the following workflow by pressing _Configure_: -~~~ +```text Python package By GitHub Actions Create and test a Python package on multiple Python versions. -~~~ +``` This takes us into the web editor. We will make the following changes to the workflow: @@ -105,15 +108,16 @@ We will make the following changes to the workflow: 1. add the `workflow_dispatch` trigger, just like in the basic file 1. Change the "Install dependencies" step to run the following block: - ~~~ bash - python -m pip install --upgrade pip setuptools wheel - python -m pip install .[dev] - ~~~ + + ```bash + python -m pip install --upgrade pip setuptools wheel + python -m pip install .[dev] + ``` 1. Change the "Lint with flake8" step to just run `flake8` (with no options at all) Then use the web interface to commit the changes. -Go over to the *Actions* tab to see it running. +Go over to the _Actions_ tab to see it running. Let's go through what is happening in this workflow: @@ -137,7 +141,6 @@ This job consists of a series of steps: 5. **Test with pytest:** The last step runs the `pytest` command to execute tests. `pytest` is a Python testing framework. - ## Identify and fix the errors If all has gone to plan, the workflow should fail. @@ -173,7 +176,7 @@ Push your new workflow, and check that it runs as expected. The full file might look like this: -~~~ yml +```yml # This workflow will install Python dependencies, run tests and lint with a variety of Python versions # For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python @@ -181,14 +184,13 @@ name: Operating systems on: push: - branches: [ "main" ] + branches: ["main"] pull_request: - branches: [ "main" ] + branches: ["main"] workflow_dispatch: jobs: build: - runs-on: ${{ matrix.os }} strategy: fail-fast: false @@ -196,28 +198,26 @@ jobs: os: [ubuntu-latest, macos-latest, windows-latest] steps: - - uses: actions/checkout@v3 - - name: Set up Python 3.10 - uses: actions/setup-python@v3 - with: - python-version: "3.10" - - name: Install dependencies - run: | - python -m pip install --upgrade pip setuptools wheel - python -m pip install .[dev] - - name: Lint with flake8 - run: | - flake8 - - name: Test with pytest - run: | - pytest - -~~~ + - uses: actions/checkout@v3 + - name: Set up Python 3.10 + uses: actions/setup-python@v3 + with: + python-version: "3.10" + - name: Install dependencies + run: | + python -m pip install --upgrade pip setuptools wheel + python -m pip install .[dev] + - name: Lint with flake8 + run: | + flake8 + - name: Test with pytest + run: | + pytest +``` ::: :::: - ## Next steps 1. \[optional\] read more about [GitHub's hosted runners](https://docs.github.com/en/free-pro-team@latest/actions/reference/specifications-for-github-hosted-runners). diff --git a/software_project_management/continuous_integration/index.md b/software_project_management/continuous_integration/index.md index 23693318..fee0e3df 100644 --- a/software_project_management/continuous_integration/index.md +++ b/software_project_management/continuous_integration/index.md @@ -1,27 +1,19 @@ --- name: Continuous Integration id: continuous_integration -dependsOn: [ - software_project_management.collaboration -] -files: [ - github_actions.md, - code_coverage.md, - documentation.md -] +dependsOn: [software_project_management.collaboration] +files: [github_actions.md, code_coverage.md, documentation.md] summary: | - This course introduces the concept of continuous integration and how to set it up for a Python project using GitHub Actions. -attribution: - - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. - url: https://www.sabsr3.ox.ac.uk - image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - + This course introduces the concept of continuous integration and how to set it up for a Python project using GitHub Actions. +attribution: + - citation: This material has been adapted from the "Software Engineering" module of the SABS R³ Center for Doctoral Training. + url: https://www.sabsr3.ox.ac.uk + image: https://www.sabsr3.ox.ac.uk/sites/default/files/styles/site_logo/public/styles/site_logo/public/sabsr3/site-logo/sabs_r3_cdt_logo_v3_111x109.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## What is Continuous Integration? @@ -60,7 +52,6 @@ Here are some key princples of CI: 1. **Automate deployment**: The process of deploying the software should be automated, ensuring it's easily reproducible and reducing the chances of human error. This might include creating Python wheels and deploying them to [PyPI](https://pypi.org/), or building public documentation pages. - ## How do we do it? There are many CI infrastructures and services, free and paid for, and subject to change as they evolve their features. @@ -70,7 +61,6 @@ In this course you will be using [GitHub Actions](https://github.com/features/ac There are other free options, for instance [Travis CI](https://travis-ci.com/) and [AppVeyor](https://www.appveyor.com/). All three of these make use of common features across many CI implementations, and you are certainly advised to look at the options to see some of the commonalities and differences in how features are typically provided. - ## This course In this course we aim to walk you through a hands-on session which will set up CI for a small Python project, and see some of its benefits in action. @@ -80,7 +70,6 @@ We will go through: 1. Generating code coverage information ([link](./continuous_integration/code_coverage)) 1. Generating and deploying documenation ([link](./continuous_integration/documentation)) - ## Getting started Go to [https://github.com/OxfordRSE/ci_course_start](https://github.com/OxfordRSE/ci_course_start) and press "Use this template" -> "Create a new repository". diff --git a/software_project_management/index.md b/software_project_management/index.md index 240b9d06..ad8b866b 100644 --- a/software_project_management/index.md +++ b/software_project_management/index.md @@ -9,4 +9,4 @@ summary: | Courses on software project management such as on collaborative programming and continuous integration. --- -Courses on software project management such as on collaborative programming and continuous integration. \ No newline at end of file +Courses on software project management such as on collaborative programming and continuous integration. diff --git a/technology_and_tooling/bash_shell/01-intro.md b/technology_and_tooling/bash_shell/01-intro.md index ee6ade89..7290538f 100644 --- a/technology_and_tooling/bash_shell/01-intro.md +++ b/technology_and_tooling/bash_shell/01-intro.md @@ -1,16 +1,17 @@ --- name: Introducing the Shell -dependsOn: [ -] +dependsOn: [] tags: [bash] +learningOutcomes: + - Describe the benefits of using the shell over other styles of interface. attribution: -- citation: > + - citation: > This material was originally taken from training materials developed by the University of Southampton Research Software Group, which are based on the Software Carpentries course "Version Control with Git". - url: https://github.com/Southampton-RSG-Training/shell-novice/ - image: https://southampton-rsg-training.github.io/shell-novice/assets/img/home-logo.png - license: CC-BY-4.0 + url: https://github.com/Southampton-RSG-Training/shell-novice/ + image: https://southampton-rsg-training.github.io/shell-novice/assets/img/home-logo.png + license: CC-BY-4.0 --- The Bash shell a text-based program that interactively allows you to run other programs. @@ -23,30 +24,31 @@ And then the shell allows you to type in another command… And so on. :::callout + ## Analogies Imagine the shell a little like working with a voice assistant like Siri or Alexa. You ask your computer questions, and your computer responds with an answer. ::: -The shell is called *the shell* because it encloses the machine's **operating system** - which could be Windows, Mac OS X, or Linux - giving you a wrapper-like interface to interact with it. Another, more general way, of referring to the shell is the **command line**, since it provides an interface into which you type commands line-by-line. +The shell is called _the shell_ because it encloses the machine's **operating system** - which could be Windows, Mac OS X, or Linux - giving you a wrapper-like interface to interact with it. Another, more general way, of referring to the shell is the **command line**, since it provides an interface into which you type commands line-by-line. ## Why use it? So why use the Bash shell? - **Capturing a process:** Being able to capture how programs are run and in what order in a Bash script - and essentially automating how we run that process - is invaluable. -It's really helpful with making your pipelines reproducible: once you've defined this process in a script, you can re-run it whenever you want. -This is both helpful for others to achieve the same results, but also for yourself -perhaps six months from now, when it would be otherwise difficult to remember exactly what you did. -What you are effectively doing is building a narrative - telling a story in recorded, programmatic form - of how you generated your research results. + It's really helpful with making your pipelines reproducible: once you've defined this process in a script, you can re-run it whenever you want. + This is both helpful for others to achieve the same results, but also for yourself + perhaps six months from now, when it would be otherwise difficult to remember exactly what you did. + What you are effectively doing is building a narrative - telling a story in recorded, programmatic form - of how you generated your research results. - **Repetition:** Bash is great at repeating the same commands many times. -This could be renaming a hundred files in a certain way, or something more complex, such as running a data analysis program over many input data files, -or running a program to generate a chart for every one of those output data files produced by that program. + This could be renaming a hundred files in a certain way, or something more complex, such as running a data analysis program over many input data files, + or running a program to generate a chart for every one of those output data files produced by that program. - **Availability:** Bash is available on different types of machines. -You can already use the Bash shell on computers like Macs and those that run Linux, where it's already installed, but you can also install and use it on Windows. + You can already use the Bash shell on computers like Macs and those that run Linux, where it's already installed, but you can also install and use it on Windows. - **Using other computational resources:** if you need to use another computational resource, such as a supercomputer to run your programs even faster, they almost exclusively use the shell. diff --git a/technology_and_tooling/bash_shell/02-filedir.md b/technology_and_tooling/bash_shell/02-filedir.md index fcd4b4b3..d718b860 100644 --- a/technology_and_tooling/bash_shell/02-filedir.md +++ b/technology_and_tooling/bash_shell/02-filedir.md @@ -1,17 +1,20 @@ --- name: Files and Directories -dependsOn: [ - technology_and_tooling.bash_shell.01-intro -] +dependsOn: [technology_and_tooling.bash_shell.01-intro] tags: [bash] +learningOutcomes: + - Translate an absolute path into a relative path and vice versa. + - Construct absolute and relative paths that identify specific files and directories. + - Use options and arguments to change the behaviour of a shell command. + - Demonstrate the use of tab completion and explain its advantages. attribution: -- citation: > + - citation: > This material was originally taken from training materials developed by the University of Southampton Research Software Group, which are based on the Software Carpentries course "Version Control with Git". - url: https://github.com/Southampton-RSG-Training/shell-novice/ - image: https://southampton-rsg-training.github.io/shell-novice/assets/img/home-logo.png - license: CC-BY-4.0 + url: https://github.com/Southampton-RSG-Training/shell-novice/ + image: https://southampton-rsg-training.github.io/shell-novice/assets/img/home-logo.png + license: CC-BY-4.0 --- The part of the operating system responsible for managing files and directories is called the **file system**. @@ -20,13 +23,13 @@ which hold information, and directories (also called "folders", for example, on Windows systems), which hold files or other directories. -The shell has a notion of *where you currently are*, and as we'll see, works by running programs at that location. For this reason, the most fundamental skills to using the shell are navigating and browsing the file system, so let's take a look at some important commands that let us do these things. +The shell has a notion of _where you currently are_, and as we'll see, works by running programs at that location. For this reason, the most fundamental skills to using the shell are navigating and browsing the file system, so let's take a look at some important commands that let us do these things. To start exploring them, let's open a shell window: -~~~bash +```bash $ -~~~ +``` The dollar sign is a **prompt**, which represents our input interface to the shell. @@ -34,25 +37,26 @@ It shows us that the shell is waiting for input; your shell may show something more elaborate. ### Working out who we are and where we are + Type the command `whoami`, then press the `Enter` key (sometimes called `Return`) to send the command to the shell. The command's output is the identity of the current user, i.e., it shows us who the shell thinks we are (yours will be something different!): -~~~bash -$ whoami -~~~ +```bash +whoami +``` -~~~ +```text nelle -~~~ +``` So what's happening? When we type `whoami` the shell: -1. Finds a program called `whoami` -2. Runs that program -3. Displays that program's output (if there is any), then -4. Displays a new prompt to tell us that it's ready for more commands +1. Finds a program called `whoami` +2. Runs that program +3. Displays that program's output (if there is any), then +4. Displays a new prompt to tell us that it's ready for more commands Next, let's find out where we are in our file system by running a command called `pwd` (which stands for "print working directory"). @@ -65,15 +69,16 @@ Here, the computer's response is `/Users/nelle`, which is Nelle's **home directory**: -~~~bash -$ pwd -~~~ +```bash +pwd +``` -~~~ +```text /Users/nelle -~~~ +``` :::callout + ## Home directory The home directory path will look different on different operating systems. @@ -85,6 +90,7 @@ Windows. ::: :::callout + ## Alphabet Soup If the command to find out who we are is `whoami`, the command to find @@ -99,7 +105,8 @@ its jargon. The result is as inconsistent as the roolz uv Inglish speling, but we're stuck with it now. ::: -:::callout +:::callout{variant="tip"} + ## Real typing timesavers Save yourself some unnecessary keypresses! @@ -147,12 +154,13 @@ which is why `nelle` is the last part of the directory's name. ![2. Home Directories](fig/home-directories.png) -:::callout +:::callout{variant="info"} + ## Path Notice that there are two meanings for the `/` character. When it appears at the front of a file or directory name, -it refers to the root directory. When it appears *inside* a name, +it refers to the root directory. When it appears _inside_ a name, it's just a separator. ::: @@ -163,11 +171,11 @@ But how can we tell what's in directories, and how can we move around the file s We're currently in our home directory, and can see what's in it by running `ls`, which stands for "listing" (the `...` refers to other files and directories that have been left out for clarity): -~~~bash -$ ls -~~~ +```bash +ls +``` -~~~ +```text shell-novice Misc Solar.pdf Applications Movies Teaching Desktop Music ThunderbirdTemp @@ -175,7 +183,7 @@ Development Notes.txt VirtualBox VMs Documents Pictures bin Downloads Pizza.cfg mbox ... -~~~ +``` Of course, this listing will depend on what you have in your own home directory. @@ -187,33 +195,33 @@ We need to get into the repository directory `shell-novice`, so what if we want Before we do this, `pwd` shows us that we're in `/Users/nelle`. -~~~bash -$ pwd -~~~ +```bash +pwd +``` -~~~ +```text /Users/nelle -~~~ +``` Let's first get hold of some example files we can explore. First, download the example zip file to your home directory. If on WSL or Linux (e.g. Ubuntu or the Ubuntu VM), then do: -~~~bash -$ wget https://train.oxrse.uk/material/HPCu/technology_and_tooling/bash_shell/shell-novice.zip -~~~ +```bash +wget https://train.oxrse.uk/material/HPCu/technology_and_tooling/bash_shell/shell-novice.zip +``` Or, if on a Mac, do: -~~~bash -$ curl -O https://train.oxrse.uk/material/HPCu/technology_and_tooling/bash_shell/shell-novice.zip -~~~ +```bash +curl -O https://train.oxrse.uk/material/HPCu/technology_and_tooling/bash_shell/shell-novice.zip +``` Once done, you can unzip this file using the `unzip` command in Bash, which will unpack all the files in this zip archive into the current directory: -~~~bash -$ unzip shell-novice.zip -~~~ +```bash +unzip shell-novice.zip +``` If you do `ls` now, you should see a new `shell-novice` directory. @@ -223,51 +231,51 @@ which is a bit misleading: the command doesn't change the directory, it changes the shell's idea of what directory we are in. -~~~bash -$ cd shell-novice -~~~ +```bash +cd shell-novice +``` `cd` doesn't print anything, but if we run `pwd` after it, we can see that we are now in `/Users/nelle/shell-novice`: -~~~bash -$ pwd -~~~ +```bash +pwd +``` -~~~ +```text /Users/nelle/shell-novice -~~~ +``` If we run `ls` without arguments now, it lists the contents of `/Users/nelle/shell-novice`, because that's where we now are: -~~~bash -$ ls -~~~ +```bash +ls +``` -~~~ -AUTHORS Gemfile _config.yml _includes bin files setup.md -CITATION LICENSE.md _episodes _layouts code index.md shell -CODE_OF_CONDUCT.md Makefile _episodes_rmd aio.md data reference.md slides -CONTRIBUTING.md README.md _extras assets fig requirements.txt -~~~ +```text +AUTHORS Gemfile _config.yml _includes bin files setup.md +CITATION LICENSE.md _episodes _layouts code index.md shell +CODE_OF_CONDUCT.md Makefile _episodes_rmd aio.md data reference.md slides +CONTRIBUTING.md README.md _extras assets fig requirements.txt +``` `ls` prints the names of the files and directories in the current directory in alphabetical order, arranged neatly into columns (where there is space to do so). We can make its output more comprehensible by using the **flag** `-F`, which tells `ls` to add a trailing `/` to the names of directories: -~~~bash -$ ls -F -~~~ +```bash +ls -F +``` -~~~ -AUTHORS Gemfile _config.yml _includes/ bin/ files/ setup.md -CITATION LICENSE.md _episodes/ _layouts/ code/ index.md shell/ -CODE_OF_CONDUCT.md Makefile _episodes_rmd/ aio.md data/ reference.md slides/ -CONTRIBUTING.md README.md _extras/ assets/ fig/ requirements.txt -~~~ +```text +AUTHORS Gemfile _config.yml _includes/ bin/ files/ setup.md +CITATION LICENSE.md _episodes/ _layouts/ code/ index.md shell/ +CODE_OF_CONDUCT.md Makefile _episodes_rmd/ aio.md data/ reference.md slides/ +CONTRIBUTING.md README.md _extras/ assets/ fig/ requirements.txt +``` Here, we can see that this directory contains a number of **sub-directories**. @@ -280,6 +288,7 @@ the shell thinks we're trying to run a command called `ls-F`, which doesn't exist. :::callout + ## What's In A Name? You may have noticed that all of these files' names are "something dot @@ -296,41 +305,41 @@ bytes: it's up to us and our programs to interpret those bytes according to the rules for PDF documents, images, and so on. Naming a PNG image of a whale as `whale.mp3` doesn't somehow -magically turn it into a recording of whalesong, though it *might* +magically turn it into a recording of whalesong, though it _might_ cause the operating system to try to open it with a music player when someone double-clicks it. ::: For this exercise, we need to change our working directory to `shell-novice`, and then `shell` (within the `shell-novice` directory). As we have already used cd to move into `shell-novice` we can get to `shell` by using `cd` again: -~~~bash -$ cd shell -~~~ +```bash +cd shell +``` Note that we are able to add directories together by using `/`. Now if we view the contents of that directory: -~~~bash -$ ls -F -~~~ +```bash +ls -F +``` -~~~ -shell-novice-data.zip tools/ test_directory/ -~~~ +```text +shell-novice-data.zip tools/ test_directory/ +``` Note that under Git Bash in Windows, the `/` is appended automatically. -Now let's take a look at what's in the directory `test_directory`, by running `ls -F test_directory`. So here, we're giving the shell the command `ls` with the **arguments** `-F` and `test_directory`. The first argument is the `-F` flag we've seen before. The second argument --- the one *without* a leading dash --- tells `ls` that +Now let's take a look at what's in the directory `test_directory`, by running `ls -F test_directory`. So here, we're giving the shell the command `ls` with the **arguments** `-F` and `test_directory`. The first argument is the `-F` flag we've seen before. The second argument --- the one _without_ a leading dash --- tells `ls` that we want a listing of something other than our current working directory: -~~~bash -$ ls -F test_directory -~~~ +```bash +ls -F test_directory +``` -~~~ +```text creatures/ molecules/ notes.txt solar.pdf data/ north-pacific-gyre/ pizza.cfg writing/ -~~~ +``` The output shows us that there are some files and sub-directories. Organising things hierarchically in this way helps us keep track of our work: @@ -345,9 +354,10 @@ it tells `ls` how to find something from where we are, rather than from the root of the file system. :::callout + ## Parameters vs. Arguments -According to [Wikipedia](https://en.wikipedia.org/wiki/Parameter_(computer_programming)#Parameters_and_arguments), +According to [Wikipedia](), the terms argument and **parameter** mean slightly different things. In practice, @@ -356,42 +366,42 @@ most people use them interchangeably or inconsistently, so we will too. ::: -If we run `ls -F /test_directory` (*with* a leading slash) we get a different response, +If we run `ls -F /test_directory` (_with_ a leading slash) we get a different response, because `/test_directory` is an **absolute path**: -~~~bash -$ ls -F /test_directory -~~~ +```bash +ls -F /test_directory +``` -~~~ +```text ls: /test_directory: No such file or directory -~~~ +``` The leading `/` tells the computer to follow the path from the root of the file system, so it always refers to exactly one directory, no matter where we are when we run the command. In this case, there is no `data` directory in the root of the file system. -Typing `ls -F test_directory` is a bit painful, so a handy shortcut is to type in the first few letters and press the *TAB* key, for example: +Typing `ls -F test_directory` is a bit painful, so a handy shortcut is to type in the first few letters and press the _TAB_ key, for example: -~~~bash -$ ls -F tes -~~~ +```bash +ls -F tes +``` -Pressing *TAB*, the shell automatically completes the directory name: +Pressing _TAB_, the shell automatically completes the directory name: -~~~bash -$ ls -F test_directory/ -~~~ +```bash +ls -F test_directory/ +``` -This is known as *tab completion* on any matches with those first few letters. -If there are more than one files or directories that match those letters, the shell will show you both --- you can then enter more characters (then using *TAB* again) until it is able to identify the precise file you want and finish the tab completion. +This is known as _tab completion_ on any matches with those first few letters. +If there are more than one files or directories that match those letters, the shell will show you both --- you can then enter more characters (then using _TAB_ again) until it is able to identify the precise file you want and finish the tab completion. Let's change our directory to `test_directory`: -~~~bash -$ cd test_directory -~~~ +```bash +cd test_directory +``` We know how to go down the directory tree: but how do we go up? @@ -399,48 +409,48 @@ We could use an absolute path, e.g. `cd /Users/nelle/shell-novice/novice/shell`. but it's almost always simpler to use `cd ..` to go up one level: -~~~bash -$ pwd -~~~ +```bash +pwd +``` -~~~ +```text /Users/nelle/shell-novice/novice/shell/test_directory -~~~ +``` -~~~bash -$ cd .. -~~~ +```bash +cd .. +``` `..` is a special directory name meaning "the directory containing this one", or more succinctly, the **parent** of the current directory. -~~~bash -$ pwd -~~~ +```bash +pwd +``` -~~~ +```text /Users/nelle/shell-novice/novice/shell/ -~~~ +``` Let's go back into our test directory: -~~~bash -$ cd test_directory -~~~ +```bash +cd test_directory +``` The special directory `..` doesn't usually show up when we run `ls`. If we want to display it, we can give `ls` the `-a` flag: -~~~bash -$ ls -F -a -~~~ +```bash +ls -F -a +``` -~~~ -./ creatures/ molecules/ notes.txt solar.pdf -../ data/ north-pacific-gyre/ pizza.cfg writing/ -~~~ +```text +./ creatures/ molecules/ notes.txt solar.pdf +../ data/ north-pacific-gyre/ pizza.cfg writing/ +``` `-a` stands for "show all"; it forces `ls` to show us file and directory names that begin with `.`, @@ -452,6 +462,7 @@ It may seem redundant to have a name for it, but we'll see some uses for it soon. :::callout + ## Special Names The special names `.` and `..` don't belong to `ls`; @@ -467,21 +478,21 @@ your computer's file system, not any particular program you can run in it. Another handy feature is that we can reference our home directory with `~`, e.g.: -~~~bash -$ ls ~/shell-novice -~~~ +```bash +ls ~/shell-novice +``` -~~~ -AUTHORS Gemfile _config.yml _includes bin files setup.md -CITATION LICENSE.md _episodes _layouts code index.md shell -CODE_OF_CONDUCT.md Makefile _episodes_rmd aio.md data reference.md slides -CONTRIBUTING.md README.md _extras assets fig requirements.txt -~~~ +```text +AUTHORS Gemfile _config.yml _includes bin files setup.md +CITATION LICENSE.md _episodes _layouts code index.md shell +CODE_OF_CONDUCT.md Makefile _episodes_rmd aio.md data reference.md slides +CONTRIBUTING.md README.md _extras assets fig requirements.txt +``` Which again shows us our repository directory. Note that `~` only works if it is the first character in the -path: `here/there/~/elsewhere` is *not* `/Users/nelle/elsewhere`. +path: `here/there/~/elsewhere` is _not_ `/Users/nelle/elsewhere`. ## Exercises @@ -490,10 +501,10 @@ path: `here/there/~/elsewhere` is *not* `/Users/nelle/elsewhere`. ::::challenge{id=relative-path-resolution title="Relative Path Resolution"} If `pwd` displays `/Users/thing`, what will `ls ../backup` display? -1. `../backup: No such file or directory` -2. `2012-12-01 2013-01-08 2013-01-27` -3. `2012-12-01/ 2013-01-08/ 2013-01-27/` -4. `original pnas_final pnas_sub` +1. `../backup: No such file or directory` +2. `2012-12-01 2013-01-08 2013-01-27` +3. `2012-12-01/ 2013-01-08/ 2013-01-27/` +4. `original pnas_final pnas_sub` :::solution **4** is correct. `ls` shows the contents of the path you give it, @@ -506,14 +517,14 @@ If `pwd` displays `/Users/backup`, and `-r` tells `ls` to display things in reverse order, what command will display: -~~~bash +```bash `pnas-sub/ pnas-final/ original/` -~~~ +``` -1. `ls pwd` -2. `ls -r -F` -3. `ls -r -F /Users/backup` -4. Either \#2 or \#3 above, but not \#1. +1. `ls pwd` +2. `ls -r -F` +3. `ls -r -F /Users/backup` +4. Either \#2 or \#3 above, but not \#1. :::solution **4** is correct. The current directory (as shown by `pwd`) is `/Users/backup`, so `ls` @@ -521,4 +532,4 @@ will give the same result with or without `/Users/backup`. Then, in order to get the output in reverse order, and with a `/` after the directories, we need the `-r` and `-F` flags. ::: -:::: \ No newline at end of file +:::: diff --git a/technology_and_tooling/bash_shell/03-create.md b/technology_and_tooling/bash_shell/03-create.md index 40e46c30..225ad54e 100644 --- a/technology_and_tooling/bash_shell/03-create.md +++ b/technology_and_tooling/bash_shell/03-create.md @@ -1,17 +1,19 @@ --- name: Creating Things -dependsOn: [ - technology_and_tooling.bash_shell.02-filedir -] +dependsOn: [technology_and_tooling.bash_shell.02-filedir] tags: [bash] +learningOutcomes: + - Create a directory hierarchy that matches a given diagram. + - Create files in that hierarchy using an editor or by copying and renaming existing files. + - Delete, copy and move specified files and/or directories. attribution: -- citation: > + - citation: > This material was originally taken from training materials developed by the University of Southampton Research Software Group, which are based on the Software Carpentries course "Version Control with Git". - url: https://github.com/Southampton-RSG-Training/shell-novice/ - image: https://southampton-rsg-training.github.io/shell-novice/assets/img/home-logo.png - license: CC-BY-4.0 + url: https://github.com/Southampton-RSG-Training/shell-novice/ + image: https://southampton-rsg-training.github.io/shell-novice/assets/img/home-logo.png + license: CC-BY-4.0 --- We now know how to explore files and directories, @@ -19,39 +21,39 @@ but how do we create them in the first place? First, let's check where we are: -~~~bash -$ pwd -~~~ +```bash +pwd +``` -~~~ +```text /Users/nelle/shell-novice/shell/test_directory -~~~ +``` If you're not in this directory, use the `cd` command to navigate to it as covered in the last lesson, for example: -~~~bash -$ cd ~/shell-novice/shell/test_directory -~~~ +```bash +cd ~/shell-novice/shell/test_directory +``` ### Creating a new directory Now let's use `ls -F` to see what our test directory contains: -~~~bash -$ ls -F -~~~ +```bash +ls -F +``` -~~~ +```text creatures/ molecules/ notes.txt solar.pdf data/ north-pacific-gyre/ pizza.cfg writing/ -~~~ +``` Let's create a new directory called `thesis` using the command `mkdir thesis` (which has no output): -~~~bash -$ mkdir thesis -~~~ +```bash +mkdir thesis +``` As you might (or might not) guess from its name, `mkdir` means "make directory". @@ -59,28 +61,29 @@ Since `thesis` is a relative path (i.e., doesn't have a leading slash), the new directory is created in the current working directory: -~~~bash -$ ls -F -~~~ +```bash +ls -F +``` -~~~ +```text creatures/ north-pacific-gyre/ thesis/ data/ notes.txt writing/ Desktop/ pizza.cfg molecules/ solar.pdf -~~~ +``` However, there's nothing in it yet - this will show no output: -~~~bash -$ ls -F thesis -~~~ +```bash +ls -F thesis +``` ### Creating a new text file Now we'll create a new file using a text editor in this new directory. :::callout + ## Which Editor? When we say, "`nano` is a text editor," we really do mean "text": it can @@ -104,10 +107,10 @@ and how comfortable you are with the terminal. Let's first change our working directory to `thesis` using `cd`, and then we'll use the `Nano` editor to create a text file called `draft.txt`, and then save it in that directory. -~~~bash -$ cd thesis -$ nano draft.txt -~~~ +```bash +cd thesis +nano draft.txt +``` We add a filename after the `nano` command to tell it that we want to edit (or in this case create) a file. @@ -125,47 +128,48 @@ but `ls` now shows that we have created a file called `draft.txt`: Now we've saved the file, we can use `ls` to see that there is a new file in the directory called `draft.txt`: -~~~bash -$ ls -~~~ +```bash +ls +``` -~~~ +```text draft.txt -~~~ +``` We can use the shell on its own to take a look at its contents using the `cat` command (which we can use to print the contents of files): -~~~bash -$ cat draft.txt -~~~ +```bash +cat draft.txt +``` -~~~ +```text It's not "publish or perish" any more, it's "share and thrive". -~~~ +``` ### Deleting files and directories Now, let's assume we didn't actually need to create this file. We can delete it by running `rm draft.txt`: -~~~bash -$ rm draft.txt -~~~ +```bash +rm draft.txt +``` This command removes files (`rm` is short for "remove"). If we run `ls` again, its output is empty once more, which tells us that our file is gone: -~~~bash -$ ls -~~~ +```bash +ls +``` :::callout + ## Deleting Is Forever The Bash shell doesn't have a trash bin that we can recover deleted -files from. Instead, +files from. Instead, when we delete files, they are unhooked from the file system so that their storage space on disk can be recycled. Tools for finding and recovering deleted files do exist, but there's no guarantee they'll @@ -176,42 +180,42 @@ file's disk space right away. But what if we want to delete a directory, perhaps one that already contains a file? Let's re-create that file and then move up one directory using `cd ..`: -~~~bash -$ pwd -~~~ +```bash +pwd +``` -~~~ +```text /Users/nelle/shell-novice/test_directory/thesis -~~~ +``` -~~~bash -$ nano draft.txt -$ ls -~~~ +```bash +nano draft.txt +ls +``` -~~~ +```text draft.txt -~~~ +``` -~~~bash -$ cd .. -$ pwd -~~~ +```bash +cd .. +pwd +``` -~~~ +```text /Users/nelle/shell-novice/shell/test_directory -~~~ +``` If we try to remove the entire `thesis` directory using `rm thesis`, we get an error message: -~~~bash -$ rm thesis -~~~ +```bash +rm thesis +``` -~~~ +```text rm: cannot remove `thesis': Is a directory -~~~ +``` On a Mac, it may look a bit different (`rm: thesis: is a directory`), but means the same thing. @@ -221,38 +225,39 @@ which is short for "remove directory". It doesn't work yet either, though, because the directory we're trying to remove isn't empty (again, it may look a bit different on a Mac): -~~~bash -$ rmdir thesis -~~~ +```bash +rmdir thesis +``` -~~~ +```text rmdir: failed to remove `thesis': Directory not empty -~~~ +``` This little safety feature can save you a lot of grief, particularly if you are a bad typist. To really get rid of `thesis` we must first delete the file `draft.txt`: -~~~bash -$ rm thesis/draft.txt -~~~ +```bash +rm thesis/draft.txt +``` The directory is now empty, so `rmdir` can delete it: -~~~bash -$ rmdir thesis -~~~ +```bash +rmdir thesis +``` :::callout + ## With Great Power Comes Great Responsibility Removing the files in a directory just so that we can remove the directory quickly becomes tedious. Instead, we can use `rm` with the `-r` flag (which stands for "recursive"): -~~~bash -$ rm -r thesis -~~~ +```bash +rm -r thesis +``` This removes everything in the directory, then the directory itself. If the directory contains sub-directories, `rm -r` does the same thing to @@ -264,36 +269,36 @@ without care. Let's create that directory and file one more time. -~~~bash -$ pwd -~~~ +```bash +pwd +``` -~~~ +```text /Users/user/shell-novice/shell/test_directory -~~~ +``` -~~~bash -$ mkdir thesis -~~~ +```bash +mkdir thesis +``` Again, put anything you like in this file (note we're giving the `thesis` path to `nano` as well as the `draft.txt` filename, so we create it in that directory): -~~~bash -$ nano thesis/draft.txt -$ ls thesis -~~~ +```bash +nano thesis/draft.txt +ls thesis +``` -~~~ +```text draft.txt -~~~ +``` `draft.txt` isn't a particularly informative name, so let's change the file's name using `mv`, which is short for "move": -~~~bash -$ mv thesis/draft.txt thesis/quotes.txt -~~~ +```bash +mv thesis/draft.txt thesis/quotes.txt +``` The first parameter tells `mv` what we're "moving", while the second is where it's to go. @@ -303,13 +308,13 @@ which has the same effect as renaming the file. Sure enough, `ls` shows us that `thesis` now contains one file called `quotes.txt`: -~~~bash -$ ls thesis -~~~ +```bash +ls thesis +``` -~~~ +```text quotes.txt -~~~ +``` Just for the sake of inconsistency, `mv` also works on directories --- there is no separate `mvdir` command. @@ -323,28 +328,28 @@ but put the file somewhere new. In this case, the directory name we use is the special directory name `.` that we mentioned earlier. -~~~bash -$ mv thesis/quotes.txt . -~~~ +```bash +mv thesis/quotes.txt . +``` The effect is to move the file from the directory it was in to the current working directory. `ls` now shows us that `thesis` is empty: -~~~bash -$ ls thesis -~~~ +```bash +ls thesis +``` Further, `ls` with a filename or directory name as a parameter only lists that file or directory. We can use this to see that `quotes.txt` is still in our current directory: -~~~bash -$ ls quotes.txt -~~~ +```bash +ls quotes.txt +``` -~~~ +```text quotes.txt -~~~ +``` ### Copying files @@ -354,28 +359,28 @@ We can check that it did the right thing using `ls` with two paths as parameters --- like most Unix commands, `ls` can be given thousands of paths at once: -~~~bash -$ cp quotes.txt thesis/quotations.txt -$ ls quotes.txt thesis/quotations.txt -~~~ +```bash +cp quotes.txt thesis/quotations.txt +ls quotes.txt thesis/quotations.txt +``` -~~~ +```text quotes.txt thesis/quotations.txt -~~~ +``` To prove that we made a copy, let's delete the `quotes.txt` file in the current directory and then run that same `ls` again (we can get to this command by pressing the up arrow twice). -~~~bash -$ rm quotes.txt -$ ls quotes.txt thesis/quotations.txt -~~~ +```bash +rm quotes.txt +ls quotes.txt thesis/quotations.txt +``` -~~~ +```text ls: cannot access quotes.txt: No such file or directory thesis/quotations.txt -~~~ +``` This time it tells us that it can't find `quotes.txt` in the current directory, but it does find the copy in `thesis` that we didn't delete. @@ -405,33 +410,33 @@ Both **1** and **2** will leave you with a file called `statistics.txt` at the e ::::challenge{id=moving-copying title="Moving and Copying"} What is the output of the closing `ls` command in the sequence shown below? -~~~bash -$ pwd -~~~ +```bash +pwd +``` -~~~ +```text /Users/jamie/data -~~~ +``` -~~~bash -$ ls -~~~ +```bash +ls +``` -~~~bash +```bash proteins.dat -~~~ +``` -~~~bash -$ mkdir recombine -$ mv proteins.dat recombine -$ cp recombine/proteins.dat ../proteins-saved.dat -$ ls -~~~ +```bash +mkdir recombine +mv proteins.dat recombine +cp recombine/proteins.dat ../proteins-saved.dat +ls +``` -1. `proteins-saved.dat recombine` -2. `recombine` -3. `proteins.dat recombine` -4. `proteins-saved.dat` +1. `proteins-saved.dat recombine` +2. `recombine` +3. `proteins.dat recombine` +4. `proteins-saved.dat` :::solution The correct answer is **2**. @@ -446,80 +451,82 @@ So as it's in the directory above the current one (`..`), it won't show up when ::::challenge{id=organising-directories-files title="Organising Directories and Files"} Jamie is working on a project and she sees that her files aren't very well organized: -~~~bash -$ ls -F -~~~ +```bash +ls -F +``` -~~~bash +```bash analyzed/ fructose.dat raw/ sucrose.dat -~~~ +``` The `fructose.dat` and `sucrose.dat` files contain output from her data analysis. What command(s) covered in this lesson does she need to run so that the commands below will produce the output shown? -~~~bash -$ ls -F -~~~ +```bash +ls -F +``` -~~~ +```text analyzed/ raw/ -~~~ +``` -~~~bash -$ ls analyzed -~~~ +```bash +ls analyzed +``` -~~~ +```text fructose.dat sucrose.dat -~~~ +``` :::solution `ls` lists the contents of the current directory, whilst `ls analyzed` lists the contents of the `analyzed` directory. So we need to move the files `fructose.dat` and `sucrose.dat` out of the current directory, and into the `analyzed` directory, which we do with `mv`. - -~~~bash -$ ls -F -$ mv fructose.dat analyzed/ -$ mv sucrose.dat analyzed/ -$ ls analyzed -~~~ + +```bash +ls -F +mv fructose.dat analyzed/ +mv sucrose.dat analyzed/ +ls analyzed +``` + ::: :::: ::::challenge{id=copy-multiple-filenames title="Copy with Multiple Filenames"} What does `cp` do when given several filenames and a directory name, as in: -~~~bash -$ mkdir backup -$ cp thesis/citations.txt thesis/quotations.txt backup -~~~ +```bash +mkdir backup +cp thesis/citations.txt thesis/quotations.txt backup +``` :::solution It copies the files to the directory with the same name. -~~~bash -$ ls backup -~~~ +```bash +ls backup +``` -~~~ +```text citations.txt quotations.txt -~~~ +``` + ::: What does `cp` do when given three or more filenames, as in: -~~~bash -$ ls -F -~~~ +```bash +ls -F +``` -~~~ +```text intro.txt methods.txt survey.txt -~~~ +``` -~~~bash -$ cp intro.txt methods.txt survey.txt -~~~ +```bash +cp intro.txt methods.txt survey.txt +``` :::solution You should get an error and the command does nothing. @@ -527,11 +534,11 @@ When passing 3 or more arguments, the last one needs to be a directory. However, -~~~bash -$ cp intro.txt methods.txt -~~~ +```bash +cp intro.txt methods.txt +``` Will not fail even though both of the arguments are existing files - it will copy the contents -of `intro.txt` *over* the contents of `methods.txt`. So be careful! +of `intro.txt` _over_ the contents of `methods.txt`. So be careful! ::: -:::: \ No newline at end of file +:::: diff --git a/technology_and_tooling/bash_shell/04-pipefilter.md b/technology_and_tooling/bash_shell/04-pipefilter.md index e5925f29..14099cdd 100644 --- a/technology_and_tooling/bash_shell/04-pipefilter.md +++ b/technology_and_tooling/bash_shell/04-pipefilter.md @@ -1,24 +1,27 @@ --- name: Pipes and Filters -dependsOn: [ - technology_and_tooling.bash_shell.03-create -] +dependsOn: [technology_and_tooling.bash_shell.03-create] tags: [bash] +learningOutcomes: + - Explain the advantage of linking commands with pipes and filters. + - Combine sequences of commands to get new output. + - Redirect a command’s output to a file. + - Briefly describe how pipes channel input and output between piped commands. + - Explain what usually happens if a program or pipeline isn’t given any input to process. attribution: -- citation: > + - citation: > This material was originally taken from training materials developed by the University of Southampton Research Software Group, which are based on the Software Carpentries course "Version Control with Git". - url: https://github.com/Southampton-RSG-Training/shell-novice/ - image: https://southampton-rsg-training.github.io/shell-novice/assets/img/home-logo.png - license: CC-BY-4.0 + url: https://github.com/Southampton-RSG-Training/shell-novice/ + image: https://southampton-rsg-training.github.io/shell-novice/assets/img/home-logo.png + license: CC-BY-4.0 --- Now that we know a few basic commands, we can finally look at the shell's most powerful feature: the ease with which it lets us combine existing programs in new ways. - ## Joining commands together using files One way we can use programs together is to have the output of one command captured @@ -27,29 +30,30 @@ in a file, and use that file as the input to another command. We'll start with a directory called `data`, which is in the `shell-novice/data` directory, one directory up from `test_directory`. i.e. from `test_directory`: -~~~bash -$ cd ../.. -$ cd data -~~~ +```bash +cd ../.. +cd data +``` Doing `ls` shows us three files in this directory: -~~~ +```text sc_climate_data.csv sc_climate_data_10.csv sc_climate_data_1000.csv -~~~ +``` The data in these files is taken from a real climate science research project that is looking into woody biomass yields. The files are as follows: -* sc_climate_data.csv: the entire 20MB data set. -* sc_climate_data_1000.csv: a subset of the entire data, but only 1000 data rows. -* sc_climate_data_10.csv: a much smaller subset with only 10 rows of data. +- sc_climate_data.csv: the entire 20MB data set. +- sc_climate_data_1000.csv: a subset of the entire data, but only 1000 data rows. +- sc_climate_data_10.csv: a much smaller subset with only 10 rows of data. We'll largely be working on the 10-row version, since this allows us to more easily reason about the data in the file and the operations we're performing on it. :::callout + ## Why not just use the entire 20MB data set? Running various commands over a 20MB data set could take some time. @@ -68,19 +72,19 @@ with lines in the file equating to rows. Let's run the command `wc *.csv`: -* `wc` is the "word count" command, it counts the number of lines, words, and characters in files. -* The `*` in `*.csv` matches zero or more characters, so the shell turns `*.csv` into a complete list of `.csv` files: +- `wc` is the "word count" command, it counts the number of lines, words, and characters in files. +- The `*` in `*.csv` matches zero or more characters, so the shell turns `*.csv` into a complete list of `.csv` files: -~~~bash -$ wc *.csv -~~~ +```bash +wc *.csv +``` -~~~ +```text 1048576 1048577 21005037 sc_climate_data.csv 11 12 487 sc_climate_data_10.csv 1001 1002 42301 sc_climate_data_1000.csv 1049588 1049591 21047825 total -~~~ +``` Sometimes we need to pass multiple filenames to a single command, or find or use filenames that match a given pattern, @@ -102,7 +106,7 @@ with `p`) or `preferred.p` (there isn't at least one character after the `.p`). When the shell sees a wildcard, it expands the wildcard to create a -list of matching filenames *before* running the command that was +list of matching filenames _before_ running the command that was asked for. As an exception, if a wildcard expression does not match any file, Bash will pass the expression as a parameter to the command as it is. For example typing `ls *.pdf` in the data directory @@ -115,16 +119,16 @@ themselves. It's the shell, not the other programs, that expands the wildcards. Going back to `wc`, if we run `wc -l` instead of just `wc`, the output shows only the number of lines per file: -~~~bash -$ wc -l *.csv -~~~ +```bash +wc -l *.csv +``` -~~~ +```text 1048576 sc_climate_data.csv 11 sc_climate_data_10.csv 1001 sc_climate_data_1000.csv 1049588 total -~~~ +``` We can also use `-w` to get only the number of words, or `-c` to get only the number of characters. @@ -134,9 +138,9 @@ It's an easy question to answer when there are only three files, but what if there were 6000? Our first step toward a solution is to run the command: -~~~bash -$ wc -l *.csv > lengths.txt -~~~ +```bash +wc -l *.csv > lengths.txt +``` The greater than symbol, `>`, tells the shell to **redirect** the command's output to a file instead of printing it to the screen. @@ -147,46 +151,46 @@ everything that `wc` would have printed has gone into the file `lengths.txt` ins `ls lengths.txt` confirms that the file exists: -~~~bash -$ ls lengths.txt -~~~ +```bash +ls lengths.txt +``` -~~~ +```text lengths.txt -~~~ +``` We can now send the content of `lengths.txt` to the screen using `cat lengths.txt`. `cat` is able to print the contents of files one after another. There's only one file in this case, so `cat` just shows us what it contains: -~~~bash -$ cat lengths.txt -~~~ +```bash +cat lengths.txt +``` -~~~ +```text 1048576 sc_climate_data.csv 11 sc_climate_data_10.csv 1001 sc_climate_data_1000.csv 1049588 total -~~~ +``` Now let's use the `sort` command to sort its contents. We will also use the -n flag to specify that the sort is numerical instead of alphabetical. -This does *not* change the file; +This does _not_ change the file; instead, it sends the sorted result to the screen: -~~~bash -$ sort -n lengths.txt -~~~ +```bash +sort -n lengths.txt +``` -~~~ +```text 11 sc_climate_data_10.csv 1001 sc_climate_data_1000.csv 1048576 sc_climate_data.csv 1049588 total -~~~ +``` We can put the sorted list of lines in another temporary file called `sorted-lengths.txt` by putting `> sorted-lengths.txt` after the command, @@ -194,14 +198,14 @@ just as we used `> lengths.txt` to put the output of `wc` into `lengths.txt`. Once we've done that, we can run another command called `head` to get the first few lines in `sorted-lengths.txt`: -~~~bash -$ sort -n lengths.txt > sorted-lengths.txt -$ head -1 sorted-lengths.txt -~~~ +```bash +sort -n lengths.txt > sorted-lengths.txt +head -1 sorted-lengths.txt +``` -~~~ +```text 11 sc_climate_data_10.csv -~~~ +``` Using the parameter `-1` with `head` tells it that we only want the first line of the file; @@ -216,18 +220,17 @@ even once you understand what `wc`, `sort`, and `head` do, all those intermediate files make it hard to follow what's going on. Fortunately, there's a way to make this much simpler. - ## Using pipes to join commands together We can make it easier to understand by running `sort` and `head` together: -~~~bash -$ sort -n lengths.txt | head -1 -~~~ +```bash +sort -n lengths.txt | head -1 +``` -~~~ - 11 sc_climate_data_10.csv -~~~ +```text + 11 sc_climate_data_10.csv +``` The vertical bar between the two commands is called a **pipe**. It tells the shell that we want to use @@ -241,16 +244,16 @@ we don't have to know or care. We can even use another pipe to send the output of `wc` directly to `sort`, which then sends its output to `head`: -~~~bash -$ wc -l *.csv | sort -n | head -1 -~~~ +```bash +wc -l *.csv | sort -n | head -1 +``` -~~~ +```text 11 sc_climate_data_10.csv -~~~ +``` -This is exactly like a mathematician nesting functions like *log(3x)* -and saying "the log of three times *x*". +This is exactly like a mathematician nesting functions like _log(3x)_ +and saying "the log of three times _x_". In our case, the calculation is "head of sort of line count of `*.csv`". @@ -271,10 +274,11 @@ and write to standard output. The key is that any program that reads lines of text from standard input and writes lines of text to standard output can be combined with every other program that behaves this way as well. -You can *and should* write your programs this way +You can _and should_ write your programs this way so that you and other people can put those programs into pipes to multiply their power. :::callout + ## Redirecting Input As well as using `>` to redirect a program's output, we can use `<` to @@ -292,32 +296,34 @@ If you're interested in how pipes work in more technical detail, see the descrip ## Exercises ::::challenge{id=double-chevron-meaning title="What does Double Chevron Mean?"} + ## What does `>>` mean? What is the difference between: -~~~bash +```bash echo hello > testfile01.txt -~~~ +``` And: -~~~bash +```bash echo hello >> testfile02.txt -~~~ +``` Hint: Try executing each command twice in a row and then examining the output files. :::solution If there isn't a file already there with the name `testfile01.txt`, both `>` and `>>` will create one. -However, if there *is* a file, then `>` will *overwrite* the contents of the file, whilst `>>` will *append* to the existing contents. +However, if there _is_ a file, then `>` will _overwrite_ the contents of the file, whilst `>>` will _append_ to the existing contents. ::: :::: For those interested in the technical details of how pipes work: :::callout + ## What's happening 'under the hood' - pipes in more detail Here's what actually happens behind the scenes when we create a pipe. @@ -358,4 +364,4 @@ through `wc` to `sort`, and from `sort` through `head` to the screen. ![1. Redirects and Pipes](fig/redirects-and-pipes.png) -::: \ No newline at end of file +::: diff --git a/technology_and_tooling/bash_shell/05-script.md b/technology_and_tooling/bash_shell/05-script.md index b3907d2f..da1250c8 100644 --- a/technology_and_tooling/bash_shell/05-script.md +++ b/technology_and_tooling/bash_shell/05-script.md @@ -1,17 +1,19 @@ --- name: Shell Scripts -dependsOn: [ - technology_and_tooling.bash_shell.04-pipefilter -] +dependsOn: [technology_and_tooling.bash_shell.04-pipefilter] tags: [bash] +learningOutcomes: + - Write a shell script that runs a command or series of commands for a fixed set of files. + - Run a shell script from the command line. + - Write a shell script that operates on a set of files defined by the user on the command line. attribution: -- citation: > + - citation: > This material was originally taken from training materials developed by the University of Southampton Research Software Group, which are based on the Software Carpentries course "Version Control with Git". - url: https://github.com/Southampton-RSG-Training/shell-novice/ - image: https://southampton-rsg-training.github.io/shell-novice/assets/img/home-logo.png - license: CC-BY-4.0 + url: https://github.com/Southampton-RSG-Training/shell-novice/ + image: https://southampton-rsg-training.github.io/shell-novice/assets/img/home-logo.png + license: CC-BY-4.0 --- We are finally ready to see what makes the shell such a powerful programming environment. @@ -26,46 +28,47 @@ these are actually small programs. Let's start by going back to `data` and putting some commands into a new file called `middle.sh` using an editor like `nano`: -~~~bash -$ cd ~/shell-novice/data -$ nano middle.sh -~~~ +```bash +cd ~/shell-novice/data +nano middle.sh +``` So why the .sh extension to the filename? Adding `.sh` is the convention to show that this is a Bash shell script. Enter the following line into our new file: -~~~bash +```bash head -15 sc_climate_data_1000.csv | tail -5 -~~~ +``` Then save it and exit `nano` (using `Control-O` to save it and then `Control-X` to exit `nano`). This pipe selects lines 11-15 of the file `sc_climate_data_1000.csv`. It selects the first 15 lines of that file using `head`, then passes that to `tail` to show us only the last 5 lines - hence lines 11-15. -Remember, we are *not* running it as a command just yet: +Remember, we are _not_ running it as a command just yet: we are putting the commands in a file. Once we have saved the file, we can ask the shell to execute the commands it contains. Our shell is called `bash`, so we run the following command: -~~~bash -$ bash middle.sh -~~~ +```bash +bash middle.sh +``` -~~~ +```text 299196.8188,972890.0521,48.07,61.41,0.78 324196.8188,972890.0521,48.20,-9999.00,0.72 274196.8188,968890.0521,47.86,60.94,0.83 275196.8188,968890.0521,47.86,61.27,0.83 248196.8188,961890.0521,46.22,58.98,1.43 -~~~ +``` Sure enough, our script's output is exactly what we would get if we ran that pipeline directly. :::callout + ## Text vs. Whatever We usually call programs like Microsoft Word or LibreOffice Writer "text @@ -87,47 +90,48 @@ but that would probably take longer than just retyping the command. Instead, let's edit `middle.sh` and replace `sc_climate_data_1000.csv` with a special variable called `$1`: -~~~bash -$ nano middle.sh -~~~ +```bash +nano middle.sh +``` -~~~ +```text head -15 "$1" | tail -5 -~~~ +``` Inside a shell script, `$1` means the first filename (or other argument) passed to the script on the command line. We can now run our script like this: -~~~bash -$ bash middle.sh sc_climate_data_1000.csv -~~~ +```bash +bash middle.sh sc_climate_data_1000.csv +``` -~~~ +```text 299196.8188,972890.0521,48.07,61.41,0.78 324196.8188,972890.0521,48.20,-9999.00,0.72 274196.8188,968890.0521,47.86,60.94,0.83 275196.8188,968890.0521,47.86,61.27,0.83 248196.8188,961890.0521,46.22,58.98,1.43 -~~~ +``` or on a different file like this (our full data set!): -~~~bash -$ bash middle.sh sc_climate_data.csv -~~~ +```bash +bash middle.sh sc_climate_data.csv +``` -~~~ +```text 299196.8188,972890.0521,48.07,61.41,0.78 324196.8188,972890.0521,48.20,-9999.00,0.72 274196.8188,968890.0521,47.86,60.94,0.83 275196.8188,968890.0521,47.86,61.27,0.83 248196.8188,961890.0521,46.22,58.98,1.43 -~~~ +``` Note the output is the same, since our full data set contains the same first 1000 lines as `sc_climate_data_1000.csv`. :::callout + ## Double-Quotes Around Arguments We put the `$1` inside of double-quotes in case the filename happens to contain any spaces. @@ -137,9 +141,9 @@ If we left out these quotes, and `$1` expanded to a filename like `climate data.csv`, the command in the script would effectively be: -~~~bash +```bash head -15 climate data.csv | tail -5 -~~~ +``` This would call `head` on two separate files, `climate` and `data.csv`, which is probably not what we intended. @@ -151,10 +155,10 @@ which is probably not what we intended. In the `test_directory/molecules` directory, you have a shell script called `script.sh` containing the following commands: -~~~bash +```bash head $2 $1 tail -n $3 $1 -~~~ +``` The shell allows us to access arguments other than just the first. Here, we are using `$2` and `$3` to obtain and use the second and third arguments passed to the script (where arguments are separated by spaces, as with any other commands). @@ -165,9 +169,9 @@ certain machines if we don't. While you are in the molecules directory, you type the following command: -~~~bash +```bash bash script.sh '*.pdb' -1 -1 -~~~ +``` Which of the following outputs would you expect to see? @@ -178,27 +182,27 @@ Which of the following outputs would you expect to see? 4. An error because of the quotes around `*.pdb` :::solution -The answer is **2**. The quotes around the wildcard `'*.pdb'` mean it isn't expanded when we call the script - but it will get expanded *inside* the script. There, it gets expanded to match every file in the directory that ends in `*.pdb`, and effectively the script calls: +The answer is **2**. The quotes around the wildcard `'*.pdb'` mean it isn't expanded when we call the script - but it will get expanded _inside_ the script. There, it gets expanded to match every file in the directory that ends in `*.pdb`, and effectively the script calls: -~~~bash +```bash head -1 *.pdb tail -n -1 *.pdb* -~~~ +``` This prints out the first line (`head -1`) of each `.pdb` file, and then the last line of each `.pdb` file. If we'd called the script as: -~~~bash +```bash bash script.sh *.pdb -1 -1 -~~~ +``` Then it wouldn't work as the wildcard would've expanded before the script started and we'd have effectively run it as: -~~~bash +```bash bash script.sh cubane.pdb ethane.pdb methane.pdb octane.pdb pentane.pdb propane.pdb -1 -1 -~~~ +``` This would have caused an error, as we expect the second and third arguments to be numbers for `head` and `tail`! ::: -:::: \ No newline at end of file +:::: diff --git a/technology_and_tooling/bash_shell/06-loop.md b/technology_and_tooling/bash_shell/06-loop.md index 72110fb6..5b9b8ec7 100644 --- a/technology_and_tooling/bash_shell/06-loop.md +++ b/technology_and_tooling/bash_shell/06-loop.md @@ -1,24 +1,27 @@ --- name: Loops -dependsOn: [ - technology_and_tooling.bash_shell.05-script -] +dependsOn: [technology_and_tooling.bash_shell.05-script] tags: [bash] +learningOutcomes: + - Write a loop that applies one or more commands separately to each file in a set of files. + - Trace the values taken on by a loop variable during execution of the loop. + - Explain the difference between the name and the value of a variable. + - Re-run recently executed commands without retyping them. attribution: -- citation: > + - citation: > This material was originally taken from training materials developed by the University of Southampton Research Software Group, which are based on the Software Carpentries course "Version Control with Git". - url: https://github.com/Southampton-RSG-Training/shell-novice/ - image: https://southampton-rsg-training.github.io/shell-novice/assets/img/home-logo.png - license: CC-BY-4.0 + url: https://github.com/Southampton-RSG-Training/shell-novice/ + image: https://southampton-rsg-training.github.io/shell-novice/assets/img/home-logo.png + license: CC-BY-4.0 --- Wildcards and tab completion are two ways to reduce typing as well as typing mistakes. Another is to tell the shell to do something over and over again, which could save us considerable time, depending on how many times we need the shell to do this thing. -### Couldn't we just... +### Couldn't we just Suppose we have several hundred genome data files named `basilisk.dat`, `minotaur.dat`, `unicorn.dat`, and so on. In this example, @@ -27,41 +30,41 @@ but the principles can be applied to many many more files at once. Let's first go to the `creatures` directory (using tab completion to enter the full directory will save considerable typing here!): -~~~bash -$ cd ~/shell-novice/shell/test_directory/creatures -$ ls -~~~ +```bash +cd ~/shell-novice/shell/test_directory/creatures +ls +``` -~~~ +```text basilisk.dat minotaur.dat unicorn.dat -~~~ +``` We would like to modify these files, but also save a version of the original files and rename them as `original-basilisk.dat`, `original-minotaur.dat`, `original-unicorn.dat`. We can't use the following (don't type this, it's just for illustrative purposes): -~~~bash -$ mv *.dat original-*.dat -~~~ +```bash +mv *.dat original-*.dat +``` Because as we learnt previously, with wildcards that would expand to: -~~~bash -$ mv basilisk.dat minotaur.dat unicorn.dat original-*.dat -~~~ +```bash +mv basilisk.dat minotaur.dat unicorn.dat original-*.dat +``` This wouldn't back up our files, instead we would get an error. If on a Mac or Linux it would look like: -~~~ +```text mv: target `original-*.dat' is not a directory -~~~ +``` Or if on Windows using Git Bash, we would see: -~~~ +```text usage: mv [-f | -i | -n] [-v] source target mv [-f | -i | -n] [-v] source ... directory -~~~ +``` Even though the error is different, the cause is the same. It arises when `mv` receives more than two inputs. When this happens, it @@ -77,26 +80,26 @@ Here's a simple example that displays the first three lines of each file in turn Let's create a new shell script using `nano` called `top.sh` that uses a loop. -~~~bash -$ nano top.sh -~~~ +```bash +nano top.sh +``` In that file enter the following: -~~~bash +```bash for filename in basilisk.dat minotaur.dat unicorn.dat do head -3 $filename done -~~~ +``` After saving it by using `Control-O` and `Control-X`, run the script: -~~~bash -$ bash top.sh -~~~ +```bash +bash top.sh +``` -~~~ +```text COMMON NAME: basilisk CLASSIFICATION: basiliscus vulgaris UPDATED: 1745-05-02 @@ -106,7 +109,7 @@ UPDATED: 1764-09-12 COMMON NAME: unicorn CLASSIFICATION: equus monoceros UPDATED: 1738-11-24 -~~~ +``` So what's happening, and how does the loop work? @@ -118,6 +121,7 @@ the name of the thing currently being operated on is assigned to the **variable** called `filename`. :::callout + ## What is a variable? Variables are used to store information that we want to refer to later, and are a fundamental concept in @@ -151,6 +155,7 @@ the command that's actually being run is our old friend `head`, so this loop prints out the first three lines of each data file in turn. :::callout + ## Why the extra spaces? Note the use of spaces for indentation before the `head` command. @@ -162,6 +167,7 @@ such as these, code becomes much harder to read. ::: :::callout + ## Dos and don'ts of variable naming We have called the variable in this loop `filename` @@ -169,24 +175,24 @@ in order to make its purpose clearer to human readers. The shell itself doesn't care what the variable is called; if we wrote this loop as: -~~~bash +```bash for x in basilisk.dat minotaur.dat unicorn.dat do head -3 $x done -~~~ +``` or: -~~~bash +```bash for temperature in basilisk.dat minotaur.dat unicorn.dat do head -3 $temperature done -~~~ +``` it would work exactly the same way. -*Don't do this.* +_Don't do this._ Programs are only useful if people can understand them, so meaningless names like `x`, or misleading names like `temperature`, increase the odds that the program won't do what its readers think it does. @@ -199,12 +205,12 @@ run a loop over them all? Using what we've learnt we can solve our original problem using the following loop. In a new script called `rename.sh` enter the following: -~~~bash +```bash for filename in *.dat do mv $filename original-$filename done -~~~ +``` Note that here, we use `*.dat` to get a list of all files ending in `.dat`, which is very similar to doing `ls *.dat`. @@ -214,123 +220,127 @@ The first time, when `$filename` expands to `basilisk.dat`, the shell executes: -~~~bash +```bash mv basilisk.dat original-basilisk.dat -~~~ +``` The second time, the command is: -~~~bash +```bash mv minotaur.dat original-minotaur.dat -~~~ +``` The third time, the command is: -~~~bash +```bash mv unicorn.dat original-unicorn.dat -~~~ +``` Note that once you've run this command once, running it again has an interesting effect that we likely don't intend - the `.dat` files we end up with are: -~~~ +```text original-original-basilisk.dat original-original-unicorn.dat original-original-minotaur.dat -~~~ +``` This is because the `.dat` files picked up by `for filename in *.dat` will now match on `original-basilisk.dat`, -`original-unicorn.dat`, and `original-minotaur.dat`, and each of these files is then renamed with *yet another* +`original-unicorn.dat`, and `original-minotaur.dat`, and each of these files is then renamed with _yet another_ `original-` prefix added to it. This is another example of why you should always ensure you have a backup of files before you operate on them! :::callout + ## Measure Twice, Run Once A loop is a way to do many things at once --- or to make many mistakes at -once if it does the wrong thing. One way to check what a loop *would* do +once if it does the wrong thing. One way to check what a loop _would_ do is to echo the commands it would run instead of actually running them. For example, we could write our file renaming loop like this: -~~~bash +```bash for filename in *.dat do echo mv $filename original-$filename done -~~~ +``` Instead of running `mv`, this loop runs `echo`, which prints out: -~~~bash +```bash mv basilisk.dat original-basilisk.dat mv unicorn.dat original-unicorn.dat -~~~ +``` -*without* actually running those commands. We can then use up-arrow to +_without_ actually running those commands. We can then use up-arrow to redisplay the loop, back-arrow to get to the word `echo`, delete it, and then press "enter" to run the loop with the actual `mv` commands. This isn't foolproof, but it's a handy way to see what's going to happen when you're still learning how loops work. ::: - ## Exercises ::::challenge{id=save-to-file-1 title="Saving to a File in a Loop, Part 1"} In the same directory, what is the effect of this loop? -~~~bash +```bash for sugar in *.dat do echo $sugar cat $sugar > xylose.dat done -~~~ +``` -1. Prints `fructose.dat`, `glucose.dat`, and `sucrose.dat`, and the text from `sucrose.dat` will be saved to a file called - `xylose.dat`. -2. Prints `fructose.dat`, `glucose.dat`, and `sucrose.dat`, and the text from all three files would be - concatenated and saved to a file called `xylose.dat`. -3. Prints `fructose.dat`, `glucose.dat`, `sucrose.dat`, and - `xylose.dat`, and the text from `sucrose.dat` will be saved to a file called `xylose.dat`. -4. None of the above. +1. Prints `fructose.dat`, `glucose.dat`, and `sucrose.dat`, and the text from `sucrose.dat` will be saved to a file called + `xylose.dat`. +2. Prints `fructose.dat`, `glucose.dat`, and `sucrose.dat`, and the text from all three files would be + concatenated and saved to a file called `xylose.dat`. +3. Prints `fructose.dat`, `glucose.dat`, `sucrose.dat`, and + `xylose.dat`, and the text from `sucrose.dat` will be saved to a file called `xylose.dat`. +4. None of the above. :::solution + 1. Correct. 2. Incorrect, since we're using the `>` redirect operator, which will overwrite any previous contents of `xylose.dat`. 3. Incorrect, since the file `xylose.dat` would not have existed when `*.dat` would have been expanded. 4. Incorrect. + ::: :::: ::::challenge{id=save-to-file-2 title="Saving to a File in a Loop, Part 2"} In another directory, where `ls` returns: -~~~ +```text fructose.dat glucose.dat sucrose.dat maltose.txt -~~~ +``` What would be the output of the following loop? -~~~bash +```bash for datafile in *.dat do cat $datafile >> sugar.dat done -~~~ +``` -1. All of the text from `fructose.dat`, `glucose.dat` and `sucrose.dat` would be - concatenated and saved to a file called `sugar.dat`. -2. The text from `sucrose.dat` will be saved to a file called `sugar.dat`. -3. All of the text from `fructose.dat`, `glucose.dat`, `sucrose.dat` and `maltose.txt` - would be concatenated and saved to a file called `sugar.dat`. -4. All of the text from `fructose.dat`, `glucose.dat` and `sucrose.dat` would be printed - to the screen and saved to a file called `sugar.dat` +1. All of the text from `fructose.dat`, `glucose.dat` and `sucrose.dat` would be + concatenated and saved to a file called `sugar.dat`. +2. The text from `sucrose.dat` will be saved to a file called `sugar.dat`. +3. All of the text from `fructose.dat`, `glucose.dat`, `sucrose.dat` and `maltose.txt` + would be concatenated and saved to a file called `sugar.dat`. +4. All of the text from `fructose.dat`, `glucose.dat` and `sucrose.dat` would be printed + to the screen and saved to a file called `sugar.dat` :::solution + 1. Correct. 2. Incorrect, since we're looping through each of the other `.dat` files (`fructose.dat` and `glucose.dat`) whose contents would also be included. 3. Incorrect, since `maltose.txt` has a `.txt` extension and not a `.dat` extension, so won't match on `*.dat` and won't be included in the loop. 4. Incorrect, since the `>>` operator redirects all output to the `sugar.dat` file, so we won't see any screen output. + ::: :::: @@ -338,33 +348,33 @@ done Suppose we want to preview the commands the following loop will execute without actually running those commands: -~~~bash +```bash for file in *.dat do analyze $file > analyzed-$file done -~~~ +``` What is the difference between the the two loops below, and which one would we want to run? -~~~bash +```bash # Version 1 for file in *.dat do echo analyze $file > analyzed-$file done -~~~ +``` -~~~bash +```bash # Version 2 for file in *.dat do echo "analyze $file > analyzed-$file" done -~~~ +``` :::solution Version 2 is the one that successfully acts as a dry run. In version 1, since the `>` file redirect is not within quotes, the script will create three files `analyzed-basilisk.dat`, `analyzed-minotaur.dat`, and `analyzed-unicorn.dat` which is not what we want. ::: -:::: \ No newline at end of file +:::: diff --git a/technology_and_tooling/bash_shell/fig/filesystem-challenge.png b/technology_and_tooling/bash_shell/fig/filesystem-challenge.png index f245f5f0..1d3571dc 100644 Binary files a/technology_and_tooling/bash_shell/fig/filesystem-challenge.png and b/technology_and_tooling/bash_shell/fig/filesystem-challenge.png differ diff --git a/technology_and_tooling/bash_shell/fig/filesystem.png b/technology_and_tooling/bash_shell/fig/filesystem.png index 01e1f570..59d0ba1d 100644 Binary files a/technology_and_tooling/bash_shell/fig/filesystem.png and b/technology_and_tooling/bash_shell/fig/filesystem.png differ diff --git a/technology_and_tooling/bash_shell/fig/home-directories.png b/technology_and_tooling/bash_shell/fig/home-directories.png index 3c62ee6f..b684a104 100644 Binary files a/technology_and_tooling/bash_shell/fig/home-directories.png and b/technology_and_tooling/bash_shell/fig/home-directories.png differ diff --git a/technology_and_tooling/best_practices/code_style_python.md b/technology_and_tooling/best_practices/code_style_python.md index 5f3c3b50..d59031af 100644 --- a/technology_and_tooling/best_practices/code_style_python.md +++ b/technology_and_tooling/best_practices/code_style_python.md @@ -1,12 +1,11 @@ --- -name: Code Style -dependsOn: [ -] +name: Code Style +dependsOn: [] tags: [python] --- :::callout -This material was edited from the original in "Intermediate Research Software +This material was edited from the original in "Intermediate Research Software Development Skills" hosted by the Software Carpentries ::: @@ -18,9 +17,10 @@ sharing it with others, ask yourself what kind of code should you be writing and worth spending some time learning a bit about Python coding style conventions to make sure that your code is consistently formatted and readable by yourself and others. -> *"Any fool can write code that a computer can understand. Good programmers write code that humans can understand."* - [Martin Fowler](https://en.wikiquote.org/wiki/Martin_Fowler), British software engineer, author and international speaker on software development +> _"Any fool can write code that a computer can understand. Good programmers write code that humans can understand."_ - [Martin Fowler](https://en.wikiquote.org/wiki/Martin_Fowler), British software engineer, author and international speaker on software development ## Python Coding Style Guide + One of the most important things we can do to make sure our code is readable by others (and ourselves a few months down the line) is to make sure that it is descriptive, cleanly and consistently formatted and uses sensible, @@ -35,7 +35,9 @@ PEP here stands for Python Enhancement Proposals; PEPs are design documents for specifications or conventions for how to do something in Python, a description of a new feature in Python, etc. :::callout + ## Style consistency + One of the [key insights from Guido van Rossum](https://www.python.org/dev/peps/pep-0008/#a-foolish-consistency-is-the-hobgoblin-of-little-minds), one of the PEP8 authors, is that code is read much more often than it is @@ -50,10 +52,11 @@ As we have already covered in the [episode on PyCharm IDE](../13-ides/index.html (reserved words) and syntax errors to help us with coding. PyCharm also gives us recommendations for formatting the code - these recommendations are mostly taken from the PEP8 style guide. -A full list of style guidelines for this style +A full list of style guidelines for this style is available from the [PEP8 website](https://www.python.org/dev/peps/pep-0008/); here we highlight a few. ### Indentation + Python is a kind of language that uses indentation as a way of grouping statements that belong to a particular block of code. Spaces are the recommended indentation method in Python code. The guideline is to use 4 spaces per indentation level - so 4 spaces on level one, 8 spaces on level two and so on. @@ -62,7 +65,9 @@ introduce an error by missing a single space character, etc.) and do not follow follow this guideline or not, be consistent and follow the style already used in the project. :::callout + # Indentation in Python 2 vs Python 3 + Python 2 allowed code indented with a mixture of tabs and spaces. Python 3 disallows mixing the use of tabs and spaces for indentation. Whichever you choose, be consistent throughout the project. @@ -86,7 +91,7 @@ list or dictionary definitions can all take more than one line. The preferred wa using Python's implied line continuation inside delimiters such as parentheses (`()`), brackets (`[]`) and braces (`{}`), or a hanging indent. -~~~python +```python nolint # Add an extra level of indentation (extra 4 spaces) to distinguish arguments from the rest of the code that follows def long_function_name( var_one, var_two, var_three, @@ -117,19 +122,20 @@ a_long_list2 = [ # ... 79 ] -~~~ +``` More details on good and bad practices for continuation lines can be found in [PEP8 guideline on indentation](https://www.python.org/dev/peps/pep-0008/#indentation). -### Maximum Line Length +## Maximum Line Length + All lines should be up to 80 characters long; for lines containing comments or docstrings (to be covered later) the line length limit should be 73 - see [this discussion](https://stackoverflow.com/questions/15438326/python-pep-8-docstring-line-length) for reasoning behind these numbers. Some teams strongly prefer a longer line length, and seemed to have settled on the length of 100. Long lines of code can be broken over multiple lines by wrapping expressions in delimiters, as mentioned above (preferred method), or using a backslash (`\`) at the end of the line to indicate line continuation (slightly less preferred method). -~~~python +```python nolint # Using delimiters ( ) to wrap a multi-line expression if (a == True and b == False): @@ -137,118 +143,136 @@ if (a == True and # Using a backslash (\) for line continuation if a == True and \ b == False: -~~~ + pass +``` ### Should a Line Break Before or After a Binary Operator? + Lines should break before binary operators so that the operators do not get scattered across different columns on the screen. In the example below, the eye does not have to do the extra work to tell which items are added and which are subtracted: -~~~python +```python nolint # PEP 8 compliant - easy to match operators with operands income = (gross_wages + taxable_interest + (dividends - qualified_dividends) - ira_deduction - student_loan_interest) -~~~ +``` ### Blank Lines + Top-level function and class definitions should be surrounded with two blank lines. Method definitions inside a class should be surrounded by a single blank line. You can use blank lines in functions, sparingly, to indicate logical sections. ### Whitespace in Expressions and Statements + Avoid extraneous whitespace in the following situations: + - immediately inside parentheses, brackets or braces - ~~~ - # PEP 8 compliant: - my_function(colour[1], {id: 2}) - # Not PEP 8 compliant: - my_function( colour[ 1 ], { id: 2 } ) - ~~~ - {: .language-python} + ```python nolint + # PEP 8 compliant: + my_function(colour[1], {id: 2}) + + # Not PEP 8 compliant: + my_function( colour[ 1 ], { id: 2 } ) + ``` + + {: .language-python} - Immediately before a comma, semicolon, or colon (unless doing slicing where the colon acts like a binary operator -in which case it should should have equal amounts of whitespace on either side) - ~~~ - # PEP 8 compliant: - if x == 4: print(x, y); x, y = y, x + in which case it should should have equal amounts of whitespace on either side) + + ```python nolint + # PEP 8 compliant: + if x == 4: print(x, y); x, y = y, x - # Not PEP 8 compliant: - if x == 4 : print(x , y); x , y = y, x - ~~~ - {: .language-python} + # Not PEP 8 compliant: + if x == 4 : print(x , y); x , y = y, x + ``` + + {: .language-python} - Immediately before the open parenthesis that starts the argument list of a function call - ~~~ - # PEP 8 compliant: - my_function(1) - # Not PEP 8 compliant: - my_function (1) - ~~~ - {: .language-python} + ```python nolint + # PEP 8 compliant: + my_function(1) + + # Not PEP 8 compliant: + my_function (1) + ``` + + {: .language-python} - Immediately before the open parenthesis that starts an indexing or slicing - ~~~ - # PEP 8 compliant: - my_dct['key'] = my_lst[id] - first_char = my_str[:, 1] - # Not PEP 8 compliant: - my_dct ['key'] = my_lst [id] - first_char = my_str [:, 1] - ~~~ - {: .language-python} + ```python nolint + # PEP 8 compliant: + my_dct['key'] = my_lst[id] + first_char = my_str[:, 1] + + # Not PEP 8 compliant: + my_dct ['key'] = my_lst [id] + first_char = my_str [:, 1] + ``` + + {: .language-python} - More than one space around an assignment (or other) operator to align it with another - ~~~ - # PEP 8 compliant: - x = 1 - y = 2 - student_loan_interest = 3 - - # Not PEP 8 compliant: - x = 1 - y = 2 - student_loan_interest = 3 - ~~~ - {: .language-python} + + ```python nolint + # PEP 8 compliant: + x = 1 + y = 2 + student_loan_interest = 3 + + # Not PEP 8 compliant: + x = 1 + y = 2 + student_loan_interest = 3 + ``` + + {: .language-python} - Avoid trailing whitespace anywhere - it is not necessary and can cause errors. For example, if you use -backslash (`\`) for continuation lines and have a space after it, the continuation line will not be -interpreted correctly. + backslash (`\`) for continuation lines and have a space after it, the continuation line will not be + interpreted correctly. - Surround these binary operators with a single space on either side: assignment (=), -augmented assignment (+=, -= etc.), comparisons (==, <, >, !=, <>, <=, >=, in, not in, is, is not), -booleans (and, or, not). + augmented assignment (+=, -= etc.), comparisons (==, <, >, !=, <>, <=, >=, in, not in, is, is not), + booleans (and, or, not). - Don't use spaces around the = sign when used to indicate a keyword argument assignment or to indicate a -default value for an unannotated function parameter - - ~~~python - # PEP 8 compliant use of spaces around = for variable assignment - axis = 'x' - angle = 90 - size = 450 - name = 'my_graph' - - # PEP 8 compliant use of no spaces around = for keyword argument assignment in a function call - my_function( - 1, - 2, - axis=axis, - angle=angle, - size=size, - name=name) - ~~~ + default value for an unannotated function parameter + + ```python nolint + # PEP 8 compliant use of spaces around = for variable assignment + axis = 'x' + angle = 90 + size = 450 + name = 'my_graph' + + # PEP 8 compliant use of no spaces around = for keyword argument assignment in a function call + my_function( + 1, + 2, + axis=axis, + angle=angle, + size=size, + name=name) + ``` ### String Quotes + In Python, single-quoted strings and double-quoted strings are the same. PEP8 does not make a recommendation for this apart from picking one rule and consistently sticking to it. When a string contains single or double quote characters, use the other one to avoid backslashes in the string as it improves readability. ### Naming Conventions + There are a lot of different naming styles in use, including: + - b (single lowercase letter) - B (single uppercase letter) - lowercase @@ -256,21 +280,23 @@ There are a lot of different naming styles in use, including: - UPPERCASE - UPPER_CASE_WITH_UNDERSCORES - CapitalisedWords (or PascalCase) (note: when using acronyms in CapitalisedWords, capitalise all the letters of the acronym, -e.g HTTPServerError) + e.g HTTPServerError) - camelCase (differs from CapitalisedWords/PascalCase by the initial lowercase character) - Capitalised_Words_With_Underscores As with other style guide recommendations - consistency is key. Pick one and stick to it, or follow the one already established if joining a project mid-way. Some things to be wary of when naming things in the code: + - Avoid using the characters 'l' (lowercase letter L), 'O' (uppercase letter o), or 'I' (uppercase letter i) -as single character variable names. In some fonts, these characters are indistinguishable from the numerals -one and zero. When tempted to use 'l', use 'L' instead. + as single character variable names. In some fonts, these characters are indistinguishable from the numerals + one and zero. When tempted to use 'l', use 'L' instead. - Avoid using non-ASCII (e.g. UNICODE) characters for identifiers -- If your audience is international and English is the common language, try to use English words for identifiers and -comments whenever possible but try to avoid abbreviations/local slang as they may not be understood by everyone. Also consider -sticking with either ‘American’ or 'British' English spellings and try not to mix the two. +- If your audience is international and English is the common language, try to use English words for identifiers and + comments whenever possible but try to avoid abbreviations/local slang as they may not be understood by everyone. Also consider + sticking with either ‘American’ or 'British' English spellings and try not to mix the two. :::callout + ## Function, Variable, Class, Module, Package Naming - Function and variable names should be lowercase, with words separated by underscores as necessary to improve readability. @@ -284,15 +310,18 @@ is available from PEP8. ::: ### Comments + Comments allow us to provide the reader with additional information on what the code does - reading and understanding source code is slow, laborious and can lead to misinterpretation, plus it is always a good idea to keep others in mind -when writing code. A good rule of thumb is to assume that someone will *always* read your code at a later date, +when writing code. A good rule of thumb is to assume that someone will _always_ read your code at a later date, and this includes a future version of yourself. It can be easy to forget why you did something a particular way in six months' time. Write comments as complete sentences and in English unless you are 100% sure the code will never be read by people who don't speak your language. :::callout + ## The Good, the Bad, and the Ugly Comments + As a side reading, check out the ['Putting comments in code: the good, the bad, and the ugly' blogpost](https://www.freecodecamp.org/news/code-comments-the-good-the-bad-and-the-ugly-be9cc65fbf83/). Remember - a comment should answer the ‘why’ question”. Occasionally the “what” question. The “how” question should be answered by the code itself. @@ -300,23 +329,25 @@ The “how” question should be answered by the code itself. Block comments generally apply to some (or all) code that follows them, and are indented to the same level as that code. Each line of a block comment starts with a `#` and a single space (unless it is indented text inside the comment). -~~~python + +```python def fahr_to_cels(fahr): # Block comment example: convert temperature in Fahrenheit to Celsius cels = (fahr + 32) * (5 / 9) return cels -~~~ +``` An inline comment is a comment on the same line as a statement. Inline comments should be separated by at least two spaces from the statement. They should start with a `#` and a single space and should be used sparingly. -~~~python + +```python def fahr_to_cels(fahr): cels = (fahr + 32) * (5 / 9) # Inline comment example: convert temperature in Fahrenheit to Celsius return cels -~~~ +``` Python doesn't have any multi-line comments, like you may have seen in other languages like C++ or Java. However, there - are ways to do it using *docstrings* as we'll see in a moment. +are ways to do it using _docstrings_ as we'll see in a moment. The reader should be able to understand a single function or method from its code and its comments, and should not have to look elsewhere in the code for clarification. The kind of things that need to be commented are: @@ -325,79 +356,79 @@ The reader should be able to understand a single function or method from its cod - The expected format of input files or database schemas However, there are some restrictions. Comments that simply restate what the code does are redundant, and comments must be - accurate and updated with the code, because an incorrect comment causes more confusion than no comment at all. +accurate and updated with the code, because an incorrect comment causes more confusion than no comment at all. ::::callenge{id=code-style, title="Improve Code Style of Our Project"} Let's look at improving the coding style of our project. First create a new feature branch called `style-fixes` off our `develop` branch and switch to it (from the project root): -~~~bash -$ git checkout develop -$ git checkout -b style-fixes -~~~ +```bash +git checkout develop +git checkout -b style-fixes +``` Next look at the `inflammation-analysis.py` file in PyCharm and identify where the above guidelines have not been followed. Fix the discovered inconsistencies and commit them to the feature branch. :::solution -Modify `inflammation-analysis.py` from PyCharm, which is helpfully marking +Modify `inflammation-analysis.py` from PyCharm, which is helpfully marking inconsistencies with coding guidelines by underlying them. There are a few things to fix in `inflammation-analysis.py`, for example: -1. Line 24 in `inflammation-analysis.py` is too long and not very readable. A better - style would be to use multiple lines and hanging indent, with the closing brace `}' - aligned either with the first non-whitespace character of the last line of list or - the first character of the line that starts the multiline construct or simply moved +1. Line 24 in `inflammation-analysis.py` is too long and not very readable. A better + style would be to use multiple lines and hanging indent, with the closing brace `}' + aligned either with the first non-whitespace character of the last line of list or + the first character of the line that starts the multiline construct or simply moved to the end of the previous line. All three acceptable modifications are shown below. - ~~~python - # Using hanging indent, with the closing '}' aligned with the first non-blank character of the previous line - view_data = { - 'average': models.daily_mean(inflammation_data), - 'max': models.daily_max(inflammation_data), - 'min': models.daily_min(inflammation_data) - } - ~~~ - - ~~~python - # Using hanging indent with the, closing '}' aligned with the start of the multiline contruct - view_data = { - 'average': models.daily_mean(inflammation_data), - 'max': models.daily_max(inflammation_data), - 'min': models.daily_min(inflammation_data) - } - ~~~ - - ~~~python - # Using hanging indent where all the lines of the multiline contruct are indented except the first one - view_data = { - 'average': models.daily_mean(inflammation_data), - 'max': models.daily_max(inflammation_data), - 'min': models.daily_min(inflammation_data)} - ~~~ - -2. Variable 'InFiles' in `inflammation-analysis.py` uses CapitalisedWords naming - convention which is recommended for class names but not variable names. By - convention, variable names should be in lowercase with optional underscores so you + ```python nolint + # Using hanging indent, with the closing '}' aligned with the first non-blank character of the previous line + view_data = { + 'average': models.daily_mean(inflammation_data), + 'max': models.daily_max(inflammation_data), + 'min': models.daily_min(inflammation_data) + } + ``` + + ```python nolint + # Using hanging indent with the, closing '}' aligned with the start of the multiline contruct + view_data = { + 'average': models.daily_mean(inflammation_data), + 'max': models.daily_max(inflammation_data), + 'min': models.daily_min(inflammation_data) + } + ``` + + ```python nolint + # Using hanging indent where all the lines of the multiline contruct are indented except the first one + view_data = { + 'average': models.daily_mean(inflammation_data), + 'max': models.daily_max(inflammation_data), + 'min': models.daily_min(inflammation_data)} + ``` + +2. Variable 'InFiles' in `inflammation-analysis.py` uses CapitalisedWords naming + convention which is recommended for class names but not variable names. By + convention, variable names should be in lowercase with optional underscores so you should rename the variable 'InFiles' to, e.g., 'infiles' or 'in_files'. -3. There is an extra blank line on line 20 in `inflammation-analysis.py`. Normally, you - should not use blank lines in the middle of the code unless you want to separate - logical units - in which case only one blank line is used. Note how PyCharm is +3. There is an extra blank line on line 20 in `inflammation-analysis.py`. Normally, you + should not use blank lines in the middle of the code unless you want to separate + logical units - in which case only one blank line is used. Note how PyCharm is warning us by underlying the whole line. -4. Only one blank line after the end of definition of function `main` and the rest of - the code on line 30 in `inflammation-analysis.py` - should be two blank lines. Note +4. Only one blank line after the end of definition of function `main` and the rest of + the code on line 30 in `inflammation-analysis.py` - should be two blank lines. Note how PyCharm is warning us by underlying the whole line. Finally, let's add and commit our changes to the feature branch. We will check the status of our working directory first. -~~~bash -$ git status -~~~ +```bash +git status +``` -~~~ +```text On branch style-fixes Changes not staged for commit: (use "git add ..." to update what will be committed) @@ -405,29 +436,32 @@ Changes not staged for commit: modified: inflammation-analysis.py no changes added to commit (use "git add" and/or "git commit -a") -~~~ +``` Git tells us we are on branch `style-fixes` and that we have unstaged and uncommited changes to `inflammation-analysis.py`. Let's commit them to the local repository. -~~~bash -$ git add inflammation-analysis.py -$ git commit -m "Code style fixes." -~~~ + +```bash +git add inflammation-analysis.py +git commit -m "Code style fixes." +``` + ::: :::: - -:::challenge{id=improve-code-style-of-others title="(Optional) Improve Code Style of +:::challenge{id=improve-code-style-of-others title="(Optional) Improve Code Style of Your Other Python Projects"} -If you have another Python project, check to which extent it conforms to PEP8 coding +If you have another Python project, check to which extent it conforms to PEP8 coding style. ::: ### Documentation Strings aka Docstrings + If the first thing in a function is a string that is not assigned to a variable, that string is attached to the function as its documentation. Consider the following code implementing function for calculating the nth Fibonacci number: -~~~python + +```python def fibonacci(n): """Calculate the nth Fibonacci number. @@ -445,11 +479,11 @@ def fibonacci(n): return 1 return fibonacci(n - 1) + fibonacci(n - 2) -~~~ +``` Note here we are explicitly documenting our input variables, what is returned by the function, and also when the `ValueError` exception is raised. Along with a helpful description of what the function does, this information can -act as a *contract* for readers to understand what to expect in terms of behaviour when using the function, +act as a _contract_ for readers to understand what to expect in terms of behaviour when using the function, as well as how to use it. A special comment string like this is called a **docstring**. We do not need to use triple quotes when writing one, but @@ -465,26 +499,25 @@ the [Sphynx/ReadTheDocs docstring style](https://sphinx-rtd-tutorial.readthedocs for the `param`, `raises` and `returns` - other docstring formats exist as well. ## Python PEP 257 - Recommendations for Docstrings -PEP 257 is another one of Python Enhancement Proposals and this one deals with docstring + +PEP 257 is another one of Python Enhancement Proposals and this one deals with docstring conventions to standardise how they are used. For example, on the subject of module-level docstrings, PEP 257 says: -~~~ The docstring for a module should generally list the classes, exceptions and functions (and any other objects) that are exported by the module, with a one-line summary of each. (These summaries generally give less detail than the summary line in the object's docstring.) The docstring for a package (i.e., the docstring of the package's `__init__.py` module) should also list the modules and subpackages exported by the package. -~~~ -Note that `__init__.py` file used to be a required part of a package (pre Python 3.3) +Note that `__init__.py` file used to be a required part of a package (pre Python 3.3) where a package was typically implemented as a directory containing an `__init__.py` file which got implicitly executed when a package was imported. So, at the beginning of a module file we can just add a docstring explaining the nature of a module. For example, if `fibonacci()` was included in a module with other functions, our module could have at the start of it: -~~~python +```python """A module for generating numerical sequences of numbers that occur in nature. Functions: @@ -493,63 +526,67 @@ Functions: ... """ ... -~~~ +``` The docstring for a function or a module is returned when calling the `help` function and passing its name - for example from the interactive Python console/terminal available -from the command line or when rendering code documentation online +from the command line or when rendering code documentation online (e.g. see [Python documentation](https://docs.python.org/3.8/library/index.html)). PyCharm also displays the docstring for a function/module in a little help popup window when using tab-completion. -~~~python +```python help(fibonacci) - ~~~ +``` ::::challenge{id=fix-docstrings title="Fix the Docstrings"} -Look into `models.py` in PyCharm and improve docstrings for functions -`daily_mean`, `daily_min`, `daily_max`. Commit those changes to feature branch +Look into `models.py` in PyCharm and improve docstrings for functions +`daily_mean`, `daily_min`, `daily_max`. Commit those changes to feature branch `style-fixes`. :::solution For example, the improved docstrings for the above functions would contain explanations for parameters and return values. -~~~python + +```python +import numpy as np def daily_mean(data): - """Calculate the daily mean of a 2D inflammation data array for each day. - - :param data: A 2D data array with inflammation data (each row contains measurements for a single patient across all days). - :returns: An array of mean values of measurements for each day. - """ - return np.mean(data, axis=0) -~~~ -~~~python + """Calculate the daily mean of a 2D inflammation data array for each day. + + :param data: A 2D data array with inflammation data (each row contains measurements for a single patient across all days). + :returns: An array of mean values of measurements for each day. + """ + return np.mean(data, axis=0) +``` + +```python def daily_max(data): - """Calculate the daily maximum of a 2D inflammation data array for each day. + """Calculate the daily maximum of a 2D inflammation data array for each day. - :param data: A 2D data array with inflammation data (each row contains measurements for a single patient across all days). - :returns: An array of max values of measurements for each day. - """ - return np.max(data, axis=0) -~~~ + :param data: A 2D data array with inflammation data (each row contains measurements for a single patient across all days). + :returns: An array of max values of measurements for each day. + """ + return np.max(data, axis=0) +``` -~~~python +```python def daily_min(data): - """Calculate the daily minimum of a 2D inflammation data array for each day. + """Calculate the daily minimum of a 2D inflammation data array for each day. - :param data: A 2D data array with inflammation data (each row contains measurements for a single patient across all days). - :returns: An array of minimum values of measurements for each day. - """ - return np.min(data, axis=0) -~~ + :param data: A 2D data array with inflammation data (each row contains measurements for a single patient across all days). + :returns: An array of minimum values of measurements for each day. + """ + return np.min(data, axis=0) +``` -Once we are happy with modifications, as usual before staging and commit our changes, +Once we are happy with modifications, as usual before staging and commit our changes, we check the status of our working directory: -~~~bash -$ git status -~~~ -~~~ +```bash +git status +``` + +```text On branch style-fixes Changes not staged for commit: (use "git add ..." to update what will be committed) @@ -557,20 +594,23 @@ Changes not staged for commit: modified: inflammation/models.py no changes added to commit (use "git add" and/or "git commit -a") -~~~ +``` As expected, Git tells us we are on branch `style-fixes` and that we have unstaged and uncommited changes to `inflammation/models.py`. Let's commit them to the local repository. -~~~bash -$ git add inflammation/models.py -$ git commit -m "Docstring improvements." -~~~ + +```bash +git add inflammation/models.py +git commit -m "Docstring improvements." +``` + ::: :::: In the previous exercises, we made some code improvements on feature branch `style-fixes`. We have committed our changes locally but have not pushed this branch remotely for others to have a look at our code before we merge it onto the `develop` branch. Let's do that now, namely: + - push `style-fixes` to GitHub - merge `style-fixes` into `develop` (once we are happy with the changes) - push updates to `develop` branch to GitHub (to keep it up to date with the latest developments) @@ -578,18 +618,21 @@ onto the `develop` branch. Let's do that now, namely: Here is a set commands that will achieve the above set of actions (remember to use `git status` often in between other Git commands to double check which branch you are on and its status): -~~~bash -$ git push -u origin style-fixes -$ git checkout develop -$ git merge style-fixes -$ git push origin develop -$ git checkout main -$ git merge develop -$ git push origin main -~~~ + +```bash +git push -u origin style-fixes +git checkout develop +git merge style-fixes +git push origin develop +git checkout main +git merge develop +git push origin main +``` :::callout + ## Typical Code Development Cycle + What you've done in the exercises in this episode mimics a typical software development workflow - you work locally on code on a feature branch, test it to make sure it works correctly and as expected, then record your changes using version diff --git a/technology_and_tooling/best_practices/index.md b/technology_and_tooling/best_practices/index.md index 157e5673..2244d3ea 100644 --- a/technology_and_tooling/best_practices/index.md +++ b/technology_and_tooling/best_practices/index.md @@ -1,16 +1,8 @@ --- name: Best Practices id: best_practices -dependsOn: [ - technology_and_tooling.ide, - software_architecture_and_design.procedural -] -files: [ - code_style_python.md, - linters_python.md -] +dependsOn: [technology_and_tooling.ide, software_architecture_and_design.procedural] +files: [code_style_python.md, linters_python.md] summary: | This course covers how to style your Python code, and use linters to enforce a consistant style and highlight any code that can lead to commonly encountered bugs or problems. - --- - diff --git a/technology_and_tooling/best_practices/linters_python.md b/technology_and_tooling/best_practices/linters_python.md index 85d474b9..f9d16179 100644 --- a/technology_and_tooling/best_practices/linters_python.md +++ b/technology_and_tooling/best_practices/linters_python.md @@ -1,50 +1,48 @@ --- name: Linters -dependsOn: [ - technology_and_tooling.best_practices.code_style_python -] +dependsOn: [technology_and_tooling.best_practices.code_style_python] tags: [python] --- :::callout -This material was edited from the original in "Intermediate Research Software +This material was edited from the original in "Intermediate Research Software Development Skills" hosted by the Software Carpentries" ::: ## Verifying Code Style Using Linters -We've seen how we can use PyCharm to help us format our Python code in a consistent style. -This aids reusability, since consistent-looking code is easier to modify since it's easier to read and understand -if it's consistent. We can also use tools to identify consistency issues in a report-style too, -using [**code linters**](https://en.wikipedia.org/wiki/Lint_%28software%29). -Linters analyse source code to identify and report on stylistic and even programming errors. Let's look at a very well +We've seen how we can use PyCharm to help us format our Python code in a consistent style. +This aids reusability, since consistent-looking code is easier to modify since it's easier to read and understand +if it's consistent. We can also use tools to identify consistency issues in a report-style too, +using [**code linters**](https://en.wikipedia.org/wiki/Lint_%28software%29). +Linters analyse source code to identify and report on stylistic and even programming errors. Let's look at a very well used one of these called `pylint`. First, let's ensure we are on the `style-fixes` branch once again. -~~~bash -$ git checkout style-fixes -~~~ +```bash +git checkout style-fixes +``` Pylint is just a Python package so we can install it in our virtual environment using: -~~~bash -$ pip3 install pylint -$ pylint --version -~~~ +```bash +pip3 install pylint +pylint --version +``` We should see the version of Pylint, something like: -~~~ +```text pylint 2.13.3 ... -~~~ +``` We should also update our `requirements.txt` with this new addition: -~~~bash -$ pip3 freeze > requirements.txt -~~~ +```bash +pip3 freeze > requirements.txt +``` Pylint is a command-line tool that can help our code in many ways: @@ -56,26 +54,26 @@ Pylint is a command-line tool that can help our code in many ways: Pylint can also identify **code smells**. :::callout + ## How Does Code Smell? -There are many ways that code can exhibit bad design whilst not breaking any rules and working correctly. A *code smell* is a characteristic that indicates that there is an underlying problem with source code, e.g. large classes or methods, methods with too many parameters, duplicated statements in both if and else blocks of conditionals, etc. They aren't functional errors in the code, but rather are certain structures that violate principles of good design and impact design quality. They can also indicate that code is in need of maintenance and refactoring. +There are many ways that code can exhibit bad design whilst not breaking any rules and working correctly. A _code smell_ is a characteristic that indicates that there is an underlying problem with source code, e.g. large classes or methods, methods with too many parameters, duplicated statements in both if and else blocks of conditionals, etc. They aren't functional errors in the code, but rather are certain structures that violate principles of good design and impact design quality. They can also indicate that code is in need of maintenance and refactoring. The phrase has its origins in Chapter 3 "Bad smells in code" by Kent Beck and Martin Fowler in [Fowler, Martin (1999). Refactoring. Improving the Design of Existing Code. Addison-Wesley. ISBN 0-201-48567-2](https://www.amazon.com/Refactoring-Improving-Design-Existing-Code/dp/0201485672/). ::: - -Pylint recommendations are given as warnings or errors, and Pylint also scores the code with an overall mark. -We can look at a specific file (e.g. `inflammation-analysis.py`), or a module -(e.g. `inflammation`). Let's look at our `inflammation` module and code inside it (namely `models.py` and `views.py`). +Pylint recommendations are given as warnings or errors, and Pylint also scores the code with an overall mark. +We can look at a specific file (e.g. `inflammation-analysis.py`), or a module +(e.g. `inflammation`). Let's look at our `inflammation` module and code inside it (namely `models.py` and `views.py`). From the project root do: -~~~bash -$ pylint inflammation -~~~ +```bash +pylint inflammation +``` You should see an output similar to the following: -~~~ +```text ************* Module inflammation.models inflammation/models.py:5:82: C0303: Trailing whitespace (trailing-whitespace) inflammation/models.py:6:66: C0303: Trailing whitespace (trailing-whitespace) @@ -85,22 +83,22 @@ inflammation/views.py:4:0: W0611: Unused numpy imported as np (unused-import) ------------------------------------------------------------------ Your code has been rated at 8.00/10 (previous run: 8.00/10, +0.00) -~~~ +``` -Your own outputs of the above commands may vary depending on how you have implemented and fixed the code in -previous exercises and the coding style you have used. +Your own outputs of the above commands may vary depending on how you have implemented and fixed the code in +previous exercises and the coding style you have used. -The five digit codes, such as `C0303`, are unique identifiers for warnings, with the first character indicating -the type of warning. There are five different types of warnings that Pylint looks for, and you can get a summary of +The five digit codes, such as `C0303`, are unique identifiers for warnings, with the first character indicating +the type of warning. There are five different types of warnings that Pylint looks for, and you can get a summary of them by doing: -~~~bash -$ pylint --long-help -~~~ +```bash +pylint --long-help +``` Near the end you'll see: -~~~ +```text Output: Using the default text output, the message format is : MESSAGE_TYPE: LINE_NUM:[OBJECT:] MESSAGE @@ -111,53 +109,57 @@ Near the end you'll see: * (E) error, for probable bugs in the code * (F) fatal, if an error occurred which prevented pylint from doing further processing. -~~~ +``` -So for an example of a Pylint Python-specific `warning`, see the "W0611: Unused numpy imported +So for an example of a Pylint Python-specific `warning`, see the "W0611: Unused numpy imported as np (unused-import)" warning. -It is important to note that while tools such as Pylint are great at giving you a starting point to consider how to +It is important to note that while tools such as Pylint are great at giving you a starting point to consider how to improve your code, they won't find everything that may be wrong with it. :::callout + ## How Does Pylint Calculate the Score? The Python formula used is (with the variables representing numbers of each type of infraction and `statement` indicating the total number of statements): -~~~python +```python nolint 10.0 - ((float(5 * error + warning + refactor + convention) / statement) * 10) -~~~ +``` + ::: -For example, with a total of 31 statements of models.py and views.py, with a count of the errors shown above, we get +For example, with a total of 31 statements of models.py and views.py, with a count of the errors shown above, we get a score of 8.00. Note whilst there is a maximum score of 10, given the formula, there is no minimum score - it's quite possible to get a negative score! -::::challenge{id=further-improve-code-style title="Further Improve Code Style of Our +::::challenge{id=further-improve-code-style title="Further Improve Code Style of Our Project"} -Select and fix a few of the issues with our code that Pylint detected. Make sure you do not break the rest of the -code in the process and that the code still runs. After making any changes, run Pylint again to verify you've +Select and fix a few of the issues with our code that Pylint detected. Make sure you do not break the rest of the +code in the process and that the code still runs. After making any changes, run Pylint again to verify you've resolved these issues. -Make sure you commit and push `requirements.txt` and any file with further code style improvements you did and +Make sure you commit and push `requirements.txt` and any file with further code style improvements you did and merge onto your development and main branches. -~~~bash -$ git add requirements.txt -$ git commit -m "Added Pylint library" -$ git push origin style-fixes -$ git checkout develop -$ git merge style-fixes -$ git push origin develop -$ git checkout main -$ git merge develop -$ git push origin main -~~~ + +```bash +git add requirements.txt +git commit -m "Added Pylint library" +git push origin style-fixes +git checkout develop +git merge style-fixes +git push origin develop +git checkout main +git merge develop +git push origin main +``` + :::: -::::challenge{id=improve-code-style-other title="Improve Code Style of Your Other +::::challenge{id=improve-code-style-other title="Improve Code Style of Your Other Python Projects"} If you have a Python project you are working on or you worked on in the past, run it past Pylint to see what issues with your code are detected, if any. :::: -It is possible to automate these kind of code checks with GitHub's Continuous Integration service GitHub Actions - +It is possible to automate these kind of code checks with GitHub's Continuous Integration service GitHub Actions - we will come back to automated linting in the episode on ["Diagnosing Issues and Improving Robustness"](../24-diagnosing-issues-improving-robustness/index.html). diff --git a/technology_and_tooling/docker/advanced-containers.md b/technology_and_tooling/docker/advanced-containers.md index 59d8bdd9..28e1edd6 100644 --- a/technology_and_tooling/docker/advanced-containers.md +++ b/technology_and_tooling/docker/advanced-containers.md @@ -2,26 +2,24 @@ name: "More Complex Containers" teaching: 30 exercises: 30 -dependsOn: [ - technology_and_tooling.docker.creating-container-images -] +dependsOn: [technology_and_tooling.docker.creating-container-images] tags: [docker] -attribution: - - citation: > - D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". - Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. - url: https://github.com/carpentries-incubator/docker-introduction - image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg - license: CC-BY-4.0 +attribution: + - citation: > + D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". + Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. + url: https://github.com/carpentries-incubator/docker-introduction + image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg + license: CC-BY-4.0 --- In order to create and use your own container images, you may need more information than -our previous example. You may want to use files from outside the container, -that are not included within the container image, either by copying the files -into the container image, or by making them visible within a running container from their -existing location on your host system. You may also want to learn a little bit -about how to install software within a running container or a container image. -This episode will look at these advanced aspects of running a container or building +our previous example. You may want to use files from outside the container, +that are not included within the container image, either by copying the files +into the container image, or by making them visible within a running container from their +existing location on your host system. You may also want to learn a little bit +about how to install software within a running container or a container image. +This episode will look at these advanced aspects of running a container or building a container image. Note that the examples will get gradually more and more complex -- most day-to-day use of containers and container images can be accomplished using the first 1--2 sections on this page. @@ -31,10 +29,10 @@ using the first 1--2 sections on this page. In your shell, change to the `sum` folder in the `docker-intro` folder and look at the files inside. -~~~bash -$ cd ~/Desktop/docker-intro/sum -$ ls -~~~ +```bash +cd ~/Desktop/docker-intro/sum +ls +``` This folder has both a `Dockerfile` and a Python script called `sum.py`. Let's say we wanted to try running the script using a container based on our recently created `alpine-python` @@ -47,13 +45,13 @@ What command would we use to run Python from the `alpine-python` container? If we try running the container and Python script, what happens? -~~~bash -$ docker container run alice/alpine-python python3 sum.py -~~~ +```bash +docker container run alice/alpine-python python3 sum.py +``` -~~~ +```text python3: can't open file 'sum.py': [Errno 2] No such file or directory -~~~ +``` :::challenge{id=no-such-file title="No such file or directory"} @@ -82,6 +80,7 @@ _visible_ within the container that is about to be started, and inside this cont directory `/temp` -- the target. :::callout + ## Types of mounts You will notice that we set the mount `type=bind`, there are other types of mount that @@ -93,48 +92,49 @@ topic. You can find more information on the different mount types in Let's try running the command now: -~~~bash -$ docker container run --mount type=bind,source=${PWD},target=/temp alice/alpine-python python3 sum.py -~~~ +```bash +docker container run --mount type=bind,source=${PWD},target=/temp alice/alpine-python python3 sum.py +``` But we get the same error! -~~~ +```text python3: can't open file 'sum.py': [Errno 2] No such file or directory -~~~ +``` This final piece is a bit tricky -- we really have to remember to put ourselves inside the container. Where is the `sum.py` file? It's in the directory that's been mapped to `/temp` -- so we need to include that in the path to the script. This command should give us what we need: -~~~bash -$ docker container run --mount type=bind,source=${PWD},target=/temp alice/alpine-python python3 /temp/sum.py -~~~ +```bash +docker container run --mount type=bind,source=${PWD},target=/temp alice/alpine-python python3 /temp/sum.py +``` Note that if we create any files in the `/temp` directory while the container is running, these files will appear on our host filesystem in the original directory and will stay there even when the container stops. :::callout + ## Other Commonly Used Docker Run Flags Docker run has many other useful flags to alter its function. A couple that are commonly used include `-w` and `-u`. -The `--workdir`/`-w` flag sets the working directory a.k.a. runs the command +The `--workdir`/`-w` flag sets the working directory a.k.a. runs the command being executed inside the directory specified. For example, the following code would run the `pwd` command in a container started from the latest ubuntu image in the `/home/alice` directory and print -`/home/alice`. If the directory doesn't exist in the image it will create it. +`/home/alice`. If the directory doesn't exist in the image it will create it. -~~~ +```text docker container run -w /home/alice/ ubuntu pwd -~~~ +``` The `--user`/`-u` flag lets you specify the username you would like to run the -container as. This is helpful if you'd like to write files to a mounted folder -and not write them as `root` but rather your own user identity and group. +container as. This is helpful if you'd like to write files to a mounted folder +and not write them as `root` but rather your own user identity and group. A common example of the `-u` flag is `--user $(id -u):$(id -g)` which will fetch the current user's ID and group and run the container as that user. ::: @@ -164,7 +164,7 @@ Here's a breakdown of each piece of the command above - `docker container run`: use Docker to run a container - `--mount type=bind,source=${PWD},target=/temp`: connect my current working directory (`${PWD}`) as a folder -inside the container called `/temp` + inside the container called `/temp` - `alice/alpine-python`: name of the container image to use to run the container - `python3 /temp/sum.py`: what commands to run in the container @@ -182,24 +182,25 @@ Can you find the folder that's connected to your host computer? What's inside? The docker command to run the container interactively is: -~~~bash -$ docker container run --mount type=bind,source=${PWD},target=/temp -it alice/alpine-python sh -~~~ +```bash +docker container run --mount type=bind,source=${PWD},target=/temp -it alice/alpine-python sh +``` Once inside, you should be able to navigate to the `/temp` folder and see that's contents are the same as the files on your host computer: -~~~bash +```bash /# cd /temp /# ls -~~~ +``` + ::: :::: Mounting a directory can be very useful when you want to run the software inside -your container on many different input files. In other situations, you may want +your container on many different input files. In other situations, you may want to save or archive an authoritative version of your data by adding it to the -container image permanently. That's what we will cover next. +container image permanently. That's what we will cover next. ## Including your scripts and data within a container image @@ -211,50 +212,50 @@ image itself. In your shell, you should still be in the `sum` folder in the `docker-intro` folder. -~~~bash -$ pwd -~~~ +```bash +pwd +``` -~~~bash -$ /Users/yourname/Desktop/docker-intro/sum -~~~ +```bash +/Users/yourname/Desktop/docker-intro/sum +``` Let's add a new line to the `Dockerfile` we've been using so far to create a copy of `sum.py`. We can do so by using the `COPY` keyword. -~~~dockerfile +```dockerfile COPY sum.py /home -~~~ +``` This line will cause Docker to copy the file from your computer into the container's filesystem. Let's build the container image like before, but give it a different name: -~~~bash -$ docker image build -t alice/alpine-sum . -~~~ +```bash +docker image build -t alice/alpine-sum . +``` :::callout + ## The Importance of Command Order in a Dockerfile When you run `docker build` it executes the build in the order specified in the `Dockerfile`. -This order is important for rebuilding and you typically will want to put your `RUN` +This order is important for rebuilding and you typically will want to put your `RUN` commands before your `COPY` commands. Docker builds the layers of commands in order. This becomes important when you need to rebuild container images. -If you change layers later in the `Dockerfile` and rebuild the container image, Docker doesn't need to +If you change layers later in the `Dockerfile` and rebuild the container image, Docker doesn't need to rebuild the earlier layers but will instead used a stored (called "cached") version of those layers. -For example, in an instance where you wanted to copy `multiply.py` into the container +For example, in an instance where you wanted to copy `multiply.py` into the container image instead of `sum.py`. -If the `COPY` line came before the `RUN` line, it would need to rebuild the whole image. +If the `COPY` line came before the `RUN` line, it would need to rebuild the whole image. If the `COPY` line came second then it would use the cached `RUN` layer from the previous build and then only rebuild the `COPY` layer. ::: - ::::challenge{id=did-it-work title="Did it work?"} Can you remember how to run a container interactively? Try that with this one. @@ -263,15 +264,17 @@ Once inside, try running the Python script. :::solution You can start the container interactively like so: -~~~bash -$ docker container run -it alice/alpine-sum sh -~~~ + +```bash +docker container run -it alice/alpine-sum sh +``` You should be able to run the python command inside the container like this: -~~~bash +```bash /# python3 /home/sum.py -~~~ +``` + ::: :::: @@ -284,13 +287,16 @@ run `docker image ls` you'll see the size of each container image all the way on the screen. The bigger your container image becomes, the harder it will be to easily download. :::callout + ## Security warning + Login credentials including passwords, tokens, secure access tokens or other secrets must never be stored in a container. If secrets are stored, they are at high risk to be found and exploited when made public. ::: :::callout + ## Copying alternatives Another trick for getting your own files into a container image is by using the `RUN` @@ -298,19 +304,19 @@ keyword and downloading the files from the internet. For example, if your code is in a GitHub repository, you could include this statement in your Dockerfile to download the latest version every time you build the container image: -~~~ +```text RUN git clone https://github.com/alice/mycode -~~~ +``` Similarly, the `wget` command can be used to download any file publicly available on the internet: -~~~ +```text RUN wget ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.10.0/ncbi-blast-2.10.0+-x64-linux.tar.gz -~~~ +``` -Note that the above `RUN` examples depend on commands (`git` and `wget` respectively) that -must be available within your container: Linux distributions such as Alpine may require you to +Note that the above `RUN` examples depend on commands (`git` and `wget` respectively) that +must be available within your container: Linux distributions such as Alpine may require you to install such commands before using them within `RUN` statements. ::: @@ -321,47 +327,47 @@ Here are some ideas: ### Make the `sum.py` script run automatically -~~~dockerfile +```dockerfile FROM alpine RUN apk add --update python3 py3-pip python3-dev COPY sum.py /home # Run the sum.py script as the default command CMD ["python3", "/home/sum.py"] -~~~ +``` Build and test it: -~~~bash -$ docker image build -t alpine-sum:v1 . -$ docker container run alpine-sum:v1 -~~~ +```bash +docker image build -t alpine-sum:v1 . +docker container run alpine-sum:v1 +``` You'll notice that you can run the container without arguments just fine, resulting in `sum = 0`, but this is boring. Supplying arguments however doesn't work: -~~~bash +```bash docker container run alpine-sum:v1 10 11 12 -~~~ +``` results in -~~~ +```text docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "exec: \"10\": executable file not found in $PATH": unknown. -~~~ +``` This is because the arguments `10 11 12` are interpreted as a -*command* that replaces the default command given by `CMD +_command_ that replaces the default command given by `CMD ["python3", "/home/sum.py"]` in the image. -To achieve the goal of having a command that *always* runs when a -container is run from the container image *and* can be passed the arguments given on the +To achieve the goal of having a command that _always_ runs when a +container is run from the container image _and_ can be passed the arguments given on the command line, use the keyword `ENTRYPOINT` in the `Dockerfile`. -~~~dockerfile +```dockerfile FROM alpine COPY sum.py /home @@ -374,45 +380,48 @@ ENTRYPOINT ["python3", "/home/sum.py"] # Give default arguments, in case none are supplied on # the command-line CMD ["10", "11"] -~~~ +``` Build and test it: -~~~bash +```bash $ docker image build -t alpine-sum:v2 . # Most of the time you are interested in the sum of 10 and 11: $ docker container run alpine-sum:v2 # Sometimes you have more challenging calculations to do: $ docker container run alpine-sum:v2 12 13 14 -~~~ +``` :::callout + ## Overriding the ENTRYPOINT + Sometimes you don't want to run the image's `ENTRYPOINT`. For example if you have a specialized container image that does only sums, but you need an interactive shell to examine the container: -~~~bash -$ docker container run -it alpine-sum:v2 /bin/sh -~~~ +```bash +docker container run -it alpine-sum:v2 /bin/sh +``` will yield -~~~ +```text Please supply integer arguments -~~~ +``` You need to override the `ENTRYPOINT` statement in the container image like so: -~~~bash -$ docker container run -it --entrypoint /bin/sh alpine-sum:v2 -~~~ +```bash +docker container run -it --entrypoint /bin/sh alpine-sum:v2 +``` + ::: -### Add the `sum.py` script to the `PATH` so you can run it directly: +### Add the `sum.py` script to the `PATH` so you can run it directly -~~~dockerfile +```dockerfile FROM alpine RUN apk add --update python3 py3-pip python3-dev @@ -422,26 +431,28 @@ COPY sum.py /home RUN chmod +x /home/sum.py # add /home folder to the PATH ENV PATH /home:$PATH -~~~ +``` Build and test it: -~~~bash -$ docker image build -t alpine-sum:v3 . -$ docker container run alpine-sum:v3 sum.py 1 2 3 4 -~~~ +```bash +docker image build -t alpine-sum:v3 . +docker container run alpine-sum:v3 sum.py 1 2 3 4 +``` :::callout + ## Best practices for writing Dockerfiles -Take a look at Nüst et al.'s "[_Ten simple rules for writing Dockerfiles for reproducible data science_](https://doi.org/10.1371/journal.pcbi.1008316)" \[1\] -for some great examples of best practices to use when writing Dockerfiles. -The [GitHub repository](https://github.com/nuest/ten-simple-rules-dockerfiles) associated with the paper also has a set of [example `Dockerfile`s](https://github.com/nuest/ten-simple-rules-dockerfiles/tree/master/examples) + +Take a look at Nüst et al.'s "[_Ten simple rules for writing Dockerfiles for reproducible data science_](https://doi.org/10.1371/journal.pcbi.1008316)" \[1\] +for some great examples of best practices to use when writing Dockerfiles. +The [GitHub repository](https://github.com/nuest/ten-simple-rules-dockerfiles) associated with the paper also has a set of [example `Dockerfile`s](https://github.com/nuest/ten-simple-rules-dockerfiles/tree/master/examples) demonstrating how the rules highlighted by the paper can be applied. -[1] Nüst D, Sochat V, Marwick B, Eglen SJ, Head T, et al. (2020) Ten simple rules for writing Dockerfiles for reproducible data science. PLOS Computational Biology 16(11): e1008316. https://doi.org/10.1371/journal.pcbi.1008316 +[1] Nüst D, Sochat V, Marwick B, Eglen SJ, Head T, et al. (2020) Ten simple rules for writing Dockerfiles for reproducible data science. PLOS Computational Biology 16(11): e1008316. ::: -## Key points: +## Key points - Docker allows containers to read and write files from the Docker host. -- You can include files from your Docker host into your Docker container images by using the COPY instruction in your Dockerfile. \ No newline at end of file +- You can include files from your Docker host into your Docker container images by using the COPY instruction in your Dockerfile. diff --git a/technology_and_tooling/docker/creating-container-images.md b/technology_and_tooling/docker/creating-container-images.md index cfd49974..29d36ebf 100644 --- a/technology_and_tooling/docker/creating-container-images.md +++ b/technology_and_tooling/docker/creating-container-images.md @@ -2,20 +2,19 @@ name: "Creating Containers" teaching: 20 exercises: 15 -dependsOn: [ - technology_and_tooling.docker.docker-hub -] +dependsOn: [technology_and_tooling.docker.docker-hub] tags: [docker] -attribution: - - citation: > - D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". - Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. - url: https://github.com/carpentries-incubator/docker-introduction - image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg - license: CC-BY-4.0 +attribution: + - citation: > + D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". + Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. + url: https://github.com/carpentries-incubator/docker-introduction + image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg + license: CC-BY-4.0 --- There are lots of reasons why you might want to create your **own** Docker container image. + - You can't find a container image with all the tools you need on Docker Hub. - You want to have a container image to "archive" all the specific software versions you ran for a project. - You want to share your workflow with someone else. @@ -25,39 +24,39 @@ There are lots of reasons why you might want to create your **own** Docker conta Before creating a reproducible installation, let's experiment with installing software inside a container. Start a container from the `alpine` container image we used before, interactively: -~~~bash -$ docker container run -it alpine sh -~~~ +```bash +docker container run -it alpine sh +``` Because this is a basic container, there's a lot of things not installed -- for example, `python3`. -~~~bash +```bash /# python3 -~~~ +``` -~~~ +```text sh: python3: not found -~~~ +``` Inside the container, we can run commands to install Python 3. The Alpine version of Linux has a installation tool called `apk` that we can use to install Python 3. -~~~bash +```bash /# apk add --update python3 py3-pip python3-dev -~~~ +``` We can test our installation by running a Python command: -~~~bash +```bash /# python3 --version -~~~ +``` Once Python is installed, we can add Python packages using the pip package installer: -~~~bash -/# pip install cython -~~~ +```bash +/# pip install pandas +``` ::::challenge{id=searching-for-help title="Searching for Help"} @@ -67,9 +66,9 @@ Can you find instructions for installing R on Alpine Linux? Do they work? A quick search should hopefully show that the way to install R on Alpine Linux is: -~~~bash +```bash /# apk add R -~~~ +``` ::: :::: @@ -82,9 +81,9 @@ known as a `Dockerfile`. If you haven't already, exit out of the interactively running container. -~~~bash +```bash /# exit -~~~ +``` ## Put installation instructions in a `Dockerfile` @@ -94,44 +93,46 @@ can be used to create a new container image. From your shell, go to the folder you downloaded at the start of the lesson and print out the Dockerfile inside: -~~~bash -$ cd ~/Desktop/docker-intro/basic -$ cat Dockerfile -~~~ +```bash +cd ~/Desktop/docker-intro/basic +cat Dockerfile +``` -~~~ +```text FROM RUN RUN CMD -~~~ +``` Let's break this file down: -- The first line, `FROM`, indicates which container image we're starting with. It is the "base" container image we are going to start from. +- The first line, `FROM`, indicates which container image we're starting with. It is the "base" container image we are going to start from. - The next two lines `RUN`, will indicate installation commands we want to run. These -are the same commands that we used interactively above. + are the same commands that we used interactively above. - The last line, `CMD`, indicates the default command we want a -container based on this container image to run, if no other command is provided. It is recommended -to provide `CMD` in *exec-form* (see the -[`CMD` section](https://docs.docker.com/engine/reference/builder/#cmd) -of the Dockerfile documentation for more details). It is written as a -list which contains the executable to run as its first element, -optionally followed by any arguments as subsequent elements. The list -is enclosed in square brackets (`[]`) and its elements are -double-quoted (`"`) strings which are separated by commas. For -example, `CMD ["ls", "-lF", "--color", "/etc"]` would translate -to `ls -lF --color /etc`. + container based on this container image to run, if no other command is provided. It is recommended + to provide `CMD` in _exec-form_ (see the + [`CMD` section](https://docs.docker.com/engine/reference/builder/#cmd) + of the Dockerfile documentation for more details). It is written as a + list which contains the executable to run as its first element, + optionally followed by any arguments as subsequent elements. The list + is enclosed in square brackets (`[]`) and its elements are + double-quoted (`"`) strings which are separated by commas. For + example, `CMD ["ls", "-lF", "--color", "/etc"]` would translate + to `ls -lF --color /etc`. :::callout -## *shell-form* and *exec-form* for CMD + +## _shell-form_ and _exec-form_ for CMD + Another way to specify the parameter for the [`CMD` instruction](https://docs.docker.com/engine/reference/builder/#cmd) -is the *shell-form*. Here you type the command as you would call it +is the _shell-form_. Here you type the command as you would call it from the command line. Docker then silently runs this command in the -image's standard shell. `CMD cat /etc/passwd` is equivalent to `CMD +image's standard shell. `CMD cat /etc/passwd` is equivalent to `CMD ["/bin/sh", "-c", "cat /etc/passwd"]`. We recommend to prefer the -more explicit *exec-form* because we will be able to create more +more explicit _exec-form_ because we will be able to create more flexible container image command options and make sure complex commands are unambiguous in this format. ::: @@ -146,17 +147,18 @@ to replicate the installation we did above? Based on our experience above, edit the `Dockerfile` (in your text editor of choice) to look like this: -~~~dockerfile +```dockerfile FROM alpine RUN apk add --update python3 py3-pip python3-dev -RUN pip install cython +RUN pip install pandas CMD ["python3", "--version"] -~~~ +``` + ::: :::: The recipe provided by the `Dockerfile` shown in the solution to the preceding exercise will use Alpine Linux as the base container image, -add Python 3 and the Cython library, and set a default command to request Python 3 to report its version information. +add Python 3 and the Pandas library, and set a default command to request Python 3 to report its version information. ## Create a new Docker image @@ -167,15 +169,16 @@ resulting container as a new container image. To do this we will use the `docker image build` command. We have to provide `docker image build` with two pieces of information: + - the location of the `Dockerfile` - the name of the new container image. Remember the naming scheme from before? You should name -your new image with your Docker Hub username and a name for the container image, like this: `USERNAME/CONTAINER_IMAGE_NAME`. + your new image with your Docker Hub username and a name for the container image, like this: `USERNAME/CONTAINER_IMAGE_NAME`. All together, the build command that you should run on your computer, will have a similar structure to this: -~~~bash -$ docker image build -t USERNAME/CONTAINER_IMAGE_NAME . -~~~ +```bash +docker image build -t USERNAME/CONTAINER_IMAGE_NAME . +``` The `-t` option names the container image; the final dot indicates that the `Dockerfile` is in our current directory. @@ -183,17 +186,18 @@ our current directory. For example, if my user name was `alice` and I wanted to call my container image `alpine-python`, I would use this command: -~~~bash -$ docker image build -t alice/alpine-python . -~~~ +```bash +docker image build -t alice/alpine-python . +``` :::callout + ## Build Context Notice that the final input to `docker image build` isn't the Dockerfile -- it's a directory! In the command above, we've used the current working directory (`.`) of the shell as the final input to the `docker image build` command. This option provides -what is called the *build context* to Docker -- if there are files being copied +what is called the _build context_ to Docker -- if there are files being copied into the built container image [more details in the next episode](/advanced-containers) they're assumed to be in this location. Docker expects to see a Dockerfile in the build context also (unless you tell it to look elsewhere). @@ -204,40 +208,40 @@ only what you need for the container image in a build context directory, as we'v in this example. ::: - ::::challenge{id=review title="Review!"} 1. Think back to earlier. What command can you run to check if your container image was created -successfully? (Hint: what command shows the container images on your computer?) + successfully? (Hint: what command shows the container images on your computer?) 2. We didn't specify a tag for our container image name. What tag did Docker automatically use? 3. What command will run a container based on the container image you've created? What should happen by default -if you run such a container? Can you make it do something different, like print -"hello world"? + if you run such a container? Can you make it do something different, like print + "hello world"? :::solution 1. To see your new image, run `docker image ls`. You should see the name of your new -container image under the "REPOSITORY" heading. + container image under the "REPOSITORY" heading. 2. In the output of `docker image ls`, you can see that Docker has automatically -used the `latest` tag for our new container image. + used the `latest` tag for our new container image. 3. We want to use `docker container run` to run a container based on a container image. The following command should run a container and print out our default message, the version of Python: -~~~bash -$ docker container run alice/alpine-python -~~~ +```bash +docker container run alice/alpine-python +``` To run a container based on our container image and print out "Hello world" instead: -~~~bash -$ docker container run alice/alpine-python echo "Hello World" -~~~ +```bash +docker container run alice/alpine-python echo "Hello World" +``` + ::: :::: @@ -252,46 +256,47 @@ There are a lot of choices when it comes to installing software -- sometimes too Here are some things to consider when creating your own container image: - **Start smart**, or, don't install everything from scratch! If you're using Python -as your main tool, start with a [Python container image](https://hub.docker.com/_/python). Same with [R](https://hub.docker.com/r/rocker/r-ver/). We've used Alpine Linux as an example -in this lesson, but it's generally not a good container image to start with for initial development and experimentation because it is -a less common distribution of Linux; using [Ubuntu](https://hub.docker.com/_/ubuntu), [Debian](https://hub.docker.com/_/debian) and [CentOS](https://hub.docker.com/_/centos) are all -good options for scientific software installations. The program you're using might -recommend a particular distribution of Linux, and if so, it may be useful to start with a container image for that distribution. + as your main tool, start with a [Python container image](https://hub.docker.com/_/python). Same with [R](https://hub.docker.com/r/rocker/r-ver/). We've used Alpine Linux as an example + in this lesson, but it's generally not a good container image to start with for initial development and experimentation because it is + a less common distribution of Linux; using [Ubuntu](https://hub.docker.com/_/ubuntu), [Debian](https://hub.docker.com/_/debian) and [CentOS](https://hub.docker.com/_/centos) are all + good options for scientific software installations. The program you're using might + recommend a particular distribution of Linux, and if so, it may be useful to start with a container image for that distribution. - **How big?** How much software do you really need to install? When you have a choice, -lean towards using smaller starting container images and installing only what's needed for -your software, as a bigger container image means longer download times to use. + lean towards using smaller starting container images and installing only what's needed for + your software, as a bigger container image means longer download times to use. - **Know (or Google) your Linux**. Different distributions of Linux often have distinct sets of tools for installing software. The `apk` command we used above is the software package installer for Alpine Linux. The installers for various common Linux distributions are listed below: - - Ubuntu: `apt` or `apt-get` - - Debian: `deb` - - CentOS: `yum` - Most common software installations are available to be installed via these tools. - A web search for "install X on Y Linux" is usually a good start for common software - installation tasks; if something isn't available via the Linux distribution's installation - tools, try the options below. + - Ubuntu: `apt` or `apt-get` + - Debian: `deb` + - CentOS: `yum` + Most common software installations are available to be installed via these tools. + A web search for "install X on Y Linux" is usually a good start for common software + installation tasks; if something isn't available via the Linux distribution's installation + tools, try the options below. - **Use what you know**. You've probably used commands like `pip` or `install.packages()` -before on your own computer -- these will also work to install things in container images (if the basic scripting -language is installed). + before on your own computer -- these will also work to install things in container images (if the basic scripting + language is installed). - **README**. Many scientific software tools have a README or installation instructions -that lay out how to install software. You want to look for instructions for Linux. If -the install instructions include options like those suggested above, try those first. + that lay out how to install software. You want to look for instructions for Linux. If + the install instructions include options like those suggested above, try those first. In general, a good strategy for installing software is: + - Make a list of what you want to install. - Look for pre-existing container images. - Read through instructions for software you'll need to install. - Try installing everything interactively in your base container -- take notes! - From your interactive installation, create a `Dockerfile` and then try to build -the container image from that. + the container image from that. ## Share your new container image on Docker Hub -Container images that you release publicly can be stored on the Docker Hub for free. If you +Container images that you release publicly can be stored on the Docker Hub for free. If you name your container image as described above, with your Docker Hub username, all you need to do is run the opposite of `docker image pull` -- `docker image push`. -~~~bash -$ docker image push alice/alpine-python -~~~ +```bash +docker image push alice/alpine-python +``` Make sure to substitute the full name of your container image! @@ -299,6 +304,7 @@ In a web browser, open , and on your user page you should now see your container image listed, for anyone to use or build on. :::callout + ## Logging In Technically, you have to be logged into Docker on your computer for this to work. @@ -309,7 +315,7 @@ try `docker image push` again. ## What's in a name? (again) -You don't *have* to name your containers images using the +You don't _have_ to name your containers images using the `USERNAME/CONTAINER_IMAGE_NAME:TAG` naming scheme. On your own computer, you can call container images whatever you want, and refer to them by the names you choose. It's only when you want to share a container image that it needs the @@ -321,14 +327,15 @@ on her own computer. She now wants to share it in her `alice` Docker Hub account with the name `workflow-complete` and a tag of `v1`. Her `docker image tag` command would look like this: -~~~bash -$ docker image tag workflow-test alice/workflow-complete:v1 -~~~ +```bash +docker image tag workflow-test alice/workflow-complete:v1 +``` She could then push the re-named container image to Docker Hub, using `docker image push alice/workflow-complete:v1` -## Key Points: +## Key Points + - `Dockerfiles` specify what is within Docker container images. - The docker image build command is used to build a container image from a Dockerfile. -- You can share your Docker container images through the Docker Hub so that others can create Docker containers from your container images. \ No newline at end of file +- You can share your Docker container images through the Docker Hub so that others can create Docker containers from your container images. diff --git a/technology_and_tooling/docker/docker-hub.md b/technology_and_tooling/docker/docker-hub.md index 71da6757..7c2fa5f3 100644 --- a/technology_and_tooling/docker/docker-hub.md +++ b/technology_and_tooling/docker/docker-hub.md @@ -2,28 +2,27 @@ name: "Finding Containers on Docker Hub" teaching: 10 exercises: 10 -dependsOn: [ - technology_and_tooling.docker.managing-containers -] +dependsOn: [technology_and_tooling.docker.managing-containers] tags: [docker] -attribution: - - citation: > - D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". - Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. - url: https://github.com/carpentries-incubator/docker-introduction - image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg - license: CC-BY-4.0 +attribution: + - citation: > + D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". + Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. + url: https://github.com/carpentries-incubator/docker-introduction + image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg + license: CC-BY-4.0 --- -In the previous episode, we ran a few different containers derived from different +In the previous episode, we ran a few different containers derived from different container images: `hello-world`, `alpine`, -and maybe `busybox`. Where did these container images come from? The Docker Hub! +and maybe `busybox`. Where did these container images come from? The Docker Hub! ## Introducing the Docker Hub The Docker Hub is an online repository of container images, a vast number of which are publicly available. A large number of the container images are curated by the developers of the software that they package. Also, many commonly used pieces of software that have been containerized into images are officially endorsed, which means that you can trust the container images to have been checked for functionality, stability, and that they don't contain malware. :::callout + ## Docker can be used without connecting to the Docker Hub Note that while the Docker Hub is well integrated into Docker functionality, the @@ -41,6 +40,7 @@ The top-left provides information about the name, short description, popularity The top-right provides the command to pull this container image to your computer. The main body of the page contains many used headings, such as: + - Which tags (i.e., container image versions) are supported; - Summary information about where to get help, which computer architectures are supported, etc.; - A longer description of the container image; @@ -52,24 +52,24 @@ The "How to use the image" section of most container images' pages will provide ## Exploring Container Image Versions A single Docker Hub page can have many different versions of container images, -based on the version of the software inside. These +based on the version of the software inside. These versions are indicated by "tags". When referring to the specific version of a container image by its tag, you use a colon, `:`, like this: -``` +```text CONTAINER_IMAGE_NAME:TAG ``` So if I wanted to download the `python` container image, with Python 3.8, I would use this name: ```bash -$ docker image pull python:3.8 +docker image pull python:3.8 ``` But if I wanted to download a Python 3.6 container image, I would use this name: ```bash -$ docker image pull python:3.6 +docker image pull python:3.6 ``` The default tag (which is used if you don't specify one) is called `latest`. @@ -83,11 +83,12 @@ groups like [rocker](https://hub.docker.com/u/rocker), a group that builds commu The name for these group- or individually-managed container images have this format: -``` +```text OWNER/CONTAINER_IMAGE_NAME:TAG ``` :::callout + ## Repositories The technical name for the contents of a Docker Hub page is a "repository." @@ -95,9 +96,10 @@ The tag indicates the specific version of the container image that you'd like to use from a particular repository. So a slightly more accurate version of the above example is: -``` +```text OWNER/REPOSITORY:TAG ``` + ::: ::::challenge{id=in-a-name title="What's in a name?"} @@ -110,15 +112,17 @@ later in this lesson, so you don't actually need to pull the container image -- constructing the correct `docker pull` command is sufficient. :::solution + ## Solution First, search for `rocker` in Docker Hub. Then look for their `tidyverse` container image. You can look at the list of tags, or just guess that the tag is `3.6.1`. Altogether, that means that the name of the container image we want to download is: -~~~bash -$ docker image pull rocker/tidyverse:3.6.1 -~~~ +```bash +docker image pull rocker/tidyverse:3.6.1 +``` + ::: :::: @@ -156,10 +160,11 @@ Once you find a container image, use the skills from the previous episode to dow the container image and explore it. ::: -## Key Points: +## Key Points + - The Docker Hub is an online repository of container images. - Many Docker Hub container images are public, and may be officially endorsed. - Each Docker Hub page about a container image provides structured information and subheadings - Most Docker Hub pages about container images contain sections that provide examples of how to use those container images. - Many Docker Hub container images have multiple versions, indicated by tags. -- The naming convention for Docker container images is: `OWNER/CONTAINER_IMAGE_NAME:TAG` \ No newline at end of file +- The naming convention for Docker container images is: `OWNER/CONTAINER_IMAGE_NAME:TAG` diff --git a/technology_and_tooling/docker/docker-image-examples.md b/technology_and_tooling/docker/docker-image-examples.md index 4131a5de..d06532e1 100644 --- a/technology_and_tooling/docker/docker-image-examples.md +++ b/technology_and_tooling/docker/docker-image-examples.md @@ -2,17 +2,15 @@ name: "Examples" teaching: 20 exercises: 0 -dependsOn: [ - technology_and_tooling.docker.advanced-containers -] +dependsOn: [technology_and_tooling.docker.advanced-containers] tags: [docker] -attribution: - - citation: > - D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". - Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. - url: https://github.com/carpentries-incubator/docker-introduction - image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg - license: CC-BY-4.0 +attribution: + - citation: > + D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". + Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. + url: https://github.com/carpentries-incubator/docker-introduction + image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg + license: CC-BY-4.0 --- Now that we have learned the basics of working with Docker container images and containers, @@ -33,7 +31,6 @@ In this [GitHub Actions example](e01-github-actions), you can learn more about continuous integration in the cloud and how you can use container images with GitHub to automate repetitive tasks like testing code or deploying websites. - ## Using Containers on an HPC Cluster It is possible to run containers on shared computing systems run by a university or national @@ -41,7 +38,7 @@ computing center. As a researcher, you can build container images and test conta computer and then run your full-scale computing work on a shared computing system like a high performance cluster or high throughput grid. -The catch? Most university and national computing centers do not support *running* +The catch? Most university and national computing centers do not support _running_ containers with Docker commands, and instead use a similar tool called Singularity or Shifter. However, both of these programs can be used to run containers based on Docker container images, so often people create their container image as a Docker container image, so they can @@ -54,5 +51,6 @@ following resources show what it can look like: - [Introduction to Singularity](https://carpentries-incubator.github.io/singularity-introduction/): See the episode titled "Running MPI parallel jobs using Singularity containers" - [Container Workflows at Pawsey](https://pawseysc.github.io/container-workflows/): See the episode titled "Run containers on HPC with Shifter (and Singularity)" -## Key points: -- There are many ways you might use Docker and existing container images in your research project. \ No newline at end of file +## Key points + +- There are many ways you might use Docker and existing container images in your research project. diff --git a/technology_and_tooling/docker/e01-github-actions.md b/technology_and_tooling/docker/e01-github-actions.md index 7a7a3f57..7d18105c 100644 --- a/technology_and_tooling/docker/e01-github-actions.md +++ b/technology_and_tooling/docker/e01-github-actions.md @@ -3,26 +3,24 @@ name: "Using Docker with Github Actions" layout: episode teaching: 30 exercises: 0 -dependsOn: [ - technology_and_tooling.docker.docker-image-examples -] +dependsOn: [technology_and_tooling.docker.docker-image-examples] tags: [docker] -attribution: - - citation: > - D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". - Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. - url: https://github.com/carpentries-incubator/docker-introduction - image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg - license: CC-BY-4.0 +attribution: + - citation: > + D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". + Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. + url: https://github.com/carpentries-incubator/docker-introduction + image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg + license: CC-BY-4.0 --- - Docker has become an industry standard in providing run-time environments to cloud services. This lesson shows how you can use Docker images inside Github Actions. Our specific example will show a neat way to build a simple website that goes with any project you might have going. # Github Actions + Github Actions are a means of automating repetitive task in maintaining software projects: - Testing if your software works correctly (Continuous Integration) @@ -37,7 +35,7 @@ These are tasks that you could do on your own computer, but consider the followi - Someone else contributed to your package but didn't run the same version of the document converter: the documentation looks different now. -These are just some of the bad things that may happen. To address these issues, +These are just some of the bad things that may happen. To address these issues, it is often desirable to have a consistent, controlled environment in which to run these tasks, collectively known as CI/CD. Github can perform these actions for you inside Docker containers. If your project is open source, this service @@ -62,14 +60,18 @@ A fabulous tool for building web content from Markdown files is Pandoc. You coul army knife of document conversion: it is very, very versatile. In this instance we will only use its most basic operation. (If you are familiar with RMarkdown: Pandoc is what powers RMarkdown). -:::callout +:::callout{variant="discussion"} + ## Why Pandoc? + There are other engines that can do this for you, but here are some features that win some people er: + - Supports citations (from BibTeX or CSL database) - Rendered equations (using MathJax, optionally numbered) - Code highlighting - Highly customizable + ::: We take you through the process of creating a project on Github from scratch and @@ -91,19 +93,20 @@ Only a `index.html` and `.nojekyll` (that prevents Github from creating a Jekyll set this up? ## Create a Github Project + Create a github project with a short `README.md`. To do this: - go to `github.com` and make sure you're logged in - click the green "New" button at the top right - clone the new project to your computer. The instructions for -doing so will be shown in the dialog on Github, or you can also see [Software Carpentry lesson on Version -Control with Git](http://swcarpentry.github.io/git-novice/07-github/index.html), or -the example below: + doing so will be shown in the dialog on Github, or you can also see [Software Carpentry lesson on Version + Control with Git](http://swcarpentry.github.io/git-novice/07-github/index.html), or + the example below: -~~~bash +```bash git clone cd -~~~ +``` ## Using Pandoc to Create a Website @@ -116,11 +119,11 @@ it to generate static websites from Markdown. First, let's download a container with pandoc installed and run it to see what the pandoc version is. -~~~bash +```bash docker run pandoc/core --version -~~~ +``` -~~~ +```text Unable to find image 'pandoc/core:latest' locally latest: Pulling from pandoc/core f84cab65f19f: Pull complete @@ -136,19 +139,19 @@ User data directory: /root/.local/share/pandoc Copyright (C) 2006-2021 John MacFarlane. Web: https://pandoc.org This is free software; see the source for copying conditions. There is no warranty, not even for merchantability or fitness for a particular purpose. -~~~ +``` Now, we can run pandoc on our `README.md` file by including our current directory and the `README.md` file as part of the `docker run` command: -~~~bash +```bash docker run --mount type=bind,source=${PWD},target=/tmp pandoc/core /tmp/README.md -~~~ +``` -~~~ +```html

readme-pages

Example for generating Github.io pages from Readme with Pandoc.

-~~~ +``` Here, the `--mount type=bind,source=${PWD},target=/tmp` flag says to take the directory at `${PWD}` and make it available inside the container as `/tmp`. Then `pandoc` can read the source file (`README.md`) and convert it to HTML. While this HTML @@ -156,41 +159,41 @@ is valid, it doesn't show the complete structure of a standalone HTML document. add the `--standalone` argument to the pandoc command. Also we can redirect the output to create a HTML file in the `build` directory. -~~~bash +```bash mkdir -p build docker run --mount type=bind,source=${PWD},target=/tmp pandoc/core /tmp/README.md --standalone --output=/tmp/build/index.html -~~~ +``` -~~~ +```text [WARNING] This document format requires a nonempty element. Defaulting to 'README' as the title. To specify a title, use 'title' in metadata or --metadata title="...". -~~~ +``` To suppress the warning message we may add the following lines at the top of the `README.md` file: -~~~ +```text --- title: Hello, Pandoc --- -~~~ +``` Or add the mentioned `--metadata title="..."` to the command line. Once we've made all of these changes, and produced the output we want, we can check it, using this command: -~~~bash +```bash cat build/index.html -~~~ +``` -~~~ +```text <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang=""> <head> <meta charset="utf-8" /> ... etc -~~~ +``` We now have tested our website deployment workflow - given the source files from Github, we can use a Docker container and command to generate our website. We now @@ -208,51 +211,51 @@ are taken immediately to a menu for creating a new one. We will skip the templat The first entry is the **name** of the workflow -~~~yaml +```yaml name: Deploy pages -~~~ +``` Next we specify **when** this workflow is run. In this case: every time content is pushed to the `main` branch -~~~yaml +```yaml on: push: branches: - main -~~~ +``` Now we tell Github **what** to do. -~~~yaml +```yaml jobs: - deploy: # a free machine-readable name for this job - runs-on: ubuntu-latest # specify the base operating system + deploy: # a free machine-readable name for this job + runs-on: ubuntu-latest # specify the base operating system steps: - - name: Checkout repo content # fetch the contents of the repository + - name: Checkout repo content # fetch the contents of the repository uses: actions/checkout@v2 - name: Prepare build environment - run: | # multiple Bash commands follow + run: | # multiple Bash commands follow mkdir -p build touch build/.nojekyll -~~~ +``` Now for the Docker bit: -~~~yaml - - name: Run pandoc - uses: docker://pandoc/core:2.12 # Always specify a version! - with: - args: >- # multi-line argument - --standalone - --output=build/index.html - README.md - - name: Deploy on github pages # Use a third-party plugin to upload the content - uses: JamesIves/github-pages-deploy-action@4.1.0 - with: - branch: gh-pages - folder: build -~~~ +```yaml +- name: Run pandoc + uses: docker://pandoc/core:2.12 # Always specify a version! + with: + args: >- # multi-line argument + --standalone + --output=build/index.html + README.md +- name: Deploy on github pages # Use a third-party plugin to upload the content + uses: JamesIves/github-pages-deploy-action@4.1.0 + with: + branch: gh-pages + folder: build +``` We may recognize the command-line that we had previously. Notice that we don't need to specify the `--mount` flag. Github Actions arranges the Docker environment such that the files are in the correct @@ -263,7 +266,8 @@ Now we should enable Github Pages on this repository: go to the "Settings" tab a seconds the page should be up. # Reference material + - [Pandoc the universal document converter](https://pandoc.org) - [Documentation on GitHub Actions](https://docs.github.com/en/actions) - [GitHub Pages deploy action](https://github.com/marketplace/actions/deploy-to-github-pages) -- [Pandoc action example](https://github.com/pandoc/pandoc-action-example) \ No newline at end of file +- [Pandoc action example](https://github.com/pandoc/pandoc-action-example) diff --git a/technology_and_tooling/docker/e02-jekyll-lesson-example.md b/technology_and_tooling/docker/e02-jekyll-lesson-example.md index c31b2582..2334dd0b 100644 --- a/technology_and_tooling/docker/e02-jekyll-lesson-example.md +++ b/technology_and_tooling/docker/e02-jekyll-lesson-example.md @@ -2,32 +2,31 @@ name: "Using Docker with Jekyll" teaching: 20 exercises: 0 -dependsOn: [ - technology_and_tooling.docker.docker-image-examples -] +dependsOn: [technology_and_tooling.docker.docker-image-examples] tags: [docker] -attribution: - - citation: > - D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". - Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. - url: https://github.com/carpentries-incubator/docker-introduction - image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg - license: CC-BY-4.0 +attribution: + - citation: > + D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". + Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. + url: https://github.com/carpentries-incubator/docker-introduction + image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg + license: CC-BY-4.0 --- As previously mentioned earlier in the lesson, containers can be helpful for -using software that can be difficult to install. An example is the software -that generates this lesson website. The website for this lesson is generated mechanically, +using software that can be difficult to install. An example is the software +that generates this lesson website. The website for this lesson is generated mechanically, based on a set of files that specify the configuration of the site, its presentation template, -and the content to go on this page. When working on updates to this lesson, +and the content to go on this page. When working on updates to this lesson, you might want to preview those changes using a local copy of the website. This requires installing Jekyll and dependencies such as Ruby and Gemfiles to your local computer which can be difficult to achieve given complexities such as needing to match specific versions of the software components. Instead you could use Docker and a pre-built Jekyll container image. First we need to get a copy of the website source to work with on your computer. -In your shell window, in your `docker-intro` create a new directory `build-website` and `cd` into it. We will be expanding a ZIP file into this directory later. +In your shell window, in your `docker-intro` create a new directory `build-website` and `cd` into it. We will be expanding a ZIP file into this directory later. Now open a web browser window and: + 1. Navigate to the [GitHub repository](https://github.com/carpentries-incubator/docker-introduction) that contains the files for this session; 2. Click the green "Clone or download" button on the right-hand side of the page; 3. Click "Download ZIP". @@ -35,6 +34,7 @@ Now open a web browser window and: 5. Move the `docker-introduction-gh-pages` folder into the `build-website` folder you created above. :::callout + ## There are many ways to work with ZIP files Note that the last two steps can be achieved using a Mac or Windows graphical user interface. There are also ways to effect expanding the ZIP archive on the command line, for example, on my Mac I can achieve the effect of those last two steps through running the command `unzip ~/Downloads/docker-introduction-gh-pages.zip`. @@ -42,21 +42,21 @@ Note that the last two steps can be achieved using a Mac or Windows graphical us In your shell window, if you `cd` into the `docker-introduction-gh-pages` folder and list the files, you should see something similar to what I see: -~~~bash -$ cd docker-introduction-gh-pages -$ ls -~~~ - -~~~ -AUTHORS _episodes code -CITATION _episodes_rmd data -CODE_OF_CONDUCT.md _extras fig -CONTRIBUTING.md _includes files -LICENSE.md _layouts index.md -Makefile aio.md reference.md -README.md assets setup.md -_config.yml bin -~~~ +```bash +cd docker-introduction-gh-pages +ls +``` + +```text +AUTHORS _episodes code +CITATION _episodes_rmd data +CODE_OF_CONDUCT.md _extras fig +CONTRIBUTING.md _includes files +LICENSE.md _layouts index.md +Makefile aio.md reference.md +README.md assets setup.md +_config.yml bin +``` You can now request that a container is created that will compile the files in this set into the lesson website, and will run a simple webserver to allow you @@ -69,21 +69,21 @@ away the container. For macOS, Linux and PowerShell: -~~~bash -$ docker run --rm -it --mount type=bind,source=${PWD},target=/srv/jekyll -p 127.0.0.1:4000:4000 jekyll/jekyll:3 jekyll serve -~~~ +```bash +docker run --rm -it --mount type=bind,source=${PWD},target=/srv/jekyll -p 127.0.0.1:4000:4000 jekyll/jekyll:3 jekyll serve +``` When I ran the macOS command, the output was as follows: -~~~ +```text Unable to find image 'jekyll/jekyll:3' locally 3: Pulling from jekyll/jekyll -9d48c3bd43c5: Pull complete -9ce9598067e7: Pull complete -278f4c997324: Pull complete -bfca09e5fd9a: Pull complete -2612f15b9d22: Pull complete -322c093d5418: Pull complete +9d48c3bd43c5: Pull complete +9ce9598067e7: Pull complete +278f4c997324: Pull complete +bfca09e5fd9a: Pull complete +2612f15b9d22: Pull complete +322c093d5418: Pull complete Digest: sha256:9521c8aae4739fcbc7137ead19f91841b833d671542f13e91ca40280e88d6e34 Status: Downloaded newer image for jekyll/jekyll:3 @@ -95,13 +95,13 @@ To use retry middleware with Faraday v2.0+, install `faraday-retry` gem Source: /srv/jekyll Destination: /srv/jekyll/_site Incremental build: disabled. Enable with --incremental - Generating... + Generating... Remote Theme: Using theme carpentries/carpentries-theme done in 7.007 seconds. Auto-regeneration: enabled for '/srv/jekyll' Server address: http://0.0.0.0:4000 Server running... press ctrl-c to stop. -~~~ +``` In the preceding output, you see Docker downloading the container image for Jekyll, which is a tool for building websites from specification files such as @@ -131,4 +131,4 @@ You can stop the Jekyll container by clicking in its terminal window and typing `Ctrl-C` You have now achieved using a reproducible computational environment to -reproduce a lesson about reproducible computing environments. \ No newline at end of file +reproduce a lesson about reproducible computing environments. diff --git a/technology_and_tooling/docker/files/docker-intro.zip b/technology_and_tooling/docker/files/docker-intro.zip index 58f01e9d..9315776d 100644 Binary files a/technology_and_tooling/docker/files/docker-intro.zip and b/technology_and_tooling/docker/files/docker-intro.zip differ diff --git a/technology_and_tooling/docker/index.md b/technology_and_tooling/docker/index.md index 100475e8..407841ea 100644 --- a/technology_and_tooling/docker/index.md +++ b/technology_and_tooling/docker/index.md @@ -1,40 +1,40 @@ --- name: Containerisation with Docker id: docker -dependsOn: [ - technology_and_tooling.bash_shell -] -files: [ - introduction.md, - meet-docker.md, - managing-containers.md, - running-containers.md, - docker-hub.md, - creating-container-images.md, - advanced-containers.md, - docker-image-examples.md, - e01-github-actions.md, - e02-jekyll-lesson-example.md, - reproducibility.md, - setup.md, -] -attribution: - - citation: > - D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". - Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. - url: https://github.com/carpentries-incubator/docker-introduction - image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg - license: CC-BY-4.0 +dependsOn: [technology_and_tooling.bash_shell] +files: + [ + introduction.md, + meet-docker.md, + managing-containers.md, + running-containers.md, + docker-hub.md, + creating-container-images.md, + advanced-containers.md, + docker-image-examples.md, + e01-github-actions.md, + e02-jekyll-lesson-example.md, + reproducibility.md, + setup.md, + ] +attribution: + - citation: > + D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". + Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. + url: https://github.com/carpentries-incubator/docker-introduction + image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg + license: CC-BY-4.0 summary: | - This course aims to introduce the use of Docker containers with the goal of - using them to effect reproducible computational environments. + This course aims to introduce the use of Docker containers with the goal of + using them to effect reproducible computational environments. --- This session aims to introduce the use of Docker containers with the goal of using them to effect reproducible computational environments. Such environments are useful for ensuring reproducible research outputs, for example. -## After completing this session you should: +## After completing this session you should + - Have an understanding of what Docker containers are, why they are useful and the common terminology used - Have a working Docker installation on your local system to allow you to use containers - Understand how to use existing Docker containers for common tasks @@ -44,6 +44,7 @@ are useful for ensuring reproducible research outputs, for example. - The practical work in this lesson is primarily aimed at using Docker on your own laptop. Beyond your laptop, software container technologies such as Docker can also be used in the cloud and on high performance computing (HPC) systems. Some of the material in this lesson will be applicable to those environments too. :::callout + ## A note about Docker Docker is a mature, robust and very widely used application. Nonetheless, it is @@ -54,4 +55,3 @@ While we do our best to ensure that this lesson remains up to date and the descriptions and outputs shown match what you will see on your own computer, inconsistencies can occur. ::: - diff --git a/technology_and_tooling/docker/introduction.md b/technology_and_tooling/docker/introduction.md index 95d0151f..c0197f20 100644 --- a/technology_and_tooling/docker/introduction.md +++ b/technology_and_tooling/docker/introduction.md @@ -2,20 +2,19 @@ name: Introduction teaching: 20 exercises: 0 -dependsOn: [ - technology_and_tooling.docker.setup -] +dependsOn: [technology_and_tooling.docker.setup] tags: [docker] -attribution: - - citation: > - D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". - Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. - url: https://github.com/carpentries-incubator/docker-introduction - image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg - license: CC-BY-4.0 +attribution: + - citation: > + D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". + Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. + url: https://github.com/carpentries-incubator/docker-introduction + image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg + license: CC-BY-4.0 --- :::callout + ## Learning about Docker Containers The Australian Research Data Commons has produced a short introductory video @@ -24,10 +23,9 @@ or after you go through this section to reinforce your understanding! [How can software containers help your research?](https://www.youtube.com/watch?v=HelrQnm3v4g) -Australian Research Data Commons, 2021. *How can software containers help your research?*. [video] Available at: https://www.youtube.com/watch?v=HelrQnm3v4g DOI: http://doi.org/10.5281/zenodo.5091260 +Australian Research Data Commons, 2021. _How can software containers help your research?_. [video] Available at: <https://www.youtube.com/watch?v=HelrQnm3v4g> DOI: <http://doi.org/10.5281/zenodo.5091260> ::: - ## Scientific Software Challenges :::challenge{id=what-is-your-experience title="What's Your Experience?"} @@ -41,7 +39,7 @@ challenges. You may have come up with some of the following: - you want to use software that doesn't exist for the operating system (Mac, Windows, Linux) you'd prefer. -- you struggle with installing a software tool because you have to install a number of other dependencies first. Those dependencies, in turn, require *other* things, and so on (i.e. combinatoric explosion). +- you struggle with installing a software tool because you have to install a number of other dependencies first. Those dependencies, in turn, require _other_ things, and so on (i.e. combinatoric explosion). - the software you're setting up involves many dependencies and only a subset of all possible versions of those dependencies actually works as desired. - you're not actually sure what version of the software you're using because the install process was so circuitous. - you and a colleague are using the same software but get different results because you have installed different versions and/or are using different operating systems. @@ -66,6 +64,7 @@ doesn't work? Unsurprisingly, software installation and configuration challenges can have negative consequences for research: + - you can't use a specific tool at all, because it's not available or installable. - you can't reproduce your results because you're not sure what tools you're actually using. - you can't access extra/newer resources because you're not able to replicate your software set up. @@ -79,7 +78,7 @@ and access to resources such as files and communications networks in a uniform m [Docker](https://www.docker.com/) is a tool that allows you to build what are called "containers." It's not the only tool that can create containers, but is the one we've chosen for -this workshop. But what *is* a container? +this workshop. But what _is_ a container? To understand containers, let's first talk briefly about your computer. @@ -100,7 +99,7 @@ of making a mess of your existing system by installing a bunch of additional stu You don't want to buy a whole new computer because it's too expensive. What if, instead, you could have another independent filesystem and running operating system that you could access from your main computer, and that is actually stored within this existing computer? -Or, imagine you have two tools you want to use in your groundbreaking research on cat memes: `PurrLOLing`, a tool that does AMAZINGLY well at predicting the best text for a meme based on the cat species and `WhiskerSpot`, the only tool available for identifying cat species from images. You want to send cat pictures to `WhiskerSpot`, and then send the species output to `PurrLOLing`. But there's a problem: `PurrLOLing` only works on Ubuntu and `WhiskerSpot` is only supported for OpenSUSE so you can't have them on the same system! Again, we really want another filesystem (or two) on our computer that we could use to chain together `WhiskerSpot` and `PurrLOLing` in a "pipeline"... +Or, imagine you have two tools you want to use in your groundbreaking research on cat memes: `PurrLOLing`, a tool that does AMAZINGLY well at predicting the best text for a meme based on the cat species and `WhiskerSpot`, the only tool available for identifying cat species from images. You want to send cat pictures to `WhiskerSpot`, and then send the species output to `PurrLOLing`. But there's a problem: `PurrLOLing` only works on Ubuntu and `WhiskerSpot` is only supported for OpenSUSE so you can't have them on the same system! Again, we really want another filesystem (or two) on our computer that we could use to chain together `WhiskerSpot` and `PurrLOLing` in a "pipeline"... Container systems, like Docker, are special programs on your computer that make it possible! The term "container" can be usefully considered with reference to shipping @@ -112,6 +111,7 @@ creation of a complete software system: you can drop a container into any comput the container software installed (the 'container host'), and it should "just work". :::callout + ## Virtualization Containers are an example of what's called **virtualization** -- having a @@ -135,7 +135,6 @@ can be used to create multiple copies of the same shape (or container) and is relatively unchanging, where cookies come and go. If you want a different type of container (cookie) you need a different container image (cookie cutter). - ## Putting the Pieces Together Think back to some of the challenges we described at the beginning. The many layers @@ -173,11 +172,12 @@ a research context include: - Archiving the container images so you can repeat analysis/modelling using the same software and configuration in the future -- capturing your workflow. -## Key Points: +## Key Points + - Almost all software depends on other software components to function, but these components have independent evolutionary paths. - Small environments that contain only the software that is needed for a given task are easier to replicate and maintain. - Critical systems that cannot be upgraded, due to cost, difficulty, etc. need to be reproduced on newer systems in a maintainable and self-documented way. - Virtualization allows multiple environments to run on a single computer. - Containerization improves upon the virtualization of whole computers by allowing efficient management of the host computer’s memory and storage resources. - Containers are built from ‘recipes’ that define the required set of software components and the instructions necessary to build/install them within a container image. -- Docker is just one software platform that can create containers and the resources they use. \ No newline at end of file +- Docker is just one software platform that can create containers and the resources they use. diff --git a/technology_and_tooling/docker/managing-containers.md b/technology_and_tooling/docker/managing-containers.md index fca65903..2f6ec43f 100644 --- a/technology_and_tooling/docker/managing-containers.md +++ b/technology_and_tooling/docker/managing-containers.md @@ -2,17 +2,15 @@ name: "Cleaning Up Containers" teaching: 10 exercises: 0 -dependsOn: [ - technology_and_tooling.docker.running-containers -] +dependsOn: [technology_and_tooling.docker.running-containers] tags: [docker] -attribution: - - citation: > - D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". - Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. - url: https://github.com/carpentries-incubator/docker-introduction - image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg - license: CC-BY-4.0 +attribution: + - citation: > + D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". + Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. + url: https://github.com/carpentries-incubator/docker-introduction + image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg + license: CC-BY-4.0 --- ## Removing images @@ -22,32 +20,32 @@ The container images and their corresponding containers can start to take up a l In order to remove a specific container image, you need to find out details about the container image, specifically, the "Image ID". For example, say my laptop contained the following container image: -~~~bash -$ docker image ls -~~~ +```bash +docker image ls +``` -~~~ +```text REPOSITORY TAG IMAGE ID CREATED SIZE hello-world latest fce289e99eb9 15 months ago 1.84kB -~~~ +``` -You can remove the container image with a `docker image rm` command that includes the *Image ID*, such as: +You can remove the container image with a `docker image rm` command that includes the _Image ID_, such as: -~~~bash -$ docker image rm fce289e99eb9 -~~~ +```bash +docker image rm fce289e99eb9 +``` or use the container image name, like so: -~~~bash -$ docker image rm hello-world -~~~ +```bash +docker image rm hello-world +``` However, you may see this output: -~~~ +```text Error response from daemon: conflict: unable to remove repository reference "hello-world" (must force) - container e7d3b76b00f4 is using its referenced image fce289e99eb9 -~~~ +``` This happens when Docker hasn't cleaned up some of the previously running containers based on this container image. So, before removing the container image, we need to be able @@ -56,19 +54,20 @@ to remove these. ## What containers are running? -Working with containers, we are going to shift back to the command: `docker container`. Similar to `docker image`, we can list running containers by typing: +Working with containers, we are going to shift back to the command: `docker container`. Similar to `docker image`, we can list running containers by typing: -~~~bash -$ docker container ls -~~~ +```bash +docker container ls +``` -~~~ +```text CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES -~~~ +``` Notice that this command didn't return any containers because our containers all exited and thus stopped running after they completed their work. :::callout + ## `docker ps` The command `docker ps` serves the same purpose as `docker container ls`, and comes @@ -79,17 +78,18 @@ from the Unix shell command `ps` which describes running processes. There is also a way to list running containers, and those that have completed recently, which is to add the `--all`/`-a` flag to the `docker container ls` command as shown below. -~~~bash -$ docker container ls --all -~~~ +```bash +docker container ls --all +``` -~~~ +```text CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 9c698655416a hello-world "/hello" 2 minutes ago Exited (0) 2 minutes ago zen_dubinsky 6dd822cf6ca9 hello-world "/hello" 3 minutes ago Exited (0) 3 minutes ago eager_engelbart -~~~ +``` :::callout + ## Keeping it clean You might be surprised at the number of containers Docker is still keeping track of. @@ -103,13 +103,13 @@ a reference to the running container for any reason, **don't** use this flag. To delete an exited container you can run the following command, inserting the `CONTAINER ID` for the container you wish to remove. It will repeat the `CONTAINER ID` back to you, if successful. -~~~bash -$ docker container rm 9c698655416a -~~~ +```bash +docker container rm 9c698655416a +``` -~~~ +```text 9c698655416a -~~~ +``` An alternative option for deleting exited containers is the `docker container prune` command. Note that this command doesn't accept a container ID as an @@ -121,41 +121,42 @@ It will ask you if to confirm you want to remove these containers, see output be If successful it will print the full `CONTAINER ID` back to you for each container it has removed. -~~~bash -$ docker container prune -~~~ +```bash +docker container prune +``` -~~~ +```text WARNING! This will remove all stopped containers. Are you sure you want to continue? [y/N] y Deleted Containers: 9c698655416a848278d16bb1352b97e72b7ea85884bff8f106877afe0210acfc 6dd822cf6ca92f3040eaecbd26ad2af63595f30bb7e7a20eacf4554f6ccc9b2b -~~~ +``` ## Removing images, for real this time Now that we've removed any potentially running or stopped containers, we can try again to delete the `hello-world` **container image**. -~~~bash -$ docker image rm hello-world -~~~ +```bash +docker image rm hello-world +``` -~~~ +```text Untagged: hello-world:latest Untagged: hello-world@sha256:5f179596a7335398b805f036f7e8561b6f0e32cd30a32f5e19d17a3cda6cc33d Deleted: sha256:fce289e99eb9bca977dae136fbe2a82b6b7d4c372474c9235adc1741675f587e Deleted: sha256:af0b15c8625bb1938f1d7b17081031f649fd14e6b233688eea3c5483994a66a3 -~~~ +``` The reason that there are a few lines of output, is that a given container image -may have been formed by merging multiple underlying layers. Any layers that are -used by multiple Docker container images will only be stored once. Now the +may have been formed by merging multiple underlying layers. Any layers that are +used by multiple Docker container images will only be stored once. Now the result of `docker image ls` should no longer include the `hello-world` container image. -## Key Points: +## Key Points + - `docker container` has subcommands used to interact and manage containers. - `docker image` has subcommands used to interact and manage container images. -- `docker container ls` or `docker ps` can provide information on currently running containers. \ No newline at end of file +- `docker container ls` or `docker ps` can provide information on currently running containers. diff --git a/technology_and_tooling/docker/meet-docker.md b/technology_and_tooling/docker/meet-docker.md index 4968c9c9..50a817c0 100644 --- a/technology_and_tooling/docker/meet-docker.md +++ b/technology_and_tooling/docker/meet-docker.md @@ -2,24 +2,25 @@ name: "Docker Command Line" teaching: 10 exercises: 0 -dependsOn: [ - technology_and_tooling.docker.introduction -] +dependsOn: [technology_and_tooling.docker.introduction] tags: [docker] -attribution: - - citation: > - D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". - Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. - url: https://github.com/carpentries-incubator/docker-introduction - image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg - license: CC-BY-4.0 +attribution: + - citation: > + D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". + Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. + url: https://github.com/carpentries-incubator/docker-introduction + image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg + license: CC-BY-4.0 --- + ## Docker command line -Start the Docker application that you installed in working through the setup instructions for this session. Note that this might not be necessary if your laptop is running Linux or if the installation added the Docker application to your startup process. +Start the Docker application that you installed in working through the setup instructions for this session. Note that this might not be necessary if your laptop is running Linux or if the installation added the Docker application to your startup process. :::callout + ## You may need to login to Docker Hub + The Docker application will usually provide a way for you to log in to the Docker Hub using the application's menu (macOS) or systray icon (Windows) and it is usually convenient to do this when the application starts. This will require you to use your Docker Hub username and your password. We will not actually require access to the Docker Hub until later in the course but if you can login now, @@ -27,23 +28,27 @@ you should do so. ::: :::callout + ## Determining your Docker Hub username + If you no longer recall your Docker Hub username, e.g., because you have been logging into the Docker Hub using your email address, you can find out what it is through the steps: + - Open <https://hub.docker.com/> in a web browser window - Sign-in using your email and password (don't tell us what it is) - In the top-right of the screen you will see your username + ::: Once your Docker application is running, open a shell (terminal) window, and run the following command to check that Docker is installed and the command line tools are working correctly. Below is the output for a Mac version, but the specific version is unlikely to matter much: it does not have to precisely match the one listed below. -~~~bash -$ docker --version -~~~ +```bash +docker --version +``` -~~~ +```text Docker version 20.10.5, build 55c4c88 -~~~ +``` The above command has not actually relied on the part of Docker that runs containers, just that Docker is installed and you can access it correctly from the command line. @@ -52,31 +57,31 @@ A command that checks that Docker is working correctly is the `docker container Without explaining the details, output on a newly installed system would likely be: -~~~bash -$ docker container ls -~~~ +```bash +docker container ls +``` -~~~ +```text CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES -~~~ +``` (The command `docker system info` could also be used to verify that Docker is correctly installed and operational but it produces a larger amount of output.) However, if you instead get a message similar to the following -~~~ +```text Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running? -~~~ +``` then you need to check that you have started the Docker Desktop, Docker Engine, or however else you worked through the setup instructions. ## Getting help + Often when working with a new command line tool, we need to get help. These tools often have some sort of subcommand or flag (usually `help`, `-h`, or `--help`) that displays a prompt describing how to use the tool. For Docker, it's no different. If we run `docker --help`, we see the following output (running `docker` also produces the help message): -~~~ - +```text Usage: docker [OPTIONS] COMMAND A self-sufficient runtime for containers @@ -158,18 +163,17 @@ Commands: wait Block until one or more containers stop, then print their exit codes Run 'docker COMMAND --help' for more information on a command. -~~~ +``` There is a list of commands and the end of the help message says: `Run 'docker COMMAND --help' for more information on a command.` For example, take the `docker container ls` command that we ran previously. We can see from the Docker help prompt that `container` is a Docker command, so to get help for that command, we run: -~~~bash +```bash docker container --help # or instead 'docker container' -~~~ - -~~~ +``` +```text Usage: docker container COMMAND Manage containers @@ -202,15 +206,15 @@ Commands: wait Block until one or more containers stop, then print their exit codes Run 'docker container COMMAND --help' for more information on a command. -~~~ +``` There's also help for the `container ls` command: -~~~bash +```bash docker container ls --help # this one actually requires the '--help' flag -~~~ +``` -~~~ +```text Usage: docker container ls [OPTIONS] List containers @@ -227,7 +231,7 @@ Options: --no-trunc Don't truncate output -q, --quiet Only display container IDs -s, --size Display total file sizes -~~~ +``` You may notice that there are many commands that stem from the `docker` command. Instead of trying to remember all possible commands and options, it's better to learn how to effectively get help from the command line. Although @@ -235,13 +239,14 @@ we can always search the web, getting the built-in help from our tool is often m right away. This applies not only to Docker, but also to most command line-based tools. :::callout + ## Docker Command Line Interface (CLI) syntax In this lesson we use the newest Docker CLI syntax [introduced with the Docker Engine version 1.13](https://www.docker.com/blog/whats-new-in-docker-1-13/). -This new syntax combines commands into groups you will most often +This new syntax combines commands into groups you will most often want to interact with. In the help example above you can see `image` and `container` -management commands, which can be used to interact with your images and +management commands, which can be used to interact with your images and containers respectively. With this new syntax you issue commands using the following pattern `docker [command] [subcommand] [additional options]` @@ -256,17 +261,17 @@ error-prone and is therefore recommended. ::::challenge{id=exploring-a-command title="Exploring a command"} Run `docker --help` and pick a command from the list. -Explore the help prompt for that command. Try to guess how a command would work by looking at the `Usage: ` +Explore the help prompt for that command. Try to guess how a command would work by looking at the `Usage:` section of the prompt. :::solution Suppose we pick the `docker image build` command: -~~~bash +```bash docker image build --help -~~~ +``` -~~~ +```text Usage: docker image build [OPTIONS] PATH | URL | - Build an image from a Dockerfile @@ -300,25 +305,26 @@ Options: -t, --tag list Name and optionally a tag in the 'name:tag' format --target string Set the target build stage to build. --ulimit ulimit Ulimit options (default []) -~~~ +``` We could try to guess that the command could be run like this: -~~~bash +```bash docker image build . -~~~ +``` or -~~~bash +```bash docker image build https://github.com/docker/rootfs.git -~~~ +``` Where `https://github.com/docker/rootfs.git` could be any relevant URL that supports a Docker image. ::: :::: -## Key Points: +## Key Points + - A toolbar icon indicates that Docker is ready to use (on Windows and macOS). - You will typically interact with Docker using the command line. -- To learn how to run a certain Docker command, we can type the command followed by the `--help` flag. \ No newline at end of file +- To learn how to run a certain Docker command, we can type the command followed by the `--help` flag. diff --git a/technology_and_tooling/docker/reproducibility.md b/technology_and_tooling/docker/reproducibility.md index 21d2625e..f39104e1 100644 --- a/technology_and_tooling/docker/reproducibility.md +++ b/technology_and_tooling/docker/reproducibility.md @@ -2,17 +2,15 @@ name: "Reproducibility and Granularity" teaching: 20 exercises: 0 -dependsOn: [ - technology_and_tooling.docker.docker-image-examples -] +dependsOn: [technology_and_tooling.docker.docker-image-examples] tags: [docker] -attribution: - - citation: > - D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". - Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. - url: https://github.com/carpentries-incubator/docker-introduction - image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg - license: CC-BY-4.0 +attribution: + - citation: > + D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". + Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. + url: https://github.com/carpentries-incubator/docker-introduction + image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg + license: CC-BY-4.0 --- Although this workshop is titled "Reproducible computational environments using containers", @@ -20,25 +18,27 @@ so far we have mostly covered the mechanics of using Docker with only passing re the reproducibility aspects. In this section, we discuss these aspects in more detail. :::callout -## Work in progress... + +## Work in progress + Note that reproducibility aspects of software and containers are an active area of research, discussion and development so are subject to many changes. We will present some ideas and approaches here but best practices will likely evolve in the near future. ::: ## Reproducibility -By *reproducibility* here we mean the ability of someone else (or your future self) being able to reproduce +By _reproducibility_ here we mean the ability of someone else (or your future self) being able to reproduce what you did computationally at a particular time (be this in research, analysis or something else) as closely as possible even if they do not have access to exactly the same hardware resources that you had when you did the original work. Some examples of why containers are an attractive technology to help with reproducibility include: - - The same computational work can be run across multiple different technologies seamlessly (e.g. Windows, macOS, Linux). - - You can save the exact process that you used for your computational work (rather than relying on potentially incomplete notes). - - You can save the exact versions of software and their dependencies in the container image. - - You can access legacy versions of software and underlying dependencies which may not be generally available any more. - - Depending on their size, you can also potentially store a copy of key data within the container image. - - You can archive and share the container image as well as associating a persistent identifier with a container image to allow other researchers to reproduce and build on your work. +- The same computational work can be run across multiple different technologies seamlessly (e.g. Windows, macOS, Linux). +- You can save the exact process that you used for your computational work (rather than relying on potentially incomplete notes). +- You can save the exact versions of software and their dependencies in the container image. +- You can access legacy versions of software and underlying dependencies which may not be generally available any more. +- Depending on their size, you can also potentially store a copy of key data within the container image. +- You can archive and share the container image as well as associating a persistent identifier with a container image to allow other researchers to reproduce and build on your work. ## Sharing images @@ -46,30 +46,30 @@ As we have already seen, the Docker Hub provides a platform for sharing containe This is fine for working collaboratively with container images on a day-to-day basis but the Docker Hub is not a good option for long time archive of container images in support of research and publications as: - - free accounts have a limit on how long a container image will be hosted if it is not updated - - it does not support adding persistent identifiers to container images - - it is easy to overwrite tagged container images with newer versions by mistake. +- free accounts have a limit on how long a container image will be hosted if it is not updated +- it does not support adding persistent identifiers to container images +- it is easy to overwrite tagged container images with newer versions by mistake. ## Archiving and persistently identifying container images using Zenodo When you publish your work or make it publicly available in some way it is good practice to make container images that you used for computational work available in an immutable, persistent way and to have an identifier that allows people to cite and give you credit for the work you have done. [Zenodo](https://zenodo.org/) is one service that provides this functionality. -Zenodo supports the upload of *tar* archives and we can capture our Docker container images as tar archives using the `docker image save` command. For example, to export the container image we created earlier in this lesson: +Zenodo supports the upload of _tar_ archives and we can capture our Docker container images as tar archives using the `docker image save` command. For example, to export the container image we created earlier in this lesson: -~~~bash +```bash docker image save alice/alpine-python:v1 -o alpine-python.tar -~~~ +``` These tar container images can become quite large and Zenodo supports uploads up to 50GB so you may need to compress your archive to make it fit on Zenodo using a tool such as gzip (or zip): -~~~bash +```bash gzip alpine-python.tar -~~~ +``` Once you have your archive, you can [deposit it on Zenodo](https://zenodo.org/deposit/) and this will: - - Create a long-term archive snapshot of your Docker container image which people (including your future self) can download and reuse or reproduce your work. - - Create a persistent DOI (*Digital Object Identifier*) that you can cite in any publications or outputs to enable reproducibility and recognition of your work. +- Create a long-term archive snapshot of your Docker container image which people (including your future self) can download and reuse or reproduce your work. +- Create a persistent DOI (_Digital Object Identifier_) that you can cite in any publications or outputs to enable reproducibility and recognition of your work. In addition to the archive file itself, the deposit process will ask you to provide some basic metadata to classify the container image and the associated work. @@ -77,19 +77,19 @@ Note that Zenodo is not the only option for archiving and generating persistent ## Reproducibility good practice - - Make use of container images to capture the computational environment required for your work. - - Decide on the appropriate granularity for the container images you will use for your computational work -- this will be different for each project/area. Take note of accepted practice from contemporary work in the same area. What are the right building blocks for individual container images in your work? - - Document what you have done and why -- this can be put in comments in the `Dockerfile` and the use of the container image described in associated documentation and/or publications. Make sure that references are made in both directions so that the container image and the documentation are appropriately linked. - - When you publish work (in whatever way) use an archiving and DOI service such as Zenodo to make sure your container image is captured as it was used for the work and that is obtains a persistent DOI to allow it to be cited and referenced properly. +- Make use of container images to capture the computational environment required for your work. +- Decide on the appropriate granularity for the container images you will use for your computational work -- this will be different for each project/area. Take note of accepted practice from contemporary work in the same area. What are the right building blocks for individual container images in your work? +- Document what you have done and why -- this can be put in comments in the `Dockerfile` and the use of the container image described in associated documentation and/or publications. Make sure that references are made in both directions so that the container image and the documentation are appropriately linked. +- When you publish work (in whatever way) use an archiving and DOI service such as Zenodo to make sure your container image is captured as it was used for the work and that is obtains a persistent DOI to allow it to be cited and referenced properly. ## Container Granularity As mentioned above, one of the decisions you may need to make when containerising your research workflows -is what level of *granularity* you wish to employ. The two extremes of this decision could be characterized +is what level of _granularity_ you wish to employ. The two extremes of this decision could be characterized as: - - Create a single container image with all the tools you require for your research or analysis workflow - - Create many container images each running a single command (or step) of the workflow and use them together +- Create a single container image with all the tools you require for your research or analysis workflow +- Create many container images each running a single command (or step) of the workflow and use them together Of course, many real applications will sit somewhere between these two extremes. @@ -104,31 +104,35 @@ and write a few bullet points for advantages and disadvantages for each approach This is not an exhaustive list but some of the advantages and disadvantages could be: ### Single large container image + - Advantages: - + Simpler to document - + Full set of requirements packaged in one place - + Potentially easier to maintain (though could be opposite if working with large, distributed group) + - Simpler to document + - Full set of requirements packaged in one place + - Potentially easier to maintain (though could be opposite if working with large, distributed group) - Disadvantages: - + Could get very large in size, making it more difficult to distribute - + Could use [Docker multi-stage build](https://docs.docker.com/develop/develop-images/multistage-build) to reduce size - + May end up with same dependency issues within the container image from different software requirements - + Potentially more complex to test - + Less re-useable for different, but related, work + - Could get very large in size, making it more difficult to distribute + - Could use [Docker multi-stage build](https://docs.docker.com/develop/develop-images/multistage-build) to reduce size + - May end up with same dependency issues within the container image from different software requirements + - Potentially more complex to test + - Less re-useable for different, but related, work ### Multiple smaller container images + - Advantages: - + Individual components can be re-used for different, but related, work - + Individual parts are smaller in size making them easier to distribute - + Avoid dependency issues between different pieces of software - + Easier to test + - Individual components can be re-used for different, but related, work + - Individual parts are smaller in size making them easier to distribute + - Avoid dependency issues between different pieces of software + - Easier to test - Disadvantage: - + More difficult to document - + Potentially more difficult to maintain (though could be easier if working with large, distributed group) - + May end up with dependency issues between component container images if they get out of sync + - More difficult to document + - Potentially more difficult to maintain (though could be easier if working with large, distributed group) + - May end up with dependency issues between component container images if they get out of sync + ::: :::: -## Key points: +## Key points + - Container images allow us to encapsulate the computation (and data) we have used in our research. - Using a service such as Docker Hub allows us to easily share computational work we have done. -- Using container images along with a DOI service such as Zenodo allows us to capture our work and enables reproducibility. \ No newline at end of file +- Using container images along with a DOI service such as Zenodo allows us to capture our work and enables reproducibility. diff --git a/technology_and_tooling/docker/running-containers.md b/technology_and_tooling/docker/running-containers.md index a502c27f..58693789 100644 --- a/technology_and_tooling/docker/running-containers.md +++ b/technology_and_tooling/docker/running-containers.md @@ -2,22 +2,22 @@ title: "Exploring and Running Containers" teaching: 20 exercises: 10 -dependsOn: [ - technology_and_tooling.docker.meet-docker -] +dependsOn: [technology_and_tooling.docker.meet-docker] tags: [docker] -attribution: - - citation: > - D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". - Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. - url: https://github.com/carpentries-incubator/docker-introduction - image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg - license: CC-BY-4.0 +attribution: + - citation: > + D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". + Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. + url: https://github.com/carpentries-incubator/docker-introduction + image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg + license: CC-BY-4.0 --- :::callout + ## Reminder of terminology: container images and containers -Recall that a *container image* is the template from which particular instances of *containers* will be created. + +Recall that a _container image_ is the template from which particular instances of _containers_ will be created. ::: Let's explore our first Docker container. The Docker team provides a simple container @@ -28,31 +28,32 @@ image online called `hello-world`. We'll start with that one. The `docker image` command is used to interact with Docker container images. You can find out what container images you have on your computer by using the following command ("ls" is short for "list"): -~~~bash -$ docker image ls -~~~ +```bash +docker image ls +``` If you've just installed Docker, you won't see any container images listed. To get a copy of the `hello-world` Docker container image from the internet, run this command: -~~~bash -$ docker image pull hello-world -~~~ +```bash +docker image pull hello-world +``` You should see output like this: -~~~ +```text Using default tag: latest latest: Pulling from library/hello-world 1b930d010525: Pull complete Digest: sha256:f9dfddf63636d84ef479d645ab5885156ae030f611a56f3a7ac7f2fdd86d7e4e Status: Downloaded newer image for hello-world:latest docker.io/library/hello-world:latest -~~~ +``` :::callout + ## Docker Hub Where did the `hello-world` container image come from? It came from the Docker Hub @@ -70,9 +71,10 @@ Give it a try before checking the solution. To see if the `hello-world` container image is now on your computer, run: -~~~bash -$ docker image ls -~~~ +```bash +docker image ls +``` + ::: :::: @@ -86,11 +88,11 @@ computer. To create and run containers from named Docker container images you use the `docker container run` command. Try the following `docker container run` invocation. Note that it does not matter what your current working directory is. -~~~bash -$ docker container run hello-world -~~~ +```bash +docker container run hello-world +``` -~~~ +```text Hello from Docker! This message shows that your installation appears to be working correctly. @@ -111,18 +113,19 @@ Share images, automate workflows, and more with a free Docker ID: For more examples and ideas, visit: https://docs.docker.com/get-started/ -~~~ +``` What just happened? When we use the `docker container run` command, Docker does three things: -| 1. Starts a Running Container | 2. Performs Default Action | 3. Shuts Down the Container | -| --------------------|-----------------|----------------| +| 1. Starts a Running Container | 2. Performs Default Action | 3. Shuts Down the Container | +| --------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------- | | Starts a running container, based on the container image. Think of this as the "alive" or "inflated" version of the container -- it's actually doing something. | If the container has a default action set, it will perform that default action. This could be as simple as printing a message (as above) or running a whole analysis pipeline! | Once the default action is complete, the container stops running (or exits). The container image is still there, but nothing is actively running. | The `hello-world` container is set up to run an action by default -- namely to print this message. :::callout + ## Using `docker container run` to get the image We could have skipped the `docker image pull` step; if you use the `docker container run` @@ -146,9 +149,9 @@ two steps, or one. What are they? What happened when you ran the Alpine Docker container? -~~~bash -$ docker container run alpine -~~~ +```bash +docker container run alpine +``` If you have never used the `alpine` Docker container image on your computer, Docker probably printed a message that it couldn't find the container image and had to download it. @@ -156,9 +159,9 @@ If you used the `alpine` container image before, the command will probably show because this particular container is designed for you to provide commands yourself. Try running this instead: -~~~bash -$ docker container run alpine cat /etc/os-release -~~~ +```bash +docker container run alpine cat /etc/os-release +``` You should see the output of the `cat /etc/os-release` command, which prints out the version of Alpine Linux that this container is using and a few additional bits of information. @@ -173,9 +176,10 @@ Give it a try before checking the solution. Use the same command as above, but with the `echo` command to print a message. -~~~bash -$ docker container run alpine echo 'Hello World' -~~~ +```bash +docker container run alpine echo 'Hello World' +``` + ::: :::: @@ -192,12 +196,13 @@ to the `docker container run` command and provide a shell (`bash`,`sh`, etc.) as our command. The `alpine` Docker container image doesn't include `bash` so we need to use `sh`. -~~~bash -$ docker container run -it alpine sh -~~~ +```bash +docker container run -it alpine sh +``` :::callout -## Technically... + +## Technically Technically, the interactive flag is just `-i` -- the extra `-t` (combined as `-it` above) is the "pseudo-TTY" option, a fancy term that means a text interface. @@ -205,28 +210,27 @@ This allows you to connect to a shell, like `sh`, using a command line. Since yo want to have a command line when running interactively, it makes sense to use the two together. ::: - Your prompt should change significantly to look like this: -~~~bash +```bash / # -~~~ +``` That's because you're now inside the running container! Try these commands: -* `pwd` -* `ls` -* `whoami` -* `echo $PATH` -* `cat /etc/os-release` +- `pwd` +- `ls` +- `whoami` +- `echo $PATH` +- `cat /etc/os-release` All of these are being run from inside the running container, so you'll get information about the container itself, instead of your computer. To finish using the container, type `exit`. -~~~bash +```bash / # exit -~~~ +``` ::::challenge{id=practice-makes-perfect title="Practice Makes Perfect"} @@ -236,7 +240,6 @@ Can you find out the version of Ubuntu installed on the `ubuntu` container image Can you also find the `apt-get` program? What does it do? (Hint: try passing `--help` to almost any command will give you more information.) - :::solution ## Solution 1 -- Interactive @@ -244,61 +247,64 @@ to almost any command will give you more information.) Run an interactive busybox container -- you can use `docker image pull` first, or just run it with this command: -~~~bash -$ docker container run -it ubuntu sh -~~~ +```bash +docker container run -it ubuntu sh +``` OR you can get the bash shell instead -~~~bash -$ docker container run -it ubuntu bash -~~~ +```bash +docker container run -it ubuntu bash +``` Then try, running these commands -~~~bash +```bash /# cat /etc/os-release /# apt-get --help -~~~ +``` Exit when you're done. -~~~bash +```bash /# exit -~~~ +``` ## Solution 2 -- Run commands Run a ubuntu container, first with a command to read out the Linux version: -~~~bash -$ docker container run ubuntu cat /etc/os-release -~~~ +```bash +docker container run ubuntu cat /etc/os-release +``` Then run a container with a command to print out the apt-get help: -~~~bash -$ docker container run ubuntu apt-get --help -~~~ +```bash +docker container run ubuntu apt-get --help +``` + ::: :::: :::callout + ## Even More Options There are many more options, besides `-it` that can be used with the `docker container run` -command! A few of them will be covered in [later episodes](advanced-containers) +command! A few of them will be covered in [later episodes](advanced-containers) and we'll share two more common ones here: -* `--rm`: this option guarantees that any running container is completely -removed from your computer after the container is stopped. Without this option, -Docker actually keeps the "stopped" container around, which you'll see in a later -episode. Note that this option doesn't impact the *container images* that you've pulled, -just running instances of containers. +- `--rm`: this option guarantees that any running container is completely + removed from your computer after the container is stopped. Without this option, + Docker actually keeps the "stopped" container around, which you'll see in a later + episode. Note that this option doesn't impact the _container images_ that you've pulled, + just running instances of containers. + +- `--name=`: By default, Docker assigns a random name and ID number to each container + instance that you run on your computer. If you want to be able to more easily refer + to a specific running container, you can assign it a name using this option. -* `--name=`: By default, Docker assigns a random name and ID number to each container -instance that you run on your computer. If you want to be able to more easily refer -to a specific running container, you can assign it a name using this option. ::: ## Conclusion @@ -307,8 +313,9 @@ So far, we've seen how to download Docker container images, use them to run comm running containers, and even how to explore a running container from the inside. Next, we'll take a closer look at all the different kinds of Docker container images that are out there. -## Key Points: +## Key Points + - The `docker image pull` command downloads Docker container images from the internet. - The `docker image ls` command lists Docker container images that are (now) on your computer. - The `docker container run` command creates running containers from container images and can run commands inside them. -- When using the docker container run command, a container can run a default action (if it has one), a user specified action, or a shell to be used interactively. \ No newline at end of file +- When using the docker container run command, a container can run a default action (if it has one), a user specified action, or a shell to be used interactively. diff --git a/technology_and_tooling/docker/setup.md b/technology_and_tooling/docker/setup.md index 6126db45..9ade710e 100644 --- a/technology_and_tooling/docker/setup.md +++ b/technology_and_tooling/docker/setup.md @@ -1,41 +1,41 @@ --- name: Setup -dependsOn: [ -] +dependsOn: [] tags: [docker] -attribution: - - citation: > - D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". - Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. - url: https://github.com/carpentries-incubator/docker-introduction - image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg - license: CC-BY-4.0 - - +attribution: + - citation: > + D. M. Eyers, S. L. R. Stevens, A. Turner, C. Koch and J. Cohen. "Reproducible computational environments using containers: Introduction to Docker". + Version 2020.09a (4a93bd67aa), September 2020. Carpentries Incubator. + url: https://github.com/carpentries-incubator/docker-introduction + image: https://carpentries-incubator.github.io/docker-introduction/assets/img/incubator-logo-blue.svg + license: CC-BY-4.0 --- + ### Website accounts to create + Please seek help at the start of the lesson if you have not been able to establish a website account on: + - The [Docker Hub](http://hub.docker.com). We will use the Docker Hub to download pre-built container images, and for you to upload and download container images that you create, as explained in the relevant lesson episodes. ### Files to download Download the [`docker-intro.zip`](files/docker-intro.zip) file. -Move the downloaded file to your Desktop and unzip it. It should unzip to a folder called `docker-intro`. +Move the downloaded file to your Desktop and unzip it. It should unzip to a folder called `docker-intro`. ### Software to install Docker's installation experience has steadily improved, however situations will arise in which installing Docker on your computer may not be straightforward -unless you have a large amount of technical experience. Workshops try to have +unless you have a large amount of technical experience. Workshops try to have helpers on hand that have worked their way through the install process, but do be prepared for some troubleshooting. In most cases, you will need to have administrator rights on the computer in order to install the Docker software. If you are using a computer managed by -your organisation and do not have administrator rights, you *may* be able to get +your organisation and do not have administrator rights, you _may_ be able to get your organisation's IT staff to install Docker for you. Alternatively your IT -support staff *may* be able to give you remote access to a server that can run +support staff _may_ be able to give you remote access to a server that can run Docker commands. Please try to install the appropriate software from the list below depending on @@ -67,8 +67,10 @@ final release of Docker Toolbox includes an old version of Docker and you are strongly advised not to attempt to use this for any production use. It will, however, enable you to follow along with the lesson material._ -:::callout -## Warning: Git Bash +:::callout{variant="warning"} + +## Git Bash + If you are using Git Bash as your terminal on Windows then you should be aware that you may run into issues running some of the commands in this lesson as Git Bash will automatically re-write any paths you specify at the command line into Windows versions of the paths and this will confuse @@ -90,7 +92,7 @@ docker run alpine cat //etc/os-release This should suppress the path translation functionality in Git Bash. ::: -#### Apple macOS +### Apple macOS Ideally, you will be able to install the Docker Desktop software, following the [Docker website's documentation](https://docs.docker.com/docker-for-mac/install/). @@ -105,7 +107,7 @@ The MacPorts Docker port should support older, as well as the most recent, opera versions (see the [port details](https://ports.macports.org/port/docker/details/)), but note that we have not recently tested the Docker installation process via MacPorts. -#### Linux +### Linux There are too many varieties of Linux to give precise instructions here, but hopefully you can locate documentation for getting Docker installed on your @@ -114,20 +116,20 @@ on your system, the [Install Docker Engine](https://docs.docker.com/engine/insta supported Linux distributions and pointers to relevant installation information. Alternatively, see: - - [Docker Engine on CentOS](https://docs.docker.com/install/linux/docker-ce/centos/) - - [Docker Engine on Debian](https://docs.docker.com/install/linux/docker-ce/debian/) - - [Docker Engine on Fedora](https://docs.docker.com/install/linux/docker-ce/fedora/) - - [Docker Engine on Ubuntu](https://docs.docker.com/install/linux/docker-ce/ubuntu/) +- [Docker Engine on CentOS](https://docs.docker.com/install/linux/docker-ce/centos/) +- [Docker Engine on Debian](https://docs.docker.com/install/linux/docker-ce/debian/) +- [Docker Engine on Fedora](https://docs.docker.com/install/linux/docker-ce/fedora/) +- [Docker Engine on Ubuntu](https://docs.docker.com/install/linux/docker-ce/ubuntu/) ### Verify Installation To quickly check if the Docker and client and server are working run the following command in a new terminal or ssh session: -~~~bash -$ docker version -~~~ +```bash +docker version +``` -~~~ +```text Client: Version: 20.10.2 API version: 1.41 @@ -149,17 +151,17 @@ Server: Experimental: false containerd: Version: 1.4.4-0ubuntu1 - GitCommit: + GitCommit: runc: Version: 1.0.0~rc95-0ubuntu1~21.04.1 - GitCommit: + GitCommit: docker-init: Version: 0.19.0 - GitCommit: -~~~ + GitCommit: +``` The above output shows a successful installation and will vary based on your -system. The important part is that the "Client" and the "Server" parts are both +system. The important part is that the "Client" and the "Server" parts are both working and returns information. It is beyond the scope of this document to debug installation problems but common errors include the user not belonging to -the `docker` group and forgetting to start a new terminal or ssh session. \ No newline at end of file +the `docker` group and forgetting to start a new terminal or ssh session. diff --git a/technology_and_tooling/ide/cpp.md b/technology_and_tooling/ide/cpp.md index d8e8cfd1..acfd2e36 100644 --- a/technology_and_tooling/ide/cpp.md +++ b/technology_and_tooling/ide/cpp.md @@ -1,114 +1,109 @@ --- name: VSCode -dependsOn: [ -] +dependsOn: [] tags: [cpp] attribution: - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 url: https://www.universe-hpc.ac.uk image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png license: CC-BY-4.0 - - --- ## Introduction to VSCode -Microsoft's VSCode is a lightweight IDE which is great when starting out developing -programs. It not only supports Python, but also C++, C#, JavaScript, CSS, and Java, -amongst others. It's also available for Mac OS, Linux, and Windows. Whilst lightweight, -it's features can be readily extended for a variety of languages via installation of -plugins to suit your needs, and you can even develop your own plugins for VSCode. As -well as features like live debugging and context-sensitive code autocompletion, other +Microsoft's VSCode is a lightweight IDE which is great when starting out developing +programs. It not only supports Python, but also C++, C#, JavaScript, CSS, and Java, +amongst others. It's also available for Mac OS, Linux, and Windows. Whilst lightweight, +it's features can be readily extended for a variety of languages via installation of +plugins to suit your needs, and you can even develop your own plugins for VSCode. As +well as features like live debugging and context-sensitive code autocompletion, other notable features include: -- Revision/version control support: ability to work with Git source code repositories, - uploading and synchronising changes to/from such repositories on e.g. GitHub. We'll be +- Revision/version control support: ability to work with Git source code repositories, + uploading and synchronising changes to/from such repositories on e.g. GitHub. We'll be covering Git version control later in the course -- Live code development sharing: without exchanging changes using version control, you - can view live changes being made by another team member collaboratively within your +- Live code development sharing: without exchanging changes using version control, you + can view live changes being made by another team member collaboratively within your own VSCode editor - ## Running VSCode for the First Time -If you haven't run VSCode yet, do this now. The first thing we need to do is open a -folder to work in. Select `Open Folder` from the bar on the left (or from the `File` -drop down menu at the top of VSCode), and a dialogue window will appear. Choose a new +If you haven't run VSCode yet, do this now. The first thing we need to do is open a +folder to work in. Select `Open Folder` from the bar on the left (or from the `File` +drop down menu at the top of VSCode), and a dialogue window will appear. Choose a new (empty) folder to work in. :::callout + ## Trusting Code -You may be asked whether you trust the authors of the files in this folder. Select the -checkbox, and click 'Yes, I trust the authors' (although in general use some caution is +You may be asked whether you trust the authors of the files in this folder. Select the +checkbox, and click 'Yes, I trust the authors' (although in general use some caution is recommended!) ::: -This directory is now the current working directory for VSCode, so when we run scripts +This directory is now the current working directory for VSCode, so when we run scripts from VSCode, this will be the working directory they'll run from. -If you'd like to explore VSCode in more depth than this course offers, see the [VSCode +If you'd like to explore VSCode in more depth than this course offers, see the [VSCode documentation](https://code.visualstudio.com/docs). ## Adding C++ VSCode extensions -VSCode has a few C++ specific extensions to help us write and compile our C++ code. -Select the "Extensions" view icon on the Activity bar (far LHS of your screen, the -vertical list of icons), and search for "C++". Make sure that the "C/C++" and "C/C++ -Extension Pack" (by Microsoft) are installed and enabled before you proceed. There are a -number of other C++ extensions also available, feel free to look through the list and +VSCode has a few C++ specific extensions to help us write and compile our C++ code. +Select the "Extensions" view icon on the Activity bar (far LHS of your screen, the +vertical list of icons), and search for "C++". Make sure that the "C/C++" and "C/C++ +Extension Pack" (by Microsoft) are installed and enabled before you proceed. There are a +number of other C++ extensions also available, feel free to look through the list and try out any that you might find useful. -## Ensuring you have a compiler installed - +## Ensuring you have a compiler installed -C++ source code is converted into executables that you can run using a *compiler*. There -are a number of different C++ compilers available, and two of the most common are -[*Clang*](https://clang.llvm.org/), based on the [LLVM](https://llvm.org/) framework or -[*GCC*](https://gcc.gnu.org/), the GNU Compiler Collection. +C++ source code is converted into executables that you can run using a _compiler_. There +are a number of different C++ compilers available, and two of the most common are +[_Clang_](https://clang.llvm.org/), based on the [LLVM](https://llvm.org/) framework or +[_GCC_](https://gcc.gnu.org/), the GNU Compiler Collection. Ensure that you have either `clang` or `g++` (part of GCC) compiler installed using: -~~~bash +```bash clang++ --version -~~~ +``` -or +or -~~~bash +```bash g++ --version -~~~ +``` You should see something like: -~~~ +```text Homebrew clang version 15.0.3 Target: x86_64-apple-darwin22.1.0 Thread model: posix InstalledDir: /usr/local/opt/llvm/bin -~~~ +``` Check where the compiler executable is located on your machine -~~~bash +```bash which clang++ which g++ -~~~ +``` You should see something like: -~~~ +```text /usr/local/opt/llvm/bin/clang++ -~~~ +``` Make a note of the location of the compiler that you wish to use. - ## Compiling and Running a C++ executable -Let's create our first C++ program in VSCode. Select `File` -> `New Text File` from the -menu. A new file will appear. Copy or type in the following contents and save the file +Let's create our first C++ program in VSCode. Select `File` -> `New Text File` from the +menu. A new file will appear. Copy or type in the following contents and save the file as `hello.cpp`. ```cpp @@ -127,7 +122,6 @@ configuration used to build and run the currently active file The file should compile successfully and output the text "hello world" in the debug console. - Open the `vscode/tasks.json` file created earlier, it should look something like this: ```json @@ -137,19 +131,11 @@ Open the `vscode/tasks.json` file created earlier, it should look something like "type": "cppbuild", "label": "C/C++: g++ build active file", "command": "/usr/bin/g++", - "args": [ - "-fdiagnostics-color=always", - "-g", - "${file}", - "-o", - "${fileDirname}/${fileBasenameNoExtension}" - ], + "args": ["-fdiagnostics-color=always", "-g", "${file}", "-o", "${fileDirname}/${fileBasenameNoExtension}"], "options": { "cwd": "${fileDirname}" }, - "problemMatcher": [ - "$gcc" - ], + "problemMatcher": ["$gcc"], "group": { "kind": "build", "isDefault": true @@ -161,31 +147,31 @@ Open the `vscode/tasks.json` file created earlier, it should look something like } ``` -In "command" and "args, this file describes the command that VSCode runs to compile your -code. In this case `${file}` is the name of our source file `hello.cpp`, -`${fileDirname}` is the name of the directory that `hello.cpp` is contained in (this is -the root directory of our project), and `${fileBasenameNoextension}` is simply `hello`. +In "command" and "args, this file describes the command that VSCode runs to compile your +code. In this case `${file}` is the name of our source file `hello.cpp`, +`${fileDirname}` is the name of the directory that `hello.cpp` is contained in (this is +the root directory of our project), and `${fileBasenameNoextension}` is simply `hello`. So the command that is run for the `tasks.json` above would be: ```bash /usr/bin/g++ -fdiagnostics-color=always -g hello.cpp -o hello ``` -This command uses the `g++` compiler to compile `hello.cpp` and writes the output -exectable program to the file `hello`. You can now run this executable via the -command-line and see the "Hello, world!" text written to the screen. This is just +This command uses the `g++` compiler to compile `hello.cpp` and writes the output +exectable program to the file `hello`. You can now run this executable via the +command-line and see the "Hello, world!" text written to the screen. This is just manually doing what VSCode does automatically via the `tasks.json` file. -Note that this command also adds debug symbols to the executable using the `-g` flags. -This allows us to use a debugger (such as `gdb` or `lldb`) to step through our code on -execution. +Note that this command also adds debug symbols to the executable using the `-g` flags. +This allows us to use a debugger (such as `gdb` or `lldb`) to step through our code on +execution. ## C++ standards -There are a number of different C++ standards with names based on the year in which they -were released: C++98, C++11, C++14, C++17, C++20, and C++22. Depending on your compiler -and its version, your compiler will by default use one of these. Lets make sure we are -using a certain standard (in this case we will choose C++20). We can tell either `g++` +There are a number of different C++ standards with names based on the year in which they +were released: C++98, C++11, C++14, C++17, C++20, and C++22. Depending on your compiler +and its version, your compiler will by default use one of these. Lets make sure we are +using a certain standard (in this case we will choose C++20). We can tell either `g++` or `clang++` to use the C++20 standard using the `-std` flag: ```bash @@ -212,9 +198,7 @@ And we can add this to the VSCode `tasks.json` file like so: "options": { "cwd": "${fileDirname}" }, - "problemMatcher": [ - "$gcc" - ], + "problemMatcher": ["$gcc"], "group": { "kind": "build", "isDefault": true diff --git a/technology_and_tooling/ide/index.md b/technology_and_tooling/ide/index.md index 0966ef3d..e50eff6e 100644 --- a/technology_and_tooling/ide/index.md +++ b/technology_and_tooling/ide/index.md @@ -1,29 +1,22 @@ --- name: IDEs id: ide -dependsOn: [ - technology_and_tooling.bash_shell -] -files: [ - python.md, - cpp.md, -] +dependsOn: [technology_and_tooling.bash_shell] +files: [python.md, cpp.md] summary: | - Integrated Development Environments (IDEs)provide programmers with a complete development environment to - write, edit, debug, and deploy their code. This course introduces the popular VSCode IDE, both for Python - and C++ development. + Integrated Development Environments (IDEs)provide programmers with a complete development environment to + write, edit, debug, and deploy their code. This course introduces the popular VSCode IDE, both for Python + and C++ development. attribution: - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 url: https://www.universe-hpc.ac.uk image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png license: CC-BY-4.0 - - --- Integrated Development Environments, commonly known as IDEs, are software applications that provide programmers with a complete development environment to -write, edit, debug, and deploy their code. IDEs are designed +write, edit, debug, and deploy their code. IDEs are designed to enhance productivity and simplify the development process by providing a unified and streamlined interface for all stages of the development cycle. @@ -36,4 +29,4 @@ provided via third-party plugins. This course provides an introduction to the popular [VSCode](https://code.visualstudio.com) IDE, both for Python and C++ development. -## Installing VSCode \ No newline at end of file +## Installing VSCode diff --git a/technology_and_tooling/ide/python.md b/technology_and_tooling/ide/python.md index bda56d90..d6bb90be 100644 --- a/technology_and_tooling/ide/python.md +++ b/technology_and_tooling/ide/python.md @@ -1,37 +1,34 @@ --- name: VSCode -dependsOn: [ -] +dependsOn: [] tags: [python] attribution: - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 url: https://www.universe-hpc.ac.uk image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png license: CC-BY-4.0 - - --- ## Introduction to VSCode -Microsoft's VSCode is a lightweight IDE which is great when starting out developing -programs. It not only supports Python, but also C++, C#, JavaScript, CSS, and Java, -amongst others. It's also available for macOS, Linux, and Windows. Whilst lightweight, -it's features can be readily extended for a variety of languages via installation of -plugins to suit your needs, and you can even develop your own plugins for VSCode. As -well as features like live debugging and context-sensitive code autocompletion, other +Microsoft's VSCode is a lightweight IDE which is great when starting out developing +programs. It not only supports Python, but also C++, C#, JavaScript, CSS, and Java, +amongst others. It's also available for macOS, Linux, and Windows. Whilst lightweight, +it's features can be readily extended for a variety of languages via installation of +plugins to suit your needs, and you can even develop your own plugins for VSCode. As +well as features like live debugging and context-sensitive code autocompletion, other notable features include: -- Revision/version control support: ability to work with Git source code repositories, - uploading and synchronising changes to/from such repositories on e.g. GitHub. We'll be +- Revision/version control support: ability to work with Git source code repositories, + uploading and synchronising changes to/from such repositories on e.g. GitHub. We'll be covering Git version control later in the course - Remote development: enabling users to work with source code and projects that are stored on remote machines or in virtual environments. This functionality provides greater flexibility for developers who need to work with code that is not stored locally on their machine. -- Live code development sharing: without exchanging changes using version control, you - can view live changes being made by another team member collaboratively within your +- Live code development sharing: without exchanging changes using version control, you + can view live changes being made by another team member collaboratively within your own VSCode editor ## Installation instructions for VSCode @@ -46,7 +43,6 @@ To install Visual Studio Code, follow these steps: For more detailed installation instructions, see the [official documentation](https://code.visualstudio.com/docs/setup/setup-overview). - ## Running VSCode for the First Time A current project in VSCode is defined by a folder, either on on your local @@ -55,176 +51,176 @@ time it will prompt you to select a folder. Create a new, empty folder on your computer and open this folder in VSCode. :::callout + ### Trusting Code -You may be asked whether you trust the authors of the files in this folder. Select the -checkbox, and click 'Yes, I trust the authors' (although in general use some caution is +You may be asked whether you trust the authors of the files in this folder. Select the +checkbox, and click 'Yes, I trust the authors' (although in general use some caution is recommended!) ::: -This directory is now the current working directory for VSCode, so when we run scripts +This directory is now the current working directory for VSCode, so when we run scripts from VSCode, this will be the working directory they'll run from. -If you'd like to explore VSCode in more depth than this course offers, see the [VSCode +If you'd like to explore VSCode in more depth than this course offers, see the [VSCode documentation](https://code.visualstudio.com/docs). - ## Our First Python Standalone Script Let's create our first Python script in VSCode and save it: - Select `File` -> `New Text File` from the menu. A new file will appear. -- On line 1, you'll see a message about selecting a language. Click on `Select a - language`, and type in `python`, and select `Python (python) Built-In` -- Select `File` > `Save As...`. You'll find yourself in your root project directory. +- On line 1, you'll see a message about selecting a language. Click on `Select a +language`, and type in `python`, and select `Python (python) Built-In` +- Select `File` > `Save As...`. You'll find yourself in your root project directory. Enter the filename `hello_world.py` at the top and select `Save`. Let's start with a classic 'Hello world' script. Enter this into the editor: -~~~python +```python print('Hello world!') -~~~ - -VSCode comes with Python support built-in. You'll notice that as you type, the editor is -suggesting possible statements, functions (and also variables and other Python -artifacts) that match what you've typed so far. When you write a function it fully -recognises and understands, it will also pop-up some context-sensitive help about the -function itself, including any documentation associated with it and a breakdown of the -function's arguments. This is very helpful when dealing with libraries with many modules -and functions. Also, for convenience, if you've only half-typed a function, variable, -statement, etc. that it recognises as the only option, you can press `Tab` and it will +``` + +VSCode comes with Python support built-in. You'll notice that as you type, the editor is +suggesting possible statements, functions (and also variables and other Python +artifacts) that match what you've typed so far. When you write a function it fully +recognises and understands, it will also pop-up some context-sensitive help about the +function itself, including any documentation associated with it and a breakdown of the +function's arguments. This is very helpful when dealing with libraries with many modules +and functions. Also, for convenience, if you've only half-typed a function, variable, +statement, etc. that it recognises as the only option, you can press `Tab` and it will autocomplete the rest of the typing for you. -You may find you see a `Python - Get Started` window tab pop up that gives you some -things to do next in Python. But for now, we'll keep editing our file in the +You may find you see a `Python - Get Started` window tab pop up that gives you some +things to do next in Python. But for now, we'll keep editing our file in the `hello_world.py` tab, so select that. Once you've finished, select `File` -> `Save`. -Now let's try running our script from within VSCode. Select the `Run` icon on the far -left navigation bar (it looks like an arrow pointing right with a bug in it), then -select `Run and Debug`. It will ask you to `Select a debug configuration`, so select -`Python File`. It will now run our script, and you should see a terminal window pop-up +Now let's try running our script from within VSCode. Select the `Run` icon on the far +left navigation bar (it looks like an arrow pointing right with a bug in it), then +select `Run and Debug`. It will ask you to `Select a debug configuration`, so select +`Python File`. It will now run our script, and you should see a terminal window pop-up at the bottom, with something like the following text in it: -~~~bash +```bash user@mycomputer:~/my/project/dir$ /usr/bin/env /usr/bin/python3 /home/user/.vscode/extensions/ms-python.python-2022.14.0/pythonFiles/lib/python/debugpy/launcher 38613 -- /home/user/my/project/dir/hello_world.py Hello world! -~~~ +``` -Here, we can see that the interpreter `/usr/bin/python3` has been used to run the VSCode +Here, we can see that the interpreter `/usr/bin/python3` has been used to run the VSCode debugger on our `hello_world.py` script, which produces the shown 'Hello world!' output. - ## Setting up a Virtual Environment -Before we start using VSCode beyond a 'Hello world' example, we should set up a new -*virtual environment* for running our Python scripts. We are currently using the -*global* installation of Python 3, and this is not considered good development practice. +Before we start using VSCode beyond a 'Hello world' example, we should set up a new +_virtual environment_ for running our Python scripts. We are currently using the +_global_ installation of Python 3, and this is not considered good development practice. :::callout + ## Why use a Virtual Environment, and what are they? -Consider developing a number of different Python scripts that each have their own -package dependencies (and versions of those dependencies) on the same machine. It could -quickly become confusing as to which packages and package versions are required by each -script, making it difficult for others to run your script themselves (or yourself on -another machine!). Additionally, different scripts may need to use different versions of +Consider developing a number of different Python scripts that each have their own +package dependencies (and versions of those dependencies) on the same machine. It could +quickly become confusing as to which packages and package versions are required by each +script, making it difficult for others to run your script themselves (or yourself on +another machine!). Additionally, different scripts may need to use different versions of a given package. -A virtual environment is a self-contained directory tree that houses a specific Python -interpreter and specific versions of a number of Python packages, so as package -dependencies are added to a script (or set of scripts), you can add them to this -specific virtual environment. So, you can avoid a great deal of confusion by having +A virtual environment is a self-contained directory tree that houses a specific Python +interpreter and specific versions of a number of Python packages, so as package +dependencies are added to a script (or set of scripts), you can add them to this +specific virtual environment. So, you can avoid a great deal of confusion by having separate virtual environments for each script. ::: -Go back to the terminal window, and exit the Python interpreter (either by typing +Go back to the terminal window, and exit the Python interpreter (either by typing `exit()` or pressing `Ctrl` and `D` at the same time). In the Bash shell, type the following (whilst in the root project directory): -~~~bash +```bash python3 -m venv venv -~~~ +``` -This instructs Python to construct a new Python virtual environment for us. Within our -`code` directory now, you should see a new `venv` directory. This will contain a -localised copy of the Python3 interpreter, and any associated libraries we wish to -install. But this local environment is particular to our current work; if we were to +This instructs Python to construct a new Python virtual environment for us. Within our +`code` directory now, you should see a new `venv` directory. This will contain a +localised copy of the Python3 interpreter, and any associated libraries we wish to +install. But this local environment is particular to our current work; if we were to start a new project, we'd create another virtual environment for that one, and so on. -The first `venv` is the name of the tool we need to use to create a virtual environment, -while the second `venv` is the name of the directory that the virtual environment will -be put in. Most people use either `venv` or `env` as the name for their virtual +The first `venv` is the name of the tool we need to use to create a virtual environment, +while the second `venv` is the name of the directory that the virtual environment will +be put in. Most people use either `venv` or `env` as the name for their virtual environment. We can activate this virtual environment, and see what it contains, by doing: -~~~bash +```bash source venv/bin/activate pip3 list -~~~ +``` -`source` runs a script that activates our virtual environment. `pip` is the de-facto -Python package installer; in this case we're using the version for Python 3 specifically -and asking it to list the packages that are currently resident in the virtual +`source` runs a script that activates our virtual environment. `pip` is the de-facto +Python package installer; in this case we're using the version for Python 3 specifically +and asking it to list the packages that are currently resident in the virtual environment: -~~~ +```text Package Version ------------- ------- pip 22.0.2 setuptools 59.6.0 -~~~ +``` -In addition to Python which is also installed, as we can see, we don't have any other -packages installed yet, aside from `pip` itself, and `setuptools` (which contains +In addition to Python which is also installed, as we can see, we don't have any other +packages installed yet, aside from `pip` itself, and `setuptools` (which contains functionality for building and distributing Python packages). -Note that this virtual environment is only active within our current terminal. If we -start another terminal and want to use this virtual environment, we'd have to activate -it there as well. Also, if we were to close the terminal, the activation of this -environment (not the environment itself) will be forgotten. When we want to use this -virtual environment we have to remember to start it using the `source venv/bin/activate` -command above from within `se-day1/code` directory each time we open a new terminal. -Otherwise, by default, we will the using the global Python interpreter and not the +Note that this virtual environment is only active within our current terminal. If we +start another terminal and want to use this virtual environment, we'd have to activate +it there as well. Also, if we were to close the terminal, the activation of this +environment (not the environment itself) will be forgotten. When we want to use this +virtual environment we have to remember to start it using the `source venv/bin/activate` +command above from within `se-day1/code` directory each time we open a new terminal. +Otherwise, by default, we will the using the global Python interpreter and not the specific environment we have created. -If we wanted to deactivate our virtual environment, and return to the globally available -set of Python packages, we'd use `deactivate` on the command line (although don't do +If we wanted to deactivate our virtual environment, and return to the globally available +set of Python packages, we'd use `deactivate` on the command line (although don't do this now!). -Other languages make use of virtual environments, such as Ruby, JavaScript, and Go - -it's a great way to keep your environments separate and avoid confusion over which +Other languages make use of virtual environments, such as Ruby, JavaScript, and Go - +it's a great way to keep your environments separate and avoid confusion over which dependencie belong with which project. ### Running the Script from the Command Line -You'll remember that we were originally running the Python interpreter directly from the +You'll remember that we were originally running the Python interpreter directly from the command line earlier. From within the same terminal, type: -~~~bash +```bash which python3 -~~~ +``` And you should see something like: -~~~ +```text /home/user/my/project/dir/venv/bin/python3 -~~~ +``` Which confirms that we are using the Python 3 interpreter from within our virtual environment at `/home/user/my/project/dir/venv`. Now let's run our new script using our virtual environment from the command line: -~~~bash +```bash python3 hello_world.py -~~~ +``` -~~~ +```text Hello world! -~~~ +``` -So here, we're doing a very similar thing to what VSCode was doing when running our -script: we give the command line the Python interpreter to run (which will use the one -in the environment we created in this case) and our script, which resides in the local +So here, we're doing a very similar thing to what VSCode was doing when running our +script: we give the command line the Python interpreter to run (which will use the one +in the environment we created in this case) and our script, which resides in the local directory. diff --git a/technology_and_tooling/packaging_dependency_management/cmake.md b/technology_and_tooling/packaging_dependency_management/cmake.md index 11ca3ce3..4238cf6c 100644 --- a/technology_and_tooling/packaging_dependency_management/cmake.md +++ b/technology_and_tooling/packaging_dependency_management/cmake.md @@ -1,33 +1,31 @@ --- name: Introduction to CMake -dependsOn: [ - technology_and_tooling.ide.cpp -] +dependsOn: [technology_and_tooling.ide.cpp] tags: [cpp] --- :::callout -This has been edited from the [introductory course in +This has been edited from the [introductory course in CMake](https://github.com/OxfordRSE/IntroCMakeCourse) from Oxford RSE. ::: # Getting started -Clone the material repository and change your current working directory to the project +Clone the material repository and change your current working directory to the project root: -~~~bash +```bash git clone https://github.com/OxfordRSE/IntroCMakeCourse cd IntroCMakeCourse -~~~ +``` # Problem Statement You want your C++ code to compile on other computers, not just your laptop. - - group workstation - - HPC compile node - - collaborator laptops +- group workstation +- HPC compile node +- collaborator laptops Everyone should end up with a program that behaves the same way, wherever they build. @@ -44,7 +42,7 @@ CMake works on Linux, Windows, macOS and more. Checkpoint 0 is a simple "hello, world" program written in C++. Let's use CMake to build it. ```bash -$ cd checkpoint_0 +cd checkpoint_0 ``` # `CMakeLists.txt` @@ -110,7 +108,7 @@ Checkpoint 0 Hello, World! ``` -# Breakout time +## Breakout time Verify that we can all configure, compile and run the executable in Checkpoint 0. @@ -126,8 +124,6 @@ build$ ninja [2/2] Linking CXX executable main_executable ``` -# Choosing a generator - You can build uniformly, regardless of the generator: ```bash @@ -153,8 +149,6 @@ CMAKE_CXX_COMPILER= /usr/local/bin/g++-10 [...] ``` -# Setting configuration - You can switch between Debug, Release, RelWithDebInfo and MinSizeRel, by default: ```bash @@ -171,13 +165,12 @@ CMAKE_CXX_FLAGS_RELEASE -O3 -DNDEBUG CMAKE_CXX_FLAGS_RELWITHDEBINFO -O2 -g -DNDEBUG ``` -# Breakout time +## Breakout time Try using the Ninja generator, compiling in Release mode, and using another compiler if you have one installed. Remember that you might have to clean your build directory when, e.g., changing generator. - # Adding subdirectories ```bash @@ -238,7 +231,7 @@ Our project has grown! In addition to the code in `main.cpp`, some new functiona This code is now contained in a specific directory `src/`, inside the project directory. -# Breakout time +## Breakout time Look through the files in Checkpoint 1. @@ -248,11 +241,10 @@ Add a new pair of hpp/cpp files that defines a new function. - Add the files to `src/CMakeLists.txt` - Configure, compile and run: check that your new function has been executed - # Target properties CMake allows for a very fine-grained control of target builds, through -*properties*. +_properties_. For example, the property `INCLUDE_DIRECTORIES`{.cmake} specifies the list of directories to be specified with the compiler switch `-I`{.cmake} (or `/I`{.cmake}). @@ -267,8 +259,7 @@ target_include_directories(main_executable ) ``` -*Properties are different from variables!* - +_Properties are different from variables!_ # Creating a library @@ -295,9 +286,9 @@ implementation. Programs using `another_target`{.cmake} don't need to know about Picture another dependency scenario: -- `another_target`{.cmake} uses `my_lib`{.cmake} in its internal implementation. -- **and** `another_target`{.cmake} defines some function that take parameters of a type defined - in `my_lib`{.cmake}. +- `another_target`{.cmake} uses `my_lib`{.cmake} in its internal implementation. +- **and** `another_target`{.cmake} defines some function that take parameters of a type defined + in `my_lib`{.cmake}. Programs using `another_target`{.cmake} also must link against `my_lib`{.cmake}: @@ -321,24 +312,28 @@ target_link_libraries(another_target INTERFACE my_lib) Target properties are paired with another property `INTERFACE_<PROPERTY>`{.cmake}. For instance - INTERFACE_INCLUDE_DIRECTORIES +```cmake +INTERFACE_INCLUDE_DIRECTORIES +``` These properties are inherited by depending targets (such as executables and other libraries). Example: - target_include_directories(my_lib INTERFACE ${CMAKE_CURRENT_SOURCE_DIR}) +```cmake +target_include_directories(my_lib INTERFACE ${CMAKE_CURRENT_SOURCE_DIR}) +``` -- `PRIVATE`{.cmake}: sets `INCLUDE_DIRECTORIES`{.cmake}. -- `INTERFACE`{.cmake}: sets `INTERFACE_INCLUDE_DIRECTORIES`{.cmake}. -- `PUBLIC`{.cmake}: sets both. +- `PRIVATE`{.cmake}: sets `INCLUDE_DIRECTORIES`{.cmake}. +- `INTERFACE`{.cmake}: sets `INTERFACE_INCLUDE_DIRECTORIES`{.cmake}. +- `PUBLIC`{.cmake}: sets both. -# Breakout time +## Breakout time Let's separate the functionality from the executable itself: -```bash +```text CMakeLists.txt src/ <library> @@ -360,7 +355,8 @@ Tasks: set(name "Jane Doe") message(STATUS "Hello ${name}") ``` -``` + +```text -- The C compiler identification is GNU 8.3.0 ... -- Hello Jane Doe @@ -374,20 +370,20 @@ message(STATUS "Hello ${name}") message(STATUS "A simple message") ``` -`STATUS`{.cmake} can be replaced by *e.g.* `WARNING`{.cmake}, `SEND_ERROR`{.cmake}, `FATAL_ERROR`{.cmake} +`STATUS`{.cmake} can be replaced by _e.g._ `WARNING`{.cmake}, `SEND_ERROR`{.cmake}, `FATAL_ERROR`{.cmake} depending on the situation. ```cmake message(SEND_ERROR "An error occurred but configure step continues") ``` -``` + +```text CMake Error at CMakeLists.txt:2 (message): An error occurred but configure step continues -- Configuring incomplete, errors occurred! ``` - # Finding dependencies Libraries can be installed in various locations on your system. @@ -416,14 +412,14 @@ the library is installed). This is usually given by the library vendor. -# Breakout time +## Breakout time Look at Checkpoint 3. A new file `src/functionality_eigen.cpp` depends on the [Eigen](http://eigen.tuxfamily.org/index.php?title=Main_Page) library for linear algebra. Task: Using `find_package`{.cmake}, modify the `CMakeLists.txt` in directory `src/` to link target `cmake_course_lib`{.cmake} against Eigen. -*Hint: Useful instructions can be found at [Using Eigen in CMake Projects](http://eigen.tuxfamily.org/dox/TopicCMakeGuide.html).* +_Hint: Useful instructions can be found at [Using Eigen in CMake Projects](http://eigen.tuxfamily.org/dox/TopicCMakeGuide.html)._ Note that keyword `NO_MODULE`{.cmake} is equivalent to `CONFIG`{.cmake}. @@ -439,7 +435,7 @@ This behaviour corresponds to using `find_package`{.cmake} with the keyword `MOD find_package(library_name MODULE REQUIRED) ``` -Such *module files* are typically provided by CMake itself. +Such _module files_ are typically provided by CMake itself. They can also be written for a particular use case if required. @@ -457,17 +453,15 @@ find_package(Boost MODULE REQUIRED COMPONENTS ${boost_components}) ``` The CMake target for a component is `<PackageName>::<ComponentName>`{.cmake} -(*e.g.* `Boost::filesystem`{.cmake}). - +(_e.g._ `Boost::filesystem`{.cmake}). -# Breakout time +## Breakout time Look at Checkpoint 4. The executable `exe/main.cpp` depends on the [Boost Program Options](https://www.boost.org/doc/libs/1_74_0/doc/html/program_options.html) library for handling command line arguments. Task: Using `find_package`{.cmake} in `MODULE`{.cmake} mode, modify the `CMakeLists.txt` in directory `exe/` to find and link target `main_executable`{.cmake} against `Boost::program_options`{.cmake}. - # Adding CMake functionality using `include` Any file containing valid CMake syntax can be "included" in the @@ -487,9 +481,7 @@ set(name "Jane Doe") message(STATUS "Hello ${name}") ``` -# Adding CMake functionality using `include` - -``` +```text -- Hello Jane Doe -- Hello Foo Bar -- Configuring done @@ -532,8 +524,7 @@ Functions cannot return a value. Functions introduce a new scope. -A similar notion is CMake *macros*, which does **not** introduce a new scope. - +A similar notion is CMake _macros_, which does **not** introduce a new scope. # Setting options with `option()` @@ -555,7 +546,7 @@ between CMake runs. # Built-in CMake variables -CMake provides *a lot* of pre-defined variables which values describe the system. +CMake provides _a lot_ of pre-defined variables which values describe the system. For instance, the value of `CMAKE_CXX_COMPILER_ID`{.cmake} can be queried to determine which C++ compiler is used. @@ -577,7 +568,7 @@ A useful technique for adding options to targets, for instance adding compiler f Let's see how that works, in Checkpoint 5... -# Breakout time +## Breakout time Look at Checkpoint 5. The compiler should now warn us about bad C++. This is encouraged! @@ -592,11 +583,11 @@ Do you get a compiler warning? An error? Try configuring `WARNINGS_AS_ERRORS`{.c ```bash cmake -DWARNINGS_AS_ERRORS=ON .. ``` + ```bash cmake -DWARNINGS_AS_ERRORS=OFF .. ``` - # That's all, folks This was only the tiniest tip of the modern CMake iceberg. There are so many great resources available, and here are just a few of them: diff --git a/technology_and_tooling/packaging_dependency_management/index.md b/technology_and_tooling/packaging_dependency_management/index.md index 4fccb326..ed4da4b1 100644 --- a/technology_and_tooling/packaging_dependency_management/index.md +++ b/technology_and_tooling/packaging_dependency_management/index.md @@ -1,34 +1,30 @@ --- name: Packaging and Dependency Management id: packaging_dependency_management -dependsOn: [ - technology_and_tooling.ide, - software_architecture_and_design.procedural -] -files: [ +dependsOn: [technology_and_tooling.ide, software_architecture_and_design.procedural] +files: + [ pack_python_01intro.md, pack_python_02making.md, pack_python_03reusing.md, pack_python_04sharing.md, virtual_environments_python.md, cmake.md, -] -attribution: - - citation: > - "Python Packaging" course developed by Thibault Lestang and the Oxford Research - Software Engineering group - url: https://github.com/OxfordRSE/python-packaging-course - image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - + ] +attribution: + - citation: > + "Python Packaging" course developed by Thibault Lestang and the Oxford Research + Software Engineering group + url: https://github.com/OxfordRSE/python-packaging-course + image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 summary: | - This course introduces the basics of packaging and dependency management in Python and C++. - For Python, we introduce `venv` for virtual envronments, and `pip` for package management and how to structure a modern Python package and publish it to PyPI. - For C++, we introduce the CMake build system and how use it to manage dependencies and the build process. + This course introduces the basics of packaging and dependency management in Python and C++. + For Python, we introduce `venv` for virtual envronments, and `pip` for package management and how to structure a modern Python package and publish it to PyPI. + For C++, we introduce the CMake build system and how use it to manage dependencies and the build process. --- - diff --git a/technology_and_tooling/packaging_dependency_management/pack_python_01intro.md b/technology_and_tooling/packaging_dependency_management/pack_python_01intro.md index 19f4e576..9889fc55 100644 --- a/technology_and_tooling/packaging_dependency_management/pack_python_01intro.md +++ b/technology_and_tooling/packaging_dependency_management/pack_python_01intro.md @@ -1,49 +1,44 @@ --- name: Packaging Python Projects -dependsOn: [ - software_architecture_and_design.procedural.arrays_python -] +dependsOn: [software_architecture_and_design.procedural.arrays_python] tags: [python, setuptools] -attribution: - - citation: > - "Python Packaging" course developed by Thibault Lestang and the Oxford Research - Software Engineering group - url: https://github.com/OxfordRSE/python-packaging-course - image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - +attribution: + - citation: > + "Python Packaging" course developed by Thibault Lestang and the Oxford Research + Software Engineering group + url: https://github.com/OxfordRSE/python-packaging-course + image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- In this workshop, you are going to learn how to organise your Python software into _packages_. Doing so, you will be able to -- Have your software clearly organised in a way that is standard among Python developpers, making - your software easier to understand, test and debug. -- Reuse your code across your research projects and analyses. No more copying and pasting - blocks of code around: implement and test things once. -- Easily share your software, making everybody (including yourself) able to `pip install` - your package! +- Have your software clearly organised in a way that is standard among Python developpers, making + your software easier to understand, test and debug. +- Reuse your code across your research projects and analyses. No more copying and pasting + blocks of code around: implement and test things once. +- Easily share your software, making everybody (including yourself) able to `pip install` + your package! The **plan** is the following: we are going to start from a couple of rather messy python scripts and gradually transform them into a full-blown python package. At the end of this workshop, you'll know: -- What is a Python package and how to create one (and why!). -- How to share your packages across several of your projects. -- Maintain packages independantly from your research projects and analyses. -- What are virtual environments and how to use them to install different versions of a package - for different analyses. -- How to share your package on the [Python Package Index](https://pypi.org/) (PyPI), effectively making it straightforward - for anyone to install your package with the `pip` package manager (and much more!). -- Where to go next. +- What is a Python package and how to create one (and why!). +- How to share your packages across several of your projects. +- Maintain packages independantly from your research projects and analyses. +- What are virtual environments and how to use them to install different versions of a package + for different analyses. +- How to share your package on the [Python Package Index](https://pypi.org/) (PyPI), effectively making it straightforward + for anyone to install your package with the `pip` package manager (and much more!). +- Where to go next. Sounds interesting? Good! Get a cup of your favorite beverage and let's get started. - ## Materials for this course This course assumes that you have a local copy of the materials repository. @@ -56,7 +51,6 @@ git clone https://github.com/OxfordRSE/python-packaging-course For non-git users, you can visit <https://github.com/OxfordRSE/python-packaging-course> and download the materials as a ZIP archive ("code" green button on the top right corner). - ## Two scripts to analyse a timeseries Our starting point for this workshop is the script `analysis.py`. You'll find it in the `course/initial_scripts/` directory at the root of the repository. @@ -78,27 +72,26 @@ The first column contains the various times when the particle's position was rec the second column the corresponding position. Let's have a quick overview of these scripts, but **don't try to understand the -details**, it is irrelevant to the present workshop. Instead, let's briefly +details**, it is irrelevant to the present workshop. Instead, let's briefly describe their structure. - ### Overview of `analysis.py` After reading the timeseries from the file `brownian.csv`, this script `base.py` does three things: -- It computes the average value of the particle's position over time and the standard - deviation, which gives a measure of the spread around the average value. -- It plots the particle's position as a function of time from the initial time until - 50 time units. -- Lastly, it computes and plots the histogram of the particle's position over the entirety - of the timeseries. In addition, the theoritical histogram is computed and drawn as a - continuous line on top of the measured histogram. For this, a function `get_theoritical_histogram` - is defined, resembling the `numpy` function `histogram`. +- It computes the average value of the particle's position over time and the standard + deviation, which gives a measure of the spread around the average value. +- It plots the particle's position as a function of time from the initial time until + 50 time units. +- Lastly, it computes and plots the histogram of the particle's position over the entirety + of the timeseries. In addition, the theoritical histogram is computed and drawn as a + continuous line on top of the measured histogram. For this, a function `get_theoritical_histogram` + is defined, resembling the `numpy` function `histogram`. You're probably familiar with this kind of script, in which several independant -operations are performed on a single dataset. It is the typical output of some -"back of the envelope", exploratory work that is common in research. Taking a step +operations are performed on a single dataset. It is the typical output of some +"back of the envelope", exploratory work that is common in research. Taking a step back, these scripts are the reason why high-level languages like Python are so popular among scientists and researchers: got some data and want to quickly get some insight into it? Let's just jot down a few lines of code and get some @@ -108,31 +101,30 @@ Whilst great for short early research phases, this "back of the envelope scripti backfire if maintained over longer periods of time, perhaps even over your whole research project. Going back to `analysis.py`, consider the following questions: -- What would you do if you wanted to plot the timeseries over the last 50 time units instead of the first 50? -- What would you do if you wanted to visualise the _Probablity Density Function_ (PDF) instead of the histogram (effectively passing the optional argument `density=true` - to `numpy.histogram`). -- What would you do if you were given a similar dataset to `brownian.csv` and asked to compute the mean, compute the histogram along with other things not implemented in `analysis.py` ? +- What would you do if you wanted to plot the timeseries over the last 50 time units instead of the first 50? +- What would you do if you wanted to visualise the _Probablity Density Function_ (PDF) instead of the histogram (effectively passing the optional argument `density=true` + to `numpy.histogram`). +- What would you do if you were given a similar dataset to `brownian.csv` and asked to compute the mean, compute the histogram along with other things not implemented in `analysis.py` ? In the interest of time, you are likely to end up modifying some specific lines (to compute the PDF instead of the histogram for example), or/and copy and paste -of lot of code. Whilst convenience on a short term basis, is it going to be +of lot of code. Whilst convenience on a short term basis, is it going to be increasingly difficult to understand your script, track its purpose, and test -that its results are correct. Three months later, facing a smilar dataset, +that its results are correct. Three months later, facing a smilar dataset, would you not be tempted to rewrite things from scratch? It doesn't have to be this way! As you're going to learn in this ourse, organising your Python software into _packages_ alleviates most of these issues. - ## Separating methods from parameters and data Roughly speaking, a numerical experiment is made of three components: -- The data (dataset, or parameters of simulation). -- The operations performed on this data. -- The output (numbers, plots). +- The data (dataset, or parameters of simulation). +- The operations performed on this data. +- The output (numbers, plots). As we saw, `analysis.py` mixes the three above components into a single `.py` -file, making the analysis difficult (sometimes even risky!) to modify and test. +file, making the analysis difficult (sometimes even risky!) to modify and test. Re-using part of the code means copying and pasting blocks of code out of their original context, which is a dangerous practice. You might be thinking (and you would be right) that this statement is an exaggeration for a script of this @@ -173,12 +165,12 @@ All that remains are the actual steps of the analysis. If we were to make changes to the way some operations are implemented, we would simply make changes to the package, leaving the scripts unmodified. This reduces the risk of messing of introducing errors in your analysis, when all what you -want to do is modyfying some opearation of data. The changes are then made +want to do is modyfying some opearation of data. The changes are then made available to all the programs that use the package: no more copying and pasting code around. :::callout -Taking a step back, the idea of separating different components is pervasive in software +Taking a step back, the idea of separating different components is pervasive in software developemt and software design. Different names depending on the field (encapsulation, separation of concerns, bounded contexts...). diff --git a/technology_and_tooling/packaging_dependency_management/pack_python_02making.md b/technology_and_tooling/packaging_dependency_management/pack_python_02making.md index fd5898a2..fa3de80a 100644 --- a/technology_and_tooling/packaging_dependency_management/pack_python_02making.md +++ b/technology_and_tooling/packaging_dependency_management/pack_python_02making.md @@ -1,22 +1,18 @@ --- name: Making Packages -dependsOn: [ - technology_and_tooling.packaging_dependency_management.pack_python_01intro -] +dependsOn: [technology_and_tooling.packaging_dependency_management.pack_python_01intro] tags: [python, setuptools] -attribution: - - citation: > - "Python Packaging" course developed by Thibault Lestang and the Oxford Research - Software Engineering group - url: https://github.com/OxfordRSE/python-packaging-course - image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - +attribution: + - citation: > + "Python Packaging" course developed by Thibault Lestang and the Oxford Research + Software Engineering group + url: https://github.com/OxfordRSE/python-packaging-course + image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## The `tstools` package @@ -85,7 +81,6 @@ In the following section we add a couple of `import` statements in the `__init__.py` so that all our functions (in both modules) are available under the single namespae `tstools`. - ## init dot pie Whenever you import a directory, Python will look for a file `__init__.py` at the root of this @@ -94,7 +89,7 @@ It is the presence of this initialization file that truly makes the `tstools` di package. :::callout -Since Python 3.3, this isn't technically true. Directories without a __init__.py +Since Python 3.3, this isn't technically true. Directories without a **init**.py file are called namespace packages, see Packaging namespace packages on the Python Packaging User Guide). However, their discussion is beyond the scope of this course. @@ -123,8 +118,7 @@ __init__.py The lesson here is that any object (variable, function, class) defined in the `__init__.py` file is available under the package's namespace. - -::::challenge{id=single_namespace title="Bringing all functions under a single namespace"} +::::challenge{id=single_namespace title="Bringing all functions under a single namespace"} Our package isn't very big, and the internal strucure with 2 different modules isn't very relevant for a user. @@ -137,12 +131,12 @@ under the `tstools` namespace, that is import tstools # instead of mean, var = tstools.moments.get_mean_and_var(...) -mean, var = tstools.get_mean_and_var(timeseries) +mean, var = tstools.get_mean_and_var(timeseries) # instead of fig, ax = tstools.vis.plot_histogram(...) -fig, ax = tstools.plot_histogram(timeseries, 4*np.sqrt(var)) +fig, ax = tstools.plot_histogram(timeseries, 4*np.sqrt(var)) ``` - + :::callout By default python looks for modules in the current directory and some other locations (more about that later). When using `import`, @@ -153,6 +147,7 @@ you can refer to modules in the current package using the _dot notation_: # in the current package (next to the __init__.py) from .module import something ``` + ::: :::solution @@ -162,11 +157,11 @@ from .module import something from .moments import get_mean_and_var from .vis import plot_histogram ``` + ::: :::: - ### Using the package Our package is now ready to be used in our analysis, and an analysis scripts could look like this: @@ -184,5 +179,5 @@ mean, var = tstools.get_mean_and_var(timeseries) fig, ax = tstools.plot_histogram(timeseries, nbins=100) ``` -Note that the above does the exact same amount of work job as +Note that the above does the exact same amount of work job as `initial_scripts/analysis.py`... but is much shorter and easier to read! diff --git a/technology_and_tooling/packaging_dependency_management/pack_python_03reusing.md b/technology_and_tooling/packaging_dependency_management/pack_python_03reusing.md index d553109c..477d2829 100644 --- a/technology_and_tooling/packaging_dependency_management/pack_python_03reusing.md +++ b/technology_and_tooling/packaging_dependency_management/pack_python_03reusing.md @@ -1,25 +1,20 @@ --- name: Reusing Packages -dependsOn: [ - technology_and_tooling.packaging_dependency_management.pack_python_02making -] +dependsOn: [technology_and_tooling.packaging_dependency_management.pack_python_02making] tags: [python, setuptools] -attribution: - - citation: > - "Python Packaging" course developed by Thibault Lestang and the Oxford Research - Software Engineering group - url: https://github.com/OxfordRSE/python-packaging-course - image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - +attribution: + - citation: > + "Python Packaging" course developed by Thibault Lestang and the Oxford Research + Software Engineering group + url: https://github.com/OxfordRSE/python-packaging-course + image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- - ## Another analysis Let's say that we have another directory `analysis2`, that contains another @@ -33,7 +28,7 @@ In the directory `analysis2/`, let's write a script `analysis2.py`, that imports analysis2/ analysis2.py data/ - hotwire.csv + hotwire.csv ``` ```python @@ -72,23 +67,22 @@ $ python The order of this list matters: it is the order in which python looks into the directories that constitute the python path. To begin with, Python first looks in the current directory. -If the package/module isn't found there, the python intepreter looks in the following directories +If the package/module isn't found there, the python interpreter looks in the following directories (in this order): -- `/usr/lib/python38.zip` -- `/usr/lib/python3.8` -- `/usr/lib/python3.8/lib-dynload` +- `/usr/lib/python38.zip` +- `/usr/lib/python3.8` +- `/usr/lib/python3.8/lib-dynload` The above contain the modules and packages in the _standard library_, _i.e_ the -packages and modules that come "pre-installed" with Python. Finally, the python +packages and modules that come "pre-installed" with Python. Finally, the python interpreter looks inside the directory `/home/thibault/python-workshop-venv/lib/python3.8/site-packages/`, which is our currently active virtual environment. -:::callout -The output of `sys.path` is probaby different on your machine. It depends on many -factors, -like your operating system, your version of Python, the location of your current active Python +:::callout{variant="info"} +The output of `sys.path` is probably different on your machine. It depends on many factors; +such as your operating system, your version of Python, and the location of your current active Python environment. ::: @@ -99,52 +93,51 @@ Looking back at the example in the previous section, let's list some potential ways we can make the `tstools` package importable from the `analysis2/` directory: -1. **Copy (`analysis1/tstools/`) in `analysis2/`**. - You end up with two independant packages. If you make changes to one, you have to remember to make the same - changes to the other. It's the usual copy and paste problems: inefficient and error-prone. -2. **Add `analysis1/` to `sys.path`**. - At the beginning of `analysis2.py`, you could just add +1. **Copy (`analysis1/tstools/`) in `analysis2/`**. + You end up with two independant packages. If you make changes to one, you have to remember to make the same + changes to the other. It's the usual copy and paste problems: inefficient and error-prone. +2. **Add `analysis1/` to `sys.path`**. + At the beginning of `analysis2.py`, you could just add + + ```python + import sys + sys.path.append("../analysis1/") + ``` - ```python - import sys - sys.path.append("../analysis1/") - ``` + This approach can be sufficient in some situations, but generally not recommended. What if the package directory is relocated? - This approach can be sufficient in some situations, but generally not recommended. What if the package directory is relocated? -3. **Copy `analysis1/tstools` directory to the `site-packages/` directory.** - You have to know where the `site-packages` is. This depends on your current system and python environment (see below). - The location on your macine may very well be differnt from the location on your colleague's machine. +3. **Copy `analysis1/tstools` directory to the `site-packages/` directory.** + You have to know where the `site-packages` is. This depends on your current system and python environment (see below). + The location on your machine may very well be different from the location on your colleague's machine. More generally, the three above approaches overlook a very important -point: **dependencies**. Our package has two: numpy and matplotlib. +point: **dependencies**. Our package has two: numpy and matplotlib. If you were to give your package to a colleague, nothing guarantees -that they have both packages installed. This is a pedagogical +that they have both packages installed. This is a pedagogical example, as it is likely that they would have both installed, given -the popularity of these packages. However, if your package relies on +the popularity of these packages. However, if your package relies on less widespread packages, specific versions of them or maybe a long list of packages, it is important to make sure that they are available. -Note that all three above approaches work. However, unless you have a +Note that all three above approaches work. However, unless you have a good reason to use one of them, these are not recommended for the reasons above. In the next section, we look at the recommended way to install a package, using `setuptools` and `pip`. ## setuptools, pyproject dot toml, setup dot pie and pip - The recommended way to install a package is to use the `setuptools` library in -conjunction with `pip`, the official python _package manager_. Effectively, +conjunction with `pip`, the official python _package manager_. Effectively, this approach is roughly equivalent to copying the package to the `site-packages` directory, but the process is **automated**. - ### pip -Pip is the de facto package manager for Python packages. It's main +Pip is the de facto package manager for Python packages. It's main job is to install, remove, upgrade, configure and manage Python packages, both available locally on your machine but also hosted on on -the [Python Package Index (PyPI)](https://pypi.org/). Pip is +the [Python Package Index (PyPI)](https://pypi.org/). Pip is maintained by the [Python Packaging Authority](https://www.pypa.io/en/latest/). @@ -165,14 +158,13 @@ pip install ./tstools ERROR: Directory './tstools' is not installable. Neither 'setup.py' nor 'pyproject.toml' found. ``` -The above doesn't really look like our package got installed properly. For +The above doesn't really look like our package got installed properly. For `pip` to be able to install our package, we must first give it some information about it. In fact, `pip` expects to find either a `pyproject.toml` configuration file or a python file named `setup.py` in the directory that it is given as an argument. These file will contain some metadata about the package and tell `pip` the location of the actual source of the package. - ### `setup.py` (setup dot pie) The `setup.py` file is a regular Python file that makes a call to the `setup` @@ -191,14 +183,14 @@ setup(name='tstools', url='myfancywebsite.com', author='Spam Eggs', packages=['tstools'], - install_requires = ["numpy", "matplotlib", "scipy"], + install_requires=["numpy", "matplotlib", "scipy"], license='GPLv3') ``` The above gives `pip` some metadata about our package: its version, a short description, its authors, ad its license. It also provides information regarding the dependencies of our package, _i.e_ `numpy` -and `matplotlib`. In addition, it gives `setup` the location of the +and `matplotlib`. In addition, it gives `setup` the location of the package to be installed, in this case the directory `tstools`. :::callout @@ -226,7 +218,7 @@ tool-agnostic and easily extensible, allowing for the coexistence and cooperation of different tools in the same project, which makes the packaging process more uniform across different tools. -Here is an equivilant `pyproject.toml` file for our `tstools` package: +Here is an equivalent `pyproject.toml` file for our `tstools` package: ```toml [build-system] @@ -240,26 +232,21 @@ authors = [ {name = "Spam Eggs", email = "spam.eggs@email.com"} ] readme = "README.md" -homepage = "myfancywebsite.com" -license = "GPLv3" +license = {text = "MIT"} +dependencies = ["numpy", "matplotlib", "scipy"] [project.urls] -Source = "myfancywebsite.com" +Source = "example.com" [project.scripts] # Define scripts here if you have any -[project.dependencies] -numpy = "*" -matplotlib = "*" -scipy = "*" - [project.optional-dependencies] # Define optional dependencies here if you have any ``` -Note that both `setup.py` and `pyproject.toml` can be used in conjection, and in +Note that both `setup.py` and `pyproject.toml` can be used in conjunction, and in the transition period it is common for Python packages to include both these files, as we will do in this workshop. In this case, we can create a minimal `pyproject.toml` file that just -specificies the use of `setuptools` and links to the `setup.py` file: +specifies the use of `setuptools` and links to the `setup.py` file: ```toml [build-system] @@ -278,91 +265,86 @@ tools that can be used to install Python packages, such as [`flit`](https://flit After writing a `setup.py` and `pyproject.toml` file, our directory structure looks like this: - ```text python-workshop/ analysis1/ - data/ - analysis1.py - setup.py + data/ + analysis1.py + setup.py pyproject.toml - tstools/ + tstools/ ``` Actually, there are no reasons for our `tstools` package to be located -in the `analysis1/` directory. Indeed, the package is independant +in the `analysis1/` directory. Indeed, the package is independent from this specific analysis, and we want to share it among multiple analyses. To reflect this, let's move the `tstools` package into a new directory -`tstools-dist` located next to the `anaylis1` and `analysis2` +`tstools-dist` located next to the `analysis1` and `analysis2` directories: ```text python-workshop/ analysis1/ - data/ - analysis1.py + data/ + analysis1.py analysis2/ - data/ - analysis2.py + data/ + analysis2.py tsools-dist/ - setup.py + setup.py pyproject.toml - tstools/ + tstools/ ``` The directory `tstools-dist` is a _distribution package_, containing the `setup.py` file and the package itself - the `tstools` directory. These are the two minimal ingredients required to _distribute_ a package. -::::challenge{id=installing-tstools title="Installing `tsools` with pip"} - -1. Write a stand-alone `pyproject.toml` file, or use a combination of - `setup.py` and `pyproject.toml` files in directory `tstools-dist`. Include the - following metadata: - - The name of the package (could be `tstools` but also could be anything else) - - The version of the package (for example 0.1) - - A one-line description - - Your name as the author - - Your email - - The GPLv3 license - -2. *Un*install numpy and matplotlib - - ```shell - pip uninstall numpy matplotlib - ``` - -:::callout -Make sure `pip` points to your current virtual environment (you can check this by typing -`pip --version`. Particularly, if admin rights are necessary to uninstall and install -packages, you're probably using `pip` in your global Python environment. To make sure -that you run the correct `pip` for your correct Python environment, run `python -m pip -<pip command>` instead of `pip <pip command>`.) -::: - - -3. Install the `tstools` package with `pip`. - Remember: `pip install <location of setup file>` - Notice how `numpy` and `matplotlib` are automatically downloaded (can you find from where?) even though your just uninstalled them. -4. Move to the directory `analysis2/` and check that you can import - your package from there. Where is this package located? Hint: - You can check the location a package using the `__file__` - attribute. -5. The directory `analysis2` contains a timeseries under - `data/`. What is the average value of the timeseries? +::::challenge{id=installing-tstools title="Installing `tsools` with pip"} + +1. Write a stand-alone `pyproject.toml` file, or use a combination of + `setup.py` and `pyproject.toml` files in directory `tstools-dist`. Include the + following metadata: + + - The name of the package (could be `tstools` but also could be anything else) + - The version of the package (for example 0.1) + - A one-line description + - Your name as the author + - Your email + - The GPLv3 license + +2. *Un*install numpy and matplotlib + + ```shell + pip uninstall numpy matplotlib + ``` + + :::callout{variant="tip"} + Make sure `pip` points to your current virtual environment (you can check this by typing `pip --version`. + Particularly, if it becomes necessary to use admin rights to uninstall and install packages, you're probably using `pip` in your global Python environment. + To ensure that you run the correct `pip` for your correct Python environment, run `python -m pip <pip command>` instead of `pip <pip command>`.) + ::: + +3. Install the `tstools` package with `pip`. + Remember: `pip install <location of setup file>` + Notice how `numpy` and `matplotlib` are automatically downloaded (can you find from where?) even though your just uninstalled them. +4. Move to the directory `analysis2/` and check that you can import + your package from there. Where is this package located? Hint: + You can check the location a package using the `__file__` + attribute. +5. The directory `analysis2` contains a timeseries under + `data/`. What is the average value of the timeseries? :::: - Congratulations! Your `tstools` package is now installed can be reused -across your analyses... no more dangerous copying and pasting! +across your analyses... no more dangerous copying and pasting! ## Maintaining your package - In the previous section you made your package "pip installable" by -creating a `setup.py` file. You then installed the package, +creating a `setup.py` file. You then installed the package, effectively making accessible between different analysis directories. However, a package is never set in stone: as you work on your @@ -371,11 +353,10 @@ instance to add functionalities or to fix bugs. You could just reinstall the package each time you make a modification to it, but this obviously becomes tedious if you are constantly making -changes (maybe to hunt down a bug) and/or testing your package. In +changes (maybe to hunt down a bug) and/or testing your package. In addition, you may simply forget to reinstall your package, leading to potentially very frustrating and time-consuming errors. - ### Editable installs `pip` has the ability to install the package in a so-called "editable" mode. @@ -391,19 +372,18 @@ To install your package in editable mode, use the `-e` option for the `install` pip install -e . ``` - ::::challenge{id=editable-install title="Editable install"} -1. Uninstall the package with `pip uninstall tstools` -2. List all the installed packages and check that `tstools` is not among them - Hint: Use `pip --help` to get alist of available `pip` commands. -3. re-install `tstools` in editable mode. -4. Modify the `tstools.vis.plot_trajectory_subset` so that it returns the maximum value - over the trajectory subset, in addition to `figure` and `axis`. - Hint: You can use the numpy function `amax` to find the maximum of an array. -5. Edit and run the script `analysis2/analysis2.py` to print the - maximum value of the timeseries `analysis2/data/hotwire.csv` between t=0 - and t = 0.25. +1. Uninstall the package with `pip uninstall tstools` +2. List all the installed packages and check that `tstools` is not among them + Hint: Use `pip --help` to get alist of available `pip` commands. +3. re-install `tstools` in editable mode. +4. Modify the `tstools.vis.plot_trajectory_subset` so that it returns the maximum value + over the trajectory subset, in addition to `figure` and `axis`. + Hint: You can use the numpy function `amax` to find the maximum of an array. +5. Edit and run the script `analysis2/analysis2.py` to print the + maximum value of the timeseries `analysis2/data/hotwire.csv` between t=0 + and t = 0.25. In editable mode, `pip install` creates a file, `<package-name>.egg-link`, at the package installation location in @@ -414,22 +394,23 @@ package in your package project directory: cat ~/python-workshop-venv/lib/python3.8/site-packages/tstools.egg-link /home/thibault/python-packaging-workshop/tstools ``` + :::: ## Summary -- In order to reuse our package across different analyses, we must _install_ it. - In effect, this means copying the package into a directory that is in the python path. - This shouldn't be done manually, but instead using a `pyproject.toml` (or `setup.py`) - configuration file that a tool like `pip` can process using the `pip install` command. -- It would be both cumbersome and error-prone to have to reinstall the package each time - we make a change to it (to fix a bug for instance). Instead, the package can be installed - in "editable" mode using the `pip install -e` command. This just redirects the python - interpreter to your project directory. -- The main value of packaging software is to facilitate its reuse across different projects. - One you have extracted the right operations into a package that is independent of your - analysis, you can easily "share" it between projects. In this way you avoid inefficient - and dangerous duplication of code. +- In order to reuse our package across different analyses, we must _install_ it. + In effect, this means copying the package into a directory that is in the python path. + This shouldn't be done manually, but instead using a `pyproject.toml` (or `setup.py`) + configuration file that a tool like `pip` can process using the `pip install` command. +- It would be both cumbersome and error-prone to have to reinstall the package each time + we make a change to it (to fix a bug for instance). Instead, the package can be installed + in "editable" mode using the `pip install -e` command. This just redirects the python + interpreter to your project directory. +- The main value of packaging software is to facilitate its reuse across different projects. + One you have extracted the right operations into a package that is independent of your + analysis, you can easily "share" it between projects. In this way you avoid inefficient + and dangerous duplication of code. Beyond greatly facilitating code reuse, writing a python package (as opposed to a loosely organised collection of modules) enables a clear organisation of your software into modules @@ -439,4 +420,4 @@ understand the structure of your software, _i.e_ what-does-what. Moreover, organising your python software into a package gives you access to a myriad of fantastic tools used by thousands of python developers everyday. Examples include pytest for automated testing, sphinx for building you documentation, tox for automation -of project-level tasks. \ No newline at end of file +of project-level tasks. diff --git a/technology_and_tooling/packaging_dependency_management/pack_python_04sharing.md b/technology_and_tooling/packaging_dependency_management/pack_python_04sharing.md index 03c913d9..b25a93bc 100644 --- a/technology_and_tooling/packaging_dependency_management/pack_python_04sharing.md +++ b/technology_and_tooling/packaging_dependency_management/pack_python_04sharing.md @@ -1,26 +1,20 @@ --- name: Sharing Packages -dependsOn: [ - technology_and_tooling.packaging_dependency_management.pack_python_03reusing -] +dependsOn: [technology_and_tooling.packaging_dependency_management.pack_python_03reusing] tags: [python, setuptools] -attribution: - - citation: > - "Python Packaging" course developed by Thibault Lestang and the Oxford Research - Software Engineering group - url: https://github.com/OxfordRSE/python-packaging-course - image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - - +attribution: + - citation: > + "Python Packaging" course developed by Thibault Lestang and the Oxford Research + Software Engineering group + url: https://github.com/OxfordRSE/python-packaging-course + image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- - # Sharing a package ## Building Python distributions @@ -32,7 +26,7 @@ the package - but also sometimes compile and test it. A distribution usually takes the from of an archive (`.tar`, `.zip` or similar). There are several possible distribution formats, but for `pip`, only two are really important: the _source distribution_ (sdist) and the _wheel_ -(bdist\_wheel). +(bdist_wheel). ### Source distributions @@ -80,9 +74,9 @@ removing 'tstools-0.1' (and everything under it) This mainly does three things: -- It gathers the python source files that consitute the package (incuding the `setup.py` or `pyproject.toml` if present). -- It writes some metadata about the package in a directory `<package name>.egg-info`. -- It bundles everyting into a tar archive. +- It gathers the python source files that consitute the package (incuding the `setup.py` or `pyproject.toml` if present). +- It writes some metadata about the package in a directory `<package name>.egg-info`. +- It bundles everyting into a tar archive. The newly created sdist is written in a directory `dist` in the root of the package: @@ -152,34 +146,42 @@ rendering the latter format obsolete. For more information, refer to [Wheel vs E ::::challenge{id=python-wheel title="Building a Python wheel"} -1. If you don't have one, create a new developement virtual environment in the - `tstools-dist` directory: - - ```shell - python -m venv tstools-venv - source tstools-venv/bin/activate # (GNU/Linux and MacOS) - tstools-venv\Scripts\activate.bat # (Windows command prompt) - tstools-venv\Scripts\Activate.ps1 # (Windows PowerShell) - ``` -2. Update `pip` - - ```shell - pip install --upgrade pip - ``` -3. Install `build` tool and the `wheel` extension: - - ```shell - pip install wheel build - ``` -4. Build a wheel - - ```shell - python -m build --wheel - ``` -5. Install the wheel using `pip`. - Hint: wheels are written in the `dist/` directory, just - like source distributions. -6. `.whl` files are basically zip files. Unzip the wheel and explore its contents. +1. If you don't have one, create a new developement virtual environment in the + `tstools-dist` directory: + + ```shell + python -m venv tstools-venv + source tstools-venv/bin/activate # (GNU/Linux and MacOS) + tstools-venv\Scripts\activate.bat # (Windows command prompt) + tstools-venv\Scripts\Activate.ps1 # (Windows PowerShell) + ``` + +2. Update `pip` + + ```shell + pip install --upgrade pip + ``` + +3. Install `build` tool and the `wheel` extension: + + ```shell + pip install wheel build + ``` + +4. Build a wheel + + ```shell + python -m build --wheel + ``` + +5. Install the wheel using `pip`. Wheels are written in the `dist/` directory, just + like source distributions. + + ```shell + python -m pip install ./dist/tstools*.whl + ``` + +6. `.whl` files are basically zip files. Unzip the wheel and explore its contents. :::callout The [wheel](https://pypi.org/project/wheel/) package is a built-in extension to the `setuptools` package. @@ -189,6 +191,7 @@ Using the `build` tool with no arguments will build both a source distribution a ```shell python -m build ``` + ::: :::: @@ -196,10 +199,9 @@ python -m build ## Uploading distributions to PyPI In the previous section you learned how to create distributions for your -packages. In this section, we look at how to share them with others, so that +packages. In this section, we look at how to share them with others, so that other people can easily install and use your packages. - ### Package repositories Let's think about distributing packages for a minute. @@ -209,10 +211,10 @@ If you both work next to each other, you could simply exchange a USB stick. If n Although effective on a short term basis, these solutions present serious shortcomings: -- You would have to share the distribution again each time you make a change to the package. -- If your colleague wants a specific version (that's not the latest), you would have to check out the old version of your package and build the distribution again - unless your manually - keep track of all your distributions. -- Users of your package must contact you to get the distribution, and wait for you to get back to them. +- You would have to share the distribution again each time you make a change to the package. +- If your colleague wants a specific version (that's not the latest), you would have to check out the old version of your package and build the distribution again - unless your manually + keep track of all your distributions. +- Users of your package must contact you to get the distribution, and wait for you to get back to them. These issues can be overcome by using _package repositories_. A package repository is just an index of packages hosted on distant servers, available to download from installation. If you're using GNU/Linux, you use a package repository each time you install new software: `apt install libreoffice` is nothing but a request for the package `libreoffice` to one of @@ -222,8 +224,7 @@ The main reposotiry for Python is the [Python Package Index](https://pypi.org/) Whenever you install a package with `pip install package`, `pip` first check than `package` isnt a directory on your machine (in which case `pip` tries to install it as a package). If not, `pip` makes a request to PyPI and, if it exists, downloads and install package `package`. - -### Publishing distributions to the test PyPI index +### Publishing distributions to the test PyPI index Once a package is uploaded to PyPI, it cannot easily be removed. This is to prevent packages from disappearing without warning while other software depends on it. @@ -250,81 +251,89 @@ without entering your username and password every time. Note that you might want ::::challenge{id=publishing-distributions title="Publishing distributions to TestPyPI"} -1. On PyPI (or TestPyPI), there cannot be two package with the same name. Therefore, before you upload your `tstools` package, - you must give the project a unique name. To do so, open the `tstools-dist/setup.py` or `tstools-dist/pyproject.toml` file and change the `name` entry - to something unique to you, for instance: - - ```python - name='tstools-<yourname>' - ``` -2. Install `twine` in your `python-packaging-venv` environment - - ```shell - pip install twine - ``` -3. If you created some distributions in the previous sections, remove everything inside your `dist/` directory - - ```shell - rm dist/* - ``` -4. Create a source distribution and a wheel for your `tstools` package - - ```shell - python -m build - ``` -5. If you don't have one, create an account on the Test PyPI index by visiting <https://test.pypi.org/account/register/>. -6. Lastly, publish your distributions to the test PyPI index: - - ```shell - twine upload --repository testpypi dist/* - ``` - - Can you find your package on [test.pypi.org](https://test.pypi.org) ? -7. Create a new virtual environment and install your `tstools` package from the test PyPI index - - ```shell - pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple your-package - ``` - - The above command is a bit lengthy, but it's just because we are installing from the test - index instead of the regular one. `--index-url https://test.pypi.org/simple/` tells `pip` - to look for the package at `test.pypi.org` instead of `pypi.org` (which is the default). - In addition, `--extra-index-url https://pypi.org/simple` tells `pip` to looks for dependencies - in the regular index, instead of the test one. In our case dependencies are `numpy` and `matplotlib`. +1. On PyPI (or TestPyPI), there cannot be two package with the same name. Therefore, before you upload your `tstools` package, + you must give the project a unique name. To do so, open the `tstools-dist/setup.py` or `tstools-dist/pyproject.toml` file and change the `name` entry + to something unique to you, for instance: + + ```python + name='tstools-<yourname>' + ``` + +2. Install `twine` in your `python-packaging-venv` environment + + ```shell + pip install twine + ``` + +3. If you created some distributions in the previous sections, remove everything inside your `dist/` directory + + ```shell + rm dist/* + ``` + +4. Create a source distribution and a wheel for your `tstools` package + + ```shell + python -m build + ``` + +5. If you don't have one, create an account on the Test PyPI index by visiting <https://test.pypi.org/account/register/>. +6. Lastly, publish your distributions to the test PyPI index: + + ```shell + twine upload --repository testpypi dist/* + ``` + + Can you find your package on [test.pypi.org](https://test.pypi.org) ? + +7. Create a new virtual environment and install your `tstools` package from the test PyPI index + + ```shell + pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple your-package + ``` + + The above command is a bit lengthy, but it's just because we are installing from the test + index instead of the regular one. `--index-url https://test.pypi.org/simple/` tells `pip` + to look for the package at `test.pypi.org` instead of `pypi.org` (which is the default). + In addition, `--extra-index-url https://pypi.org/simple` tells `pip` to looks for dependencies + in the regular index, instead of the test one. In our case dependencies are `numpy` and `matplotlib`. + :::: Congratulations! You just published your first Python package. Remarks: -- It's always a good idea to first publish your package on the test index, before - you publish it to the real index. -- `twine` and `pip` defaut to the real index <https://pypi.org>, so commands are really simple: - - ```shell - twine upload <distributions> # Publish package - pip install <package name> # Install package from pypi.org - ``` -- You can, and _should_ publish your package each time you make a new version of it. - All versions are stored on PyPI, and are accessible from pip. - See the [release history for numpy](https://pypi.org/project/numpy/#history) for example. - You could just install a specific version of numpy with: - - ```shell - pip install numpy==1.17.5 - ``` -- Note that _you cannot_ erase a published version of your package. - If you discover a bug in a version of your package that already has been published and want to fix it without changing the version number, - what is known as a _post-release_, _i.e_ adding `.postX` add the end of the faulty version number. - For instance: - - ```python - setup(name='tstools', - version='0.1.post1', - ...) - ``` - - and upload your fixed package. - This will still be considered version `0.1`, but `pip install tstools==0.1` will download - the `0.1.post1` version. - Note that you could publish subsequent post-releases, _i.e_ `.post2`, `.post3`... \ No newline at end of file +- It's always a good idea to first publish your package on the test index, before + you publish it to the real index. +- `twine` and `pip` defaut to the real index <https://pypi.org>, so commands are really simple: + + ```shell + twine upload <distributions> # Publish package + pip install <package name> # Install package from pypi.org + ``` + +- You can, and _should_ publish your package each time you make a new version of it. + All versions are stored on PyPI, and are accessible from pip. + See the [release history for numpy](https://pypi.org/project/numpy/#history) for example. + You could just install a specific version of numpy with: + + ```shell + pip install numpy==1.17.5 + ``` + +- Note that _you cannot_ erase a published version of your package. + If you discover a bug in a version of your package that already has been published and want to fix it without changing the version number, + what is known as a _post-release_, _i.e_ adding `.postX` add the end of the faulty version number. + For instance: + + ```python + from setuptools import setup + setup(name='tstools', + version='0.1.post1') + ``` + + and upload your fixed package. + This will still be considered version `0.1`, but `pip install tstools==0.1` will download + the `0.1.post1` version. + Note that you could publish subsequent post-releases, _i.e_ `.post2`, `.post3`... diff --git a/technology_and_tooling/packaging_dependency_management/virtual_environments_python.md b/technology_and_tooling/packaging_dependency_management/virtual_environments_python.md index 4883a04c..e2ce28a3 100644 --- a/technology_and_tooling/packaging_dependency_management/virtual_environments_python.md +++ b/technology_and_tooling/packaging_dependency_management/virtual_environments_python.md @@ -1,67 +1,62 @@ --- name: Virtual Environments -dependsOn: [ - technology_and_tooling.ide.python -] +dependsOn: [technology_and_tooling.ide.python] tags: [python, venv] -attribution: - - citation: > - "Python Packaging" course developed by Thibault Lestang and the Oxford Research - Software Engineering group - url: https://github.com/OxfordRSE/python-packaging-course - image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - +attribution: + - citation: > + "Python Packaging" course developed by Thibault Lestang and the Oxford Research + Software Engineering group + url: https://github.com/OxfordRSE/python-packaging-course + image: https://www.rse.ox.ac.uk/images/banner_ox_rse.svg + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- - - :::callout -This material was edited from the original in "Intermediate Research Software +This material was edited from the original in "Intermediate Research Software Development Skills" hosted by the Software Carpentries" ::: ## Introduction -Consider the following lines of Python that you might find at the top of a file that +Consider the following lines of Python that you might find at the top of a file that does some plotting and data analysis: -~~~python +```python from matplotlib import pyplot as plt import numpy as np -~~~ +``` -This means that our code requires two *external libraries* (also called third-party packages or dependencies) - +This means that our code requires two _external libraries_ (also called third-party packages or dependencies) - `numpy` and `matplotlib`. Python applications often use external libraries that don’t come as part of the standard Python distribution. This means -that you will have to use a *package manager* tool to install them on your system. +that you will have to use a _package manager_ tool to install them on your system. Applications will also sometimes need a specific version of an external library (e.g. because they require that a particular bug has been fixed in a newer version of the library), or a specific version of Python interpreter. This means that each Python application you work with may require a different setup and a set of dependencies so it is important to be able to keep these configurations separate to avoid confusion between projects. -The solution for this problem is to create a self-contained *virtual -environment* per project, which contains a particular version of Python installation plus a number of +The solution for this problem is to create a self-contained _virtual +environment_ per project, which contains a particular version of Python installation plus a number of additional external libraries. Virtual environments are not just a feature of Python - most modern programming languages use them to isolate libraries -for a specific project and make it easier to develop, run, test and share code with others. -Even languages that don't explicitly have virtual environments have other mechanisms that promote per-project library collections. +for a specific project and make it easier to develop, run, test and share code with others. +Even languages that don't explicitly have virtual environments have other mechanisms that promote per-project library collections. In this episode, we learn how to set up a virtual environment to develop our code and manage our external dependencies. ## Virtual Environments + So what exactly are virtual environments, and why use them? A Python virtual environment is an **isolated working copy** of a specific version of Python interpreter together with specific versions of a number of external libraries installed into that -virtual environment. A virtual environment is simply a *directory with a particular -structure* which includes links to and enables multiple side-by-side installations of +virtual environment. A virtual environment is simply a _directory with a particular +structure_ which includes links to and enables multiple side-by-side installations of different Python interpreters or different versions of the same external library to coexist on your machine and only one to be selected for each of our projects. This allows you to work on a particular project without worrying about affecting other projects on your machine. @@ -70,33 +65,35 @@ its specific virtual environment and avoid a great deal of confusion by having s for each project rather than one huge global environment with potential package version clashes. Another big motivator for using virtual environments is that they make sharing your code with others much easier (as we will see shortly). Here are some typical scenarios where the usage of virtual environments is highly recommended (almost unavoidable): + - You have an older project that only works under Python 2. You do not have the time to migrate the project to Python 3 -or it may not even be possible as some of the third party dependencies are not available under Python 3. You have to -start another project under Python 3. The best way to do this on a single machine is to set up two separate Python virtual -environments. + or it may not even be possible as some of the third party dependencies are not available under Python 3. You have to + start another project under Python 3. The best way to do this on a single machine is to set up two separate Python virtual + environments. - One of your Python 3 projects is locked to use a particular older version of a third party dependency. You cannot use the -latest version of the -dependency as it breaks things in your project. In a separate branch of your project, you want to try and fix problems -introduced by the new version of the dependency without affecting the working version of your project. You need to set up -a separate virtual environment for your branch to 'isolate' your code while testing the new feature. + latest version of the + dependency as it breaks things in your project. In a separate branch of your project, you want to try and fix problems + introduced by the new version of the dependency without affecting the working version of your project. You need to set up + a separate virtual environment for your branch to 'isolate' your code while testing the new feature. You do not have to worry too much about specific versions of external libraries that your project depends on most of the time. Virtual environments enable you to always use the latest available version without specifying it explicitly. They also enable you to use a specific older version of a package for your project, should you need to. :::callout + ## A Specific Python or Package Version is Only Ever Installed Once -Note that you will not have a separate Python or package installations for each of your +Note that you will not have a separate Python or package installations for each of your projects - they will only -ever be installed once on your system but will be referenced from different virtual +ever be installed once on your system but will be referenced from different virtual environments. ::: - ### Managing Python Virtual Environments There are several commonly used command line tools for managing Python virtual environments: + - `venv`, available by default from the standard `Python` distribution from `Python 3.3+` - `virtualenv`, needs to be installed separately but supports both `Python 2.7+` and `Python 3.3+`versions - `pipenv`, created to fix certain shortcomings of `virtualenv` @@ -113,14 +110,16 @@ also recognised and picked up automatically by PyCharm IDE, as we will see in th Part of managing your (virtual) working environment involves installing, updating and removing external packages on your system. The Python package manager tool `pip` is most commonly used for this - it interacts - and obtains the packages from the central repository called [Python Package Index (PyPI)](https://pypi.org/). +and obtains the packages from the central repository called [Python Package Index (PyPI)](https://pypi.org/). `pip` can now be used with all Python distributions (including Anaconda). :::callout + ## A Note on Anaconda and `conda` + Anaconda is an open source Python -distribution commonly used for scientific programming - it conveniently installs Python, package and environment management `conda`, and a -number of commonly used scientific computing packages so you do not have to obtain them separately. +distribution commonly used for scientific programming - it conveniently installs Python, package and environment management `conda`, and a +number of commonly used scientific computing packages so you do not have to obtain them separately. `conda` is an independent command line tool (available separately from the Anaconda distribution too) with dual functionality: (1) it is a package manager that helps you find Python packages from remote package repositories and install them on your system, and (2) it is also a virtual environment manager. So, you can use `conda` for both tasks instead of using `venv` and `pip`. ::: @@ -134,7 +133,7 @@ Note that each Python distribution comes with its own version of `pip` - and if you have several Python versions installed you have to be extra careful to use the correct `pip` to manage external packages for that Python version. -`venv` and `pip` are considered the *de facto* standards for virtual environment and package management for Python 3. +`venv` and `pip` are considered the _de facto_ standards for virtual environment and package management for Python 3. However, the advantages of using Anaconda and `conda` are that you get (most of the) packages needed for scientific code development included with the distribution. If you are only collaborating with others who are also using Anaconda, you may find that `conda` satisfies all your needs. It is good, however, to be aware of all these tools, @@ -147,11 +146,12 @@ too to which your knowledge can be ported). Let us have a look at how we can create and manage virtual environments from the command line using `venv` and manage packages using `pip`. ### Creating Virtual Environments Using `venv` + Creating a virtual environment with `venv` is done by executing the following command: -~~~bash +```bash python3 -m venv /path/to/new/virtual/environment -~~~ +``` where `/path/to/new/virtual/environment` is a path to a directory where you want to place it - conventionally within your software project so they are co-located. @@ -159,36 +159,38 @@ This will create the target directory for the virtual environment (and any paren For our project let's create a virtual environment called "venv". First, ensure you are within the project root directory, then: -~~~bash +```bash python3 -m venv venv -~~~ +``` -If you list the contents of the newly created directory "venv", on a Mac or Linux system +If you list the contents of the newly created directory "venv", on a Mac or Linux system (slightly different on Windows as explained below) you should see something like: -~~~bash +```bash ls -l venv -~~~ +``` -~~~ +```text total 8 drwxr-xr-x 12 alex staff 384 5 Oct 11:47 bin drwxr-xr-x 2 alex staff 64 5 Oct 11:47 include drwxr-xr-x 3 alex staff 96 5 Oct 11:47 lib -rw-r--r-- 1 alex staff 90 5 Oct 11:47 pyvenv.cfg -~~~ +``` So, running the `python3 -m venv venv` command created the target directory called "venv" containing: - `pyvenv.cfg` configuration file with a home key pointing to the Python installation from which the command was run, - `bin` subdirectory (called `Scripts` on Windows) containing a symlink of the Python interpreter binary used to create the -environment and the standard Python library, + environment and the standard Python library, - `lib/pythonX.Y/site-packages` subdirectory (called `Lib\site-packages` on Windows) to contain its own independent set of installed Python packages isolated from other projects, - various other configuration and supporting files and subdirectories. :::callout + ## Naming Virtual Environments + What is a good name to use for a virtual environment? Using "venv" or ".venv" as the name for an environment and storing it within the project's directory seems to be the recommended way - this way when you come across such a subdirectory within a software project, @@ -197,20 +199,22 @@ A slight downside is that all different virtual environments on your machine then use the same name and the current one is determined by the context of the path you are currently located in. A (non-conventional) alternative is to use your project name for the name of the virtual environment, with the downside that there is nothing to indicate -that such a directory contains a virtual environment. In our case, we have settled to use the name "venv" since it is -not a hidden directory and we want it to be displayed by the command line when listing directory contents (hence, +that such a directory contains a virtual environment. In our case, we have settled to use the name "venv" since it is +not a hidden directory and we want it to be displayed by the command line when listing directory contents (hence, no need for the "." in its name that would, by convention, make it hidden). In the future, you will decide what naming convention works best for you. Here are some references for each of the naming conventions: + - [The Hitchhiker's Guide to Python](https://docs.python-guide.org/dev/virtualenvs/) notes that "venv" is the general convention used globally - [The Python Documentation](https://docs.python.org/3/library/venv.html) indicates that ".venv" is common - ["venv" vs ".venv" discussion](https://discuss.python.org/t/trying-to-come-up-with-a-default-directory-name-for-virtual-environments/3750) + ::: Once you’ve created a virtual environment, you will need to activate it: -~~~bash +```bash source venv/bin/activate -~~~ +``` Activating the virtual environment will change your command line’s prompt to show what virtual environment you are currently using (indicated by its name in round brackets at the start of the prompt), @@ -218,65 +222,69 @@ and modify the environment so that running Python will get you the particular version of Python configured in your virtual environment. You can verify you are using your virtual environment's version of Python by checking the path using the command `which`: -~~~bash + +```bash which python3 -~~~ +``` -~~~ +```text /<your-current-directory>/venv/bin/python3 -~~~ +``` When you’re done working on your project, you can exit the environment with: -~~~bash +```bash deactivate -~~~ +``` If you've just done the `deactivate`, ensure you reactivate the environment ready for the next part: -~~~bash +```bash source venv/bin/activate -~~~ +``` :::callout + ## Python Within A Virtual Environment -Within a virtual environment, commands `python` and `pip` will refer to the version of Python you created the environment with. If you create a virtual environment with `python3 -m venv venv`, `python` will refer to `python3` and `pip` will refer to `pip3`. +Within a virtual environment, commands `python` and `pip` will refer to the version of Python you created the environment with. If you create a virtual environment with `python3 -m venv venv`, `python` will refer to `python3` and `pip` will refer to `pip3`. On some machines with Python 2 installed, `python` command may refer to the copy of Python 2 installed outside of the virtual environment instead, which can cause confusion. You can always check which version of Python you are using in your virtual environment with the command `which python` to be absolutely sure. We continue using `python3` and `pip3` in this material to avoid confusion for those users, but commands `python` and `pip` may work for you as expected. ::: -Note that, since our software project is being tracked by Git, the newly created virtual environment will show up +Note that, since our software project is being tracked by Git, the newly created virtual environment will show up in version control - we will see how to handle it using Git in one of the subsequent episodes. ### Installing External Packages Using `pip` -We noticed earlier that our code depends on two *external packages/libraries* - `numpy` and `matplotlib`. In order +We noticed earlier that our code depends on two _external packages/libraries_ - `numpy` and `matplotlib`. In order for the code to run on your machine, you need to install these two dependencies into your virtual environment. To install the latest version of a package with `pip` you use pip's `install` command and specify the package’s name, e.g.: -~~~bash +```bash pip3 install numpy pip3 install matplotlib -~~~ +``` or like this to install multiple packages at once for short: -~~~bash -$ pip3 install numpy matplotlib -~~~ +```bash +pip3 install numpy matplotlib +``` :::callout + ## How About `python3 -m pip install`? -Why are we not using `pip` as an argument to `python3` command, in the same way we did with `venv` -(i.e. `python3 -m venv`)? `python3 -m pip install` should be used according to the + +Why are we not using `pip` as an argument to `python3` command, in the same way we did with `venv` +(i.e. `python3 -m venv`)? `python3 -m pip install` should be used according to the [official Pip documentation](https://pip.pypa.io/en/stable/user_guide/#running-pip); other official documentation -still seems to have a mixture of usages. Core Python developer Brett Cannon offers a -[more detailed explanation](https://snarky.ca/why-you-should-use-python-m-pip/) of edge cases when the two options may produce -different results and recommends `python3 -m pip install`. We kept the old-style command (`pip3 install`) as it seems more -prevalent among developers at the moment - but it may be a convention that will soon change and certainly something you should consider. +still seems to have a mixture of usages. Core Python developer Brett Cannon offers a +[more detailed explanation](https://snarky.ca/why-you-should-use-python-m-pip/) of edge cases when the two options may produce +different results and recommends `python3 -m pip install`. We kept the old-style command (`pip3 install`) as it seems more +prevalent among developers at the moment - but it may be a convention that will soon change and certainly something you should consider. ::: If you run the `pip3 install` command on a package that is already installed, `pip` will notice this and do nothing. @@ -291,11 +299,11 @@ To upgrade a package to the latest version, e.g. `pip3 install --upgrade numpy`. To display information about a particular installed package do: -~~~bash +```bash pip3 show numpy -~~~ +``` -~~~ +```text Name: numpy Version: 1.21.2 Summary: NumPy is the fundamental package for array computing with Python. @@ -306,15 +314,15 @@ License: BSD Location: /Users/alex/work/SSI/Carpentries/python-intermediate-inflammation/inflammation/lib/python3.9/site-packages Requires: Required-by: matplotlib -~~~ +``` To list all packages installed with `pip` (in your current virtual environment): -~~~bash +```bash pip3 list -~~~ +``` -~~~ +```text Package Version --------------- ------- cycler 0.11.0 @@ -331,7 +339,7 @@ setuptools 57.0.0 setuptools-scm 6.3.2 six 1.16.0 tomli 1.2.2 -~~~ +``` To uninstall a package installed in the virtual environment do: `pip3 uninstall package-name`. You can also supply a list of packages to uninstall at the same time. @@ -347,12 +355,12 @@ To export your active environment - use `pip3 freeze` command to produce a list of packages installed in the virtual environment. A common convention is to put this list in a `requirements.txt` file: -~~~bash +```bash pip3 freeze > requirements.txt cat requirements.txt -~~~ +``` -~~~ +```text cycler==0.11.0 fonttools==4.28.1 kiwisolver==1.3.2 @@ -365,18 +373,18 @@ python-dateutil==2.8.2 setuptools-scm==6.3.2 six==1.16.0 tomli==1.2.2 -~~~ +``` The first of the above commands will create a `requirements.txt` file in your current directory. Yours may look a little different, depending on the version of the packages you have installed, as well as any differences in the packages that they themselves use. -The `requirements.txt` file can then be committed to a version control system (we will see how to do this using Git in +The `requirements.txt` file can then be committed to a version control system (we will see how to do this using Git in one of the following episodes) and get shipped as part of your software and shared with collaborators and/or users. They can then replicate your environment and install all the necessary packages from the project root as follows: -~~~bash +```bash pip3 install -r requirements.txt -~~~ +``` As your project grows - you may need to update your environment for a variety of reasons. For example, one of your project's dependencies has just released a new version (dependency version number update), you need an additional package for data analysis @@ -387,7 +395,9 @@ accordingly by re-issuing `pip freeze` command and propagate the updated `requir via your code sharing platform (e.g. GitHub). :::callout + ## Official Documentation + For a full list of options and commands, consult the [official `venv` documentation](https://docs.python.org/3/library/venv.html) and the [Installing Python Modules with `pip` guide](https://docs.python.org/3/installing/index.html#installing-index). Also check out the guide ["Installing packages using `pip` and virtual environments"](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/#installing-packages-using-pip-and-virtual-environments). ::: diff --git a/technology_and_tooling/snakemake/additional_features.md b/technology_and_tooling/snakemake/additional_features.md index 7d55dba7..36477188 100644 --- a/technology_and_tooling/snakemake/additional_features.md +++ b/technology_and_tooling/snakemake/additional_features.md @@ -2,22 +2,18 @@ name: "Additional features" teaching: 30 exercises: 30 -dependsOn: [ - technology_and_tooling.snakemake.advanced -] +dependsOn: [technology_and_tooling.snakemake.advanced] tags: [snakemake] -attribution: - - citation: > - Mölder, F., Jablonski, K.P., Letcher, B., Hall, M.B., Tomkins-Tinch, - C.H., Sochat, V., Forster, J., Lee, S., Twardziok, S.O., Kanitz, A., - Wilm, A., Holtgrewe, M., Rahmann, S., Nahnsen, S., Köster, J., 2021. - Sustainable data analysis with Snakemake. F1000Res 10, 33. - Revision c7ae161c. - url: https://snakemake.readthedocs.io/en/stable/tutorial/additional_features.html - image: https://raw.githubusercontent.com/snakemake/snakemake/main/snakemake/report/template/logo.svg - license: MIT license - - +attribution: + - citation: > + Mölder, F., Jablonski, K.P., Letcher, B., Hall, M.B., Tomkins-Tinch, + C.H., Sochat, V., Forster, J., Lee, S., Twardziok, S.O., Kanitz, A., + Wilm, A., Holtgrewe, M., Rahmann, S., Nahnsen, S., Köster, J., 2021. + Sustainable data analysis with Snakemake. F1000Res 10, 33. + Revision c7ae161c. + url: https://snakemake.readthedocs.io/en/stable/tutorial/additional_features.html + image: https://raw.githubusercontent.com/snakemake/snakemake/main/snakemake/report/template/logo.svg + license: MIT license --- # Additional features @@ -33,7 +29,7 @@ With the `benchmark` directive, Snakemake can be instructed to **measure the wall clock time of a job**. We activate benchmarking for the rule `bwa_map`: -``` python +```snakemake rule bwa_map: input: "data/genome.fa", @@ -71,7 +67,7 @@ workflows, it is sometimes reasonable to **split a workflow into modules**. For this, Snakemake provides the `include` directive to include another Snakefile into the current one, e.g.: -``` python +```yaml include: "path/to/other.snakefile" ``` @@ -107,7 +103,7 @@ software versions (e.g. combine Python 2 with Python 3). In our example, instead of using an external environment we can specify environments per rule, e.g.: -``` python +```snakemake rule samtools_index: input: "sorted_reads/{sample}.bam" @@ -121,7 +117,7 @@ rule samtools_index: with `envs/samtools.yaml` defined as -``` yaml +```yaml channels: - bioconda - conda-forge @@ -139,7 +135,7 @@ snakefile. When Snakemake is executed with -``` console +```console snakemake --use-conda --cores 1 ``` @@ -163,7 +159,7 @@ provides the `wrapper` directive that can be used instead of `shell`, `script`, or `run`. For example, the rule `bwa_map` could alternatively look like this: -``` python +```snakemake rule bwa_mem: input: ref="data/genome.fa", @@ -207,8 +203,8 @@ In cluster environments, compute jobs are usually submitted as shell scripts via commands like `qsub`. Snakemake provides a **generic mode** to execute on such clusters. By invoking Snakemake with -``` console -$ snakemake --cluster qsub --jobs 100 +```console +snakemake --cluster qsub --jobs 100 ``` each job will be compiled into a shell script that is submitted with the @@ -219,8 +215,8 @@ clusters allow to run the submission command in **synchronous mode**, such that it waits until the job has been executed. In such cases, we can invoke e.g. -``` console -$ snakemake --cluster-sync "qsub -sync yes" --jobs 100 +```console +snakemake --cluster-sync "qsub -sync yes" --jobs 100 ``` The specified submission command can also be **decorated with additional @@ -228,8 +224,8 @@ parameters taken from the submitted job**. For example, the number of used threads can be accessed in braces similarly to the formatting of shell commands, e.g. -``` console -$ snakemake --cluster "qsub -pe threaded {threads}" --jobs 100 +```console +snakemake --cluster "qsub -pe threaded {threads}" --jobs 100 ``` Alternatively, Snakemake can use the Distributed Resource Management @@ -237,8 +233,8 @@ Application API ([DRMAA](https://www.drmaa.org)). This API provides a common interface to control various resource management systems. The **DRMAA support** can be activated by invoking Snakemake as follows: -``` console -$ snakemake --drmaa --jobs 100 +```console +snakemake --drmaa --jobs 100 ``` If available, **DRMAA is preferable over the generic cluster modes** @@ -271,8 +267,7 @@ use that option. For sge this would look like The following (simplified) script detects the job status on a given SLURM cluster (\>= 14.03.0rc1 is required for `--parsable`). -``` python -#!/usr/bin/env python +```python import subprocess import sys @@ -282,18 +277,18 @@ output = str(subprocess.check_output("sacct -j %s --format State --noheader | he running_status=["PENDING", "CONFIGURING", "COMPLETING", "RUNNING", "SUSPENDED"] if "COMPLETED" in output: - print("success") + print("success") elif any(r in output for r in running_status): - print("running") + print("running") else: - print("failed") + print("failed") ``` To use this script call snakemake similar to below, where `status.py` is the script above. -``` console -$ snakemake all --jobs 100 --cluster "sbatch --cpus-per-task=1 --parsable" --cluster-status ./status.py +```console +snakemake all --jobs 100 --cluster "sbatch --cpus-per-task=1 --parsable" --cluster-status ./status.py ``` ## Using \--cluster-cancel @@ -339,17 +334,17 @@ Constraints may be defined per rule or globally using the `snakefiles-wildcards`{.interpreted-text role="ref"}. This mechanism helps to solve two kinds of ambiguity. -- It can help to avoid ambiguous rules, i.e. two or more rules that - can be applied to generate the same output file. Other ways of - handling ambiguous rules are described in the Section - `snakefiles-ambiguous-rules`{.interpreted-text role="ref"}. -- It can help to guide the regular expression based matching so that - wildcards are assigned to the right parts of a file name. Consider - the output file `{sample}.{group}.txt` and assume that the target - file is `A.1.normal.txt`. It is not clear whether `dataset="A.1"` - and `group="normal"` or `dataset="A"` and `group="1.normal"` is the - right assignment. Here, constraining the dataset wildcard by - `{sample,[A-Z]+}.{group}` solves the problem. +- It can help to avoid ambiguous rules, i.e. two or more rules that + can be applied to generate the same output file. Other ways of + handling ambiguous rules are described in the Section + `snakefiles-ambiguous-rules`{.interpreted-text role="ref"}. +- It can help to guide the regular expression based matching so that + wildcards are assigned to the right parts of a file name. Consider + the output file `{sample}.{group}.txt` and assume that the target + file is `A.1.normal.txt`. It is not clear whether `dataset="A.1"` + and `group="normal"` or `dataset="A"` and `group="1.normal"` is the + right assignment. Here, constraining the dataset wildcard by + `{sample,[A-Z]+}.{group}` solves the problem. When dealing with ambiguous rules, it is best practice to first try to solve the ambiguity by using a proper file structure, for example, by diff --git a/technology_and_tooling/snakemake/advanced.md b/technology_and_tooling/snakemake/advanced.md index 1c0619ac..4536291a 100644 --- a/technology_and_tooling/snakemake/advanced.md +++ b/technology_and_tooling/snakemake/advanced.md @@ -2,22 +2,18 @@ name: "Advanced: Decorating" teaching: 30 exercises: 30 -dependsOn: [ - technology_and_tooling.snakemake.basics -] +dependsOn: [technology_and_tooling.snakemake.basics] tags: [snakemake] -attribution: - - citation: > - Mölder, F., Jablonski, K.P., Letcher, B., Hall, M.B., Tomkins-Tinch, - C.H., Sochat, V., Forster, J., Lee, S., Twardziok, S.O., Kanitz, A., - Wilm, A., Holtgrewe, M., Rahmann, S., Nahnsen, S., Köster, J., 2021. - Sustainable data analysis with Snakemake. F1000Res 10, 33. - Revision c7ae161c. - url: https://snakemake.readthedocs.io/en/stable/tutorial/advanced.html - image: https://raw.githubusercontent.com/snakemake/snakemake/main/snakemake/report/template/logo.svg - license: MIT license - - +attribution: + - citation: > + Mölder, F., Jablonski, K.P., Letcher, B., Hall, M.B., Tomkins-Tinch, + C.H., Sochat, V., Forster, J., Lee, S., Twardziok, S.O., Kanitz, A., + Wilm, A., Holtgrewe, M., Rahmann, S., Nahnsen, S., Köster, J., 2021. + Sustainable data analysis with Snakemake. F1000Res 10, 33. + Revision c7ae161c. + url: https://snakemake.readthedocs.io/en/stable/tutorial/advanced.html + image: https://raw.githubusercontent.com/snakemake/snakemake/main/snakemake/report/template/logo.svg + license: MIT license --- # Advanced: Decorating the example workflow @@ -32,7 +28,7 @@ speed up the computation. **Snakemake can be made aware of the threads a rule needs** with the `threads` directive. In our example workflow, it makes sense to use multiple threads for the rule `bwa_map`: -``` python +```yaml rule bwa_map: input: "data/genome.fa", @@ -55,8 +51,8 @@ does not exceed a given number of available CPU cores. This number is given with the `--cores` command line argument, which is mandatory for `snakemake` calls that actually run the workflow. For example -``` console -$ snakemake --cores 10 +```console +snakemake --cores 10 ``` :::callout @@ -100,7 +96,7 @@ Config files can be written in [JSON](https://json.org) or [YAML](https://yaml.org), and are used with the `configfile` directive. In our example workflow, we add the line -``` python +```yaml configfile: "config.yaml" ``` @@ -110,16 +106,16 @@ store its contents into a globally available named `config`. In our case, it makes sense to specify the samples in `config.yaml` as -``` yaml +```yaml samples: - A: data/samples/A.fastq - B: data/samples/B.fastq + A: data/samples/A.fastq + B: data/samples/B.fastq ``` Now, we can remove the statement defining `SAMPLES` from the Snakefile and change the rule `bcftools_call` to -``` python +```yaml rule bcftools_call: input: fa="data/genome.fa", @@ -140,13 +136,13 @@ different to the rule `bcftools_call` we modified above. To understand this, it is important to know that Snakemake workflows are executed in three phases. -1. In the **initialization** phase, the files defining the workflow are - parsed and all rules are instantiated. -2. In the **DAG** phase, the directed acyclic dependency graph of all - jobs is built by filling wildcards and matching input files to - output files. -3. In the **scheduling** phase, the DAG of jobs is executed, with jobs - started according to the available resources. +1. In the **initialization** phase, the files defining the workflow are + parsed and all rules are instantiated. +2. In the **DAG** phase, the directed acyclic dependency graph of all + jobs is built by filling wildcards and matching input files to + output files. +3. In the **scheduling** phase, the DAG of jobs is executed, with jobs + started according to the available resources. The expand functions in the list of input files of the rule `bcftools_call` are executed during the initialization phase. In this @@ -158,7 +154,7 @@ of input files to the DAG phase. This can be achieved by specifying an **input function** instead of a string as inside of the input directive. For the rule `bwa_map` this works as follows: -``` python +```snakemake def get_bwa_map_input_fastqs(wildcards): return config["samples"][wildcards.sample] @@ -181,7 +177,7 @@ files that are affected by such changes with `snakemake --list-input-changes`. To trigger a rerun, this bit of bash magic helps: -``` console +```console snakemake -n --forcerun $(snakemake --list-input-changes) ``` @@ -214,7 +210,7 @@ for rules with the `params` directive. In our workflow, it is reasonable to annotate aligned reads with so-called read groups, that contain metadata like the sample name. We modify the rule `bwa_map` accordingly: -``` python +```snakemake rule bwa_map: input: "data/genome.fa", @@ -263,7 +259,7 @@ defined via the `log` directive and handled similarly to output files, but they are not subject of rule matching and are not cleaned up when a job fails. We modify our rule `bwa_map` as follows: -``` python +```snakemake rule bwa_map: input: "data/genome.fa", @@ -295,20 +291,20 @@ avoid file name clashes between different jobs of the same rule. ::::challenge{id=add_logging title="Exercise"} -- Add a log directive to the `bcftools_call` rule as well. -- Time to re-run the whole workflow (remember the command line flags - to force re-execution). See how log files are created for variant - calling and read mapping. -- The ability to track the provenance of each generated result is an - important step towards reproducible analyses. Apart from the - `report` functionality discussed before, Snakemake can summarize - various provenance information for all output files of the workflow. - The flag `--summary` prints a table associating each output file - with the rule used to generate it, the creation date and optionally - the version of the tool used for creation is provided. Further, the - table informs about updated input files and changes to the source - code of the rule after creation of the output file. Invoke Snakemake - with `--summary` to examine the information for our example. +- Add a log directive to the `bcftools_call` rule as well. +- Time to re-run the whole workflow (remember the command line flags + to force re-execution). See how log files are created for variant + calling and read mapping. +- The ability to track the provenance of each generated result is an + important step towards reproducible analyses. Apart from the + `report` functionality discussed before, Snakemake can summarize + various provenance information for all output files of the workflow. + The flag `--summary` prints a table associating each output file + with the rule used to generate it, the creation date and optionally + the version of the tool used for creation is provided. Further, the + table informs about updated input files and changes to the source + code of the rule after creation of the output file. Invoke Snakemake + with `--summary` to examine the information for our example. :::: @@ -323,7 +319,7 @@ will delete the marked files for you, once all the consuming jobs (that need it as input) have been executed. We use this mechanism for the output file of the rule `bwa_map`: -``` python +```snakemake rule bwa_map: input: "data/genome.fa", @@ -347,7 +343,7 @@ reasonable to **protect** the final BAM file **from accidental deletion or modification**. We modify the rule `samtools_sort` to mark its output file as `protected`: -``` python +```snakemake rule samtools_sort: input: "mapped_reads/{sample}.bam" @@ -364,14 +360,14 @@ deleted by accident. ::::challenge{id=add_temporaries title="Exercise"} -- Re-execute the whole workflow and observe how Snakemake handles the - temporary and protected files. -- Run Snakemake with the target `mapped_reads/A.bam`. Although the - file is marked as temporary, you will see that Snakemake does not - delete it because it is specified as a target file. -- Try to re-execute the whole workflow again with the dry-run option. - You will see that it fails (as intended) because Snakemake cannot - overwrite the protected output files. +- Re-execute the whole workflow and observe how Snakemake handles the + temporary and protected files. +- Run Snakemake with the target `mapped_reads/A.bam`. Although the + file is marked as temporary, you will see that Snakemake does not + delete it because it is specified as a target file. +- Try to re-execute the whole workflow again with the dry-run option. + You will see that it fails (as intended) because Snakemake cannot + overwrite the protected output files. :::: @@ -380,10 +376,10 @@ deleted by accident. For this advanced part of the tutorial, we have now created a `config.yaml` configuration file: -``` yaml +```yaml samples: - A: data/samples/A.fastq - B: data/samples/B.fastq + A: data/samples/A.fastq + B: data/samples/B.fastq prior_mutation_rate: 0.001 ``` @@ -391,7 +387,7 @@ prior_mutation_rate: 0.001 With this, the final version of our workflow in the `Snakefile` looks like this: -``` python +```snakemake configfile: "config.yaml" diff --git a/technology_and_tooling/snakemake/basics.md b/technology_and_tooling/snakemake/basics.md index 9e4e2b55..214ee25b 100644 --- a/technology_and_tooling/snakemake/basics.md +++ b/technology_and_tooling/snakemake/basics.md @@ -2,20 +2,18 @@ name: "Basics: An example workflow" teaching: 30 exercises: 30 -dependsOn: [ - technology_and_tooling.snakemake.setup -] +dependsOn: [technology_and_tooling.snakemake.setup] tags: [snakemake] -attribution: - - citation: > - Mölder, F., Jablonski, K.P., Letcher, B., Hall, M.B., Tomkins-Tinch, - C.H., Sochat, V., Forster, J., Lee, S., Twardziok, S.O., Kanitz, A., - Wilm, A., Holtgrewe, M., Rahmann, S., Nahnsen, S., Köster, J., 2021. - Sustainable data analysis with Snakemake. F1000Res 10, 33. - Revision c7ae161c. - url: https://snakemake.readthedocs.io/en/stable/tutorial/basics.html - image: https://raw.githubusercontent.com/snakemake/snakemake/main/snakemake/report/template/logo.svg - license: MIT license +attribution: + - citation: > + Mölder, F., Jablonski, K.P., Letcher, B., Hall, M.B., Tomkins-Tinch, + C.H., Sochat, V., Forster, J., Lee, S., Twardziok, S.O., Kanitz, A., + Wilm, A., Holtgrewe, M., Rahmann, S., Nahnsen, S., Köster, J., 2021. + Sustainable data analysis with Snakemake. F1000Res 10, 33. + Revision c7ae161c. + url: https://snakemake.readthedocs.io/en/stable/tutorial/basics.html + image: https://raw.githubusercontent.com/snakemake/snakemake/main/snakemake/report/template/logo.svg + license: MIT license --- # Basics: An example workflow @@ -81,7 +79,7 @@ reference genome (see `tutorial-background`). For this, we will use the tool [Atom](https://atom.io) editor, since it provides out-of-the-box syntax highlighting for Snakemake. In the Snakefile, define the following rule: -``` python +```snakemake rule bwa_map: input: "data/genome.fa", @@ -132,8 +130,8 @@ When a workflow is executed, Snakemake tries to generate given **target** files. Target files can be specified via the command line. By executing -``` console -$ snakemake -np mapped_reads/A.bam +```console +snakemake -np mapped_reads/A.bam ``` in the working directory containing the Snakefile, we tell Snakemake to @@ -151,8 +149,8 @@ where the edges represent dependencies. So far, we only have a single rule, and the DAG of jobs consists of a single node. Nevertheless, we can **execute our workflow** with -``` console -$ snakemake --cores 1 mapped_reads/A.bam +```console +snakemake --cores 1 mapped_reads/A.bam ``` Whenever executing a workflow, you need to specify the number of cores @@ -171,7 +169,7 @@ rules by using named wildcards**. Simply replace the `A` in the second input file and in the output file with the wildcard `{sample}`, leading to -``` python +```snakemake rule bwa_map: input: "data/genome.fa", @@ -202,8 +200,8 @@ wildcards**. When executing -``` console -$ snakemake -np mapped_reads/B.bam +```console +snakemake -np mapped_reads/B.bam ``` Snakemake will determine that the rule `bwa_map` can be applied to @@ -212,16 +210,16 @@ value `B`. In the output of the dry-run, you will see how the wildcard value is propagated to the input files and all filenames in the shell command. You can also **specify multiple targets**, for example: -``` console -$ snakemake -np mapped_reads/A.bam mapped_reads/B.bam +```console +snakemake -np mapped_reads/A.bam mapped_reads/B.bam ``` Some [Bash](https://www.tldp.org/LDP/Bash-Beginners-Guide/html) magic can make this particularly handy. For example, you can alternatively compose our multiple targets in a single pass via -``` console -$ snakemake -np mapped_reads/{A,B}.bam +```console +snakemake -np mapped_reads/{A,B}.bam ``` Note that this is not a special Snakemake syntax. @@ -237,15 +235,15 @@ the workflow before (see the previous step) and no input file is newer than the output file `mapped_reads/A.bam`. You can update the file modification date of the input file `data/samples/A.fastq` via -``` console -$ touch data/samples/A.fastq +```console +touch data/samples/A.fastq ``` and see how Snakemake wants to re-run the job to create the file `mapped_reads/A.bam` by executing -``` console -$ snakemake -np mapped_reads/A.bam mapped_reads/B.bam +```console +snakemake -np mapped_reads/A.bam mapped_reads/B.bam ``` ## Step 3: Sorting read alignments @@ -254,7 +252,7 @@ For later steps, we need the read alignments in the BAM files to be sorted. This can be achieved with the [samtools](https://www.htslib.org) `sort` command. We add the following rule beneath the `bwa_map` rule: -``` python +```snakemake rule samtools_sort: input: "mapped_reads/{sample}.bam" @@ -285,8 +283,8 @@ object that has an attribute with the value for each wildcard. When issuing -``` console -$ snakemake -np sorted_reads/B.bam +```console +snakemake -np sorted_reads/B.bam ``` you will see how Snakemake wants to run first the rule `bwa_map` and @@ -312,7 +310,7 @@ the sorted read alignments so that we can quickly access reads by the genomic location they were mapped to. This can be done with the following rule: -``` python +```snakemake rule samtools_index: input: "sorted_reads/{sample}.bam" @@ -325,8 +323,8 @@ rule samtools_index: Having three steps already, it is a good time to take a closer look at the resulting directed acyclic graph (DAG) of jobs. By executing -``` console -$ snakemake --dag sorted_reads/{A,B}.bam.bai | dot -Tsvg > dag.svg +```console +snakemake --dag sorted_reads/{A,B}.bam.bai | dot -Tsvg > dag.svg ``` :::callout @@ -371,7 +369,7 @@ calling, we will combine the two utilities function for collecting input files** that helps us to describe the aggregation in this step. With -``` python +```snakemake expand("sorted_reads/{sample}.bam", sample=SAMPLES) ``` @@ -379,21 +377,21 @@ we obtain a list of files where the given pattern `"sorted_reads/{sample}.bam"` was formatted with the values in a given list of samples `SAMPLES`, i.e. -``` python +```snakemake ["sorted_reads/A.bam", "sorted_reads/B.bam"] ``` The function is particularly useful when the pattern contains multiple wildcards. For example, -``` python +```sn akemake expand("sorted_reads/{sample}.{replicate}.bam", sample=SAMPLES, replicate=[0, 1]) ``` would create the product of all elements of `SAMPLES` and the list `[0, 1]`, yielding -``` python +```snakemake ["sorted_reads/A.0.bam", "sorted_reads/A.1.bam", "sorted_reads/B.0.bam", "sorted_reads/B.1.bam"] ``` @@ -406,7 +404,7 @@ principle Python code enhanced by some declarative statements to define workflows. Hence, we can define the list of samples ad-hoc in plain Python at the top of the Snakefile: -``` python +```yaml SAMPLES = ["A", "B"] ``` @@ -424,7 +422,7 @@ Later, we will learn about more sophisticated ways like **config files**. But for now, this is enough so that we can add the following rule to our Snakefile: -``` python +```snakemake rule bcftools_call: input: fa="data/genome.fa", @@ -467,7 +465,7 @@ write Python code inside a rule, it is usually reasonable to move such logic into separate scripts. For this purpose, Snakemake offers the `script` directive. Add the following rule to your Snakefile: -``` python +```snakemake rule plot_quals: input: "calls/all.vcf" @@ -496,11 +494,12 @@ like `input`, `output`, `wildcards`, etc. are available as attributes of a global `snakemake` object. Create the file `scripts/plot-quals.py`, with the following content: -``` python +```python import matplotlib matplotlib.use("Agg") import matplotlib.pyplot as plt from pysam import VariantFile +import snakemake quals = [record.qual for record in VariantFile(snakemake.input[0])] plt.hist(quals) @@ -549,7 +548,7 @@ as input files. Here, this means that we add a rule -``` python +```snakemake rule all: input: "plots/quals.svg" @@ -557,8 +556,8 @@ rule all: to the top of our workflow. When executing Snakemake with -``` console -$ snakemake -n +```console +snakemake -n ``` :::callout @@ -578,18 +577,18 @@ influence the DAG of jobs**. ::::challenge{id=dag_complete title="Exercise"} -- Create the DAG of jobs for the complete workflow. -- Execute the complete workflow and have a look at the resulting - `plots/quals.svg`. -- Snakemake provides handy flags for forcing re-execution of parts of - the workflow. Have a look at the command line help with - `snakemake --help` and search for the flag `--forcerun`. Then, use - this flag to re-execute the rule `samtools_sort` and see what - happens. -- Snakemake displays the reason for each job (under `reason:`). - Perform a dry-run that forces some rules to be reexecuted (using the - `--forcerun` flag in combination with some rulename) to understand - the decisions of Snakemake. +- Create the DAG of jobs for the complete workflow. +- Execute the complete workflow and have a look at the resulting + `plots/quals.svg`. +- Snakemake provides handy flags for forcing re-execution of parts of + the workflow. Have a look at the command line help with + `snakemake --help` and search for the flag `--forcerun`. Then, use + this flag to re-execute the rule `samtools_sort` and see what + happens. +- Snakemake displays the reason for each job (under `reason:`). + Perform a dry-run that forces some rules to be reexecuted (using the + `--forcerun` flag in combination with some rulename) to understand + the decisions of Snakemake. :::: @@ -597,7 +596,7 @@ influence the DAG of jobs**. In total, the resulting workflow looks like this: -``` python +```snakemake SAMPLES = ["A", "B"] diff --git a/technology_and_tooling/snakemake/index.md b/technology_and_tooling/snakemake/index.md index c964126d..99a5760f 100644 --- a/technology_and_tooling/snakemake/index.md +++ b/technology_and_tooling/snakemake/index.md @@ -1,29 +1,21 @@ --- name: "Snakemake Tutorial" id: snakemake -dependsOn: [ - technology_and_tooling.bash_shell -] -files: [ - setup.md, - basics.md, - advanced.md, - additional_features.md, - short.md, -] -attribution: - - citation: > - Mölder, F., Jablonski, K.P., Letcher, B., Hall, M.B., Tomkins-Tinch, - C.H., Sochat, V., Forster, J., Lee, S., Twardziok, S.O., Kanitz, A., - Wilm, A., Holtgrewe, M., Rahmann, S., Nahnsen, S., Köster, J., 2021. - Sustainable data analysis with Snakemake. F1000Res 10, 33. - Revision c7ae161c. - url: https://snakemake.readthedocs.io/en/stable/tutorial/basics.html - image: https://raw.githubusercontent.com/snakemake/snakemake/main/snakemake/report/template/logo.svg - license: MIT license +dependsOn: [technology_and_tooling.bash_shell] +files: [setup.md, basics.md, advanced.md, additional_features.md, short.md] +attribution: + - citation: > + Mölder, F., Jablonski, K.P., Letcher, B., Hall, M.B., Tomkins-Tinch, + C.H., Sochat, V., Forster, J., Lee, S., Twardziok, S.O., Kanitz, A., + Wilm, A., Holtgrewe, M., Rahmann, S., Nahnsen, S., Köster, J., 2021. + Sustainable data analysis with Snakemake. F1000Res 10, 33. + Revision c7ae161c. + url: https://snakemake.readthedocs.io/en/stable/tutorial/basics.html + image: https://raw.githubusercontent.com/snakemake/snakemake/main/snakemake/report/template/logo.svg + license: MIT license summary: | - This tutorial introduces the text-based workflow system - Snakemake. + This tutorial introduces the text-based workflow system + Snakemake. --- This tutorial introduces the text-based workflow system diff --git a/technology_and_tooling/snakemake/setup.md b/technology_and_tooling/snakemake/setup.md index 9f635255..fc1e6a17 100644 --- a/technology_and_tooling/snakemake/setup.md +++ b/technology_and_tooling/snakemake/setup.md @@ -1,20 +1,17 @@ --- name: "Setup" -dependsOn: [ -] +dependsOn: [] tags: [snakemake] -attribution: - - citation: > - Mölder, F., Jablonski, K.P., Letcher, B., Hall, M.B., Tomkins-Tinch, - C.H., Sochat, V., Forster, J., Lee, S., Twardziok, S.O., Kanitz, A., - Wilm, A., Holtgrewe, M., Rahmann, S., Nahnsen, S., Köster, J., 2021. - Sustainable data analysis with Snakemake. F1000Res 10, 33. - Revision c7ae161c. - url: https://snakemake.readthedocs.io/en/stable/tutorial/setup.html - image: https://raw.githubusercontent.com/snakemake/snakemake/main/snakemake/report/template/logo.svg - license: MIT license - - +attribution: + - citation: > + Mölder, F., Jablonski, K.P., Letcher, B., Hall, M.B., Tomkins-Tinch, + C.H., Sochat, V., Forster, J., Lee, S., Twardziok, S.O., Kanitz, A., + Wilm, A., Holtgrewe, M., Rahmann, S., Nahnsen, S., Köster, J., 2021. + Sustainable data analysis with Snakemake. F1000Res 10, 33. + Revision c7ae161c. + url: https://snakemake.readthedocs.io/en/stable/tutorial/setup.html + image: https://raw.githubusercontent.com/snakemake/snakemake/main/snakemake/report/template/logo.svg + license: MIT license --- # Setup @@ -104,7 +101,7 @@ you want to share with your Linux VM, for example, create a folder named `vagrant-linux` somewhere. Open a command line prompt, and change into that directory. Here, you create a 64-bit Ubuntu Linux environment with -``` console +```shell > vagrant init hashicorp/precise64 > vagrant up ``` @@ -114,7 +111,7 @@ If you decide to use a 32-bit image, you will need to download the `vagrant-linux` folder will be shared with the virtual machine that is set up by vagrant. You can log into the virtual machine via -``` console +```shell > vagrant ssh ``` @@ -129,48 +126,42 @@ First, please **open a terminal** or make sure you are logged into your Vagrant Linux VM. Assuming that you have a 64-bit system, on Linux, download and install Miniconda 3 with -``` console -$ curl -L https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh -o Mambaforge-Linux-x86_64.sh -$ bash Mambaforge-Linux-x86_64.sh +```shell +curl -L https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh -o Mambaforge-Linux-x86_64.sh +bash Mambaforge-Linux-x86_64.sh ``` On MacOS with x86_64 architecture, download and install with -``` console -$ curl -L https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-MacOSX-x86_64.sh -o Mambaforge-MacOSX-x86_64.sh -$ bash Mambaforge-MacOSX-x86_64.sh +```shell +curl -L https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-MacOSX-x86_64.sh -o Mambaforge-MacOSX-x86_64.sh +bash Mambaforge-MacOSX-x86_64.sh ``` On MacOS with ARM/M1 architecture, download and install with -``` console -$ curl -L https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-MacOSX-arm64.sh -o Mambaforge-MacOSX-arm64.sh -$ bash Mambaforge-MacOSX-arm64.sh +```shell +curl -L https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-MacOSX-arm64.sh -o Mambaforge-MacOSX-arm64.sh +bash Mambaforge-MacOSX-arm64.sh ``` When you are asked the question -``` +```text Do you wish the installer to prepend the install location to PATH ...? [yes|no] ``` -answer with **yes**. Along with a minimal Python 3 environment, -Mambaforge contains the package manager -[Mamba](https://github.com/mamba-org/mamba). After closing your current -terminal and opening a **new terminal**, you can use the new `conda` -command to install software packages and create isolated environments -to, for example, use different versions of the same package. We will -later use [Conda](https://conda.pydata.org) to create an isolated -environment with all the required software for this tutorial. +answer with **yes**. Along with a minimal Python 3 environment, Mambaforge contains the package manager [Mamba](https://github.com/mamba-org/mamba). +After closing your current terminal and opening a **new terminal**, you can use the new `conda` command to install software packages and create isolated environments to, for example, use different versions of the same package. +We will later use [Conda](https://conda.pydata.org) to create an isolated environment with all the required software for this tutorial. ## Step 2: Preparing a working directory -First, **create a new directory** `snakemake-tutorial` at a **place you -can easily remember** and change into that directory in your terminal: +First, **create a new directory** `snakemake-tutorial` at a **place you can easily remember** and change into that directory in your terminal: -``` console -$ mkdir snakemake-tutorial -$ cd snakemake-tutorial +```shell +mkdir snakemake-tutorial +cd snakemake-tutorial ``` If you use a Vagrant Linux VM from Windows as described above, create @@ -182,20 +173,20 @@ workflow that illustrates the Snakemake syntax and execution environment. First, we download some example data on which the workflow shall be executed: -``` console -$ curl -L https://api.github.com/repos/snakemake/snakemake-tutorial-data/tarball -o snakemake-tutorial-data.tar.gz +```shell +curl -L https://api.github.com/repos/snakemake/snakemake-tutorial-data/tarball -o snakemake-tutorial-data.tar.gz ``` Next we extract the data. On Linux, run -``` console -$ tar --wildcards -xf snakemake-tutorial-data.tar.gz --strip 1 "*/data" "*/environment.yaml" +```shell +tar --wildcards -xf snakemake-tutorial-data.tar.gz --strip 1 "*/data" "*/environment.yaml" ``` On MacOS, run -``` console -$ tar -xf snakemake-tutorial-data.tar.gz --strip 1 "*/data" "*/environment.yaml" +```shell +tar -xf snakemake-tutorial-data.tar.gz --strip 1 "*/data" "*/environment.yaml" ``` This will create a folder `data` and a file `environment.yaml` in the @@ -205,16 +196,16 @@ working directory. First, make sure to activate the conda base environment with -``` console -$ conda activate base +```shell +conda activate base ``` The `environment.yaml` file that you have obtained with the previous step (Step 2) can be used to install all required software into an isolated Conda environment with the name `snakemake-tutorial` via -``` console -$ mamba env create --name snakemake-tutorial --file environment.yaml +```shell +mamba env create --name snakemake-tutorial --file environment.yaml ``` :::callout @@ -225,8 +216,8 @@ In this case we can force conda/mamba to create a virtual environment that corresponds to another chipset by prepending `CONDA_SUBDIR=osx-64` (for x64) or `CONDA_SUBDIR=osx-arm64` (for arm64) to the `mamba create` command, like so: -``` console -$ CONDA_SUBDIR=osx-64 mamba env create --name snakemake-tutorial --file environment.yaml +```shell +CONDA_SUBDIR=osx-64 mamba env create --name snakemake-tutorial --file environment.yaml ``` ::: @@ -238,8 +229,8 @@ can also first install [Mamba](https://github.com/mamba-org/mamba) (which is a faster and more robust replacement for [Conda](https://conda.pydata.org)) in your base environment with -``` console -$ conda install -n base -c conda-forge mamba +```shell +conda install -n base -c conda-forge mamba ``` and then run the `mamba env create` command shown above. @@ -248,14 +239,14 @@ and then run the `mamba env create` command shown above. To activate the `snakemake-tutorial` environment, execute -``` console -$ conda activate snakemake-tutorial +```shell +conda activate snakemake-tutorial ``` Now you can use the installed tools. Execute -``` console -$ snakemake --help +```shell +snakemake --help ``` to test this and get information about the command-line interface of @@ -264,7 +255,6 @@ Snakemake. To exit the environment, you can execute the following command (but **don\'t do that now**, since we finally want to start working with Snakemake :-)). -``` console -$ conda deactivate +```shell +conda deactivate ``` - diff --git a/technology_and_tooling/snakemake/short.md b/technology_and_tooling/snakemake/short.md index bf1159dc..d5485b8a 100644 --- a/technology_and_tooling/snakemake/short.md +++ b/technology_and_tooling/snakemake/short.md @@ -2,21 +2,18 @@ name: "Short tutorial" teaching: 30 exercises: 30 -dependsOn: [ -] +dependsOn: [] tags: [snakemake] -attribution: - - citation: > - Mölder, F., Jablonski, K.P., Letcher, B., Hall, M.B., Tomkins-Tinch, - C.H., Sochat, V., Forster, J., Lee, S., Twardziok, S.O., Kanitz, A., - Wilm, A., Holtgrewe, M., Rahmann, S., Nahnsen, S., Köster, J., 2021. - Sustainable data analysis with Snakemake. F1000Res 10, 33. - Revision c7ae161c. - url: https://snakemake.readthedocs.io/en/stable/tutorial/short.html - image: https://raw.githubusercontent.com/snakemake/snakemake/main/snakemake/report/template/logo.svg - license: MIT license - - +attribution: + - citation: > + Mölder, F., Jablonski, K.P., Letcher, B., Hall, M.B., Tomkins-Tinch, + C.H., Sochat, V., Forster, J., Lee, S., Twardziok, S.O., Kanitz, A., + Wilm, A., Holtgrewe, M., Rahmann, S., Nahnsen, S., Köster, J., 2021. + Sustainable data analysis with Snakemake. F1000Res 10, 33. + Revision c7ae161c. + url: https://snakemake.readthedocs.io/en/stable/tutorial/short.html + image: https://raw.githubusercontent.com/snakemake/snakemake/main/snakemake/report/template/logo.svg + license: MIT license --- # Short tutorial @@ -47,22 +44,28 @@ Snakemake is sufficient for this demo. Second, download and unpack the test data needed for this example from [here](https://github.com/snakemake/snakemake-tutorial-data), e.g., via - mkdir snakemake-demo - cd snakemake-demo - wget https://github.com/snakemake/snakemake-tutorial-data/archive/v5.4.5.tar.gz - tar --wildcards -xf v5.4.5.tar.gz --strip 1 "*/data" +```bash +mkdir snakemake-demo +cd snakemake-demo +wget https://github.com/snakemake/snakemake-tutorial-data/archive/v5.4.5.tar.gz +tar --wildcards -xf v5.4.5.tar.gz --strip 1 "*/data" +``` ## Step 1 First, create an empty workflow in the current directory with: - mkdir workflow - touch workflow/Snakefile +```bash +mkdir workflow +touch workflow/Snakefile +``` Once a Snakefile is present, you can perform a dry run of Snakemake with: - snakemake -n +```bash +snakemake -n +``` Since the Snakefile is empty, it will report that nothing has to be done. In the next steps, we will gradually fill the Snakefile with an @@ -72,18 +75,20 @@ example analysis workflow. The data folder in your working directory looks as follows: - data - ├── genome.fa - ├── genome.fa.amb - ├── genome.fa.ann - ├── genome.fa.bwt - ├── genome.fa.fai - ├── genome.fa.pac - ├── genome.fa.sa - └── samples - ├── A.fastq - ├── B.fastq - └── C.fastq +```text +data +├── genome.fa +├── genome.fa.amb +├── genome.fa.ann +├── genome.fa.bwt +├── genome.fa.fai +├── genome.fa.pac +├── genome.fa.sa +└── samples + ├── A.fastq + ├── B.fastq + └── C.fastq +``` You will create a workflow that maps the sequencing samples in the `data/samples` folder to the reference genome `data/genome.fa`. Then, @@ -92,16 +97,16 @@ example plot. First, create a rule called `map_reads`, with input files -- `data/genome.fa` -- `data/samples/A.fastq` +- `data/genome.fa` +- `data/samples/A.fastq` and output file -- `results/mapped/A.bam` +- `results/mapped/A.bam` To generate output from input, use the shell command -``` python +```shell "bwa mem {input} | samtools view -Sb - > {output}" ``` @@ -114,7 +119,7 @@ that points to a [Conda environment definition](https://conda.io/docs/user-guide/tasks/manage-environments.html?highlight=environment#creating-an-environment-file-manually), with the following content -``` yaml +```yaml channels: - bioconda - conda-forge @@ -129,11 +134,15 @@ and execute the shell command within. Now, test your workflow by simulating the creation of the file `results/mapped/A.bam` via - snakemake --use-conda -n results/mapped/A.bam +```bash +snakemake --use-conda -n results/mapped/A.bam +``` to perform a dry-run and - snakemake --use-conda results/mapped/A.bam --cores 1 +```bash +snakemake --use-conda results/mapped/A.bam --cores 1 +``` to perform the actual execution. @@ -151,33 +160,39 @@ Test this by creating the file `results/mapped/B.bam`. Next, create a rule `sort_alignments` that sorts the obtained `.bam` file by genomic coordinate. The rule should have the input file -- `results/mapped/{sample}.bam` +- `results/mapped/{sample}.bam` and the output file -- `results/mapped/{sample}.sorted.bam` +- `results/mapped/{sample}.sorted.bam` and uses the shell command - samtools sort -o {output} {input} +```bash +samtools sort -o {output} {input} +``` to perform the sorting. Moreover, use the same `conda:` directive as for the previous rule. Test your workflow with - snakemake --use-conda -n results/mapped/A.sorted.bam +```bash +snakemake --use-conda -n results/mapped/A.sorted.bam +``` and - snakemake --use-conda results/mapped/A.sorted.bam --cores 1 +```bash +snakemake --use-conda results/mapped/A.sorted.bam --cores 1 +``` ## Step 5 Now, we aggregate over all samples to perform a joint calling of genomic variants. First, we define a variable -``` python +```python samples = ["A", "B", "C"] ``` @@ -192,20 +207,22 @@ For aggregation over many files, Snakemake provides the helper function docs](https://snakemake.readthedocs.io/en/stable/tutorial/basics.html#step-5-calling-genomic-variants)). Create a rule `call` with input files -- `fa="data/genome.fa"` -- `bam=expand("results/mapped/{sample}.sorted.bam", sample=samples)` +- `fa="data/genome.fa"` +- `bam=expand("results/mapped/{sample}.sorted.bam", sample=samples)` output file -- `"results/calls/all.vcf"` +- `"results/calls/all.vcf"` and shell command - bcftools mpileup -f {input.fa} {input.bam} | bcftools call -mv - > {output} +```shell +bcftools mpileup -f {input.fa} {input.bam} | bcftools call -mv - > {output} +``` Further, define a new conda environment file with the following content: -``` yaml +```yaml channels: - bioconda - conda-forge @@ -222,18 +239,17 @@ notebooks. First, we create a rule `plot_quals` with input file -- `"results/calls/all.vcf"` +- `"results/calls/all.vcf"` and output file -- `"results/plots/quals.svg"`. +- `"results/plots/quals.svg"`. Instead of a shell command, we use Snakemake\'s Jupyter notebook integration by specifying -``` python -notebook: - "notebooks/plot-quals.py" +```yaml +notebook: "notebooks/plot-quals.py" ``` instead of using the `shell` directive as before. @@ -242,7 +258,7 @@ Next, we have to define a conda environment for the rule, say `workflow/envs/stats.yaml`, that provides the required Python packages to execute the script: -``` yaml +```yaml channels: - bioconda - conda-forge @@ -256,7 +272,7 @@ dependencies: Then, we let Snakemake generate a skeleton notebook for us with -``` console +```shell snakemake --draft-notebook results/plots/quals.svg --cores 1 --use-conda ``` @@ -265,10 +281,11 @@ notebook. We open the notebook in the editor and add the following content -``` python +```python import pandas as pd import altair as alt from pysam import VariantFile +import snakemake quals = pd.DataFrame({"qual": [record.qual for record in VariantFile(snakemake.input[0])]}) @@ -287,7 +304,9 @@ automatically inserts into the notebook before executing the rule. Make sure to test your workflow with - snakemake --use-conda --force results/plots/quals.svg --cores 1 +```bash +snakemake --use-conda --force results/plots/quals.svg --cores 1 +``` Here, the force ensures that the readily drafted notebook is re-executed even if you had already generated the output plot in the interactive @@ -302,8 +321,8 @@ define default target files. At the top of your `Snakefile` define a rule `all`, with input files -- `"results/calls/all.vcf"` -- `"results/plots/quals.svg"` +- `"results/calls/all.vcf"` +- `"results/plots/quals.svg"` and neither a shell command nor output files. This rule simply serves as an indicator of what shall be collected as results. @@ -317,7 +336,9 @@ information. Snakemake can automatically create HTML reports with - snakemake --report report.html +```bash +snakemake --report report.html +``` Such a report contains runtime statistics, a visualization of the workflow topology, used software and data provenance information. @@ -341,7 +362,9 @@ make Snakemake aware of this, such that the information can be used for scheduling. Add a directive `threads: 8` to the rule and alter the shell command to - bwa mem -t {threads} {input} | samtools view -Sb - > {output} +```bash +bwa mem -t {threads} {input} | samtools view -Sb - > {output} +``` This passes the threads defined in the rule as a command line argument to the `bwa` process. @@ -365,7 +388,7 @@ Only read this if you have a problem with one of the steps. The rule should look like this: -``` python +```snakemake rule map_reads: input: "data/genome.fa", @@ -386,7 +409,7 @@ rule map_reads: The rule should look like this: -``` python +```snakemake rule map_reads: input: "data/genome.fa", @@ -407,7 +430,7 @@ rule map_reads: The rule should look like this: -``` python +```snakemake rule sort_alignments: input: "results/mapped/{sample}.bam" @@ -427,7 +450,7 @@ rule sort_alignments: The rule should look like this: -``` python +```snakemake samples = ["A", "B", "C"] rule call_variants: @@ -450,7 +473,7 @@ rule call_variants: The rule should look like this: -``` python +```snakemake rule plot_quals: input: "results/calls/all.vcf" @@ -470,7 +493,7 @@ rule plot_quals: The rule should look like this: -``` python +```snakemake rule all: input: "results/calls/all.vcf", @@ -487,7 +510,7 @@ It has to appear as first rule in the `Snakefile`. The complete workflow should look like this: -``` python +```snakemake SAMPLES = ["A", "B", "C"] rule all: diff --git a/technology_and_tooling/testing/automated_testing.md b/technology_and_tooling/testing/automated_testing.md index 6a15fa66..1af9d4f0 100644 --- a/technology_and_tooling/testing/automated_testing.md +++ b/technology_and_tooling/testing/automated_testing.md @@ -1,24 +1,24 @@ --- name: Automated Testing id: automated_testing -dependsOn: [ -] +dependsOn: [] tags: [pytest] -attribution: - - citation: > - "Aleksandra Nenadic, Steve Crouch, James Graham, et al. (2022). carpentries-incubator/python-intermediate-development: beta (beta). Zenodo. https://doi.org/10.5281/zenodo.6532057" - url: https://doi.org/10.5281/zenodo.6532057 - image: https://carpentries-incubator.github.io/python-intermediate-development/assets/img/incubator-logo-blue.svg - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - +learningOutcomes: + - Explain the reasons why testing is important. + - Describe the three main types of tests and what each are used for. + - Implement and run unit tests to verify the correct behaviour of program functions. +attribution: + - citation: > + "Aleksandra Nenadic, Steve Crouch, James Graham, et al. (2022). carpentries-incubator/python-intermediate-development: beta (beta). Zenodo. https://doi.org/10.5281/zenodo.6532057" + url: https://doi.org/10.5281/zenodo.6532057 + image: https://carpentries-incubator.github.io/python-intermediate-development/assets/img/incubator-logo-blue.svg + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- - ## Introduction Being able to demonstrate that a process generates the right results is @@ -42,7 +42,6 @@ techniques of automated testing to improve the predictability of a software change, make development more productive, and help us produce code that works as expected and produces desired results. - ## What Is Software Testing? For the sake of argument, if each line we write has a 99% chance of being right, @@ -72,31 +71,31 @@ tests too. We will be using a simple inflammation data analysis python package to demonstrate the use of automated testing. Let's download this now. First, create a new directory inflammation and `cd` to it: -~~~bash -$ mkdir inflammation -$ cd inflammation -~~~ +```bash +mkdir inflammation +cd inflammation +``` If on WSL or Linux (e.g. Ubuntu or the Ubuntu VM), then do: -~~~bash -$ wget https://train.oxrse.uk/material/HPCu/software_architecture_and_design/procedural/inflammation/inflammation.zip -~~~ +```bash +wget https://train.oxrse.uk/material/HPCu/software_architecture_and_design/procedural/inflammation/inflammation.zip +``` Or, if on a Mac, do: -~~~bash -$ curl -O https://train.oxrse.uk/material/HPCu/software_architecture_and_design/procedural/inflammation/inflammation.zip -~~~ +```bash +curl -O https://train.oxrse.uk/material/HPCu/software_architecture_and_design/procedural/inflammation/inflammation.zip +``` Once done, you can unzip this file using the `unzip` command in Bash, which will unpack all the files in this zip archive into the current directory: -~~~bash -$ unzip inflammation.zip -~~~ +```bash +unzip inflammation.zip +``` -This will unpack the zip file's contents into the new `inflammation` directory. The file structure should look like this: +This will unpack the zip file's contents into the new `inflammation` directory. The file structure should look like this: ```text inflammation/ @@ -111,33 +110,36 @@ inflammation/ │   ├── test_patient.py │   └── test_models.py ├── .gitignore -├── LICENSE +├── LICENSE ├── README.md └── requirements.txt ``` The only files we'll be working with in this course are the following, so you can ignore the rest for now: + 1. `inflammation/models.py` - contains the functions we'll be testing 2. `tests/test_models.py` - contains the tests we'll be writing 3. `data/inflammation-*.csv` - contains the data we'll be using to test our functions :::callout + ## What Does the Patient Inflammation Data Contain? Each dataset records inflammation measurements from a separate clinical trial of the drug, and each dataset contains information for 60 patients, who had their inflammation levels recorded for 40 days whilst participating in the trial. ![Snapshot of the inflammation dataset](fig/inflammation-study-pipeline.png) -*Inflammation study pipeline from the [Software Carpentry Python novice lesson](https://swcarpentry.github.io/python-novice-inflammation/fig/lesson-overview.svg)* +_Inflammation study pipeline from the [Software Carpentry Python novice lesson](https://swcarpentry.github.io/python-novice-inflammation/fig/lesson-overview.svg)_ Each of the data files uses the popular [comma-separated (CSV) format](https://en.wikipedia.org/wiki/Comma-separated_values) to represent the data, where: - Each row holds inflammation measurements for a single patient, - Each column represents a successive day in the trial, - Each cell represents an inflammation reading on a given day for a patient (in some arbitrary units of inflammation measurement). + ::: The data is based on a clinical trial of inflammation in patients who have -been given a new treatment for arthritis. There are a number of datasets in the +been given a new treatment for arthritis. There are a number of datasets in the `data` directory recording inflammation information in patients (each file representing a different trial), and are each stored in comma-separated values (CSV) format: each row holds information for a single patient, and the columns @@ -149,25 +151,25 @@ activate it. Install the required dependencies (`numpy` and `matplotlib`) and then start the Python console by invoking the Python interpreter without any parameters, e.g.: -~~~bash -$ python3 -m venv venv -$ source venv/bin/activate -$ pip install numpy matplotlib -$ python -~~~ +```bash +python3 -m venv venv +source venv/bin/activate +pip install numpy matplotlib +python +``` The last command will start the Python console within your shell, which enables us to execute Python commands interactively. Inside the console enter the following: -~~~python +```python import numpy as np data = np.loadtxt(fname='data/inflammation-01.csv', delimiter=',') data.shape -~~~ +``` -~~~ +```text (60, 40) -~~~ +``` The data in this case is two-dimensional - it has 60 rows (one for each patient) and 40 columns (one for each day). Each cell in the data represents an @@ -179,7 +181,7 @@ for calculating the mean average, the maximum, and the minimum values for a given number of rows in our data. For example, the `daily_mean()` function looks like this: -~~~python +```python def daily_mean(data): """Calculate the daily mean of a 2D inflammation data array for each day. @@ -187,28 +189,28 @@ def daily_mean(data): :returns: An array of mean values of measurements for each day. """ return np.mean(data, axis=0) -~~~ +``` -Here, we use NumPy's `np.mean()` function to calculate the mean *vertically* +Here, we use NumPy's `np.mean()` function to calculate the mean _vertically_ across the data (denoted by `axis=0`), which is then returned from the function. So, if `data` was a NumPy array of three rows like... -~~~python +```python [[1, 2], [3, 4], [5, 6]] -~~~ +``` ...the function would return a 1D NumPy array of `[3, 4]` - each value representing the mean of each column (which are, coincidentally, the same values as the second row in the above data array). -To show this working with our patient data, we can use the function like this, passing the first four patient rows to the +To show this working with our patient data, we can use the function like this, passing the first four patient rows to the function in the Python console: -~~~python +```python from inflammation.models import daily_mean daily_mean(data[0:4]) -~~~ +``` Note we use a different form of `import` here - only importing the `daily_mean` function from our `models` instead of everything. This also has the effect that @@ -218,13 +220,13 @@ module name too (i.e. `inflammation.models.daily_mean()`). The above code will return the mean inflammation for each day column across the first four patients (as a 1D NumPy array of shape (40, 0)): -~~~ +```text array([ 0. , 0.5 , 1.5 , 1.75, 2.5 , 1.75, 3.75, 3. , 5.25, 6.25, 7. , 7. , 7. , 8. , 5.75, 7.75, 8.5 , 11. , 9.75, 10.25, 15. , 8.75, 9.75, 10. , 8. , 10.25, 8. , 5.5 , 8. , 6. , 5. , 4.75, 4.75, 4. , 3.25, 4. , 1.75, 2.25, 0.75, 0.75]) -~~~ +``` The other statistical functions are similar. Note that in real situations functions we write are often likely to be more complicated than these, but @@ -234,7 +236,6 @@ test - more easily. Let's now look into how we can test each of our application's statistical functions to ensure they are functioning correctly. - ## Writing Tests to Verify Correct Behaviour ### One Way to Do It? @@ -246,13 +247,13 @@ referring back to our simple `daily_mean()` example above, we could use `[[1, 2], [3, 4], [5, 6]]` as an input to that function and check whether the result equals `[3, 4]`: -~~~python +```python import numpy.testing as npt test_input = np.array([[1, 2], [3, 4], [5, 6]]) test_result = np.array([3, 4]) npt.assert_array_equal(daily_mean(test_input), test_result) -~~~ +``` So we use the `assert_array_equal()` function - part of NumPy's testing library - to test that our calculated result is the same as our expected result. This function explicitly checks the array's shape and elements are the same, and @@ -263,7 +264,7 @@ with NumPy arrays in all cases. We could then add to this with other tests that use and test against other values, and end up with something like: -~~~python +```python test_input = np.array([[2, 0], [4, 0]]) test_result = np.array([2, 0]) npt.assert_array_equal(daily_mean(test_input), test_result) @@ -275,11 +276,11 @@ npt.assert_array_equal(daily_mean(test_input), test_result) test_input = np.array([[1, 2], [3, 4], [5, 6]]) test_result = np.array([3, 4]) npt.assert_array_equal(daily_mean(test_input), test_result) -~~~ +``` However, if we were to enter these in this order, we'll find we get the following after the first test: -~~~ +```text ... AssertionError: Arrays are not equal @@ -289,7 +290,7 @@ Max absolute difference: 1. Max relative difference: 0.5 x: array([3., 0.]) y: array([2, 0]) -~~~ +``` This tells us that one element between our generated and expected arrays doesn't match, and shows us the different arrays. @@ -307,16 +308,16 @@ failed. Going back to our failed first test, what was the issue? As it turns out, the test itself was incorrect, and should have read: -~~~python +```python test_input = np.array([[2, 0], [4, 0]]) test_result = np.array([3, 0]) npt.assert_array_equal(daily_mean(test_input), test_result) -~~~ +``` Which highlights an important point: as well as making sure our code is returning correct answers, we also need to ensure the tests themselves are also correct. Otherwise, we may go on to fix our code only to return an incorrect -result that *appears* to be correct. So a good rule is to make tests simple +result that _appears_ to be correct. So a good rule is to make tests simple enough to understand so we can reason about both the correctness of our tests as well as our code. Otherwise, our tests hold little value. @@ -342,7 +343,7 @@ lose faith in it and stop using it. Look at `tests/test_models.py`: -~~~python +```python """Tests for statistics functions within the Model layer.""" import numpy as np @@ -374,7 +375,7 @@ def test_daily_mean_integers(): # Need to use NumPy testing functions to compare arrays npt.assert_array_equal(daily_mean(test_input), test_result) ... -~~~ +``` Here, although we have specified two of our previous manual tests as separate functions, they run the same assertions. Each of these test functions, in a @@ -397,6 +398,7 @@ be using Pytest to write unit tests, but what you learn can scale to more complex functional testing for applications or libraries. :::callout + ## What About Unit Testing in Other Languages? Other unit testing frameworks exist for Python, including Nose2 and Unittest, @@ -406,52 +408,51 @@ Catch for C++, etc. ::: - ### Installing Pytest You can install `pytest` using `pip` - exit the Python console first (either with `Ctrl-D` or by typing `exit()`), then do: -~~~bash -$ pip install pytest -~~~ +```bash +pip install pytest +``` Whether we do this via VsCode or the command line, the results are exactly the same: our virtual environment will now have the `pytest` package installed for use. - ### Running Tests Now we can run these tests using `pytest`: -~~~python -$ python -m pytest tests/test_models.py -~~~ +```bash +python -m pytest tests/test_models.py +``` Here, we use `-m` to invoke the `pytest` installed module, and specify the `tests/test_models.py` file to run the tests in that file -explicitly. +explicitly. :::callout + ## Why Run Pytest Using `python -m` and Not `pytest` ? -Another way to run `pytest` is via its own command, so we *could* try to use `pytest tests/test_models.py` on the +Another way to run `pytest` is via its own command, so we _could_ try to use `pytest tests/test_models.py` on the command line instead, but this would lead to a `ModuleNotFoundError: No module named 'inflammation'`. This is because using the `python -m pytest` method adds the current directory to its list of directories to search for modules, whilst using `pytest` does not - the `inflammation` subdirectory's contents are not 'seen', hence the `ModuleNotFoundError`. There are ways to get around this with [various methods](https://stackoverflow.com/questions/71297697/modulenotfounderror-when-running-a-simple-pytest), but we've used `python -m` for simplicity. ::: -~~~ +```text ============================================== test session starts ===================================================== platform darwin -- Python 3.9.6, pytest-6.2.5, py-1.11.0, pluggy-1.0.0 rootdir: /Users/alex/python-intermediate-inflammation plugins: anyio-3.3.4 -collected 2 items - +collected 2 items + tests/test_models.py .. [100%] =============================================== 2 passed in 0.79s ====================================================== -~~~ +``` -Pytest looks for functions whose names also start with the letters 'test_' and +Pytest looks for functions whose names also start with the letters 'test\_' and runs each one. Notice the `..` after our test script: - If the function completes without an assertion being triggered, we count the test as a success (indicated as `.`). @@ -469,13 +470,13 @@ new test cases that test the `daily_max()` and `daily_min()` functions, adding them to `test/test_models.py`. Here are some hints: - You could choose to format your functions very similarly to `daily_mean()`, defining test input and expected result arrays followed by the equality assertion. -- Try to choose cases that are suitably different, and remember that these functions take a 2D array and return a 1D array with each element the result of analysing each *column* of the data. +- Try to choose cases that are suitably different, and remember that these functions take a 2D array and return a 1D array with each element the result of analysing each _column_ of the data. Once added, run all the tests again with `python -m pytest tests/test_models.py`, and you should also see your new tests pass. - :::solution -~~~python + +```python ... def test_daily_max(): """Test that max function works for an array of positive integers.""" @@ -500,11 +501,11 @@ def test_daily_min(): npt.assert_array_equal(daily_min(test_input), test_result) ... -~~~ +``` + ::: :::: - The big advantage is that as our code develops we can update our test cases and commit them back, ensuring that ourselves (and others) always have a set of tests to verify our code at each step of development. This way, when we @@ -518,7 +519,7 @@ There are some cases where seeing an error is actually the correct behaviour, and Python allows us to test for exceptions. Add this test in `tests/test_models.py`: -~~~python +```python import pytest ... def test_daily_min_string(): @@ -527,7 +528,7 @@ def test_daily_min_string(): with pytest.raises(TypeError): error_expected = daily_min([['Hello', 'there'], ['General', 'Kenobi']]) -~~~ +``` Note that you need to import the `pytest` library at the top of our `test_models.py` file with `import pytest` so that we can use `pytest`'s @@ -536,10 +537,11 @@ Note that you need to import the `pytest` library at the top of our Run all your tests as before. :::callout + ## Why Should We Test Invalid Input Data? Testing the behaviour of inputs, both valid and invalid, is a really good idea -and is known as *data validation*. Even if you are developing command line +and is known as _data validation_. Even if you are developing command line software that cannot be exploited by malicious data entry, testing behaviour against invalid inputs prevents generation of erroneous results that could lead to serious misinterpretation (as well as saving time and compute cycles which diff --git a/technology_and_tooling/testing/diagnosing_issues.md b/technology_and_tooling/testing/diagnosing_issues.md index d4f5aba7..b3618d21 100644 --- a/technology_and_tooling/testing/diagnosing_issues.md +++ b/technology_and_tooling/testing/diagnosing_issues.md @@ -1,22 +1,23 @@ --- name: Diagnosing Issues id: diagnosing_issues -dependsOn: [ - technology_and_tooling.testing.scaling_up -] +dependsOn: [technology_and_tooling.testing.scaling_up] tags: [pytest] -attribution: - - citation: > - "Aleksandra Nenadic, Steve Crouch, James Graham, et al. (2022). carpentries-incubator/python-intermediate-development: beta (beta). Zenodo. https://doi.org/10.5281/zenodo.6532057" - url: https://doi.org/10.5281/zenodo.6532057 - image: https://carpentries-incubator.github.io/python-intermediate-development/assets/img/incubator-logo-blue.svg - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - +learningOutcomes: + - Use a debugger to explore behaviour of a running program. + - Describe and identify edge and corner test cases and explain why they are important. + - Apply error handling and defensive programming techniques to improve robustness of a program. + - Integrate linting tool style checking into a continuous integration job. +attribution: + - citation: > + "Aleksandra Nenadic, Steve Crouch, James Graham, et al. (2022). carpentries-incubator/python-intermediate-development: beta (beta). Zenodo. https://doi.org/10.5281/zenodo.6532057" + url: https://doi.org/10.5281/zenodo.6532057 + image: https://carpentries-incubator.github.io/python-intermediate-development/assets/img/incubator-logo-blue.svg + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Introduction @@ -24,38 +25,41 @@ attribution: Unit testing can tell us something is wrong in our code and give a rough idea of where the error is by which test(s) are failing. But it does not tell us exactly where the problem is (i.e. what line of code), or how it came about. To give us a better idea of what is going on, we can: - - output program state at various points, e.g. by using print statements to output the contents of -variables, + +- output program state at various points, e.g. by using print statements to output the contents of + variables, - use a logging capability to output the state of everything as the program progresses, or - look at intermediately generated files. But such approaches are often time consuming and sometimes not enough to fully -pinpoint the issue. In complex programs, like simulation codes, we often need -to get inside the code while it is running and explore. This is where using a +pinpoint the issue. In complex programs, like simulation codes, we often need +to get inside the code while it is running and explore. This is where using a **debugger** can be useful. ## Setting the Scene Let us add a new function called `patient_normalise()` to our inflammation example to normalise a given inflammation data array so that all entries fall between 0 and 1. -To normalise each patient's inflammation data we need to divide it by the maximum inflammation +To normalise each patient's inflammation data we need to divide it by the maximum inflammation experienced by that patient. To do so, we can add the following code to `inflammation/models.py`: -~~~python +```python +import numpy as np +import pytest def patient_normalise(data): """Normalise patient data from a 2D inflammation data array.""" max = np.max(data, axis=0) return data / max[:, np.newaxis] -~~~ +``` -**Note:** *there are intentional mistakes in the above code, which will be -detected by further testing and code style checking below so bear with us for the moment!* +**Note:** _there are intentional mistakes in the above code, which will be +detected by further testing and code style checking below so bear with us for the moment!_ In the code above, we first go row by row and find the maximum inflammation value for each patient and store these values in a 1-dimensional NumPy array `max`. We then want to use NumPy's element-wise division, to divide each value in every row of inflammation data (belonging to the same patient) by the maximum -value for that patient stored in the 1D array `max`. However, we cannot do that +value for that patient stored in the 1D array `max`. However, we cannot do that division automatically as `data` is a 2D array (of shape `(60, 40)`) and `max` is a 1D array (of shape `(60, )`), which means that their shapes are not compatible. @@ -78,10 +82,11 @@ performed. ![NumPy arrays' shapes after broadcasting](fig/numpy-shapes-after-broadcasting.png) :::callout + ## Broadcasting The term broadcasting describes how NumPy treats arrays with different shapes -during arithmetic operations. Subject to certain constraints, the smaller array +during arithmetic operations. Subject to certain constraints, the smaller array is “broadcast” across the larger array so that they have compatible shapes. Be careful, though, to understand how the arrays get stretched to avoid getting unexpected results. @@ -96,7 +101,7 @@ inflammation on a particular day. Let us now add a new test in `tests/test_models.py` to check that the normalisation function is correct for some test data. -~~~python +```python @pytest.mark.parametrize( "test, expected", [ @@ -106,12 +111,12 @@ def test_patient_normalise(test, expected): """Test normalisation works for arrays of one and positive integers. Assumption that test accuracy of two decimal places is sufficient.""" from inflammation.models import patient_normalise - npt.assert_almost_equal(patient_normalise(np.array(test)), np.array(expected), decimal=2) -~~~ + np.assert_almost_equal(patient_normalise(np.array(test)), np.array(expected), decimal=2) +``` Note that we are using the `assert_almost_equal()` Numpy testing function instead of `assert_array_equal()`, since it allows us to test against values -that are *almost* equal. This is very useful when we have numbers with arbitrary +that are _almost_ equal. This is very useful when we have numbers with arbitrary decimal places and are only concerned with a certain degree of precision, like the test case above, where we make the assumption that a test accuracy of two decimal places is sufficient. @@ -120,7 +125,7 @@ Run the tests again using `python -m pytest tests/test_models.py` and you will note that the new test is failing, with an error message that does not give many clues as to what went wrong. -~~~ +```text E AssertionError: E Arrays are not almost equal to 2 decimals E @@ -135,7 +140,7 @@ E [0.67, 0.83, 1. ], E [0.78, 0.89, 1. ]]) tests/test_models.py:53: AssertionError -~~~ +``` Let us use a debugger at this point to see what is going on and why the function failed. @@ -149,26 +154,25 @@ performs its functions. ### Setup testing in VSCode -First we will set up VSCode to run and debug our tests. If you haven't done so already, -you will first need to enable the PyTest framework in VSCode. You can do this by -selecting the `Python: Configure Tests` command in the Command Palette (Ctrl+Shift+P). -This will then prompt you to select a test framework (`Pytest`), and a directory -containing the tests (`tests`). You should then see the Test view, shown as a beaker, in -the left hand activity sidebar. Select this and you should see the list of tests, along -with our new test `test_patient_normalise`. If you select this test you should see some -icons to the right that either run, debug or open the `test_patient_normalise` test. You +First we will set up VSCode to run and debug our tests. If you haven't done so already, +you will first need to enable the PyTest framework in VSCode. You can do this by +selecting the `Python: Configure Tests` command in the Command Palette (Ctrl+Shift+P). +This will then prompt you to select a test framework (`Pytest`), and a directory +containing the tests (`tests`). You should then see the Test view, shown as a beaker, in +the left hand activity sidebar. Select this and you should see the list of tests, along +with our new test `test_patient_normalise`. If you select this test you should see some +icons to the right that either run, debug or open the `test_patient_normalise` test. You can see what this looks like in the screenshot below. - ![Patient normalise tests in VSCode](fig/testsInVSCode.jpg) -Click on the "run" button next to `test_patient_normalise`, and you will be able to see -that VSCode runs the function, and the same `AssertionError` that we saw before. +Click on the "run" button next to `test_patient_normalise`, and you will be able to see +that VSCode runs the function, and the same `AssertionError` that we saw before. ### Running the Debugger Now we want to use the debugger to investigate what is happening inside the -`patient_normalise` function. To do this we will add a *breakpoint* in the code. +`patient_normalise` function. To do this we will add a _breakpoint_ in the code. A breakpoint will pause execution at that point allowing us to explore the state of the program. @@ -177,86 +181,84 @@ To set a breakpoint, navigate to the `models.py` file and move your mouse to the left of the line number for that line and a small red dot will appear, indicating that you have placed a breakpoint on that line. - -Now if you debug `test_patient_normalise`, you will notice that execution will be paused -at the return statement of `patient_normalise`, and we can investigate the exact state -of the program as it is executing this line of code. Navigate to the Run view, and you -will be able to see the local and global variables currently in memory, the call stack -(i.e. what functions are currently running), and the current list of breakpoints. In the -local variables section you will be able to see the `data` array that is input to the -`patient_normalise` function, as well as the `max` local array that was created to hold +Now if you debug `test_patient_normalise`, you will notice that execution will be paused +at the return statement of `patient_normalise`, and we can investigate the exact state +of the program as it is executing this line of code. Navigate to the Run view, and you +will be able to see the local and global variables currently in memory, the call stack +(i.e. what functions are currently running), and the current list of breakpoints. In the +local variables section you will be able to see the `data` array that is input to the +`patient_normalise` function, as well as the `max` local array that was created to hold the maximum inflammation values for each patient. See below for a screenshot. ![Debugging function in VSCode](fig/debugInVSCode.jpg) -In the Watch section of the Run view you can write any expression you want the debugger -to calculate, this is useful if you want to view a particular combination of variables, -or perhaps a single element or slice of an array. Try putting in the expression `max[:, -np.newaxis]` into the Watch section, and you will be able to see the column vector that -we are dividing `data` by in the return line of the function. You can also open the +In the Watch section of the Run view you can write any expression you want the debugger +to calculate, this is useful if you want to view a particular combination of variables, +or perhaps a single element or slice of an array. Try putting in the expression `max[:, +np.newaxis]` into the Watch section, and you will be able to see the column vector that +we are dividing `data` by in the return line of the function. You can also open the Debug Console and type in `max[:, np.newaxis]` to see the same result. -Looking at the `max` variable, we can see that something looks wrong, as the maximum -values for each patient do not correspond to the `data` array. Recall that the input +Looking at the `max` variable, we can see that something looks wrong, as the maximum +values for each patient do not correspond to the `data` array. Recall that the input `data` array we are using for the function is -~~~python +```text [[1, 2, 3], [4, 5, 6], [7, 8, 9]] -~~~ - +``` -So the maximum inflammation for each patient should be `[3, 6, 9]`, whereas the debugger -shows `[7, 8, 9]`. You can see that the latter corresponds exactly to the last column of -`data`, and we can immediately conclude that we took the maximum along the wrong axis of -`data`. So to fix the function we can change `axis=0` in the first line to `axis=1`. -With this fix in place, running the tests again will result in a passing test, and a +So the maximum inflammation for each patient should be `[3, 6, 9]`, whereas the debugger +shows `[7, 8, 9]`. You can see that the latter corresponds exactly to the last column of +`data`, and we can immediately conclude that we took the maximum along the wrong axis of +`data`. So to fix the function we can change `axis=0` in the first line to `axis=1`. +With this fix in place, running the tests again will result in a passing test, and a nice green tick next to the test in the VSCode IDE. :::callout + ## NumPy Axis -Getting the axes right in NumPy is not trivial - the -[following tutorial](https://www.sharpsightlabs.com/blog/numpy-axes-explained/#:~:text=NumPy%20axes%20are%20the%20directions,along%20the%20rows%20and%20columns.) +Getting the axes right in NumPy is not trivial - the +[following tutorial](https://www.sharpsightlabs.com/blog/numpy-axes-explained/#:~:text=NumPy%20axes%20are%20the%20directions,along%20the%20rows%20and%20columns.) offers a good explanation on how axes work when applying NumPy functions to arrays. ::: - ## Corner or Edge Cases The test case that we have currently written for `patient_normalise` is parameterised with a fairly standard data array. However, when writing your test cases, it is important to consider parameterising them by unusual or extreme values, in order to test all the edge or corner cases that your code could be -exposed to in practice. Generally speaking, it is at these extreme cases that +exposed to in practice. Generally speaking, it is at these extreme cases that you will find your code failing, so it's beneficial to test them beforehand. What is considered an "edge case" for a given component depends on what that -component is meant to do. In the case of `patient_normalise` function, the goal -is to normalise a numeric array of numbers. For numerical values, extreme cases +component is meant to do. In the case of `patient_normalise` function, the goal +is to normalise a numeric array of numbers. For numerical values, extreme cases could be zeros, very large or small values, not-a-number (`NaN`) or infinity -values. Since we are specifically considering an *array* of values, an edge +values. Since we are specifically considering an _array_ of values, an edge case could be that all the numbers of the array are equal. For all the given edge cases you might come up with, you should also consider -their likelihood of occurrence. It is often too much effort to exhaustively +their likelihood of occurrence. It is often too much effort to exhaustively test a given function against every possible input, so you should prioritise edge cases that are likely to occur. For our `patient_normalise` function, some common edge cases might be the occurrence of zeros, and the case where all the values of the array are the same. When you are considering edge cases to test for, try also to think about what -might break your code. For `patient_normalise` we can see that there is a +might break your code. For `patient_normalise` we can see that there is a division by the maximum inflammation value for each patient, so this will clearly break if we are dividing by zero here, resulting in `NaN` values in the normalised array. With all this in mind, let us add a few edge cases to our parametrisation of -`test_patient_normalise`. We will add two extra tests, corresponding to an +`test_patient_normalise`. We will add two extra tests, corresponding to an input array of all 0, and an input array of all 1. -~~~python +```python nolint @pytest.mark.parametrize( "test, expected", [ @@ -264,11 +266,11 @@ input array of all 0, and an input array of all 1. ([[1, 1, 1], [1, 1, 1], [1, 1, 1]], [[1, 1, 1], [1, 1, 1], [1, 1, 1]]), ([[1, 2, 3], [4, 5, 6], [7, 8, 9]], [[0.33, 0.67, 1], [0.67, 0.83, 1], [0.78, 0.89, 1]]), ]) -~~~ +``` Running the tests now from the command line results in the following assertion error, due to the division by zero as we predicted. -~~~ +```text E AssertionError: E Arrays are not almost equal to 2 decimals E @@ -281,7 +283,7 @@ E [0, 0, 0], E [0, 0, 0]]) tests/test_models.py:88: AssertionError -~~~ +``` How can we fix this? Luckily, there is a NumPy function that is useful here, [`np.isnan()`](https://numpy.org/doc/stable/reference/generated/numpy.isnan.html), @@ -289,8 +291,7 @@ which we can use to replace all the NaN's with our desired result, which is 0. We can also silence the run-time warning using [`np.errstate`](https://numpy.org/doc/stable/reference/generated/numpy.errstate.html): -~~~python -... +```python def patient_normalise(data): """ Normalise patient data from a 2D inflammation data array. @@ -306,15 +307,15 @@ def patient_normalise(data): normalised[normalised < 0] = 0 return normalised ... -~~~ - +``` ::::challenge{id=edge-cases title="Exploring Tests for Edge Cases"} Think of some more suitable edge cases to test our `patient_normalise()` function and add them to the parametrised tests. After you have finished remember to commit your changes. :::solution -~~~python + +```python @pytest.mark.parametrize( "test, expected", [ @@ -346,9 +347,9 @@ Think of some more suitable edge cases to test our `patient_normalise()` functio def test_patient_normalise(test, expected): """Test normalisation works for arrays of one and positive integers.""" from inflammation.models import patient_normalise - npt.assert_almost_equal(patient_normalise(np.array(test)), np.array(expected), decimal=2) + np.assert_almost_equal(patient_normalise(np.array(test)), np.array(expected), decimal=2) ... -~~~ +``` You could also, for example, test and handle the case of a whole row of NaNs. @@ -368,31 +369,30 @@ This could have be handled differently. We might decide that we do not want to s Checking that input to a function is valid via a set of preconditions is one of the simplest forms of **defensive programming** which is used as a way of avoiding potential errors. Preconditions are checked at the beginning of the function to make sure that all assumptions are satisfied. -These assumptions are often based on the *value* of the arguments, like we have already discussed. +These assumptions are often based on the _value_ of the arguments, like we have already discussed. However, in a dynamic language like Python one of the more common preconditions is to check that the arguments of a -function are of the correct *type*. Currently there is nothing stopping someone from calling `patient_normalise` +function are of the correct _type_. Currently there is nothing stopping someone from calling `patient_normalise` with a string, a dictionary, or another object that is not an `ndarray`. As an example, let us change the behaviour of the `patient_normalise()` function to raise an error on negative inflammation values. Edit the `inflammation/models.py` file, and add a precondition check to the beginning of the `patient_normalise()` function like so: -~~~python -... - if np.any(data < 0): - raise ValueError('Inflammation values should not be negative') +```python nolint ... -~~~ +if np.any(data < 0): + raise ValueError('Inflammation values should not be negative') +``` We can then modify our test function in `tests/test_models.py` to check that the function raises the correct exception - a `ValueError` - when input to the test contains negative values (i.e. input case `[[-1, 2, 3], [4, 5, 6], [7, 8, 9]]`). The [`ValueError`](https://docs.python.org/3/library/exceptions.html#ValueError) exception is part of the standard Python library and is used to indicate that the function received an argument of the right type, but of an inappropriate value. -~~~python +```python @pytest.mark.parametrize( "test, expected, expect_raises", [ - ... # previous test cases here, with None for expect_raises, except for the next one - add ValueError - ... # as an expected exception (since it has a negative input value) + # previous test cases here, with None for expect_raises, except for the next one - add ValueError + # as an expected exception (since it has a negative input value) ( [[-1, 2, 3], [4, 5, 6], [7, 8, 9]], [[0, 0.67, 1], [0.67, 0.83, 1], [0.78, 0.89, 1]], @@ -409,10 +409,10 @@ def test_patient_normalise(test, expected, expect_raises): from inflammation.models import patient_normalise if expect_raises is not None: with pytest.raises(expect_raises): - npt.assert_almost_equal(patient_normalise(np.array(test)), np.array(expected), decimal=2) + np.assert_almost_equal(patient_normalise(np.array(test)), np.array(expected), decimal=2) else: - npt.assert_almost_equal(patient_normalise(np.array(test)), np.array(expected), decimal=2) -~~~ + np.assert_almost_equal(patient_normalise(np.array(test)), np.array(expected), decimal=2) +``` Be sure to commit your changes so far and push them to GitHub. @@ -427,7 +427,7 @@ useful here, as well as the Python exception [`TypeError`](https://docs.python.o In `inflammation/models.py`: -~~~python +```python ... def patient_normalise(data): """ @@ -451,11 +451,11 @@ def patient_normalise(data): normalised[np.isnan(normalised)] = 0 return normalised ... -~~~ +``` In `test/test_models.py`: -~~~python +```python ... @pytest.mark.parametrize( "test, expected, expect_raises", @@ -484,13 +484,13 @@ def test_patient_normalise(test, expected, expect_raises): test = np.array(test) if expect_raises is not None: with pytest.raises(expect_raises): - npt.assert_almost_equal(patient_normalise(test), np.array(expected), decimal=2) + np.assert_almost_equal(patient_normalise(test), np.array(expected), decimal=2) else: - npt.assert_almost_equal(patient_normalise(test), np.array(expected), decimal=2) + np.assert_almost_equal(patient_normalise(test), np.array(expected), decimal=2) ... -~~~ +``` -Note the conversion from `list` to `np.array` has been moved out of the call to `npt.assert_almost_equal()` within the test function, and is now only applied to list items (rather than all items). This allows for greater flexibility with our test inputs, since this wouldn't work in the test case that uses a string. +Note the conversion from `list` to `np.array` has been moved out of the call to `np.assert_almost_equal()` within the test function, and is now only applied to list items (rather than all items). This allows for greater flexibility with our test inputs, since this wouldn't work in the test case that uses a string. ::: :::: @@ -504,7 +504,7 @@ You can also decide against adding explicit preconditions in your code, and inst limitations of your code for users of your code in the docstring and rely on them to invoke your code correctly. This approach is useful when explicitly checking the precondition is too costly. -## Improving Robustness with Automated Code Style Checks +## Improving Robustness with Automated Code Style Checks Linters are tools that analyze source code to detect and report errors, inconsistencies, and stylistic issues. They are widely used in software @@ -520,9 +520,9 @@ pylint --version We should see the version of Pylint, something like: -~~~ +```text pylint 2.13.3 -~~~ +``` Pylint is a command-line tool that can help our code in many ways: @@ -534,10 +534,11 @@ Pylint is a command-line tool that can help our code in many ways: Pylint can also identify **code smells**. :::callout + ## How Does Code Smell? There are many ways that code can exhibit bad design whilst not breaking any -rules and working correctly. A *code smell* is a characteristic that indicates +rules and working correctly. A _code smell_ is a characteristic that indicates that there is an underlying problem with source code, e.g. large classes or methods, methods with too many parameters, duplicated statements in both if and else blocks of conditionals, etc. They aren't functional errors in the code, but @@ -545,26 +546,25 @@ rather are certain structures that violate principles of good design and impact design quality. They can also indicate that code is in need of maintenance and refactoring. -The phrase has its origins in Chapter 3 "Bad smells in code" by Kent Beck and Martin Fowler in +The phrase has its origins in Chapter 3 "Bad smells in code" by Kent Beck and Martin Fowler in [Fowler, Martin (1999). Refactoring. Improving the Design of Existing Code. Addison-Wesley. ISBN 0-201-48567-2](https://www.amazon.com/Refactoring-Improving-Design-Existing-Code/dp/0201485672/). ::: - Let's run Pylint over our project after having added some more code to it. From the project root do: -~~~bash -$ pylint inflammation -~~~ +```bash +pylint inflammation +``` You may see something like the following in Pylint's output: -~~~bash +```bash ************* Module inflammation.models ... inflammation/models.py:60:4: W0622: Redefining built-in 'max' (redefined-builtin) ... -~~~ +``` The above output indicates that by using the local variable called `max` it the `patient_normalise` function, we have redefined a built-in Python function @@ -589,4 +589,4 @@ rerun your tests. - A debugger allows us to pause code execution and examine its state by adding **breakpoints** to lines in code. - Use **preconditions** to ensure correct behaviour of code. - Ensure that unit tests check for **edge** and **corner cases** too. -- Using **linting** tools to automatically flag suspicious programming language constructs and stylistic errors can help improve code robustness. \ No newline at end of file +- Using **linting** tools to automatically flag suspicious programming language constructs and stylistic errors can help improve code robustness. diff --git a/technology_and_tooling/testing/index.md b/technology_and_tooling/testing/index.md index d069d280..6c319c53 100644 --- a/technology_and_tooling/testing/index.md +++ b/technology_and_tooling/testing/index.md @@ -1,31 +1,21 @@ --- name: Testing id: testing -dependsOn: [ - technology_and_tooling.ide, - software_architecture_and_design.procedural -] -files: [ - automated_testing.md, - diagnosing_issues.md, - scaling_up.md, - testable_code_fixtures.md, - mocking.md -] -attribution: - - citation: > - "Aleksandra Nenadic, Steve Crouch, James Graham, et al. (2022). carpentries-incubator/python-intermediate-development: beta (beta). Zenodo. https://doi.org/10.5281/zenodo.6532057" - url: https://doi.org/10.5281/zenodo.6532057 - image: https://carpentries-incubator.github.io/python-intermediate-development/assets/img/incubator-logo-blue.svg - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +dependsOn: [technology_and_tooling.ide, software_architecture_and_design.procedural] +files: [automated_testing.md, diagnosing_issues.md, scaling_up.md, testable_code_fixtures.md, mocking.md] +attribution: + - citation: > + "Aleksandra Nenadic, Steve Crouch, James Graham, et al. (2022). carpentries-incubator/python-intermediate-development: beta (beta). Zenodo. https://doi.org/10.5281/zenodo.6532057" + url: https://doi.org/10.5281/zenodo.6532057 + image: https://carpentries-incubator.github.io/python-intermediate-development/assets/img/incubator-logo-blue.svg + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 summary: | - This course introduces the basics of automated testing and debugging in Python, using the Pytest framework. + This course introduces the basics of automated testing and debugging in Python, using the Pytest framework. --- in this section, we'll look at testing approaches that can help us ensure that the software we write is behaving as intended, and how we can diagnose and fix issues once faults are found. Using such approaches requires us to change our practice of development. This can take time, but potentially saves us considerable time in the medium to long term by allowing us to more comprehensively and rapidly find such faults, as well as giving us greater confidence in the correctness of our code - so we should try and employ such practices early on. We will also make use of techniques and infrastructure that allow us to do this in a scalable, automated and more performant way as our codebase grows. diff --git a/technology_and_tooling/testing/mocking.md b/technology_and_tooling/testing/mocking.md index 9ac7e061..e5e0dd6d 100644 --- a/technology_and_tooling/testing/mocking.md +++ b/technology_and_tooling/testing/mocking.md @@ -1,16 +1,13 @@ --- name: Using Mocks in Tests id: mocking -dependsOn: [ - technology_and_tooling.testing.testable_code_fixtures -] +dependsOn: [technology_and_tooling.testing.testable_code_fixtures] tags: [pytest] -attribution: - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - +attribution: + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Why use mocking? @@ -19,9 +16,9 @@ Sometimes we may not want to use "real" objects or functions in our tests, such ### Using the `unittest.mock` library -Let us continue with previous example from the [testable code and fixtures section](testable_code_fixtures), where we were testing functions to connect to and query a SQLite database. We will be testing the functionality of code in the `sqlite_example.py` file that we created previously. This time, instead of actually connecting to the database, we can mock an object to replace it with one that we can control and monitor. We will need to import a library in order to create our mocks. Rather than `pytest`, another library `unittest`, which is the testing library that comes as standard with Python, will be used. We can use the `unittest.mock.Mock` class to create a mock object. As a simple example, we can replace our `query_database` function with this `Mock` object. Then we are able to replace the value returned from `query_database` with whatever we want. Here is the contents of a new file `test_mocks.py`. +Let us continue with previous example from the [testable code and fixtures section](testable_code_fixtures), where we were testing functions to connect to and query a SQLite database. We will be testing the functionality of code in the `sqlite_example.py` file that we created previously. This time, instead of actually connecting to the database, we can mock an object to replace it with one that we can control and monitor. We will need to import a library in order to create our mocks. Rather than `pytest`, another library `unittest`, which is the testing library that comes as standard with Python, will be used. We can use the `unittest.mock.Mock` class to create a mock object. As a simple example, we can replace our `query_database` function with this `Mock` object. Then we are able to replace the value returned from `query_database` with whatever we want. Here is the contents of a new file `test_mocks.py`. -~~~python +```python import pytest import sqlite3 from pathlib import Path @@ -46,19 +43,20 @@ def test_query_database_mock(database_fn_fixture): assert query_database(sql, connection=conn) == ("Jerry", "Mouse", 1) query_database.assert_called_once_with(sql, connection=conn) -~~~ +``` In the example above, we do not require a database connection, a database file, or to perform any query on a database at all, since we have replaced the entire `query_database` function. The test is not especially useful, however, since we are now simply testing that the `Mock` object returns the value that we asked it to return. Note that we also test that the function was called with the correct arguments (although in this case we could call `query_database` with any arguments we liked since it is actually an `Mock` object). :::callout{variant="note"} + ## The difference between `Mock` and `MagicMock` -In the examples in this lesson, we will use the `Mock` object from the `unittest` library. When looking elsewhere for information you may find examples that use the `MagicMock` object. The difference between the two is that `MagicMock` objects have default implementations of Python "magic" methods. These are also sometimes referred to as *dunder methods* (double underscore methods), officially however, they are known as [*special methods*](https://docs.python.org/3/reference/datamodel.html#specialnames). Since we will not be relying on any of these methods for our examples, we will stick with the more simple object that does not risk bringing any unexpected behaviour to our mocks. +In the examples in this lesson, we will use the `Mock` object from the `unittest` library. When looking elsewhere for information you may find examples that use the `MagicMock` object. The difference between the two is that `MagicMock` objects have default implementations of Python "magic" methods. These are also sometimes referred to as _dunder methods_ (double underscore methods), officially however, they are known as [_special methods_](https://docs.python.org/3/reference/datamodel.html#specialnames). Since we will not be relying on any of these methods for our examples, we will stick with the more simple object that does not risk bringing any unexpected behaviour to our mocks. ::: For a more useful (and interesting) example, we could mock the `sqlite3` connection itself. Once we have done this, we will also need to add the cursor that is associated with the connection to the mocked connection and add a return value for the `cursor.fetchall()` method that we call. The example below shows how we might do this: -~~~python +```python import pytest import sqlite3 from sqlite_example import query_database @@ -85,13 +83,13 @@ def test_query_db_mocked_connection(): # check that query_database closes the connection conn.close.assert_called_once() -~~~ +``` -#### Patching functions +### Patching functions -If we add the test above to our `test_mocks.py` file and run `python -m pytest tests/test_mocks.py ` we find that the tests pass. If we run this file along with the `test_sqlite.py` file that we created in the [previous lesson](testable_code_fixtures), however, we may find that we start to get test failures with errors similar to this: +If we add the test above to our `test_mocks.py` file and run `python -m pytest tests/test_mocks.py` we find that the tests pass. If we run this file along with the `test_sqlite.py` file that we created in the [previous lesson](testable_code_fixtures), however, we may find that we start to get test failures with errors similar to this: -~~~bash +```bash def test_connect_to_db_type(database_fn_fixture): """ Test that connect_to_db function returns sqlite3.Connection @@ -117,20 +115,20 @@ tests/test_sqlite.py:34: AssertionError E TypeError: cannot unpack non-iterable Mock object tests/test_sqlite.py:68: TypeError -~~~ +``` We can see that in the first case, the test is failing because `assert isinstance(conn, sqlite3.Connection)` is actually receiving a `Mock` object instead of an `sqlite3.Connection` object. In the second case, a `Mock` object is received instead of the `tuple` we would expect from the `cursor.fetchone()` function, so we get an error when trying to unpack it. -It appears that our mocked `sqlite.connection` has created issues in other test functions where we did not intend to use it. To overcome this behaviour, we will need to use a *patch* which will only affect the scope of the function. There are two ways of using a patch, a *context manager* or a *function decorator*. +It appears that our mocked `sqlite.connection` has created issues in other test functions where we did not intend to use it. To overcome this behaviour, we will need to use a _patch_ which will only affect the scope of the function. There are two ways of using a patch, a _context manager_ or a _function decorator_. ::::challenge{id=patch-context-manager title="Using the `with patch` context manager."} Rewrite `test_query_db_mocked_connection` to use a context manager. You can view the [unittest documentation](https://docs.python.org/3/library/unittest.mock.html#unittest.mock.patch) as a guide. :::solution -Below is an example of using a context manager to patch the test. When we use a patch we are actually receiving a `Mock` object that behaves in exactly the same way as in the previous examples. First we `import patch` from the `unitest.mock` library, then we create a patch called `mock_connection` which only exists within the context of the `with` statement. After this statement, the context will be cleaned up automatically. +Below is an example of using a context manager to patch the test. When we use a patch we are actually receiving a `Mock` object that behaves in exactly the same way as in the previous examples. First we `import patch` from the `unitest.mock` library, then we create a patch called `mock_connection` which only exists within the context of the `with` statement. After this statement, the context will be cleaned up automatically. -~~~python +```python import pytest import sqlite3 from sqlite_example import query_database @@ -155,7 +153,7 @@ def test_query_db_mocked_connection(): mock_cursor.fetchall.assert_called_once() # check that query_database closes the connection conn.close.assert_called_once() -~~~ +``` ::: :::: @@ -167,7 +165,8 @@ Now lets look at using a function decorator. Rewrite `test_query_db_mocked_connection` to use a function decorator instead of a context manager. You can view the [unittest documentation](https://docs.python.org/3/library/unittest.mock.html#unittest.mock.patch) as a guide. :::solution -~~~python + +```python @patch('sqlite3.connect') def test_query_db_mocked_connection(mock_connection): mock_cursor = mock_connection.return_value.cursor.return_value @@ -184,7 +183,7 @@ def test_query_db_mocked_connection(mock_connection): mock_cursor.fetchall.assert_called_once() # check that query_database closes the connection conn.close.assert_called_once() -~~~ +``` ::: :::: @@ -193,7 +192,7 @@ def test_query_db_mocked_connection(mock_connection): #### Monkeypatching in pytest -As an alternative to using the `unitest.mock` library, its possible to use a version of mocking from within `pytest`, termed *monkeypatching*. This may be a simpler alternative in cases when a full mock to replace an object is not required, such as when you wish to just replace a single method or attribute. A built-in fixture called `monkeypatch` allows modifying attributes, functions or classes within the scope of the test function. Some example methods are: +As an alternative to using the `unitest.mock` library, its possible to use a version of mocking from within `pytest`, termed _monkeypatching_. This may be a simpler alternative in cases when a full mock to replace an object is not required, such as when you wish to just replace a single method or attribute. A built-in fixture called `monkeypatch` allows modifying attributes, functions or classes within the scope of the test function. Some example methods are: - `monkeypatch.setattr()` - used to set an attribute to a new value or replace it with a new function - `monkeypatch.delattr()` - used to delete an attribute @@ -207,7 +206,8 @@ It makes sense to use both the pytest `monkeypatch` fixture and `unittest.mock` Rewrite `test_query_db_mocked_connection` to use the pytest `monkeypatch` fixture alongside `unittest.mock`. You can view the [pytest monkeypatch documentation](https://docs.pytest.org/en/7.1.x/how-to/monkeypatch.html) if needed. :::solution -~~~python + +```python import pytest import sqlite3 from sqlite_example import query_database @@ -241,42 +241,43 @@ def test_query_db_mocked_connection(monkeypatch): mock_cursor.fetchall.assert_called_once() # check that query_database closes the connection conn.close.assert_called_once() - -~~~ + +``` ::: :::: :::callout + #### Using the `mocker` fixture from `pytest-mock` An alternative to using the `unitest.mock` library is to install `pytest-mock` alongside `pytest`. This wil give you access to a fixture called `mocker`. This fixture provides access to `unittest.patch` functionalities as well as mocks. There is no need to `import unittest` and no `monkeypatch` functions are required. For more information see the [pytest-mock documentation](https://pytest-mock.readthedocs.io/en/latest/index.html). ::: -Well done for making it this far, mocking is often a confusing subject due to the many ways in which it can be done and the abstract nature of temporarily replacing parts of the thing you are testing. After this introduction, you can now solidify your learning by practicing the techniques here on your own code whilst using the documentation as a reference. +Well done for making it this far, mocking is often a confusing subject due to the many ways in which it can be done and the abstract nature of temporarily replacing parts of the thing you are testing. After this introduction, you can now solidify your learning by practicing the techniques here on your own code whilst using the documentation as a reference. ## Putting it all together - adding a database as a data source -Finally, we can come back to our `Trial` object and integrate the functions to connect to and query an SQLite database. We have provided a file `inflammation_data.db` that contains all of the data from the 12 csv files in one table called `data`. +Finally, we can come back to our `Trial` object and integrate the functions to connect to and query an SQLite database. We have provided a file `inflammation_data.db` that contains all of the data from the 12 csv files in one table called `data`. To get this file, if on WSL or Linux (e.g. Ubuntu or the Ubuntu VM), then do: -~~~bash -$ wget https://train.oxrse.uk/material/HPCu/software_architecture_and_design/procedural/inflammation/inflammation_data.db -~~~ +```bash +wget https://train.oxrse.uk/material/HPCu/software_architecture_and_design/procedural/inflammation/inflammation_data.db +``` Or, if on a Mac, do: -~~~bash -$ curl -O https://train.oxrse.uk/material/HPCu/software_architecture_and_design/procedural/inflammation/inflammation_data.db -~~~ +```bash +curl -O https://train.oxrse.uk/material/HPCu/software_architecture_and_design/procedural/inflammation/inflammation_data.db +``` -Save the file into the `inflammation/data` directory of your project. +Save the file into the `inflammation/data` directory of your project. The `data` table has 43 columns, `patient_id`, `trial_id`, `filename` and `day01` to `day40` that record the number of inflammation flare-ups for these days. The `patient_id` field is in the form of `pxx` where patient 1 is `p01`, for `trial_id` the format is `txx` where trial 1 is `t01`. Now we can add a new method `from_database` to our class: -~~~python +```python import numpy as np from sqlite_example import connect_to_database, query_database @@ -299,7 +300,7 @@ class Trial: """ data = cls.load_csv(filename) return cls(data, id) - + @classmethod def from_database(cls, db_filepath, trial_id): """ @@ -333,15 +334,15 @@ class Trial: return np.loadtxt(fname=filename, delimiter=',') ... -~~~ +``` Using our new method, an instance of the `Trial` class can now be created in the following way: -~~~python +```python from inflammation.models import Trial trial_group01 = Trial.from_database("inflammation_data.db", "t01") -~~~ +``` Our existing tests for the statistical methods from the `Trial` object do not need to be altered even if the underlying data storage has changed, as long as the data is loaded into a numpy array of the same format as we had previously. @@ -352,7 +353,7 @@ Write some more tests for the `Trial` class. These should check that the data lo :::solution Here we give some example tests using the `mocker` fixture from `pytest-mock` as well as a real test database. Your tests do not need to be identical to these ones. At this stage, you should know that testing the functionality can be done in in a number of ways! -~~~python +```python import numpy as np import pytest from inflammation.models import Trial @@ -398,7 +399,7 @@ def setup_database(database_connection): ('p05', 't03', 'filename4', 3, 2) ''') conn.commit() - yield conn + yield conn cur.execute("DROP TABLE data") def test_trial_from_database(database_fn_fixture, setup_database): @@ -406,7 +407,7 @@ def test_trial_from_database(database_fn_fixture, setup_database): trial = Trial.from_database(database_fn_fixture, "t02") assert isinstance(trial.data, np.ndarray) # Check that the data attribute is correct (the first three columns should be skipped) - npt.assert_array_equal(trial.data, np.array([[4, 5], [2, 1]])) + np.assert_array_equal(trial.data, np.array([[4, 5], [2, 1]])) assert trial.id == "t02" def test_trial_from_database_no_data(database_fn_fixture, setup_database): @@ -436,10 +437,10 @@ def test_trial_from_mock_database(mocker): trial = Trial.from_database('test_db.db', 1) assert isinstance(trial.data, np.ndarray) # Check that the data attribute is correct (the first three columns should be skipped) - npt.assert_array_equal(trial.data, np.array([[4, 5], [2, 1]])) + np.assert_array_equal(trial.data, np.array([[4, 5], [2, 1]])) assert trial.id == 1 -~~~ +``` ::: :::: @@ -447,10 +448,12 @@ def test_trial_from_mock_database(mocker): When combined with the previous course, we have now covered a number of more advanced topics: Designing testable code, using fixtures and mocking. These should help you to ensure you write reliable and maintainable software. Happy testing! :::callout{variant="keypoints"} + - Mocking allows objects and functions to be **replaced** during testing with fake versions that **simulate attributes and behaviours** - Examples of mocked classes and methods are those that **write to a production database**, those that **read data from external services** or simply **those parts that take a long time to run** - Mocking allows checking if a **specific function is called, how many times it was called** and **if the arguments passed to the call were correct** - Mocking is available through the `unitest.mock` library, the `monkeypatch` fixture in `pytest` and the `mocker` fixture in `pytest-mock` - Using a **context manager** or a **function decorator** to **patch** a method ensures that a `unittest.Mock` object will only affect the scope of the test function -- Mocking can be used alongside **fixtures** and **writing testable code** to **isolate components** for effective testing +- Mocking can be used alongside **fixtures** and **writing testable code** to **isolate components** for effective testing + ::: diff --git a/technology_and_tooling/testing/scaling_up.md b/technology_and_tooling/testing/scaling_up.md index 63782e77..26bdd66f 100644 --- a/technology_and_tooling/testing/scaling_up.md +++ b/technology_and_tooling/testing/scaling_up.md @@ -1,22 +1,21 @@ --- name: Scaling Up Unit Testing id: scaling_up -dependsOn: [ - technology_and_tooling.testing.automated_testing -] +dependsOn: [technology_and_tooling.testing.automated_testing] tags: [pytest] -attribution: - - citation: > - "Aleksandra Nenadic, Steve Crouch, James Graham, et al. (2022). carpentries-incubator/python-intermediate-development: beta (beta). Zenodo. https://doi.org/10.5281/zenodo.6532057" - url: https://doi.org/10.5281/zenodo.6532057 - image: https://carpentries-incubator.github.io/python-intermediate-development/assets/img/incubator-logo-blue.svg - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - +learningOutcomes: + - Use parameterisation to automatically run tests over a set of inputs. + - Use code coverage to understand how much of our code is being tested using unit tests. +attribution: + - citation: > + "Aleksandra Nenadic, Steve Crouch, James Graham, et al. (2022). carpentries-incubator/python-intermediate-development: beta (beta). Zenodo. https://doi.org/10.5281/zenodo.6532057" + url: https://doi.org/10.5281/zenodo.6532057 + image: https://carpentries-incubator.github.io/python-intermediate-development/assets/img/incubator-logo-blue.svg + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Introduction @@ -40,7 +39,9 @@ So instead of writing a separate function for each different test, we can `tests/test_models.py` let us rewrite the `test_daily_mean_zeros()` and `test_daily_mean_integers()` into a single test function: -~~~python +```python +import pytest +import numpy as np @pytest.mark.parametrize( "test, expected", [ @@ -50,8 +51,8 @@ So instead of writing a separate function for each different test, we can def test_daily_mean(test, expected): """Test mean function works for array of zeroes and positive integers.""" from inflammation.models import daily_mean - npt.assert_array_equal(daily_mean(np.array(test)), np.array(expected)) -~~~ + np.assert_array_equal(daily_mean(np.array(test)), np.array(expected)) +``` Here, we use Pytest's **mark** capability to add metadata to this specific test - in this case, marking that it's a parameterised test. `parameterize()` function is actually a [Python **decorator**](https://www.programiz.com/python-programming/decorator). A @@ -76,13 +77,13 @@ The big plus here is that we don't need to write separate functions for each of the tests - our test code can remain compact and readable as we write more tests and adding more tests scales better as our code becomes more complex. -::::challenge{id="write-parameterised-unit-tests" title="Write Parameterised Unit Tests"} +::::challenge{id="write-parameterised-unit-tests" title="Write Parameterised Unit Tests"} Rewrite your test functions for `daily_max()` and `daily_min()` to be parameterised, adding in new test cases for each of them. :::solution -~~~python +```python ... @pytest.mark.parametrize( "test, expected", @@ -94,7 +95,7 @@ Rewrite your test functions for `daily_max()` and `daily_min()` to be parameteri def test_daily_max(test, expected): """Test max function works for zeroes, positive integers, mix of positive/negative integers.""" from inflammation.models import daily_max - npt.assert_array_equal(daily_max(np.array(test)), np.array(expected)) + np.assert_array_equal(daily_max(np.array(test)), np.array(expected)) @pytest.mark.parametrize( @@ -107,13 +108,13 @@ def test_daily_max(test, expected): def test_daily_min(test, expected): """Test min function works for zeroes, positive integers, mix of positive/negative integers.""" from inflammation.models import daily_min - npt.assert_array_equal(daily_min(np.array(test)), np.array(expected)) + np.assert_array_equal(daily_min(np.array(test)), np.array(expected)) ... -~~~ +``` + ::: :::: - ## Code Coverage - How Much of Our Code is Tested? Pytest can't think of test cases for us. We still have to decide what to test @@ -133,19 +134,19 @@ tell us how many statements in our code are being tested. By installing a Python package to our virtual environment called `pytest-cov` that is used by Pytest and using that, we can find this out: -~~~bash -$ pip install pytest-cov -$ python -m pytest --cov=inflammation.models tests/test_models.py -~~~ +```bash +pip install pytest-cov +python -m pytest --cov=inflammation.models tests/test_models.py +``` So here, we specify the additional named argument `--cov` to `pytest` specifying the code to analyse for test coverage. -~~~ +```text ============================= test session starts ============================== platform darwin -- Python 3.9.6, pytest-6.2.5, py-1.11.0, pluggy-1.0.0 rootdir: /Users/alex/python-intermediate-inflammation plugins: anyio-3.3.4, cov-3.0.0 -collected 9 items +collected 9 items tests/test_models.py ......... [100%] @@ -157,17 +158,17 @@ inflammation/models.py 9 1 89% TOTAL 9 1 89% ============================== 9 passed in 0.26s =============================== -~~~ +``` Here we can see that our tests are doing very well - 89% of statements in `inflammation/models.py` have been executed. But which statements are not being tested? The additional argument `--cov-report term-missing` can tell us: -~~~bash -$ python -m pytest --cov=inflammation.models --cov-report term-missing tests/test_models.py -~~~ +```bash +python -m pytest --cov=inflammation.models --cov-report term-missing tests/test_models.py +``` -~~~ +```text ... Name Stmts Miss Cover Missing ------------------------------------------------------ @@ -175,7 +176,7 @@ inflammation/models.py 9 1 89% 18 ------------------------------------------------------ TOTAL 9 1 89% ... -~~~ +``` So there's still one statement not being tested at line 18, and it turns out it's in the function `load_csv()`. Here we should consider whether or not to @@ -187,6 +188,7 @@ used, how complex they are, and importantly, the extent to which they affect our program's results. :::callout + ## What about Testing Against Indeterminate Output? What if your implementation depends on a degree of random behaviour? This can be @@ -195,7 +197,7 @@ example, molecular simulations) or other stochastic behavioural models of complex systems. So how can you test against such systems if the outputs are different when given the same inputs? -One way is to *remove the randomness* during testing. For those portions of your +One way is to _remove the randomness_ during testing. For those portions of your code that use a language feature or library to generate a random number, you can instead produce a known sequence of numbers instead when testing, to make the results deterministic and hence easier to test against. You could encapsulate @@ -204,7 +206,7 @@ appropriate one depending on whether you are testing or not. This is essentially a type of **mocking**, where you are creating a "mock" version that mimics some behaviour for the purposes of testing. -Another way is to *control the randomness* during testing to provide results +Another way is to _control the randomness_ during testing to provide results that are deterministic - the same each time. Implementations of randomness in computing languages, including Python, are actually never truly random - they are **pseudorandom**: the sequence of 'random' numbers are typically generated @@ -215,33 +217,33 @@ as the default seed, but you can set your own. By doing so, the generated sequence of numbers is the same, e.g. using Python's `random` library to randomly select a sample of ten numbers from a sequence between 0-99: -~~~python +```python import random random.seed(1) print(random.sample(range(0, 100), 10)) random.seed(1) print(random.sample(range(0, 100), 10)) -~~~ +``` Will produce: -~~~ +```text [17, 72, 97, 8, 32, 15, 63, 57, 60, 83] [17, 72, 97, 8, 32, 15, 63, 57, 60, 83] -~~~ +``` So since your program's randomness is essentially eliminated, your tests can be written to test against the known output. The trick of course, is to ensure that the output being testing against is definitively correct! -The other thing you can do while keeping the random behaviour, is to *test the -output data against expected constraints* of that output. For example, if you +The other thing you can do while keeping the random behaviour, is to _test the +output data against expected constraints_ of that output. For example, if you know that all data should be within particular ranges, or within a particular statistical distribution type (e.g. normal distribution over time), you can test against that, conducting multiple test runs that take advantage of the randomness to fill the known "space" of expected results. Note that this isn't -as precise or complete, and bear in mind this could mean you need to run *a lot* +as precise or complete, and bear in mind this could mean you need to run _a lot_ of tests which may take considerable time. ::: @@ -268,9 +270,8 @@ Our software will inevitably increase in complexity as it develops. Using automated testing where appropriate can save us considerable time, especially in the long term, and allows others to verify against correct behaviour. - ## Key Points - We can assign multiple inputs to tests using parametrisation. - It’s important to understand the **coverage** of our tests across our code. -- Writing unit tests takes time, so apply them where it makes the most sense. \ No newline at end of file +- Writing unit tests takes time, so apply them where it makes the most sense. diff --git a/technology_and_tooling/testing/testable_code_fixtures.md b/technology_and_tooling/testing/testable_code_fixtures.md index ac09b168..10a12a39 100644 --- a/technology_and_tooling/testing/testable_code_fixtures.md +++ b/technology_and_tooling/testing/testable_code_fixtures.md @@ -1,32 +1,28 @@ --- name: Testable Code and Fixtures id: testable_code_fixtures -dependsOn: [ - technology_and_tooling.testing.diagnosing_issues -] +dependsOn: [technology_and_tooling.testing.diagnosing_issues] tags: [pytest] -attribution: - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 - url: https://www.universe-hpc.ac.uk - image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png - license: CC-BY-4.0 - - - +attribution: + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 + - citation: This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1 + url: https://www.universe-hpc.ac.uk + image: https://www.universe-hpc.ac.uk/assets/images/universe-hpc.png + license: CC-BY-4.0 --- ## Introduction Having completed the previous sections of the course, we now know: + - how to write tests -- how to debug problems with our code +- how to debug problems with our code - how to code in a defensive manner in order to prevent invalid inputs from causing problems. -Up to this point, our examples have focused on unit testing of simple functions that formed part of larger program to analyse inflammation data from patients in a drug trial. If we were to develop this program further, a larger number of tests would need to written and organised in order to cater for the increased complexity of the codebase. In the future, we may also want to integrate our program with other entities that are external to the application itself, such as a database, a web-based service or even analysis components that are written in another language such as C++. +Up to this point, our examples have focused on unit testing of simple functions that formed part of larger program to analyse inflammation data from patients in a drug trial. If we were to develop this program further, a larger number of tests would need to written and organised in order to cater for the increased complexity of the codebase. In the future, we may also want to integrate our program with other entities that are external to the application itself, such as a database, a web-based service or even analysis components that are written in another language such as C++. In this section of the course, we will cover two topics that will help us to deal with testing software that is more complex whilst ensuring that it functions as we would expect: @@ -41,7 +37,7 @@ Increasing the ease of writing tests can result in increased test coverage, and ### Separation of concerns -It is good practice to organise code into modular function or classes that each have a single, well-defined responsibility. By doing this, not only will it be more readable, but also it will be more straightforward to isolate and test individual components of your system. Another way to ensure separate concerns is to use *dependency injection*. This involves passing an object or function to our routines rather than creating such objects internally. +It is good practice to organise code into modular function or classes that each have a single, well-defined responsibility. By doing this, not only will it be more readable, but also it will be more straightforward to isolate and test individual components of your system. Another way to ensure separate concerns is to use _dependency injection_. This involves passing an object or function to our routines rather than creating such objects internally. ### Avoid duplication @@ -56,16 +52,18 @@ Using pure functions that have no side effects, can result in more testable soft Test-driven development (TDD) is a software development approach that consists of short development cycles, where tests are written before actually writing the classes and functions. The tests are run and initially fail, then the minimum amount of code is written in order to make the tests pass. TDD ensures that the design for testability is in mind from the onset and that requirements are thought about before diving in and starting to implement algorithms. ## Refactoring our code and our tests + We are going to refactor our code to incorporate some of the ideas above in order to make it more testable. Firstly, we are going to change our procedural inflammation project into an object-orientated version, let's start by creating classes to represent the different types of data in our study. This was already investigated in the [object-orientated programming](https://train.oxrse.uk/material/HPCu/software_architecture_and_design/object_orientated) part of the course, where we created different subclasses of the `Person` class and added them to a `Trial` object. We use similar concepts, but will doing things slightly differently here. We would ideally like to have models that represent individual patients and their associated data. It is going to be up to you to write them! ::::challenge{id=patient-class title="Creating a `Patient` class."} - Write a class `Patient`. For now, the only attributes a `Patient` has is an `id` and a list of numbers containing their inflammation scores (flare-ups per day) as recorded in a row of one of the CSV files. We would also like to add some useful methods to the `Patient` class that will return the mean, max and min of the data for that patient. Call these `data_mean`, `data_max` and `data_min`. +Write a class `Patient`. For now, the only attributes a `Patient` has is an `id` and a list of numbers containing their inflammation scores (flare-ups per day) as recorded in a row of one of the CSV files. We would also like to add some useful methods to the `Patient` class that will return the mean, max and min of the data for that patient. Call these `data_mean`, `data_max` and `data_min`. :::solution -~~~python + +```python import numpy as np class Patient: @@ -85,7 +83,7 @@ class Patient: """Calculate the min of patient's inflammation data.""" return np.min(self.data) -~~~ +``` ::: :::: @@ -94,10 +92,11 @@ Now we have a class that represents a patient in the study, we can also create a ::::challenge{id=trial-class title="Creating a `Trial` class."} - Write a class `Trial` that represents a trial. For now, the only attributes a `Trial` has are an `id` and `data`, which is a 2D numpy array with the data from one CSV file. The data from the CSV should be read in by calling a method `load_csv` which can be called from the class constructor (`__init__`). You can also add all the functions from our `models.py` file to this class: `daily_mean` and `daily_max`, `daily_min` and `patient_normalise`, they will need to be modified slightly to work as methods of the `Trial` class. +Write a class `Trial` that represents a trial. For now, the only attributes a `Trial` has are an `id` and `data`, which is a 2D numpy array with the data from one CSV file. The data from the CSV should be read in by calling a method `load_csv` which can be called from the class constructor (`__init__`). You can also add all the functions from our `models.py` file to this class: `daily_mean` and `daily_max`, `daily_min` and `patient_normalise`, they will need to be modified slightly to work as methods of the `Trial` class. :::solution -~~~python + +```python class Trial: def __init__(self, filename, id): self.data = self.load_csv(filename) @@ -144,59 +143,61 @@ class Trial: normalised[normalised < 0] = 0 return normalised -~~~ +``` ::: :::: Now we can create `Trial` objects, with associated `data` attributes, but how can we create `Patient` objects? We could do that by creating them in the standard way: -~~~python +```python filename = "inflammation-01.csv" data = np.loadtxt(fname=filename, delimiter=',') row = data[0, :] # The first row of the 2D data array patient_0 = Patient(0, row) # Create a Patient with id 0 -~~~ +``` Alternatively we could create a `Person` using a method in the `Trial` class, since all the required data is already there: -~~~python +```python filename = "inflammation-01.csv" trail_group_01 = Trial(filename, "Group01") patient_0 = trail_group_01.get_patient(0) # Create a Patient with id 0 -~~~ +``` ::::challenge{id=gat-patient title="Get a Patient from a Trial."} - Add a method `get_patient` to the `Trial` class that returns an instance of a `Patient`. +Add a method `get_patient` to the `Trial` class that returns an instance of a `Patient`. :::solution -~~~python + +```python class Trial: def __init__(self, filename, id): self.data = self.load_csv(filename) self.id = id def get_patient(self, row): - """Get a Patient object by data row. The id of the object is the + """Get a Patient object by data row. The id of the object is the same as the row number.""" return Patient(row, self.data[row, :]) ... -~~~ +``` ::: :::: -We should now adjust and extend our existing tests from the previous lesson in order to fit with these changes. +We should now adjust and extend our existing tests from the previous lesson in order to fit with these changes. ::::challenge{id=test-patient title="Testing the `Patient` class."} - Write some tests for the `Patient` class that cover the functions `data_mean`, `data_max` and `data_min` as well as a test that checks that the attributes of the class are created correctly. You do not need to write extensive parametrised tests at this stage, this is more an exercise to practice testing class methods as opposed to standard procedural functions. +Write some tests for the `Patient` class that cover the functions `data_mean`, `data_max` and `data_min` as well as a test that checks that the attributes of the class are created correctly. You do not need to write extensive parametrised tests at this stage, this is more an exercise to practice testing class methods as opposed to standard procedural functions. :::solution -~~~python + +```python import pytest from inflammation.models import Patient @@ -216,20 +217,20 @@ def test_patient_attributes(): patient = Patient(id=2, data=[10, 20, 30, 40, 50]) assert patient.id == 2 assert patient.data == [10, 20, 30, 40, 50] -~~~ +``` ::: :::: In the exercise above, we found ourselves having to create the same or similar `Patient` objects multiple times. To prevent this repetition, we could encapsulate these tests in their own class, `TestPatient`. Writing tests in this manner helps to organise similar tests into groups and also allows sharing of data between tests. The `pytest` library defines methods that you can add to your class, such as `setup_class` which will be run before running all of the tests in that class or `setup_method` that will be run before each test within the class. This method can be used for creating data or opening files, for example. An additional method called `teardown_class` could be also be added, if needed, and `pytest` will run this method after the tests in the class have completed. Alternatively `teardown_method` will run after each test. These methods can be useful for cleaning up in cases where files were created on your system or an connections were opened. For more information you can [view the documentation here](https://docs.pytest.org/en/latest/how-to/xunit_setup.html). - ::::challenge{id=test-patient-class title="Grouping tests in a class."} - Encapsulate the tests for the `Patient` class inside a class named `TestPatient`. Include a method `setup_class` where the two `Patient` objects (with `id` of 1 and 2) will be created rather than creating an object within each test. +Encapsulate the tests for the `Patient` class inside a class named `TestPatient`. Include a method `setup_class` where the two `Patient` objects (with `id` of 1 and 2) will be created rather than creating an object within each test. :::solution -~~~python + +```python import pytest from inflammation.models import Patient @@ -251,18 +252,18 @@ class TestPatient: assert self.patient2.id == 2 assert self.patient2.data == [10, 20, 30, 40, 50] -~~~ +``` ::: :::: ## Fixtures -As an alternative to encapsulating test methods in a class and using `setup` and `teardown` methods, we can use *fixtures*. Fixtures are defined by using the `@pytest.fixture` decorator on a function. This function will then become available to be passed as an argument to your tests and used within them. +As an alternative to encapsulating test methods in a class and using `setup` and `teardown` methods, we can use _fixtures_. Fixtures are defined by using the `@pytest.fixture` decorator on a function. This function will then become available to be passed as an argument to your tests and used within them. Here is how we can write our tests for the `Person` class using fixtures instead of a `setup_class` method: -~~~python +```python import pytest from inflammation.models_oo import Patient @@ -275,7 +276,7 @@ def patient_2(): return Patient(id=2, data=[10, 20, 30, 40, 50]) class TestPatient: - + def test_patient_data_mean(self, patient_1): assert patient_1.data_mean() == 3.0 @@ -288,21 +289,21 @@ class TestPatient: def test_patient_attributes(self, patient_2): assert patient_2.id == 2 assert patient_2.data == [10, 20, 30, 40, 50] -~~~ +``` -By default, fixtures will be created when first requested by a test and will be destroyed at the end of the test. We can change this behaviour by defining the *scope* of the fixture. If we want to use the decorator `@pytest.fixture(scope="session")` for example, the fixture will only be destroyed at the end of the entire test session. Modifying this behaviour is especially useful if the fixture is expensive to create (such as a large file) and we do not need to recreate it for each test. +By default, fixtures will be created when first requested by a test and will be destroyed at the end of the test. We can change this behaviour by defining the _scope_ of the fixture. If we want to use the decorator `@pytest.fixture(scope="session")` for example, the fixture will only be destroyed at the end of the entire test session. Modifying this behaviour is especially useful if the fixture is expensive to create (such as a large file) and we do not need to recreate it for each test. -Next we can adapt our tests from the previous lesson that test the analysis functions that are now methods in the `Trial` class. +Next we can adapt our tests from the previous lesson that test the analysis functions that are now methods in the `Trial` class. ::::challenge{id=test-trial title="Testing the `Trial` class."} - Write some tests for the `Trial` class and the associated methods. You can adapt the tests that you wrote in your `test_models.py` file from the previous lesson. You can use fixtures to help with creating instances of the class for testing. +Write some tests for the `Trial` class and the associated methods. You can adapt the tests that you wrote in your `test_models.py` file from the previous lesson. You can use fixtures to help with creating instances of the class for testing. :::solution Here is the solution for the first three of the tests, the others should have been refactored in a similar fashion. -~~~python +```python @pytest.fixture() def trial_instance(): return Trial("test_data.csv", 1) @@ -311,25 +312,27 @@ def trial_instance(): class TestTrial: def test_daily_mean_zeros(self, trial_instance): """Test that mean function works for an array of zeros.""" - trial_instance.data = np.array([[0, 0], - [0, 0], - [0, 0]]) + trial_instance.data = np.array([ + [0, 0], + [0, 0], + [0, 0]]) test_result = np.array([0, 0]) # Need to use Numpy testing functions to compare arrays - npt.assert_array_equal(trial_instance.daily_mean(), test_result) + np.assert_array_equal(trial_instance.daily_mean(), test_result) def test_daily_mean_integers(self, trial_instance): """Test that mean function works for an array of positive integers.""" - trial_instance.data = np.array([[1, 2], - [3, 4], - [5, 6]]) + trial_instance.data = np.array([ + [1, 2], + [3, 4], + [5, 6]]) test_result = np.array([3, 4]) # Need to use Numpy testing functions to compare arrays - npt.assert_array_equal(trial_instance.daily_mean(), test_result) + np.assert_array_equal(trial_instance.daily_mean(), test_result) @pytest.mark.parametrize( @@ -342,16 +345,15 @@ class TestTrial: def test_daily_max(self, test, expected, trial_instance): """Test max function works for zeroes, positive integers, mix of positive/negative integers.""" trial_instance.data = np.array(test) - npt.assert_array_equal(trial_instance.daily_max(), np.array(expected)) + np.assert_array_equal(trial_instance.daily_max(), np.array(expected)) ... -~~~ +``` ::: :::: -In our tests for the `Trial` class, we have to initialise the class using a CSV file in order to create an instance, even if we do not use that particular data in our tests. How can we simplify this? One thing that can be changed is the `__init__` method, if we just needed the data as an argument, rather than the path to a CSV file, that would make testing easier. After this change, a separate method is going to be needed to allow creating a `Trial` from a CSV filepath, this can be achieved using a class method. - +In our tests for the `Trial` class, we have to initialise the class using a CSV file in order to create an instance, even if we do not use that particular data in our tests. How can we simplify this? One thing that can be changed is the `__init__` method, if we just needed the data as an argument, rather than the path to a CSV file, that would make testing easier. After this change, a separate method is going to be needed to allow creating a `Trial` from a CSV filepath, this can be achieved using a class method. ::::challenge{id=load-from-csv title="Refactor the `Trial` class."} @@ -359,13 +361,14 @@ As described above, refactor the `__init__` method of the `Trial` class to take :::solution Here is the first section of our adjusted object code: -~~~python + +```python class Trial: def __init__(self, data, id): self.data = data self.id = id - @classmethod + @classmethod def from_csv(cls, filename, id): """ Class method to create a Trial instance from data in a CSV file. @@ -384,20 +387,20 @@ class Trial: def load_csv(filename): """Load a Numpy array from a CSV - Parameters: + Parameters: filename (str). Filename of CSV to load """ return np.loadtxt(fname=filename, delimiter=',') ... -~~~ +``` ::: :::: Now, a `Trial` object can be created in two ways: -~~~python +```python import numpy as np from inflammation.models import Trial @@ -406,19 +409,19 @@ data = np.loadtxt(fname=filename, delimiter=',') trial_group_01 = Trial(data, "Group01") trial_group_02 = Trial.from_csv("inflammation-02.csv", "Group02") -~~~ +``` For our tests, we no longer need a CSV file in order to ensure that the statistical methods from the class give the expected results and we can replace our `trial_instance` fixture: -~~~python +```python @pytest.fixture() def trial_instance(): return Trial(np.array([[0, 0],[0, 0]]), 1) -~~~ +``` Alternatively, we can create objects within test methods, if we prefer to do things that way: -~~~python +```python class TestTrial: def test_daily_mean_zeros(self): """Test that mean function works for an array of zeros.""" @@ -426,10 +429,10 @@ class TestTrial: test_result = np.array([0, 0]) # Need to use Numpy testing functions to compare arrays - npt.assert_array_equal(trial_instance.daily_mean(), test_result) + np.assert_array_equal(trial_instance.daily_mean(), test_result) ... -~~~ +``` ### Using a database rather than CSV files @@ -437,7 +440,7 @@ Our alterations to the `Trial` class to make it easier to test have also paved t In the following example, we have a function `query_database` that utilises a connection to a [SQLite](https://www.sqlite.org/) database. In a similar fashion to how a CSV file was needed for a `Trial` object, this function is going to be difficult to test without connecting to the `example.db` database. The contents of our file, named `sqlite_example.py` are shown here. You can create the file alongside the rest of the inflammation code in your working directory. You may have to install the `sqlite3` library to your python environment in order to use it. -~~~python +```python # Original code: Function that performs a database query import sqlite3 @@ -453,22 +456,22 @@ def query_database(sql): conn.close() return result -~~~ - -If we refactor the function to inject the database connection dependency, we can then easily replace that connection during testing with one that is connected to a test database. This also means we can test the two distinct tasks, connecting to the database and querying the database, separately. Additionally, we have the option to replace the connection with a fake (*mocked*) object, meaning that we do not have to connect to an actual database at all in order to test the function. +``` +If we refactor the function to inject the database connection dependency, we can then easily replace that connection during testing with one that is connected to a test database. This also means we can test the two distinct tasks, connecting to the database and querying the database, separately. Additionally, we have the option to replace the connection with a fake (_mocked_) object, meaning that we do not have to connect to an actual database at all in order to test the function. ::::challenge{id=dependency-inject title="Using dependency injection."} Create a separate function `connect_to_database` that returns the database connection. Refactor `query_database` to accept the database connection as a named argument. Programming defensively, raise an error if no connection is given. :::solution -~~~python + +```python # Rewritten code: Performs a database query with dependency injection import sqlite3 def connect_to_database(filename): - return sqlite3.connect(filename) + return sqlite3.connect(filename) def query_database(sql, connection=None): if connection is None: @@ -479,14 +482,14 @@ def query_database(sql, connection=None): connection.close() return result -~~~ +``` ::: :::: -Now let write some tests for these functions, these can be created in a new file named `test_sqlite.py` within the `/tests` directory. Here are some initial tests that check `connect_to_database` returns a connection of the correct type that refers to correct database file as well as checking that `query_database` returns the correct data. If you would like to learn more about the Structured Query Language (SQL) expressions in this example that are used to interact with the database see the [SQL Zoo](https://sqlzoo.net/wiki/SQL_Tutorial) site. +Now let write some tests for these functions, these can be created in a new file named `test_sqlite.py` within the `/tests` directory. Here are some initial tests that check `connect_to_database` returns a connection of the correct type that refers to correct database file as well as checking that `query_database` returns the correct data. If you would like to learn more about the Structured Query Language (SQL) expressions in this example that are used to interact with the database see the [SQL Zoo](https://sqlzoo.net/wiki/SQL_Tutorial) site. -~~~python +```python import pytest import sqlite3 from pathlib import Path @@ -508,10 +511,10 @@ def test_connect_to_db_name(): cur = conn.cursor() # List current databases https://www.sqlite.org/pragma.html#pragma_database_list cur.execute('PRAGMA database_list;') - # Unpack the three parameters returned + # Unpack the three parameters returned db_index, db_type, db_filepath = cur.fetchone() # Extract just the filename from the full filepath - db_filename = Path(db_filepath).name + db_filename = Path(db_filepath).name assert db_filename == 'test.db' conn.close() @@ -546,17 +549,18 @@ def test_query_database_without_connection(): with pytest.raises(TypeError): query_database(sql) -~~~ +``` As you can see, the tests are becoming complex, especially the one for `query_database`. Next we can look at how fixtures can help us to reduce this complexity, especially when we want to reuse resources such as a test database. ### More about Fixtures -Our `test_query_database` function can be simplified by separating the processes of creating the database and populating it with data from the test itself. We can create a fixture to do this which can then be passed to the `test_query_database` function. The fixture can also be responsible for removing the database after the tests have run. +Our `test_query_database` function can be simplified by separating the processes of creating the database and populating it with data from the test itself. We can create a fixture to do this which can then be passed to the `test_query_database` function. The fixture can also be responsible for removing the database after the tests have run. -in order to In the example below, we can use a fixture named `setup_database` to create our test database, add data and also remove the database file once the tests have finished running. As a result, our `test_query_database` function can be simplified and if we want to use the test database in other tests, we simply need to add `setup_database` as an argument to those tests. +in order to In the example below, we can use a fixture named `setup_database` to create our test database, add data and also remove the database file once the tests have finished running. As a result, our `test_query_database` function can be simplified and if we want to use the test database in other tests, we simply need to add `setup_database` as an argument to those tests. #### Using `yield` instead of `return` + If there is a cleanup part to the fixture code, then the fixture function should use a `yield` statement rather than a `return` statement. Anything up to the `yield` statement is setup code, and anything after the statement will be run post-testing in order to clean up (teardown code). ::::challenge{id=database-fixture title="Adding a fixture to setup the database."} @@ -564,7 +568,8 @@ If there is a cleanup part to the fixture code, then the fixture function should Add a fixture named `setup_database` to create our test database, add data and also remove the database file once the tests have finished running. Pass the fixture as an argument to `test_query_database`. :::solution -~~~python + +```python import pytest import sqlite3 from pathlib import Path @@ -600,15 +605,16 @@ def test_query_database(setup_database): # That record should be the data we added assert result[0] == ("Bugs", "Rabbit", 6) -~~~ +``` ::: :::: :::callout{variant="discussion"} + ### Should We Use Multiple `assert` statements in one test Function? -According to the book, The Art of Unit Testing by Roy Osherove, a unit test, by definition, should test a *unit of work*. What this means exactly is itself a point for discussion, but generally it refers to actions that take place between an entry point (e.g. a declaration fo a function) and an exit point (e.g. the output of a function). It is also often said that each test should fail for only one reason alone. +According to the book, The Art of Unit Testing by Roy Osherove, a unit test, by definition, should test a _unit of work_. What this means exactly is itself a point for discussion, but generally it refers to actions that take place between an entry point (e.g. a declaration fo a function) and an exit point (e.g. the output of a function). It is also often said that each test should fail for only one reason alone. Does using multiple `assert` statements in one test contravene these guidelines? @@ -622,7 +628,8 @@ Are there any disadvantages to enforcing a rule of one `assert` per test? The `setup_database` fixture does several things including initiating the connection as well as creating and populating the database table. In order to separate out these functionalities, split this fixture into two, with one fixture `database_connection` for providing the database connection and another`setup_database` that uses the first fixture and then populates the database. You can view the [pytest fixtures documentation](https://docs.pytest.org/en/7.1.x/how-to/fixtures.html) as a guide. :::solution -~~~python + +```python import pytest import sqlite3 from pathlib import Path @@ -665,23 +672,23 @@ def test_query_database(setup_database): assert len(result) == 1 # That record should be the data we added assert result[0] == ("Bugs", "Rabbit", 6) -~~~ +``` ::: :::: #### Using Built-in Fixtures -As well as writing our own fixtures, we can use those that are [predefined/(built-in)](https://docs.pytest.org/en/latest/reference/fixtures.html). For example we may want to use a temporary directory for our files during testing, rather than creating files in the directory that we are working from (this is what currently happens when we run our database tests). The built-in fixture `temp_path_factory` allows us to to do this. We can refactor our code to add an extra fixture that uses feature and then it can be used by all the tests that we have written as well as by the `setup_database` fixture. +As well as writing our own fixtures, we can use those that are [predefined/(built-in)](https://docs.pytest.org/en/latest/reference/fixtures.html). For example we may want to use a temporary directory for our files during testing, rather than creating files in the directory that we are working from (this is what currently happens when we run our database tests). The built-in fixture `temp_path_factory` allows us to to do this. We can refactor our code to add an extra fixture that uses feature and then it can be used by all the tests that we have written as well as by the `setup_database` fixture. ::::challenge{id=builtin-fixtures title="Using built-in fixtures."} -Add another fixture `database_filename` that uses the built-in `temp_path_factory` fixture to create a temporary directory for storing our `test.db` database file. This fixture can then be passed into the `database_connection` fixture. +Add another fixture `database_filename` that uses the built-in `temp_path_factory` fixture to create a temporary directory for storing our `test.db` database file. This fixture can then be passed into the `database_connection` fixture. :::solution The contents of our `test_sqlite.py` is now: -~~~python +```python import pytest import sqlite3 from pathlib import Path @@ -763,22 +770,23 @@ def test_query_database_without_connection(): with pytest.raises(TypeError): query_database(sql) -~~~ +``` ::: :::: - For more details on what you can do with fixtures, please refer to the [pytest fixtures documentation](https://docs.pytest.org/en/7.1.x/how-to/fixtures.html). #### Next steps -Now we know about testable code and fixtures. Before we add the functionality to create a `Trial` object using data stored in a database, we will look at how to mock objects for testing. This is covered in the [next lesson](mocking). +Now we know about testable code and fixtures. Before we add the functionality to create a `Trial` object using data stored in a database, we will look at how to mock objects for testing. This is covered in the [next lesson](mocking). :::callout{variant="keypoints"} + - **Separation of concerns** using methods such as **dependency injection** can make it easier to isolate and test components of your code - Refactoring code to make it **more testable** can make it **less complex** and **more extensible** - Tests can be grouped into classes in order to organise them - **Fixtures** allow **setup** and **teardown** of objects and data that are going to be reused in tests - There are a set of **built-in fixtures** available in `pytest` that can help you create temporary directories or access logging or other outputs during testing -::: \ No newline at end of file + +::: diff --git a/technology_and_tooling/unit_testing/continuous_integration.md b/technology_and_tooling/unit_testing/continuous_integration.md index 5c0ebdde..16ed1e21 100644 --- a/technology_and_tooling/unit_testing/continuous_integration.md +++ b/technology_and_tooling/unit_testing/continuous_integration.md @@ -1,8 +1,6 @@ --- name: Continuous Integration with GitHub Actions -dependsOn: [ - technology_and_tooling.unit_testing.unit_testing_python -] +dependsOn: [technology_and_tooling.unit_testing.unit_testing_python] tags: [python, unit-testing, ci, github] --- @@ -11,7 +9,7 @@ tags: [python, unit-testing, ci, github] Continuous Integration (CI) is the process where when code changes are merged into the `main` branch, automated builds and tests are run by the CI platform, and developers notified of any errors. This process encourages developers to -have a standardised tooling for running tests and can help prevent code with +have a standardised tooling for running tests and can help prevent code with errors from getting merged into the codebase. In this episode we will look at how GitHub works as a CI platform, and how to @@ -37,16 +35,14 @@ on: - main pull_request: branches: - - '**' + - "**" jobs: - build-and-test: name: Unit tests runs-on: ubuntu-20.04 steps: - - name: checkout repository uses: actions/checkout@v3 @@ -68,9 +64,9 @@ jobs: The three sections of a workflow file are: -* **Metadata**: This is the first block and specifies global action metadata. +- **Metadata**: This is the first block and specifies global action metadata. Only the `name` key is generally specified. -* **Triggers**: This is specified using the `on` block. Usual triggers are +- **Triggers**: This is specified using the `on` block. Usual triggers are pushes to the `main` branch and all pull requests. There are a variety of triggers available, including [triggering on tags](https://docs.github.com/en/actions/using-workflows/triggering-a-workflow#example-including-branches-and-tags), @@ -79,7 +75,7 @@ The three sections of a workflow file are: (useful in a large codebase, when you don't want to run all tests), and [scheduled triggers](https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#schedule). -* **Jobs**: Multiple jobs can be run in parallel within a workflow. Each job has +- **Jobs**: Multiple jobs can be run in parallel within a workflow. Each job has at least a `runs-on` key which specifies the [operating system image it runs on](https://github.com/actions/runner-images#available-images), a readable `name` key and a list of `steps`. @@ -139,7 +135,7 @@ Status badges are a quick way to see whether tests in your repository are passing or not. These are usually added to the top of your README.md and automatically update. You can find the status badge corresponding to a particular workflow by going to Actions > Your Workflow, clicking the [...] icon -next to *Filter workflow runs* and clicking *Create status badge*. +next to _Filter workflow runs_ and clicking _Create status badge_. ::: exercise Add the unit tests status badge to README @@ -147,10 +143,10 @@ Add the unit tests status badge to README ## Next steps -* Write some unit tests! -* Once you have set up the infrastructure it’s very easy, especially for the +- Write some unit tests! +- Once you have set up the infrastructure it’s very easy, especially for the majority of testing. -* There is complexity: how to test random functions, for instance. -* Don’t fall into the trap of thinking “I could be using this time to write more +- There is complexity: how to test random functions, for instance. +- Don’t fall into the trap of thinking “I could be using this time to write more features”. Code worthless if you aren’t certain it’s correct! -* Encourage others in your research group to start testing their code. +- Encourage others in your research group to start testing their code. diff --git a/technology_and_tooling/unit_testing/index.md b/technology_and_tooling/unit_testing/index.md index ced1aa53..f1d69b4c 100644 --- a/technology_and_tooling/unit_testing/index.md +++ b/technology_and_tooling/unit_testing/index.md @@ -1,14 +1,6 @@ --- name: Unit testing with Python and GitHub id: unit_testing -dependsOn: [ - technology_and_tooling.ide -] -files: [ - setup.md, - introduction.md, - unit_testing_python.md, - continuous_integration.md, -] +dependsOn: [technology_and_tooling.ide] +files: [setup.md, introduction.md, unit_testing_python.md, continuous_integration.md] --- - diff --git a/technology_and_tooling/unit_testing/introduction.md b/technology_and_tooling/unit_testing/introduction.md index 5529e8b4..976f49e0 100644 --- a/technology_and_tooling/unit_testing/introduction.md +++ b/technology_and_tooling/unit_testing/introduction.md @@ -1,12 +1,9 @@ --- name: Introduction to Unit Testing -dependsOn: [ - technology_and_tooling.unit_testing.setup -] +dependsOn: [technology_and_tooling.unit_testing.setup] tags: [python, unit-testing] --- - ## Introduction Being able to demonstrate that a process generates the right results is @@ -42,7 +39,6 @@ does it still behave the way you expect? ::: - ## What Is Software Testing? For the sake of argument, if each line we write has a 99% chance of being right, @@ -67,4 +63,3 @@ There are three main types of automated tests: For the purposes of this course, we'll focus on unit tests. But the principles and practices we'll talk about can be built on and applied to the other types of tests too. - diff --git a/technology_and_tooling/unit_testing/setup.md b/technology_and_tooling/unit_testing/setup.md index 5d297b6f..bd403eed 100644 --- a/technology_and_tooling/unit_testing/setup.md +++ b/technology_and_tooling/unit_testing/setup.md @@ -3,6 +3,7 @@ name: Unit Testing with Python -- Setup dependsOn: [] tags: [python, unit-testing] --- + # Setup These are the setup instructions for the Introduction to Unit Testing course. diff --git a/technology_and_tooling/unit_testing/unit_testing_python.md b/technology_and_tooling/unit_testing/unit_testing_python.md index 82d14bc2..36135a26 100644 --- a/technology_and_tooling/unit_testing/unit_testing_python.md +++ b/technology_and_tooling/unit_testing/unit_testing_python.md @@ -1,8 +1,6 @@ --- name: Unit Testing with Python -dependsOn: [ - technology_and_tooling.unit_testing.introduction -] +dependsOn: [technology_and_tooling.unit_testing.introduction] tags: [python, unit-testing] --- @@ -24,6 +22,7 @@ As an example, we will test the `daily_mean` function defined in Create a file `sandbox.py` and paste the following code: ```python +import pytest from inflammation.models import daily_mean import numpy as np @@ -63,31 +62,32 @@ execution never reaches that point. Most people don’t enjoy writing tests, so if we want them to actually do it, it must be easy to: -* Add or change tests. -* Understand the tests that have already been written. -* Run those tests. -* Understand those tests’ results. +- Add or change tests. +- Understand the tests that have already been written. +- Run those tests. +- Understand those tests’ results. There are many unit testing frameworks in different languages -* Python: pytest, unittest, nose2 -* C++: Catch2, GoogleTest, ... -* Java: JUnit -* Fortran: FRUIT +- Python: pytest, unittest, nose2 +- C++: Catch2, GoogleTest, ... +- Java: JUnit +- Fortran: FRUIT Let’s add some tests for our library function, `daily_mean`: ```python +import numpy as np def test_daily_mean_zeros(): - """Test that mean function works for an array of zeros.""" - from inflammation.models import daily_mean + """Test that mean function works for an array of zeros.""" + from inflammation.models import daily_mean - test_array = np.array([[0, 0], - [0, 0], - [0, 0]]) + test_array = np.array([[0, 0], + [0, 0], + [0, 0]]) - # Need to use Numpy testing functions to compare arrays - npt.assert_array_equal(np.array([0, 0]), daily_mean(test_array)) + # Need to use Numpy testing functions to compare arrays + np.assert_array_equal(np.array([0, 0]), daily_mean(test_array)) ``` Run `pytest tests/test_models.py` @@ -106,12 +106,12 @@ functionality built in. ```python def test_daily_min_string(): - """Test for TypeError when passing strings""" - from inflammation.models import daily_min - from pytest import raises + """Test for TypeError when passing strings""" + from inflammation.models import daily_min + from pytest import raises - with raises(TypeError): - daily_min([['Cannot', 'min'], ['string', 'arguments']]) + with raises(TypeError): + daily_min([['Cannot', 'min'], ['string', 'arguments']]) ``` This code uses the `raises` function defined in pytest to create a block that @@ -141,15 +141,15 @@ example, we could rewrite the `test_daily_mean_zeros()` and ```python @pytest.mark.parametrize( - "test, expected", - [ - ([[0, 0], [0, 0], [0, 0]], [0, 0]), - ([[1, 2], [3, 4], [5, 6]], [3, 4]), - ]) + "test, expected", + [ + ([[0, 0], [0, 0], [0, 0]], [0, 0]), + ([[1, 2], [3, 4], [5, 6]], [3, 4]), + ]) def test_daily_mean(test, expected): - """Test mean function works for array of zeroes and positive integers.""" - from inflammation.models import daily_mean - npt.assert_array_equal(np.array(expected), daily_mean(np.array(test))) + """Test mean function works for array of zeroes and positive integers.""" + from inflammation.models import daily_mean + np.assert_array_equal(np.array(expected), daily_mean(np.array(test))) ``` ::: exercise @@ -174,10 +174,10 @@ module in the curriculum. ## Recap -* We are making use of a unit testing framework (pytest). -* We have written tests that verify normal functional behaviour of three specific units. -* We have tested that some common cases of function misuse fail in the way we expect them to fail. -* We have parameterised tests to cut down on code duplication. +- We are making use of a unit testing framework (pytest). +- We have written tests that verify normal functional behaviour of three specific units. +- We have tested that some common cases of function misuse fail in the way we expect them to fail. +- We have parameterised tests to cut down on code duplication. In the next episode, we will walk through how to run unit tests automatically when we push code changes to GitHub. diff --git a/technology_and_tooling/version_control/01-background.md b/technology_and_tooling/version_control/01-background.md index e62f70ea..1f5706a7 100644 --- a/technology_and_tooling/version_control/01-background.md +++ b/technology_and_tooling/version_control/01-background.md @@ -4,6 +4,10 @@ dependsOn: [ technology_and_tooling.bash_shell.bash ] tags: [git] +learningOutcomes: + - Describe the benefits of an automated version control system. + - Explain the basics of how automated version control systems work. + attribution: - citation: > This material was originally taken from training materials developed by the @@ -32,24 +36,23 @@ Using **version control** means **we don't keep dozens of different versions** o ### 2. Reproducibility -When you use **version control**, at any point in the future, you can retrieve the **correct versions** of your documents, scripts or code. So, for example, a year after **publication**, you can get hold of the **precise combination** of scripts and data that you used to assemble a paper. +When you use **version control**, at any point in the future, you can retrieve the **correct versions** of your documents, scripts or code. So, for example, a year after **publication**, you can get hold of the **precise combination** of scripts and data that you used to assemble a paper. Version control makes **reproducibility** simpler. Without using version control it's very hard to say that your research is truly reproducible... - ### 3. To Aid Collaboration As well as maintaining a revison history, VC tools also help multiple authors **collaborate** on the **same file** or set of files. - **Professional software developers** use VC to work in large **teams** and to keep track of what they've done. If you know what changes have been made to each file, you can easily combine multiple people's changes to a single file. You can also track down where and when (and by who!) bugs in the code were introduced. +**Professional software developers** use VC to work in large **teams** and to keep track of what they've done. If you know what changes have been made to each file, you can easily combine multiple people's changes to a single file. You can also track down where and when (and by who!) bugs in the code were introduced. **Every** large software development project relies on VC, and most programmers use it for their small jobs as well. -**VC is not just for software**: papers, small data sets - anything that changes over time, or needs to be shared **can**, and **probably should** be stored in a version control system. +**VC is not just for software**: papers, small data sets - anything that changes over time, or needs to be shared **can**, and **probably should** be stored in a version control system. We'll look at both the backup and collaboration scenarios, but first it's useful to understand what going on **under the hood**. -## How do Version Control Tools Work? ## +## How do Version Control Tools Work? ![Changes are tracked sequentially](fig/01-background/track_changes.svg) @@ -63,7 +66,6 @@ Once you think of **changes as separate from the document** itself, you can then If there aren't conflicts, you can even try to combine two different sets of changes together onto the same base document, a process called **merging**. - ## Version Control Alternatives **Git** is overwhelmingly the most popular version control system in academia, and beyond. @@ -81,34 +83,41 @@ Because Git is so popular, and making a GitHub account is so easy, we're going t ## Graphical User Interfaces -We're going to teach you how to use Git on the *command line*, as it's the same on every single platform (Mac, Linux & Windows) - and it's the only way to use it on high-performance clusters like Iridis. This isn't the only way to use it, however. There are many different graphical user interfaces for Git, like: +We're going to teach you how to use Git on the _command line_, as it's the same on every single platform (Mac, Linux & Windows) - and it's the only way to use it on high-performance clusters like Iridis. This isn't the only way to use it, however. There are many different graphical user interfaces for Git, like: ### [SourceTree](https://www.sourcetreeapp.com/) + ![SourceTree](fig/01-background/sourcetree.png) ### [Git Kraken](https://www.gitkraken.com/) + ![Git Kraken](fig/01-background/kraken.png) ### [GitHub Desktop](https://desktop.github.com/) + ![GitHub Desktop](fig/01-background/desktop.png) Fundamentally, though, these are all just 'wrappers' around the command line version of Git. If you understand what they're doing under the hood, you can easily switch between versions. You can, for example, manage your code on Iridis using command-line git and GitHub Desktop on your desktop workstation. :::callout + ## Git GUI Integrations Most code editors and Integrated Development Environments (or IDEs) integrate Git into their UI, so you can easily see the state of your files and work with your repository. Examples include: ### [VS Code](https://code.visualstudio.com) + ![VS Code](fig/01-background/integration-vscode.png) ### [PyCharm & CLion](https://www.jetbrains.com/pycharm/) + ![PyCharm](fig/01-background/integration-pycharm.png) ### [RStudio/Posit](https://posit.co) + ![RStudio](fig/01-background/integration-rstudio.png) -Others include MatLab, Atom, Sublime Text and Notepad++. The only common IDE with poor Git support is Spyder! +Others include MatLab, Atom, Sublime Text and Notepad++. The only common IDE with poor Git support is Spyder! ::: diff --git a/technology_and_tooling/version_control/02-setup.md b/technology_and_tooling/version_control/02-setup.md index 13eb2ccb..e02a5604 100644 --- a/technology_and_tooling/version_control/02-setup.md +++ b/technology_and_tooling/version_control/02-setup.md @@ -1,26 +1,28 @@ --- name: Setting Up Git -dependsOn: [ - technology_and_tooling.version_control.01-background -] +dependsOn: [technology_and_tooling.version_control.01-background] tags: [git] +learningOutcomes: + - Demonstrate the process of configuring Git for the first time on a computer. + - Describe the -global configuration flag. + - Apply adding an SSH key to a GitHub account. attribution: -- citation: > + - citation: > This material was originally taken from training materials developed by the University of Southampton Research Software Group, which are based on the Software Carpentries course "Version Control with Git". - url: https://github.com/Southampton-RSG-Training/git-novice/ - image: https://southampton-rsg-training.github.io/git-novice/assets/img/home-logo.png - license: CC-BY-4.0 + url: https://github.com/Southampton-RSG-Training/git-novice/ + image: https://southampton-rsg-training.github.io/git-novice/assets/img/home-logo.png + license: CC-BY-4.0 --- - :::callout + ## Prerequisites In this lesson we use Git from the Bash Shell. Some previous experience with the shell is expected, -*but isn't mandatory*. +_but isn't mandatory_. ::: ## Get Started @@ -33,53 +35,53 @@ We’ll start by exploring how version control can be used to keep track of what The first time we use Git on a new machine, we need to configure it. We're going to set some global options, so when Git starts tracking changes to files it records who made them and how to contact them. -~~~bash -$ git config --global user.name "Firstname Surname" -$ git config --global user.email "fsurname@university.ac.uk" -~~~ +```bash +git config --global user.name "Firstname Surname" +git config --global user.email "fsurname@university.ac.uk" +``` (Please use your own name and the email address you used to sign up to GitHub!) We're going to set **Nano**, a simple, minimal command-line text editor to be the default for when you need to edit messages. -~~~bash -$ git config --global core.editor "nano -w" -~~~ +```bash +git config --global core.editor "nano -w" +``` If you're already comfortable with another command-line editor, feel free to select that! Git commands are written `git action`, where `action` is what we actually want it to do. In this case, we're telling Git: -* our **name** and **email address**, -* what our favorite **text editor** is, and -* that we want to use these settings **globally** (i.e., for every project), +- our **name** and **email address**, +- what our favorite **text editor** is, and +- that we want to use these settings **globally** (i.e., for every project), The three commands above only need to be **run once**: the flag `--global` tells Git to use the settings for every project on this machine. You can check your settings at any time: -~~~bash -$ git config --list -~~~ +```bash +git config --list +``` :::callout + ## Git Help and Manual If you forget a `git` command, you can access the list of commands by using `-h` and access the Git manual by using `--help` : -~~~bash -$ git config -h -$ git config --help -~~~ +```bash +git config -h +git config --help +``` - While viewing the manual, remember the `:` is a prompt waiting for commands and you can press <kbd>Q</kbd> to exit the manual. +While viewing the manual, remember the `:` is a prompt waiting for commands and you can press <kbd>Q</kbd> to exit the manual. ::: - ## Setting Up GitHub -In order to make sure all our work is backed up online, as well as making it easy to share with collaborators, we're going to link our version control content to [GitHub](https://github.com/). You'll need to [create an account there](https://github.com/signup). As your GitHub +In order to make sure all our work is backed up online, as well as making it easy to share with collaborators, we're going to link our version control content to [GitHub](https://github.com/). You'll need to [create an account there](https://github.com/signup). As your GitHub username will appear in the URLs of your projects there, it's best to use a short, clear version of your name if you can. ### Creating an SSH Key @@ -87,11 +89,12 @@ username will appear in the URLs of your projects there, it's best to use a shor We'll need to set up SSH access to GitHub from your computer. This is how GitHub checks your identity when you try to access it - and is more secure than a password. To set up SSH access, we generate a pair of keys - one public, one private. We want to add the public key to GitHub, whilst the private one stays on our computer. :::callout + ## More Detail -There are full guides in the GitHub documentation for how to -[Make an SSH Key](https://docs.github.com/en/authentication/connecting-to-github-with-ssh/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent) and -[Add an SSH key](https://docs.github.com/en/authentication/connecting-to-github-with-ssh/adding-a-new-ssh-key-to-your-github-account). +There are full guides in the GitHub documentation for how to +[Make an SSH Key](https://docs.github.com/en/authentication/connecting-to-github-with-ssh/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent) and +[Add an SSH key](https://docs.github.com/en/authentication/connecting-to-github-with-ssh/adding-a-new-ssh-key-to-your-github-account). We're going to simplify them for today. If you already have your own SSH key, feel free to skip to **Add an SSH Key**. @@ -99,15 +102,15 @@ If you already have your own SSH key, feel free to skip to **Add an SSH Key**. We can run a simple command to generate a new SSH key. It'll ask you for some settings, but you should just hit enter to use the defaults for everything: -~~~bash -$ ssh-keygen -t ed25519 -~~~ +```bash +ssh-keygen -t ed25519 +``` -~~~ +```text Generating public/private ed25519 key pair. -Enter file in which to save the key (/home/smangham/.ssh/id_ed25519): -Enter passphrase (empty for no passphrase): -Enter same passphrase again: +Enter file in which to save the key (/home/smangham/.ssh/id_ed25519): +Enter passphrase (empty for no passphrase): +Enter same passphrase again: Your identification has been saved in id_ed25519 Your public key has been saved in id_ed25519.pub The key fingerprint is: @@ -124,7 +127,7 @@ The key's randomart image is: | Eo . . o O oo| | oo. .o B+.| +----[SHA256]-----+ -~~~ +``` ### Add an SSH Key @@ -134,22 +137,22 @@ Now we've generated a key, we can add this to GitHub and register the key there. We need to fill in the details. Give the key a title like "Laptop SSH key", and then paste your **public key** into the key box - we can find it in our `~/.ssh` folder: -~~~bash -$ ls ~/.ssh -~~~ +```bash +ls ~/.ssh +``` -~~~ +```text id_ed25519 id_ed25519.pub known_hosts -~~~ +``` You want to copy the contents of the `.pub` file, which you can display with: -~~~bash -$ cat ~/.ssh/id_ed25519.pub -~~~ +```bash +cat ~/.ssh/id_ed25519.pub +``` -~~~ +```text ssh-ed25519 <SNIPPED FOR SECURITY> user-name@computer-name -~~~ +``` **Make sure you copy the `.pub` file and not the private key!** Your private key lives on your machine and is never shared with anyone else. Then click **Add key**, and you're done! diff --git a/technology_and_tooling/version_control/03-create.md b/technology_and_tooling/version_control/03-create.md index 2bcdf850..705b042a 100644 --- a/technology_and_tooling/version_control/03-create.md +++ b/technology_and_tooling/version_control/03-create.md @@ -1,17 +1,19 @@ --- name: Creating a Repository -dependsOn: [ - technology_and_tooling.version_control.02-setup -] +dependsOn: [technology_and_tooling.version_control.02-setup] tags: [git] +learningOutcomes: + - Create a repository from a template. + - Clone and use a Git repository. + - Demonstrate the process of cloning and using a Git repository. attribution: -- citation: > + - citation: > This material was originally taken from training materials developed by the University of Southampton Research Software Group, which are based on the Software Carpentries course "Version Control with Git". - url: https://github.com/Southampton-RSG-Training/git-novice/ - image: https://southampton-rsg-training.github.io/git-novice/assets/img/home-logo.png - license: CC-BY-4.0 + url: https://github.com/Southampton-RSG-Training/git-novice/ + image: https://southampton-rsg-training.github.io/git-novice/assets/img/home-logo.png + license: CC-BY-4.0 --- ## Creating a Repository @@ -29,6 +31,7 @@ We should get prompted to give details for what we'd like our copy of the templa ![Repository Details](fig/03-create/template-details.png) :::callout + ## Public or Private? GitHub will allow you to create private repositories, so only people you specify can access the code, but it's always best to keep your code public - especially if you're going to use it in a paper! @@ -40,7 +43,6 @@ A major advantage of this is if you leave academia, or you switch institution an After a brief wait, GitHub will have created a **remote repository** - a copy of the files and their history stored on GitHub's servers. - ## Cloning the Repository Next, from the new GitHub repository click on the **code** button, and you should have a choice of ways to copy the code. Select **SSH**, then click the copy button to copy the repository's URL: @@ -50,71 +52,75 @@ Next, from the new GitHub repository click on the **code** button, and you shoul Now we'll download a copy of the repository to our server. :::callout + ## SSH vs HTTPS -**Make sure you select SSH!** Whilst Git supports both **HTTPS** and **SSH**, **GitHub** will only let you *download* with **HTTPS**, as it's less secure. +**Make sure you select SSH!** Whilst Git supports both **HTTPS** and **SSH**, **GitHub** will only let you _download_ with **HTTPS**, as it's less secure. ::: We have our SSH key in place and have created our new repository from the template, so we can finally clone the repository to our machine: -~~~bash -$ git clone git@github.com:yourname/climate-analysis.git -~~~ +```bash +git clone git@github.com:yourname/climate-analysis.git +``` After you enter the `git clone` command, you should see: -~~~ +```text Cloning into 'climate-analysis'... The authenticity of host 'github.com (140.82.121.4)' can't be established. ECDSA key fingerprint is SHA256:p2QAMXNIC1TJYWeIOttrVc98/R1BUFWu3/LiyKgUfQM. ECDSA key fingerprint is MD5:7b:99:81:1e:4c:91:a5:0d:5a:2e:2e:80:13:3f:24:ca. Are you sure you want to continue connecting (yes/no)? yes -~~~ +``` Then, when you're prompted, continue the connection with `yes` and it will finish downloading: -~~~ +```text remote: Enumerating objects: 4, done. remote: Counting objects: 100% (4/4), done. remote: Compressing objects: 100% (4/4), done. remote: Total 4 (delta 0), reused 3 (delta 0), pack-reused 0 Receiving objects: 100% (4/4), done. -~~~ +``` Now, if we use `ls` to list the contents of the directory, we should see we have a new directory, called `climate-analysis`, that's a **local repository** containing the code from our **remote repository**. This is linked up automatically - making it easy for us to download updates to the remote repository, or to send our changes back up to it. :::callout + ## What if I Accidentally Cloned the Repository using HTTPS? As a note, if you've already cloned a repository you can check if you selected **HTTPS** as the access method using, e.g.: -~~~bash -$ cd climate-analysis -$ git remote -v -~~~ +```bash +cd climate-analysis +git remote -v +``` -~~~ -origin git@github.com:yourname/climate-analysis (fetch) -origin git@github.com:yourname/climate-analysis (push) -~~~ +```text +origin git@github.com:yourname/climate-analysis (fetch) +origin git@github.com:yourname/climate-analysis (push) +``` In this case, we're using SSH. If you see **HTTPS**, you can fix this with the following: -~~~bash -$ git remote set-url origin git@github.com:yourname/climate-analysis -~~~ +```bash +git remote set-url origin git@github.com:yourname/climate-analysis +``` + ::: :::callout + ## Creating Repositories Locally - + We've shown you how to create a repository on GitHub then download it via `git clone`, but you don't have to do it that way. If you want, you can create a repository locally by entering any directory and using `git init`. This turns any directory into a **git repository**, one stored entirely locally on your computer. After you've used `git init` to turn a directory into a repository, you can use the other commands we introduce in this section to add files to it. We still want to make sure our **local repository** is linked to a **remote repository** on GitHub though! To do that, you can [make an empty repository on GitHub](https://github.com/new) and name it. Once you've got that, you can then connect your **local repository** to it using `git remote add origin git@github.com:yourname/repositoryname`. - + `git remote add` tells your local repository to link up to a remote one, and `origin git@github.com:yourname/repositoryname` tells it that the remote is at `git@github.com:yourname/repositoryname`, and can be referred to as `origin`. You can link a **local repository** to many **remote repositories** if you want, but the main one is always called `origin`. ::: @@ -122,14 +128,14 @@ We still want to make sure our **local repository** is linked to a **remote repo Now, let's **change to our code directory** and look at the files we just downloaded. -~~~bash -$ cd ~/climate-analysis -$ ls -~~~ +```bash +cd ~/climate-analysis +ls +``` -~~~ +```text climate_analysis.py temp_conversion.py -~~~ +``` These are some Python files for analysing climate data- you'll recognise them if you've done some of our earlier lessons. @@ -138,40 +144,41 @@ Don't worry, you don't need to know Python to follow along. You'll notice that even though this directory is a **version control repository**, nothing actually looks special about it. But, if we add the `-a` flag to show everything, we can see that there's a hidden directory called `.git`: -~~~bash -$ ls -a -~~~ +```bash +ls -a +``` -~~~ +```text . .. climate_analysis.py .git temp_conversion.py -~~~ +``` Git stores information about the project in here. If we ever delete it, we will lose the project's history. -### Check Status +## Check Status We can check that everything is set up correctly by asking Git to tell us the status of our project with the **status** command: -~~~bash -$ git status -~~~ +```bash +git status +``` -~~~ +```text # On branch main nothing to commit, working tree clean -~~~ +``` -A **branch** is an independent line of development. We have only one, and the default name is **main**. +A **branch** is an independent line of development. We have only one, and the default name is **main**. Our **local repository** is connected to a **remote repository** (called **origin** by default), and is currently up-to-date; we haven't made any changes to the code yet. -Git works on **commits** - snapshots of the current state of the repository. *"nothing to commit, working tree clean"* means that the directory currently looks exactly the same as the last snapshot we took of it, with no changes or edits. +Git works on **commits** - snapshots of the current state of the repository. _"nothing to commit, working tree clean"_ means that the directory currently looks exactly the same as the last snapshot we took of it, with no changes or edits. :::callout + ## Branch names - + In this workshop, we have a **default branch** called **main**. In older versions of Git, if you create a new repository on the command line, it'll have a default branch called **master**, and a lot of examples online will show **master** instead of **main**. Don't worry - branches work the same, regardless of what they're called! -::: \ No newline at end of file +::: diff --git a/technology_and_tooling/version_control/04-changes.md b/technology_and_tooling/version_control/04-changes.md index b719312e..710fd14e 100644 --- a/technology_and_tooling/version_control/04-changes.md +++ b/technology_and_tooling/version_control/04-changes.md @@ -1,82 +1,84 @@ --- name: Tracking Changes -dependsOn: [ - technology_and_tooling.version_control.03-create -] +dependsOn: [technology_and_tooling.version_control.03-create] tags: [git] +learningOutcomes: + - Perform the modify-add-commit cycle for one or more files. + - Explain the storage of changes at each stage in the modify-add-commit cycle. attribution: -- citation: > + - citation: > This material was originally taken from training materials developed by the University of Southampton Research Software Group, which are based on the Software Carpentries course "Version Control with Git". - url: https://github.com/Southampton-RSG-Training/git-novice/ - image: https://southampton-rsg-training.github.io/git-novice/assets/img/home-logo.png - license: CC-BY-4.0 + url: https://github.com/Southampton-RSG-Training/git-novice/ + image: https://southampton-rsg-training.github.io/git-novice/assets/img/home-logo.png + license: CC-BY-4.0 --- ## Tracking Changes We've got a repository now containing a few pre-existing files - so let's add one more. You might remember seeing GitHub suggest we added a README.md to let people know what our code is about, so let's do that on the command line. We'll use the text editor `nano`, as: -~~~bash -$ nano README.md -~~~ +```bash +nano README.md +``` -Then type an example description: +Then type an example description: -~~~ +```text # Climate Analysis Toolkit This is a set of python scripts designed to analyse climate datafiles. -~~~ +``` -We can save our file using `Control-O` (`Control` and `O` at the same time), then `Enter`, and quit out of nano using `Control-X`. +We can save our file using `Control-O` (`Control` and `O` at the same time), then `Enter`, and quit out of nano using `Control-X`. Our description is a bit brief, but it's enough for now! Let's try `git status` again: -~~~bash -$ git status -~~~ +```bash +git status +``` -~~~ +```text # On branch main # Untracked files: # (use "git add <file>..." to include in what will be committed) # -# README.md +# README.md nothing added to commit but untracked files present (use "git add" to track) -~~~ +``` Now, whilst our current snapshot of the repository is up-to-date, we've added a new file that we're not tracking yet. We can tell Git to track the file we've just created using `git add`: -~~~bash -$ git add README.md -~~~ +```bash +git add README.md +``` and then check that the right thing happened: -~~~bash -$ git status -~~~ +```bash +git status +``` -~~~ +```text # On branch main # Changes to be committed: # (use "git reset HEAD <file>..." to unstage) # -# new file: README.md +# new file: README.md # -~~~ +``` -Git now knows that it's supposed to **keep track** of `README.md`, just like `climate_analysis.py` and `temp_conversion.py` but it **hasn't recorded that as a commit** yet. We dont have a snapshot of the repository with all the existing files *and* `README.md`. +Git now knows that it's supposed to **keep track** of `README.md`, just like `climate_analysis.py` and `temp_conversion.py` but it **hasn't recorded that as a commit** yet. We dont have a snapshot of the repository with all the existing files _and_ `README.md`. ### Initial Commit + To get it to do that, we need to run one more command: -~~~bash -$ git commit -m "Added a basic readme file." -~~~ +```bash +git commit -m "Added a basic readme file." +``` We use the `-m` flag (for "**message**") to record a short, **descriptive comment** that will help us remember later on what we did and why. @@ -91,11 +93,11 @@ changes made in the commit, **NOT "Bug Fixes"** or **"Changes"**! If you want to go into more detail, add a blank line between the summary line and your additional notes. -~~~ +```text [main fa90884] Added a basic readme file. 1 file changed, 3 insertions(+) create mode 100644 README.md -~~~ +``` When we run `git commit`, Git takes everything we have told it to save by using `git add` @@ -106,17 +108,17 @@ and its short **identifier** is `fa90884`. If we run `git status` now: -~~~bash -$ git status -~~~ +```bash +git status +``` -~~~ +```text # On branch main # Your branch is ahead of 'origin/main' by 1 commit. # (use "git push" to publish your local commits) # nothing to commit, working directory clean -~~~ +``` it tells us our local repository is up-to-date, although now we have edits to it that the remote version of it doesn't (we'll get to that later!). @@ -130,12 +132,13 @@ but **not yet committed**. and `git commit` then copies them to long-term storage (as a commit) :::callout + ## What's the Point of the Staging Area? Why do we have this two-stage process, where we **add** files to the staging area, then create a **commit** from them? Among other reasons, it allows you to easily bundle together a lot of changes in one go. If you changed the name of a variable used in multiple files (e.g. from `t` to `temperature`), you'd need to change it in all your files in one go in order for it to make sense. -If you stored a copy of each file one-by-one you'd end up with a lot of versions of the code that didn't work - variables with different names everywhere. The **staging area** lets you bundle together all those small changes that don't work in isolation into one big change that's coherent. +If you stored a copy of each file one-by-one you'd end up with a lot of versions of the code that didn't work - variables with different names everywhere. The **staging area** lets you bundle together all those small changes that don't work in isolation into one big change that's coherent. Git does give you shortcuts to reduce **add -> commit** to a single step, but when you're starting out it's always better to make sure you know what's going in to each commit! ::: @@ -145,11 +148,11 @@ Git does give you shortcuts to reduce **add -> commit** to a single step, but wh If we want to know what we've done recently, we can ask Git to show us the **project's history** using `git log`: -~~~bash -$ git log -~~~ +```bash +git log +``` -~~~ +```text commit fa90884ca03dcefb97e415a374ac1aacaaa94c91 (HEAD -> main) Author: Sam Mangham <mangham@gmail.com> Date: Wed Mar 16 15:22:29 2022 +0000 @@ -161,46 +164,47 @@ Author: Sam Mangham <mangham@gmail.com> Date: Wed Mar 16 14:19:13 2022 +0000 Initial commit -~~~ +``` `git log` lists all **revisions committed** to a repository in reverse chronological order (most recent at the top). The listing for each revision includes -* the **revision's full identifier** (which starts with the same characters as the short identifier printed by the `git commit` command earlier), -* the **branch** it was created on (including whether or not it's up-to-date with any **remote versions of that branch** - in this case, our last README commit hasn't been pushed to the remote repo yet), -* the revision's **author**, -* **when** it was created, -* the **log message** Git was given when the revision was committed. - +- the **revision's full identifier** (which starts with the same characters as the short identifier printed by the `git commit` command earlier), +- the **branch** it was created on (including whether or not it's up-to-date with any **remote versions of that branch** - in this case, our last README commit hasn't been pushed to the remote repo yet), +- the revision's **author**, +- **when** it was created, +- the **log message** Git was given when the revision was committed. :::callout + ## Compatibility Notice If you don't see information on the **remote branches**, try `git log --decorate`. -This ensures output will indicate, for each commit revision, whether it is up-to-date with its *remote* repository, if one exists. +This ensures output will indicate, for each commit revision, whether it is up-to-date with its _remote_ repository, if one exists. Older versions of git don't show this information by default. ::: ### Modify a file (1) + Now suppose we modify an existing file, for example by adding a **Docstring** to the **top** of one of the files: -~~~bash -$ nano climate_analysis.py -~~~ +```bash +nano climate_analysis.py +``` -~~~ +```text """ Climate Analysis Tools """ -~~~ +``` When we run `git status` now, it tells us that a file it already knows about has been modified: -~~~bash -$ git status -~~~ +```bash +git status +``` -~~~ +```text # On branch main # Your branch is ahead of 'origin/main' by 1 commit. # (use "git push" to publish your local commits) @@ -209,33 +213,35 @@ $ git status # (use "git add <file>..." to update what will be committed) # (use "git checkout -- <file>..." to discard changes in working directory) # -# modified: climate_analysis.py +# modified: climate_analysis.py # no changes added to commit (use "git add" and/or "git commit -a") -~~~ +``` The last line is the key phrase: "no changes added to **commit**". - So, while we have changed this file, but we haven't told Git we will want to save those changes (which we do with `git add`) much less actually saved them (which we do with `git commit`). -**It's important to remember that git only stores changes when you make a commit** +::::callout{variant="warning"} +It's important to remember that git only stores changes when you make a commit +:::: ### Review Changes and Commit + It is good practice to always **review our changes** before saving them. We do this using `git diff`. This shows us the differences between the current state of the file and the most recently commited version: -~~~bash -$ git diff -~~~ +```bash +git diff +``` -~~~ +```text diff --git a/climate_analysis.py b/climate_analysis.py index 277d6c7..d5b442d 100644 --- a/climate_analysis.py @@ -245,7 +251,7 @@ index 277d6c7..d5b442d 100644 import sys import temp_conversion import signal -~~~ +``` The output is **cryptic** because it is actually a series of **commands** for tools like editors and `patch` @@ -253,18 +259,18 @@ telling them how **to reconstruct one file given the other**. The key things to note are: - 1. Line 1: The **files** that are being **compared** (a/ and b/ are labels, not paths) - 2. Line 2: The two **hex strings** on the second line which parts of the **hashes** of the files being compares - 3. Line 5: The **lines** that have changed. (It's complex) - 4. Below that, the changes - note the '**+**' marker which shows an addtion +1. Line 1: The **files** that are being **compared** (a/ and b/ are labels, not paths) +2. Line 2: The two **hex strings** on the second line which parts of the **hashes** of the files being compares +3. Line 5: The **lines** that have changed. (It's complex) +4. Below that, the changes - note the '**+**' marker which shows an addtion After reviewing our change, it's time to commit it: -~~~bash -$ git commit -m "Add Docstring" -~~~ +```bash +git commit -m "Add Docstring" +``` -~~~ +```text # On branch main # Your branch is ahead of 'origin/main' by 1 commit. # (use "git push" to publish your local commits) @@ -273,24 +279,24 @@ $ git commit -m "Add Docstring" # (use "git add <file>..." to update what will be committed) # (use "git checkout -- <file>..." to discard changes in working directory) # -# modified: climate_analysis.py +# modified: climate_analysis.py # no changes added to commit (use "git add" and/or "git commit -a") -~~~ +``` **Whoops**: Git won't commit because we didn't use `git add` first. Let's fix that: -~~~bash -$ git add climate_analysis.py -$ git commit -m "Add Docstring" -~~~ +```bash +git add climate_analysis.py +git commit -m "Add Docstring" +``` -~~~ +```text [main 55d3f56] Add Docstring 1 file changed, 1 insertion(+) -~~~ +``` Git insists that we **add** files to the set we want to commit before actually committing anything @@ -299,38 +305,37 @@ because we may not want to commit **everything at once**. For example, suppose we might have **fixed a bug** in some existing code, but we might have added new code that's **not ready to share**. - ### One more addition What if we've made some edits, added them, and then forgotten what they were? Let's add another line to the end of the file: -~~~bash -$ nano climate_analysis.py -~~~ +```bash +nano climate_analysis.py +``` -~~~ +```text # TODO(smangham): Add call to process rainfall -~~~ +``` Check what's changed with **diff**: -~~~bash -$ git diff -~~~ +```bash +git diff +``` -~~~ +```text diff --git a/climate_analysis.py b/climate_analysis.py index d5b442d..6f8ed8a 100644 --- a/climate_analysis.py +++ b/climate_analysis.py @@ -26,3 +26,5 @@ for line in climate_data: kelvin = temp_conversion.fahr_to_kelvin(fahr) - + print(str(celsius)+", "+str(kelvin)) + +# TODO(smangham): Add call to process rainfall -~~~ +``` So far, so good: we've added one line to the end of the file @@ -338,69 +343,70 @@ we've added one line to the end of the file Now let's put that change in the staging area (or **add it to the change set**), then go away for the weekend. When we come back, we can't remember what we added, so we see what `git diff` reports: -~~~bash -$ git add climate_analysis.py -$ git diff -~~~ +```bash +git add climate_analysis.py +git diff +``` -~~~ -~~~ +```text + +``` **There is no output**! This is because **git diff** shows us the differences between the **working copy** and what's been added to the **change set** in staging area. However, if we add the `--staged` flag to the command: -~~~bash -$ git diff --staged -~~~ +```bash +git diff --staged +``` -~~~ +```text diff --git a/climate_analysis.py b/climate_analysis.py index d5b442d..6f8ed8a 100644 --- a/climate_analysis.py +++ b/climate_analysis.py @@ -26,3 +26,5 @@ for line in climate_data: kelvin = temp_conversion.fahr_to_kelvin(fahr) - + print(str(celsius)+", "+str(kelvin)) + +# TODO(smangham): Add call to process rainfall -~~~ +``` it shows us the difference between the last **committed change** and what's in the **staging area**. You might not use this often, but it's very useful when you come back to a project you've left for a while! Let's **commit** our changes: -~~~bash -$ git commit -m "Add rainfall processing placeholder" -~~~ +```bash +git commit -m "Add rainfall processing placeholder" +``` -~~~ +```text [main 6f60ad6] Add rainfall processing placeholder 1 file changed, 2 insertions(+) -~~~ +``` Let's now check our status: -~~~bash -$ git status -~~~ +```bash +git status +``` -~~~ +```text # On branch main # Your branch is ahead of 'origin/main' by 3 commits. # (use "git push" to publish your local commits) # nothing to commit, working directory clean -~~~ +``` And now look at the history of what we've done so far: -~~~bash -$ git log -~~~ +```bash +git log +``` -~~~ +```text commit 6f60ad638f344fbb5fdf81f05a804f7417984eec (HEAD -> main) Author: Sam Mangham <mangham@gmail.com> Date: Wed Mar 16 15:40:30 2022 +0000 @@ -424,7 +430,7 @@ Author: Sam Mangham <mangham@gmail.com> Date: Wed Mar 16 14:19:13 2022 +0000 Initial commit -~~~ +``` ![Differences](fig/04-changes/diff.svg) diff --git a/technology_and_tooling/version_control/05-history.md b/technology_and_tooling/version_control/05-history.md index 2202db38..ad489ce1 100644 --- a/technology_and_tooling/version_control/05-history.md +++ b/technology_and_tooling/version_control/05-history.md @@ -1,28 +1,30 @@ --- name: Exploring History -dependsOn: [ - technology_and_tooling.version_control.04-changes -] +dependsOn: [technology_and_tooling.version_control.04-changes] tags: [git] +learningOutcomes: + - Identify and use Git revision numbers. + - Analyse files by comparing them with previous versions. + - Describe the process of restoring previous versions of files. attribution: -- citation: > + - citation: > This material was originally taken from training materials developed by the University of Southampton Research Software Group, which are based on the Software Carpentries course "Version Control with Git". - url: https://github.com/Southampton-RSG-Training/git-novice/ - image: https://southampton-rsg-training.github.io/git-novice/assets/img/home-logo.png - license: CC-BY-4.0 + url: https://github.com/Southampton-RSG-Training/git-novice/ + image: https://southampton-rsg-training.github.io/git-novice/assets/img/home-logo.png + license: CC-BY-4.0 --- ## Exploring History We've seen that `git log` gives us some information on what commits were made when, but let's look a bit deeper at the specifics: -~~~bash -$ git log -~~~ +```bash +git log +``` -~~~ +```text commit f15ad111042cee7492f40ad6ff0ec18588fce753 (HEAD -> main) Author: Sam Mangham <mangham@gmail.com> Date: Wed Mar 30 17:15:47 2022 +0100 @@ -46,7 +48,7 @@ Author: Sam Mangham <mangham@gmail.com> Date: Wed Mar 16 14:19:13 2022 +0000 Initial commit -~~~ +``` We can see commits identified by long IDs, but also **HEAD** at the top of the log. **HEAD** is the name used to refer to the **most recent** end of the chain of commits to our **local repository**. @@ -60,11 +62,11 @@ so `HEAD~1` (pronounced "head minus one") means "the previous commit", while `HEAD~123` goes back 123 commits from the latest one. -~~~bash -$ git diff HEAD~1 climate_analysis.py -~~~ +```bash +git diff HEAD~1 climate_analysis.py +``` -~~~ +```text diff --git a/climate_analysis.py b/climate_analysis.py index d5b442d..c463f71 100644 --- a/climate_analysis.py @@ -75,15 +77,15 @@ index d5b442d..c463f71 100644 print(str(celsius)+", "+str(kelvin)) + +# TODO(smangham): Add call to process rainfall -~~~ +``` So we see the difference between the file as it is now, and as it was **the commit before before the latest one**. -~~~bash -$ git diff HEAD~2 climate_analysis.py -~~~ +```bash +git diff HEAD~2 climate_analysis.py +``` -~~~ +```text diff --git a/climate_analysis.py b/climate_analysis.py index 277d6c7..c463f71 100644 --- a/climate_analysis.py @@ -99,14 +101,14 @@ index 277d6c7..c463f71 100644 print(str(celsius)+", "+str(kelvin)) + +# TODO(smangham): Add call to process rainfall -~~~ +``` And here we see the state **before the last two commits**, HEAD minus 2. ### Absolute History -What about if we want to compare our version of the code to the version from last month, or from the version we used to make a paper last year? -Calculating the number of commits is wildly impractical. +What about if we want to compare our version of the code to the version from last month, or from the version we used to make a paper last year? +Calculating the number of commits is wildly impractical. Instead, we can refer to **specific revisions** using those long strings of digits and letters that `git log` displays. These are unique IDs for the changes, @@ -116,11 +118,11 @@ has a unique 40-character identifier. (A SHA-1 hash of the new, post-commit stat If we scroll down to the bottom of the `git log` output, we can see the ID for our first commit - in the example above, it's `499b6d18b36a25d3f5ab9be1b708ea48fef1dd65` (but **yours will be different!**). Try this, substituting your first commit's ID: -~~~bash -$ git diff 499b6d18b36a25d3f5ab9be1b708ea48fef1dd65 climate_analysis.py -~~~ +```bash +git diff 499b6d18b36a25d3f5ab9be1b708ea48fef1dd65 climate_analysis.py +``` -~~~ +```text diff --git a/climate_analysis.py b/climate_analysis.py index 277d6c7..6f8ed8a 100644 --- a/climate_analysis.py @@ -132,20 +134,20 @@ index 277d6c7..6f8ed8a 100644 import signal @@ -25,3 +26,5 @@ for line in climate_data: kelvin = temp_conversion.fahr_to_kelvin(fahr) - + print(str(celsius)+", "+str(kelvin)) + +# TODO(smangham): Add call to process rainfall -~~~ +``` We can now see all the changes since a specific commit! However, typing random 40-character strings is annoying and incredibly easy to typo, so Git lets us use just the first **seven**: -~~~bash -$ git diff 499b6d1 climate_analysis.py -~~~ +```bash +git diff 499b6d1 climate_analysis.py +``` -~~~ +```text diff --git a/climate_analysis.py b/climate_analysis.py index 277d6c7..6f8ed8a 100644 --- a/climate_analysis.py @@ -157,17 +159,18 @@ index 277d6c7..6f8ed8a 100644 import signal @@ -25,3 +26,5 @@ for line in climate_data: kelvin = temp_conversion.fahr_to_kelvin(fahr) - + print(str(celsius)+", "+str(kelvin)) + +# TODO(smangham): Add call to process rainfall -~~~ +``` This is particularly handy as you can **exactly identify specific versions of the code**, for example the one you used to write your first paper, and the different, newer version you used to write your second paper. ![Differencing](fig/05-history/diff.svg) :::callout + ## Other Ways To Reference Commits Newer versions of Git have some more advanced ways of referencing past commits. In place of `HEAD~1` you can use `HEAD~` or `HEAD@{1}`, @@ -181,26 +184,26 @@ we can **save changes** to files and **see what we've changed** — suppose Let's suppose we **accidentally** overwrite or delete our file: -~~~bash -$ rm climate_analysis.py -$ ls -~~~ +```bash +rm climate_analysis.py +ls +``` -~~~ +```text README.md temp_conversion.py -~~~ +``` **Whoops!** `git status` now tells us that the file has been changed, but those changes haven't been staged: -~~~bash -$ git status -~~~ +```bash +git status +``` -~~~ +```text # On branch main # Your branch is ahead of 'origin/main' by 3 commits. # (use "git push" to publish your local commits) @@ -209,48 +212,49 @@ $ git status # (use "git add/rm <file>..." to update what will be committed) # (use "git restore <file>..." to discard changes in working directory) # -# deleted: climate_analysis.py +# deleted: climate_analysis.py # no changes added to commit (use "git add" and/or "git commit -a") -~~~ +``` Following the helpful hint in that output, we can put things back the way they were by using `git restore`: -~~~bash -$ git restore climate_analysis.py -$ cat climate_analysis.py -~~~ +```bash +git restore climate_analysis.py +cat climate_analysis.py +``` -~~~ +```text [SNIPPED - but changes rolled back] -~~~ +``` -By default, `restore` replaces the file with the version of it in the *staging area*. If you haven't used `git add`, that should be the same as the version in the last commit. But what if we already used `git add` on our incorrect version of a file, or we broke the file more than one commit ago? +By default, `restore` replaces the file with the version of it in the _staging area_. If you haven't used `git add`, that should be the same as the version in the last commit. But what if we already used `git add` on our incorrect version of a file, or we broke the file more than one commit ago? We can use `git checkout`, e.g.: -~~~bash -$ git checkout <HEAD or commit ID> climate_analysis.py -~~~ - +```bash +git checkout <HEAD or commit ID> climate_analysis.py +``` :::callout + ## Compatibility Notice Older versions of Git don't include the `git restore` command - fortunately, it's just a shortcut for `git checkout --`. If `git restore` doesn't work, try `git checkout -- temp_conversion.py`. -`checkout` has a *lot* of functions, and newer versions of Git simplify things by giving them new names. +`checkout` has a _lot_ of functions, and newer versions of Git simplify things by giving them new names. ::: :::callout + ## Double Whoops -What if you accidentally did `git rm climate_analysis.py`? That command tells Git to *delete the file and remove it from the repository* - so it will record that the file has been deleted, then stop tracking further changes. +What if you accidentally did `git rm climate_analysis.py`? That command tells Git to _delete the file and remove it from the repository_ - so it will record that the file has been deleted, then stop tracking further changes. Even if you re-make the file, it won't be tracked until you use `git add` on it again. -The file still exists in the *history*, though so if you want to undo this you can do `git checkout HEAD climate_analysis.py`, to get the file back and start tracking it again. -Since you can retrieve any file that existed in *a* previous commit, even if you removed it from future ones, this makes it important to not commit files containing passwords or sensitive information! +The file still exists in the _history_, though so if you want to undo this you can do `git checkout HEAD climate_analysis.py`, to get the file back and start tracking it again. +Since you can retrieve any file that existed in _a_ previous commit, even if you removed it from future ones, this makes it important to not commit files containing passwords or sensitive information! ::: ![Restoring Files](fig/05-history/restore.svg) @@ -258,8 +262,8 @@ Since you can retrieve any file that existed in *a* previous commit, even if you The fact that files can be reverted one by one tends to change the way people organize their work. -Consider a situation where all your code is in one file, -and you fixed a bug in one section but accidentally introduced one elsewhere. +Consider a situation where all your code is in one file, +and you fixed a bug in one section but accidentally introduced one elsewhere. -You can't just roll back to fix one bug without un-fixing the other. +You can't just roll back to fix one bug without un-fixing the other. However, if each section is in its own file, you can just roll back the section you broke! diff --git a/technology_and_tooling/version_control/06-remote.md b/technology_and_tooling/version_control/06-remote.md index c5ff5088..75c71d27 100644 --- a/technology_and_tooling/version_control/06-remote.md +++ b/technology_and_tooling/version_control/06-remote.md @@ -1,17 +1,18 @@ --- name: Remote Repositories -dependsOn: [ - technology_and_tooling.version_control.05-history -] +dependsOn: [technology_and_tooling.version_control.05-history] tags: [git] +learningOutcomes: + - Apply git push and git pull. + - Resolve conflicts encountered during remote repository operations. attribution: -- citation: > + - citation: > This material was originally taken from training materials developed by the University of Southampton Research Software Group, which are based on the Software Carpentries course "Version Control with Git". - url: https://github.com/Southampton-RSG-Training/git-novice/ - image: https://southampton-rsg-training.github.io/git-novice/assets/img/home-logo.png - license: CC-BY-4.0 + url: https://github.com/Southampton-RSG-Training/git-novice/ + image: https://southampton-rsg-training.github.io/git-novice/assets/img/home-logo.png + license: CC-BY-4.0 --- We've learned how to use a **local repository** to store our code and view changes: @@ -21,10 +22,11 @@ We've learned how to use a **local repository** to store our code and view chang Now, however, we'd like to share the changes we've made to our code with others, as well as making sure we have an off-site backup in case things go wrong. We need to upload our changes in our **local repository** to a **remote repository**. :::callout + ## Why Have an Off-site Backup? You might wonder why having an off-site backup (i.e. a copy not stored at your University) is so important. -In 2005, [a fire destroyed a building at the University of Southampton](http://news.bbc.co.uk/1/hi/england/hampshire/4390048.stm). Some people's *entire PhD projects* were wiped out in the blaze. +In 2005, [a fire destroyed a building at the University of Southampton](http://news.bbc.co.uk/1/hi/england/hampshire/4390048.stm). Some people's _entire PhD projects_ were wiped out in the blaze. To ensure your PhD only involves a normal level of suffering, please make sure you have off-site backups of as much of your work as possible! ![Mountbatten Fire](fig/06-remote/mountbatten-fire.jpg) @@ -34,15 +36,15 @@ To do that, we'll use the **remote repository** we set up on GitHub at the start ![Remote Repositories](fig/06-remote/remote.svg) -So we're finally going to address all those *"Your branch is ahead of 'origin/main' by 3 commits"* messages we got from `git status`! However, GitHub doesn't let just anyone push to your repository - you need to prove you're the owner (or have been given access). Fortunately, we already set up an SSH key earlier. +So we're finally going to address all those _"Your branch is ahead of 'origin/main' by 3 commits"_ messages we got from `git status`! However, GitHub doesn't let just anyone push to your repository - you need to prove you're the owner (or have been given access). Fortunately, we already set up an SSH key earlier. Now we can synchronise our code to the remote repository, with `git push`: -~~~bash -$ git push -~~~ +```bash +git push +``` -~~~ +```text Counting objects: 11, done. Delta compression using up to 32 threads. Compressing objects: 100% (9/9), done. @@ -51,12 +53,13 @@ Total 9 (delta 2), reused 0 (delta 0) remote: Resolving deltas: 100% (2/2), completed with 1 local object. To git@github.com:smangham/climate-analysis 70bf8f3..501e88f main -> main -~~~ +``` And we're done! This bit was easy as when we used `git clone` earlier, it set up our **local repository** to **track** the **remote repository**. The `main -> main` line shows we're sending our local branch called `main` to the remote repository as a branch called `main`. :::callout -## What *is* a Branch, Though? + +## What _is_ a Branch, Though? Branches allow you to have alternate versions of the code 'branching off' from another branch (e.g. `main`). You can try out new features in these branches without disrupting your `main` version of the code, then **merge them in** once you've finished. We have a **Stretch Episode** that gives you a brief introduction to them! @@ -69,7 +72,9 @@ If we go back to the repository on GitHub, we can refresh the page and see our u Conveniently, the contents of `README.md` are shown on the main page, with formatting. [You can also add links, tables and more](https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax). Your code should always have a descriptive `README.md` file, so anyone visiting the repo can easily get started with it. :::callout + ## How often should I push? + Every day. You can never predict when your hard disk will fail or your building will be destroyed! ![In case of fire, git commit, git push, leave building](fig/06-remote/incaseoffire.jpg) [Credit: Mitch Altman, CC BY-SA 2.0](https://www.flickr.com/photos/maltman23/38138235276) @@ -81,33 +86,33 @@ Now we know how to **push** our work from our local repository to a remote one, We want to invite other people to collaborate on our code, so we'll update the `README.md` with a request for potential collaborators to email us at our University email address. -~~~bash -$ nano README.md -$ cat README.md -~~~ +```bash +nano README.md +cat README.md +``` -~~~ +```text # Climate Analysis Toolkit This is a set of python scripts designed to analyse climate datafiles. If you're interested in collaborating, email me at s.w.mangham@soton.ac.uk. -~~~ +``` -~~~bash -$ git commit -am "Added collaboration info" -~~~ +```bash +git commit -am "Added collaboration info" +``` -~~~ +```text [main 39a2c8f] Added collaboration info 1 file changed, 2 insertions(+) -~~~ +``` -In this case, we use `git commit -am` where the `-a` means **commit all modified files we've previously used `git add` on**, and the `-m` bit means 'and here's the commit message' as usual. It's a handy shortcut. +In this case, we use `git commit -am` where the `-a` means **commit all modified files we've previously used `git add` on**, and the `-m` bit means 'and here's the commit message' as usual. It's a handy shortcut. -But **don't push to GitHub** just yet! We're going to set up a small conflict, of the kind you might see when working with a remote repository. What happens if you change a file at the same time as one of your collaborators does, and you *both* commit those changes? How does GitHub know which version of the file is 'correct'? +But **don't push to GitHub** just yet! We're going to set up a small conflict, of the kind you might see when working with a remote repository. What happens if you change a file at the same time as one of your collaborators does, and you _both_ commit those changes? How does GitHub know which version of the file is 'correct'? -Pretending to be an existing collaborator, we'll go and add those installation instructions by editing our `README.md` file directly on GitHub. This isn't *common*, but if you want to quickly make some small changes to a single file it can be useful. We edit it as: +Pretending to be an existing collaborator, we'll go and add those installation instructions by editing our `README.md` file directly on GitHub. This isn't _common_, but if you want to quickly make some small changes to a single file it can be useful. We edit it as: ![GitHub edit button](fig/06-remote/edit-button.png) @@ -125,11 +130,11 @@ Then commit the changes directly to our `main` branch with a descriptive commit Great. Now let's go back to the terminal and try pushing our local changes to the remote repository. This is going to cause problems, just as we expected: -~~~bash -$ git push -~~~ +```bash +git push +``` -~~~ +```text To git@github.com:smangham/climate-analysis ! [rejected] main -> main (fetch first) error: failed to push some refs to 'git@github.com:smangham/climate-analysis' @@ -138,19 +143,19 @@ hint: not have locally. This is usually caused by another repository pushing hint: to the same ref. You may want to first merge the remote changes (e.g., hint: 'git pull') before pushing again. hint: See the 'Note about fast-forwards' in 'git push --help' for details. -~~~ +``` -Git helpfully tells us that actually, there are commits present in the **remote repository** that we don't have in our **local repository**. +Git helpfully tells us that actually, there are commits present in the **remote repository** that we don't have in our **local repository**. ### Merge Conflicts We'll need to **pull** those commits into our local repository before we can push our own updates back! -~~~bash -$ git pull -~~~ +```bash +git pull +``` -~~~ +```text remote: Enumerating objects: 5, done. remote: Counting objects: 100% (5/5), done. remote: Compressing objects: 100% (3/3), done. @@ -161,74 +166,72 @@ From github.com:smangham/climate-analysis Auto-merging README.md CONFLICT (content): Merge conflict in README.md Automatic merge failed; fix conflicts and then commit the result. -~~~ +``` :::callout + ## Compatibility Notice Newer versions of git will default to attempting to merge conflicting 'histories'. Older versions might not - and they'll give you a message like: -~~~ +```text hint: You have divergent branches and need to specify how to reconcile them. hint: You can do so by running one of the following commands sometime before hint: your next pull: -hint: +hint: hint: git config pull.rebase false # merge hint: git config pull.rebase true # rebase hint: git config pull.ff only # fast-forward only -hint: +hint: hint: You can replace "git config" with "git config --global" to set a default hint: preference for all repositories. You can also pass --rebase, --no-rebase, hint: or --ff-only on the command line to override the configured default per hint: invocation. -fatal: Need to specity how to reconcile divergent branches -~~~ +fatal: Need to specity how to reconcile divergent branches +``` We want to default to **merging**. **Fast forward** and **rebase** are advanced options you'd typically only see used in large teams in industry. So as git suggests, we can fix it our problem with: -~~~bash -$ git config --global pull.rebase false -$ git pull -~~~ +```bash +git config --global pull.rebase false +git pull +``` + Now we'll get the same behaviour as newer versions of git. ::: We have created a conflict! Both us, and our remote collaborator, both edited `README.md`. Let's take a look at the file: -~~~bash -$ cat README.md -~~~ +```bash +cat README.md +``` -~~~ +```text # Climate Analysis Toolkit This is a set of python scripts designed to analyse climate datafiles. -<<<<<<< HEAD If you're interested in collaborating, email me at s.w.mangham@soton.ac.uk. -======= To install a copy of the toolkit, open a terminal and run: git clone git@github.com:smangham/climate-analysis.git **This code is currently in development and not all features will work** ->>>>>>> 493dd81b5d5b34211ccff4b5d0daf8efb3147755 -~~~ +``` - -Git has tried to auto-merge the files, but unfortunately failed. It can handle most conflicts by itself, but if two commits edit the *exact same* part of a file it will need you to help it. +Git has tried to auto-merge the files, but unfortunately failed. It can handle most conflicts by itself, but if two commits edit the _exact same_ part of a file it will need you to help it. We can see the two different edits we made to the end of the `README.md` file, in a block defined by `<<<`, `===` and `>>>`. The top block is labelled `HEAD` (the changes in our latest local commit), whilst the bottom block is labelled with the commit ID of the commit we made on GitHub. We can easily fix this using `nano`, by deleting all the markers and keeping the text we want: -~~~bash -$ nano README.md -$ cat README.md -~~~ +```bash +nano README.md +cat README.md +``` -~~~ +```text # Climate Analysis Toolkit This is a set of python scripts designed to analyse climate datafiles. @@ -241,23 +244,23 @@ To install a copy of the toolkit, open a terminal and run: **This code is currently in development and not all features will work** -~~~ +``` Now we've got a fixed and finished `README.md` file, we can commit our changes, and push them up to our remote repository: -~~~bash -$ git commit -am "Fixed merge conflict" -~~~ +```bash +git commit -am "Fixed merge conflict" +``` -~~~ +```text [main 6f4df16] Fixed merge conflict -~~~ +``` -~~~bash -$ git push -~~~ +```bash +git push +``` -~~~ +```text Counting objects: 10, done. Delta compression using up to 32 threads. Compressing objects: 100% (6/6), done. @@ -266,7 +269,7 @@ Total 6 (delta 2), reused 0 (delta 0) remote: Resolving deltas: 100% (2/2), completed with 1 local object. To git@github.com:smangham/climate-analysis 023f8f6..09f5151 main -> main -~~~ +``` Now back on GitHub we can see that our `README.md` shows the text from both commits, and our conflict is resolved: @@ -275,6 +278,7 @@ Now back on GitHub we can see that our `README.md` shows the text from both comm Now we can successfully collaboratively develop our research code with others. :::callout + ## Conflict Mitigation If you've got multiple different people working on a code at once, diff --git a/technology_and_tooling/version_control/07-branches.md b/technology_and_tooling/version_control/07-branches.md index 8a4784c9..c7cfdacc 100644 --- a/technology_and_tooling/version_control/07-branches.md +++ b/technology_and_tooling/version_control/07-branches.md @@ -1,20 +1,21 @@ --- name: Branches -dependsOn: [ - technology_and_tooling.version_control.06-remote -] +dependsOn: [technology_and_tooling.version_control.06-remote] tags: [git] +learningOutcomes: + - Describe the use of branching in version control. + - Use git branch and git merge commands effectively. attribution: -- citation: > + - citation: > This material was originally taken from training materials developed by the University of Southampton Research Software Group, which are based on the Software Carpentries course "Version Control with Git". - url: https://github.com/Southampton-RSG-Training/git-novice/ - image: https://southampton-rsg-training.github.io/git-novice/assets/img/home-logo.png - license: CC-BY-4.0 + url: https://github.com/Southampton-RSG-Training/git-novice/ + image: https://southampton-rsg-training.github.io/git-novice/assets/img/home-logo.png + license: CC-BY-4.0 --- -We've seen branches mentioned a *lot* so far - mostly `main`. So what are they? +We've seen branches mentioned a _lot_ so far - mostly `main`. So what are they? A branch is a **parallel version of a repository**. It can **branch off** from a commit, contain its own set of extra commits and edits to files, then easily **merge back** into the branch it came off (or even another!). We can visualise this flow of splitting and merging branches like this: @@ -28,13 +29,13 @@ However, if you plan on **making changes to an existing code**, **collaborating ### Sharing Your Code: `main` and `dev` branches -As mentioned, if you're using an existing code written by somebody else, you'll typically just download the `main` branch and use that. What if, though, the author(s) of the code want to continue working on it without the potential users downloading half-finished or untested code? They could keep all their changes local and only commit and push once a new feature has been completed and rigorously tested, but that's not particularly sustainable for large features. It could potentially take months to add a new feature (a long time to go without a backup!), and you might want to share the work-in-progress version with others to test. +As mentioned, if you're using an existing code written by somebody else, you'll typically just download the `main` branch and use that. What if, though, the author(s) of the code want to continue working on it without the potential users downloading half-finished or untested code? They could keep all their changes local and only commit and push once a new feature has been completed and rigorously tested, but that's not particularly sustainable for large features. It could potentially take months to add a new feature (a long time to go without a backup!), and you might want to share the work-in-progress version with others to test. The traditional way to do this is to create a **development branch (`dev` or `develop`) coming off the main branch (`main` or `master`)**. The **main branch** contains tested, finished code that can be shared with others, whilst the **development branch** contains work-in-progress code. Typically you **merge** your development branch into your master branch when your work on it has been tested and is ready to share - for example, when you release a paper using it. Then you can continue working on your development branch and sharing your development code with other other members of your group. ### Making Changes to an Existing Code: Feature branches -Once you have a working code, particularly one that's being shared, you'll inevitably want to add new features. You could add them directly to your development branch - however, what happens if, mid-way through, you need to pause the feature and switch to something else as you wait for simulations to finish, new data to arrive, or similar? Instead of ending up with a mess of multiple half-finished modifications, which are impossible to evaluate independently of the other, you can instead create a new **feature branch coming off of your development branch** for each new feature. You work on each new feature or bugfix in their own **feature branch**, and merge them back into your **development branch** once they're tested and complete. Then, as before, once you're ready to publish a paper using your new functionality you merge it all back into the **main branch**. +Once you have a working code, particularly one that's being shared, you'll inevitably want to add new features. You could add them directly to your development branch - however, what happens if, mid-way through, you need to pause the feature and switch to something else as you wait for simulations to finish, new data to arrive, or similar? Instead of ending up with a mess of multiple half-finished modifications, which are impossible to evaluate independently of the other, you can instead create a new **feature branch coming off of your development branch** for each new feature. You work on each new feature or bugfix in their own **feature branch**, and merge them back into your **development branch** once they're tested and complete. Then, as before, once you're ready to publish a paper using your new functionality you merge it all back into the **main branch**. ### Collaborating With Others: Feature branches @@ -42,7 +43,7 @@ Feature branches also make collaborating with others far easier! Instead of step ## Merging -We've mentioned **merges** repeatedly; as Git tracks the *changes* made to each file in each commit, it can easily determine whether or not the changes made in two branches **conflict** with each other. It can intelligently merge together two modified versions of a file where their changes don't overlap, and highlight sections where they do for you to resolve, showing both versions of the code. +We've mentioned **merges** repeatedly; as Git tracks the _changes_ made to each file in each commit, it can easily determine whether or not the changes made in two branches **conflict** with each other. It can intelligently merge together two modified versions of a file where their changes don't overlap, and highlight sections where they do for you to resolve, showing both versions of the code. These use the same conflict resolution we saw earlier - new files are added seamlessly, whilst modified files use smart conflict resolution and might need your intervention if there's a clash! @@ -50,110 +51,110 @@ These use the same conflict resolution we saw earlier - new files are added seam We can use the `git branch` command to list the branches in our local repository, and let us know which we're on: -~~~bash -$ git branch -~~~ +```bash +git branch +``` -~~~ +```text * main -~~~ +``` At the moment, we only have one - `main` - and the asterisk tells us it's the one we're currently on. We can check this by creating a new branch using `git branch new_branch_name`, and listing them again: -~~~bash -$ git branch dev -$ git branch -~~~ +```bash +git branch dev +git branch +``` -~~~ +```text dev * main -~~~ +``` Now we've got a `dev` branch set up! - ### Working with a `dev` branch We'll try a quick example of using the `main` and `dev` branches to have a work-in-progress version of the code that we only share when we've completed and tested it. We can switch to our new branch with: -~~~bash -$ git switch dev -~~~ +```bash +git switch dev +``` -~~~ +```text Switched to branch 'dev' -~~~ +``` :::callout + ## Compatibility Notice -Older versions of Git don't have `git switch` - instead, you have to use `git checkout dev`. As we've already seen, `checkout` has a *lot* of functions, and newer versions of Git simplify things by giving them new names. +Older versions of Git don't have `git switch` - instead, you have to use `git checkout dev`. As we've already seen, `checkout` has a _lot_ of functions, and newer versions of Git simplify things by giving them new names. ::: -Any commits we make on this branch will exist *only* on this branch - when you use `git switch main` to switch back to your **main branch**, they won't show up in your `git log` results! +Any commits we make on this branch will exist _only_ on this branch - when you use `git switch main` to switch back to your **main branch**, they won't show up in your `git log` results! We'll give it a try. In one of our earlier edits to `climate_analysis.py`, we mentioned we wanted to process rainfall measurements in our climate data. Let's imagine these are historic values, in imperial measurements, that we'll need to convert. We'll make a new file, and write a simple function to handle it: -~~~bash -$ nano rainfall_conversion.py -$ cat rainfall_conversion.py -~~~ +```bash +nano rainfall_conversion.py +cat rainfall_conversion.py +``` -~~~ +```text def inches_to_mm(inches): mm = inches * 25.4 return mm -~~~ +``` Now we've made the file, we want to **commit it** to our `dev` branch. Make sure you're on the `dev` branch with `git switch dev` if you haven't already, and then add it like we added our changes before: -~~~bash -$ git add rainfall_conversion.py -$ git commit -m "Add rainfall module" -~~~ +```bash +git add rainfall_conversion.py +git commit -m "Add rainfall module" +``` -~~~ +```text [dev b402781] Add rainfall module 1 file changed, 4 insertions(+) create mode 100644 rainfall_conversion.py -~~~ +``` So we've successfully made a new file, and committed it to our repository, on the `dev` branch. Let's take a look at the directory now using `ls`: -~~~bash -$ ls -~~~ +```bash +ls +``` -~~~ +```text README.md climate_analysis.py rainfall_conversion.py temp_conversion.py -~~~ +``` We can see that the `rainfall_conversion.py` file is all present and correct. But we told git that we made it on the `dev` branch - what happens if we switch back to `main` with `git switch` again?: -~~~bash -$ git switch main -~~~ +```bash +git switch main +``` -~~~ +```text Switched to branch 'main' Your branch is up to date with 'origin/main'. -~~~ +``` -~~~bash -$ ls -~~~ +```bash +ls +``` -~~~ +```text README.md climate_analysis.py temp_conversion.py -~~~ +``` -The `rainfall_conversion.py` file isn't present, as the **commit** that created it was made on the `dev` branch. It still exists, and if we use `git switch dev` it'll re-appear. However, whilst we're on `main`, it's tidied away into our hidden `.git` directory. +The `rainfall_conversion.py` file isn't present, as the **commit** that created it was made on the `dev` branch. It still exists, and if we use `git switch dev` it'll re-appear. However, whilst we're on `main`, it's tidied away into our hidden `.git` directory. This doesn't just work on new files. If you edit an existing file on `dev`, then when you switch back to `main` you'll see the old version. @@ -161,21 +162,21 @@ This doesn't just work on new files. If you edit an existing file on `dev`, then Now we've made changes to our `dev` branch, we want to send them up to GitHub, to make sure that we don't lose any of our development work! Let's switch back to `dev` with `git switch`: -~~~bash -$ git switch dev -~~~ +```bash +git switch dev +``` -~~~ +```text Switched to branch 'dev' -~~~ +``` And use `git push` to synchonise our branch with GitHub, just like we did earlier. However, this time we'll get an error: -~~~bash -$ git push -~~~ +```bash +git push +``` -~~~ +```text fatal: The current branch dev has no upstream branch. To push the current branch and set the remote as upstream, use @@ -183,7 +184,7 @@ To push the current branch and set the remote as upstream, use To have this happen automatically for branches without a tracking upstream, see 'push.autoSetupRemote' in 'git help config'. -~~~ +``` When we used `git clone` it linked up our `main` branch with the `main` branch on our GitHub repository automatically. Our `dev` branch is new, though, and git doesn't yet know where it should be pushing it to. Fortunately, git has told us what we need to do to tell it (git is good about this!). @@ -191,25 +192,25 @@ The `origin` argument to `git push` tells it which remote repository we're pushi We'll use a shortcut for `--set-upstream` - `-u`: -~~~bash -$ git push -u origin dev -~~~ +```bash +git push -u origin dev +``` -~~~ +```text Enumerating objects: 4, done. Counting objects: 100% (4/4), done. Delta compression using up to 4 threads Compressing objects: 100% (3/3), done. Writing objects: 100% (3/3), 415 bytes | 415.00 KiB/s, done. Total 3 (delta 0), reused 0 (delta 0), pack-reused 0 -remote: +remote: remote: Create a pull request for 'dev' on GitHub by visiting: remote: https://github.com/smangham/climate-analysis/pull/new/dev -remote: +remote: To github.com:smangham/climate-analysis.git * [new branch] dev -> dev branch 'dev' set up to track 'origin/dev'. -~~~ +``` Now we've got it up on GitHub successfully! Let's go check on the site: @@ -219,88 +220,91 @@ It defaults to showing the `main` branch, but lets us know there's been a recent ![Viewing dev branch on GitHub](fig/07-branches/push-dev-selected.png) -We can see the `rainfall_conversion.py` file has been uploaded! This makes it easy for us to share work-in-progress versions of our code that others can easily look at. +We can see the `rainfall_conversion.py` file has been uploaded! This makes it easy for us to share work-in-progress versions of our code that others can easily look at. :::callout + ## Linking Remotes It's always worth double-checking before you run `git push origin dev` for the first time - if you're accidentally still on the `main` branch, you can end up pushing it to GitHub as a new branch called `dev`, and having two copies! To avoid this, we can set the 'upstream' for a branch when we make it, using: -~~~bash -$ git branch --track branchname origin/branchname -~~~ - +```bash +git branch --track branchname origin/branchname +``` + But this functionality isn't available on older versions of git. Alternatively, if your git is new enough to suggest it, you can make it automatically link branches to their remote equivalents with: -~~~bash -$ git config --global push.autoSetupRemote true -~~~ +```bash +git config --global push.autoSetupRemote true +``` + ::: :::callout + ## Downloading Branches - + It's easy to share a branch with a collaborator so they can test out a different version of the code. If they `clone` the repository, like we did back at the start, it defaults to `main` but they can download the other branches and try them out too, using: -~~~bash -$ git clone git@github.com:yourname/climate-analysis.git -$ git fetch -$ git switch dev -~~~ +```bash +git clone git@github.com:yourname/climate-analysis.git +git fetch +git switch dev +``` -Where `git fetch` downloads *all* the branches on the remote repository, not just the `main` one. +Where `git fetch` downloads _all_ the branches on the remote repository, not just the `main` one. ::: - ## Merging Branches If we're happy with the way our work on the `dev` branch has gone, and we've tested it, we can merge the content back in! Let's switch back to our `main` branch: -~~~bash -$ git switch main -~~~ +```bash +git switch main +``` -~~~ +```text Switched to branch 'main' Your branch is up to date with 'origin/main'. -~~~ +``` Now, to merge the changes from our `dev` branch into the current (`main`) branch, we just need to do: -~~~bash -$ git merge dev -~~~ +```bash +git merge dev +``` -~~~ +```text Updating fd30d36..b402781 Fast-forward rainfall_conversion.py | 4 ++++ 1 file changed, 4 insertions(+) create mode 100644 rainfall_conversion.py -~~~ +``` Now, let's push our updated `main` branch to GitHub: -~~~bash -$ git push -~~~ +```bash +git push +``` -~~~ +```text Total 0 (delta 0), reused 0 (delta 0), pack-reused 0 To github.com:smangham/climate-analysis.git fd30d36..b402781 main -> main -~~~ +``` And we can see on GitHub that the two branches are up-to-date: ![Main up-to-date on GitHub](fig/07-branches/push-main.png) :::callout + ## Pull Requests When we looked at GitHub earlier, we saw a banner letting us know we could compare our branches, make a **Pull Request**: @@ -308,12 +312,12 @@ When we looked at GitHub earlier, we saw a banner letting us know we could compa ![Main up-to-date on GitHub](fig/07-branches/push-dev-selected.png) A **Pull Request** is another way of merging branches, that works better when you're part of a team. -There's an interface for discussing the changes you've made with your colleagues, +There's an interface for discussing the changes you've made with your colleagues, requesting others peer-review your code, and it shows all your changes in detail: ![Pull request on GitHub](fig/07-branches/pull-request-example.png) - + Then, once you've taken a proper look and you're happy with your changes, you can merge the branches -through the GitHub web interface. +through the GitHub web interface. If you're working as part of a team, it's better to make a **Pull Request** than use than `git merge`. -::: \ No newline at end of file +::: diff --git a/technology_and_tooling/version_control/08-ignore.md b/technology_and_tooling/version_control/08-ignore.md index dc3eea9c..f9f93d36 100644 --- a/technology_and_tooling/version_control/08-ignore.md +++ b/technology_and_tooling/version_control/08-ignore.md @@ -1,17 +1,18 @@ --- name: Ignoring Things -dependsOn: [ - technology_and_tooling.version_control.07-branches -] +dependsOn: [technology_and_tooling.version_control.07-branches] tags: [git] +learningOutcomes: + - Use a .gitignore file to exclude specific files from version control. + - Explain the importance and benefits of using .gitignore files. attribution: -- citation: > + - citation: > This material was originally taken from training materials developed by the University of Southampton Research Software Group, which are based on the Software Carpentries course "Version Control with Git". - url: https://github.com/Southampton-RSG-Training/git-novice/ - image: https://southampton-rsg-training.github.io/git-novice/assets/img/home-logo.png - license: CC-BY-4.0 + url: https://github.com/Southampton-RSG-Training/git-novice/ + image: https://southampton-rsg-training.github.io/git-novice/assets/img/home-logo.png + license: CC-BY-4.0 --- What if we have files that we **do not** want Git to track for us, @@ -19,29 +20,29 @@ like **backup files** created by our editor or **intermediate** files created during data analysis. Let's switch to our dev branch, and create a few dummy files: -~~~bash -$ git checkout dev -$ mkdir results -$ touch a.dat b.dat c.dat results/a.out results/b.out -~~~ +```bash +git checkout dev +mkdir results +touch a.dat b.dat c.dat results/a.out results/b.out +``` and see what Git says: -~~~bash -$ git status -~~~ +```bash +git status +``` -~~~ +```text # On branch dev # Untracked files: # (use "git add <file>..." to include in what will be committed) # -# a.dat -# b.dat -# c.dat -# results/ +# a.dat +# b.dat +# c.dat +# results/ nothing added to commit but untracked files present (use "git add" to track) -~~~ +``` Putting these files under version control would be a **waste of disk space**. What's worse, @@ -50,15 +51,15 @@ so let's tell Git to **ignore** them. We do this by creating a file in the root directory of our project called `.gitignore`. -~~~bash -$ nano .gitignore -$ cat .gitignore -~~~ +```bash +nano .gitignore +cat .gitignore +``` -~~~ +```text *.dat results/ -~~~ +``` These patterns tell Git to **ignore** any file whose name ends in **`.dat`** and everything in the **`results`** directory. @@ -68,18 +69,18 @@ Git would **continue** to track them.) Once we have created this file, the output of `git status` is much cleaner: -~~~bash -$ git status -~~~ +```bash +git status +``` -~~~ +```text # On branch dev # Untracked files: # (use "git add <file>..." to include in what will be committed) # -# .gitignore +# .gitignore nothing added to commit but untracked files present (use "git add" to track) -~~~ +``` The only thing Git notices now is the newly-created `.gitignore` file. You might think we wouldn't want to track it, @@ -87,40 +88,40 @@ but everyone we're **sharing** our repository with will probably **want to ignor the same** things that we're ignoring. Let's add and commit `.gitignore`: -~~~bash -$ git add .gitignore -$ git commit -m "Add the ignore file" -$ git status -~~~ +```bash +git add .gitignore +git commit -m "Add the ignore file" +git status +``` -~~~ +```text # On branch dev nothing to commit, working directory clean -~~~ +``` As a bonus, using `.gitignore` helps us **avoid accidentally adding files** to the repository that we don't want. -~~~bash -$ git add a.dat -~~~ +```bash +git add a.dat +``` -~~~ +```text The following paths are ignored by one of your .gitignore files: a.dat Use -f if you really want to add them. fatal: no files added -~~~ +``` If we really want to override our ignore settings, we can use `git add -f` to force Git to add something. We can also always see the status of ignored files if we want: -~~~bash -$ git status --ignored -~~~ +```bash +git status --ignored +``` -~~~ +```text # On branch dev # Ignored files: # (use "git add -f <file>..." to include in what will be committed) @@ -131,7 +132,7 @@ $ git status --ignored # results/ nothing to commit, working directory clean -~~~ +``` Force adding can be useful for adding a `.gitkeep` file. You can't add empty directories to a repository- they have to have some files within them. But if your code expects there to be a `results/` directory to output to, for example, this can be a problem. Users will run your code, and have it error out at a missing directory and have to create it themselves. diff --git a/technology_and_tooling/version_control/slides/index.md b/technology_and_tooling/version_control/slides/index.md index 72123298..57391357 100644 --- a/technology_and_tooling/version_control/slides/index.md +++ b/technology_and_tooling/version_control/slides/index.md @@ -8,12 +8,13 @@ title: Introduction to Version Control - A tool that tracks changes to files - Records the changes you made, and the order in which you made them -- Like the history of a page on Wikipedia +- Like the history of a page on Wikipedia - Or - like turning on ‘Track Changes’ in Word, but for code. ![](images/track_changes.svg) :::{.notes} + - Version control is a tool that tracks changes to files. - Version control records the changes you made, and the order in which you made them. - It’s like the history of a page on wikipedia or like turning on “Track Changes” in Word or Google Docs, but for code. @@ -35,6 +36,7 @@ title: Introduction to Version Control ::: :::{.notes} + - You may have experienced the common problem of having multiple nearly-identical versions of the same file with no meaningful explanation of what the differences are, just incremental changes in filename (thesis.doc, thesis_final.doc, thesis_final2.doc…). - If we’re just dealing with text documents, some word processors let us deal with this a little better, like Microsoft Word’s “Track Changes” or Google Docs’ version history. - However, research isn’t just Word docs, it’s code and data and diagrams too, and a single paper or project can involve a whole constellation of files, all of which need backing up! @@ -50,6 +52,7 @@ title: Introduction to Version Control - Easy to share full copy of any version :::{.notes} + - When you use version control, at any point in the future, you can retrieve the correct versions of your documents, scripts or code. So, for example, a year after publication, you can get hold of the precise combination of scripts and data that you used to assemble a paper. - Version control makes reproducibility simpler. Without using version control it’s very hard to say that your research is truly reproducible… ::: @@ -70,6 +73,7 @@ title: Introduction to Version Control ::: :::{.notes} + - As well as maintaining a revison history, VC tools also help multiple authors collaborate on the same file or set of files. - Professional software developers use VC to work in large teams and to keep track of what they’ve done. - If you know what changes have been made to each file, you can easily combine multiple people’s changes to a single file. @@ -85,6 +89,7 @@ title: Introduction to Version Control ![](images/track_changes.svg) :::{.notes} + - Version control systems start by storing the base version of the file that you save and then store just the changes you made at each step on the way. - You can think of it like storing Lego bricks and the instructions for putting them together - if you start with the first piece, then add each other in turn, you end up with your final document. ::: @@ -93,7 +98,8 @@ title: Introduction to Version Control :::{.columns} ::::{.column width="50%"} -- Changes are separate from the document itself + +- Changes are separate from the document itself - Two users can make independent sets of changes - Creates two different versions of the document :::: @@ -104,6 +110,7 @@ title: Introduction to Version Control ::: :::{.notes} + - Once you think of changes as separate from the document itself, you can then think about taking the same document and adding different changes to it, getting different versions of the document. - For example, two users can make independent sets of changes based on the same document. ::: @@ -116,12 +123,14 @@ title: Introduction to Version Control :::: ::::{.column width="50%"} + - If no conflicts - merge changes - Two different sets of changes can be merged together onto the same base document :::: ::: :::{.notes} + - If there aren’t conflicts, you can even try to combine two different sets of changes together onto the same base document, a process called merging. ::: @@ -133,6 +142,7 @@ title: Introduction to Version Control - Subversion :::{.notes} + - Git is overwhelmingly the most popular version control system in academia, and beyond. - It’s a distributed version control system, where every developer in a team has their own full copy of a repository, and can synchronise between them. - It’s partly become such a success thanks to sites like GitHub and GitLab, which make it easy to collaborate on a Git repository, and provide all kinds of extra tools to manage software projects. @@ -142,6 +152,7 @@ title: Introduction to Version Control :::{.columns} ::::{.column width="50%"} + - Command-line `git` - Standalone graphical user interface - GitHub Desktop @@ -159,6 +170,7 @@ title: Introduction to Version Control ::: :::{.notes} + - We’re going to teach you how to use Git on the command line, as it’s the same on every single platform (Mac, Linux & Windows) - and it’s the only way to use it on high-performance clusters like Iridis. - However, if you’re not working on a high-performance system you could use a graphical user interface rather than the command line. - There are many different graphical user interfaces for Git, like GitHub desktop, git kraken and sourcetree.