Skip to content

Commit

Permalink
Merge pull request #761 from mkstoyanov/doc_updates
Browse files Browse the repository at this point in the history
* doc updates
  • Loading branch information
mkstoyanov authored Mar 21, 2024
2 parents 1de7210 + 6207002 commit 957e9cb
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 2 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
Changelog for version 8.1
--------------

* added more multicore cpu support
* parallelized setting surplus refinement
* compatibility with gcc parallel STL algorithms

* implemented a new algorithm for global sparse Kronecker
* significant speedup when loading needed values

Expand Down
9 changes: 7 additions & 2 deletions Doxygen/Installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,10 @@ Recommended additional features:
* [OpenMP](https://en.wikipedia.org/wiki/OpenMP) implementation (usually included with the compiler)

Optional features:
* Acceleration using [OpenMP](https://www.openmp.org/) multicore algorithms (CPU only), the OpenMP standard is supported on most major compilers.
* Acceleration using Nvidia [linear algebra libraries](https://developer.nvidia.com/cublas) and custom [CUDA kernels](https://developer.nvidia.com/cuda-zone)
* Acceleration using AMD ROCm [linear algebra libraries](https://rocsparse.readthedocs.io/en/master/) and custom [HIP kernels](https://rocmdocs.amd.com/en/latest/ROCm_API_References/HIP-API.html)
* Acceleration using Intel OneAPI [oneMKL](https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onemkl.html) and custom [DPC++ kernels](https://software.intel.com/content/www/us/en/develop/tools/oneapi.html)
* Acceleration using Intel OneAPI [oneMKL](https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onemkl.html) and custom [SYCL kernels](https://software.intel.com/content/www/us/en/develop/tools/oneapi.html)
* GPU out-of-core algorithms using the [UTK MAGMA library](http://icl.cs.utk.edu/magma/)
* Basic [Python matplotlib](https://matplotlib.org/) support
* Fully featured [MATLAB/Octave](https://www.gnu.org/software/octave/) interface via wrappers around the command-line tool
Expand Down Expand Up @@ -100,7 +101,11 @@ ROCm capabilities require CMake 3.21.
```

* Acceleration options:
* OpenMP allows Tasmanian to use more than one CPU core, which greatly increases the performance
* OpenMP allows Tasmanian to use more than one CPU core, which greatly increases the performance.
While many of the Tasmanian algorithms have been parallelized, the buildin C++ algorithms are usually sequential.
This is most notable in the case of `std::sort` but affects others as well.
Some compilers support parallel standard algorithms but those in turn reuqire additional compiler flags.
* for GCC add `-D_GLIBCXX_PARALLEL` to the `CMAKE_CXX_FLAGS`
* Basic Linear Algebra Subroutines (BLAS) is a standard with many implementations,
e.g., [https://www.openblas.net/](https://www.openblas.net/); optimized BLAS improves the
performance when using evaluate commands on grids with many points or working with models with many outputs
Expand Down

0 comments on commit 957e9cb

Please sign in to comment.