Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use CUDA Rosenbrock parameters #659

Merged
merged 2 commits into from
Sep 16, 2024
Merged

Conversation

sjsprecious
Copy link
Collaborator

fix #657

@sjsprecious sjsprecious self-assigned this Sep 15, 2024
@sjsprecious sjsprecious added the bug Something isn't working label Sep 15, 2024
@sjsprecious sjsprecious added this to the CUDA Rosenbrock Solver milestone Sep 15, 2024
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.69%. Comparing base (6c8875d) to head (4ba249e).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #659   +/-   ##
=======================================
  Coverage   92.69%   92.69%           
=======================================
  Files          53       53           
  Lines        3585     3585           
=======================================
  Hits         3323     3323           
  Misses        262      262           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sjsprecious sjsprecious merged commit 84b4b40 into main Sep 16, 2024
29 checks passed
@sjsprecious sjsprecious deleted the use_cuda_rosenbrock_parameters branch September 16, 2024 14:49
K20shores added a commit that referenced this pull request Sep 26, 2024
* Add CUDA Rosenbrock tests (#579)

* add sync functions to state variable
add cuda rosenbrock tests

* fix all the compilation errors
analytical tests do not work for CUDA rosenbrock

* fix call to the base class function;
bug fix for CuLudecompose and add
singularity check

* fix the compilation error for CUDA decomposition class

* remove unnecessary calls to the base class functions

* fix all the compilation errors

* add crtp to allow calls to function from either base or derived class

* fix more compilation errors about abstract rosenbrock solver
now the cuda test passes for Troe case

* add lambda functions as arguments for CPU/JIT/CUDA tests

* initialize Yerror on the GPU every time and pass all the analytical
tests

* turn off the cuda memory check for the integration tests

* revert back to the original process class

* clean up unused header

* update JIT test interface

* extend state class to cudastate class

* remove unnecessary cuda device sync

* add cuda state class and address compilation errors

* fix broken CI tests

* more bug fix for CI tests

* fix the compiler warning for cuda code

* more fix for broken CI tests

* resolve the cuda compiler warnings

* address Matt's PR comments

---------

Co-authored-by: Jian Sun <sunjian@ucar.edu>

* Auto-format code changes (#586)

Auto-format code using Clang-Format

Co-authored-by: GitHub Actions <actions@github.com>

* Use Fill to reset the L and U matrices in Rosenbrock solve (#588)

use Fill in Rosenbrock solve

* In-place linear solve (#585)

* removing condensing x and b in nonvectorizable matrix code for linear solve

* adding alias back

* adding back comment

* spacing

* adding back comment

* moving comment

* vectorize version no longer segfaults but something is wrong

* vectorized passes

* removing b from jit linear solver

* removing b from cuda linear solver

* usin function pointer alias

* adding a comment

* fix conflict resolve typo

---------

Co-authored-by: Jian Sun <sjsprecious@gmail.com>
Co-authored-by: Jian Sun <sunjian@ucar.edu>

* Auto-format code changes (#589)

Auto-format code using Clang-Format

Co-authored-by: GitHub Actions <actions@github.com>

* 498 mimic camchem substep convergence failure integration acceptance (#582)

* trying to continue on with current solution

* mimicing camchem

* testing backward euler against hires, e5

* updating citations

* oregonator is too stiff for backward euler

* addressing PR comments

* collecting solver stats

* Update include/micm/solver/backward_euler.inl

Co-authored-by: Matt Dawson <mattdawson@ucar.edu>

* removing backward euler for oregonator test

* removing cerr in favor of a solver state

---------

Co-authored-by: Matt Dawson <mattdawson@ucar.edu>

* Auto-format code changes (#590)

Auto-format code using Clang-Format

Co-authored-by: GitHub Actions <actions@github.com>

* 304 reorganize include folder (#591)

* reorganizing files

* correcting cuda imports

* Auto-format code changes (#592)

Auto-format code using Clang-Format

Co-authored-by: GitHub Actions <actions@github.com>

* 577 test all parameter types of the dense matrix cpu rosenbrock on the analytical policy tests (#593)

Converts HIRES, Oregonator, E5 to chemical equations so that they can be tested on the GPU

All analytical tests are tested with CPU and GPU rosenbrock. Backward euler as well (except oregonator). Renaming to match naming schemes for test files

* Auto-format code changes (#597)

Auto-format code using Clang-Format

Co-authored-by: GitHub Actions <actions@github.com>

* Fix GPU memory leak for the CUDA unit tests (#600)

* fix most GPU memory leak

* allocate a device pointer in the device struct

* remove unused cuda mem copy

* use swap in the move constructor and assignment of CUDA class
initialize the null pointer in the struct definition
pass the cuda memory check for all the unit tests

* remove unnecessary nullptr

* fix the broken CI tests

* more bug fixes

---------

Co-authored-by: Jian Sun <sunjian@ucar.edu>

* Auto-format code changes (#601)

Auto-format code using Clang-Format

Co-authored-by: GitHub Actions <actions@github.com>

* Backware Euler with vectorizable matrix types (#596)

* starting to test all solver parameter types

* saving progress

* saving progress

* testing all stages analytically

* updating all interfaces

* correcting cuda build I hope

* testing jit against hires, e5, oregonator

* adding cuda solver builder test

* removing hires, e5, oregonator from cuda tests; they need their own kernels

* testing e5 from a configuration

* testing e5 jit integration

* testing e5 properly

* removing reset of L and U matrices (#594)

* oregonator from a configuration

* renaming things

* using different tolerances?

* moving state onto and off of host

* saving gpu changes

* updating cuda tests

* adding some better tolerances for cuda tests

* adding different tolerances for e5

* adding citation to e5

* thing

* formed hires equations

* using passing tolerances for cpu tests

* jit tolerances

* backward euler tests

* configuration for hires

* add AddToDiagonal function on sparse matrix

* use ForEach in Backward Euler

* add convergence check function to backward euler

* fix merge problems

* add vector matrix to analytical solver tests

* update JIT analytical tests

* set up general use analytical test function

* add general function for stiff analytical tests

* fix jit analytical tests

* update remaining analytical tests

* address review comments

* update cuda analytical tests

* update tolerances for cuda analytical tests

---------

Co-authored-by: Kyle Shores <kyle.shores44@gmail.com>
Co-authored-by: Kyle Shores <kshores@ucar.edu>

* Auto-format code changes (#605)

Auto-format code using Clang-Format

Co-authored-by: GitHub Actions <actions@github.com>

* 572 check for singularity when the solver parameters flag is turned on (#603)

Add tests to check for singularity in the U matrix after the LU decomposition. If the check for singularity flag is turned on, decrease the timestep and try again. Fixes a bug where a zero in the bottom right of the U matrix would not have been detected

* Auto-format code changes (#606)

Auto-format code using Clang-Format

Co-authored-by: GitHub Actions <actions@github.com>

* Provide a way to access the processes_ data member (#607)

return the process_ member

Co-authored-by: Jian Sun <sunjian@ucar.edu>

* Auto-format code changes (#608)

Auto-format code using Clang-Format

Co-authored-by: GitHub Actions <actions@github.com>

* adding headers

* Auto-format code changes (#609)

Auto-format code using Clang-Format

Co-authored-by: GitHub Actions <actions@github.com>

* Add missing CUDA tests and fix broken path (#611)

add missing cuda tests and fix broken path

* throwing error on mismatched size (#610)

* throwing error on mismatched size

* using a copy of the paramteres so that a builder can be repeatedly used

* adding const

* correcting number of tolerances for robertson

* Auto-format code changes (#612)

Auto-format code using Clang-Format

Co-authored-by: GitHub Actions <actions@github.com>

* Correct usage of third body species (#614)

using the species map to grab the exact same species for reactants and products

* Auto-format code changes (#615)

Auto-format code using Clang-Format

Co-authored-by: GitHub Actions <actions@github.com>

* correcting solver builder constructor (#616)

* correcting solver builder constructor

* fix a bug

---------

Co-authored-by: Jiwon Gim <jiwongim@ucar.edu>

* Relax the criteria to pass the GPU test with nvhpc/24.7 on Derecho (#618)

relax the criteria to pass the GPU test with nvhpc/24.7 on Derecho

* Auto-format code changes (#623)

Auto-format code using Clang-Format

Co-authored-by: GitHub Actions <actions@github.com>

* Update fill function for CUDA matrix (#626)

* update the fill function for cuda matrix to avoid data transfer

* fix compilation errors

* add a comment about template function

* update fill function for cuda sparse matrix

* remove gcc11 CI test and add gcc14 CI test

* Auto-format code changes (#627)

Auto-format code using Clang-Format

Co-authored-by: GitHub Actions <actions@github.com>

* Remove data transfer in cuda matrix constructor and template some CUDA functions (#630)

* remove data transfer in the cuda dense matrix constructor

* template many cuda functions for cuda dense and sparse matrix

* Auto-format code changes (#633)

Auto-format code using Clang-Format

Co-authored-by: GitHub Actions <actions@github.com>

* Remove redundant variable and optimize the copy assignment for the CUDA matrix (#636)

* test to remove forcing variables

* fix broken unit tests

* fix the bug of calculating forcing term when substepping happens

* update the copy assignment operator for CUDA matrix

* fix the broken unit tests again

* Remove local copy of state in solver functions (#639)

remove local copy of state in solver functions

* Auto-format code changes (#640)

Auto-format code using Clang-Format

Co-authored-by: GitHub Actions <actions@github.com>

* Add CUDA stream for asynchronous kernel launch (#641)

* add the functions to create & get cuda stream

* simplify the CUDA dense matrix destructor

* add cuda stream to cuda matrix functions

* add cuda stream to process_set.cu

* add cuda stream to CudaLuDecomposition

* add cuda stream to CudaLinearSolver

* set cuda stream in the cublas handle
add cuda stream to rosenbrock.cu

* switch to singleton class for cuda stream manager

* update the method to get the cuda stream

* revise the Gtest main function to cleanup the CUDA resources explicitly

* fix broken cuda analytical test

* fix GPU memory leak in the unit test

* clean up unused files

* fix Kyle's review comment

* make cudamemset asynchronous

* Auto-format code changes (#645)

Auto-format code using Clang-Format

Co-authored-by: GitHub Actions <actions@github.com>

* Remove the local copy of Jacobian matrix when doing LU decomposition (#646)

remove the local copy of jaocbian matrix in the LinFactor function

* Auto-format code changes (#647)

Auto-format code using Clang-Format

Co-authored-by: GitHub Actions <actions@github.com>

* Add const to solver functions (#642)

add const to solver functions

* Replace json to yaml 619 (#649)

* reaplce

* json to yaml

* yamle to JSON

* test

* added .string to yaml file

* added string to loadFile

* changes based on the PR. modified the code to use YAML file

* Auto-format code changes (#650)

Auto-format code using Clang-Format

Co-authored-by: GitHub Actions <actions@github.com>

* Add const qualifiers (#651)

add const qualifiers

* Move Yerror construction outside of the inner solve loop for rosenbrock (#652)

* added error outside of the loop

* moved the code to all the way to outer while loop

* Auto-format code changes (#653)

Auto-format code using Clang-Format

Co-authored-by: GitHub Actions <actions@github.com>

* Move temporary variables to the State class (#655)

* add temporary variables in the solver class

* declare temporary variable in the State class; initialize temporary variable in the solver

* fix broken units test build

* rename base class for temporary variables

* make destructor of base class virtual so that the GPU memory is freed correctly

* remove unnecessary data member from the solver class

* add the copy assignment and constructor for the state class

* add JIT rosenbrock parameter type

* maybe this fixes the broken JIT tests

* try is_convertible instead

* Auto-format code changes (#656)

Auto-format code using Clang-Format

Co-authored-by: GitHub Actions <actions@github.com>

* Use CUDA Rosenbrock parameters (#659)

* use cuda rosenbrock parameters instead

* use 0 for fill function

* Added license and copyright (#661)

added copyright

* Auto-format code changes (#660)

Auto-format code using Clang-Format

Co-authored-by: GitHub Actions <actions@github.com>

* Misc updates (#665)

* add back the getnumberofreactions function

* update cuda thread count to 512

* Set LU matrices to zero when jacobian is a zero element (#666)

* pushing

* pushing fix

* removing unneccesary logic check

* adding cuda stuff

* lowering tolerance

* lowering tolerance

* modified jit ludecomp

* raising tolerance

* testing jit and cuda properly

* raising tolerance

* raising again

* again

* raising again

* lowering tolerance

* adding prints to matrices

* copy LU to host

* printing A

* sparsity

* bernoulli again

* manual engine

* double

* thing

* printing values

* larger matrix

* 2 cells

* now

* dense

* 20

* 4000

* things

* uncomment

* uncomment

* print

* 9

* 8

* 6

* 5

* 2e-6

* lu decomp

* 10

* 8

* 0

* comment

* checking

* uncomment

* 7

* 1

* 10

* print

* print

* again

* 100

* data check

* remove check results

* 13

* 16

* eq

* equal

* uncomment

* 1 block

* 5

* print

* 1

* testing LU decomp specifically

* trying to correct cuda test

* lowering

* lowering tolerance

* lowering again

* thing

* variable

* all tests pass on derecho

* setting values to zero for lu decomp

* defaulting LU to 0 instead of 1e-30

* copying block values to other blocks

* removing small value initialization

* correcting version copyright

* using absolute error

* making index once

* camel case

* Auto-format code changes (#667)

Auto-format code using Clang-Format

Co-authored-by: GitHub Actions <actions@github.com>

* bumping version

---------

Co-authored-by: Jian Sun <sjsprecious@gmail.com>
Co-authored-by: Jian Sun <sunjian@ucar.edu>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: GitHub Actions <actions@github.com>
Co-authored-by: Matt Dawson <mattdawson@ucar.edu>
Co-authored-by: Jiwon Gim <jiwongim@ucar.edu>
Co-authored-by: Montek Thind <mthind@ucar.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Change the parameter type for the Cuda Rosenbrock solver
3 participants