Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add in-place transform support for inv() #736

Merged
merged 1 commit into from
Aug 26, 2024

Conversation

tbensonatl
Copy link
Collaborator

Change the behavior of the inv() transform as follows:

  • No longer unconditionally overwrite the input with factorized data. Previously, (Ainv = inv(A)).run() would write the inverse to Ainv and the LU factorization to A.
  • Support in-place transforms like (A = inv(A)).run(). Previously, this would run, but the results would be incorrect because the underlying cuBLAS calls only support out-of-place solves.
  • The above are achieved by always creating a temporary work buffer and copying the input into that work buffer.
  • Add support for input operators (i.e., not just tensors). The operator runs when populating the temporary input work buffer.
  • Use host-pinned memory and async memcpys to test the success of the factorization/inversion. This still synchronizes the provided stream, but no longer synchronizes the default stream.

Change the behavior of the inv() transform as follows:

- No longer unconditionally overwrite the input with factorized data.
  Previously, (Ainv = inv(A)).run() would write the inverse to A and
  the LU factorization to A.
- Support in-place transforms like (A = inv(A)).run(). Previously,
  this would run, but the results would be incorrect because the
  underlying cuBLAS calls only support out-of-place solves.
- The above are achieved by always creating a temporary workbuffer
  and copying the input into that work buffer.
- Add support for input operators (i.e., not just tensors). The
  operator runs when populating the temporary input work buffer.
@tbensonatl
Copy link
Collaborator Author

/build

@cliffburdick
Copy link
Collaborator

/build

@coveralls
Copy link

Coverage Status

coverage: 93.39% (+0.07%) from 93.323%
when pulling 3df9230 on add-in-place-inv-transform-support
into 459cffb on main.

@cliffburdick cliffburdick merged commit 4468820 into main Aug 26, 2024
1 check passed
@cliffburdick cliffburdick deleted the add-in-place-inv-transform-support branch August 26, 2024 18:34
@cliffburdick
Copy link
Collaborator

/build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants