Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

scikit-hep / awkward Public

Notifications You must be signed in to change notification settings
Fork 87
Star 836

Code
Issues 118
Pull requests 12
Discussions
Actions
Projects 1
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

CUDA kernels that are implemented but not optimal #2987

Open

1 of 16 tasks

jpivarski opened this issue Jan 25, 2024 · 0 comments

Open

1 of 16 tasks

CUDA kernels that are implemented but not optimal #2987

jpivarski opened this issue Jan 25, 2024 · 0 comments

Assignees

Labels

Works, but not fast enough or uses too much memory

Comments

Copy link

Member

jpivarski commented Jan 25, 2024 •

edited by ManasviGoyal

Loading

This is primarily for record-keeping, so that we don't forget about CUDA kernels that should be revisited someday. To be in this list, a kernel must be implemented correctly (in main or an impending PR), but have some reason to be rewritten. The list is to help us stick to the policy that existence is the first priority and optimization is second, without the temptation to go down a rabbit-hole of optimizing every kernel before moving on to the next one.

awkward_ListArray_min_range is a reduce-to-scalar algorithm, but it's implemented with atomicMin instead of tree-reduction.

Variable-length inner loop:

awkward_IndexedArray_ranges_next_64
awkward_IndexedArray_ranges_carry_next_64
awkward_ListArray_getitem_jagged_numvalid
awkward_ListArray_getitem_next_range_spreadadvanced
awkward_ListArray_broadcast_tooffsets
awkward_ListArray_localindex
awkward_ListOffsetArray_drop_none_indexes
awkward_ListOffsetArray_reduce_local_nextparents_64
awkward_ListArray_rpad_axis1
awkward_ListOffsetArray_rpad_axis1
awkward_ListArray_combinations_length
awkward_NumpyArray_pad_zero_to_length
awkward_NumpyArray_rearrange_shifted
awkward_UnionArray_flatten_combine
awkward_UnionArray_nestedfill_tags_index

The text was updated successfully, but these errors were encountered:

ManasviGoyal reacted with thumbs up emoji

All reactions

👍 1 reaction

jpivarski added the performance Works, but not fast enough or uses too much memory label

jpivarski assigned ManasviGoyal

jpivarski mentioned this issue

feat: turn on CUDA unit tests for working kernels and add some CUDA kernels #2930

Merged

ManasviGoyal mentioned this issue

feat: add CUDA kernels that calculate length/sum #2992

Merged

ManasviGoyal mentioned this issue

feat: add variable length loop kernels #3003

Merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Assignees

Labels

Works, but not fast enough or uses too much memory

Projects

None yet

Milestone

No milestone

Development

No branches or pull requests

2 participants

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.