Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rocm: fix bug in intercept mode path #106

Merged

Conversation

gcongiu
Copy link
Contributor

@gcongiu gcongiu commented Oct 27, 2023

Pull Request Description

The intercept mode path keeps track of events of intercepted kernel using the same hash table used to map event names to entries in the native event table (hash table normally used to do ntv_name_to_code conversions). The event names in the hash table don't collide because intercept mode keeps track of the base name of the event (discarding device and instance number), while native event table entries are referenced through name:device=N:instance=M keys.

The reason for storing only the name of the event in intercept mode is that events are set up on all devices' dispatch queues, regardless the device specified by the user (this approach follows rocprof strategy). However, using only the event name without the instance number will cause problems. Instances represent separate events and should not be treated as a single event (XXXX:instance=2 != XXXX:instance=3).

The proposed patch uses a separate hash table for intercept mode and inserts the feature name rather than the event base name. This means that events with more than one instance will have a hash table key of the form name[M], where M represents the instance number. If the event only has one instance the key will be name.

Author Checklist

  • Description
    Why this PR exists. Reference all relevant information, including background, issues, test failures, etc
  • Commits
    Commits are self contained and only do one thing
    Commits have a header of the form: module: short description
    Commits have a body (whenever relevant) containing a detailed description of the addressed problem and its solution
  • Tests
    The PR needs to pass all the tests

@gcongiu gcongiu force-pushed the 2023.10.27_fix-intercept-mode-bug branch from b3af898 to 1c4cc1c Compare October 27, 2023 11:02
@gcongiu gcongiu marked this pull request as ready for review November 7, 2023 10:52
@gcongiu gcongiu added this to the PAPI 7.1.0 release milestone Nov 7, 2023
@gcongiu gcongiu requested a review from dbarry9 November 8, 2023 17:54
@gcongiu gcongiu force-pushed the 2023.10.27_fix-intercept-mode-bug branch from 1c4cc1c to c1c8b7f Compare November 9, 2023 10:53
@gcongiu gcongiu force-pushed the 2023.10.27_fix-intercept-mode-bug branch from c1c8b7f to b1912af Compare November 9, 2023 16:57
The intercept mode path keeps track of incercepted events using the same
hash table used to map event names to entries in the native event table.
The event names don't collide because intercept mode keeps track of the
base name of the event (discarding device id and instance number), while
native event table entries are referenced as "name:device=N:instance=M".
The reason is that events are intercepted on all devices' dispatch
queues regarless the device id specified by the user (this approach
follows rocprof strategy). However, using only the event name without
the instance number will cause problems. Instances represent separate
events and should not be treated as a single event.

The proposed patch uses a separate has table for intercept mode and
inserts the feature name rather than the event base name. This means
that events with more than one instance will have an hash table key of
the form "name[M]", where M represents the instance. If the event only
has one instance the key will be "name".
@gcongiu gcongiu force-pushed the 2023.10.27_fix-intercept-mode-bug branch from b1912af to 01eae41 Compare November 13, 2023 11:04
@gcongiu gcongiu merged commit 0a565ec into icl-utk-edu:master Nov 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants