Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update BARs and Telemetry #248

Merged
merged 4 commits into from
Sep 10, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 24 additions & 5 deletions src/driver/doc/amdnpu.rst
Original file line number Diff line number Diff line change
Expand Up @@ -76,12 +76,28 @@ instance of ERT. Each user channel is bound to its own dedicated mailbox.
PCIe EP
-------

NPU is visible to the x86 as a PCIe device with 3 BARS and an MSI-X interrupt
vector. NPU uses a dedicated high bandwidth SoC level fabric for reading
NPU is visible to the x86 as a PCIe device with multiple BARs and some MSI-X interrupt
vectors. NPU uses a dedicated high bandwidth SoC level fabric for reading
writing into host memory. Each instance of ERT gets its own dedicated MSI-X
interrupt. MERT gets a single instance of MSI-X interrupt.

TODO, briefly describe the BARs
The number of PCIe BARs varies depending on the specific device.
Based on their functions, PCIe BARs can generally be categorized into the
following types.

* PSP BAR: Expose the AMD PSP (Platform Security Processor) function
* SMU BAR: Expose the AMD SMU (System Management Unit) function
* SRAM BAR: Expose ring buffers for the mailbox
* Mailbox BAR: Expose the mailbox control registers (head, tail and ISR registers etc.)
* Public Register BAR: Expose public registers

On specific devices, the above-mentioned BAR type might be combined into a single physical PCIe BAR.
Or a module might require two physical PCIe BARs to be fully functional.
For example,

* On AMD Phoenix device, PSP, SMU, Public Register BARs are on PCIe BAR index 0.
* On AMD Strix Point device, Mailbox and Public Register BARs are on PCIe BAR index 0.
The PSP has some registers in PCIe BAR index 0 (Public Register BAR) and PCIe BAR index 4 (PSP BAR).

Process Isolation Hardware
--------------------------
Expand Down Expand Up @@ -244,8 +260,11 @@ driver then decodes the error by reading the contents of the buffer pointer.
Telemetry
=========

MERT can report various kinds of telemetry information like
TODO, list the key ones
MERT can report various kinds of telemetry information like the following:
* L1 interrupt counter
* DMA counter
* Deep Sleep counter
* etc.


References
Expand Down