Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase core0 stack to 8K and paint/measure stack usage. #67

Merged
merged 2 commits into from
Jul 6, 2023

Conversation

thejpster
Copy link
Member

I observed in probe-run output that we were using all 4096 bytes of our allocated stack region. Adding the painting and measurement code shows that:

  • Core 0 uses 4372 bytes peak
  • Core 1 uses 312 bytes peak

Thus I upgrade Core 0 to have an 8 KiB stack - stealing the top 4 KiB of the striped region and moving all the BIOS global variables down, and also reducing the TPA by 4 KiB.

Technically we weren't damaging any globals because the 24 KiB block allocated for them had about 900 bytes unused. But we were asking for trouble.

This also led to me finding a bug in probe-run, where it doesn't understand that a stack can span across multiple contiguous memory regions. I raised that as knurling-rs/probe-run#415

@thejpster
Copy link
Member Author

I could also reduce Core 1's stack to, say, 2048 bytes (or 1024 bytes, or even 512 bytes), but that would make Core 0 and Core 1 contend for access to SRAM_REGION_4. There are registers we can read to check how many cycles were stalled waiting for contention, but I figured it was easier for now to just steal some of the striped RAM.

@thejpster
Copy link
Member Author

Idea. Move the Core 1 stuff like the text buffer into SRAM4 and leave the 256K for Core 0.

Now there's 32K of RAM from the top of the SRAM region (24K from the striped block, plus SRAM_BLOCK4 and SRAM_BLOCK5) and from within that is the Core 0 stack, the Core 1 stack (a static mut array) and all the other global variables.

Tested OK.
@thejpster
Copy link
Member Author

Re-arranged things somewhat.

Now there is just a RAM memory region (plus the RAM_OS region which is memory for the OS).

The RAM memory region sits at the top of the SRAM address space, using some of the striped memory, plus all of SRAM_BLOCK4 and SRAM_BLOCK5. Within this region, the .data and .bss sit at the bottom, and the Core 0 stack sits at the top. The Core 1 stack is now just a static mut array of 1024 bytes, located within the .bss section.

I benchmarked the performance and it went from 497,040 chars/sec to 497,205 chars/sec - so basically no change, but certainly not worse than it was.

@thejpster thejpster merged commit 537f844 into develop Jul 6, 2023
@thejpster thejpster deleted the stack-paint branch July 6, 2023 20:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant