Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable D-Cache for Cortex-M7 #485

Closed
salkinium opened this issue Sep 29, 2020 · 17 comments · Fixed by #1222
Closed

Enable D-Cache for Cortex-M7 #485

salkinium opened this issue Sep 29, 2020 · 17 comments · Fixed by #1222

Comments

@salkinium
Copy link
Member

The Data Cache for Cortex-M7 devices is currently disables, due to a lack of in-depth understanding of cache policies during porting. The Instruction Cache however is enabled, since it only gets read.

See: https://github.com/modm-io/modm/blob/develop/src/modm/platform/core/cortex/startup.c.in#L95-L98

In addition to enabling the cache, the caches must be invalidated manually on certain operations (writing Flash for example).
This is however not just an issue on Cortex-M7, since most Cortex-M devices have some sort of vendor specific cache implementation for their Flash reads, which must also be manually invalided.

cc @mikewolfram

@mikewolfram
Copy link
Contributor

I had this already on my list since I was wondering why the I-Cache is enabled, but not the D-Cache.

First time I saw this when porting the FreeRTOS HAL based network interface, but at least for the M7 the macro does nothing to invalidate the cache.

@salkinium
Copy link
Member Author

Yes, I just remembered this due to getting confused about DTCM with D-Cache while fixing the DTCM size bug. I wasn't sure during porting what the invalidation required from an modm API perspective, so I just disabled it. Not great, not terrible.

@salkinium
Copy link
Member Author

@salkinium
Copy link
Member Author

Also good: https://alexkalmuk.medium.com/cpu-caches-with-examples-for-arm-cortex-m-2c05a339246e

We should enable the D-Cache at least with write-through. Write-back can then be enabled by the user if required.

@salkinium
Copy link
Member Author

Disadvantages of the ‘write-through’ mode are the following:

  • Sequential and frequent access to the same memory address can degrade performance.
  • You still need to do a cache invalidate after the end of DMA operations.
  • There is the “Data corruption in a sequence of Write-Through stores and loads” bug in some versions of Cortex-M7

Well… maybe not. Sounds like we would need support from the DMA API.

@mikewolfram
Copy link
Contributor

I recently had an issue on a F765, where I enabled the D cache and Ethernet stopped working. Probably related to the DMA.

@salkinium
Copy link
Member Author

Yeah, it seems to require application support to manually invalidate the cache when required (ie. after (during?) DMA transfers), and tbh the user can still enable the DCache in main() so there's no need for modm to enable it (wrongly).

I'm instead going to add some docs on this fact to the modm:platform:cortex-m module and just delegate this to the future.

@ghost
Copy link

ghost commented Feb 3, 2022

Yeah, it seems to require application support to manually invalidate the cache when required (ie. after (during?) DMA transfers), and tbh the user can still enable the DCache in main() so there's no need for modm to enable it (wrongly).

There is a better solution, an MPU is really good to define cache policies. It gives better performance since you don't have to invalidate the cache, it only costs a new section in the linker script. You can see a example below on a helper class I have made to build MPU configurations. It's made in Boost style syntax but it is pretty easy to port. If you wish i can send it to you.

void mpu_setup()
{
    using namespace msl;

    auto dma_buffers = [] {
        mpu::region_builder<1, 0x38000000, 2048, mpu::region_type::normal> reg;
        reg.set_cache_policy(mpu::cache_policy::non_cacheable);
        reg.set_access_policy(mpu::access_policy::privileged_only);
        reg.update_rnr();
        reg.enable();
        return reg.build();
    };

    constexpr auto rdb = dma_buffers();
    mpu::update(rdb);

    mpu::enable(mpu::mode::default_memory_map);
}

@salkinium
Copy link
Member Author

Oh, very nice, I hadn't thought about using the MPU that way!
I want to use the MPU to guard against fiber stack overflows, but this is also a very good use-case, so I'll happily have a look at your helper class please!

@ghost
Copy link

ghost commented Feb 4, 2022

The header is posted here. I have modified it so it compiles in modm. 😃

The example below is tested on a stm32h743 and it should work on any cm4 and cm7 architecture, only the number of regions needs to be changed.

#include <modm/board.hpp>
#include <modm/processing.hpp>
#include <modm/platform/clock/rcc.hpp>

#include "mpu.hpp"

using namespace Board;

void mpu_setup()
{
    constexpr auto const stack_sentinal = [] {
        mpu::region_builder<0x24070000, 32, mpu::region_type::normal> reg;
        reg.set_cache_policy(mpu::cache_policy::write_back);
        reg.set_access_policy(mpu::access_policy::no_access);
        reg.update_rnr(0);
        reg.enable();
        return reg.build();
    }();
    mpu::update(stack_sentinal);

    constexpr auto const dma_buffers = [] {
        mpu::region_builder<0x38000000, 2048, mpu::region_type::normal> reg;
        reg.set_cache_policy(mpu::cache_policy::non_cacheable);
        reg.set_access_policy(mpu::access_policy::privileged_only);
        reg.update_rnr(1);
        reg.enable();
        return reg.build();
    }();
    mpu::update(dma_buffers);

    mpu::enable(mpu::mode::default_memory_map);
}

int
main()
{
	Board::initialize();
	Led::setOutput();

	RCC->AHB2ENR |= RCC_AHB2ENR_SRAM1EN_Msk | RCC_AHB2ENR_SRAM2EN_Msk | RCC_AHB2ENR_SRAM3EN_Msk;

	mpu_setup();

	[[maybe_unused]] volatile auto stack_crash = reinterpret_cast<std::uint32_t*>(0x24070000U);
	*stack_crash = 0;

@ghost
Copy link

ghost commented Feb 4, 2022

I just discovered a bug. It should be fixed in the region_builder class, the MPU is very critical with correct alignment and can give some hairy situations if it's wrong.

static_assert((Address & 0x1f) == 0, "Invalid alignment");

// Should be changed to:
static_assert((Address & (Size - 1)) == 0, "Invalid alignment");

@salkinium
Copy link
Member Author

I love it, this is great, do you want to add a modm:platform:mpu module?

We already have a modm_faststack section that we can align(32) to add a the space for the guard, this could be very handy for that. We will have to see if the region builder requires a simpler runtime version, since the specific stack object address is only known at link-time, so the constexpr won't work there, but I'm sure we can still make it efficient.

@ghost
Copy link

ghost commented Feb 6, 2022

I love it, this is great, do you want to add a modm:platform:mpu module?

Yes, later... Right now missing SPI/DMA support is more important.

@rleh
Copy link
Member

rleh commented Feb 6, 2022

DMA and SPI with DMA support exists for all STM32 chips, see #371, #608, #629 and #772.

@ghost
Copy link

ghost commented Feb 6, 2022

DMA and SPI with DMA support exists for all STM32 chips, see #371, #608, #629 and #772.

I can't find it?. It's advanced SPI on stm32h743vit6. 4 to 32 bits data frame, fifo buffers, etc...

@rleh
Copy link
Member

rleh commented Feb 6, 2022

Oh, I see. STM32H7 family completely is missing a SPI driver currently, see peripheral matrix. (Not sure if that is just not tested/enabled, but identical to the SPI peripherals in other STM32 families or if ST put a new SPI IP into the STM32H7 controllers which requires a new driver.)

For all other targets SpiMasterN and SpiMasterN_Dma HAL will be generated once you include the lbuild modulesmodm:platform:spi:N and modm:platform:dma.

e.g.: https://docs.modm.io/develop/api/stm32f745zgt7/
image

@chris-durand
Copy link
Member

Oh, I see. STM32H7 family completely is missing a SPI driver currently, see peripheral matrix. (Not sure if that is just not tested/enabled, but identical to the SPI peripherals in other STM32 families or if ST put a new SPI IP into the STM32H7 controllers which requires a new driver.)

The H7 SPI has a completely different register map and needs a new driver.

vishwamartur added a commit to vishwamartur/modm that referenced this issue Nov 9, 2024
Related to modm-io#485

Enable D-Cache for Cortex-M7 devices.

* Enable the D-Cache in `src/modm/platform/core/cortex/startup.c.in` by adding `SCB_EnableDCache()` after `SCB_EnableICache()`.
* Add a comment explaining the D-Cache enablement and the need for manual invalidation on certain operations.
* Update the documentation in `docs/src/reference/build-systems.md` to reflect the D-Cache enablement for Cortex-M7 devices.
* Add a note in the documentation about the need for manual invalidation on certain operations.
vishwamartur added a commit to vishwamartur/modm that referenced this issue Nov 9, 2024
Related to modm-io#485

Enable D-Cache for Cortex-M7 devices.

* Enable the D-Cache in `src/modm/platform/core/cortex/startup.c.in` by adding `SCB_EnableDCache()` after `SCB_EnableICache()`.
* Add a comment explaining the D-Cache enablement and the need for manual invalidation on certain operations.
* Update the documentation in `docs/src/reference/build-systems.md` to reflect the D-Cache enablement for Cortex-M7 devices.
* Add a note in the documentation about the need for manual invalidation on certain operations.

Signed-off-by: Vishwanath Martur <64204611+vishwamartur@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

4 participants