Disabling optimizations in bootloader code. Why? #26

codewitch-honey-crisis · 2024-02-13T17:47:32Z

codewitch-honey-crisis
Feb 13, 2024

I didn't feel like this belonged as an issue, but it's a curiosity for me. I enabled I-Cache and D-Cache on my H7 and got significant gains in CPU bound operations. Also disabling branch prediction? That has me scratching my head.

I found this in the bootloader assembly code for the multicore example - startup.s but I've seen similar code elsewhere in this repo

I can't help but wonder what is the reason for doing these things? The cache I could potentially see if you don't want to tie up valuable SRAM, but what if you do? Are there any gotchas I should be aware of?

And the branch prediction disabling, that just floors me. Does it break something?

(I know I said I would take a break, but my curiosity got the better of me) :)

	mrc     p15, 0, r0, c1, c0, 0					// Read System Control register (SCTLR)
	bic     r0, r0, #(0x1 << 12) 					// Clear I bit 12 to disable I Cache
	bic     r0, r0, #(0x1 <<  2) 					// Clear C bit  2 to disable D Cache
	bic     r0, r0, #0x1 							// Clear M bit  0 to disable MMU
	bic     r0, r0, #(0x1 << 11) 					// Clear Z bit 11 to disable branch prediction

Answered by danngreen

Feb 13, 2024

The caches and branch prediction are enabled in SystemInit().

They are just disabled in startup so we can set up the stack and other things without having to worry about flushing the cache or anything complex.

Also, the cache does not use SRAM, so disabling it doesn't give you more available RAM (same as on the H7).

The A7 is hardly usable with the caches off. The MMU can be used to set up regions that are non-cacheable, which you need to do for DMA transfers (or else keep the cache on and make sure to flush/invalidate at the right times)

View full answer

danngreen · 2024-02-13T20:18:41Z

danngreen
Feb 13, 2024
Maintainer

The caches and branch prediction are enabled in SystemInit().

They are just disabled in startup so we can set up the stack and other things without having to worry about flushing the cache or anything complex.

Also, the cache does not use SRAM, so disabling it doesn't give you more available RAM (same as on the H7).

The A7 is hardly usable with the caches off. The MMU can be used to set up regions that are non-cacheable, which you need to do for DMA transfers (or else keep the cache on and make sure to flush/invalidate at the right times)

0 replies

codewitch-honey-crisis · 2024-02-13T20:26:47Z

codewitch-honey-crisis
Feb 13, 2024
Author

Thanks. That clarifies a lot for me. Guess I made a poor assumption about how the cache works based on some lesser Espressif devices I'm more familiar with. In my first real use of the MCU I will be using 1 of the DMA controllers and 5 channels of it to drive SPI TX only lines as fast as the devices will allow (they are rated for 10Mhz according to the datasheet, but will accept up to 40MHz, and yet I've found no real improvements after 20MHz for some reason but that was on different hardware). The main challenge for me is getting LVGL divied across each core. I don't want them to collide, and I don't want a scheduler. I know, i want my cake, and I want to eat it. very demanding!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disabling optimizations in bootloader code. Why? #26

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Disabling optimizations in bootloader code. Why? #26

codewitch-honey-crisis Feb 13, 2024

Replies: 2 comments

danngreen Feb 13, 2024 Maintainer

codewitch-honey-crisis Feb 13, 2024 Author

codewitch-honey-crisis
Feb 13, 2024

danngreen
Feb 13, 2024
Maintainer

codewitch-honey-crisis
Feb 13, 2024
Author