Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] PR-15073 PR-15086 broke High Priority Zero Latency Interrupts #15100

Closed
1 task done
hartmannathan opened this issue Dec 9, 2024 · 13 comments · Fixed by #15102
Closed
1 task done

[BUG] PR-15073 PR-15086 broke High Priority Zero Latency Interrupts #15100

hartmannathan opened this issue Dec 9, 2024 · 13 comments · Fixed by #15102
Assignees
Labels
Arch: arm Issues related to ARM (32-bit) architecture OS: Linux Issues related to Linux (building system, etc) Type: Bug Something isn't working

Comments

@hartmannathan
Copy link
Contributor

hartmannathan commented Dec 9, 2024

Description / Steps to reproduce the issue

PR-15073 and PR-15086 removed config ARMV7M_USEBASEPRI and many things that depend on it.

This breaks High Priority, Zero Latency Interrupts! See: https://nuttx.apache.org/docs/latest/guides/zerolatencyinterrupts.html

Also the help text of config ARMV7M_USEBASEPRI explains that "If CONFIG_ARCH_HIPRI_INTERRUPT is selected, then you MUST select CONFIG_ARMV7M_USEBASEPRI" -- see below:

config ARMV7M_USEBASEPRI
	bool "Use BASEPRI Register"
	default ARCH_HIPRI_INTERRUPT
	depends on ARCH_CORTEXM3 || ARCH_CORTEXM4 || ARCH_CORTEXM7
	---help---
		Use the BASEPRI register to enable and disable interrupts. By
		default, the PRIMASK register is used for this purpose. This
		usually results in hardfaults when supervisor calls are made.
		Though, these hardfaults are properly handled by the RTOS, the
		hardfaults can confuse some debuggers. With the BASEPRI
		register, these hardfaults, will be avoided. For more details see
		https://cwiki.apache.org/confluence/display/NUTTX/ARMv7-M+Hardfaults%2C+SVCALL%2C+and+Debuggers
		WARNING:  If CONFIG_ARCH_HIPRI_INTERRUPT is selected, then you
		MUST select CONFIG_ARMV7M_USEBASEPRI.  The Kconfig dependencies
		here will permit to select an invalid configuration because it
		cannot enforce that requirement.  If you create this invalid
		configuration, you will encounter some problems that may be
		very difficult to debug.

IMPORTANT: Without the High Priority Zero Latency Interrupt feature, NuttX is completely useless to me and my company. We depend strongly on this feature to handle critical tasks that cannot be done without this feature!

On which OS does this issue occur?

[OS: Linux]

What is the version of your OS?

All

NuttX Version

master

Issue Architecture

[Arch: arm]

Issue Area

[Area: Other]

Verification

  • I have verified before submitting the report.
@hartmannathan hartmannathan added the Type: Bug Something isn't working label Dec 9, 2024
@github-actions github-actions bot added Arch: arm Issues related to ARM (32-bit) architecture OS: Linux Issues related to Linux (building system, etc) labels Dec 9, 2024
@hartmannathan
Copy link
Contributor Author

hartmannathan commented Dec 9, 2024

Please see:

#PR-15073 was meant to fix a regression in #PR-14881. #PR-14881 attempted to optimize handling of g_current_regs. The commit log of #PR-15073 explains this.

I suggest to revert #PR-15073, and possibly revert #PR-14881, and find another way to implement #PR-14881 that doesn't cause this problem.

@hartmannathan
Copy link
Contributor Author

Looks like #PR-15086 is also removing Zero Latency Interrupts support.

@hartmannathan hartmannathan changed the title [BUG] PR-15073 broke High Priority Zero Latency Interrupts [BUG] PR-15073 PR-15086 broke High Priority Zero Latency Interrupts Dec 9, 2024
@cederom
Copy link
Contributor

cederom commented Dec 9, 2024

+1 to revert all breaking changes to critical and important features.. and search for a better solution :-)

@acassis
Copy link
Contributor

acassis commented Dec 9, 2024

@hartmannathan when they started modifying these thing I asked them to test the highpri example to confirm it still working in the viewtool-stm32f107 board. As I remember it was told that everything was working, see: #13606

I think they are not using High Priority / Zero Latency yet, this is a nice feature for applications requiring high speed control (few microseconds) and jitter as low as 27nS.

@GUIDINGLI in the thread you said your team will get a STM32F4Discover to test highpri and include in the automated test CI, did they get the board included?

A simple test could be controlling a stepper motor in high speed and verifying it is stopped at right position. @xiaoxiang781216 could you please help here if they didn't get it yet.

@hartmannathan
Copy link
Contributor Author

@acassis I am also preparing to do a test of zero latency interrupts with oscilloscope. It will take me a little time to get the hardware setup to do the test.

@acassis
Copy link
Contributor

acassis commented Dec 9, 2024

@hartmannathan do you have a logic analyzer? It is ideal for this kind of test because it will show the worst jitter and you can get all the pulses saved easily.

@xiaoxiang781216 for an automatic test, I think using another STM32 board with Capture timer to measure the pulses will work fine and will avoid external equipament

@hartmannathan
Copy link
Contributor Author

@acassis yes I have a logic analyzer.

More importantly, I like the idea of building an automatic test by using a second board with capture timer. I might have hardware here to do that. This will be a valuable test to do with a distributed build farm :-)

@hartmannathan
Copy link
Contributor Author

By the way, you don't need a stepper motor. It is enough to make a test that runs in a loop and disables OS interrupts for 10 ms at a time; meanwhile, a high priority interrupt can output a waveform by bit-banging a GPIO at 0.5 ms. If the high priority interrupt works correctly, the waveform will be output while (all other) OS interrupts are disabled.

@xiaoxiang781216
Copy link
Contributor

@hartmannathan #15073 doesn't remove ARMV7M_USEBASEPRI, but enable it unconditionally, that's why ARMV7M_USEBASEPRI is removed from Kconfig.

@hartmannathan
Copy link
Contributor Author

@hartmannathan #15073 doesn't remove ARMV7M_USEBASEPRI, but enable it unconditionally, that's why ARMV7M_USEBASEPRI is removed from Kconfig.

Correct. I misunderstood the change earlier. BASEPRI is always used on ARM targets that support it. There is no need to support PRIMASK on these targets because BASEPRI is a superset of PRIMASK's functionality. Better to remove the ARMV7M_USEBASEPRI config and avoid maintaining 2 alternative implementations.

Thank you!

@cederom
Copy link
Contributor

cederom commented Dec 10, 2024

Good to see all works fine as expected :-)

Guys do you think an ostest could be added to detect proper zero-latency handling on platforms supported? This way we could catch those problems at runtime with a generic test utility :-)

@xiaoxiang781216
Copy link
Contributor

already add here: https://github.com/apache/nuttx-apps/pull/2877/files. but need enable on ci

@hartmannathan
Copy link
Contributor Author

Good to see all works fine as expected :-)

Guys do you think an ostest could be added to detect proper zero-latency handling on platforms supported? This way we could catch those problems at runtime with a generic test utility :-)

I am working on an idea to check for this in hardware.

If we build a distributed hardware test farm, this can be one of the tests that will run to measure jitter in ISR response.

Jitter is how much interrupt latency can vary from one interrupt to another.

With normal interrupts, branching to the ISR may be delayed if interrupts are disabled during critical sections or while servicing another interrupt. The amount of delay can be more one time, less another time. In some applications, this can be significant.

With High Priority, Zero Latency Interrupts, the jitter is limited to the time it takes the CPU to finish the current instruction and branch to the ISR. Although it is never exactly zero, the length of time it takes the CPU to execute an instruction is very small. So the jitter is minimized to the absolute minimum.

A test could work like this:

  1. Use a hardware timer to toggle a GPIO pin at some fixed rate.
  2. Jumper this GPIO to another GPIO, which is configured as External Interrupt.
  3. Setup this interrupt to use High Priority, Zero Latency Interrupts.
  4. In the interrupt, toggle a second GPIO pin.
  5. Measure the latency between toggling of the two GPIOs using a hardware timer/capture peripheral.
  6. Collect the values and compare best-case to worse-case numbers are within some known threshold.
  7. During the test, disable and enable normal OS-level interrupts for varying amounts of time.

If the Zero Latency Interrupts are ever broken due to a software regression, a test like this could detect it automatically. Also if it's not detected immediately, we can use git bisect to find the commit without too much difficulty.

I'm still trying to work out the details. If you have ideas, please share!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arch: arm Issues related to ARM (32-bit) architecture OS: Linux Issues related to Linux (building system, etc) Type: Bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants