You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The work in #20589 uncovered a lot of possible improvements for driver/mtd_spi_nor.
Especially in it's current form it is not very resiliant against failing operations or non-responsive devices, leading to loops and essentially a hanging program.
When mtd_spi_nor_power tries to read the JEDEC ID and fails and no timers are used, it will retry 560,000,000 times to read the JEDEC ID, which takes a considerable amount of time, essentially blocking the system from startup. The variable "retries" has to be set to a sensible value.
There is no timeout for the functions wait_for_write_enable_cleared and wait_for_write_enable_set, which would get the thread stuck if the chip does not answer. https://www.macronix.com/Lists/Datasheet/Attachments/8868/MX25R6435F,%20Wide%20Range,%2064Mb,%20v1.6.pdf Page 34 has a block diagram. Macronix says to issue the WREN command again if the flag has not been set. So the right thing to do isn't waiting for the flag to be set but retrying (for a finite amount of times).
The function wait_for_write_complete does not have a timeout either but it counts the attempts and how many times it yielded the thread. This can be used as a basis for a timeout. The timeout should be dependent on the operation, chip erase can take a very long time (in the multiple minutes range), whereas other operations shouldn't take very long.
Add a software fallback for the data integrity functions (to check if the program or erase was successful). Depending on the length of the operation, maybe only do some random checks to avoid doing a full blank check of a 128MB chip for example.
When a chip reset is issued and the microcontroller is reset, reading out the JEDEC ID will fail because the Flash is still busy with the reset. Therefore the JEDEC check should check the WIP flag to see if there's still something going on. The function should then return -EBUSY (but I don't know how that would be handled by the rest of the MTD subsystem?)
The first three points are probably a single PR, the fourth one is a single PR, the fifth one too and the last one might become a can of worms again.
Useful links
The text was updated successfully, but these errors were encountered:
Description
The work in #20589 uncovered a lot of possible improvements for driver/mtd_spi_nor.
Especially in it's current form it is not very resiliant against failing operations or non-responsive devices, leading to loops and essentially a hanging program.
The first three points are probably a single PR, the fourth one is a single PR, the fifth one too and the last one might become a can of worms again.
Useful links
The text was updated successfully, but these errors were encountered: