-
Notifications
You must be signed in to change notification settings - Fork 806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v2.7 : long delay in open/close for spi nor #965
Comments
Hi @matthieu-c-tagheuer, thanks for creating an issue. All of these reads look like they're in the same block, so this is probably related to #783. Long story short, littlefs metadata compaction algorithm is There are some workarounds you can try:
|
Hi @geky I will try your suggestion. I manage to reproduce it on pc with littlefs-2.9.1 and rambd What I see it that we have lot's of call with hint=4. This don't use cache.
There also some cache where we parse the flash in reverse order : (hint is also small)
In case of iteration over tag, tag containing tag (big one) and small tag are mixed together ? And when doing in reverse order, we could also use the cache. For example pass a negative hint, that will indicate to the cache to not read block:offset to block:offset + hint, but something like block:offset - hint + size to block:offset + size. So that next call hit the cache. |
Sorry if I don't quite understand. I think you're asking if our tags are very different in size? (oh that is hard to phrase). The problem is we don't really know, it depends on exactly what metadata gets stored. Worst case files can be inlined, which results in rather large tags:
This leads to a tricky tradeoff when it comes to how much to read. On one hand, Some of the problem is if we don't read But this is an interesting area to explore, maybe exposing a
This can be controlled by The Actually, one question for the hypothetical
This is quite a neat idea. Unfortunately there's quite a bit of work in progress on this data structure in experimental branches. The good news is hopefully we will eventually be able to reduce the Instead of unconditionally traversing backwards, the planned design involves conditional branches backwards. This is where a perfect hint system starts to looks like a perfect branch predictor... |
Hi, Thanks, I have did some experiement. First metadata size should fit in cache, to avoid too many cache miss.
improved caching I implemented the reverse caching, it is working for some cases. Also read_size should be small 4 in order to avoid to trash cache :
if read_size in not 4 byte, we will use the cache buffer as intermediate buffer an trash it. I will provide more info tomorrow and patches
We have already the same problem in order place of the code. I have a patch to print some cache stats.
Yes it could be interesting is having cache_size and cache_fill_size.
[1] |
It should be bypassing the cache if the incoming buffer is larger than the cache. Though there may be opportunities where this is missed. Feel free to fork and create a pull request.
Really just because the config system needs work. Validating is a slightly better default because it allows us to detect/recover from write errors. There's nothing stopping us from making validation optional, but it would require either an additional There are also plans for additional validation options, such as metadata/data checksumming, so making write validation optional has been low-priority since it will probably be easier to introduce this with other related API changes.
This has been requested a couple times. It makes sense to separate these cache sizes. The main issue is backwards compatibility. It would be tricky to implement separate cache sizes without breaking the current API. There are also other API-breaking changes in the works right now, and breaking the API once is better than breaking the API twice.
Is this significantly better than increasing Something else that should be mentioned is these caching tradeoffs are all being made with code/ram size in mind. Caching logic can get very complicated, and even if it results in performance benefits, it might not be the right tradeoff for littlefs. Just something to keep in mind. The cost can be ignored if we introduce additional caching logic behind |
Hi, here my PR with my current patches : #977 Some may need some cleaning, improvement other may be specialize for our case before patches
after patches (LFS_NO_CHECK_DATA enable)
read_size = 8 vs read_size = 16
vs
With read_size = 16, there less caching (40965 direct flash read vs 40923) and the volume of raw flash read is bigger (15094160 vs 14740616). With read_size = 256 and not hint modified in original code
With read_size = 4 and [lfs_dir_traverse: better try to use cache] / [lfs_dir_getslice: reverse caching] patches
|
Hi I have updated my PR, and now test pass work. I removed the patch to by pass validated. Even if it was an option, if fail to pass badblock test when enabled : silent write/erase failure are not working well. Also read error also fail the test. I think this is because test read error are triggered during the write (when doing the write). Now there are triggered in other place of the code and break the test. I also removed the patch to have different size for the cache. The patch added 2 new entry in lfs_config . But if there are 0, they will default to old behaviour This 2 patches are on https://github.com/matthieu-c-tagheuer/littlefs/commits/performance2/ For the size comparison, ASSERT are used ? |
Hi @matthieu-c-tagheuer, sorry to the late response.
Is this still with I'm also curious about lfs_dir_traverse changes vs reverse-caching changes. I don't think reverse-caching changes will survive other ongoing work unfortunately...
The validation step is how we detect write errors, which is what test_badblocks tests, so that makes sense. This test could be made optional if validation is disabled. It would just be up to users to decide if the performance-vs-write-error-detection tradeoff is worth it.
It sounds like this may be an actual bug. Currently littlefs assumes all caches are the same size, and there is some hackiness where it relies on this (mainly here). The test_compat tests check for backwards compatibility with the previous version. It's possible because of difference in cache sizes in the two versions different code is being executed.
Sorry I'm not sure I understood this question. Is there an assert triggering that is confusing? I was thinking about how best to add additional caching features to littlefs. One thought is, could this be introduced as an optional block device? So if a user wants to use my_nor_flash_t my_nor_bd;
lfs_cachingbd_t cachingbd;
struct lfs_cachingbd_config {
.context = &my_nor_bd
.read = my_nor_flash_read,
.prog = my_nor_flash_prog,
...
} cachingbd_cfg;
lfs_caching_bd_create(&cachingbd, &cachingbd_cfg) => 0;
lfs_t lfs;
struct lfs_config {
.context = &cachingbd,
.read = my_nor_flash_read,
.prog = my_nor_flash_prog,
...
}
lfs_mount(&lfs, &lfs_cfg) => 0; I realize this would add quite a bit of work, but the result would be quite flexible, allowing users to decide if they want this level of caching, or to implement their own caching strategies. Unfortunately the current block device layer will probably make this more annoying than it needs to be. The bd API needs to be revisited... Thoughts? |
Hi,
I started to do some test on littlefs. I have a simple test that do
mkdir ("flash")
for i = 0 ; i < 100; i++
remove("flash/test%d", i)
open("flash/test%d", i)
write(4k)
close()
For some open/close, I see several second delay.
I enabled trace in read/write/erase callback (see attached trace : log4).
log4.txt
In case of long delay, there is lot's of 16B read : around 26780 read (see long_close.txt) !
long_close.txt
For example
Where come from this read ?
Is that because there is too much files in a directory ?
Config :
This a spi nor flash. The 26780 read are not really fast. It is not possible to merge continous read in one spi transaction ?
The text was updated successfully, but these errors were encountered: