Regression in 2.0.5 -> 2.1.0: Getting Error: failed to open cache: timeout #478

terlar · 2024-11-11T11:14:01Z

Describe the bug

I just tried to update to 2.1.0 from 2.0.5. But now I start getting

Error: failed to open cache: timeout

I get this issue every time I try to run treefmt via pre-commit hook, if I don't have any cache. I have not been able to reproduce it running treefmt directly. It seems to be because pre-commit is batching calls to treefmt and there is some issue running multiple treefmt in parallel and accessing this database?

Of course it is fixable with --no-cache. But would be nice if I could use the cache when it is available.

To Reproduce

Steps to reproduce the behavior:

Git clone https://github.com/terlar/pre-commit-treefmt-bug.git
Enter project directory
Run nix develop
Run rm -r ~/.cache/treefmt
Run pre-commit run --all-files

Expected behavior

Running without failing with a timeout. Perhaps retry? Or add possibility to extend timeout or find another clever solution to the problem.

System information

Linux

Additional context

Only happens together with pre-commit or potentially also if you batch treefmt calls in other ways. I triggered this error with 100 files on one machine, but had to bump it on another one to trigger it. So it could potentially be that you cannot reproduce this depending on your hardware.

The text was updated successfully, but these errors were encountered:

terlar · 2024-11-11T11:20:39Z

I'm pretty sure this was introduced by #409.

terlar · 2024-11-11T12:19:28Z

After some research, perhaps the pre-commit config require_serial: true is what should be used instead. This will leave the parallelization to treefmt. However, could potentially happen in multiple calls, if the filenames cannot fit in one invocation. But perhaps that is not a big issue.

Testing with this configuration the formatting was also faster. So seems relying on pre-commits parallelization slowed things down slightly.

terlar · 2024-11-11T12:45:51Z

Using --no-cache is also considerable faster. So leaving out the cache seems to work well both with serial and non-serial execution. At least in this case. So perhaps when running as a pre-commit hook cache should be disabled?

brianmcgee · 2024-11-12T08:44:58Z

In this instance, --no-cache is the simplest way of guaranteeing that multiple instances of treefmt running at the same time do not conflict. However, on larger repos, the loss of the cache could have quite an impact.

require_serial is a good idea, but as you said, there can still be concurrent invocations.

We could make the connection timeout for the db configurable. It's currently set to 1s. In a pre-commit hook, this could be increased to 10s for example, and would ensure any concurrent invocations are queued.

jfly · 2024-11-14T13:59:14Z

Is it worth discussing how to make parallel usage of treefmt work better? A couple ideas:

I see that bbolt has a sharable, read-only mode: https://github.com/etcd-io/bbolt?tab=readme-ov-file#read-only-mode. We could rework things so treefmt grabs a shared connection to check if files are formatted, then releases the connection, formats, and finally grabs a write connection just to update the db.
My preference: the filesystem is already a good enough db for us! Just create a file with the right "signature" if a given file has been formatted. If it exists, there's nothing to do. If it doesn't exist, then format the file and create the "signature" file.

Both these ideas would create scenarios where parallel treefmt processes could end up doing duplicate work (i.e.: formatting the same file), but I'm pretty sure the use case described above is one where people are "intelligently" using treefmt in parallel (not giving the various treefmt incantations overlapping work).

brianmcgee · 2024-11-14T14:12:49Z

I don't know if parallel execution is a path we should go down 🤔

terlar added the bug Something isn't working label Nov 11, 2024

terlar mentioned this issue Nov 11, 2024

feat(treefmt): add options to settings and disable cache by default cachix/git-hooks.nix#525

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression in 2.0.5 -> 2.1.0: Getting Error: failed to open cache: timeout #478

Regression in 2.0.5 -> 2.1.0: Getting Error: failed to open cache: timeout #478

terlar commented Nov 11, 2024 •

edited

Loading

terlar commented Nov 11, 2024

terlar commented Nov 11, 2024

terlar commented Nov 11, 2024

brianmcgee commented Nov 12, 2024

jfly commented Nov 14, 2024 •

edited

Loading

brianmcgee commented Nov 14, 2024

Regression in 2.0.5 -> 2.1.0: Getting Error: failed to open cache: timeout #478

Regression in 2.0.5 -> 2.1.0: Getting Error: failed to open cache: timeout #478

Comments

terlar commented Nov 11, 2024 • edited Loading

terlar commented Nov 11, 2024

terlar commented Nov 11, 2024

terlar commented Nov 11, 2024

brianmcgee commented Nov 12, 2024

jfly commented Nov 14, 2024 • edited Loading

brianmcgee commented Nov 14, 2024

terlar commented Nov 11, 2024 •

edited

Loading

jfly commented Nov 14, 2024 •

edited

Loading