Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lingui-extract-experimental.ts extractFromFiles concurrently? #1798

Open
yunsii opened this issue Oct 30, 2023 · 8 comments
Open

lingui-extract-experimental.ts extractFromFiles concurrently? #1798

yunsii opened this issue Oct 30, 2023 · 8 comments

Comments

@yunsii
Copy link
Contributor

yunsii commented Oct 30, 2023

Is your feature request related to a problem? Please describe.

For big project, it is too slow to extractFromFiles, after investigation, how about to make lingui-extract-experimental.ts extractFromFiles concurrently?

for (const outFile of Object.keys(bundleResult.metafile.outputs)) {

Describe proposed solution

p-limit?

@timofei-iatsenko
Copy link
Collaborator

i considered implementing a worker thread pool to do so. But for first iteration stopped as it is now. You probably a first user who really started experimenting with that and we started getting the feedback.

If you have capacity for implement worker threads i would happy to help. But for now i'm out of capacity to do so by my own.

@timofei-iatsenko
Copy link
Collaborator

BTW p-limit doesn't really help here because while extracting we invoking Babel on a bundles, which are single big file. Babel is very CPU bound and synchronous by nature. In all bounding code there also not much async operations, so you will not benefit of running them in one node process.

@yunsii
Copy link
Contributor Author

yunsii commented Oct 30, 2023

Thanks for you patient explanation. I'm not familier with worker threads, but very interesting of it. I'll study the theory first, glad to join the work if possible.

@timofei-iatsenko
Copy link
Collaborator

@timofei-iatsenko
Copy link
Collaborator

The caveat of working with workers - you don't have a shared memory between it. Treat them as few standalone nodejs programs ran by another one.

So if you want to expose something for all workers, you could not just store it in some global variable. Usually, passing data between main / child processes is done by serializing and storing in some place, and then reading and deserializing it on another side. So you could not pass from main process to child something non-serializable, say a function or class instance.

In lingui there might be few places where it's needed, and should be re-designed in a different way.

  • Passing a lingui config to child workers (config is not serializable to json, as it might have custom formatters / extractors as function). So you rather need to read config in each thread by it's own (this might bring a significant overhead!)
  • Passing a Catalog instance object, this should be just designed in diffrent way.

@yunsii
Copy link
Contributor Author

yunsii commented Oct 30, 2023

Got it, how about make each worker to extract each entry? It seems isolated.

@timofei-iatsenko
Copy link
Collaborator

In your very first message you point into the right place in sourcecode which should be parallelized. Start from there.

@semoal
Copy link
Contributor

semoal commented Nov 7, 2023

I know that Vitest instead of using jest-worker is using Piscina https://www.npmjs.com/package/piscina which is more robust by far than jest-worker, probably could be a good addition here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants