Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add batching to Solo #28

Open
njbernstein opened this issue May 20, 2020 · 2 comments
Open

Add batching to Solo #28

njbernstein opened this issue May 20, 2020 · 2 comments

Comments

@njbernstein
Copy link
Contributor

njbernstein commented May 20, 2020

currently, users must manually break up their dataset if it contains multiple sample. We can help them by running Solo for them per batch. The main issue with this is that it will be slower if the user has multiple GPUs available to them.

@cnk113
Copy link

cnk113 commented Jul 19, 2020

I was wondering on the status of this feature? I've been using a little wrapper script that can do multiple batches with solo using the Ray library to distribute over multiple GPUs, it can also partition up how much memory to use. My script is a pretty clunky since there is some manual input and is mainly a wrapper. It would be ideal if Solo can just take a metadata column and run with it.

@njbernstein
Copy link
Contributor Author

Hi @cnk113 I haven't started implementing it yet. That being said if you already have a script to distribute with multiple GPUs I would stick with that because I don't plan on having code in Solo to spread the computation across multiple GPUs. The various compute environments out there are just too diverse and trying to engineer code within Solo to do this would not be worth the effort.

This issue is just for simplifying things for users working on a single GPU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants