Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Molnsutil state #6

Open
wants to merge 17 commits into
base: master
Choose a base branch
from
Open

Conversation

briandrawert
Copy link

No description provided.

@@ -69,6 +72,7 @@ class MolnsUtilStorageException(Exception):

import os
def get_s3config():

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that it is so important, but we could change the name to "get_persisistent_storage_config"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

@ahellander
Copy link

Great improvements! I tested to "come back" to the DistribtedEnsemble after having the laptop closed.

Bug: It seems like that it is always using one less engine as the number of available engines when it calculates the chunk_size. I.e. I had 2 engines and got chunk_size=1. Setting the 'num_engines=1' as an argument causes an error.

Enhancement: This might be what you were mentioning before, but if 'storage_mode' is set to "Persistent", not only the results but also the ensemble state file should be written to S3/Swift, not to the controller fs. This of course introduces the complication that there will be a global name for ensembles, so we might need to add an API function to list the names of available ensembles and sweeps. Also, since the S3 bucket name is unique for a controller and private as default, sharing data between different users/controllers will require being able to set the S3/Swift bucket to public, right?

I also wonder if we should change the behavior and have the async behavior be the default for both ensembles and sweeps, and optionally use the progress bar but only after setting the flag "synchronous=True". Also, it would be very nice to mimic the behavior of IP parallels map_aynch, i.e. that we can iterate over trajectories as they arrive, i.e.:

ensemble = DistributedEnsemble(...)
res = ensemble.run(mapper=g)
for r in res.results:
print r

This increases the complexity, but will make it more useful for building more complex flows and still having the benefit of all the book-keeping in the DE, and PS classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants