Implement locking for dataset creation in Globus uploader #446

craig-willis · 2018-05-17T15:57:36Z

Because Clowder allows multiple datasets to have the same name, we've frequently run into a problem with duplicate datasets during the Globus upload process. This results in downstream problems, such as extractors not triggering because of incomplete data if files are split across datasets.

We've discussed implementing a method in the Clowder API -- getOrCreateDataset or similar -- that would return an existing ID or create if it didn't exist, but have had pushback from the Clowder team since it would require locking in Mongo.

An alternative is to implement locking in the uploader itself either via Postgres or another package such as https://github.com/vaidik/sherlock.

Completion criteria:

Implement distributed locking mechanism in uploader
Update documentation

robkooper · 2018-05-17T16:13:10Z

The other option is to add to clowder the option to do a get_or_create_dataset(). We should be able to use : https://stackoverflow.com/a/16362833

craig-willis · 2018-05-17T16:55:49Z

Implementing this in Clowder would obviously be the best option, if this is an acceptable approach.

craig-willis · 2018-05-23T17:20:58Z

As noted on Slack, the Mongo docs (and the SO post above) indicate that this isn't a reliable mechanism unless there's a unique index (https://docs.mongodb.com/manual/reference/method/db.collection.findAndModify/#upsert-and-unique-index). I think we run into the same problem --- with 20+ uploaders running, usually the dataset creation collision happens in a very short period (~ms). We're seeing this problem on < 1% of data right now, so I'd rather go with a solution that absolutely fixes the problem.

Max and I discussed sherlock backed by Redis, but reading an exchange about the algorithm I'm not convinced it's any better. I'm how looking at the python-etcd client, which seems (along with Zookeeper) to have a better approach. If this fails, we can discuss the Mongo approach above.

craig-willis · 2018-05-31T16:43:14Z

Per discussion with @robkooper and @max-zilla, we will move ahead with creating a new endpoint in Clowder to lock the collection using intent exclusive (IX) lock in combination with the findAndModify call above. See also https://docs.mongodb.com/manual/faq/concurrency/#which-operations-lock-the-database.

Assigning to @max-zilla

craig-willis · 2018-05-31T18:03:19Z

After further discussion, back on me to try the etcd approach.

max-zilla · 2018-06-14T18:22:16Z

waiting on nebula.

max-zilla · 2018-07-26T18:08:18Z

@craig-willis any updates here?

craig-willis · 2018-08-09T18:09:03Z

#491

max-zilla · 2018-08-23T18:30:43Z

@craig-willis hasn't revisited after etcd timeout issues

craig-willis mentioned this issue May 17, 2018

Generate fullfield products #398

Closed

craig-willis added this to the TERRA Sprint 5/16-5/30/2018 milestone May 17, 2018

craig-willis self-assigned this May 24, 2018

craig-willis assigned max-zilla and craig-willis and unassigned craig-willis and max-zilla May 31, 2018

craig-willis modified the milestones: TERRA Sprint 5/16-5/30/2018, TERRA Sprint 06/01/2018 - 06/15/2018 May 31, 2018

max-zilla modified the milestones: TERRA Sprint 06/01/2018 - 06/15/2018, TERRA Sprint 06/16/2018 - 06/30/2018 Jun 14, 2018

max-zilla added the kind/bug label Jun 15, 2018

max-zilla modified the milestones: TERRA Sprint 06/16/2018 - 07/14/2018, TERRA Sprint 07/13/2018 - 07/26/2018 Jul 12, 2018

max-zilla modified the milestones: TERRA Sprint 07/13/2018 - 07/26/2018, TERRA Sprint 07/27/2018 - 08/09/2018 Jul 26, 2018

max-zilla modified the milestones: TERRA Sprint 07/27/2018 - 08/09/2018, TERRA Sprint 08/10/2018 - 08/23/2018 Aug 9, 2018

max-zilla modified the milestones: TERRA Sprint 08/10/2018 - 08/23/2018, TERRA Sprint - 08/24/2018 - 09/06/2018 Aug 23, 2018

max-zilla removed this from the TERRA Sprint - 08/24/2018 - 09/06/2018 milestone Sep 6, 2018

max-zilla added this to the TERRA Sprint - 09/07/2018 - 09/20/2018 milestone Sep 6, 2018

craig-willis modified the milestones: TERRA Sprint - 09/07/2018 - 09/20/2018, TERRA Sprint - 10/05/2018 - 10/18/2018 Oct 4, 2018

max-zilla removed this from the TERRA Sprint - 10/05/2018 - 10/18/2018 milestone Oct 25, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement locking for dataset creation in Globus uploader #446

Implement locking for dataset creation in Globus uploader #446

craig-willis commented May 17, 2018 •

edited

Loading

robkooper commented May 17, 2018

craig-willis commented May 17, 2018

craig-willis commented May 23, 2018

craig-willis commented May 31, 2018

craig-willis commented May 31, 2018

max-zilla commented Jun 14, 2018

max-zilla commented Jul 26, 2018

craig-willis commented Aug 9, 2018

max-zilla commented Aug 23, 2018

Implement locking for dataset creation in Globus uploader #446

Implement locking for dataset creation in Globus uploader #446

Comments

craig-willis commented May 17, 2018 • edited Loading

robkooper commented May 17, 2018

craig-willis commented May 17, 2018

craig-willis commented May 23, 2018

craig-willis commented May 31, 2018

craig-willis commented May 31, 2018

max-zilla commented Jun 14, 2018

max-zilla commented Jul 26, 2018

craig-willis commented Aug 9, 2018

max-zilla commented Aug 23, 2018

craig-willis commented May 17, 2018 •

edited

Loading