Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stronger explicit coupling of code and data #86

Open
tiborsimko opened this issue Feb 29, 2016 · 2 comments
Open

stronger explicit coupling of code and data #86

tiborsimko opened this issue Feb 29, 2016 · 2 comments

Comments

@tiborsimko
Copy link

Nice proposal! Many things in the pitch are exactly what we try to achieve within the context of the CERN Open Data service and the CERN Analysis Preservation pilot.

One suggestion: the proposal seems to address running code more in length than it addresses its relation to data. It may be useful to promote the idea of coupling of code and data more closely, e.g. via git-annex or git-lfs tools, that permit researchers to maintain versioning of both software and data in the same place, even though the data is located on some remote storage service due to its size.

For services like Zenodo, this would open an easy possibility to archive not only software, but also (reasonably sized) datasets at the time of the release, for example.

@khinsen
Copy link
Collaborator

khinsen commented Feb 29, 2016

@tiborsimko That's indeed an important issue, but difficult to deal with in our proposal, for two reasons: (1) Executability and linking with data are nearly orthogonal issues and (2) Depending on the size and nature of data, very different technical solutions are required.

What we could do is to mention the issue in some kind of outlook - something we'd look at in phase II.

@lukasheinrich
Copy link
Contributor

yes I'm also interested in this.

@tiborsimko do you know if at CERN the EOS people have looked into having EOS as a git-lfs backend? (for non-CERNies, EOS is CERN's multi-PB storage solution)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants