Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration with data catalog API #59

Open
dgarijo opened this issue May 6, 2020 · 15 comments
Open

Integration with data catalog API #59

dgarijo opened this issue May 6, 2020 · 15 comments
Labels
enhancement New feature or request

Comments

@dgarijo
Copy link
Contributor

dgarijo commented May 6, 2020

We would like to search for relevant datasets for those model configurations that do not have any files available (e.g., TopoFlow). This can be done by asking the data catalog to list datasets that are available through their API. @dnfeldman will refine the client in https://github.com/mintproject/data-catalog-client to make sure it is updated and we can use it for querying.

@dnfeldman
Copy link

@dgarijo is the client not currently working correctly? There haven't been any breaking changes to the data catalog api in over a year now. Maybe you could provide a specific example of how the client isn't working?

@dgarijo
Copy link
Contributor Author

dgarijo commented May 8, 2020

@sirspock please answer @dnfeldman
We want to make sure the Python client is the latest and reliable.

@mosoriob
Copy link
Collaborator

mosoriob commented May 8, 2020

The question is: Is the openapi.yaml updated? Do you have all the paths and authorization in the spec?

@mosoriob
Copy link
Collaborator

mosoriob commented May 8, 2020

In october, the openapi.yaml doesn't have information the authentication.

@dnfeldman
Copy link

I'm adding a few more paths that deal with updating records (datasets, resources, variables), but those don't affect searching.

As for authentication, it shouldn't be necessary to interact with the data catalog. Is the lack of authentication information breaking the client?

@dgarijo
Copy link
Contributor Author

dgarijo commented May 8, 2020

@dnfeldman I think it would be great to have a release that you consider stable and (if possible) make package pip installable without the GitHub URL, which may cause instabilities if there are further updates. We want to make sure that when we integrate a version, we can identify it properly. Thanks!
If authentication is not required, then great, less work for using it.

@mosoriob
Copy link
Collaborator

mosoriob commented May 8, 2020

Then, Can I trust the GET methods are ok?

@dnfeldman
Copy link

@sirspock - Yes, GET and POST methods have not changed so they should be ok
@dgarijo - I'll look into publishing this to PyPy although with gihub it's also possible to point to specific version/release to ensure future updates don't break things

@dgarijo
Copy link
Contributor Author

dgarijo commented May 11, 2020

@dnfeldman, the project has 0 releases. We need a release to have a stable version of the software.

@mosoriob
Copy link
Collaborator

@sirspock - Yes, GET and POST methods have not changed so they should be ok
@dgarijo - I'll look into publishing this to PyPy although with gihub it's also possible to point to specific version/release to ensure future updates don't break things

A good practice is to put the version in the API
Docs: https://swagger.io/docs/specification/api-host-and-base-path/
Example: https://api.solutions.mint.isi.edu/v0.0.1/ui/

@dnfeldman
Copy link

I've create a release of the (automatically generated) python client: https://github.com/mintproject/data-catalog-client/releases/tag/v1.0

Please let me know if the current version isn't working as you expected!

Adding versioning to api endpoints might make sense if it ever changes, but as things are right now, I don't think it's going to.

@mosoriob
Copy link
Collaborator

mosoriob commented Jul 9, 2020

At this moment, this is not required because the dt service searches the resource of the dataset id.

@mosoriob mosoriob closed this as completed Jul 9, 2020
@dgarijo
Copy link
Contributor Author

dgarijo commented Jul 9, 2020

@sirspock correct me if I am wrong (and close the issue then), but if I want to search a dataset for a dataset for a model configuration in DAME, we can't. This issue was to represent that use case. Users have asked us this.

@dgarijo dgarijo reopened this Jul 9, 2020
@dnfeldman
Copy link

Sorry, I'm a bit confused - are there any issues with the data catalog client? Or is there a use case that is not handled by the currently API?

@dgarijo
Copy link
Contributor Author

dgarijo commented Jul 10, 2020

@dnfeldman we are trying to prioritize issues when integrating with the data catalog. We have integrated the transformation service, which requires a dataset id to operate. However, I feel like we need to also integrate the search for datasets based on variable for those model configurations that require it.

We haven't had the time to test yet the client, I am afraid. We will let you know when we have done it.

@dgarijo dgarijo removed this from the End-Of-June Release milestone Oct 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants