Adding earthaccess catalog in Intake 2 #352

martindurant · 2023-11-10T21:05:49Z

I have written a little code which enables calling the earthaccess functions from within intake. The point of this, is that certain queries and dataset results could then be persisted in catalogs without having to keep code snippets around. The users still need to register and understand what the query parameters mean.

Do people here think this is a useful thing to do, and does the implementation look OK? Am I right in assuming that the DOI is the best unique identifier of a data product?

MattF-NSIDC · 2023-11-10T21:23:48Z

Nice! I haven't used Intake before, but excited to see more integrations :) What would using this look like?

Am I right in assuming that the DOI is the best unique identifier of a data product?

I think collection_concept_id is going to be the "best" unique identifier (as intended by the CMR API, not necessarily easiest-to-use). Under the hood, earthaccess is translating the doi query to a concept_id query by doing a collection search to get the concept_id.

earthaccess/earthaccess/search.py

Lines 699 to 702 in 7db2e59

    
           collection = DataCollections().doi(doi).get() 
        
           if len(collection) > 0: 
        
               concept_id = collection[0].concept_id() 
        
               self.params["concept_id"] = concept_id

martindurant · 2023-11-10T21:38:04Z

collection_concept_id is going to be the "best" unique identifier

Thanks, I'll use that.

The use pattern would be like

import intake.readers.catalogs
spec = intake.readers.catalogs.EarthdataCatalogReader(temporal=("2002-01-01", "2002-01-02"), ....)
cat = spec.read()
list(cat) # shows available identifiers, which all have metadata
reader = cat[<identifier>]
ds = reader.read() # outputs an xr.DataSet

Of course, the flow is nearly exactly the same as you have anyway, but the point is that spec and reader with their parameters can be saved in catalogs.

ebo · 2023-12-06T07:21:06Z

I am working with provisional ATL07/10 data, and would like to set up some access to our local repositories. These are pre-decisional data, and cannot be added for general access. I have been looking for instructions and/or tutorials on how to set up intake/earthaccess to access local files/repositories, but have not figured it out yet, so I thought I would ask here .

As a note, it has been 5+ years since I worked on setting up any intake catalogs, so pointers to instructions on setting this out would be helpful. I will be glad to post tutorials and instructions once I get this worked out, but I will first have to get permission for the public release.

martindurant · 2023-12-06T14:21:12Z

The general Earth catalog maker for Intake 2 is here: https://github.com/intake/intake/blob/745ebd42db371aa7d0f5d7d2ca8744103532819d/intake/readers/catalogs.py#L623

This calls earthaccess.search_datasets - so I don't know how you would change that to point to local resources.

ebo · 2023-12-06T15:48:51Z

Thanks! This gives me a place to start. Ill post something here if I find a workable solution.

github-project-automation bot added this to earthaccess project Nov 10, 2023

github-project-automation bot moved this to 🆕 New in earthaccess project Nov 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding earthaccess catalog in Intake 2 #352

Adding earthaccess catalog in Intake 2 #352

martindurant commented Nov 10, 2023

MattF-NSIDC commented Nov 10, 2023

martindurant commented Nov 10, 2023 •

edited

Loading

ebo commented Dec 6, 2023

martindurant commented Dec 6, 2023

ebo commented Dec 6, 2023

Adding earthaccess catalog in Intake 2 #352

Adding earthaccess catalog in Intake 2 #352

Comments

martindurant commented Nov 10, 2023

MattF-NSIDC commented Nov 10, 2023

martindurant commented Nov 10, 2023 • edited Loading

ebo commented Dec 6, 2023

martindurant commented Dec 6, 2023

ebo commented Dec 6, 2023

martindurant commented Nov 10, 2023 •

edited

Loading