Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for loading from pystac objects #282

Open
hrodmn opened this issue Jun 26, 2024 · 3 comments
Open

Add support for loading from pystac objects #282

hrodmn opened this issue Jun 26, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@hrodmn
Copy link
Collaborator

hrodmn commented Jun 26, 2024

Would it make sense to add support for loading Collection, Item, and ItemCollection objects from pystac? It could be useful in python workflows that generate STAC metadata and load directly into pgstac.

Right now, I do something like this:

import io
import json

from pypgstac.db import PgstacDB
from pypgstac.load import Loader, Methods
from pystac import ItemCollection
from stactools.package import create_item

# fire up pgstac loader
db = PgstacDB()
loader = Loader(db=db)

# create item collection
item_collection = ItemCollection(
  create_item(href) for href in hrefs
)

# write to ndjson and load into pgstac
buffer = io.BytesIO()
for item in item_collection:
    item.collection_id = COLLECTION_ID_FORMAT.format(
        region=region.value, product=product.value
    )

    buffer.write((json.dumps(item.to_dict()) + "\n").encode("utf-8"))

buffer.seek(0)

loader.load_items(buffer, insert_mode=Methods.upsert)

It isn't that hard to write to ndjson and pass that to load_items, but it would be nice if that operation was handled by pypgstac!

@hrodmn hrodmn added the enhancement New feature or request label Jun 26, 2024
@keewis
Copy link

keewis commented Aug 16, 2024

looks like you can pass a sequence of dict to load_items and load_collections. So this could work already:

items = [item.to_dict() for item in item_collection]
loader.load_items(items, insert_mode=Methods.upsert)

@hrodmn
Copy link
Collaborator Author

hrodmn commented Aug 16, 2024

Yeah, that does work. I was thrown off by a bad type hint, I think. Passing a list in does in fact work but my type checker complains about list[Unknown] being incompatible with Iterator[Any]. Maybe this should include Iterable[Any] instead of Iterator[Any]:

def read_json(file: Union[Path, str, Iterator[Any]] = "stdin") -> Iterable:

@keewis
Copy link

keewis commented Aug 16, 2024

true, for read_json that could be Iterable: the code checks for Iterable and iterates over the variable, but never calls iter() on the variable.

However, Iterator inherits from Iterable, and load_items at least needs the parameter to be an Iterator, so unless you call read_json manually you won't be able to pass the narrower type (I have no idea how to deal with list[Unknown] vs Iterator[Any], though).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants