-
Notifications
You must be signed in to change notification settings - Fork 1
Spatial Data Uploads
Chad Burt edited this page Jul 19, 2023
·
9 revisions
SeaSketch uses a system similar to Felt to enable spatial data uploads, converting them to pmtiles for storage and hosting.
- Accepts uploaded spatial data, informing the user of progress towards ingest into their SeaSketch project
- Transforms it into cloud-native visualization formats such as PMTiles or plain geojson if small enough
- Stores both the original upload and a "canonical" representation that can be converted to other forms to support data export and download
- Extracts metadata from files to support metadata viewing (markdown) and styling (mapbox-geostats/tilejson)
- Assigns a default cartographic style, ideally using metadata to pick from a set of appropriate templates
- Monitors size of datasets uploaded and enforces a limit on upload size and total uploaded bytes on a per-project basis.
- Keeps data private and only accessible from www.seasketch.org
- Deletes data representations from cloud storage if deleted from a project
sequenceDiagram
participant Client
participant G as GraphQL API
participant D as Database
participant W as Graphile Worker
participant Lambda as AWS Lambda
participant R2 as Cloudflare R2
Client->>G: createDataUpload mutation
G->>D: creates record in data_uploads
G-->>Client: DataUpload with presignedUploadUrl
Client->>R2: Uploads spatial file directly to cloud storage using presigned url
Client->>G: Calls submitDataUpload mutation
G->>D: calls submit_data_upload fn
D->>W: submit_data_upload triggers processDataUpload task
W->>Lambda: processDataUpload updates progress and triggers the lambda
Lambda->>R2: Lambda fetches uploaded data from cloud storage
loop
Lambda-->>D: Updates db with progress while processing
end
Lambda->>R2: Stores outputs (pmtiles, etc)
Lambda->>W: lambda calls processDataUploadOutputs worker task
W->>D: Creates data_layer, data_source, and table_of_contents_items
loop
Client-->>G: DataUploadManager is uses GraphQL subscription to monitor task state
end
G-->>Client: receives upload status
Client->>G: Fetches new table of contents & displays layers
The upload system stores spatial data in Cloudflare R2 (previously AWS Cloudfront). It stores the original upload + several derivative resources depending on type. This includes:
- The original upload (supported types are currently shapefile, geojson, flatgeobuf, and geotiff)
- A canonical form that can be processed into data visualization products and other exports.
- For vector this is flatgeobuf
- For raster this is geotif
- The visualization product. PMTiles for large vector and raster data, and geojson for < 1MB geojson files.
Each resource should go in a data_source_resources table which includes it's location, file size, and type. These "resources" are the primary output of the tiling lambda process.