Skip to content

wri/gfw-data-api

Folders and files

NameName
Last commit message
Last commit date
Jun 28, 2024
Oct 24, 2024
Aug 28, 2024
Aug 5, 2024
Oct 15, 2024
Sep 24, 2024
Sep 6, 2024
Feb 16, 2020
Oct 1, 2024
Jan 29, 2021
Feb 27, 2024
Aug 7, 2024
Dec 29, 2022
Aug 10, 2022
Oct 30, 2024
Jul 2, 2024
Jul 2, 2024
Jan 28, 2024
Apr 15, 2020
Sep 24, 2024
Sep 24, 2024
Oct 30, 2024
Mar 4, 2024
Nov 7, 2024
Jun 6, 2020
Nov 7, 2024
Jun 6, 2020

Repository files navigation

GFW Data API

High-performance Async REST API, in Python. FastAPI + GINO + Uvicorn (powered by PostgreSQL).

Get Started

Run Locally with Docker

  1. Clone this Repository. git clone https://github.com/wri/gfw-data-api.git
  2. Run ./scripts/setup from the root directory. (Run pip install pipenv first, if necessary.)
  3. Run locally using docker-compose. ./scripts/develop

Developing

  • Generate a DB Migration: ./scripts/migrate (note app/settings/prestart.sh will run migrations automatically when running /scripts/develop)
  • Run tests: ./scripts/test
    • --no_build - don't rebuild the containers
    • --moto-port=<port_number> - explicitly sets the motoserver port (default 50000)
  • Run specific tests: ./scripts/test tasks/test_vector_source_assets.py::test_vector_source_asset
  • Each development branch app instance gets its isolated database in AWS dev account that's cloned from geostore database. This database is named with the branch suffix (like geostore_<branch_name>). If a PR includes a database migration, once the change is merged to higher environments, the geostore database needs to also be updated with the migration. This can be done by manually replacing the existing database by a copy of a cleaned up version of the branch database (see ./prestart.sh script for cloning command).
  • Debug memory usage of Batch jobs with memory_profiler:
    1. Install memory_profiler in the job's Dockerfile
    2. Modify the job's script to run with memory_profiler. Ex: pixetl "${ARG_ARRAY[@]}" -> mprof run -M -C -T 1 --python /usr/local/app/gfw_pixetl/pixetl.py "${ARG_ARRAY[@]}"
    3. scp memory_profiler's .dat files off of the Batch instance (found in /tmp by default) while the instance is still up

Features

Core Dependencies

  • FastAPI: touts performance on-par with NodeJS & Go + automatic Swagger + ReDoc generation.
  • GINO: built on SQLAlchemy core. Lightweight, simple, asynchronous ORM for PostgreSQL.
  • Uvicorn: Lightning-fast, asynchronous ASGI server.
  • Optimized Dockerfile: Optimized Dockerfile for ASGI applications, from https://github.com/tiangolo/uvicorn-gunicorn-docker.

Additional Dependencies

  • Pydantic: Core to FastAPI. Define how data should be in pure, canonical python; validate it with pydantic.
  • Alembic: Handles database migrations. Compatible with GINO.
  • SQLAlchemy_Utils: Provides essential handles & datatypes. Compatible with GINO.
  • PostgreSQL: Robust, fully-featured, scalable, open-source.