diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md
deleted file mode 100644
index ecc26730..00000000
--- a/ARCHITECTURE.md
+++ /dev/null
@@ -1,75 +0,0 @@
-# Architecture
-
-This document describes the high level architecture of the nwp-consumer project.
-
-## Birds-eye view
-
-```mermaid
-flowchart
- subgraph "Hexagonal Architecture"
-
- subgraph "NWP Consumer"
- subgraph "Ports"
- portFI(FetcherInterface) --- core
- core --- portSI(StorageInterface)
-
- subgraph "Core"
- core{{Domain Logic}}
- end
- end
- end
-
- subgraph "Driving Adaptors"
- i1{ICON} --implements--> portFI
- i2{ECMWF} --implements--> portFI
- i3{MetOffice} --implements--> portFI
- end
-
- subgraph "Driven Adaptors"
- portSI --- o1{S3}
- portSI --- o2{Huggingface}
- portSI --- o3{LocalFS}
- end
-
- end
-```
-
-At the top level, the consumer downloads raw NWP data, processes it to zarr, and saves it to a storage backend.
-
-It is built following the hexagonal architecture pattern.
-This pattern is used to separate the core business logic from the driving and driven adaptors.
-The core business logic is the `service` module, which contains the domain logic.
-This logic is agnostic to the driving and driven actors,
-instead relying on abstract classes as the ports to interact with them.
-
-
-## Entry Points
-
-`src/nwp_consumer/cmd/main.py` contains the main function which runs the consumer.
-
-`src/nwp_consumer/internal/service/consumer.py` contains the `NWPConsumer` class,
-the methods of which are the business use cases of the consumer.
-
-`StorageInterface` and `FetcherInterface` classes define the ports used by driving and driven actors.
-
-`src/nwp_consumer/internal/inputs` contains the adaptors for the driving actors.
-
-`src/nwp_consumer/internal/outputs` contains the adaptors for the driven actors.
-
-## Core
-
-The core business logic is contained in the `service` module.
-According to the hexagonal pattern, the core logic is agnostic to the driving and driven actors.
-As such, there is an internal data representation of the NWP data that the core logic acts upon.
-Due to the multidimensional data of the NWP data, it is hard to define a schema for this.
-
-Internal data is stored an xarray dataset.
-This dataset effectively acts as an array of `DataArrays` for each parameter or variable.
-It should have the following dimensions and coordinates:
-
-- `time` dimension
-- `step` dimension
-- `latitude` or `x` dimension
-- `longitude` or `y` dimension
-
-Parameters should be stored as DataArrays in the dataset.
\ No newline at end of file
diff --git a/README.md b/README.md
index abfacfe2..4c15d748 100644
--- a/README.md
+++ b/README.md
@@ -42,26 +42,64 @@ $ docker pull ghcr.io/openclimatefix/nwp-consumer
## Example usage
-**To create an archive of GFS data:**
+**To download the latest available day of GFS data:***
-TODO
+```bash
+$ nwp-consumer consume
+```
-## Documentation
+**To create an archive of a month of GFS data:**
+
+> [!Note]
+> This will download several gigabytes of data to your home partition.
+> Make sure you have plenty of free space (and time!)
-TODO: link to built documentation
+```bash
+$ nwp-consumer archive --year 2024 --month 1
+```
-Documentation is generated via [pydoctor](https://pydoctor.readthedocs.io/en/latest/).
+## Documentation
+
+Documentation is generated via [pdoc](https://pdoc.dev/docs/pdoc.html).
To build the documentation, run the following command in the repository root:
```bash
-$ python -m pydoctor
+$ PDOC_ALLOW_EXEC=1 python -m pdoc -o docs --docformat=google src/nwp_consumer
```
+> [!Note]
+> The `PDOC_ALLOW_EXEC=1` environment variable is required due to a facet
+> of the `ocf_blosc2` library, which imports itself automatically and hence
+> necessitates execution to be enabled.
+
## FAQ
### How do I authenticate with model repositories that require accounts?
+Authentication, and model repository selection, is handled via environment variables.
+Choose a repository via the `MODEL_REPOSITORY` environment variable. Required environment
+variables can be found in the repository's metadata function. Missing variables will be
+warned about at runtime.
+
+### How do I use an S3 bucket for created stores?
+
+The `ZARRDIR` environment variable can be set to an S3 url
+(ex: `s3://some-bucket-name/some-prefix`). Valid credentials for accessing the bucket
+must be discoverable in the environment as per
+[Botocore's documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html)
+
+### How do I change what variables are pulled?
+
+With difficulty! This package pulls data specifically tailored to Open Climate Fix's needs,
+and as such, the data it pulls (and the schema that data is surfaced with)
+is a fixed part of the package. A large part of the value proposition of this consumer is
+that the data it produces is consistent and comparable between different sources, so pull
+requests to the effect of adding or changing this for a specific model are unlikely to be
+approved.
+However, desired changes can be made via cloning the repo and making the relevant
+parameter modifications to the model's expected coordinates in it's metadata for the desired model
+repository.
## Development
@@ -77,7 +115,8 @@ $ python -m ruff check .
```
Be sure to do this periodically while developing to catch any errors early
-and prevent headaches with the CI pipeline.
+and prevent headaches with the CI pipeline. It may seem like a hassle at first,
+but it prevents accidental creation of a whole suite of bugs.
### Running the test suite
diff --git a/src/nwp_consumer/cmd/main.py b/src/nwp_consumer/cmd/main.py
index eddb3dcf..92aab700 100644
--- a/src/nwp_consumer/cmd/main.py
+++ b/src/nwp_consumer/cmd/main.py
@@ -31,8 +31,11 @@ def parse_env() -> Adaptors:
case "metoffice-datahub":
model_repository_adaptor = \
repositories.model_repositories.MetOfficeDatahubModelRepository
- case _ as model:
- log.error(f"Unknown model: {model}")
+ case _ as mr:
+ log.error(
+ f"Unknown model repository '{mr}'. Expected one of "
+ f"['gfs', 'ceda', 'ecmwf-realtime', 'metoffice-datahub']"
+ )
sys.exit(1)
notification_repository_adaptor: type[ports.NotificationRepository]
diff --git a/src/nwp_consumer/cmd/test_main.py b/src/nwp_consumer/cmd/test_main.py
deleted file mode 100644
index 73ee7683..00000000
--- a/src/nwp_consumer/cmd/test_main.py
+++ /dev/null
@@ -1,56 +0,0 @@
-import datetime as dt
-import os
-import unittest
-from unittest import mock
-
-from nwp_consumer.internal import FetcherInterface
-
-from .main import _parse_from_to
-
-
-
-class TestParseFromTo(unittest.TestCase):
- def test_today(self) -> None:
- # Test that today is processed correctly
- start, end = _parse_from_to("today", None)
- self.assertEqual(
- start,
- dt.datetime.now(tz=dt.UTC).replace(hour=0, minute=0, second=0, microsecond=0),
- )
- self.assertEqual(
- end,
- dt.datetime.now(tz=dt.UTC).replace(hour=0, minute=0, second=0, microsecond=0)
- + dt.timedelta(days=1),
- )
-
- def test_from_date(self) -> None:
- # Test that a date is processed correctly
- start, end = _parse_from_to("2021-01-01", None)
- self.assertEqual(start, dt.datetime(2021, 1, 1, tzinfo=dt.UTC))
- self.assertEqual(end, dt.datetime(2021, 1, 2, tzinfo=dt.UTC))
-
- def test_from_datetime(self) -> None:
- # Test that a datetime is processed correctly
- start, end = _parse_from_to("2021-01-01T12:00", None)
- self.assertEqual(start, dt.datetime(2021, 1, 1, 12, 0, tzinfo=dt.UTC))
- self.assertEqual(end, dt.datetime(2021, 1, 1, 12, 0, tzinfo=dt.UTC))
-
- def test_from_datetime_to_date(self) -> None:
- # Test that a datetime is processed correctly
- start, end = _parse_from_to("2021-01-01T12:00", "2021-01-02")
- self.assertEqual(start, dt.datetime(2021, 1, 1, 12, 0, tzinfo=dt.UTC))
- self.assertEqual(end, dt.datetime(2021, 1, 2, 0, tzinfo=dt.UTC))
-
- def test_from_datetime_to_datetime(self) -> None:
- # Test that a datetime is processed correctly
- start, end = _parse_from_to("2021-01-01T12:00", "2021-01-02T12:00")
- self.assertEqual(start, dt.datetime(2021, 1, 1, 12, 0, tzinfo=dt.UTC))
- self.assertEqual(end, dt.datetime(2021, 1, 2, 12, 0, tzinfo=dt.UTC))
-
- def test_invalid_datetime(self) -> None:
- # Test that an invalid datetime is processed correctly
- with self.assertRaises(ValueError):
- _parse_from_to("2021-01-01T12:00:00", None)
-
- with self.assertRaises(ValueError):
- _parse_from_to("2021010100", None)
diff --git a/src/nwp_consumer/internal/cache.py b/src/nwp_consumer/internal/cache.py
deleted file mode 100644
index 4bdfd34b..00000000
--- a/src/nwp_consumer/internal/cache.py
+++ /dev/null
@@ -1,91 +0,0 @@
-"""Defines the cache for the application.
-
-Many sources of data do not give any option for accessing their files
-via e.g. a BytesIO object. Were this the case, we could use a generic
-local filesystem adaptor to handle all incoming data. Since it isn't,
-and instead often a pre-existing file object is required to push data
-into, a cache is required to store the data temporarily.
-
-The cache is a simple directory structure that stores files in a
-hierarchical format; with the top level directory being the source of
-the data, followed by a subdirectory for the type of data (raw or
-zarr), then further subdirectories according to the init time
-associated with the file.
-
-Driven actors are then responsible for mapping the cached data to the
-desired storage location.
-
-Example:
-|--- /tmp/nwpc
-| |--- source1
-| | |--- raw
-| | | |--- 2021
-| | | |--- 01
-| | | |--- 01
-| | | |--- 0000
-| | | |--- parameter1.grib
-| | | |--- parameter2.grib
-| | | |--- 1200
-| | | |--- parameter1.grib
-| | | |--- parameter2.grib
-| | |--- zarr
-| | |--- 2021
-| | |--- 01
-| | |--- 01
-| | |--- 20210101T0000.zarr.zip
-| | |--- 20210101T1200.zarr.zip
-"""
-
-import datetime as dt
-import pathlib
-
-# --- Constants --- #
-
-# Define the location of the consumer's cache directory
-CACHE_DIR = pathlib.Path("/tmp/nwpc") # noqa: S108
-CACHE_DIR_RAW = CACHE_DIR / "raw"
-CACHE_DIR_ZARR = CACHE_DIR / "zarr"
-
-# Define the datetime format strings for creating a folder
-# structure from a datetime object for raw and zarr files
-IT_FOLDER_STRUCTURE_RAW = "%Y/%m/%d/%H%M"
-IT_FOLDER_GLOBSTR_RAW = "*/*/*/*"
-IT_FOLDER_STRUCTURE_ZARR = "%Y/%m/%d"
-IT_FOLDER_GLOBSTR_ZARR = "*/*/*"
-
-# Define the datetime format string for a zarr filename
-IT_FILENAME_ZARR = "%Y%m%dT%H%M.zarr"
-IT_FULLPATH_ZARR = f"{IT_FOLDER_STRUCTURE_ZARR}/{IT_FILENAME_ZARR}"
-
-# --- Functions --- #
-
-
-def rawCachePath(it: dt.datetime, filename: str) -> pathlib.Path:
- """Create a filepath to cache a raw file.
-
- Args:
- it: The initialisation time of the file to cache.
- filename: The name of the file (including extension).
-
- Returns:
- The path to the cached file.
- """
- # Build the directory structure according to the file's datetime
- parent: pathlib.Path = CACHE_DIR_RAW / it.strftime(IT_FOLDER_STRUCTURE_RAW)
- parent.mkdir(parents=True, exist_ok=True)
- return parent / filename
-
-
-def zarrCachePath(it: dt.datetime) -> pathlib.Path:
- """Create a filepath to cache a zarr file.
-
- Args:
- it: The initialisation time of the file to cache.
-
- Returns:
- The path to the cache file.
- """
- # Build the directory structure according to the file's datetime
- parent: pathlib.Path = CACHE_DIR_ZARR / it.strftime(IT_FOLDER_STRUCTURE_ZARR)
- parent.mkdir(parents=True, exist_ok=True)
- return parent / it.strftime(IT_FILENAME_ZARR)
diff --git a/src/nwp_consumer/internal/config/__init__.py b/src/nwp_consumer/internal/config/__init__.py
deleted file mode 100644
index 84b9e414..00000000
--- a/src/nwp_consumer/internal/config/__init__.py
+++ /dev/null
@@ -1,31 +0,0 @@
-"""Configuration for the service."""
-
-__all__ = [
- "EnvParser",
- "CEDAEnv",
- "ConsumerEnv",
- "CMCEnv",
- "ECMWFMARSEnv",
- "ECMWFS3Env",
- "ICONEnv",
- "GFSEnv",
- "HuggingFaceEnv",
- "MetOfficeEnv",
- "S3Env",
- "LocalEnv",
-]
-
-from .env import (
- CEDAEnv,
- CMCEnv,
- ConsumerEnv,
- ECMWFMARSEnv,
- ECMWFS3Env,
- EnvParser,
- GFSEnv,
- HuggingFaceEnv,
- ICONEnv,
- LocalEnv,
- MetOfficeEnv,
- S3Env,
-)
diff --git a/src/nwp_consumer/internal/config/env.py b/src/nwp_consumer/internal/config/env.py
deleted file mode 100644
index 2a7e80ad..00000000
--- a/src/nwp_consumer/internal/config/env.py
+++ /dev/null
@@ -1,248 +0,0 @@
-"""Config struct for application running."""
-import os
-from distutils.util import strtobool
-from typing import get_type_hints
-
-import structlog
-
-from nwp_consumer import internal
-from nwp_consumer.internal import inputs, outputs
-
-log = structlog.getLogger()
-
-
-class EnvParser:
- """Mixin to parse environment variables into class fields.
-
- Whilst this could be done with Pydantic, it's nice to avoid the
- extra dependency if possible, and pydantic would be overkill for
- this small use case.
- """
-
- def __init__(self) -> None:
- """Parse environment variables into class fields.
-
- If the class field is upper case, parse it into the indicated
- type from the environment. Required fields are those set in
- the child class without a default value.
-
- Examples:
- >>> MyEnv(EnvParser):
- >>> REQUIRED_ENV_VAR: str
- >>> OPTIONAL_ENV_VAR: str = "default value"
- >>> ignored_var: str = "ignored"
- """
- for field, t in get_type_hints(self).items():
- # Skip item if not upper case
- if not field.isupper():
- continue
-
- # Log Error if required field not supplied
- default_value = getattr(self, field, None)
- match (default_value, os.environ.get(field)):
- case (None, None):
- # No default value, and field not in env
- raise OSError(f"Required field {field} not supplied")
- case (_, None):
- # A default value is set and field not in env
- pass
- case (_, _):
- # Field is in env
- env_value: str | bool = os.environ[field]
- # Handle bools seperately as bool("False") == True
- if t == bool:
- env_value = bool(strtobool(os.environ[field]))
- # Cast to desired type
- self.__setattr__(field, t(env_value))
-
- @classmethod
- def print_env(cls) -> None:
- """Print the required environment variables."""
- message: str = f"Environment variables for {cls.__class__.__name__}:\n"
- for field, _ in get_type_hints(cls).items():
- if not field.isupper():
- continue
- default_value = getattr(cls, field, None)
- message += f"\t{field}{'(default: ' + default_value + ')' if default_value else ''}\n"
- log.info(message)
-
- def configure_fetcher(self) -> internal.FetcherInterface:
- """Configure the associated fetcher."""
- raise NotImplementedError(
- "Fetcher not implemented for this environment. Check the available inputs.",
- )
-
- def configure_storer(self) -> internal.StorageInterface:
- """Configure the associated storer."""
- raise NotImplementedError(
- "Storer not implemented for this environment. Check the available outputs.",
- )
-
-
-# --- Configuration environment variables --- #
-
-
-class ConsumerEnv(EnvParser):
- """Config for Consumer."""
-
- DASK_SCHEDULER_ADDRESS: str = ""
-
-
-# --- Inputs environment variables --- #
-
-
-class CEDAEnv(EnvParser):
- """Config for CEDA FTP server."""
-
- CEDA_FTP_USER: str
- CEDA_FTP_PASS: str
-
- def configure_fetcher(self) -> internal.FetcherInterface:
- """Overrides the corresponding method in the parent class."""
- return inputs.ceda.Client(ftpUsername=self.CEDA_FTP_USER, ftpPassword=self.CEDA_FTP_PASS)
-
-
-class MetOfficeEnv(EnvParser):
- """Config for Met Office API."""
-
- METOFFICE_ORDER_ID: str
- METOFFICE_API_KEY: str
-
- def configure_fetcher(self) -> internal.FetcherInterface:
- """Overrides the corresponding method in the parent class."""
- return inputs.metoffice.Client(
- apiKey=self.METOFFICE_API_KEY,
- orderID=self.METOFFICE_ORDER_ID,
- )
-
-
-class ECMWFMARSEnv(EnvParser):
- """Config for ECMWF MARS API."""
-
- ECMWF_API_KEY: str
- ECMWF_API_URL: str
- ECMWF_API_EMAIL: str
- ECMWF_AREA: str = "uk"
- ECMWF_HOURS: int = 48
- ECMWF_PARAMETER_GROUP: str = "default"
-
- def configure_fetcher(self) -> internal.FetcherInterface:
- """Overrides the corresponding method in the parent class."""
- return inputs.ecmwf.MARSClient(
- area=self.ECMWF_AREA,
- hours=self.ECMWF_HOURS,
- param_group=self.ECMWF_PARAMETER_GROUP,
- )
-
-
-class ECMWFS3Env(EnvParser):
- """Config for ECMWF S3."""
-
- ECMWF_AWS_S3_BUCKET: str
- ECMWF_AWS_ACCESS_KEY: str = ""
- ECMWF_AWS_ACCESS_SECRET: str = ""
- ECMWF_AWS_REGION: str
- ECMWF_AREA: str = "uk"
-
- def configure_fetcher(self) -> internal.FetcherInterface:
- """Overrides the corresponding method in the parent class."""
- return inputs.ecmwf.S3Client(
- bucket=self.ECMWF_AWS_S3_BUCKET,
- area=self.ECMWF_AREA,
- region=self.ECMWF_AWS_REGION,
- key=self.ECMWF_AWS_ACCESS_KEY,
- secret=self.ECMWF_AWS_ACCESS_SECRET,
- )
-
-
-class ICONEnv(EnvParser):
- """Config for ICON API."""
-
- ICON_MODEL: str = "europe"
- ICON_HOURS: int = 48
- ICON_PARAMETER_GROUP: str = "default"
-
- def configure_fetcher(self) -> internal.FetcherInterface:
- """Overrides the corresponding method in the parent class."""
- return inputs.icon.Client(
- model=self.ICON_MODEL,
- hours=self.ICON_HOURS,
- param_group=self.ICON_PARAMETER_GROUP,
- )
-
-
-class CMCEnv(EnvParser):
- """Config for CMC API."""
-
- CMC_MODEL: str = "gdps"
- CMC_HOURS: int = 240
- CMC_PARAMETER_GROUP: str = "full"
-
- def configure_fetcher(self) -> internal.FetcherInterface:
- """Overrides the corresponding method in the parent class."""
- return inputs.cmc.Client(
- model=self.CMC_MODEL,
- hours=self.CMC_HOURS,
- param_group=self.CMC_PARAMETER_GROUP,
- )
-
-class GFSEnv(EnvParser):
- """Config for GFS API."""
-
- GFS_MODEL: str = "global"
- GFS_HOURS: int = 48
- GFS_PARAMETER_GROUP: str = "default"
-
- def configure_fetcher(self) -> internal.FetcherInterface:
- """Overrides the corresponding method in the parent class."""
- return inputs.noaa.AWSClient(
- model=self.GFS_MODEL,
- param_group=self.GFS_PARAMETER_GROUP,
- hours=self.GFS_HOURS,
- )
-
-
-# --- Outputs environment variables --- #
-
-
-class LocalEnv(EnvParser):
- """Config for local storage."""
-
- # Required for EnvParser to believe it's a valid class
- dummy_field: str = ""
-
- def configure_storer(self) -> internal.StorageInterface:
- """Overrides the corresponding method in the parent class."""
- return outputs.localfs.Client()
-
-
-class S3Env(EnvParser):
- """Config for S3."""
-
- AWS_S3_BUCKET: str
- AWS_ACCESS_KEY: str = ""
- AWS_ACCESS_SECRET: str = ""
- AWS_REGION: str
-
- def configure_storer(self) -> internal.StorageInterface:
- """Overrides the corresponding method in the parent class."""
- return outputs.s3.Client(
- bucket=self.AWS_S3_BUCKET,
- region=self.AWS_REGION,
- key=self.AWS_ACCESS_KEY,
- secret=self.AWS_ACCESS_SECRET,
- )
-
-
-class HuggingFaceEnv(EnvParser):
- """Config for HuggingFace API."""
-
- HUGGINGFACE_TOKEN: str
- HUGGINGFACE_REPO_ID: str
-
- def configure_storer(self) -> internal.StorageInterface:
- """Overrides the corresponding method in the parent class."""
- return outputs.huggingface.Client(
- token=self.HUGGINGFACE_TOKEN,
- repoID=self.HUGGINGFACE_REPO_ID,
- )
diff --git a/src/nwp_consumer/internal/config/test_env.py b/src/nwp_consumer/internal/config/test_env.py
deleted file mode 100644
index fc720140..00000000
--- a/src/nwp_consumer/internal/config/test_env.py
+++ /dev/null
@@ -1,63 +0,0 @@
-"""Tests for the config module."""
-
-import unittest.mock
-
-from .env import EnvParser, ICONEnv
-
-
-class TestConfig(EnvParser):
- """Test config class."""
-
- REQUIRED_STR: str
- REQUIRED_BOOL: bool
- REQUIRED_INT: int
- OPTIONAL_STR: str = "default"
- OPTIONAL_BOOL: bool = True
- OPTIONAL_INT: int = 4
-
-
-class Test_EnvParser(unittest.TestCase):
- """Tests for the _EnvParseMixin class."""
-
- @unittest.mock.patch.dict(
- "os.environ",
- {
- "REQUIRED_STR": "required_str",
- "REQUIRED_BOOL": "false",
- "REQUIRED_INT": "5",
- },
- )
- def test_parsesEnvVars(self) -> None:
- tc = TestConfig()
-
- self.assertEqual("required_str", tc.REQUIRED_STR)
- self.assertFalse(tc.REQUIRED_BOOL)
- self.assertEqual(5, tc.REQUIRED_INT)
- self.assertEqual("default", tc.OPTIONAL_STR)
- self.assertTrue(tc.OPTIONAL_BOOL)
- self.assertEqual(4, tc.OPTIONAL_INT)
-
- @unittest.mock.patch.dict(
- "os.environ",
- {
- "REQUIRED_STR": "required_str",
- "REQUIRED_BOOL": "not a bool",
- "REQUIRED_INT": "5.7",
- },
- )
- def test_errorsIfCantCastType(self) -> None:
- with self.assertRaises(ValueError):
- TestConfig()
-
- def test_errorsIfRequiredFieldNotSet(self) -> None:
- with self.assertRaises(OSError):
- TestConfig()
-
- @unittest.mock.patch.dict(
- "os.environ", {"ICON_HOURS": "3", "ICON_PARAMETER_GROUP": "basic"}
- )
- def test_parsesIconConfig(self) -> None:
- tc = ICONEnv()
-
- self.assertEqual(3, tc.ICON_HOURS)
- self.assertEqual("basic", tc.ICON_PARAMETER_GROUP)
diff --git a/src/nwp_consumer/internal/inputs/__init__.py b/src/nwp_consumer/internal/inputs/__init__.py
deleted file mode 100644
index b8d4905f..00000000
--- a/src/nwp_consumer/internal/inputs/__init__.py
+++ /dev/null
@@ -1,22 +0,0 @@
-"""Available inputs to source data from."""
-
-__all__ = [
- "ceda",
- "metoffice",
- "ecmwf",
- "icon",
- "cmc",
- "meteofrance",
- "noaa",
-]
-
-from . import (
- ceda,
- cmc,
- ecmwf,
- icon,
- meteofrance,
- metoffice,
- noaa,
-)
-
diff --git a/src/nwp_consumer/internal/inputs/ceda/README.md b/src/nwp_consumer/internal/inputs/ceda/README.md
deleted file mode 100644
index 04f28d3c..00000000
--- a/src/nwp_consumer/internal/inputs/ceda/README.md
+++ /dev/null
@@ -1,273 +0,0 @@
-# CEDA
-
----
-
-## Data
-
-See
-- https://artefacts.ceda.ac.uk/formats/grib/
-- https://dap.ceda.ac.uk/badc/ukmo-nwp/doc/NWP_UKV_Information.pdf
-
-Investigate files via eccodes:
-
-```shell
-$ conda install -c conda-forge eccodes
-```
-
-More info on eccodes: https://confluence.ecmwf.int/display/ECC/grib_ls
-
-For example:
-
-```shell
-$ grib_ls -n parameter -w stepRange=1 201901010000_u1096_ng_umqv_Wholesale1.grib
-```
-
-## Files
-
-Sourced from https://zenodo.org/record/7357056. There are two files per
-`init_time` (model run time) that contain surface-level parameters of interest.
-
-The contents of those files differs somewhat from what is presented in the above
-document
-
-#### Un-split File 1 `yyyymmddhhmm_u1096_ng_umqv_Wholesale1.grib`
-
-Full domain, 35 time steps and the following surface level parameters.
-
-| paramId | shortName | units | name |
-|---------|-----------|----------------|-------------------------|
-| 130 | t | K | Temperature |
-| 3017 | dpt | K | Dew point temperature |
-| 3020 | vis | m | Visibility |
-| 157 | r | % | Relative humidity |
-| 260074 | prmsl | Pa | Pressure reduced to MSL |
-| 207 | 10si | m s**-1 | 10 metre wind speed |
-| 260260 | 10wdir | Degree true | 10 metre wind direction |
-| 3059 | prate | kg m**-2 s**-1 | Precipitation rate |
-| | unknown | unknown | unknown |
-
-View via pasting the output of the following to this
-[online table converter](https://tableconvert.com/json-to-markdown):
-
-```shell
-$ grib_ls -n parameter -w stepRange=0 -j 201901010000_u1096_ng_umqv_Wholesale1.grib
-```
-
-When loading this file in using *cfgrib*, it loads in 5 distinct xarray datasets.
-
-Wholesale1 Datasets
-
- --- Dataset 1 ---
- Dimensions: (step: 37, values: 385792)
- Coordinates:
- time datetime64[ns] 2019-01-01
- * step (step) timedelta64[ns] 00:00:00 ... 1 days 12:00:00
- heightAboveGround float64 1.0
- valid_time (step) datetime64[ns] 2019-01-01 ... 2019-01-02T12:00:00
- Dimensions without coordinates: values
- Data variables:
- t (step, values) float32 ... (1.5m temperature)
- r (step, values) float32 ... (1.5m relative humidity)
- dpt (step, values) float32 ... (1.5m dew point)
- vis (step, values) float32 ... (1.5m visibility)
-
- --- Dataset 2 ---
- Dimensions: (step: 37, values: 385792)
- Coordinates:
- time datetime64[ns] 2019-01-01
- * step (step) timedelta64[ns] 00:00:00 ... 1 days 12:00:00
- heightAboveGround float64 10.0
- valid_time (step) datetime64[ns] 2019-01-01 ... 2019-01-02T12:00:00
- Dimensions without coordinates: values
- Data variables:
- si10 (step, values) float32 ... (10m wind speed)
- wdir10 (step, values) float32 ... (10m wind direction)
-
- --- Dataset 3 ---
- Dataset 3
- Dimensions: (step: 37, values: 385792)
- Coordinates:
- time datetime64[ns] 2019-01-01
- * step (step) timedelta64[ns] 00:00:00 01:00:00 ... 1 days 12:00:00
- meanSea float64 0.0
- valid_time (step) datetime64[ns] ...
- Dimensions without coordinates: values
- Data variables:
- prmsl (step, values) float32 ... (mean sea level pressure)
-
- --- Dataset 4 ---
- Dimensions: (step: 36, values: 385792)
- Coordinates:
- time datetime64[ns] 2019-01-01
- * step (step) timedelta64[ns] 01:00:00 02:00:00 ... 1 days 12:00:00
- surface float64 0.0
- valid_time (step) datetime64[ns] ...
- Dimensions without coordinates: values
- Data variables:
- unknown (step, values) float32 ... (?)
-
- --- Dataset 5 ---
- Dimensions: (step: 37, values: 385792)
- Coordinates:
- time datetime64[ns] 2019-01-01
- * step (step) timedelta64[ns] 00:00:00 01:00:00 ... 1 days 12:00:00
- surface float64 0.0
- valid_time (step) datetime64[ns] 2019-01-01 ... 2019-01-02T12:00:00
- Dimensions without coordinates: values
- Data variables:
- unknown (step, values) float32 ... (?)
- prate (step, values) float32 ... (total precipitation rate)
-
-Wholesal21 Datasets
-
- --- Dataset 1 ---
- Dimensions: (step: 37, values: 385792)
- Coordinates:
- time datetime64[ns] 2019-01-01
- * step (step) timedelta64[ns] 00:00:00 01:00:00 ... 1 days 12:00:00
- atmosphere float64 0.0
- valid_time (step) datetime64[ns] ...
- Dimensions without coordinates: values
- Data variables:
- unknown (step, values) float32 ... (?)
-
- --- Dataset 2 ---
- Dimensions: (step: 37, values: 385792)
- Coordinates:
- time datetime64[ns] 2019-01-01
- * step (step) timedelta64[ns] 00:00:00 01:00:00 ... 1 days 12:00:00
- cloudBase float64 0.0
- valid_time (step) datetime64[ns] ...
- Dimensions without coordinates: values
- Data variables:
- cdcb (step, values) float32 ... (convective cloud base height)
-
- --- Dataset 3 ---
- Dimensions: (step: 37, values: 385792)
- Coordinates:
- time datetime64[ns] 2019-01-01
- * step (step) timedelta64[ns] 00:00:00 ... 1 days 12:00:00
- heightAboveGroundLayer float64 0.0
- valid_time (step) datetime64[ns] ...
- Dimensions without coordinates: values
- Data variables:
- lcc (step, values) float32 ... (low cloud amount)
-
- --- Dataset 4 ---
- Dimensions: (step: 37, values: 385792)
- Coordinates:
- time datetime64[ns] 2019-01-01
- * step (step) timedelta64[ns] 00:00:00 ... 1 days 12:00:00
- heightAboveGroundLayer float64 1.524e+03
- valid_time (step) datetime64[ns] ...
- Dimensions without coordinates: values
- Data variables:
- mcc (step, values) float32 ... (medium cloud amount)
-
- --- Dataset 5 ---
- Dimensions: (step: 37, values: 385792)
- Coordinates:
- time datetime64[ns] 2019-01-01
- * step (step) timedelta64[ns] 00:00:00 ... 1 days 12:00:00
- heightAboveGroundLayer float64 4.572e+03
- valid_time (step) datetime64[ns] ...
- Dimensions without coordinates: values
- Data variables:
- hcc (step, values) float32 ... (high cloud amount)
-
- --- Dataset 6 ---
- Dimensions: (step: 37, values: 385792)
- Coordinates:
- time datetime64[ns] 2019-01-01
- * step (step) timedelta64[ns] 00:00:00 01:00:00 ... 1 days 12:00:00
- surface float64 0.0
- valid_time (step) datetime64[ns] 2019-01-01 ... 2019-01-02T12:00:00
- Dimensions without coordinates: values
- Data variables:
- unknown (step, values) float32 ...
- sde (step, values) float32 ... (snow depth water equivalent)
- hcct (step, values) float32 ... (height of convective cloud top)
- dswrf (step, values) float32 ... (downward short-wave radiation flux)
- dlwrf (step, values) float32 ... (downward long-wave radiation flux)
-
- --- Dataset 7 ---
- Dimensions: (step: 37, values: 385792)
- Coordinates:
- time datetime64[ns] 2019-01-01
- * step (step) timedelta64[ns] 00:00:00 01:00:00 ... 1 days 12:00:00
- level float64 0.0
- valid_time (step) datetime64[ns] ...
- Dimensions without coordinates: values
- Data variables:
- h (step, values) float32 ... (geometrical height)
-
-