From 488fd960bac02cb4e11d51d5693ee8061ff3585e Mon Sep 17 00:00:00 2001 From: Kian-Tat Lim Date: Fri, 25 Oct 2024 01:44:05 -0700 Subject: [PATCH] Flesh out contribution information. --- doc/contributor-guide/adding-columns.rst | 21 +++++++++++++++---- doc/contributor-guide/adding-tables.rst | 9 ++++---- .../inserting-information.rst | 15 ++++++++----- 3 files changed, 32 insertions(+), 13 deletions(-) diff --git a/doc/contributor-guide/adding-columns.rst b/doc/contributor-guide/adding-columns.rst index fa2551aa..d688bdb6 100644 --- a/doc/contributor-guide/adding-columns.rst +++ b/doc/contributor-guide/adding-columns.rst @@ -1,6 +1,19 @@ -############### +############## Adding Columns -############### +############## -* Values should be usefully summarized -* Try to make everything into some kind of scalar +Structure: +* ConsDB content must relate to exposures or visits or observations structured like exposures. General time series should go in the Engineering and Facilities Database (EFD). +* ConsDB content should generally be scalar values. Large amounts of data, especially arrays or images or cubes, should generally go into the Large File Annex (LFA). +* Avoid arrays expressed as individual columns (e.g. ``something0``, ``something1``, ``something2``) where possible, as this increases the number of columns drastically (and there is `a limit `_), makes it hard to query (``SELECT`` clauses need to list all of these individually, and ``WHERE`` clauses may need to include large ``OR`` or ``AND`` conditions), and potentially requires a lot of database storage space. +* Columns should be named in all lowercase with underscore (``_``) separators, also known as "snake_case". + +Data sources: +* Columns added to the ``exposure`` and ``ccdexposure`` tables must be derived from the Header Service running at the Summit for a given instrument, which extracts information from the EFD in real time and is designed to provide information critical for Alert Production. (This service also populates the ``visit1`` and ``ccdvisit1`` views.) Changes must typically be coordinated with both the Header Service and the ConsDB teams, in addition to being added to `sdm_schemas `_. +* The source for the ``exposure_efd*`` tables is the EFD Transformation service running at the US Data Facility, which extracts information from the EFD in batches and is designed for all other EFD data. It has its own configuration. +* Ensure that the data source for the table to which the column is being added will in fact produce that column. + +Column descriptions: +* Make sure the description is understandable to a non-staff scientist, and try to avoid internal jargon. +* Include `units `_ for measurements. Note that these should follow IVOA standards, not Astropy unit standards. +* Include a `Unified Content Descriptor (UCD) `_ indicating the meaning of the column. diff --git a/doc/contributor-guide/adding-tables.rst b/doc/contributor-guide/adding-tables.rst index 271b4b50..c077351e 100644 --- a/doc/contributor-guide/adding-tables.rst +++ b/doc/contributor-guide/adding-tables.rst @@ -2,7 +2,8 @@ Adding Tables ############## -* Each source of data should have its own table(s) -* Each dimension combination (exposure, visit, exposure+detector, visit+detector, etc.) should have its own table(s) -* Normalize when possible -* De-normalize via views to make querying easier +* Each source of data should have its own table(s). +* Conversely, each new table being added should have its data source identified. +* Each dimension combination (exposure, visit, exposure+detector, visit+detector, etc.) should have its own table(s). +* Normalize when possible. Try not to repeat non-key columns between tables with the same dimensions. +* De-normalize via views to make querying easier. diff --git a/doc/contributor-guide/inserting-information.rst b/doc/contributor-guide/inserting-information.rst index dc81f882..769b202e 100644 --- a/doc/contributor-guide/inserting-information.rst +++ b/doc/contributor-guide/inserting-information.rst @@ -2,9 +2,14 @@ Inserting Information ##################### -* Sasquatch - * REST API - * Direct Kafka messages +Four tools can be used to insert information into ConsDB. -* ConsDB client library in summit_utils -* ConsDB REST API +* `Sasquatch `_ + * Sasquatch will be configured to write via a Kafka Connector to tables in ConsDB. This should become the preferred interface for data sources to insert information. It provides isolation from SQL details (and does not require a SQL client library), and it can be used from any programming language. The Kafka messaging system provides resiliency. + * `REST Proxy `_ + * `Direct Kafka messages `_ +* ConsDB Python client library in summit_utils + * This library is currently implemented using the Web service API, but it can be changed in the future to use Sasquatch. +* ConsDB Web service API + * The Web service API (pqserver) provides some of the same advantages as Sasquatch, but it does not provide any buffering, retries, or resiliency. We hope to phase out its usage when Sasquatch becomes available. +* Direct SQL ``INSERT``. This is discouraged. Appropriate credentials would have to be arranged.