diff --git a/docs/source/api/auth.rst b/docs/source/api-reference/auth.rst similarity index 100% rename from docs/source/api/auth.rst rename to docs/source/api-reference/auth.rst diff --git a/docs/source/api/bigquery.rst b/docs/source/api-reference/bigquery.rst similarity index 100% rename from docs/source/api/bigquery.rst rename to docs/source/api-reference/bigquery.rst diff --git a/docs/source/api-reference/exceptions.rst b/docs/source/api-reference/exceptions.rst new file mode 100644 index 0000000..00bfd19 --- /dev/null +++ b/docs/source/api-reference/exceptions.rst @@ -0,0 +1,7 @@ +pittgoogle.exceptions +===================== + +.. automodule:: pittgoogle.exceptions + :members: + :private-members: + :member-order: bysource diff --git a/docs/source/api/pubsub.rst b/docs/source/api-reference/pubsub.rst similarity index 100% rename from docs/source/api/pubsub.rst rename to docs/source/api-reference/pubsub.rst diff --git a/docs/source/api-reference/registry.rst b/docs/source/api-reference/registry.rst new file mode 100644 index 0000000..91c0572 --- /dev/null +++ b/docs/source/api-reference/registry.rst @@ -0,0 +1,7 @@ +pittgoogle.registry +=================== + +.. automodule:: pittgoogle.registry + :members: + :private-members: + :member-order: bysource diff --git a/docs/source/api/utils.rst b/docs/source/api-reference/utils.rst similarity index 100% rename from docs/source/api/utils.rst rename to docs/source/api-reference/utils.rst diff --git a/docs/source/for-developers/manage-dependencies-poetry.md b/docs/source/for-developers/manage-dependencies-poetry.md new file mode 100644 index 0000000..2d0b7da --- /dev/null +++ b/docs/source/for-developers/manage-dependencies-poetry.md @@ -0,0 +1,58 @@ +# Managing Dependencies with Poetry + +This page contains instructions for managing the `pittgoogle` package dependencies using [Poetry](https://python-poetry.org/). +Poetry was implemented in this repo in [pull #7](https://github.com/mwvgroup/pittgoogle-client/pull/7). + +## Setup your environment + +Create a new conda environment for poetry and install it ([Poetry installation](https://python-poetry.org/docs/#installation)). + +```bash +conda create --name poetry-py311 python=3.11 +conda activate poetry-py311 + +# pipx is recommended, but it requires a brew install on MacOS and I (Raen) avoid brew whenever possible. +# pip seems to work fine. +pip install poetry +``` + +## Install existing dependencies + +This repo already contains a poetry.lock file, so running `poetry install` will give you +the exact versions specified there ([Poetry install dependencies](https://python-poetry.org/docs/basic-usage/#installing-dependencies)). + +If you would rather start over completely, skip ahead to the next section. + +```bash +poetry install +``` + +## Update Dependency Versions + +To upgrade to the latest versions compatible with the pyproject.toml file, you have two options below +([Poetry update dependencies](https://python-poetry.org/docs/basic-usage/#updating-dependencies-to-their-latest-versions)): + +```bash +# Option 1: Start over completely by deleting the lock file and re-installing. +rm poetry.lock +poetry install + +# Option 2: Update dependencies starting from the existing lock file (assumes you've run poetry install). +poetry update +``` + +Now commit the updated poetry.lock file to the repo. + +## Add a Dependency + +Here are two examples +([Poetry add dependencies](https://python-poetry.org/docs/managing-dependencies/#adding-a-dependency-to-a-group), +see also: [Poetry version-constraint syntax](https://python-poetry.org/docs/dependency-specification/)): + +```bash +# This example adds pandas to the main dependencies. +poetry add pandas + +# This example adds sphinx to the docs dependencies. +poetry add sphinx --group docs.dependencies +``` diff --git a/docs/source/for-developers/release-new-version.md b/docs/source/for-developers/release-new-version.md new file mode 100644 index 0000000..ddc60ed --- /dev/null +++ b/docs/source/for-developers/release-new-version.md @@ -0,0 +1,28 @@ +# Release a New Version of pittgoogle-client + +When you are ready to release a new version of `pittgoogle-client`, publish to PyPI using the following steps: + +1. Make sure the code in the main branch is ready for release. + +2. Make sure the CHANGELOG.md file has been updated to reflect the changes being released. + +3. On the repo's GitHub [releases](https://github.com/mwvgroup/pittgoogle-client/releases) page: + - Click "Draft a new release". + - Under "Choose a tag", enter the version tag as "v" followed by the release version + ([semantic versioning](https://semver.org/) MAJOR.MINOR.PATCH). + - Enter the same tag for the release title. + - Click "Publish release". + +Completing step 3 will: + +- Execute the test suite. +- Publish the documentation to GitHub pages. +- Publish the package to PyPI.org. + +You will now be able to install the new version using: + +```bash +pip install --upgrade pittgoogle-client +``` + +This release process was implemented and described in [pull #7](https://github.com/mwvgroup/pittgoogle-client/pull/7). diff --git a/docs/source/for-developers/setup-development-mode.md b/docs/source/for-developers/setup-environment.md similarity index 67% rename from docs/source/for-developers/setup-development-mode.md rename to docs/source/for-developers/setup-environment.md index e60d049..97b6048 100644 --- a/docs/source/for-developers/setup-development-mode.md +++ b/docs/source/for-developers/setup-environment.md @@ -1,12 +1,10 @@ -# Development Mode +# Set up and Use a Developer Environment -Instructions for setting up development or "editable" mode are given below. -This is a method of pip-installing pointed at your local repository so you can iterate code and import changes for testing. +Instructions for setting up an environment with `pittgoogle` installed in development or "editable" mode are given below. +This is a method of pip-installing the package from your local files so that you have quick access to +your changes as you develop code. -See also: [Python Packaging User Guide](https://packaging.python.org/en/latest/). - -When you are ready to release a new version of `pittgoogle-client`, publish to PyPI using the release -process described in [issues #7](https://github.com/mwvgroup/pittgoogle-client/pull/7). +See also: [Working in “development mode”](https://packaging.python.org/guides/distributing-packages-using-setuptools/#working-in-development-mode). ## Setup @@ -38,5 +36,3 @@ importlib.reload(pittgoogle) # if you don't have access to the new changes at this point, try reloading again # if that doesn't work, restart your python interpreter ``` - -See also: [Working in “development mode”](https://packaging.python.org/guides/distributing-packages-using-setuptools/#working-in-development-mode). diff --git a/docs/source/index.rst b/docs/source/index.rst index c5e383c..509e564 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -7,38 +7,31 @@ :maxdepth: 3 :hidden: - Install - overview/authentication - overview/project - overview/env-vars - overview/cost - overview/adv-setup - -.. toctree:: - :caption: Tutorials - :maxdepth: 3 - :hidden: - - tutorials/bigquery - tutorials/cloud-storage - tutorials/ztf-figures + main/listings + Install
+ main/one-time-setup/index + main/faq/index .. toctree:: :caption: For Developers :maxdepth: 3 :hidden: - for-developers/setup-development-mode + for-developers/setup-environment + for-developers/manage-dependencies-poetry + for-developers/release-new-version .. toctree:: :caption: API Reference :maxdepth: 3 :hidden: - api/auth - api/bigquery - api/pubsub - api/utils + api-reference/auth + api-reference/bigquery + api-reference/exceptions + api-reference/pubsub + api-reference/registry + api-reference/utils pittgoogle-client ============================================== @@ -54,98 +47,3 @@ It is being developed the `Pitt-Google alert broker `__. - -**Data overview** - -.. _data pubsub: - -Pub/Sub Message Streams -======================= - -.. list-table:: Streams - :class: tight-table - :widths: 25 75 - :header-rows: 1 - - * - Topic - - Description - - * - ztf-alerts - - Full ZTF alert stream - - * - ztf-lite - - Lite version of ztf-alerts (every alert, subset of fields) - - * - ztf-tagged - - ztf-lite with basic categorizations such as “is pure” and “is likely extragalactic - transient” added to the message metadata. - - * - ztf-SuperNNova - - ztf-tagged plus SuperNNova classification results (Ia vs non-Ia). - - * - ztf-alert_avros - - Notification stream from the ztf-alert_avros Cloud Storage bucket indicating - that a new alert packet is in file storage. - These messages contain no data, only attributes. - The file name is in the attribute "objectId", - and the bucket name is in the attribute "bucketId". - - * - ztf-BigQuery - - Notification stream indicating that alert data is available in BigQuery tables. - - * - **ztf-loop** - - Use this stream for testing. Recent ZTF alerts are published to this topic - at a roughly constant rate of 1 per second. - -.. _data bigquery: - -BigQuery Catalogs -================== - -.. list-table:: Datasets and Tables - :class: tight-table - :widths: 15 15 70 - :header-rows: 1 - - * - Dataset - - Table - - Description - - * - ztf_alerts - - alerts - - Complete alert packet, excluding image cutouts. - Same schema as the original alert, including nested and repeated fields. - - * - ztf_alerts - - DIASource - - Alert packet data for the triggering source only. Including the object ID and a - list of source IDs for the previous sources included in the alert, - excluding cutouts and data for previous sources. - Flat schema. - - * - ztf_alerts - - SuperNNova - - Results from a SuperNNova (Möller \& de Boissière, 2019) - Type Ia supernova classification (binary). - - * - ztf_alerts - - metadata - - Information recording Pitt-Google processing (e.g., message publish times, - bucket name and filename, etc.). - -.. _data cloud storage: - -Cloud Storage -==================== - -.. list-table:: Buckets - :class: tight-table - :widths: 40 60 - :header-rows: 1 - - * - Bucket Name - - Description - - * - ardent-cycling-243415-ztf-alert_avros - - Contains the complete, original alert packets as Avro files. - Filename syntax is: `//.avro` diff --git a/docs/source/overview/cost.rst b/docs/source/main/faq/cost.rst similarity index 54% rename from docs/source/overview/cost.rst rename to docs/source/main/faq/cost.rst index 319f236..8ea4546 100644 --- a/docs/source/overview/cost.rst +++ b/docs/source/main/faq/cost.rst @@ -3,13 +3,20 @@ Costs -------------- -Pitt-Google Broker makes astronomy data available in Google Cloud services like Pub/Sub, BigQuery, and Cloud Storage. -Google provides a baseline level of access for free with no credit card or billing account required. -The free tier is structured as a usage quota that renews monthly. -Access beyond the free tier is "pay-as-you-go". -Some examples are given in the table below. -If you exceed the free-tier quota and have not set up billing, your access will be restricted until -the quota renews. +The Pitt-Google Alert Broker makes data available in Google Cloud repositories. +The data are public and user-pays, meaning that anyone can access as much or little as they want, and everyone pays for what *they* use. +Making the data available in this way can allow us to support a very large number of users. +Payment goes to Google (not Pitt-Google Broker). +All authentication and billing is managed through Google Cloud projects. + +Compared to more traditional computing costs, cloud charges are much smaller but more frequent. +Some example charges are given in the table below. +Small projects can run for free. +Google provides a baseline level of "free tier" access, structured as a usage quota that renews monthly. +No credit card or billing account is required. +Other cost-offset options include $300 in free credits available to everyone, and $5000 in research credits available to many academics (see links below). +Large projects can use as much as they want to pay for. +Google's structure is "pay-as-you-go" with a monthly billing cycle, cancel at any time. .. list-table:: Pricing Examples (as of Aug. 2021) :class: tight-table diff --git a/docs/source/main/faq/find-project-id.rst b/docs/source/main/faq/find-project-id.rst new file mode 100644 index 0000000..f247073 --- /dev/null +++ b/docs/source/main/faq/find-project-id.rst @@ -0,0 +1,10 @@ +.. _find-project-id: + +Find the Project ID +=================== + +If you've created a :ref:`project ` and need to know the ID: + +- Go to the `Google Cloud Console `__. +- Click on the name of the project in the menu bar at the top of the page. +- From there you can see the names and IDs of all the projects you are connected to. diff --git a/docs/source/main/faq/index.rst b/docs/source/main/faq/index.rst new file mode 100644 index 0000000..0745f3d --- /dev/null +++ b/docs/source/main/faq/index.rst @@ -0,0 +1,8 @@ +Frequently Asked Questions +============================================== + +.. toctree:: + :maxdepth: 3 + + cost + find-project-id diff --git a/docs/source/main/listings.rst b/docs/source/main/listings.rst new file mode 100644 index 0000000..5e06aa8 --- /dev/null +++ b/docs/source/main/listings.rst @@ -0,0 +1,97 @@ +Data Listings +============= + +This page contains a listing of the data resources served by Pitt-Google Alert Broker. + +.. _data pubsub: + +Pub/Sub Message Streams +------------------------ + +.. list-table:: ZTF Streams + :class: tight-table + :widths: 25 75 + :header-rows: 1 + + * - Topic + - Description + + * - ztf-alerts + - Full ZTF alert stream + + * - ztf-lite + - Lite version of ztf-alerts (every alert, subset of fields) + + * - ztf-tagged + - ztf-lite with basic categorizations such as “is pure” and “is likely extragalactic + transient” added to the message metadata. + + * - ztf-SuperNNova + - ztf-tagged plus SuperNNova classification results (Ia vs non-Ia). + + * - ztf-alert_avros + - Notification stream from the ztf-alert_avros Cloud Storage bucket indicating + that a new alert packet is in file storage. + These messages contain no data, only attributes. + The file name is in the attribute "objectId", + and the bucket name is in the attribute "bucketId". + + * - ztf-BigQuery + - Notification stream indicating that alert data is available in BigQuery tables. + + * - **ztf-loop** + - Use this stream for testing. Recent ZTF alerts are published to this topic + at a roughly constant rate of 1 per second. + +.. _data bigquery: + +BigQuery Catalogs +------------------------ + +.. list-table:: ZTF Datasets and Tables + :class: tight-table + :widths: 15 15 70 + :header-rows: 1 + + * - Dataset + - Table + - Description + + * - ztf_alerts + - alerts + - Complete alert packet, excluding image cutouts. + Same schema as the original alert, including nested and repeated fields. + + * - ztf_alerts + - DIASource + - Alert packet data for the triggering source only. Including the object ID and a + list of source IDs for the previous sources included in the alert, + excluding cutouts and data for previous sources. + Flat schema. + + * - ztf_alerts + - SuperNNova + - Results from a SuperNNova (Möller \& de Boissière, 2019) + Type Ia supernova classification (binary). + + * - ztf_alerts + - metadata + - Information recording Pitt-Google processing (e.g., message publish times, + bucket name and filename, etc.). + +.. _data cloud storage: + +Cloud Storage +------------------------ + +.. list-table:: ZTF Buckets + :class: tight-table + :widths: 40 60 + :header-rows: 1 + + * - Bucket Name + - Description + + * - ardent-cycling-243415-ztf-alert_avros + - Contains the complete, original alert packets as Avro files. + Filename syntax is: `//.avro` diff --git a/docs/source/overview/authentication.rst b/docs/source/main/one-time-setup/authentication-oauth.rst similarity index 50% rename from docs/source/overview/authentication.rst rename to docs/source/main/one-time-setup/authentication-oauth.rst index f0e3723..e7b65ec 100644 --- a/docs/source/overview/authentication.rst +++ b/docs/source/main/one-time-setup/authentication-oauth.rst @@ -1,48 +1,5 @@ -.. _authentication: - -Authentication -================ - -Authentication for API calls is obtained directly from Google Cloud. -Two options are implemented in pittgoogle. Complete at least one: - -.. contents:: - :depth: 1 - :local: - -.. _service account: - -Service Account (Recommended) --------------------------------- - -These are instructions to create a service account and download a key file that can be -used for authentication. - -#. Prerequisite: Access to a Google Cloud :ref:`project `. - -#. Follow Google's instructions to - `create a service account `__. - You will: - - - Create a service account with the **Project > Owner** role. - - - Download a key file that contains authorization credentials. - **Keep this file secret!** - -#. Take note of the path to the key file you downloaded. Then, - :ref:`set both environment variables `. - -.. note:: - - The **Project > Owner** role gives the service account permission to do anything and - everything, within the project. - It is the simplest option and allows you to avoid the headache of tracking down - "permission denied" errors. - However, this role is excessively permissive in essentially all cases. - If you want to restrict the permissions granted to the service account, - assign a different role(s). - A good place to look is: - `Predefined roles `__. +Authentication: OAuth2 +====================== .. _oauth2: diff --git a/docs/source/main/one-time-setup/authentication.rst b/docs/source/main/one-time-setup/authentication.rst new file mode 100644 index 0000000..eba32f5 --- /dev/null +++ b/docs/source/main/one-time-setup/authentication.rst @@ -0,0 +1,47 @@ +.. _authentication: + +Authentication +================ + +Authentication for API calls is obtained directly from Google Cloud. +Two options are implemented in pittgoogle. Complete at least one: + +.. toctree:: + :maxdepth: 3 + + Service Account (recommended) + OAuth2 + +.. _service account: + +Service Account +--------------- + +These are instructions to create a service account and download a key file that can be +used for authentication. + +#. Prerequisite: Access to a Google Cloud :ref:`project `. + +#. Follow Google's instructions to + `create a service account `__. + You will: + + - Create a service account with the **Project > Owner** role. + + - Download a key file that contains authorization credentials. + **Keep this file secret!** + +#. Take note of the path to the key file you downloaded. Use it in the next step, + :ref:`Set environment variables `. + +.. note:: + + The **Project > Owner** role gives the service account permission to do anything and + everything, within the project. + It is the simplest option and allows you to avoid the headache of tracking down + "permission denied" errors. + However, this role is excessively permissive in essentially all cases. + If you want to restrict the permissions granted to the service account, + assign a different role(s). + A good place to look is: + `Predefined roles `__. diff --git a/docs/source/main/one-time-setup/enable-apis.rst b/docs/source/main/one-time-setup/enable-apis.rst new file mode 100644 index 0000000..cdc021b --- /dev/null +++ b/docs/source/main/one-time-setup/enable-apis.rst @@ -0,0 +1,28 @@ +.. _enable apis: + +Enable APIs for a Google Cloud Project +======================================= + +.. contents:: + :depth: 2 + :local: + +Every Google service has at least one API that can be used to access it. +These are disabled by default (since there are hundreds -- Gmail, Maps, Pub/Sub, ...) +They must be manually enabled before anyone in your project can use them. + +Enabling the following 3 will be enough for most interactions with +Pitt-Google resources. +Follow the links, make sure you've landed in the right project +(there's a dropdown in the blue menu bar), then click "Enable": + +- `Pub/Sub `__ + +- `BigQuery `__ + +- `Cloud Storage `__ + +If/when you attempt a call to an API you have not enabled, +the error message provides instructions to enable it. +You can also search the +`API Library `__. diff --git a/docs/source/main/one-time-setup/env-vars.rst b/docs/source/main/one-time-setup/env-vars.rst new file mode 100644 index 0000000..54c7042 --- /dev/null +++ b/docs/source/main/one-time-setup/env-vars.rst @@ -0,0 +1,50 @@ +.. _set env vars: + +Set Environment Variables +========================= + +(Note: If you are using OAuth2, you will also need the environment variables described :ref:`here +`.) + +The following environment variables will be used to authenticate you to your Google Cloud project. + +- ``GOOGLE_CLOUD_PROJECT`` -- :ref:`Project ID ` of the project that you will authenticate to. + +- ``GOOGLE_APPLICATION_CREDENTIALS`` -- Path to a key file containing your :ref:`service + account ` credentials. + +Set the environment variables using: + +.. code-block:: bash + + # Replace everything between angle brackets with your values. + export GOOGLE_CLOUD_PROJECT="" + export GOOGLE_APPLICATION_CREDENTIALS="" + +If you are using pittgoogle-client (python) only, setting the environment variables is sufficient. +If you are also using the command-line tools, you may need to re-run the ``gcloud auth`` command (see :ref:`install gcp cli`). + +---- + +If you are using a conda environment, you can configure it to automatically set/unset the variables when you activate/deactivate the environment. +Otherwise, you will need to set the variables again every time you start a new shell. + +First, activate your conda environment and make sure the variables +``GOOGLE_CLOUD_PROJECT`` and ``GOOGLE_APPLICATION_CREDENTIALS`` are set. +Then run the following commands. + +.. code-block:: bash + + # Create the activate/deactivate files if they don't already exist. + mkdir -p "${CONDA_PREFIX}/etc/conda/activate.d" + mkdir -p "${CONDA_PREFIX}/etc/conda/deactivate.d" + touch "${CONDA_PREFIX}/etc/conda/activate.d/env_vars.sh" + touch "${CONDA_PREFIX}/etc/conda/deactivate.d/env_vars.sh" + + # Set the environment variables when the environment is activated. + echo "export GOOGLE_CLOUD_PROJECT=${GOOGLE_CLOUD_PROJECT}" >> "${CONDA_PREFIX}/etc/conda/activate.d/env_vars.sh" + echo "export GOOGLE_APPLICATION_CREDENTIALS=${GOOGLE_APPLICATION_CREDENTIALS}" >> "${CONDA_PREFIX}/etc/conda/activate.d/env_vars.sh" + + # Unset the environment variables when the environment is deactivated. + echo "unset GOOGLE_CLOUD_PROJECT" >> "${CONDA_PREFIX}/etc/conda/deactivate.d/env_vars.sh" + echo "unset GOOGLE_APPLICATION_CREDENTIALS" >> "${CONDA_PREFIX}/etc/conda/deactivate.d/env_vars.sh" diff --git a/docs/source/overview/adv-setup.rst b/docs/source/main/one-time-setup/google-sdk.rst similarity index 55% rename from docs/source/overview/adv-setup.rst rename to docs/source/main/one-time-setup/google-sdk.rst index 1bf1276..c7af5d7 100644 --- a/docs/source/overview/adv-setup.rst +++ b/docs/source/main/one-time-setup/google-sdk.rst @@ -1,37 +1,16 @@ -Advanced Setup -=============== - -.. note:: - - Nothing on this page is required for standard access. - In most cases, you should just :ref:`install pittgoogle ` instead. - -Install Libraries for Google Cloud APIs ----------------------------------------- - -.. _install gcp python: - -Python -~~~~~~~~~~~~~~~~ - -You can pip install any of the Google Cloud python libraries. -Here are the 3 we use most. - -.. code-block:: bash - - pip install google-cloud-bigquery - pip install google-cloud-pubsub - pip install google-cloud-storage +.. _install gcp cli: -Here is a complete list: -`Python Cloud Client Libraries `__. +Google Cloud SDK +======================== -.. _install gcp cli: +.. note:: -Command Line -~~~~~~~~~~~~~~~~ + This page contains instructions for installing command-line tools. + This is not required in order to use ``pittgoogle-client`` itself but + is helpful for some use cases. If you don't know whether you need this, + skip it for now. -The Google Cloud SDK includes the 3 command line tools: gcloud, bq, and gsutil (see +The Google Cloud command-line tools include: gcloud, bq, and gsutil (see `default components `__ ). @@ -46,7 +25,7 @@ For Linux and Mac, use: In either case, follow the instructions to complete the installation. Then open a new terminal or restart your shell. Make sure your :ref:`environment variables ` are set, reset them if needed. -Then initialize gcloud using +Then initialize gcloud using: .. code-block:: bash @@ -55,7 +34,7 @@ Then initialize gcloud using and follow the directions. Note that this may open a browser and ask you to complete the setup there. -The remaining steps are optional, but recommended for the smoothest experience. +The remaining steps are recommended but optional. Set your new project as the default: @@ -71,3 +50,7 @@ Instruct gcloud to authenticate using your key file containing gcloud auth activate-service-account \ --project="$GOOGLE_CLOUD_PROJECT" \ --key-file="$GOOGLE_APPLICATION_CREDENTIALS" + +You may want to `create a configuration `__ if you use multiple projects or want to control settings like the default region. + +# [TODO] give instructions to add the ``gcloud auth`` command to the conda activation file and/or to create a configuration and activate it with the conda env. diff --git a/docs/source/main/one-time-setup/index.rst b/docs/source/main/one-time-setup/index.rst new file mode 100644 index 0000000..db15c58 --- /dev/null +++ b/docs/source/main/one-time-setup/index.rst @@ -0,0 +1,16 @@ +.. _one-time-setup: + +One-Time Setup +============================================== + +Using pittgoogle-client to interact with cloud resources requires the following one-time setup tasks: + +.. toctree:: + :maxdepth: 3 + + Install pittgoogle-client + Setup a Google Cloud Project + Setup Authentication + Set Environment Variables + Enable APIs + Install Google Cloud command-line tools (optional) diff --git a/docs/source/main/one-time-setup/install.rst b/docs/source/main/one-time-setup/install.rst new file mode 100644 index 0000000..7ef78e3 --- /dev/null +++ b/docs/source/main/one-time-setup/install.rst @@ -0,0 +1,21 @@ +.. _install: + +Install pittgoogle-client +---------------------------- + +.. automodule:: pittgoogle + +The basic install command is: + +.. code-block:: bash + + pip install pittgoogle-client + +This is imported in python as: + +.. code-block:: python + + import pittgoogle + +You will need to complete the rest of the :ref:`one-time-setup` before you can and obtain authentication +credentials before you will be able to access data. diff --git a/docs/source/overview/project-setup.png b/docs/source/main/one-time-setup/project-setup.png similarity index 100% rename from docs/source/overview/project-setup.png rename to docs/source/main/one-time-setup/project-setup.png diff --git a/docs/source/main/one-time-setup/project.rst b/docs/source/main/one-time-setup/project.rst new file mode 100644 index 0000000..b949ddd --- /dev/null +++ b/docs/source/main/one-time-setup/project.rst @@ -0,0 +1,51 @@ +.. _projects: + +Google Cloud Projects +====================== + +.. contents:: + :depth: 2 + :local: + +You will need to be authenticated to a Google Cloud project in order to access data served by Pitt-Google Broker. +Projects are free. +They are easy to create and delete. +One user can have many projects, and users can share projects. +Access is usually managed through the Google Console using a Gmail account, as shown below. + +If you already have access to a Google Cloud project, you can skip this step. + +.. _setup project: + +Setup a Google Cloud project +-------------------------------- + +**Create a project** + +- Go to the + `Cloud Resource Manager `__ + and login with a Google or Gmail account (go + `here `__ if you need to create one). + +- Click "Create Project" (A, in the screenshot below). + +- Enter a project name. + +- Write down the project ID (B). + You will need it in a future step, :ref:`set env vars`. + (You can also :ref:`find-project-id` again later.) + +- Click "Create". + +.. figure:: project-setup.png + :alt: Google Cloud project setup + +.. _delete-project: + +Cleanup: Delete a project +------------------------------- + +If/when you are done with a Google Cloud project you can permanently delete it. +Go to the +`Cloud Resource Manager `__, +select your project, and click "DELETE". diff --git a/docs/source/overview/env-vars.rst b/docs/source/overview/env-vars.rst deleted file mode 100644 index 5058e50..0000000 --- a/docs/source/overview/env-vars.rst +++ /dev/null @@ -1,49 +0,0 @@ -.. _set env vars: - -Set Environment Variables -========================== - -(Note: If you are using OAuth2, you will also need the environment variables described :ref:`here -`.) - -Setting these two environment variables will support a smooth authentication process -that occurs in the background: - -- `GOOGLE_CLOUD_PROJECT` -- Project ID of the :ref:`project ` that you - will authenticate to. - -- `GOOGLE_APPLICATION_CREDENTIALS` -- Path to a key file containing your :ref:`service - account ` credentials. - -To set these, replace the following angle brackets (``<>``) and everything between them with your -values. - -.. code-block:: bash - - export GOOGLE_CLOUD_PROJECT="" - export GOOGLE_APPLICATION_CREDENTIALS="" - -**If you open a new terminal, you will need to set the variables again.** -Conda can simplify this. -The following commands will configure it to automatically set these -variables when your environment is activated, and erase them when it is deactivated. - -Activate your Conda environment and make sure the variables -``GOOGLE_CLOUD_PROJECT`` and ``GOOGLE_APPLICATION_CREDENTIALS`` are set. -Then: - -.. code-block:: bash - - # create the activate/deactivate files if they don't already exist - mkdir -p "${CONDA_PREFIX}/etc/conda/activate.d" - mkdir -p "${CONDA_PREFIX}/etc/conda/deactivate.d" - touch "${CONDA_PREFIX}/etc/conda/activate.d/env_vars.sh" - touch "${CONDA_PREFIX}/etc/conda/deactivate.d/env_vars.sh" - - # store the variables to export them automatically when the environment is activated - echo "export GOOGLE_CLOUD_PROJECT=${GOOGLE_CLOUD_PROJECT}" >> "${CONDA_PREFIX}/etc/conda/activate.d/env_vars.sh" - echo "export GOOGLE_APPLICATION_CREDENTIALS=${GOOGLE_APPLICATION_CREDENTIALS}" >> "${CONDA_PREFIX}/etc/conda/activate.d/env_vars.sh" - - # remove the variables automatically when the environment is deactivated - echo "unset GOOGLE_CLOUD_PROJECT" >> "${CONDA_PREFIX}/etc/conda/deactivate.d/env_vars.sh" - echo "unset GOOGLE_APPLICATION_CREDENTIALS" >> "${CONDA_PREFIX}/etc/conda/deactivate.d/env_vars.sh" diff --git a/docs/source/overview/install.rst b/docs/source/overview/install.rst deleted file mode 100644 index ad75f4f..0000000 --- a/docs/source/overview/install.rst +++ /dev/null @@ -1,29 +0,0 @@ -.. _install: - -Install pittgoogle-client ----------------------------- - -.. automodule:: pittgoogle - -The basic install command is: - -.. code-block:: bash - - pip install pittgoogle-client - -This is imported as: - -.. code-block:: python - - import pittgoogle - -If you have trouble with dependencies, you may want to try creating a -`Conda `__ environment -using an environment file that you can download from the repo. - -.. code-block:: bash - - # download the file, create, and activate the environment - wget https://raw.githubusercontent.com/mwvgroup/pittgoogle-client/main/pittgoogle_env.yml - conda env create --file pittgoogle_env.yml - conda activate pittgoogle diff --git a/docs/source/overview/project.rst b/docs/source/overview/project.rst deleted file mode 100644 index db1fd93..0000000 --- a/docs/source/overview/project.rst +++ /dev/null @@ -1,78 +0,0 @@ -.. _projects: - -Google Cloud Projects -====================== - -.. contents:: - :depth: 2 - :local: - -In order to make API calls accessing data from Pitt-Google's cloud resources you will need to be authenticated to a Google Cloud project. -Projects are free. -They are easy to create and delete. -Each user can have many projects and users can share projects. - -.. _setup project: - -Setup a Google Cloud project --------------------------------- - -**Create a project** - -- Go to the - `Cloud Resource Manager `__ - and login with a Google or Gmail account (go - `here `__ - if you need to create one). - -- Click "Create Project" (A, in the screenshot below). - -- Enter a project name and **write down the project ID (B)** as you will need it to - :ref:`set env vars`, among other things. - -- Click "Create". - -.. figure:: project-setup.png - :alt: Google Cloud project setup - - -**Enable the APIs** - -Every Google service has at least one API that can be used to access it. -These are disabled by default (since there are hundreds -- Gmail, Maps, Pub/Sub, ...) -They must be manually enabled before anyone in your project can use them. - -Enabling the following 3 will be enough for most interactions with -Pitt-Google resources. -Follow the links, make sure you've landed in the right project -(there's a dropdown in the blue menu bar), then click "Enable": - -- `Pub/Sub `__ - -- `BigQuery `__ - -- `Cloud Storage `__ - -If/when you attempt a call to an API you have not enabled, -the error message provides instructions to enable it. -You can also search the -`API Library `__. - -.. _find-project-id: - -Where to find the project ID ------------------------------ - -Click on the name of the project in the blue menu bar on any page in the -`Google Cloud Console `__. -From there you can see the names and IDs of all the projects you are connected to. - -.. _delete-project: - -Cleanup: Delete a project -------------------------------- - -If/when you are done with a Google Cloud project you can permanently delete it. -Go to the `Cloud Resource -Manager `__, -select your project, and click "DELETE". diff --git a/docs/source/tutorials/bigquery.rst b/docs/source/tutorials/bigquery.rst deleted file mode 100644 index ff21e73..0000000 --- a/docs/source/tutorials/bigquery.rst +++ /dev/null @@ -1,402 +0,0 @@ -.. _bigquery: - -BigQuery Tutorial -================== - -.. contents:: Table of Contents - :depth: 1 - :local: - -This tutorial covers access via two methods: pittgoogle-client and the bq CLI. - -Prerequisites -------------- - -1. Complete the initial setup. In particular, be sure to: - - - :ref:`install` and/or :ref:`Install the command-line tools `. - - :ref:`service account` - - :ref:`Set your environment variables ` - -Python ------- - -Setup and basics -~~~~~~~~~~~~~~~~ - -Imports - -.. code:: python - - import pittgoogle - import os - -Create a Client for the BigQuery connections below - -.. code:: python - - my_project_id = os.getenv('GOOGLE_CLOUD_PROJECT') - pittgoogle.bigquery.create_client(my_project_id) - -View the available tables and their schemas - -.. code:: python - - # see which tables are available - pittgoogle.bigquery.get_dataset_table_names() - - # look at the schema and basic info of a table - table = 'DIASource' - pittgoogle.bigquery.get_table_info(table) - -Query lightcurves and other history -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Setup - -.. code:: python - - # Choose the history data you want returned - columns = ['jd', 'fid', 'magpsf', 'sigmapsf'] - # 'objectId' and 'candid' will be included automatically - # options are columns in the 'DIASource' table - # pittgoogle.bigquery.get_table_info('DIASource') - - # Optional - # choose specific objects - objectIds = ['ZTF18aczuwfe', 'ZTF18aczvqcr', 'ZTF20acqgklx', 'ZTF18acexdlh'] - # limit to a sample of the table - # limit = 1000 # add this keyword to query_objects() below - -To retrieve lightcurves and other history, we must query for objects' -"DIASource" observations and aggregate the results by ``objectId``. - -``pittgoogle.bigquery.query_objects()`` is a convenience wrapper that let's you -grab all the results at once, or step through them using a generator. -It's options are demonstrated below. - -.. code:: python - - # Option 1: Get a single DataFrame of all results - - lightcurves_df = pittgoogle.bigquery.query_objects(columns, objectIds=objectIds) - # This will execute a dry run and tell you how much data will be processed. - # You will be asked to confirm before proceeding. - # In the future we'll skip this using - dry_run = False - - lightcurves_df.sample(10) - # cleaned of duplicates - -Congratulations! You've now retrieved your first data from the transient -table. It is a DataFrame containing the candidate observations for every -object we requested, indexed by ``objectId`` and ``candid`` (candidate -ID). It includes the columns we requested in the query. - -``fid`` is the filter, mapped to an integer. You can see the filter's -common name in the table schema we looked at earlier, or you can use -``pittgoogle.utils.ztf_fid_names()`` which returns a dictionary of the mapping. - -.. code:: python - - # map fid column to the filter's common name - fid_names = pittgoogle.utils.ztf_fid_names() # dict - print(fid_names) - - lightcurves_df['filter'] = lightcurves_df['fid'].map(fid_names) - lightcurves_df.head() - -Queries can return large datasets. You may want to use a generator to -step through objects individually, and avoid loading the entire dataset -into memory at once. ``query_objects()`` can return one for you: - -.. code:: python - - # Option 2: Get a generator that yields a DataFrame for each objectId - - iterator = True - objects = pittgoogle.bigquery.query_objects( - columns, objectIds=objectIds, iterator=iterator, dry_run=dry_run - ) - # cleaned of duplicates - - for lightcurve_df in objects: - print(f'\nobjectId: {lightcurve_df.objectId}') # objectId in metadata - print(lightcurve_df.sample(5)) - -Each DataFrame contains data on a single object, and is indexed by -``candid``. The ``objectId`` is in the metadata. - -``query_objects()`` can also return a json formatted string of the query -results: - -.. code:: python - - # Option 3: Get a single json string with all the results - - format = 'json' - lcsjson = pittgoogle.bigquery.query_objects( - columns, objectIds=objectIds, format=format, dry_run=dry_run - ) - # cleaned of duplicates - print(lcsjson) - - # read it back in - df = pd.read_json(lcsjson) - df.head() - -.. code:: python - - # Option 4: Get a generator that yields a json string for a single objectId - - format = 'json' - iterator = True - jobj = pittgoogle.bigquery.query_objects( - columns, objectIds=objectIds, format=format, iterator=iterator, dry_run=dry_run - ) - # cleaned of duplicates - - for lcjson in jobj: - print(lcjson) - # lightcurve_df = pd.read_json(lcjson) # read back to a df - -Finally, ``query_objects()`` can return the raw query job object that it -gets from its API call using ``google.cloud.bigquery``'s ``query()`` -method. - -.. code:: python - - # Option 5: Get the `query_job` object - # (see the section on using google.cloud.bigquery directly) - - query_job = pittgoogle.bigquery.query_objects( - columns, objectIds=objectIds, format="query_job", dry_run=dry_run - ) - # query_job is iterable - # each element contains the aggregated history for a single objectId - # Beware: this has not been cleaned of duplicate entries - -.. code:: python - - # Option 5 continued: parse query_job results row by row - - for row in query_job: - # values can be accessed by field name or index - print(f"objectId={row[0]}, magpsf={row['magpsf']}") - - # pgb can cast to a DataFrame or json string - # this option also cleans the duplicates - lightcurve_df = pittgoogle.bigquery.format_history_query_results(row=row) - print(f'\nobjectId: {lightcurve_df.objectId}') # objectId in metadata - print(lightcurve_df.head(1)) - lcjson = pittgoogle.bigquery.format_history_query_results(row=row, format='json') - print('\n', lcjson) - - break - -Plot a lightcurve -^^^^^^^^^^^^^^^^^ - -The following DataFrame can be used with the code in :ref:`ztf figures` to plot the object's light curves. - -.. code:: python - - # Get an object's lightcurve DataFrame with the minimum required columns - columns = ['jd','fid','magpsf','sigmapsf','diffmaglim'] - objectId = 'ZTF20acqgklx' - lightcurve_df = pittgoogle.bigquery.query_objects(columns, objectIds=[objectId], dry_run=False) - -Cone search -~~~~~~~~~~~ - -To perform a cone search, we query for object histories and then check -whether they are within the cone. ``pittgoogle.bigquery.cone_search()`` is a -convenience wrapper provided -for demonstration, but note that it is very inefficient. - -First we set the search parameters. - -.. code:: python - - center = coord.SkyCoord(76.91, 6.02, frame='icrs', unit='deg') - radius = coord.Angle(2, unit=u.deg) - - columns = ['jd', 'fid', 'magpsf', 'sigmapsf'] - # 'objectId' and 'candid' will be included automatically - # options are in the 'DIASource' table - # pittgoogle.bigquery.get_table_info('DIASource') - dry_run = False - - # we'll restrict to a handful of objects to reduce runtime, but this is optional - objectIds = ['ZTF18aczuwfe', 'ZTF18aczvqcr', 'ZTF20acqgklx', 'ZTF18acexdlh'] - -``cone_search()`` has similar options to ``query_objects()``. -Here we demonstrate one. - -.. code:: python - - # Option 1: Get a single df of all objects in the cone - - objects_in_cone = pittgoogle.bigquery.cone_search( - center, radius, columns, objectIds=objectIds, dry_run=dry_run - ) - objects_in_cone.sample(5) - - --------------- - -Using google.cloud.bigquery -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The previous sections demonstrated convenience wrappers for querying -with ``google.cloud.bigquery``. Here we demonstrate using these tools -directly with some basic examples. View the pgb\_utils source code for -more examples. - -Links to more information: - -- `Query syntax in Standard - SQL `__ -- `google.cloud.bigquery - docs `__ - -Query setup: - -.. code:: python - - # Create a BigQuery Client to handle the connections - bq_client = bigquery.Client(project=my_project_id) - -.. code:: python - - # Write the standard SQL query statement - - # pittgoogle.bigquery.get_dataset_table_names() # view available tables - # pittgoogle.bigquery.get_table_info('') # view available column names - - # construct the full table name - pgb_project_id = 'ardent-cycling-243415' - table = 'salt2' - dataset = 'ztf_alerts' - full_table_name = f'{pgb_project_id}.{dataset}.{table}' - - # construct the query - query = ( - f'SELECT objectId, candid, t0, x0, x1, c, chisq, ndof ' - f'FROM `{full_table_name}` ' - f'WHERE ndof>0 and chisq/ndof<2 ' - ) - - # note: if you want to query object histories you can get the - # query statement using `pittgoogle.bigquery.object_history_sql_statement()` - -.. code:: python - - # Let's create a function to execute a "dry run" - # and tell us how much data will be processed. - # This is essentially `pittgoogle.bigquery.dry_run()` - def dry_run(query): - job_config = bigquery.QueryJobConfig(dry_run=True, use_query_cache=False) - query_job = bq_client.query(query, job_config=job_config) - nbytes, TiB = query_job.total_bytes_processed, 2**40 - pTiB = nbytes/TiB*100 # nbytes as a percent of 1 TiB - print(f'\nQuery statement:') - print(f'\n"{query}"\n') - print(f'will process {nbytes} bytes of data.') - print(f'({pTiB:.3}% of your 1 TiB Free Tier monthly allotment.)') - -.. code:: python - - # Find out how much data will be processed - dry_run(query) - -Query: - -.. code:: python - - # Make the API request - query_job = bq_client.query(query) - # Beware: the results may contain duplicate entries - -Format and view results: - -.. code:: python - - # Option 1: dump results to a pandas.DataFrame - df = query_job.to_dataframe() - - # some things you might want to do with it - df = df.drop_duplicates() - df = df.set_index(['objectId','candid']).sort_index() - - df.hist() - df.head() - -.. code:: python - - # Option 2: parse results row by row - for r, row in enumerate(query_job): - - # row values can be accessed by field name or index - print(f"objectId={row[0]}, t0={row['t0']}") - - if r>5: break - --------------- - -Command line ------------- - -Links to more information: - -- `Quickstart using the bq command-line - tool `__ -- `Reference of all bq commands and - flags `__ -- `Query syntax in Standard - SQL `__ - -.. code:: bash - - # Get help - bq help query - -.. code:: bash - - # view the schema of a table - bq show --schema --format=prettyjson ardent-cycling-243415:ztf_alerts.DIASource - # bq show --schema --format=prettyjson ardent-cycling-243415:ztf_alerts.alerts - - # Note: The first time you make a call with `bq` you will ask you to - # initialize a .bigqueryrc configuration file. Follow the directions. - -.. code:: bash - - # Query: dry run - - # first we do a dry_run by including the flag --dry_run - bq query \ - --dry_run \ - --use_legacy_sql=false \ - 'SELECT - objectId, candid, t0, x0, x1, c, chisq, ndof - FROM - `ardent-cycling-243415.ztf_alerts.salt2` - WHERE - ndof>0 and chisq/ndof<2 - LIMIT - 10' - -.. code:: bash - - # execute the Query - bq query \ - --use_legacy_sql=false \ - "SELECT - objectId, candid, t0, x0, x1, c, chisq, ndof - FROM - `ardent-cycling-243415.ztf_alerts.salt2` - WHERE - ndof>0 and chisq/ndof<2 - LIMIT - 10" diff --git a/docs/source/tutorials/cloud-storage.rst b/docs/source/tutorials/cloud-storage.rst deleted file mode 100644 index 5078804..0000000 --- a/docs/source/tutorials/cloud-storage.rst +++ /dev/null @@ -1,125 +0,0 @@ -.. _cloud storage: - -Cloud Storage Tutorial -============================== - -.. contents:: Table of Contents - :depth: 1 - :local: - -This tutorial covers access via two methods: pittgoogle-client (with some direct use -of the Google Cloud API), and the gsutil CLI. - -Prerequisites -------------- - -1. Complete the initial setup. In particular, be sure to: - - - :ref:`install` and/or :ref:`Install the command-line tools `. - - :ref:`service account` - - :ref:`Set your environment variables ` - -Python ------- - -Setup -~~~~~ - -Imports - -.. code:: python - - import fastavro - from google.cloud import storage - from matplotlib import pyplot as plt - import os - from pathlib import Path - import pittgoogle - -Name some things - -.. code:: python - - # fill in the path to the local directory to which you want to download files - local_dir = '' - - my_project_id = os.getenv('GOOGLE_CLOUD_PROJECT') - pgb_project_id = pittgoogle.utils.ProjectIds.pittgoogle - -Download files -~~~~~~~~~~~~~~ - -Download alerts for a given ``objectId`` - -.. code:: python - - objectId = 'ZTF17aaackje' - bucket_name = f'{pgb_project_id}-ztf-alert_avros' - - # Create a client and request a list of files - storage_client = storage.Client(my_project_id) - bucket = storage_client.get_bucket(bucket_name) - blobs = bucket.list_blobs(prefix=objectId) - - # download the files - for blob in blobs: - local_path = f'{local_dir}/{blob.name}' - blob.download_to_filename(local_path) - print(f'Downloaded {local_path}') - -Open a file -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Load to a dict: - -.. code:: python - - paths = Path(local_dir).glob('*.avro') - for path in paths: - with open(path, 'rb') as fin: - alert_list = [r for r in fastavro.reader(fin)] - break - alert_dict = alert_list[0] # extract the single alert packet - - print(alert_dict.keys()) - -Load to a pandas DataFrame: - -.. code:: python - - lightcurve_df = pittgoogle.utils.Cast.alert_dict_to_dataframe(alert_dict) - - -Plot light curves and cutouts -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -See :ref:`ztf figures` - -Command line ------------- - -See also: - -- `Quickstart: Using the gsutil - tool `__ -- `gsutil cp `__ - -Get help - -.. code:: bash - - gsutil help - gsutil help cp - -Download a single file - -.. code:: bash - - # fill in the path to the local directory to which you want to download files - local_dir= - # fill in the name of the file you want. see above for the syntax - file_name= - # file_name=ZTF17aaackje.1563161493315010012.ztf_20210413_programid1.avro - avro_bucket="${pgb_project_id}-ztf-alert_avros" - - gsutil cp "gs://${avro_bucket}/${file_name}" ${local_dir}/. diff --git a/docs/source/tutorials/ztf-figures.rst b/docs/source/tutorials/ztf-figures.rst deleted file mode 100644 index 0ec8695..0000000 --- a/docs/source/tutorials/ztf-figures.rst +++ /dev/null @@ -1,137 +0,0 @@ -.. _ztf figures: - -ZTF Figures Tutorial -============================== - -.. contents:: Table of Contents - :depth: 1 - :local: - -This tutorial demonstrates plotting ZTF cutouts and light curves. -It is based heavily on https://github.com/ZwickyTransientFacility/ztf-avro-alert/blob/master/notebooks/Filtering_alerts.ipynb. - -Prerequisites -------------- - -1. Load a ZTF alert to a dict or a pandas DataFrame. For examples, see: - - - :ref:`cloud storage` - - :ref:`bigquery` - -Imports ---------- - -.. code:: python - - import gzip - import io - from typing import Optional - - import aplpy - import matplotlib as mpl - import numpy as np - import pandas as pd - from astropy.io import fits - from astropy.time import Time - from matplotlib import pyplot as plt - - import pittgoogle - -Plot a Light Curve ------------------- - -.. code:: python - - def plot_lightcurve(lightcurve_df: pd.DataFrame, days_ago: bool = True): - """Plot the per-band light curve of a single ZTF object. - Adapted from: - https://github.com/ZwickyTransientFacility/ztf-avro-alert/blob/master/notebooks/Filtering_alerts.ipynb - - Parameters - ---------- - lightcurve_df - Lightcurve history of a ZTF object. Must contain columns - ['jd','fid','magpsf','sigmapsf','diffmaglim'] - days_ago - If True, x-axis will be number of days in the past. - Else x-axis will be Julian date. - """ - - filter_code = pittgoogle.utils.ztf_fid_names() # dict - filter_color = {1: "green", 2: "red", 3: "pink"} - - # set the x-axis (time) details - if days_ago: - now = Time.now().jd - t = lightcurve_df.jd - now - xlabel = "Days Ago" - else: - t = lightcurve_df.jd - xlabel = "Time (JD)" - - # plot lightcurves by band - for fid, color in filter_color.items(): - # plot detections in this filter: - w = (lightcurve_df.fid == fid) & ~lightcurve_df.magpsf.isnull() - if np.sum(w): - label = f"{fid}: {filter_code[fid]}" - kwargs = {"fmt": ".", "color": color, "label": label} - plt.errorbar(t[w], lightcurve_df.loc[w, "magpsf"], lightcurve_df.loc[w, "sigmapsf"], **kwargs) - # plot nondetections in this filter - wnodet = (lightcurve_df.fid == fid) & lightcurve_df.magpsf.isnull() - if np.sum(wnodet): - plt.scatter( - t[wnodet], - lightcurve_df.loc[wnodet, "diffmaglim"], - marker="v", - color=color, - alpha=0.25, - ) - - plt.gca().invert_yaxis() - plt.xlabel(xlabel) - plt.ylabel("Magnitude") - plt.legend() - -.. code:: python - - plot_lightcurve(lightcurve_df) - -Plot Cutouts ------------- - -.. code:: python - - def plot_stamp(stamp, fig=None, subplot=None, **kwargs): - """Adapted from: - https://github.com/ZwickyTransientFacility/ztf-avro-alert/blob/master/notebooks/Filtering_alerts.ipynb - """ - - with gzip.open(io.BytesIO(stamp), "rb") as f: - with fits.open(io.BytesIO(f.read())) as hdul: - if fig is None: - fig = plt.figure(figsize=(4, 4)) - if subplot is None: - subplot = (1, 1, 1) - ffig = aplpy.FITSFigure(hdul[0], figure=fig, subplot=subplot, **kwargs) - ffig.show_grayscale(stretch="arcsinh") - return ffig - - - def plot_cutouts(alert_dict): - """Adapted from: - https://github.com/ZwickyTransientFacility/ztf-avro-alert/blob/master/notebooks/Filtering_alerts.ipynb - """ - - # fig, axes = plt.subplots(1,3, figsize=(12,4)) - fig = plt.figure(figsize=(12, 4)) - for i, cutout in enumerate(["Science", "Template", "Difference"]): - stamp = alert_dict["cutout{}".format(cutout)]["stampData"] - ffig = plot_stamp(stamp, fig=fig, subplot=(1, 3, i + 1)) - ffig.set_title(cutout) - - -.. code:: python - - plot_cutouts(alert_dict) - plt.show(block=False) diff --git a/pittgoogle/alert.py b/pittgoogle/alert.py index 4c23238..0f19fa3 100644 --- a/pittgoogle/alert.py +++ b/pittgoogle/alert.py @@ -27,7 +27,7 @@ import logging from datetime import datetime from pathlib import Path -from typing import TYPE_CHECKING, Any, Dict, Optional, Union +from typing import TYPE_CHECKING, Any, Dict, Mapping, Optional, Union import fastavro from attrs import define, field @@ -36,7 +36,6 @@ from .exceptions import BadRequest, OpenAlertError, SchemaNotFoundError if TYPE_CHECKING: - import google._upb._message import google.cloud.pubsub_v1 import pandas as pd # always lazy-load pandas. it hogs memory on cloud functions and run @@ -79,9 +78,7 @@ class Alert: Union["google.cloud.pubsub_v1.types.PubsubMessage", types_.PubsubMessageLike] ] = field(default=None) """Incoming Pub/Sub message object.""" - _attributes: Optional[Union[Dict, "google._upb._message.ScalarMapContainer"]] = field( - default=None - ) + _attributes: Optional[Mapping[str, str]] = field(default=None) _dict: Optional[Dict] = field(default=None) _dataframe: Optional["pd.DataFrame"] = field(default=None) schema_name: Optional[str] = field(default=None) @@ -168,7 +165,7 @@ def index(): def from_dict( cls, payload: Dict, - attributes: Optional[Union[Dict, "google._upb._message.ScalarMapContainer"]] = None, + attributes: Optional[Mapping[str, str]] = None, schema_name: Optional[str] = None, ) -> "Alert": """Create an `Alert` object from the given `payload` dictionary. @@ -177,7 +174,7 @@ def from_dict( ---------- payload : dict The dictionary containing the data for the `Alert` object. - attributes : dict or google._upb._message.ScalarMapContainer (optional) + attributes : dict-like (optional) Additional attributes for the `Alert` object. Defaults to None. schema_name : str (optional) The name of the schema. Defaults to None. @@ -189,9 +186,12 @@ def from_dict( return cls(dict=payload, attributes=attributes, schema_name=schema_name) @classmethod - def from_msg( - cls, msg: "google.cloud.pubsub_v1.types.PubsubMessage", schema_name: Optional[str] = None - ) -> "Alert": + def from_msg(cls, msg, schema_name: Optional[str] = None) -> "Alert": + # [FIXME] This type hint is causing an error when building docs. + # Warning, treated as error: + # Cannot resolve forward reference in type annotations of "pittgoogle.alert.Alert.from_msg": + # name 'google' is not defined + # cls, msg: "google.cloud.pubsub_v1.types.PubsubMessage", schema_name: Optional[str] = None """ Create an `Alert` object from a `google.cloud.pubsub_v1.types.PubsubMessage`. diff --git a/pittgoogle/auth.py b/pittgoogle/auth.py index 0534a04..e436b43 100644 --- a/pittgoogle/auth.py +++ b/pittgoogle/auth.py @@ -8,7 +8,7 @@ .. note:: To authenticate, you must have completed one of the setup options described in - :doc:`/overview/authentication`. The recommended workflow is to use a + :doc:`/main/one-time-setup/authentication`. The recommended workflow is to use a :ref:`service account ` and :ref:`set environment variables `. In that case, you will not need to call this module directly.