From 3d087498a3be2255a7280af2445a42d108183f67 Mon Sep 17 00:00:00 2001 From: Isabela Sobral <35778239+belasobral93@users.noreply.github.com> Date: Tue, 15 Aug 2023 11:43:15 -0500 Subject: [PATCH 1/5] Update deployment-tools.md adjusting docs link to take you to dbt cloud docs for astronome --- website/docs/docs/deploy/deployment-tools.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/docs/deploy/deployment-tools.md b/website/docs/docs/deploy/deployment-tools.md index e642e4b95e2..b707d131e6b 100644 --- a/website/docs/docs/deploy/deployment-tools.md +++ b/website/docs/docs/deploy/deployment-tools.md @@ -30,7 +30,7 @@ Invoking dbt Core jobs through the [BashOperator](https://registry.astronomer.io -For more details on both of these methods, including example implementations, check out [this guide](https://www.astronomer.io/guides/airflow-dbt). +For more details on both of these methods, including example implementations, check out [this guide](https://www.astronomer.io/guides/airflow-dbt](https://docs.astronomer.io/learn/airflow-dbt-cloud). ## Azure Data Factory From 768ac19f34e96e4bf4f91c4cfe1df32e2bb51f28 Mon Sep 17 00:00:00 2001 From: Isabela Sobral <35778239+belasobral93@users.noreply.github.com> Date: Tue, 15 Aug 2023 11:45:42 -0500 Subject: [PATCH 2/5] Update deployment-tools.md updating broken link for airflow's dbt cloud provider --- website/docs/docs/deploy/deployment-tools.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/docs/deploy/deployment-tools.md b/website/docs/docs/deploy/deployment-tools.md index e642e4b95e2..990474d9c97 100644 --- a/website/docs/docs/deploy/deployment-tools.md +++ b/website/docs/docs/deploy/deployment-tools.md @@ -16,7 +16,7 @@ If your organization is using [Airflow](https://airflow.apache.org/), there are -Installing the [dbt Cloud Provider](https://registry.astronomer.io/providers/dbt-cloud) to orchestrate dbt Cloud jobs. This package contains multiple Hooks, Operators, and Sensors to complete various actions within dbt Cloud. +Installing the [dbt Cloud Provider](https://registry.astronomer.io/providers/dbt-cloud](https://airflow.apache.org/docs/apache-airflow-providers-dbt-cloud/stable/index.html) to orchestrate dbt Cloud jobs. This package contains multiple Hooks, Operators, and Sensors to complete various actions within dbt Cloud. From d3c5cab8a3e84459baec2d424bc4fec4b28d5d1f Mon Sep 17 00:00:00 2001 From: mirnawong1 <89008547+mirnawong1@users.noreply.github.com> Date: Tue, 15 Aug 2023 14:19:02 -0400 Subject: [PATCH 3/5] Update website/docs/docs/deploy/deployment-tools.md --- website/docs/docs/deploy/deployment-tools.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/docs/deploy/deployment-tools.md b/website/docs/docs/deploy/deployment-tools.md index b707d131e6b..d171a86a3a2 100644 --- a/website/docs/docs/deploy/deployment-tools.md +++ b/website/docs/docs/deploy/deployment-tools.md @@ -30,7 +30,7 @@ Invoking dbt Core jobs through the [BashOperator](https://registry.astronomer.io -For more details on both of these methods, including example implementations, check out [this guide](https://www.astronomer.io/guides/airflow-dbt](https://docs.astronomer.io/learn/airflow-dbt-cloud). +For more details on both of these methods, including example implementations, check out [this guide](https://docs.astronomer.io/learn/airflow-dbt-cloud). ## Azure Data Factory From fb4ce8cafd287b41e7f2759552c3ab30d2a2126c Mon Sep 17 00:00:00 2001 From: mirnawong1 <89008547+mirnawong1@users.noreply.github.com> Date: Tue, 15 Aug 2023 15:14:40 -0400 Subject: [PATCH 4/5] Update website/docs/docs/deploy/deployment-tools.md --- website/docs/docs/deploy/deployment-tools.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/docs/deploy/deployment-tools.md b/website/docs/docs/deploy/deployment-tools.md index c41894ce6e9..b9ab14e1c0c 100644 --- a/website/docs/docs/deploy/deployment-tools.md +++ b/website/docs/docs/deploy/deployment-tools.md @@ -16,7 +16,7 @@ If your organization is using [Airflow](https://airflow.apache.org/), there are -Installing the [dbt Cloud Provider](https://registry.astronomer.io/providers/dbt-cloud](https://airflow.apache.org/docs/apache-airflow-providers-dbt-cloud/stable/index.html) to orchestrate dbt Cloud jobs. This package contains multiple Hooks, Operators, and Sensors to complete various actions within dbt Cloud. +Installing the [dbt Cloud Provider](https://airflow.apache.org/docs/apache-airflow-providers-dbt-cloud/stable/index.html) to orchestrate dbt Cloud jobs. This package contains multiple Hooks, Operators, and Sensors to complete various actions within dbt Cloud. From 795d743f5741cc8f84e49e283998d644a9c97e76 Mon Sep 17 00:00:00 2001 From: "Eddo W. Hintoso" Date: Tue, 15 Aug 2023 12:37:10 -0700 Subject: [PATCH 5/5] Discovery API: update docs (#3894) ## What are you changing in this pull request and why? I'm updating the docs of Discovery API because we have breaking changes to the API and we need to update the docs to better reflect the current state of our API, since the docs point to legacy endpoints we plan to deprecate soon. - [x] update use case queries to use bigints and new apis - [x] update other queries to use bigints and new apis - [x] docs parity with legacy vs new endpoints - [x] fix bug where schemas and types weren't retrieved properly - [x] update or remove screenshots (we're keeping it) - [ ] verify that Python code works (no bandwidth -- if it breaks it breaks, not the most urgent for getting docs out) Some common mistakes I found while going through each query and executing it to make sure it's valid. - Fields were `snake_case` when they should be `camelCase` - Typos in filters and fields - Nested nodes were not updated so `... on` syntaxes were invalid - Querying non-nested-allowed fields of a nested node, like `children`, `tests`, etc. ## Checklist - [x] Review the [Content style guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md) and [About versioning](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version) so my content adheres to these guidelines. - [x] Add a checklist item for anything that needs to happen before this PR is merged, such as "needs technical review" or "change base branch." (Below might be a separate PR, but leaving it for now) Adding new pages (delete if not applicable): - [x] Add page to `website/sidebars.js` - [x] Provide a unique filename for the new page Removing or renaming existing pages (delete if not applicable): - [ ] Remove page from `website/sidebars.js` - [ ] Add an entry `website/static/_redirects` - [ ] [Ran link testing](https://github.com/dbt-labs/docs.getdbt.com#running-the-cypress-tests-locally) to update the links that point to the deleted page --------- Co-authored-by: Ly Nguyen <107218380+nghi-ly@users.noreply.github.com> Co-authored-by: Ly Nguyen Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Leona B. Campbell <3880403+runleonarun@users.noreply.github.com> --- contributing/single-sourcing-content.md | 44 +- .../docs/dbt-cloud-apis/discovery-querying.md | 175 +-- .../discovery-use-cases-and-examples.md | 1206 +++++++++-------- ...nvironment-applied-modelHistoricalRuns.mdx | 50 + .../schema-discovery-environment.mdx | 64 +- .../schema-discovery-exposure.mdx | 19 +- .../schema-discovery-exposures.mdx | 19 +- .../schema-discovery-job-exposure.mdx | 64 + .../schema-discovery-job-exposures.mdx | 65 + .../schema-discovery-job-metric.mdx | 58 + .../schema-discovery-job-metrics.mdx | 60 + .../schema-discovery-job-model.mdx | 91 ++ .../schema-discovery-job-models.mdx | 59 + .../schema-discovery-job-seed.mdx | 42 + .../schema-discovery-job-seeds.mdx | 40 + .../schema-discovery-job-snapshots.mdx | 49 + .../schema-discovery-job-source.mdx | 52 + .../schema-discovery-job-sources.mdx | 65 + .../schema-discovery-job-test.mdx | 43 + .../schema-discovery-job-tests.mdx | 43 + .../dbt-cloud-apis/schema-discovery-job.mdx | 62 + .../schema-discovery-metric.mdx | 17 +- .../schema-discovery-metrics.mdx | 19 +- .../dbt-cloud-apis/schema-discovery-model.mdx | 26 +- .../schema-discovery-modelByEnv.mdx | 22 +- .../schema-discovery-models.mdx | 24 +- .../dbt-cloud-apis/schema-discovery-seed.mdx | 18 +- .../dbt-cloud-apis/schema-discovery-seeds.mdx | 17 +- .../schema-discovery-snapshots.mdx | 20 +- .../schema-discovery-source.mdx | 18 +- .../schema-discovery-sources.mdx | 24 +- .../dbt-cloud-apis/schema-discovery-test.mdx | 17 +- .../dbt-cloud-apis/schema-discovery-tests.mdx | 17 +- website/docs/docs/dbt-cloud-apis/schema.jsx | 238 +++- website/sidebars.js | 48 +- .../_discovery_api_job_deprecation_notice.md | 7 + 36 files changed, 2043 insertions(+), 859 deletions(-) create mode 100644 website/docs/docs/dbt-cloud-apis/schema-discovery-environment-applied-modelHistoricalRuns.mdx create mode 100644 website/docs/docs/dbt-cloud-apis/schema-discovery-job-exposure.mdx create mode 100644 website/docs/docs/dbt-cloud-apis/schema-discovery-job-exposures.mdx create mode 100644 website/docs/docs/dbt-cloud-apis/schema-discovery-job-metric.mdx create mode 100644 website/docs/docs/dbt-cloud-apis/schema-discovery-job-metrics.mdx create mode 100644 website/docs/docs/dbt-cloud-apis/schema-discovery-job-model.mdx create mode 100644 website/docs/docs/dbt-cloud-apis/schema-discovery-job-models.mdx create mode 100644 website/docs/docs/dbt-cloud-apis/schema-discovery-job-seed.mdx create mode 100644 website/docs/docs/dbt-cloud-apis/schema-discovery-job-seeds.mdx create mode 100644 website/docs/docs/dbt-cloud-apis/schema-discovery-job-snapshots.mdx create mode 100644 website/docs/docs/dbt-cloud-apis/schema-discovery-job-source.mdx create mode 100644 website/docs/docs/dbt-cloud-apis/schema-discovery-job-sources.mdx create mode 100644 website/docs/docs/dbt-cloud-apis/schema-discovery-job-test.mdx create mode 100644 website/docs/docs/dbt-cloud-apis/schema-discovery-job-tests.mdx create mode 100644 website/docs/docs/dbt-cloud-apis/schema-discovery-job.mdx create mode 100644 website/snippets/_discovery_api_job_deprecation_notice.md diff --git a/contributing/single-sourcing-content.md b/contributing/single-sourcing-content.md index ca27372e5bc..5b87d494c94 100644 --- a/contributing/single-sourcing-content.md +++ b/contributing/single-sourcing-content.md @@ -15,9 +15,9 @@ Versions are managed in the `versions` array located in the `website/dbt-version ### Adding a new version -To add a new version to the site, a new object must be added to the `versions` array in the same format as existing versions. This object holds two properties: **version** and **EOLDate (See End of Life Dates below)**. +To add a new version to the site, a new object must be added to the `versions` array in the same format as existing versions. This object holds two properties: **version** and **EOLDate (See End of Life Dates below)**. -Example Version: +Example Version: ```jsx exports.versions = [ @@ -36,7 +36,7 @@ The **EOLDate** property determines when a version is no longer supported. A ver When a documentation page is viewed, the **EOLDate** property for the active version is compared to today’s date. If the current version has reached or is nearing the end of support, a banner will show atop the page, notifying the visitor of the end-of-life status. -Two different versions of the banner will show depending on the end-of-life date: +Two different versions of the banner will show depending on the end-of-life date: - When the version is within 3 months of the **EOLDate.** - When the version has passed the **EOLDate.** @@ -76,7 +76,7 @@ exports.versionedPages = [ ## Versioning blocks of content -The **VersionBlock** component provides the ability to version a specific piece of content on a docs page. +The **VersionBlock** component provides the ability to version a specific piece of content on a docs page. This component can be added directly to a markdown file in a similar way as other components (FAQ, File, Lightbox). @@ -99,7 +99,7 @@ Both properties can be used together to set a range where the content should sho ### Example for versioning entire pages -On the [Docs Defer page](https://docs.getdbt.com/reference/node-selection/defer), tabs are used to show different versions of a piece of code. **v0.21.0 and later** shows `--select`, while **v-.20.x and earlier** changes this to `--models`. +On the [Docs Defer page](https://docs.getdbt.com/reference/node-selection/defer), tabs are used to show different versions of a piece of code. **v0.21.0 and later** shows `--select`, while **v-.20.x and earlier** changes this to `--models`. ![oldway](https://user-images.githubusercontent.com/3880403/163254165-dea23266-2eea-4e65-b3f0-c7b6d3e51fc3.png) @@ -149,7 +149,7 @@ Using a global variable requires two steps: exports.dbtVariables = { dbtCore: { name: "dbt Core" - } + } } ``` @@ -198,13 +198,13 @@ In the above example, the **dbtCloud** property has a default name of “dbt Clo ### Global variables example -The global `` component can be used inline, for example: +The global `` component can be used inline, for example: ```markdown This piece of markdown content explains why is awesome. ``` -However, a Var component cannot start a new line of content. Fortunately, a workaround exists to use the Var component at the beginning of a line of content. +However, a Var component cannot start a new line of content. Fortunately, a workaround exists to use the Var component at the beginning of a line of content. To use the component at the beginning of a sentence, add a non-breaking space character before the component: @@ -231,7 +231,7 @@ A partial file allows you to reuse content throughout the docs. Here are the ste 2. Go back to the docs file that will pull content from the partial file. 3. Add the following import file: `import ComponentName from '/snippets/_this-is-your-partial-file-name.md';` * You must always add an import file in that format. Note you can name `ComponentName` (a partial component) can be whatever makes sense for your purpose. - * `.md` needs to be added to the end of the filename. + * `.md` needs to be added to the end of the filename. 4. To use the partial component, go to the next line and add ``. This fetches the reusable content in the partial file * Note `anyname` can be whatever makes sense for your purpose. @@ -258,15 +258,15 @@ Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam fermentum portti ```markdown Docs content here. -`import SetUpPages from '/snippets/_partial-name.md';` - - +import SetUpPages from '/snippets/_partial-name.md'; + + Docs content here. ``` - `import SetUpPages from '/snippets/_partial-name.md';` — A partial file that will be imported by other files -- `` — A component that imports content from the partial file. You can also use it to pass in data into the partial using props (See 'How to use props to pass different content on multiple pages?' below). +- `` — A component that imports content from the partial file. You can also use it to pass in data into the partial using props (See 'How to use props to pass different content on multiple pages?' below). 4. This will then render the content of the docs in the partial file. @@ -276,32 +276,32 @@ Docs content here.
How to use props to pass different content on multiple pages?
- + You can add props on the component only if you want to pass in data from the component into the partial file. This is useful for using the same partial component on multiple docs pages and displaying different values for each. For example, if we wanted to use a partial on multiple pages and pass in a different 'feature' for each docs page, you can write it as: -``` +```markdown import SetUpPages from '/snippets/_available-enterprise-only.md'; - -` + + ``` - + Then in the `/snippets/_available-enterprise-only.md file`, you can display that feature prop with: - + >This feature: `{props.feature}` other content etc... This will then translate to: - + >This feature: A really cool feature other content etc... In this example, the component ` ### Snippets -The Snippet component allows for content to be reusable throughout the Docs. This is very similar to the existing FAQ component. Using partial files, which is a built-in Docusaurus feature, is recommended over snippets. +The Snippet component allows for content to be reusable throughout the Docs. This is very similar to the existing FAQ component. Using partial files, which is a built-in Docusaurus feature, is recommended over snippets. Creating and using a snippet requires two steps: diff --git a/website/docs/docs/dbt-cloud-apis/discovery-querying.md b/website/docs/docs/dbt-cloud-apis/discovery-querying.md index 40836203faa..eaa30c36dfb 100644 --- a/website/docs/docs/dbt-cloud-apis/discovery-querying.md +++ b/website/docs/docs/dbt-cloud-apis/discovery-querying.md @@ -1,14 +1,14 @@ --- title: "Query the Discovery API" id: "discovery-querying" -sidebar_label: "Query the Discovery API" +sidebar_label: "Query the Discovery API" --- -The Discovery API supports ad-hoc queries and integrations.. If you are new to the API, read the [Discovery API overview](/docs/dbt-cloud-apis/discovery-api) for an introduction. +The Discovery API supports ad-hoc queries and integrations. If you are new to the API, refer to [About the Discovery API](/docs/dbt-cloud-apis/discovery-api) for an introduction. -Use the Discovery API to evaluate data pipeline health and project state across runs or at a moment in time. dbt Labs provide a [GraphQL explorer](https://metadata.cloud.getdbt.com/graphql) for this API, enabling you to run queries and browse the schema. +Use the Discovery API to evaluate data pipeline health and project state across runs or at a moment in time. dbt Labs provide a [GraphQL explorer](https://metadata.cloud.getdbt.com/graphql) for this API, enabling you to run queries and browse the schema. -Since GraphQL describes the data in the API, the schema displayed in the GraphQL explorer accurately represents the graph and fields available to query. +Since GraphQL describes the data in the API, the schema displayed in the GraphQL explorer accurately represents the graph and fields available to query. @@ -16,17 +16,17 @@ Since GraphQL describes the data in the API, the schema displayed in the GraphQL Currently, authorization of requests takes place [using a service token](/docs/dbt-cloud-apis/service-tokens). dbt Cloud admin users can generate a Metadata Only service token that is authorized to execute a specific query against the Discovery API. -Once you've created a token, you can use it in the Authorization header of requests to the dbt Cloud Discovery API. Be sure to include the Token prefix in the Authorization header, or the request will fail with a `401 Unauthorized` error. Note that `Bearer` can be used instead of `Token` in the Authorization header. Both syntaxes are equivalent. +Once you've created a token, you can use it in the Authorization header of requests to the dbt Cloud Discovery API. Be sure to include the Token prefix in the Authorization header, or the request will fail with a `401 Unauthorized` error. Note that `Bearer` can be used instead of `Token` in the Authorization header. Both syntaxes are equivalent. -## Access the Discovery API +## Access the Discovery API 1. Create a [service account token](/docs/dbt-cloud-apis/service-tokens) to authorize requests. dbt Cloud Admin users can generate a _Metadata Only_ service token, which can be used to execute a specific query against the Discovery API to authorize requests. -2. Find your API URL using the endpoint `https://metadata.{YOUR_ACCESS_URL}/graphql`. +2. Find your API URL using the endpoint `https://metadata.{YOUR_ACCESS_URL}/graphql`. * Replace `{YOUR_ACCESS_URL}` with the appropriate [Access URL](/docs/cloud/about-cloud/regions-ip-addresses) for your region and plan. For example, if your multi-tenant region is North America, your endpoint is `https://metadata.cloud.getdbt.com/graphql`. If your multi-tenant region is EMEA, your endpoint is `https://metadata.emea.dbt.com/graphql`. -3. For specific query points, refer to the [schema documentation](/docs/dbt-cloud-apis/discovery-schema-model). +3. For specific query points, refer to the [schema documentation](/docs/dbt-cloud-apis/discovery-schema-model). ## Run queries using HTTP requests @@ -36,7 +36,7 @@ You can run queries by sending a `POST` request to the `https://metadata.YOUR_AC * `YOUR_TOKEN` in the Authorization header with your actual API token. Be sure to include the Token prefix. * `QUERY_BODY` with a GraphQL query, for example `{ "query": "" }` * `VARIABLES` with a dictionary of your GraphQL query variables, such as a job ID or a filter. -* `ENDPOINT` with the endpoint you're querying, such as environment. +* `ENDPOINT` with the endpoint you're querying, such as environment. ```shell curl 'https://metadata.YOUR_ACCESS_URL/graphql' \ @@ -48,10 +48,13 @@ You can run queries by sending a `POST` request to the `https://metadata.YOUR_AC Python example: -```py -response = requests.post('YOUR_ACCESS_URL', -headers={"authorization": "Bearer "+YOUR_TOKEN, "content-type": "application/json"}, -json={"query": QUERY_BODY, "variables": VARIABLES}) +```python +response = requests.post( + 'YOUR_ACCESS_URL', + headers={"authorization": "Bearer "+YOUR_TOKEN, "content-type": "application/json"}, + json={"query": QUERY_BODY, "variables": VARIABLES} +) + metadata = response.json()['data'][ENDPOINT] ``` @@ -71,66 +74,74 @@ You can use the Discovery API to query data from the previous three months. For ## Run queries with the GraphQL explorer -You can run ad-hoc queries directly in the [GraphQL API explorer](https://metadata.cloud.getdbt.com/graphql) and use the document explorer on the left-hand side, where you can see all possible nodes and fields. +You can run ad-hoc queries directly in the [GraphQL API explorer](https://metadata.cloud.getdbt.com/graphql) and use the document explorer on the left-hand side to see all possible nodes and fields. + +Refer to the [Apollo explorer documentation](https://www.apollographql.com/docs/graphos/explorer/explorer) for setup and authorization info. + +1. Access the [GraphQL API explorer](https://metadata.cloud.getdbt.com/graphql) and select fields you want to query. -Refer to the [Apollo explorer documentation](https://www.apollographql.com/docs/graphos/explorer/explorer) for setup and authorization info. +2. Select **Variables** at the bottom of the explorer and replace any `null` fields with your unique values. -1. Access the [GraphQL API explorer](https://metadata.cloud.getdbt.com/graphql) and select fields you'd like query. +3. [Authenticate](https://www.apollographql.com/docs/graphos/explorer/connecting-authenticating#authentication) using Bearer auth with `YOUR_TOKEN`. Select **Headers** at the bottom of the explorer and select **+New header**. -2. Go to **Variables** at the bottom of the explorer and replace any `null` fields with your unique values. +4. Select **Authorization** in the **header key** dropdown list and enter your Bearer auth token in the **value** field. Remember to include the Token prefix. Your header key should be in this format: `{"Authorization": "Bearer }`. -3. [Authenticate](https://www.apollographql.com/docs/graphos/explorer/connecting-authenticating#authentication) via Bearer auth with `YOUR_TOKEN`. Go to **Headers** at the bottom of the explorer and select **+New header**. + + -4. Select **Authorization** in the **header key** drop-down list and enter your Bearer auth token in the **value** field. Remember to include the Token prefix. Your header key should look like this `{"Authorization": "Bearer }`.
-5. Run your query by pressing the blue query button in the top-right of the Operation editor (to the right of the query). You should see a successful query response on the right side of the explorer. +1. Run your query by clicking the blue query button in the top right of the **Operation** editor (to the right of the query). You should see a successful query response on the right side of the explorer. + + + ### Fragments -Use the [`..on`](https://www.apollographql.com/docs/react/data/fragments/) notation to query across lineage and retrieve results from specific node types. +Use the [`... on`](https://www.apollographql.com/docs/react/data/fragments/) notation to query across lineage and retrieve results from specific node types. ```graphql - -environment(id: $environmentId) { - applied { - models(first: $first,filter:{uniqueIds:"MODEL.PROJECT.MODEL_NAME"}) { - edges { - node { - name - ancestors(types:[Model, Source, Seed, Snapshot]) { - ... on ModelAppliedStateNode { - name - resourceType - materializedType - executionInfo { - executeCompletedAt +query ($environmentId: BigInt!, $first: Int!) { + environment(id: $environmentId) { + applied { + models(first: $first, filter: { uniqueIds: "MODEL.PROJECT.MODEL_NAME" }) { + edges { + node { + name + ancestors(types: [Model, Source, Seed, Snapshot]) { + ... on ModelAppliedStateNestedNode { + name + resourceType + materializedType + executionInfo { + executeCompletedAt + } } - } - ... on SourceAppliedStateNode { - sourceName - name - resourceType - freshness { - maxLoadedAt + ... on SourceAppliedStateNestedNode { + sourceName + name + resourceType + freshness { + maxLoadedAt + } } - } - ... on SnapshotAppliedStateNode { - name - resourceType - executionInfo { - executeCompletedAt + ... on SnapshotAppliedStateNestedNode { + name + resourceType + executionInfo { + executeCompletedAt + } } - } - ... on SeedAppliedStateNode { - name - resourceType - executionInfo { - executeCompletedAt + ... on SeedAppliedStateNestedNode { + name + resourceType + executionInfo { + executeCompletedAt + } } } } @@ -139,56 +150,59 @@ environment(id: $environmentId) { } } } - ``` ### Pagination -Querying large datasets can impact performance on multiple functions in the API pipeline. Pagination eases the burden by returning smaller data sets one page at a time. This is useful for returning a particular portion of the dataset or the entire dataset piece-by-piece to enhance performance. dbt Cloud utilizes cursor-based pagination, which makes it easy to return pages of constantly changing data. +Querying large datasets can impact performance on multiple functions in the API pipeline. Pagination eases the burden by returning smaller data sets one page at a time. This is useful for returning a particular portion of the dataset or the entire dataset piece-by-piece to enhance performance. dbt Cloud utilizes cursor-based pagination, which makes it easy to return pages of constantly changing data. -Use the `PageInfo` object to return information about the page. The following fields are available: +Use the `PageInfo` object to return information about the page. The available fields are: -- `startCursor` string type - corresponds to the first `node` in the `edge`. -- `endCursor` string type - corresponds to the last `node` in the `edge`. -- `hasNextPage` boolean type - whether there are more `nodes` after the returned results. -- `hasPreviousPage` boolean type - whether `nodes` exist before the returned results. +- `startCursor` string type — Corresponds to the first `node` in the `edge`. +- `endCursor` string type — Corresponds to the last `node` in the `edge`. +- `hasNextPage` boolean type — Whether or not there are more `nodes` after the returned results. There are connection variables available when making the query: -- `first` integer type - will return the first 'n' `nodes` for each page, up to 500. -- `after` string type sets the cursor to retrieve `nodes` after. It's best practice to set the `after` variable with the object ID defined in the `endcursor` of the previous page. +- `first` integer type — Returns the first n `nodes` for each page, up to 500. +- `after` string type — Sets the cursor to retrieve `nodes` after. It's best practice to set the `after` variable with the object ID defined in the `endCursor` of the previous page. + +Below is an example that returns the `first` 500 models `after` the specified Object ID in the variables. The `PageInfo` object returns where the object ID where the cursor starts, where it ends, and whether there is a next page. -The following example shows that we're returning the `first` 500 models `after` the specified Object ID in the variables. The `PageInfo` object will return where the object ID where the cursor starts, where it ends, and whether there is a next page. + + -Here is a code example of the `PageInfo` object: +Below is a code example of the `PageInfo` object: ```graphql pageInfo { - startCursor - endCursor - hasNextPage - } - totalCount # Total number of pages - + startCursor + endCursor + hasNextPage +} +totalCount # Total number of records across all pages ``` ### Filters -Filtering helps to narrow down the results of an API query. Want to query and return only models and tests that are failing? Or find models that are taking too long to run? You can fetch execution details such as [`executionTime`](/docs/dbt-cloud-apis/discovery-schema-models#fields), [`runElapsedTime`](/docs/dbt-cloud-apis/discovery-schema-models#fields), or [`status`](/docs/dbt-cloud-apis/discovery-schema-models#fields). This helps data teams monitor the performance of their models, identify bottlenecks, and optimize the overall data pipeline. +Filtering helps to narrow down the results of an API query. If you want to query and return only models and tests that are failing or find models that are taking too long to run, you can fetch execution details such as [`executionTime`](/docs/dbt-cloud-apis/discovery-schema-models#fields), [`runElapsedTime`](/docs/dbt-cloud-apis/discovery-schema-models#fields), or [`status`](/docs/dbt-cloud-apis/discovery-schema-models#fields). This helps data teams monitor the performance of their models, identify bottlenecks, and optimize the overall data pipeline. -In the following example, we can see that we're filtering results to models that have succeeded on their `lastRunStatus`: +Below is an example that filters for results of models that have succeeded on their `lastRunStatus`: -Here is a code example that filters for models that have an error on their last run and tests that have failed: +Below is an example that filters for models that have an error on their last run and tests that have failed: -```graphql + + -environment(id: $environmentId) { +```graphql +query ModelsAndTests($environmentId: BigInt!, $first: Int!) { + environment(id: $environmentId) { applied { - models(first: $first, filter: {lastRunStatus:error}) { + models(first: $first, filter: { lastRunStatus: error }) { edges { node { name @@ -198,7 +212,7 @@ environment(id: $environmentId) { } } } - tests(first: $first, filter: {status:"fail"}) { + tests(first: $first, filter: { status: "fail" }) { edges { node { name @@ -207,9 +221,10 @@ environment(id: $environmentId) { } } } - } + } + } + } } - ``` ## Related content diff --git a/website/docs/docs/dbt-cloud-apis/discovery-use-cases-and-examples.md b/website/docs/docs/dbt-cloud-apis/discovery-use-cases-and-examples.md index 030688d9aeb..8efb1ec0d37 100644 --- a/website/docs/docs/dbt-cloud-apis/discovery-use-cases-and-examples.md +++ b/website/docs/docs/dbt-cloud-apis/discovery-use-cases-and-examples.md @@ -3,9 +3,9 @@ title: "Use cases and examples for the Discovery API" sidebar_label: "Uses and examples" --- -With the Discovery API, you can query the metadata in dbt Cloud to learn more about your dbt deployments and the data it generates to analyze them and make improvements. +With the Discovery API, you can query the metadata in dbt Cloud to learn more about your dbt deployments and the data it generates to analyze them and make improvements. -You can use the API in a variety of ways to get answers to your business questions. Below describes some of the uses of the API and is meant to give you an idea of the questions this API can help you answer. +You can use the API in a variety of ways to get answers to your business questions. Below describes some of the uses of the API and is meant to give you an idea of the questions this API can help you answer. | Use Case | Outcome | Example Questions | | --- | --- | --- | @@ -17,13 +17,13 @@ You can use the API in a variety of ways to get answers to your business questio ## Performance -You can use the Discovery API to identify inefficiencies in pipeline execution to reduce infrastructure costs and improve timeliness. Below are example questions and queries you can run. +You can use the Discovery API to identify inefficiencies in pipeline execution to reduce infrastructure costs and improve timeliness. Below are example questions and queries you can run. For performance use cases, people typically query the historical or latest applied state across any part of the DAG (for example, models) using the `environment`, `modelByEnvironment`, or job-level endpoints. ### How long did each model take to run? -It’s helpful to understand how long it takes to build models (tables) and tests to execute during a dbt run. Longer model build times result in higher infrastructure costs and fresh data arriving later to stakeholders. Analyses like these can be in observability tools or ad-hoc queries, like in a notebook. +It’s helpful to understand how long it takes to build models (tables) and tests to execute during a dbt run. Longer model build times result in higher infrastructure costs and fresh data arriving later to stakeholders. Analyses like these can be in observability tools or ad-hoc queries, like in a notebook. @@ -35,33 +35,42 @@ Data teams can monitor the performance of their models, identify bottlenecks, an 1. Use latest state environment-level API to get a list of all executed models and their execution time. Then, sort the models by `executionTime` in descending order. ```graphql -query Query($environmentId: Int!, $first: Int!){ - environment(id: $environmentId) { - applied { - models(first: $first) { - edges { - node { - name - uniqueId - materializedType - executionInfo { - lastSuccessRunId - executionTime - executeStartedAt - } - } - } +query AppliedModels($environmentId: BigInt!, $first: Int!) { + environment(id: $environmentId) { + applied { + models(first: $first) { + edges { + node { + name + uniqueId + materializedType + executionInfo { + lastSuccessRunId + executionTime + executeStartedAt } + } } + } } + } } ``` -2. Get the most recent 20 run results for the longest running model. Review the results of the model across runs, or you can go to the job/run or commit itself to investigate further. +2. Get the most recent 20 run results for the longest running model. Review the results of the model across runs or you can go to the job/run or commit itself to investigate further. ```graphql -query($environmentId: Int!, $uniqueId: String!, $lastRunCount: Int!) { - modelByEnvironment(environmentId: $environmentId, uniqueId: $uniqueId, lastRunCount: $lastRunCount) { +query ModelHistoricalRuns( + $environmentId: BigInt! + $uniqueId: String + $lastRunCount: Int +) { + environment(id: $environmentId) { + applied { + modelHistoricalRuns( + uniqueId: $uniqueId + lastRunCount: $lastRunCount + ) { name runId runElapsedTime @@ -70,12 +79,15 @@ query($environmentId: Int!, $uniqueId: String!, $lastRunCount: Int!) { executeStartedAt executeCompletedAt status + } } + } } ``` 3. Use the query results to plot a graph of the longest running model’s historical run time and execution time trends. + ```python # Import libraries import os @@ -88,11 +100,11 @@ auth_token = *[SERVICE_TOKEN_HERE]* # Query the API def query_discovery_api(auth_token, gql_query, variables): - response = requests.post('https://metadata.cloud.getdbt.com/graphql', + response = requests.post('https://metadata.cloud.getdbt.com/graphql', headers={"authorization": "Bearer "+auth_token, "content-type": "application/json"}, json={"query": gql_query, "variables": variables}) data = response.json()['data'] - + return data # Get the latest run metadata for all models @@ -120,7 +132,7 @@ variables_query_two = { } # Get the historical run metadata for the longest running model -model_historical_metadata = query_discovery_api(auth_token, query_two, variables_query_two)['modelByEnvironment'] +model_historical_metadata = query_discovery_api(auth_token, query_two, variables_query_two)['environment']['applied']['modelHistoricalRuns'] # Convert to dataframe model_df = pd.DataFrame(model_historical_metadata) @@ -143,7 +155,8 @@ plt.plot(model_df['executeStartedAt'], model_df['executionTime']) plt.title(model_df['name'].iloc[0]+" Execution Time") plt.show() ``` -Plotting examples: + +Plotting examples: @@ -152,70 +165,91 @@ Plotting examples:
-### What’s the latest state of each model? +### What’s the latest state of each model? The Discovery API provides information about the applied state of models and how they arrived in that state. You can retrieve the status information from the most recent run and most recent successful run (execution) from the `environment` endpoint and dive into historical runs using job-based and `modelByEnvironment` endpoints.
Example query -The API returns full identifier information (`database.schema.alias`) and the `executionInfo` for both the most recent run and most recent successful run from the database: - - - ```graphql - query($environmentId: Int!, $first: Int!){ - environment(id: $environmentId) { - applied { - models(first: $first) { - edges { - node { - uniqueId - compiledCode - database - schema - alias - materializedType - executionInfo { - executeCompletedAt - lastJobDefinitionId - lastRunGeneratedAt - lastRunId - lastRunStatus - lastRunError - lastSuccessJobDefinitionId - runGeneratedAt - lastSuccessRunId - } - } - } - } - } - } - } - ``` +The API returns full identifier information (`database.schema.alias`) and the `executionInfo` for both the most recent run and most recent successful run from the database: + +```graphql +query ($environmentId: BigInt!, $first: Int!) { + environment(id: $environmentId) { + applied { + models(first: $first) { + edges { + node { + uniqueId + compiledCode + database + schema + alias + materializedType + executionInfo { + executeCompletedAt + lastJobDefinitionId + lastRunGeneratedAt + lastRunId + lastRunStatus + lastRunError + lastSuccessJobDefinitionId + runGeneratedAt + lastSuccessRunId + } + } + } + } + } + } +} +```
### What happened with my job run? -You can query the metadata at the job level to review results for specific runs. This is helpful for historical analysis of deployment performance or optimizing particular jobs. +You can query the metadata at the job level to review results for specific runs. This is helpful for historical analysis of deployment performance or optimizing particular jobs. + +import DiscoveryApiJobDeprecationNotice from '/snippets/_discovery_api_job_deprecation_notice.md'; + +
Example query +Deprecated example: ```graphql -query($jobId: Int!, $runId: Int!){ - models(jobId: $jobId, runId: $runId) { - name - status - tests { - name - status - } - } +query ($jobId: Int!, $runId: Int!) { + models(jobId: $jobId, runId: $runId) { + name + status + tests { + name + status + } + } +} +``` + +New example: + +```graphql +query ($jobId: BigInt!, $runId: BigInt!) { + job(id: $jobId, runId: $runId) { + models { + name + status + tests { + name + status + } + } + } } ``` - +
### What’s changed since the last run? @@ -228,41 +262,47 @@ With the API, you can compare the `rawCode` between the definition and applied s ```graphql -query($environmentId: Int!, $first: Int!){ - environment(id: $environmentId) { - applied { - models(first: $first, filter: {uniqueIds:"MODEL.PROJECT.MODEL_NAME"}) { - edges { - node { - rawCode - ancestors(types: [Source]){ - ...on SourceAppliedStateNode { - freshness { - maxLoadedAt - } - } - } - executionInfo { - runGeneratedAt - executeCompletedAt - } - materializedType - } - } - } - } - definition { - models(first: $first, filter: {uniqueIds:"MODEL.PROJECT.MODEL_NAME"}) { - edges { - node { - rawCode - runGeneratedAt - materializedType - } - } - } - } - } +query ($environmentId: BigInt!, $first: Int!) { + environment(id: $environmentId) { + applied { + models( + first: $first + filter: { uniqueIds: "MODEL.PROJECT.MODEL_NAME" } + ) { + edges { + node { + rawCode + ancestors(types: [Source]) { + ... on SourceAppliedStateNestedNode { + freshness { + maxLoadedAt + } + } + } + executionInfo { + runGeneratedAt + executeCompletedAt + } + materializedType + } + } + } + } + definition { + models( + first: $first + filter: { uniqueIds: "MODEL.PROJECT.MODEL_NAME" } + ) { + edges { + node { + rawCode + runGeneratedAt + materializedType + } + } + } + } + } } ``` @@ -270,45 +310,46 @@ query($environmentId: Int!, $first: Int!){ ## Quality -You can use the Discovery API to monitor data source freshness and test results to diagnose and resolve issues and drive trust in data. When used with [webhooks](/docs/deploy/webhooks), can also help with detecting, investigating, and alerting issues. Below lists example questions the API can help you answer. Below are example questions and queries you can run. +You can use the Discovery API to monitor data source freshness and test results to diagnose and resolve issues and drive trust in data. When used with [webhooks](/docs/deploy/webhooks), can also help with detecting, investigating, and alerting issues. Below lists example questions the API can help you answer. Below are example questions and queries you can run. -For quality use cases, people typically query the historical or latest applied state, often in the upstream part of the DAG (for example, sources), using the `environment` or `modelByEnvironment` endpoints. +For quality use cases, people typically query the historical or latest applied state, often in the upstream part of the DAG (for example, sources), using the `environment` or `environment { applied { modelHistoricalRuns } }` endpoints. ### Which models and tests failed to run? + By filtering on the latest status, you can get lists of models that failed to build and tests that failed during their most recent execution. This is helpful when diagnosing issues with the deployment that result in delayed or incorrect data.
Example query with code -1. Get the latest run results across all jobs in the environment and return only the models and tests that errored/failed. +1. Get the latest run results across all jobs in the environment and return only the models and tests that errored/failed. ```graphql -query($environmentId: Int!, $first: Int!){ - environment(id: $environmentId) { - applied { - models(first: $first, filter: {lastRunStatus:error}) { - edges { - node { - name - executionInfo { - lastRunId - } - } - } - } - tests(first: $first, filter: {status:"fail"}) { - edges { - node { - name - executionInfo { - lastRunId - } - } - } - } - } - } +query ($environmentId: BigInt!, $first: Int!) { + environment(id: $environmentId) { + applied { + models(first: $first, filter: { lastRunStatus: error }) { + edges { + node { + name + executionInfo { + lastRunId + } + } + } + } + tests(first: $first, filter: { status: "fail" }) { + edges { + node { + name + executionInfo { + lastRunId + } + } + } + } + } + } } ``` @@ -316,14 +357,18 @@ query($environmentId: Int!, $first: Int!){ ```graphql -query($environmentId: Int!, $uniqueId: String!, $lastRunCount: Int) { - modelByEnvironment(environmentId: $environmentId, uniqueId: $uniqueId, lastRunCount: $lastRunCount) { - name - executeStartedAt - status - tests { - name - status +query ($environmentId: BigInt!, $uniqueId: String!, $lastRunCount: Int) { + environment(id: $environmentId) { + applied { + modelHistoricalRuns(uniqueId: $uniqueId, lastRunCount: $lastRunCount) { + name + executeStartedAt + status + tests { + name + status + } + } } } } @@ -337,63 +382,67 @@ query($environmentId: Int!, $uniqueId: String!, $lastRunCount: Int) { ### When was the data my model uses last refreshed? -You can get the metadata on the latest execution for a particular model or across all models in your project. For instance, investigate when each model or snapshot that's feeding into a given model was last executed or the source or seed was last loaded to gauge the _freshness_ of the data. +You can get the metadata on the latest execution for a particular model or across all models in your project. For instance, investigate when each model or snapshot that's feeding into a given model was last executed or the source or seed was last loaded to gauge the _freshness_ of the data.
Example query with code ```graphql -query($environmentId: Int!, $first: Int!){ - environment(id: $environmentId) { - applied { - models(first: $first,filter:{uniqueIds:"MODEL.PROJECT.MODEL_NAME"}) { - edges { - node { - name - ancestors(types:[Model, Source, Seed, Snapshot]) { - ... on ModelAppliedStateNode { - name - resourceType - materializedType - executionInfo { - executeCompletedAt - } - } - ... on SourceAppliedStateNode { - sourceName - name - resourceType - freshness { - maxLoadedAt - } - } - ... on SnapshotAppliedStateNode { - name - resourceType - executionInfo { - executeCompletedAt - } - } - ... on SeedAppliedStateNode { - name - resourceType - executionInfo { - executeCompletedAt - } - } - } - } - } - } - } - } +query ($environmentId: BigInt!, $first: Int!) { + environment(id: $environmentId) { + applied { + models( + first: $first + filter: { uniqueIds: "MODEL.PROJECT.MODEL_NAME" } + ) { + edges { + node { + name + ancestors(types: [Model, Source, Seed, Snapshot]) { + ... on ModelAppliedStateNestedNode { + name + resourceType + materializedType + executionInfo { + executeCompletedAt + } + } + ... on SourceAppliedStateNestedNode { + sourceName + name + resourceType + freshness { + maxLoadedAt + } + } + ... on SnapshotAppliedStateNestedNode { + name + resourceType + executionInfo { + executeCompletedAt + } + } + ... on SeedAppliedStateNestedNode { + name + resourceType + executionInfo { + executeCompletedAt + } + } + } + } + } + } + } + } } ``` + ```python # Extract graph nodes from response -def extract_nodes(data): +def extract_nodes(data): models = [] sources = [] groups = [] @@ -422,9 +471,9 @@ def create_freshness_graph(models_df, sources_df): if model["executionInfo"]["executeCompletedAt"] is not None: model_freshness = current_time - pd.Timestamp(model["executionInfo"]["executeCompletedAt"]) for ancestor in model["ancestors"]: - if ancestor["resourceType"] == "SourceAppliedStateNode": + if ancestor["resourceType"] == "SourceAppliedStateNestedNode": ancestor_freshness = current_time - pd.Timestamp(ancestor["freshness"]['maxLoadedAt']) - elif ancestor["resourceType"] == "ModelAppliedStateNode": + elif ancestor["resourceType"] == "ModelAppliedStateNestedNode": ancestor_freshness = current_time - pd.Timestamp(ancestor["executionInfo"]["executeCompletedAt"]) if ancestor_freshness > max_freshness: @@ -437,11 +486,11 @@ def create_freshness_graph(models_df, sources_df): for _, model in models_df.iterrows(): for parent in model["parents"]: G.add_edge(parent["uniqueId"], model["uniqueId"]) - + return G ``` -Graph example: +Graph example: @@ -450,7 +499,7 @@ Graph example: ### Are my data sources fresh? -Checking [source freshness](/docs/build/sources#snapshotting-source-data-freshness) allows you to ensure that sources loaded and used in your dbt project are compliant with expectations. The API provides the latest metadata about source loading and information about the freshness check criteria. +Checking [source freshness](/docs/build/sources#snapshotting-source-data-freshness) allows you to ensure that sources loaded and used in your dbt project are compliant with expectations. The API provides the latest metadata about source loading and information about the freshness check criteria. @@ -458,47 +507,49 @@ Checking [source freshness](/docs/build/sources#snapshotting-source-data-freshne Example query ```graphql -query($environmentId: Int!, $first: Int!){ - environment(id: $environmentId) { - applied { - sources(first: $first, filters:{freshnessChecked:true, database:"production"}) { - edges { - node { - sourceName - name - identifier - loader - freshness { - freshnessJobDefinitionId - freshnessRunId - freshnessRunGeneratedAt - freshnessStatus - freshnessChecked - maxLoadedAt - maxLoadedAtTimeAgoInS - snapshottedAt - criteria { - errorAfter { - count - period - } - warnAfter { - count - period - } - } - } - } - } - } - } - } +query ($environmentId: BigInt!, $first: Int!) { + environment(id: $environmentId) { + applied { + sources( + first: $first + filter: { freshnessChecked: true, database: "production" } + ) { + edges { + node { + sourceName + name + identifier + loader + freshness { + freshnessJobDefinitionId + freshnessRunId + freshnessRunGeneratedAt + freshnessStatus + freshnessChecked + maxLoadedAt + maxLoadedAtTimeAgoInS + snapshottedAt + criteria { + errorAfter { + count + period + } + warnAfter { + count + period + } + } + } + } + } + } + } + } } ```
- ### What’s the test coverage and status? [Tests](https://docs.getdbt.com/docs/build/tests) are an important way to ensure that your stakeholders are reviewing high-quality data. You can execute tests during a dbt Cloud run. The Discovery API provides complete test results for a given environment or job, which it represents as the `children` of a given node that’s been tested (for example, a `model`). @@ -506,32 +557,32 @@ query($environmentId: Int!, $first: Int!){
Example query -For the following example, the `parents` are the nodes (code) that's being tested and `executionInfo` describes the latest test results: +For the following example, the `parents` are the nodes (code) that's being tested and `executionInfo` describes the latest test results: ```graphql -query($environmentId: Int!, $first: Int!){ - environment(id: $environmentId) { - applied { - tests(first: $first) { - edges { - node { - name - columnName - parents { - name - resourceType - } - executionInfo { - lastRunStatus - lastRunError - executeCompletedAt - executionTime - } - } - } - } - } - } +query ($environmentId: BigInt!, $first: Int!) { + environment(id: $environmentId) { + applied { + tests(first: $first) { + edges { + node { + name + columnName + parents { + name + resourceType + } + executionInfo { + lastRunStatus + lastRunError + executeCompletedAt + executionTime + } + } + } + } + } + } } ``` @@ -541,44 +592,41 @@ query($environmentId: Int!, $first: Int!){ ### How is this model contracted and versioned? -To enforce the shape of a model's definition, you can define contracts on models and their columns. You can also specify model versions to keep track of discrete stages in its evolution and use the appropriate one. +To enforce the shape of a model's definition, you can define contracts on models and their columns. You can also specify model versions to keep track of discrete stages in its evolution and use the appropriate one. + +
Example query ```graphql -query{ - environment(id:123) { - definition { - models(first:100, filter:{access:public}) { - edges { - nodes { - name - latest_version - contract_enforced - constraints{ - name - type - expression - columns - } - catalog { - columns { - name - type - constraints { - name - type - expression - } - } - } - } - } - } - } - } +query { + environment(id: 123) { + applied { + models(first: 100, filter: { access: public }) { + edges { + node { + name + latestVersion + contractEnforced + constraints { + name + type + expression + columns + } + catalog { + columns { + name + type + } + } + } + } + } + } + } } ``` @@ -594,42 +642,50 @@ For discovery use cases, people typically query the latest applied or definition ### What does this dataset and its columns mean? -Query the Discovery API to map a table/view in the data platform to the model in the dbt project; then, retrieve metadata about its meaning, including descriptive metadata from its YAML file and catalog information from its YAML file and the schema. - +Query the Discovery API to map a table/view in the data platform to the model in the dbt project; then, retrieve metadata about its meaning, including descriptive metadata from its YAML file and catalog information from its YAML file and the schema.
Example query ```graphql -query($environmentId: Int!, $first: Int!){ - environment(id: $environmentId) { - applied { - models(first: $first, filter: {database:"analytics", schema:"prod", identifier:"customers"}) { - edges { - node { - name - description - tags - meta - catalog { - columns { - name - description - type - } - } - } - } - } - } - } +query ($environmentId: BigInt!, $first: Int!) { + environment(id: $environmentId) { + applied { + models( + first: $first + filter: { + database: "analytics" + schema: "prod" + identifier: "customers" + } + ) { + edges { + node { + name + description + tags + meta + catalog { + columns { + name + description + type + } + } + } + } + } + } + } } ```
+ + -### Which metrics are available? +### Which metrics are available? -Metric definitions are coming soon to the Discovery API with dbt v1.6. You’ll be able to query metrics using the dbt Semantic Layer, use them for documentation purposes (like for a data catalog), and calculate aggregations (like in a BI tool that doesn’t query the SL). +You can define and query metrics using the [dbt Semantic Layer](/docs/build/about-metricflow), use them for documentation purposes (like for a data catalog), and calculate aggregations (like in a BI tool that doesn’t query the SL). To learn more, refer to [Get started with MetricFlow](/docs/build/sl-getting-started).
Example query ```graphql -query($environmentId: Int!, $first: Int!){ - environment(id: $environmentId) { - definition { - metrics(first: $first) { - edges { - node { - name - description - type - formula - filter - tags - parents { - name - resourceType - } - } - } - } - } - } +query ($environmentId: BigInt!, $first: Int!) { + environment(id: $environmentId) { + definition { + metrics(first: $first) { + edges { + node { + name + description + type + formula + filter + tags + parents { + name + resourceType + } + } + } + } + } + } } ``` @@ -912,7 +952,7 @@ query($environmentId: Int!, $first: Int!){ -## Governance +## Governance You can use the Discovery API to audit data development and facilitate collaboration within and between teams. @@ -923,95 +963,98 @@ For governance use cases, people tend to query the latest definition state, ofte You can define and surface the groups each model is associated with. Groups contain information like owner. This can help you identify which team owns certain models and who to contact about them.
-Example query +Example query ```graphql -query($environmentId: Int!, $first: Int!){ - environment(id: $environmentId) { - applied { - model(first: $first, filter:{uniqueIds:["MODEL.PROJECT.NAME"]}) { - edges { - node { - name - description - resourceType - access - group - } - } - } - } - definition { - groups(first: $first) { - edges { - node { - name - resourceType - models { - name - } - owner_name - owner_email - } - } - } - } - } +query ($environmentId: BigInt!, $first: Int!) { + environment(id: $environmentId) { + applied { + models(first: $first, filter: { uniqueIds: ["MODEL.PROJECT.NAME"] }) { + edges { + node { + name + description + resourceType + access + group + } + } + } + } + definition { + groups(first: $first) { + edges { + node { + name + resourceType + models { + name + } + ownerName + ownerEmail + } + } + } + } + } } ```
### Who can use this model? -You can enable users the ability to specify the level of access for a given model. In the future, public models will function like APIs to unify project lineage and enable reuse of models using cross-project refs. +You can enable people the ability to specify the level of access for a given model. In the future, public models will function like APIs to unify project lineage and enable reuse of models using cross-project refs.
-Example query +Example query ```graphql -query($environmentId: Int!, $first: Int!){ - environment(id: $environmentId) { - definition { - models(first: $first) { - edges { - node { - name - access - } - } - } - } - } +query ($environmentId: BigInt!, $first: Int!) { + environment(id: $environmentId) { + definition { + models(first: $first) { + edges { + node { + name + access + } + } + } + } + } } +``` --- -query($environmentId: Int!, $first: Int!){ - environment(id: $environmentId) { - definition { - models(first: $first, filters:{access:public}) { - edges { - node { - name - } - } - } - } - } + +```graphql +query ($environmentId: BigInt!, $first: Int!) { + environment(id: $environmentId) { + definition { + models(first: $first, filter: { access: public }) { + edges { + node { + name + } + } + } + } + } } ```
-## Development +## Development You can use the Discovery API to understand dataset changes and usage and gauge impacts to inform project definition. Below are example questions and queries you can run. For development use cases, people typically query the historical or latest definition or applied state across any part of the DAG using the `environment` endpoint. ### How is this model or metric used in downstream tools? -[Exposures](/docs/build/exposures) provide a method to define how a model or metric is actually used in dashboards and other analytics tools and use cases. You can query an exposure’s definition to see how project nodes are used and query its upstream lineage results to understand the state of the data used in it, which powers use cases like a freshness and quality status tile. +[Exposures](/docs/build/exposures) provide a method to define how a model or metric is actually used in dashboards and other analytics tools and use cases. You can query an exposure’s definition to see how project nodes are used and query its upstream lineage results to understand the state of the data used in it, which powers use cases like a freshness and quality status tile. @@ -1019,47 +1062,41 @@ For development use cases, people typically query the historical or latest defin
Example query -This example reviews an exposure and the models used in it, including when they were last executed and their test results: +Below is an example that reviews an exposure and the models used in it including when they were last executed. ```graphql -query($environmentId: Int!, $first: Int!){ - environment(id: $environmentId) { - applied { - exposures(first: $first) { - edges { - node { - name - description - owner_name - url - parents { - name - resourceType - ... on ModelAppliedStateNode { - executionInfo { - executeCompletedAt - lastRunStatus - } - tests { - executionInfo { - executeCompletedAt - lastRunStatus - } - } - } - } - } - } - } - } - } +query ($environmentId: BigInt!, $first: Int!) { + environment(id: $environmentId) { + applied { + exposures(first: $first) { + edges { + node { + name + description + ownerName + url + parents { + name + resourceType + ... on ModelAppliedStateNestedNode { + executionInfo { + executeCompletedAt + lastRunStatus + } + } + } + } + } + } + } + } } ```
-### How has this model changed over time? -The Discovery API provides historical information about any resource in your project. For instance, you can view how a model has evolved over time (across recent runs) given changes to its shape and contents. +### How has this model changed over time? +The Discovery API provides historical information about any resource in your project. For instance, you can view how a model has evolved over time (across recent runs) given changes to its shape and contents.
Example query @@ -1067,54 +1104,69 @@ The Discovery API provides historical information about any resource in your pro Review the differences in `compiledCode` or `columns` between runs or plot the “Approximate Size” and “Row Count” `stats` over time: ```graphql -query(environmentId: Int!, uniqueId: String!, lastRunCount: Int!, withCatalog: Boolean!){ - modelByEnvironment(environmentId: $environmentId, uniqueId: $uniqueId, lastRunCount: $lastRunCount, withCatalog: $withCatalog) { - name - compiledCode - columns { - name - } - stats { - label - value - } - } +query ( + $environmentId: BigInt! + $uniqueId: String! + $lastRunCount: Int! + $withCatalog: Boolean! +) { + environment(id: $environmentId) { + applied { + modelHistoricalRuns( + uniqueId: $uniqueId + lastRunCount: $lastRunCount + withCatalog: $withCatalog + ) { + name + compiledCode + columns { + name + } + stats { + label + value + } + } + } + } } ```
### Which nodes depend on this data source? + dbt lineage begins with data sources. For a given source, you can look at which nodes are its children then iterate downstream to get the full list of dependencies. +Currently, querying beyond 1 generation (defined as a direct parent-to-child) is not supported. To see the grandchildren of a node, you need to make two queries: one to get the node and its children, and another to get the children nodes and their children.
Example query ```graphql -query($environmentId: Int!, $first: Int!){ - environment(id: $environmentId) { - applied { - sources(first: $first, filter:{uniqueIds:["SOURCE_NAME.TABLE_NAME"]}) { - edges { - node { - loader - children { - uniqueId - resourceType - ... on ModelAppliedStateNode { - database - schema - alias - children { - uniqueId - } - } - } - } - } - } - } - } +query ($environmentId: BigInt!, $first: Int!) { + environment(id: $environmentId) { + applied { + sources( + first: $first + filter: { uniqueIds: ["SOURCE_NAME.TABLE_NAME"] } + ) { + edges { + node { + loader + children { + uniqueId + resourceType + ... on ModelAppliedStateNestedNode { + database + schema + alias + } + } + } + } + } + } + } } ```
diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-environment-applied-modelHistoricalRuns.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-environment-applied-modelHistoricalRuns.mdx new file mode 100644 index 00000000000..d1463f9e9b7 --- /dev/null +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-environment-applied-modelHistoricalRuns.mdx @@ -0,0 +1,50 @@ +--- +title: "Model Historical Runs object schema" +sidebar_label: "Model historical runs" +id: "discovery-schema-environment-applied-modelHistoricalRuns" +--- + +import { NodeArgsTable, SchemaTable } from "./schema"; + +The model historical runs object allows you to query information about a model's run history. + +The [Example query](#example-query) illustrates a few fields you can query with the `modelHistoricalRuns` object. Refer to [Fields](#fields) to view the entire schema, which provides all possible fields you can query. + +### Arguments + +When querying for `modelHistoricalRuns`, you can use the following arguments: + + + +### Example query + +You can use the `environmentId` and the model's `uniqueId` to return the model and its execution time for the last 20 times it was run, regardless of which job ran it. + +```graphql +query { + environment(id: 834) { + applied { + modelHistoricalRuns( + uniqueId: "model.marketing.customers" + lastRunCount: 20 + ) { + runId # Get historical results for a particular model + runGeneratedAt + executionTime # View build time across runs + status + tests { + name + status + executeCompletedAt + } # View test results across runs + } + } + } +} +``` + +### Fields + +When querying for `modelHistoricalRuns`, you can use the following fields: + + diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-environment.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-environment.mdx index 41fd5555c3f..a82bba6576d 100644 --- a/website/docs/docs/dbt-cloud-apis/schema-discovery-environment.mdx +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-environment.mdx @@ -4,28 +4,34 @@ sidebar_label: "Environment" id: "discovery-schema-environment" --- -import { ArgsTable, SchemaTable } from "./schema"; +import { QueryArgsTable, SchemaTable } from "./schema"; -This environment object allows you to query information about a particular model based on `environmentId`. +The environment object allows you to query information about a particular model based on `environmentId`. -The [example query](#example-query) illustrates a few fields you can query in this `environment` object. Refer to [Fields](#fields) to see the entire schema, which provides all possible fields you can query. +The [Example queries](#example-queries) illustrate a few fields you can query with this `environment` object. Refer to [Fields](#fields) to view the entire schema, which provides all possible fields you can query. ### Arguments When querying for `environment`, you can use the following arguments. - + +:::caution -### Example Query +dbt Labs is making changes to the Discovery API. These changes will take effect on August 15, 2023. -You can use your production environment's `id`: +The data type `Int` for `id` is being deprecated and will be replaced with `BigInt`. When the time comes, you will need to update your API call accordingly to avoid errors. +::: + +### Example queries + +You can use your production environment's `id`: ```graphql query Example { - environment(id: 834){ # Get the latest state of the production environment + environment(id: 834){ # Get the latest state of the production environment applied { # The state of an executed node as it exists as an object in the database models(first: 100){ # Pagination to ensure manageable response for large projects edges { node { @@ -34,8 +40,8 @@ query Example { executionInfo {executeCompletedAt, executionTime}, # Metadata from when the model was built tests {name, executionInfo{lastRunStatus, lastRunError}}, # Latest test results catalog {columns {name, description, type}, stats {label, value}}, # Catalog info - ancestors(types:[Source]) {name, ...on SourceAppliedStateNode {freshness{maxLoadedAt, freshnessStatus}}}, # Source freshness } - children {name, resourceType}}} # Immediate dependencies in lineage + ancestors(types:[Source]) {name, ...on SourceAppliedStateNode {freshness{maxLoadedAt, freshnessStatus}}}, # Source freshness } + children {name, resourceType}}} # Immediate dependencies in lineage totalCount } # Number of models in the project } definition { # The logical state of a given project node given its most recent manifest generated @@ -48,12 +54,50 @@ query Example { } ``` +With the deprecation of the data type `Int` for `id`, below is an example of replacing it with `BigInt`: + +```graphql +query ($environmentId: BigInt!, $first: Int!) { + environment(id: $environmentId) { + applied { + models(first: $first) { + edges { + node { + uniqueId + executionInfo { + lastRunId + } + } + } + } + } + } +} + +``` + +With the deprecation of `modelByEnvironment`, below is an example of replacing it with `environment`: + +```graphql +query ($environmentId: BigInt!, $uniqueId: String) { + environment(id: $environmentId) { + applied { + modelHistoricalRuns(uniqueId: $uniqueId) { + uniqueId + executionTime + executeCompletedAt + } + } + } +} +``` + ### Fields When querying an `environment`, you can use the following fields. -When querying the `applied` field of `environment`, you can use the following fields. +When querying the `applied` field of `environment`, you can use the following fields. When querying the `definition` field of `environment`, you can use the following fields. diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-exposure.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-exposure.mdx index d74f12223c5..aa1d27fd83c 100644 --- a/website/docs/docs/dbt-cloud-apis/schema-discovery-exposure.mdx +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-exposure.mdx @@ -4,22 +4,25 @@ sidebar_label: "Exposure" id: "discovery-schema-exposure" --- -import { ArgsTable, SchemaTable } from "./schema"; +import { QueryArgsTable, SchemaTable } from "./schema"; -The exposure object allows you to query information about a particular exposure. You can learn more about exposures [here](/docs/build/exposures). +The exposure object allows you to query information about a particular exposure. To learn more, refer to [Add Exposures to your DAG](/docs/build/exposures). + +import DiscoveryApiJobDeprecationNotice from '/snippets/_discovery_api_job_deprecation_notice.md'; + + ### Arguments -When querying for an `exposure`, the following arguments are available. Note that if you do not include a runId, it will default to the most recent run of the specified job: +When querying for an `exposure`, the following arguments are available. If you don't include a `runId`, the API defaults to the most recent run of the specified job: - + -Below we show some illustrative example queries and outline the schema of this exposure object. +Below we show some illustrative example queries and outline the schema of the exposure object. -### Example Queries -#### Exposure information +### Example query -The example query below queries information about an exposure, including the owner's name and email, the url, and information about parent sources and parent models. +The example below queries information about an exposure including the owner's name and email, the URL, and information about parent sources and parent models. ```graphql { diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-exposures.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-exposures.mdx index 5e3dcdd45a9..ba539c87dc8 100644 --- a/website/docs/docs/dbt-cloud-apis/schema-discovery-exposures.mdx +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-exposures.mdx @@ -4,22 +4,25 @@ sidebar_label: "Exposures" id: "discovery-schema-exposures" --- -import { ArgsTable, SchemaTable } from "./schema"; +import { QueryArgsTable, SchemaTable } from "./schema"; -The exposures object allows you to query information about all exposures in a given job. You can learn more about exposures [here](/docs/build/exposures). +The exposures object allows you to query information about all exposures in a given job. To learn more, refer to [Add Exposures to your DAG](/docs/build/exposures). + +import DiscoveryApiJobDeprecationNotice from '/snippets/_discovery_api_job_deprecation_notice.md'; + + ### Arguments -When querying for `exposures`, the following arguments are available. Note that if you do not include a runId, it will default to the most recent run of the specified job: +When querying for `exposures`, the following arguments are available. If you don't include a `runId`, the API defaults to the most recent run of the specified job: - + -Below we show some illustrative example queries and outline the schema of this exposures object. +Below we show some illustrative example queries and outline the schema of the exposures object. -### Example Queries -#### Exposures information +### Example query -The example query below queries information about all exposures in a given job, including, for each exposure, the owner's name and email, the url, and information about parent sources and parent models. +The example below queries information about all exposures in a given job including the owner's name and email, the URL, and information about parent sources and parent models for each exposure. ```graphql { diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-job-exposure.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-exposure.mdx new file mode 100644 index 00000000000..58855659d05 --- /dev/null +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-exposure.mdx @@ -0,0 +1,64 @@ +--- +title: "Exposure object schema" +sidebar_label: "Exposure" +id: "discovery-schema-job-exposure" +--- + +import { NodeArgsTable, SchemaTable } from "./schema"; + +The exposure object allows you to query information about a particular exposure. To learn more, refer to [Add Exposures to your DAG](/docs/build/exposures). + +### Arguments + +When querying for an `exposure`, the following arguments are available. + + + +Below we show some illustrative example queries and outline the schema of the exposure object. + +### Example query + +The example below queries information about an exposure including the owner's name and email, the URL, and information about parent sources and parent models. + +```graphql +{ + job(id: 123) { + exposure(name: "my_awesome_exposure") { + runId + projectId + name + uniqueId + resourceType + ownerName + url + ownerEmail + parentsSources { + uniqueId + sourceName + name + state + maxLoadedAt + criteria { + warnAfter { + period + count + } + errorAfter { + period + count + } + } + maxLoadedAtTimeAgoInS + } + parentsModels { + uniqueId + } + } + } +} +``` + +### Fields +When querying for an `exposure`, the following fields are available: + + diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-job-exposures.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-exposures.mdx new file mode 100644 index 00000000000..b4fe027e324 --- /dev/null +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-exposures.mdx @@ -0,0 +1,65 @@ +--- +title: "Exposures object schema" +sidebar_label: "Exposures" +id: "discovery-schema-job-exposures" +--- + +import { NodeArgsTable, SchemaTable } from "./schema"; + +The exposures object allows you to query information about all exposures in a given job. To learn more, refer to [Add Exposures to your DAG](/docs/build/exposures). + + +### Arguments + +When querying for `exposures`, the following arguments are available. + + + +Below we show some illustrative example queries and outline the schema of the exposures object. + +### Example query + +The example below queries information about all exposures in a given job including the owner's name and email, the URL, and information about parent sources and parent models for each exposure. + +```graphql +{ + job(id: 123) { + exposures(jobId: 123) { + runId + projectId + name + uniqueId + resourceType + ownerName + url + ownerEmail + parentsSources { + uniqueId + sourceName + name + state + maxLoadedAt + criteria { + warnAfter { + period + count + } + errorAfter { + period + count + } + } + maxLoadedAtTimeAgoInS + } + parentsModels { + uniqueId + } + } + } +} +``` + +### Fields +When querying for `exposures`, the following fields are available: + + diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-job-metric.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-metric.mdx new file mode 100644 index 00000000000..1f1f490d62f --- /dev/null +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-metric.mdx @@ -0,0 +1,58 @@ +--- +title: "Metric object schema" +sidebar_label: "Metric" +id: "discovery-schema-job-metric" +--- + +import { NodeArgsTable, SchemaTable } from "./schema"; + +The metric object allows you to query information about [metrics](/docs/build/metrics). + +### Arguments + +When querying for a `metric`, the following arguments are available. + + + +Below we show some illustrative example queries and outline the schema (all possible fields you can query) of the metric object. + +### Example query + +The example query below outputs information about a metric. You can also add any field from the model endpoint (the example simply selects name). This includes schema, database, uniqueId, columns, and more. For details, refer to [Model object schema](/docs/dbt-cloud-apis/discovery-schema-model). + + +```graphql +{ + job(id: 123) { + metric(uniqueId: "metric.jaffle_shop.new_customers") { + uniqueId + name + packageName + tags + label + runId + description + type + sql + timestamp + timeGrains + dimensions + meta + resourceType + filters { + field + operator + value + } + model { + name + } + } + } +} +``` + +### Fields +When querying for a `metric`, the following fields are available: + + diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-job-metrics.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-metrics.mdx new file mode 100644 index 00000000000..174dd5b676a --- /dev/null +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-metrics.mdx @@ -0,0 +1,60 @@ +--- +title: "Metrics object schema" +sidebar_label: "Metrics" +id: "discovery-schema-job-metrics" +--- + +import { NodeArgsTable, SchemaTable } from "./schema"; + +The metrics object allows you to query information about [metrics](/docs/build/metrics). + + +### Arguments + +When querying for `metrics`, the following arguments are available. + + + +Below we show some illustrative example queries and outline the schema (all possible fields you can query) of the metrics object. + +### Example query + +The example query returns information about all metrics for the given job. + +```graphql +{ + job(id: 123) { + metrics { + uniqueId + name + packageName + tags + label + runId + description + type + sql + timestamp + timeGrains + dimensions + meta + resourceType + filters { + field + operator + value + } + model { + name + } + } + } +} +``` + +### Fields +The metrics object can access the _same fields_ as the [metric node](/docs/dbt-cloud-apis/discovery-schema-job-metric). The difference is that the metrics object can output a list so instead of querying for fields for one specific metric, you can query for those parameters for all metrics in a run. + +When querying for `metrics`, the following fields are available: + + diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-job-model.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-model.mdx new file mode 100644 index 00000000000..abd1ca1b1d6 --- /dev/null +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-model.mdx @@ -0,0 +1,91 @@ +--- +title: "Model object schema" +sidebar_label: "Model" +id: "discovery-schema-job-model" +--- + +import { NodeArgsTable, SchemaTable } from "./schema"; + +The model object allows you to query information about a particular model in a given job. + +### Arguments + +When querying for a `model`, the following arguments are available. + + + +Below we show some illustrative example queries and outline the schema (all possible fields you can query) of the model object. + +### Example query for finding parent models and sources + +The example query below uses the `parentsModels` and `parentsSources` fields to fetch information about a model’s parent models and parent sources. The jobID and uniqueID fields are placeholders that you will need to replace with your own values. + +```graphql +{ + job(id: 123) { + model(uniqueId: "model.jaffle_shop.dim_user") { + parentsModels { + runId + uniqueId + executionTime + } + parentsSources { + runId + uniqueId + state + } + } + } +} + +``` + +### Example query for model timing + +The example query below could be useful if you want to understand information around execution timing on a given model (start, end, completion). + +```graphql +{ + job(id: 123) { + model(uniqueId: "model.jaffle_shop.dim_user") { + runId + projectId + name + uniqueId + resourceType + executeStartedAt + executeCompletedAt + executionTime + } + } +} +``` + +### Example query for column-level information + +You can use the following example query to understand more about the columns of a given model. This query will only work if the job has generated documentation; that is, it will work with the command `dbt docs generate`. + +```graphql +{ + job(id: 123) { + model(uniqueId: "model.jaffle_shop.dim_user") { + columns { + name + index + type + comment + description + tags + meta + } + } + } +} +``` + + +### Fields + +When querying for a `model`, the following fields are available: + + diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-job-models.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-models.mdx new file mode 100644 index 00000000000..ee512f3cd97 --- /dev/null +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-models.mdx @@ -0,0 +1,59 @@ +--- +title: "Models object schema" +sidebar_label: "Models" +id: "discovery-schema-job-models" +--- + +import { NodeArgsTable, SchemaTable } from "./schema"; + + +The models object allows you to query information about all models in a given job. + +### Arguments + +When querying for `models`, the following arguments are available. + + + +Below we show some illustrative example queries and outline the schema of the models object. + +### Example queries +The database, schema, and identifier arguments are all optional. This means that with this endpoint you can: + +- Find a specific model by providing `..` +- Find all of the models in a database and/or schema by providing `` and/or `` + +#### Find models by their database, schema, and identifier +The example query below finds a model by its unique database, schema, and identifier. + +```graphql +{ + job(id: 123) { + models(database:"analytics", schema: "analytics", identifier:"dim_customers") { + uniqueId + } + } +} +``` + +#### Find models by their schema +The example query below finds all models in this schema and their respective execution times. + +```graphql +{ + job(id: 123) { + models(schema: "analytics") { + uniqueId + executionTime + } + } +} +``` + + +### Fields +The models object can access the _same fields_ as the [Model node](/docs/dbt-cloud-apis/discovery-schema-job-model). The difference is that the models object can output a list so instead of querying for fields for one specific model, you can query for those parameters for all models within a jobID, database, and so on. + +When querying for `models`, the following fields are available: + + diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-job-seed.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-seed.mdx new file mode 100644 index 00000000000..924e3e87e91 --- /dev/null +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-seed.mdx @@ -0,0 +1,42 @@ +--- +title: "Seed object schema" +sidebar_label: "Seed" +id: "discovery-schema-job-seed" +--- + +import { NodeArgsTable, SchemaTable } from "./schema"; + +The seed object allows you to query information about a particular seed in a given job. + +### Arguments + +When querying for a `seed`, the following arguments are available. + + + +Below we show some illustrative example queries and outline the schema of the seed object. + +### Example query + +The example query below pulls relevant information about a given seed. For instance, you can view the load time. + +```graphql +{ + job(id: 123) { + seed(uniqueId: "seed.jaffle_shop.raw_customers") { + database + schema + uniqueId + name + status + error + } + } +} +``` + +### Fields + +When querying for a `seed`, the following fields are available: + + diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-job-seeds.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-seeds.mdx new file mode 100644 index 00000000000..6ed45216e5f --- /dev/null +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-seeds.mdx @@ -0,0 +1,40 @@ +--- +title: "Seeds object schema" +sidebar_label: "Seeds" +id: "discovery-schema-job-seeds" +--- + +import { NodeArgsTable, SchemaTable } from "./schema"; + +The seeds object allows you to query information about all seeds in a given job. + +### Arguments + +When querying for `seeds`, the following arguments are available. + + + +Below we show some illustrative example queries and outline the schema of the seeds object. + +### Example query + +The example query below pulls relevant information about all seeds in a given job. For instance, you can view load times. + +```graphql +{ + job(id: 123) { + seeds { + uniqueId + name + executionTime + status + } + } +} +``` + +### Fields + +When querying for `seeds`, the following fields are available: + + diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-job-snapshots.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-snapshots.mdx new file mode 100644 index 00000000000..a57163e0554 --- /dev/null +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-snapshots.mdx @@ -0,0 +1,49 @@ +--- +title: "Snapshots object schema" +sidebar_label: "Snapshots" +id: "discovery-schema-job-snapshots" +--- + +import { NodeArgsTable, SchemaTable } from "./schema"; + +The snapshots object allows you to query information about all snapshots in a given job. + +### Arguments + +When querying for `snapshots`, the following arguments are available. + + + +Below we show some illustrative example queries and outline the schema of the snapshots object. + +### Example query + +The database, schema, and identifier arguments are optional. This means that with this endpoint you can: + +- Find a specific snapshot by providing `..` +- Find all of the snapshots in a database and/or schema by providing `` and/or `` + +#### Find snapshots information for a job + +The example query returns information about all snapshots in this job. + +```graphql +{ + job(id: 123) { + snapshots { + uniqueId + name + executionTime + environmentId + executeStartedAt + executeCompletedAt + } + } +} +``` + +### Fields + +When querying for `snapshots`, the following fields are available: + + diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-job-source.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-source.mdx new file mode 100644 index 00000000000..972e929f4cd --- /dev/null +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-source.mdx @@ -0,0 +1,52 @@ +--- +title: "Source object schema" +sidebar_label: "Source" +id: "discovery-schema-job-source" +--- + +import { NodeArgsTable, SchemaTable } from "./schema"; + +The source object allows you to query information about a particular source in a given job. + +### Arguments + +When querying for a `source`, the following arguments are available. + + + +Below we show some illustrative example queries and outline the schema of the source object. + +### Example query + +The query below pulls relevant information about a given source. For instance, you can view the load time and the state (pass, fail, error) of that source. + +```graphql +{ + job(id: 123) { + source(uniqueId: "source.jaffle_shop.snowplow.event") { + uniqueId + sourceName + name + state + maxLoadedAt + criteria { + warnAfter { + period + count + } + errorAfter { + period + count + } + } + maxLoadedAtTimeAgoInS + } + } +} +``` + +### Fields + +When querying for a `source`, the following fields are available: + + diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-job-sources.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-sources.mdx new file mode 100644 index 00000000000..97f717d269a --- /dev/null +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-sources.mdx @@ -0,0 +1,65 @@ +--- +title: "Sources object schema" +sidebar_label: "Sources" +id: "discovery-schema-job-sources" +--- + +import { NodeArgsTable, SchemaTable } from "./schema"; + +The sources object allows you to query information about all sources in a given job. + +### Arguments + +When querying for `sources`, the following arguments are available. + + + +Below we show some illustrative example queries and outline the schema of the sources object. + +### Example queries + +The database, schema, and identifier arguments are optional. This means that with this endpoint you can: + +- Find a specific source by providing `..` +- Find all of the sources in a database and/or schema by providing `` and/or `` + +#### Finding sources by their database, schema, and identifier + +The example query below finds a source by its unique database, schema, and identifier. + +```graphql +{ + job(id: 123) { + sources( + database: "analytics" + schema: "analytics" + identifier: "dim_customers" + ) { + uniqueId + } + } +} +``` + +#### Finding sources by their schema + +The example query below finds all sources in this schema and their respective states (pass, error, fail). + +```graphql +{ + job(id: 123) { + sources(schema: "analytics") { + uniqueId + state + } + } +} +``` + +### Fields + +The sources object can access the _same fields_ as the [source node](/docs/dbt-cloud-apis/discovery-schema-job-source). The difference is that the sources object can output a list so instead of querying for fields for one specific source, you can query for those parameters for all sources within a jobID, database, and so on. + +When querying for `sources`, the following fields are available: + + diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-job-test.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-test.mdx new file mode 100644 index 00000000000..c52aa49ab93 --- /dev/null +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-test.mdx @@ -0,0 +1,43 @@ +--- +title: "Test object schema" +sidebar_label: "Test" +id: "discovery-schema-job-test" +--- + +import { NodeArgsTable, SchemaTable } from "./schema"; + +The test object allows you to query information about a particular test. + +### Arguments + +When querying for a `test`, the following arguments are available. + + + +Below we show some illustrative example queries and outline the schema (all possible fields you can query) of the test object. + +### Example query + +The example query below outputs information about a test including the state of the test result. In order of severity, the result can be one of these: "error", "fail", "warn", or "pass". + +```graphql +{ + job(id: 123) { + test(uniqueId: "test.internal_analytics.not_null_metrics_id") { + runId + accountId + projectId + uniqueId + name + columnName + state + } + } +} +``` + +### Fields + +When querying for a `test`, the following fields are available: + + diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-job-tests.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-tests.mdx new file mode 100644 index 00000000000..efcef674c55 --- /dev/null +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-job-tests.mdx @@ -0,0 +1,43 @@ +--- +title: "Tests object schema" +sidebar_label: "Tests" +id: "discovery-schema-job-tests" +--- + +import { NodeArgsTable, SchemaTable } from "./schema"; + +The tests object allows you to query information about all tests in a given job. + +### Arguments + +When querying for `tests`, the following arguments are available. + + + +Below we show some illustrative example queries and outline the schema (all possible fields you can query) of the tests object. + +### Example query + +The example query below finds all tests in this job and includes information about those tests. + +```graphql +{ + job(id: 123) { + tests { + runId + accountId + projectId + uniqueId + name + columnName + state + } + } +} +``` + +### Fields + +When querying for `tests`, the following fields are available: + + diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-job.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-job.mdx new file mode 100644 index 00000000000..bb30786e19d --- /dev/null +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-job.mdx @@ -0,0 +1,62 @@ +--- +title: "Job object schema" +sidebar_label: "Job" +id: "discovery-schema-job" +--- + +import { QueryArgsTable, SchemaTable } from "./schema"; + +The job object allows you to query information about a particular model based on `jobId` and, optionally, a `runId`. + +If you don't provide a `runId`, the API returns information on the latest runId of a job. + +The [example query](#example-query) illustrates a few fields you can query in this `job` object. Refer to [Fields](#fields) to see the entire schema, which provides all possible fields you can query. + +### Arguments + +When querying for `job`, you can use the following arguments. + + + + +### Example Query + +You can use your production job's `id`. + +```graphql +query JobQueryExample { + # Provide runId for looking at specific run, otherwise it defaults to latest run + job(id: 940) { + # Get all models from this job's latest run + models(schema: "analytics") { + uniqueId + executionTime + } + + # Or query a single node + source(uniqueId: "source.jaffle_shop.snowplow.event") { + uniqueId + sourceName + name + state + maxLoadedAt + criteria { + warnAfter { + period + count + } + errorAfter { + period + count + } + } + maxLoadedAtTimeAgoInS + } + } +} +``` + +### Fields +When querying an `job`, you can use the following fields. + + diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-metric.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-metric.mdx index 2280c6f7802..aee04ba2cce 100644 --- a/website/docs/docs/dbt-cloud-apis/schema-discovery-metric.mdx +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-metric.mdx @@ -4,22 +4,25 @@ sidebar_label: "Metric" id: "discovery-schema-metric" --- -import { ArgsTable, SchemaTable } from "./schema"; +import { QueryArgsTable, SchemaTable } from "./schema"; The metric object allows you to query information about [metrics](/docs/build/metrics). +import DiscoveryApiJobDeprecationNotice from '/snippets/_discovery_api_job_deprecation_notice.md'; + + + ### Arguments -When querying for a `metric`, the following arguments are available. Note that if you do not include a runId, it will default to the most recent run of the specified job: +When querying for a `metric`, the following arguments are available. If you don't include a runId, the API defaults to the most recent run of the specified job: - + -Below we show some illustrative example queries and outline the schema (all possible fields you can query) of this metric object. +Below we show some illustrative example queries and outline the schema (all possible fields you can query) of the metric object. -### Example Queries -#### Metric information +### Example query -The example query below outputs information about a metric. Note that you can also add any field from the Model endpoint -- here we are simply selecting name. This includes schema, database, uniqueId, columns and more -- find documentation [here](/docs/dbt-cloud-apis/discovery-schema-model). +The example query below outputs information about a metric. You can also add any field from the model endpoint (the example simply selects name). This includes schema, database, uniqueId, columns, and more. For details, refer to [Model object schema](/docs/dbt-cloud-apis/discovery-schema-model). ```graphql diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-metrics.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-metrics.mdx index 5242eb717dc..30d8d68b365 100644 --- a/website/docs/docs/dbt-cloud-apis/schema-discovery-metrics.mdx +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-metrics.mdx @@ -4,22 +4,25 @@ sidebar_label: "Metrics" id: "discovery-schema-metrics" --- -import { ArgsTable, SchemaTable } from "./schema"; +import { QueryArgsTable, SchemaTable } from "./schema"; The metrics object allows you to query information about [metrics](/docs/build/metrics). +import DiscoveryApiJobDeprecationNotice from '/snippets/_discovery_api_job_deprecation_notice.md'; + + + ### Arguments -When querying for `metrics`, the following arguments are available. Note that if you do not include a runId, it will default to the most recent run of the specified job: +When querying for `metrics`, the following arguments are available. If you don't include a runId, the API defaults to the most recent run of the specified job: - + -Below we show some illustrative example queries and outline the schema (all possible fields you can query) of this metrics object. +Below we show some illustrative example queries and outline the schema (all possible fields you can query) of the metrics object. -### Example Queries -#### Metrics information +### Example query -The example query returns information about all metrics in this job. +The example query returns information about all metrics for the given job. ```graphql { @@ -52,7 +55,7 @@ The example query returns information about all metrics in this job. ``` ### Fields -metrics has access to the *same fields* as the [metric node](/docs/dbt-cloud-apis/discovery-schema-metric). The difference is that metrics can output a list, so instead of querying for fields for one specific metric, you can query for those parameters for all metrics in a run. +The metrics object can access the _same fields_ as the [metric node](/docs/dbt-cloud-apis/discovery-schema-metric). The difference is that the metrics object can output a list so instead of querying for fields for one specific metric, you can query for those parameters for all metrics in a run. When querying for `metrics`, the following fields are available: diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-model.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-model.mdx index 3fb43edaded..7206fb9a51c 100644 --- a/website/docs/docs/dbt-cloud-apis/schema-discovery-model.mdx +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-model.mdx @@ -4,22 +4,25 @@ sidebar_label: "Model" id: "discovery-schema-model" --- -import { ArgsTable, SchemaTable } from "./schema"; +import { QueryArgsTable, SchemaTable } from "./schema"; The model object allows you to query information about a particular model in a given job. +import DiscoveryApiJobDeprecationNotice from '/snippets/_discovery_api_job_deprecation_notice.md'; + + + ### Arguments -When querying for a `model`, the following arguments are available. Note that if you do not include a runId, it will default to the most recent run of the specified job: +When querying for a `model`, the following arguments are available. If you don't include a runId, the API defaults to the most recent run of the specified job: - + -Below we show some illustrative example queries and outline the schema (all possible fields you can query) of this model object. +Below we show some illustrative example queries and outline the schema (all possible fields you can query) of the model object. -### Example Queries -#### Finding parent models and sources +### Example query for finding parent models and sources -The example query below uses the `parentsModels` and `parentsSources` fields to fetch information about a model’s parent models and parent sources. Note that we put a placeholder jobID and uniqueID, which you will have to replace. +The example query below uses the `parentsModels` and `parentsSources` fields to fetch information about a model’s parent models and parent sources. The jobID and uniqueID fields are placeholders that you will need to replace with your own values. ```graphql { @@ -38,9 +41,9 @@ The example query below uses the `parentsModels` and `parentsSources` fields to } ``` -#### Model Timing +### Example query for model timing -The example query below could be useful if we wanted to understand information around execution timing on a given model (start, end, completion). +The example query below could be useful if you wanted to understand information around execution timing on a given model (start, end, completion). ```graphql { @@ -57,9 +60,10 @@ The example query below could be useful if we wanted to understand information a } ``` -#### Column-level information +### Example query for column-level information + +You can use the following example query to understand more about the columns of a given model. This query will only work if the job has generated documentation; that is, it will work with the command `dbt docs generate`. -You can use the following example query to understand more about the columns of a given model. Note that this will only work if the job has generated documentation. For example it will work with the command `dbt docs generate`. ```graphql { diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-modelByEnv.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-modelByEnv.mdx index 078d2512256..dade3d32d8a 100644 --- a/website/docs/docs/dbt-cloud-apis/schema-discovery-modelByEnv.mdx +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-modelByEnv.mdx @@ -4,24 +4,34 @@ sidebar_label: "Model by environment" id: "discovery-schema-modelByEnv" --- -import { ArgsTable, SchemaTable } from "./schema"; +import { QueryArgsTable, SchemaTable } from "./schema"; - +:::caution + +dbt Labs is making changes to the Discovery API. These changes will take effect on August 15, 2023. + +The model by environment object is being deprecated and will be replaced with `environment { applied { modelHistoricalRuns } }`. When the time comes, you will need to update your API calls accordingly to avoid errors. Refer to the +[Environment object schema](/docs/dbt-cloud-apis/discovery-schema-environment) for details. +::: This model by environment object allows you to query information about a particular model based on `environmentId`. -The [example query](#example-query) illustrates a few fields you can query in this `modelByEnvironment` object. Refer to [Fields](#fields) to see the entire schema, which provides all possible fields you can query. +The [Example query](#example-query) illustrates a few fields you can query in the `modelByEnvironment` object. Refer to [Fields](#fields) to see the entire schema, which provides all possible fields you can query. + +import DiscoveryApiJobDeprecationNotice from '/snippets/_discovery_api_job_deprecation_notice.md'; + + ### Arguments When querying for `modelByEnvironment`, you can use the following arguments. - + -### Example Query +### Example query -You can use the `environment_id` and `model_unique_id` to return the model and its execution time for the last 20 times it was run, regardless of which job ran it. +You can use the `environmentId` and the model's `uniqueId` to return the model and its execution time for the last 20 times it was run, regardless of which job ran it. ```graphql diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-models.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-models.mdx index a3215eee039..f813d6a9ccf 100644 --- a/website/docs/docs/dbt-cloud-apis/schema-discovery-models.mdx +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-models.mdx @@ -4,25 +4,29 @@ sidebar_label: "Models" id: "discovery-schema-models" --- -import { ArgsTable, SchemaTable } from "./schema"; +import { QueryArgsTable, SchemaTable } from "./schema"; The models object allows you to query information about all models in a given job. +import DiscoveryApiJobDeprecationNotice from '/snippets/_discovery_api_job_deprecation_notice.md'; + + + ### Arguments -When querying for `models`, the following arguments are available. Note that if you do not include a runId, it will default to the most recent run of the specified job: +When querying for `models`, the following arguments are available. If you don't include a runId, the API defaults to the most recent run of the specified job: - + -Below we show some illustrative example queries and outline the schema of this models object. +Below we show some illustrative example queries and outline the schema of the models object. -### Example Queries -As we noted above, database, schema, and identifier are all optional arguments. This means that with this endpoint, you can: +### Example queries +The database, schema, and identifier arguments are optional. This means that with this endpoint you can: - Find a specific model by providing `..` - Find all of the models in a database and/or schema by providing `` and/or `` -#### Finding models by their database, schema, and identifier +#### Find models by their database, schema, and identifier The example query below finds a model by its unique database, schema, and identifier. ```graphql @@ -33,8 +37,8 @@ The example query below finds a model by its unique database, schema, and identi } ``` -#### Finding models by their schema -The example query below finds all models in this schema, and their respective execution times. +#### Find models by their schema +The example query below finds all models in this schema and their respective execution times. ```graphql { @@ -47,7 +51,7 @@ The example query below finds all models in this schema, and their respective ex ### Fields -Models has access to the *same fields* as the [Model node](/docs/dbt-cloud-apis/discovery-schema-model). The difference is that Models can output a list, so instead of querying for fields for one specific model, you can query for those parameters for all models within a jobID, database, etc. +The models object can access the _same fields_ as the [Model node](/docs/dbt-cloud-apis/discovery-schema-model). The difference is that the models object can output a list so instead of querying for fields for one specific model, you can query for those parameters for all models within a jobID, database, and so on. When querying for `models`, the following fields are available: diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-seed.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-seed.mdx index 1047545a8be..110f417769b 100644 --- a/website/docs/docs/dbt-cloud-apis/schema-discovery-seed.mdx +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-seed.mdx @@ -4,23 +4,25 @@ sidebar_label: "Seed" id: "discovery-schema-seed" --- -import { ArgsTable, SchemaTable } from "./schema"; +import { QueryArgsTable, SchemaTable } from "./schema"; The seed object allows you to query information about a particular seed in a given job. -### Arguments +import DiscoveryApiJobDeprecationNotice from '/snippets/_discovery_api_job_deprecation_notice.md'; + + -When querying for a `seed`, the following arguments are available. Note that if you do not include a runId, it will default to the most recent run of the specified job: +### Arguments - +When querying for a `seed`, the following arguments are available. If you don't include a runId, the API defaults to the most recent run of the specified job: -Below we show some illustrative example queries and outline the schema of this seed object. + -### Example Queries +Below we show some illustrative example queries and outline the schema of the seed object. -#### Seed information +### Example query -The query below pulls relevant information about a given seed. For example, we could see the load time. +The query below pulls relevant information about a given seed. For instance, you can view the load time. ```graphql { diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-seeds.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-seeds.mdx index 2cee2b8aa3f..c0a45664e38 100644 --- a/website/docs/docs/dbt-cloud-apis/schema-discovery-seeds.mdx +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-seeds.mdx @@ -4,22 +4,25 @@ sidebar_label: "Seeds" id: "discovery-schema-seeds" --- -import { ArgsTable, SchemaTable } from "./schema"; +import { QueryArgsTable, SchemaTable } from "./schema"; The seeds object allows you to query information about a all seeds in a given job. +import DiscoveryApiJobDeprecationNotice from '/snippets/_discovery_api_job_deprecation_notice.md'; + + + ### Arguments -When querying for `seeds`, the following arguments are available. Note that if you do not include a runId, it will default to the most recent run of the specified job: +When querying for `seeds`, the following arguments are available. If you don't include a runId, the API defaults to the most recent run of the specified job: - + -Below we show some illustrative example queries and outline the schema of this seeds object. +Below we show some illustrative example queries and outline the schema of the seeds object. -### Example Queries -#### Seeds information +### Example query -The query below pulls relevant information about all seeds in a given job. For example, we could see the load times. +The query below pulls relevant information about all seeds in a given job. For instance, you can view load times. ```graphql { diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-snapshots.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-snapshots.mdx index b3f7071319f..4d2316cfc6a 100644 --- a/website/docs/docs/dbt-cloud-apis/schema-discovery-snapshots.mdx +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-snapshots.mdx @@ -4,24 +4,28 @@ sidebar_label: "Snapshots" id: "discovery-schema-snapshots" --- -import { ArgsTable, SchemaTable } from "./schema"; +import { QueryArgsTable, SchemaTable } from "./schema"; The snapshots object allows you to query information about all snapshots in a given job. +import DiscoveryApiJobDeprecationNotice from '/snippets/_discovery_api_job_deprecation_notice.md'; + + + ### Arguments -When querying for `snapshots`, the following arguments are available. Note that if you do not include a runId, it will default to the most recent run of the specified job: +When querying for `snapshots`, the following arguments are available. If you don't include a runId, the API defaults to the most recent run of the specified job: - + -Below we show some illustrative example queries and outline the schema of this snapshots object. +Below we show some illustrative example queries and outline the schema of the snapshots object. -### Example Query -As we noted above, database, schema, and identifier are all optional arguments. This means that with this endpoint, you can: +### Example query +The database, schema, and identifier arguments are optional. This means that with this endpoint you can: - Find a specific snapshot by providing `..` - Find all of the snapshots in a database and/or schema by providing `` and/or `` -#### Finding snapshots information for a job +#### Find snapshots information for a job The example query returns information about all snapshots in this job. ```graphql @@ -39,7 +43,7 @@ The example query returns information about all snapshots in this job. ``` ### Fields -Snapshots has access to the *same fields* as the [Snapshot node](/docs/dbt-cloud-apis/discovery-schema-snapshots). The difference is that Snapshots can output a list, so instead of querying for fields for one specific snapshot, you can query for those parameters for all snapshots within a jobID, database, etc. +The snapshots object can access the _same fields_ as the [Snapshot node](/docs/dbt-cloud-apis/discovery-schema-snapshots). The difference is that the snapshots object can output a list so instead of querying for fields for one specific snapshot, you can query for those parameters for all snapshots within a jobID, database, and so on. When querying for `snapshots`, the following fields are available: diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-source.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-source.mdx index 87d776282fe..3d632a035e3 100644 --- a/website/docs/docs/dbt-cloud-apis/schema-discovery-source.mdx +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-source.mdx @@ -4,23 +4,25 @@ sidebar_label: "Source" id: "discovery-schema-source" --- -import { ArgsTable, SchemaTable } from "./schema"; +import { QueryArgsTable, SchemaTable } from "./schema"; The source object allows you to query information about a particular source in a given job. -### Arguments +import DiscoveryApiJobDeprecationNotice from '/snippets/_discovery_api_job_deprecation_notice.md'; + + -When querying for a `source`, the following arguments are available. Note that if you do not include a runId, it will default to the most recent run of the specified job: +### Arguments - +When querying for a `source`, the following arguments are available. If you don't include a runId, the API defaults to the most recent run of the specified job: -Below we show some illustrative example queries and outline the schema of this source object. + -### Example Queries +Below we show some illustrative example queries and outline the schema of the source object. -#### Source information +### Example query -The query below pulls relevant information about a given source. For example, we could see the load time and the state (“pass”, “fail”, “error”) of that source. +The example query below pulls relevant information about a given source. For instance, you can view the load time and the state (pass, fail, error) of that source. ```graphql { diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-sources.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-sources.mdx index a719c5caf92..591f8e0307c 100644 --- a/website/docs/docs/dbt-cloud-apis/schema-discovery-sources.mdx +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-sources.mdx @@ -4,25 +4,29 @@ sidebar_label: "Sources" id: "discovery-schema-sources" --- -import { ArgsTable, SchemaTable } from "./schema"; +import { QueryArgsTable, SchemaTable } from "./schema"; The sources object allows you to query information about all sources in a given job. +import DiscoveryApiJobDeprecationNotice from '/snippets/_discovery_api_job_deprecation_notice.md'; + + + ### Arguments -When querying for `sources`, the following arguments are available. Note that if you do not include a runId, it will default to the most recent run of the specified job: +When querying for `sources`, the following arguments are available. If you don't include a runId, the API defaults to the most recent run of the specified job: - + -Below we show some illustrative example queries and outline the schema of this sources object. +Below we show some illustrative example queries and outline the schema of the sources object. -### Example Queries -As we noted above, database, schema, and identifier are all optional arguments. This means that with this endpoint, you can: +### Example queries +The database, schema, and identifier arguments are optional. This means that with this endpoint you can: - Find a specific source by providing `..` - Find all of the sources in a database and/or schema by providing `` and/or `` -#### Finding sources by their database, schema, and identifier +#### Find sources by their database, schema, and identifier The example query below finds a source by its unique database, schema, and identifier. ```graphql @@ -33,8 +37,8 @@ The example query below finds a source by its unique database, schema, and ident } ``` -#### Finding sources by their schema -The example query below finds all sources in this schema, and their respective states (pass, error, fail). +#### Find sources by their schema +The example query below finds all sources in this schema and their respective states (pass, error, fail). ```graphql { @@ -46,7 +50,7 @@ The example query below finds all sources in this schema, and their respective s ``` ### Fields -Sources has access to the *same fields* as the [Source node](/docs/dbt-cloud-apis/discovery-schema-source). The difference is that Sources can output a list, so instead of querying for fields for one specific source, you can query for those parameters for all sources within a jobID, database, etc. +The sources object can access the _same fields_ as the [Source node](/docs/dbt-cloud-apis/discovery-schema-source). The difference is that the sources object can output a list so instead of querying for fields for one specific source, you can query for those parameters for all sources within a jobID, database, and so on. When querying for `sources`, the following fields are available: diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-test.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-test.mdx index 2ee915d27c7..ea22a81fc8e 100644 --- a/website/docs/docs/dbt-cloud-apis/schema-discovery-test.mdx +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-test.mdx @@ -4,22 +4,25 @@ sidebar_label: "Test" id: "discovery-schema-test" --- -import { ArgsTable, SchemaTable } from "./schema"; +import { QueryArgsTable, SchemaTable } from "./schema"; The test object allows you to query information about a particular test. +import DiscoveryApiJobDeprecationNotice from '/snippets/_discovery_api_job_deprecation_notice.md'; + + + ### Arguments -When querying for a `test`, the following arguments are available. Note that if you do not include a runId, it will default to the most recent run of the specified job: +When querying for a `test`, the following arguments are available. If you don't include a runId, the API defaults to the most recent run of the specified job: - + -Below we show some illustrative example queries and outline the schema (all possible fields you can query) of this test object. +Below we show some illustrative example queries and outline the schema (all possible fields you can query) of the test object. -### Example Queries -#### Test result +### Example query -The example query below outputs information about a test, including the state of the test result. This can be one of, in order of severity, "error", "fail", "warn", "pass." +The example query below outputs information about a test including the state of the test result. In order of severity, the result can be one of these: "error", "fail", "warn", or "pass". ```graphql { diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-tests.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-tests.mdx index 7f087c85fee..250a73cea5d 100644 --- a/website/docs/docs/dbt-cloud-apis/schema-discovery-tests.mdx +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-tests.mdx @@ -4,23 +4,26 @@ sidebar_label: "Tests" id: "discovery-schema-tests" --- -import { ArgsTable, SchemaTable } from "./schema"; +import { QueryArgsTable, SchemaTable } from "./schema"; The tests object allows you to query information about all tests in a given job. +import DiscoveryApiJobDeprecationNotice from '/snippets/_discovery_api_job_deprecation_notice.md'; + + + ### Arguments -When querying for `tests`, the following arguments are available. Note that if you do not include a runId, it will default to the most recent run of the specified job: +When querying for `tests`, the following arguments are available. If you don't include a runId, the API defaults to the most recent run of the specified job: - + -Below we show some illustrative example queries and outline the schema (all possible fields you can query) of this tests object. +Below we show some illustrative example queries and outline the schema (all possible fields you can query) of the tests object. -### Example Queries -#### Tests result +### Example query -The example query below finds all tests in this job, and includes information about those tests. +The example query below finds all tests in this job and includes information about those tests. ```graphql { diff --git a/website/docs/docs/dbt-cloud-apis/schema.jsx b/website/docs/docs/dbt-cloud-apis/schema.jsx index 8b9bbc358f0..ea6660251f6 100644 --- a/website/docs/docs/dbt-cloud-apis/schema.jsx +++ b/website/docs/docs/dbt-cloud-apis/schema.jsx @@ -1,6 +1,52 @@ -import React, { setState } from "react"; +import React from "react"; import { useState, useEffect } from 'react' -const queriesQuery = `{ + +const getTypeString = (typeStructure) => { + // Helper function to represent GraphQL type + if (!typeStructure) return '' + + if (typeStructure.kind === 'NON_NULL') { + return `${getTypeString(typeStructure.ofType)}!`; + } else if (typeStructure.kind === 'LIST') { + return `[${getTypeString(typeStructure.ofType)}]`; + } else if (['OBJECT', 'SCALAR', 'ENUM'].includes(typeStructure.kind)) { + return `${typeStructure.name}${getTypeString(typeStructure.ofType)}`; + } else { + return ''; + } +} + +export const ArgsTable = ({ data, name }) => { + return ( + + + + + + + + + + + {data.fields.find(d => d.name === name).args.map(function ({ name, description, type }) { + return ( + + + + + + + ) + })} + +
FieldTypeRequired?Description
{name}{getTypeString(type)}{type.kind === 'NON_NULL' ? `Yes` : `No`}{description || `No description provided`}
+ ) +} + +const metadataUrl = 'https://metadata.cloud.getdbt.com/graphql' +const metadataBetaUrl = 'https://metadata.cloud.getdbt.com/beta/graphql' + +const queryArgsQuery = `{ __schema { queryType { fields { @@ -18,23 +64,22 @@ const queriesQuery = `{ name description kind - ofType { name description } + ofType { kind name description } } } } } } }` -const metadataUrl = 'https://metadata.cloud.getdbt.com/graphql' -const metadataBetaUrl = 'https://metadata.cloud.getdbt.com/beta/graphql' -export const ArgsTable = ({ queryName, useBetaAPI }) => { + +export const QueryArgsTable = ({ queryName, useBetaAPI }) => { const [data, setData] = useState(null) useEffect(() => { const fetchData = () => { fetch(useBetaAPI ? metadataBetaUrl : metadataUrl, { method: "POST", headers: { "Content-Type": "application/json" }, - body: JSON.stringify({ query: queriesQuery }), + body: JSON.stringify({ query: queryArgsQuery }), }) .then((result) => result.json()) .then((data) => setData(data)) @@ -45,33 +90,89 @@ export const ArgsTable = ({ queryName, useBetaAPI }) => { return

Fetching data...

} return ( - - - - - - - - - - - {data.data.__schema.queryType.fields.find(d => d.name === queryName).args.map(function ({ name, description, type }) { - return ( - - - {type.ofType ? - : - + + ) +} + +export const NodeArgsTable = ({ parent, name, useBetaAPI }) => { + const [data, setData] = useState(null) + useEffect(() => { + const fetchData = () => { + fetch(useBetaAPI ? metadataBetaUrl : metadataUrl, { + method: "POST", + headers: { "Content-Type": "application/json" }, + body: JSON.stringify({ + query: ` + query { + __type(name: "${parent}") { + ...FullType + } + } + + fragment FullType on __Type { + kind + fields(includeDeprecated: true) { + name + description + args { + name + description + defaultValue + type { + ...TypeRef + } } - - - - ) - })} - -
FieldTypeRequired?Description
{name}{type.ofType.name}{type.name}{type.kind === 'NON_NULL' ? `Yes` : `No`}{description || `No description provided`}
+ } + } + + # get several levels + fragment TypeRef on __Type { + kind + name + ofType { + kind + name + ofType { + kind + name + ofType { + kind + name + ofType { + kind + name + ofType { + kind + name + ofType { + kind + name + ofType { + kind + name + } + } + } + } + } + } + } + } + `}) + }) + .then((result) => result.json()) + .then((data) => setData(data)) + } + fetchData() + }, []) + if (!data) { + return

Fetching data...

+ } + return ( + ) } + export const SchemaTable = ({ nodeName, useBetaAPI }) => { const [data, setData] = useState(null) useEffect(() => { @@ -80,27 +181,60 @@ export const SchemaTable = ({ nodeName, useBetaAPI }) => { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ - query: `{ - __type(name: "${nodeName}") { - fields { + query: ` + query { + __type(name: "${nodeName}") { + ...FullType + } + } + + fragment FullType on __Type { + kind + name + description + fields(includeDeprecated: true) { name description - type { - name - description - kind - ofType { - name - description - ofType { - name - description - } - } + type { + ...TypeRef } } } - }`}), + + # get several levels + fragment TypeRef on __Type { + kind + name + ofType { + kind + name + ofType { + kind + name + ofType { + kind + name + ofType { + kind + name + ofType { + kind + name + ofType { + kind + name + ofType { + kind + name + } + } + } + } + } + } + } + } + `}), }) .then((result) => result.json()) .then((data) => setData(data)) @@ -124,13 +258,7 @@ export const SchemaTable = ({ nodeName, useBetaAPI }) => { return ( {name} - {type.kind === 'LIST' ? - [{type.ofType.ofType ? type.ofType.ofType.name : type.ofType.name}] : - (type.ofType ? - {type.ofType.name} : - {type.name} - ) - } + {getTypeString(type)} {description} ) @@ -138,4 +266,4 @@ export const SchemaTable = ({ nodeName, useBetaAPI }) => { ) -} \ No newline at end of file +} diff --git a/website/sidebars.js b/website/sidebars.js index a789f27ab2e..e319f4d49bf 100644 --- a/website/sidebars.js +++ b/website/sidebars.js @@ -457,7 +457,51 @@ const sidebarSettings = { type: "category", label: "Schema", items: [ - "docs/dbt-cloud-apis/discovery-schema-environment", + { + type: "category", + label: "Job", + link: { type: "doc", id: "docs/dbt-cloud-apis/discovery-schema-job" }, + items: [ + "docs/dbt-cloud-apis/discovery-schema-job-model", + "docs/dbt-cloud-apis/discovery-schema-job-models", + "docs/dbt-cloud-apis/discovery-schema-job-metric", + "docs/dbt-cloud-apis/discovery-schema-job-metrics", + "docs/dbt-cloud-apis/discovery-schema-job-source", + "docs/dbt-cloud-apis/discovery-schema-job-sources", + "docs/dbt-cloud-apis/discovery-schema-job-seed", + "docs/dbt-cloud-apis/discovery-schema-job-seeds", + // "docs/dbt-cloud-apis/discovery-schema-job-snapshot", + "docs/dbt-cloud-apis/discovery-schema-job-snapshots", + "docs/dbt-cloud-apis/discovery-schema-job-test", + "docs/dbt-cloud-apis/discovery-schema-job-tests", + "docs/dbt-cloud-apis/discovery-schema-job-exposure", + "docs/dbt-cloud-apis/discovery-schema-job-exposures", + // "docs/dbt-cloud-apis/discovery-schema-job-macro", + // "docs/dbt-cloud-apis/discovery-schema-job-macros", + ], + }, + { + type: "category", + label: "Environment", + link: { type: "doc", id: "docs/dbt-cloud-apis/discovery-schema-environment" }, + items: [ + { + type: "category", + label: "Applied", + items: [ + "docs/dbt-cloud-apis/discovery-schema-environment-applied-modelHistoricalRuns", + ], + }, + // Uncomment to add Definition subpage, but need to make items non-empty + // { + // type: "category", + // label: "Definition", + // items: [ + // // insert pages here + // ], + // }, + ], + }, "docs/dbt-cloud-apis/discovery-schema-model", "docs/dbt-cloud-apis/discovery-schema-models", "docs/dbt-cloud-apis/discovery-schema-modelByEnv", @@ -932,7 +976,7 @@ const sidebarSettings = { "guides/orchestration/airflow-and-dbt-cloud/3-running-airflow-and-dbt-cloud", "guides/orchestration/airflow-and-dbt-cloud/4-airflow-and-dbt-cloud-faqs", ], - }, + }, { type: "category", label: "Set up Continuous Integration", diff --git a/website/snippets/_discovery_api_job_deprecation_notice.md b/website/snippets/_discovery_api_job_deprecation_notice.md new file mode 100644 index 00000000000..71e80a958b4 --- /dev/null +++ b/website/snippets/_discovery_api_job_deprecation_notice.md @@ -0,0 +1,7 @@ +:::caution +dbt Labs is making changes to the Discovery API. These changes will take effect on September 7, 2023. + +The data type `Int` for `id` is being deprecated and will be replaced with `BigInt`. Currently, both data types are supported. + +To perform job-based queries, you must do it within the `job` schema object, and move the `jobId` and `runId` arguments to `job(...)`. This is now supported so you can update your API calls accordingly. For details, refer to [Job object schema](/docs/dbt-cloud-apis/discovery-schema-job). +:::