Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updated according to dbt-teradata 1.8.2 #6577

Open
wants to merge 3 commits into
base: current
Choose a base branch
from

Conversation

tallamohan
Copy link
Contributor

What are you changing in this pull request and why?

Updated the setup and config pages of dbt-teradata according to the latest release dbt-teradata 1.8.2

@tallamohan tallamohan requested review from dataders and a team as code owners December 3, 2024 09:33
Copy link

vercel bot commented Dec 3, 2024

@tallamohan is attempting to deploy a commit to the dbt-labs Team on Vercel.

A member of the Team first needs to authorize it.

@github-actions github-actions bot added content Improvements or additions to content size: medium This change will take up to a week to address labels Dec 3, 2024
Copy link

vercel bot commented Dec 10, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated (UTC)
docs-getdbt-com ❌ Failed (Inspect) Dec 10, 2024 1:38am

@mirnawong1
Copy link
Contributor

looping in @amychen1776 as an fyi

@@ -210,7 +209,8 @@ For using cross-DB macros, teradata-utils as a macro namespace will not be used,

##### <a name="hash"></a>hash

`Hash` macro needs an `md5` function implementation. Teradata doesn't support `md5` natively. You need to install a User Defined Function (UDF):
`Hash` macro needs an `md5` function implementation. Teradata doesn't support `md5` natively. You need to install a User Defined Function (UDF) and optionally specify `md5_udf` [variable](/docs/build/project-variables). <br>
If not specified the code defaults to using `GLOBAL_FUNCTIONS.hash_md5`. See below instructions on how to install the custom UDF:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If not specified the code defaults to using `GLOBAL_FUNCTIONS.hash_md5`. See below instructions on how to install the custom UDF:
If not specified the code defaults to using `GLOBAL_FUNCTIONS.hash_md5`. See the following instructions on how to install the custom UDF:

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This information should be called out in the Teradata config page.

@@ -241,6 +247,14 @@ dbt-teradata 1.8.0 and later versions support unit tests, enabling you to valida

## Limitations

### Browser Authentication
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Browser Authentication
### Browser authentication

@@ -241,6 +247,14 @@ dbt-teradata 1.8.0 and later versions support unit tests, enabling you to valida

## Limitations

### Browser Authentication
When running a dbt job with logmech set to "browser", the initial authentication opens a browser window where you must enter your username and password.<br>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tallamohan can you turn these into bullets and remove the
. so:

  • When running a dbt job with logmech set to "browser", the initial authentication opens a browser window where you must enter your username and password.
  • After authentication, this window remains open, requiring you to manually switch back to the dbt console.
  • For every subsequent connection, a new browser tab briefly opens, displaying the message "TERADATA BROWSER AUTHENTICATION COMPLETED," and silently reuses the existing session.
  • However, the focus stays on the browser window, so you’ll need to manually switch back to the dbt console each time.
  • This behavior is the default functionality of the teradatasql driver and cannot be avoided at this time.
  • To prevent session expiration and the need to re-enter credentials, ensure the authentication browser window stays open until the job is complete.

@@ -348,6 +348,18 @@ If a user sets some key-value pair with value as `'{model}'`, internally this `'
- For example, if the model the user is running is `stg_orders`, `{model}` will be replaced with `stg_orders` in runtime.
- If no `query_band` is set by the user, the default query_band used will be: ```org=teradata-internal-telem;appname=dbt;```

## Unit testing
* Unit testing is supported in dbt-teradata, allowing users to write and execute unit tests using the dbt test command.
* For detailed guidance, refer to the dbt documentation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* For detailed guidance, refer to the dbt documentation.
* For detailed guidance, refer to the [dbt unit tests documentation](/docs/build/documentation).

* `history_column_in_target` &mdash; Column in the target table of **period** datatype that tracks history.
* `unique_key`: The primary key of the model (excluding the valid time components), specified as a column name or list of column names.
* `valid_period`: Name of the model column indicating the period for which the record is considered to be valid. The datatype must be `PERIOD(DATE)` or `PERIOD(TIMESTAMP)`.
* `use_valid_to_time`: Wether the end bound value of the valid period in the input is considered by the strategy when building the valid timeline. Use 'no' if you consider your record to be valid until changed (and supply any value greater to the begin bound for the end bound of the period - a typical convention is `9999-12-31` of ``9999-12-31 23:59:59.999999`). Use 'yes' if you know until when the record is valid (typically this is a correction in the history timeline).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* `use_valid_to_time`: Wether the end bound value of the valid period in the input is considered by the strategy when building the valid timeline. Use 'no' if you consider your record to be valid until changed (and supply any value greater to the begin bound for the end bound of the period - a typical convention is `9999-12-31` of ``9999-12-31 23:59:59.999999`). Use 'yes' if you know until when the record is valid (typically this is a correction in the history timeline).
* `use_valid_to_time`: Whether the end bound value of the valid period in the input is considered by the strategy when building the valid timeline. Use `no` if you consider your record to be valid until changed (and supply any value greater to the begin bound for the end bound of the period. A typical convention is `9999-12-31` of ``9999-12-31 23:59:59.999999`). Use `yes` if you know until when the record is valid (typically this is a correction in the history timeline).

* Identify and adjust overlapping time slices:
* Overlapping time periods in the data are detected and corrected to maintain a consistent and non-overlapping timeline.
* Manage records needing to be overwritten or split based on the source and target data:
* The process of removing primary key duplicates (ie. two or more records with the same value for the `unique_key` and BEGIN() bond of the `valid_period` fields) in the dataset produced by the model. If such duplicates exist, the row with the lowest value is retained for all non-primary-key fields (in the order specified in the model) is retained. Full-row duplicates are always de-duplicated.
Copy link
Contributor

@mirnawong1 mirnawong1 Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* The process of removing primary key duplicates (ie. two or more records with the same value for the `unique_key` and BEGIN() bond of the `valid_period` fields) in the dataset produced by the model. If such duplicates exist, the row with the lowest value is retained for all non-primary-key fields (in the order specified in the model) is retained. Full-row duplicates are always de-duplicated.
* The process of removing primary key duplicates (two or more records with the same value for the `unique_key` and BEGIN() bond of the `valid_period` fields) in the dataset produced by the model. If such duplicates exist, the row with the lowest value is retained for all non-primary-key fields (in the order specified in the model). Full-row duplicates are always de-duplicated.

* The process of removing primary key duplicates (ie. two or more records with the same value for the `unique_key` and BEGIN() bond of the `valid_period` fields) in the dataset produced by the model. If such duplicates exist, the row with the lowest value is retained for all non-primary-key fields (in the order specified in the model) is retained. Full-row duplicates are always de-duplicated.
* Identify and adjust overlapping time slices (if use_valid_to_time='yes):
* Overlapping time periods in the data are corrected to maintain a consistent and non-overlapping timeline. To do so, the valid period end bound of a record is adjusted to meet the begin bound of the next record with the same `unique_key` value and overlapping `valid_period` value if any.
* Manage records needing to be adjusted, deleted or split based on the source and target data:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Manage records needing to be adjusted, deleted or split based on the source and target data:
* Manage records needing to be adjusted, deleted, or split based on the source and target data:

* Utilize the TD_NORMALIZE_MEET function to compact history:
* This function helps to normalize and compact the history by merging adjacent time periods, improving the efficiency and performance of the database.
* Compact history:
* Normalize and compact the history by merging records of adjacent time periods withe same value, optimizing database storage and performance. We use the function TD_NORMALIZE_MEET for this purpose.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Normalize and compact the history by merging records of adjacent time periods withe same value, optimizing database storage and performance. We use the function TD_NORMALIZE_MEET for this purpose.
* Normalize and compact the history by merging records of adjacent time periods with the same value, optimizing database storage and performance. We use the function TD_NORMALIZE_MEET for this purpose.

Copy link
Contributor

@mirnawong1 mirnawong1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @tallamohan , thanks for opening this pr up! i've made some suggestions adn couldn't commit them myself unfortunately. could you make those changes on your end and when it's ready, give me a tag and i'll look it over again?

For every subsequent connection, a new browser tab briefly opens, displaying the message "TERADATA BROWSER AUTHENTICATION COMPLETED," and silently reuses the existing session.<br>
However, the focus stays on the browser window, so you’ll need to manually switch back to the dbt console each time.<br>
This behavior is the default functionality of the teradatasql driver and cannot be avoided at this time.<br>
To prevent session expiration and the need to re-enter credentials, ensure the authentication browser window stays open until the job is complete.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a great user experience. Do you have plans to provide caching and if so, can it be called out?

@@ -95,7 +95,6 @@ Parameter | Default | Type | Description
`browser_tab_timeout` | `"5"` | quoted integer | Specifies the number of seconds to wait before closing the browser tab after Browser Authentication is completed. The default is 5 seconds. The behavior is under the browser's control, and not all browsers support automatic closing of browser tabs.
`browser_timeout` | `"180"` | quoted integer | Specifies the number of seconds that the driver will wait for Browser Authentication to complete. The default is 180 seconds (3 minutes).
`column_name` | `"false"` | quoted boolean | Controls the behavior of cursor `.description` sequence `name` items. Equivalent to the Teradata JDBC Driver `COLUMN_NAME` connection parameter. False specifies that a cursor `.description` sequence `name` item provides the AS-clause name if available, or the column name if available, or the column title. True specifies that a cursor `.description` sequence `name` item provides the column name if available, but has no effect when StatementInfo parcel support is unavailable.
`connect_failure_ttl` | `"0"` | quoted integer | Specifies the time-to-live in seconds to remember the most recent connection failure for each IP address/port combination. The driver subsequently skips connection attempts to that IP address/port for the duration of the time-to-live. The default value of zero disables this feature. The recommended value is half the database restart time. Equivalent to the Teradata JDBC Driver `CONNECT_FAILURE_TTL` connection parameter.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you remove this functionality? If so it might be worth calling that out to avoid confusion

Copy link
Contributor

@mirnawong1 mirnawong1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @tallamohan, wanted to check in to see if you had any questions on our reviews? many thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
content Improvements or additions to content size: medium This change will take up to a week to address
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants