Skip to content

Commit

Permalink
Merge branch 'main' into ja-landing-page
Browse files Browse the repository at this point in the history
  • Loading branch information
ObedVega authored Feb 13, 2024
2 parents 5836627 + ec0e24c commit 5552459
Show file tree
Hide file tree
Showing 91 changed files with 1,100 additions and 36 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v3
- uses: actions/setup-node@v3
uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 17
cache: npm
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/fluid-topics.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ jobs:
# Steps represent a sequence of tasks that will be executed as part of the job
steps:
# Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
- uses: actions/checkout@v3
- uses: actions/checkout@v4

# Runs a single command using the runners shell
- name: Archive and Send it
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v3
- uses: actions/setup-node@v3
uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 17
cache: npm
Expand Down
3 changes: 0 additions & 3 deletions .vscode/settings.json

This file was deleted.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added modules/ROOT/images/csae_create_env.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added modules/ROOT/images/csae_env_details.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added modules/ROOT/images/csae_env_params.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added modules/ROOT/images/csae_jupyter.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added modules/ROOT/images/csae_register.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added modules/ROOT/images/csae_signin.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified modules/ROOT/images/lake_environment_page.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified modules/ROOT/images/lake_settings_menu.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 11 additions & 1 deletion modules/ROOT/nav.adoc
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
* Introduction
** xref::teradata-vantage-engine-architecture-and-concepts.adoc[Teradata Vantage Engine Architecture and Concepts]
* Get access to Vantage
** On your local
*** xref::getting.started.vmware.adoc[Vantage Express on VMware]
Expand All @@ -10,6 +12,8 @@
*** xref::run-vantage-express-on-aws.adoc[Vantage Express on AWS]
*** xref::vantage.express.gcp.adoc[Vantage Express on Google Cloud]
*** xref::run-vantage-express-on-microsoft-azure.adoc[Vantage Express on Azure]
** ClearScape Analytics Experience
*** xref::getting-started-with-csae.adoc[]
* Connect to Vantage
** xref:install-teradata-studio-on-mac-m1-m2.adoc[]
Expand All @@ -20,6 +24,7 @@
* Manage data
** xref::nos.adoc[]
** xref::select-the-right-data-ingestion-tools-for-teradata-vantage.adoc[]
** xref::airflow.adoc[]
** xref::dbt.adoc[]
** xref::advanced-dbt.adoc[]
** xref:modelops:using-feast-feature-store-with-teradata-vantage.adoc[]
Expand Down Expand Up @@ -71,4 +76,9 @@
*** xref:ai-unlimited:ai-unlimited-magic-reference.adoc[]
* VantageCloud Lake
** xref:getting-started-with-vantagecloud-lake.adoc[]
** xref:getting-started-with-vantagecloud-lake.adoc[]
** xref:vantagecloud-lake:vantagecloud-lake-demo-jupyter-docker.adoc[]
** xref:vantagecloud-lake:vantagecloud-lake-demos-visual-studio-code.adoc[]
** xref:vantagecloud-lake:vantagecloud-lake-demo-jupyter-sagemaker.adoc[]
** xref:vantagecloud-lake:vantagecloud-lake-demo-jupyter-google-cloud-vertex-ai.adoc[]
** xref:vantagecloud-lake:vantagecloud-lake-demo-jupyter-azure.adoc[]
135 changes: 135 additions & 0 deletions modules/ROOT/pages/airflow.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
= Use Apache Airflow with Teradata Vantage
:experimental:
:page-author: Satish Chinthanippu
:page-email: [email protected]
:page-revdate: February 06th, 2024
:description: Use Apache Airflow with Teradata Vantage.
:keywords: data warehouses, compute storage separation, teradata, vantage, cloud data platform, object storage, business intelligence, enterprise analytics, elt, airflow, workflow.
:tabs:
:dir: airflow

== Overview

This tutorial demonstrates how to use airflow with Teradata Vantage. Airflow will be installed on Ubuntu System.

== Prerequisites

* Ubuntu 22.x
* Access to a Teradata Vantage instance.
+
include::ROOT:partial$vantage_clearscape_analytics.adoc[]
* Python *3.8*, *3.9*, *3.10* or *3.11* installed.

== Install Apache Airflow

1. Set the AIRFLOW_HOME environment variable. Airflow requires a home directory and uses ~/airflow by default, but you can set a different location if you prefer. The AIRFLOW_HOME environment variable is used to inform Airflow of the desired location.
+
[source, bash]
----
export AIRFLOW_HOME=~/airflow
----
2. Install `apache-airflow` stable version 2.8.1 from PyPI repository.:
+
[source, bash]
----
AIRFLOW_VERSION=2.8.1
PYTHON_VERSION="$(python --version | cut -d " " -f 2 | cut -d "." -f 1-2)"
CONSTRAINT_URL="https://raw.githubusercontent.com/apache/airflow/constraints-${AIRFLOW_VERSION}/constraints-${PYTHON_VERSION}.txt"
pip install "apache-airflow==${AIRFLOW_VERSION}" --constraint "${CONSTRAINT_URL}"
----
3. Install the Airflow Teradata provider stable version 1.0.0 from PyPI repository.
+
[source, bash]
----
pip install "apache-airflow-providers-teradata==1.0.0"
----

== Start Airflow Standalone

1. Run Airflow Standalone
+
[source, bash]
----
airflow standalone
----
2. Access the Airflow UI. Visit https://localhost:8080 in the browser and log in with the admin account details shown in the terminal.

== Define a Teradata connection in Airflow UI

1. Open the Admin -> Connections section of the UI. Click the Create link to create a new connection.
+
image::{dir}/airflow-connection.png[Airflow admin dropdown, width=75%]
2. Fill in input details in New Connection Page.
+
image::{dir}/airflow-newconnection.png[Airflow New Connection, width=75%]
* Connection Id: Unique ID of Teradata Connection.
* Connection Type: Type of the system. Select Teradata.
* Database Server URL (required): Teradata instance hostname to connect to.
* Database (optional): Specify the name of the database to connect to
* Login (required): Specify the user name to connect.
* Password (required): Specify the password to connect.
* Click on Test and Save.

== Define a DAG in Airflow

1. In Airflow, DAGs are defined as Python code.
2. Create a DAG as a python file like sample.py under DAG_FOLDER - $AIRFLOW_HOME/files/dags directory.
+
[source, python]
----
from datetime import datetime
from airflow import DAG
from airflow.providers.teradata.operators.teradata import TeradataOperator
CONN_ID = "Teradata_TestConn"
with DAG(
dag_id="example_teradata_operator",
max_active_runs=1,
max_active_tasks=3,
catchup=False,
start_date=datetime(2023, 1, 1),
) as dag:
create = TeradataOperator(
task_id="table_create",
conn_id=CONN_ID,
sql="""
CREATE TABLE my_users,
FALLBACK (
user_id decimal(10,0) NOT NULL GENERATED ALWAYS AS IDENTITY (
START WITH 1
INCREMENT BY 1
MINVALUE 1
MAXVALUE 2147483647
NO CYCLE),
user_name VARCHAR(30)
) PRIMARY INDEX (user_id);
""",
)
----

== Load DAG

Airflow loads DAGs from Python source files, which it looks for inside its configured DAG_FOLDER - $AIRFLOW_HOME/files/dags directory.

== Run DAG
DAGs will run in one of two ways:
1. When they are triggered either manually or via the API
2. On a defined schedule, which is defined as part of the DAG
`example_teradata_operator` is defined to trigger as manually. To define a schedule, any valid link:https://en.wikipedia.org/wiki/Cron[Crontab, window="_blank"] schedule value can be passed to the schedule argument.
[source, python]
----
with DAG(
dag_id="my_daily_dag",
schedule="0 0 * * *"
) as dag:
----

== Summary

This tutorial demonstrated how to use Airflow and the Airflow Teradata provider with a Teradata Vantage instance. The example DAG provided creates `my_users` table in the Teradata Vantage instance defined in Connection UI.

== Further reading
* link:https://airflow.apache.org/docs/apache-airflow/stable/start.html[airflow documentation]
* link:https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/dags.html[airflow DAGs]


include::ROOT:partial$community_link.adoc[]
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ include::ROOT:partial$vantage_clearscape_analytics.adoc[]
You can find more documentation about `WRITE_NOS` functionality in the https://docs.teradata.com/r/Teradata-VantageTM-Native-Object-Store-Getting-Started-Guide/June-2022/Writing-Data-to-External-Object-Store[NOS documentation].

You will need access to a database where you can execute `WRITE_NOS` function. If you don't have such a database, run the following commands:
[source, teradata-sql]
[source, teradata-sql, id="parquet_create_user", role="emits-gtm-events"]
----
CREATE USER db AS PERM=10e7, PASSWORD=db;
Expand Down
10 changes: 1 addition & 9 deletions modules/ROOT/pages/dbt.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -74,15 +74,7 @@ We will now configure dbt to connect to your Vantage database. Create file `$HOM
[NOTE]
.Database setup
====
The following dbt profile points to a database called `jaffle_shop`. You can change `schema` value to point to an existing database in your Teradata Vantage instance or you can create `jaffle_shop` database:
[source, teradata-sql]
----
CREATE DATABASE jaffle_shop
AS PERMANENT = 110e6,
SPOOL = 220e6;
----
The following dbt profile points to a database called `jaffle_shop`. If the database doesn't exist on your Teradata Vantage instance, it will be created. You can also change `schema` value to point to an existing database in your instance.
====

[source, yaml, id="dbt_first_config", role="emits-gtm-events"]
Expand Down
2 changes: 1 addition & 1 deletion modules/ROOT/pages/fastload.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ LOGON localhost/dbc,dbc;

Now, that we are logged in, I'm going to prepare the database. I'm switching to `irs` database and making sure that the target table `irs_returns` and error tables (more about error tables later) do not exist:

[source, teradata-sql]
[source, teradata-sql, id="fastload_drop_table", role="emits-gtm-events"]
----
DATABASE irs;
DROP TABLE irs_returns;
Expand Down
77 changes: 77 additions & 0 deletions modules/ROOT/pages/getting-started-with-csae.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
= Getting started with ClearScape Analytics Experience
:experimental:
:page-author: Vidhan Bhonsle
:page-email: [email protected]
:page-revdate: February 9th, 2024
:description: Getting started with ClearScape Analytics Experience
:keywords: data warehouses, compute storage separation, teradata, vantage, cloud data platform, business intelligence, enterprise analytics, jupyter, teradatasql, ipython-sql, clearscape, csae

== Overview

https://www.teradata.com/platform/clearscape-analytics[ClearScape Analytics^TM^] is a powerful analytics engine in https://www.teradata.com/platform/vantagecloud[Teradata VantageCloud]. It delivers breakthrough performance, value, and growth across the enterprise with the most powerful, open and connected AI/ML capabilities on the market. You can experience ClearClearScape Analytics^TM^ and Teradata Vantage, in a non-production setting, through https://www.teradata.com/experience[ClearScape Analytics Experience].

In this how-to we will go through the steps for creating an environment in ClearScape Analytics Experience and access demos.

image::VantageCloud.png[VantageCloud,align="center",width=50%]

== Create a ClearScape Analytics Experience account

Head over to https://www.teradata.com/experience[ClearScape Analytics Experience] and create a free account.

image::csae_register.png[Register,align="center",width=75%]

Sign in to your https://clearscape.teradata.com/sign-in[ClearScape Analytics account] to create an environment and access demos.

image::csae_signin.png[Sign in,align="center",width=60%]

== Create an Environment

Once signed in, click on *CREATE ENVIRONMENT*

image::csae_create_env.png[Create environment,align="center",width=60%]

You will need to provide:

[cols="1,1"]
|====
| *Variable* | *Value*

| *environment name*
| A name for your environment, e.g. "demo"

| *database password*
| A password of your choice, this password will be assigned to `dbc` and `demo_user` users

| *Region*
| Select a region from the dropdown

|====

IMPORTANT: Note down the database password. You will need it to connect to the database.

image::csae_env_params.png[Environment params,align="center",width=65%]

Click on *CREATE* button to complete the creation of your environment and now, you can see details of your environment.

image::csae_env_details.png[Environment details,align="center",width=75%]

== Access demos

The ClearScape Analytics Experience environment includes a variety of demos that showcase how to use analytics to solve business problems across many industries. +

To access demos, click on *RUN DEMOS USING JUPYTER* button. It will open a Jupyter environment in a new tab of your browser. +

NOTE: You can find all the detail of demos on the demo index page.

image::csae_jupyter.png[Usecases folder,align="center",width=75%]


== Summary

In this quick start, we learned how to create an environment in ClearScape Analytics Experience and access demos.

== Further reading

* https://api.clearscape.teradata.com/api-docs/[ClearScape Analytics Experience API documentation]
* https://docs.teradata.com/[Teradata Documentation]

2 changes: 1 addition & 1 deletion modules/ROOT/pages/teradatasql.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ This how-to demonstrates how to connect to Vantage using link:https://github.com

* `teradatasql` driver installed in your system:
+
[source, bash]
[source, bash, id="teradatasql_pip_install", role="emits-gtm-events"]
----
pip install teradatasql
----
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ networks:
+
[source, bash, id="docker_compose_jupyter_up", role="content-editable emits-gtm-events"]
----
docker compose -f jupyter.yaml up
docker compose -f jupyter.yml up
----
+
Once the JupyterLab server is initialized and started, you can connect to JupyterLab using the URL: http://localhost:8888 and enter the token when prompted. For detailed instructions, see link:https://docs.teradata.com/r/Teradata-VantageTM-Modules-for-Jupyter-Installation-Guide/Teradata-Vantage-Modules-for-Jupyter/Teradata-Vantage-Modules-for-Jupyter[Teradata Vantage™ Modules for Jupyter Installation Guide] or link:https://quickstarts.teradata.com/jupyter.html[Use Vantage from a Jupyter Notebook].
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ pip install google-datacatalog-teradata-connector

==== Set environment variables

[source, bash, role="content-editable emits-gtm-events"]
[source, bash, id="gcp_env_var", role="content-editable emits-gtm-events"]
----
export GOOGLE_APPLICATION_CREDENTIALS=<google_credentials_file>
export TERADATA2DC_DATACATALOG_PROJECT_ID=<google_cloud_project_id>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,27 +10,28 @@

== Overview

This tutorial showcases how to use Airbyte (an open-source Extract Load Transform tool) with Teradata Vantage. We work with a very simple end-to-end setup to load data from Google Sheets to Teradata Vantage using Airbyte.


image::{dir}/sample_employees_payrate_google_sheets.png[Sample Employees Payrate Google Sheets, width=75%]
This tutorial showcases how to use Airbyte to move data from sources to Teradata Vantage, detailing both the https://docs.airbyte.com/using-airbyte/getting-started/[Airbyte Open Source] and https://airbyte.com/[Airbyte Cloud options]. This specific example covers replication from Google Sheets to Teradata Vantage.

* Source: Google Sheets
* Destination: Teradata Vantage

== Prerequisites
image::{dir}/sample_employees_payrate_google_sheets.png[Sample Employees Payrate Google Sheets,align="center", width=50%]

== Prerequisites
* Access to a Teradata Vantage Instance. This will be defined as the destination of the Airbyte connection. You will need a database `Host`, `Username`, and `Password` for Airbyte’s configuration.
+
include::ROOT:partial$vantage_clearscape_analytics.adoc[]

* Docker Compose to run link:https://github.com/airbytehq/airbyte[Airbyte Open Source, window="_blank"] locally. Docker Compose comes with Docker Desktop. Please refer to link:https://docs.docker.com/compose/install/[docker docs, window="_blank"] for more details.
* link:https://support.google.com/googleapi/answer/6158841?hl=en[Google Cloud Platform API enabled for your personal or organizational account, window="_blank"]. You’ll need to authenticate your Google account via OAuth or via Service Account Key Authenticator. In this example, we use Service Account Key Authenticator.

* Data from the source system. In this case, we use a link:https://docs.google.com/spreadsheets/d/1XNBYUw3p7xG6ptfwjChqZ-dNXbTuVwPi7ToQfYKgJIE/edit#gid=0[sample spreadsheet from google sheets, window="_blank"]. The sample data is a breakdown of payrate by employee type.

* link:https://support.google.com/googleapi/answer/6158841?hl=en[Google Cloud Platform API enabled for your personal or organizational account, window="_blank"]. You’ll need to authenticate your Google account via OAuth or via Service Account Key Authenticator. In this example, we use Service Account Key Authenticator.
=== Airbyte Cloud
* Create an account on https://airbyte.com/[Airbyte Cloud] and skip to the instructions under the link:#airbyte_configuration[Airbyte Configuration] section.

=== Airbyte Open Source
* Install Docker Compose to run link:https://github.com/airbytehq/airbyte[Airbyte Open Source, window="_blank"] locally. Docker Compose comes with Docker Desktop. Please refer to link:https://docs.docker.com/compose/install/[docker docs, window="_blank"] for more details.

== Launch Airbyte Open Source
* Clone the Airbyte Open Source repository and go to the airbyte directory.
+
[source, bash]
Expand Down Expand Up @@ -159,7 +160,7 @@ image::{dir}/data_sync_validation_in_teradata.png[Data Sync Validation in Terada
%connect local
----

[source, bash]
[source, bash, id="airbyte_select_query", role="emits-gtm-events"]
----
SELECT DatabaseName, TableName, CreateTimeStamp, LastAlterTimeStamp
FROM DBC.TablesV
Expand Down
Loading

0 comments on commit 5552459

Please sign in to comment.