diff --git a/openmetadata-docs/content/partials/v1.1/deployment/configure-external-orchestrator-for-ingestion-service.md b/openmetadata-docs/content/partials/v1.1/deployment/configure-external-orchestrator-for-ingestion-service.md index 4be8a9df4ccb..41d0c2348509 100644 --- a/openmetadata-docs/content/partials/v1.1/deployment/configure-external-orchestrator-for-ingestion-service.md +++ b/openmetadata-docs/content/partials/v1.1/deployment/configure-external-orchestrator-for-ingestion-service.md @@ -1,5 +1,5 @@ ### Configure External Orchestrator Service (Ingestion Service) -OpenMetadata requires connectors to be scheduled to periodically fetch the metadata or you can use the OpenMetadata APIs to push the metadata as well -1. OpenMetadata Ingestion Framework is flexible to run on any orchestrator. However we built an ability to deploy and manage connectors as pipelines from the UI. This requires the Airflow container we ship. However, it is recommended to -2. If your team prefers to run on any other orchestrator such as prefect, dagster or even github workflows. Please refer to our recent webinar on [How Ingestion Framework works](https://www.youtube.com/watch?v=i7DhG_gZMmE&list=PLa1l-WDhLreslIS_96s_DT_KdcDyU_Itv&index=10) \ No newline at end of file +OpenMetadata requires connectors to be scheduled to periodically fetch the metadata, or you can use the OpenMetadata APIs to push the metadata as well +1. OpenMetadata Ingestion Framework is flexible to run on any orchestrator. However, we built an ability to deploy and manage connectors as pipelines from the UI. This requires the Airflow container we ship. +2. If your team prefers to run on any other orchestrator such as prefect, dagster or even GitHub workflows. Please refer to our recent webinar on [How Ingestion Framework works](https://www.youtube.com/watch?v=i7DhG_gZMmE&list=PLa1l-WDhLreslIS_96s_DT_KdcDyU_Itv&index=10) \ No newline at end of file diff --git a/openmetadata-docs/content/partials/v1.2/deployment/configure-external-orchestrator-for-ingestion-service.md b/openmetadata-docs/content/partials/v1.2/deployment/configure-external-orchestrator-for-ingestion-service.md new file mode 100644 index 000000000000..41d0c2348509 --- /dev/null +++ b/openmetadata-docs/content/partials/v1.2/deployment/configure-external-orchestrator-for-ingestion-service.md @@ -0,0 +1,5 @@ +### Configure External Orchestrator Service (Ingestion Service) + +OpenMetadata requires connectors to be scheduled to periodically fetch the metadata, or you can use the OpenMetadata APIs to push the metadata as well +1. OpenMetadata Ingestion Framework is flexible to run on any orchestrator. However, we built an ability to deploy and manage connectors as pipelines from the UI. This requires the Airflow container we ship. +2. If your team prefers to run on any other orchestrator such as prefect, dagster or even GitHub workflows. Please refer to our recent webinar on [How Ingestion Framework works](https://www.youtube.com/watch?v=i7DhG_gZMmE&list=PLa1l-WDhLreslIS_96s_DT_KdcDyU_Itv&index=10) \ No newline at end of file diff --git a/openmetadata-docs/content/partials/v1.2/deployment/minimum-sizing-requirements.md b/openmetadata-docs/content/partials/v1.2/deployment/minimum-sizing-requirements.md new file mode 100644 index 000000000000..462fb492904b --- /dev/null +++ b/openmetadata-docs/content/partials/v1.2/deployment/minimum-sizing-requirements.md @@ -0,0 +1,9 @@ +## Minimum Sizing Requirements + +We recommend you to allocate openmetadata-server with minimum of 2vCPUs and 6 GiB Memory. + +For External Services that openmetadata depends on - +- For the database, minimum 2 vCPUs and 2 GiB RAM (per instance) with 30 GiB of Storage Volume Attached (dynamic expansion up to 100 GiB) +- For Elasticsearch, minimum 2 vCPUs and 2 GiB RAM (per instance) with 30 GiB of Storage volume attached + +These settings apply as well when using managed instances, such as AWS RDS or GCP CloudSQL or AWS OpenSearch. \ No newline at end of file diff --git a/openmetadata-docs/content/partials/v1.2/deployment/postgresql-issue-permission-denied-extension-pgcrypto.md b/openmetadata-docs/content/partials/v1.2/deployment/postgresql-issue-permission-denied-extension-pgcrypto.md new file mode 100644 index 000000000000..2eb98e814a9b --- /dev/null +++ b/openmetadata-docs/content/partials/v1.2/deployment/postgresql-issue-permission-denied-extension-pgcrypto.md @@ -0,0 +1,19 @@ +If you are facing the below issue with PostgreSQL as Database Backend for OpenMetadata Application, + +``` +Message: ERROR: permission denied to create extension "pgcrypto" +Hint: Must be superuser to create this extension. +``` + +It seems the Database User does not have sufficient privileges. In order to resolve the above issue, grant usage permissions to the PSQL User. + +```sql +GRANT USAGE ON SCHEMA schema_name TO ; +GRANT CREATE ON EXTENSION pgcrypto TO ; +``` + +{%note%} + +In the above command, replace `` with the sql user used by OpenMetadata Application to connect to PostgreSQL Database. + +{%\note%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/deployment/docker/index.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/deployment/docker/index.md index 407550223241..bec7c3a396e3 100644 --- a/openmetadata-docs/content/v1.2.x-SNAPSHOT/deployment/docker/index.md +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/deployment/docker/index.md @@ -4,32 +4,36 @@ slug: /deployment/docker --- # Docker Deployment -Deploying OpenMetadata in Docker is a great start! + +This guide will help you set up the OpenMetadata Application using Docker Deployment. Before starting with the deployment make sure you follow all the below Prerequisites. ## Docker Deployment Architecture + {% image src="/images/v1.2/deployment/docker/om_docker_architecture.png" alt="Docker Deployment Architecture" /%} -High-level overview: +## Prerequisites -- Deploying with MySQL 3306 /PostgreSQL 5432 : Download docker-compose.yml / docker-compose-postgres.yml from the link: https://github.com/open-metadata/OpenMetadata/releases -/OpenMetadata/releases. -- We are shipping the Elasticsearch service and Ul at 9200. -- We are shipping the OpenMetadata server and Ul at 8585. -- We are shipping the ingestion container (Airflow) at 8080. -- You can change the port number's according to your requirement. +### Configure OpenMetadata to use External Database and Search Engine + +For Production Deployment using Docker, we recommend bringing your own Databases and ElasticSearch Engine and not rely on quickstart packages. + +{% partial file="/v1.2/deployment/configure-external-orchestrator-for-ingestion-service.md" /%} -## Prerequisites ### Docker (version 20.10.0 or greater) + [Docker](https://docs.docker.com/get-started/overview/) is an open-source platform for developing, shipping, and running applications. It enables you to separate your applications from your infrastructure, so you can deliver software quickly using OS-level virtualization. It helps deliver software in packages called Containers. To check what version of Docker you have, please use the following command. + ```commandline docker --version ``` If you need to install Docker, please visit [Get Docker](https://docs.docker.com/get-docker/). + ### Docker Compose (version v2.2.3 or greater) + The Docker compose package enables you to define and run multi-container Docker applications. The compose command integrates compose functions into the Docker platform, making them available from the Docker command-line interface ( CLI). The Python packages you will install in the procedure below use compose to deploy OpenMetadata. - **MacOS X**: Docker on MacOS X ships with compose already available in the Docker CLI. @@ -48,61 +52,138 @@ Upon running this command you should see output similar to the following. Docker Compose version v2.2.3 ``` -### Install Docker Compose Version 2 on Linux +#### Install Docker Compose Version 2 on Linux Follow the instructions [here](https://docs.docker.com/compose/cli-command/#install-on-linux) to install docker compose version 2 1. Run the following command to download the current stable release of Docker Compose - ``` - DOCKER_CONFIG=${DOCKER_CONFIG:-$HOME/.docker} - - mkdir -p $DOCKER_CONFIG/cli-plugins - curl -SL https://github.com/docker/compose/releases/download/v2.2.3/docker-compose-linux-x86_64 -o - $DOCKER_CONFIG/cli-plugins/docker-compose - ``` - + + ``` + DOCKER_CONFIG=${DOCKER_CONFIG:-$HOME/.docker} + + mkdir -p $DOCKER_CONFIG/cli-plugins + curl -SL https://github.com/docker/compose/releases/download/v2.2.3/docker-compose-linux-x86_64 -o + $DOCKER_CONFIG/cli-plugins/docker-compose + ``` + This command installs Compose V2 for the active user under $HOME directory. To install Docker Compose for all users on your system, replace` ~/.docker/cli-plugins` with `/usr/local/lib/docker/cli-plugins`. 2. Apply executable permissions to the binary - ``` - chmod +x $DOCKER_CONFIG/cli-plugins/docker-compose - ``` + ``` + chmod +x $DOCKER_CONFIG/cli-plugins/docker-compose + ``` 3. Test your installation - ``` - docker compose version - > Docker Compose version v2.2.3 - ``` + ``` + docker compose version + > Docker Compose version v2.2.3 + ``` -## Steps for Deploying OpenMetadata using Docker +{% partial file="/v1.2/deployment/minimum-sizing-requirements.md" /%} -- First download the docker-compose.yml file from the release page [here](https://github.com/open-metadata/OpenMetadata/releases/latest). The latest version is at the top of the page - - Deploying with MySQL: Download `docker-compose.yml` file from the above link. - - Deploying with PostgreSQL: Download `docker-compose-postgres.yml` file from the above link. +## Steps for Deploying OpenMetadata using Docker + +### 1. Create a directory for OpenMetadata + +Create a new directory for OpenMetadata and navigate into that directory. -- Create the directory for host volumes ```commandline -mkdir -p $PWD/docker-volume/db-data +mkdir openmetadata-docker && cd openmetadata-docker ``` -- Run the below command to deploy the OpenMetadata +### 2. Download Docker Compose Files from GitHub Releases -```commandline -docker compose up -d +Download the Docker Compose files from the [Latest GitHub Releases](https://github.com/open-metadata/OpenMetadata/releases/latest). + +The Docker compose file name will be `docker-compose-openmetadata.yml`. + +This docker compose file contains only the docker compose services for OpenMetadata Server. Bring up the dependencies as mentioned in the [prerequisites](#configure-openmetadata-to-use-external-database-and-search-engine) section. + +You can also run the below command to fetch the docker compose file directly from the terminal - + +```bash +wget https://github.com/open-metadata/OpenMetadata/releases/download/1.1.5-release/docker-compose-openmetadata.yml ``` -This command will pull the docker images of Openmetadata for MySQL, OpenMetadat-Server, OpenMetadata-Ingestion and Elasticsearch. -Upon running this command you should see output similar to the following. -```commandline -+] Running 7/8 - ⠿ Network metadata_app_net Created 0.2s - ⠿ Volume "metadata_ingestion-volume-dag-airflow" Created 0.0s - ⠿ Volume "metadata_ingestion-volume-dags" Created 0.0s - ⠿ Volume "metadata_ingestion-volume-tmp" Created 0.0s - ⠿ Container openmetadata_elasticsearch Started 5.9s - ⠿ Container openmetadata_mysql Started 38.3s - ⠿ Container openmetadata_server Started 124.8s - ⠿ Container openmetadata_ingestion Started 0.3s +### 3. Update Environment Variables required for OpenMetadata Dependencies + +In the previous [step](#2-download-docker-compose-file-from-github-release-branch), we download the `docker-compose` file. + +Identify and update the environment variables in the file to prepare openmetadata configurations. + +For MySQL Configurations, update the below environment variables - + +```bash +... +# Database configuration for MySQL +DB_DRIVER_CLASS="com.mysql.cj.jdbc.Driver" +DB_SCHEME="mysql" +DB_USE_SSL="true" +DB_USER="" +DB_USER_PASSWORD="" +DB_HOST="" +DB_PORT="" +OM_DATABASE="" +... +``` + +For ElasticSearch Configurations, update the below environment variables - + +```bash +# ElasticSearch Configurations +SEARCH_TYPE="elasticsearch" +ELASTICSEARCH_HOST="" +ELASTICSEARCH_PORT="" +ELASTICSEARCH_SCHEME="" +ELASTICSEARCH_USER="" +ELASTICSEARCH_PASSWORD="" +... +``` + +For OpenSearch Configurations, update the below environment variables - + +```bash +# ElasticSearch Configurations +SEARCH_TYPE="opensearch" +ELASTICSEARCH_HOST="" +ELASTICSEARCH_PORT="" +ELASTICSEARCH_SCHEME="" +ELASTICSEARCH_USER="" +ELASTICSEARCH_PASSWORD="" +... +``` + +For Ingestion Configurations, update the below environment variables - + +```bash +PIPELINE_SERVICE_CLIENT_ENDPOINT="" +PIPELINE_SERVICE_CLIENT_HEALTH_CHECK_INTERVAL="300" +SERVER_HOST_API_URL="/api" +PIPELINE_SERVICE_CLIENT_VERIFY_SSL="no-ssl" +PIPELINE_SERVICE_CLIENT_SSL_CERT_PATH="" +PIPELINE_SERVICE_CLIENT_CLASS_NAME="org.openmetadata.service.clients.pipeline.airflow.AirflowRESTClient" +PIPELINE_SERVICE_IP_INFO_ENABLED="false" +PIPELINE_SERVICE_CLIENT_HOST_IP="" +PIPELINE_SERVICE_CLIENT_SECRETS_MANAGER_LOADER="noop" +AIRFLOW_USERNAME="" +AIRFLOW_PASSWORD="" +AIRFLOW_TIMEOUT="10" +AIRFLOW_TRUST_STORE_PATH="" +AIRFLOW_TRUST_STORE_PASSWORD="" +``` + +{% note noteType="Warning" %} + +When setting up environment file if your custom password includes any special characters then make sure to follow the steps [here](https://github.com/open-metadata/OpenMetadata/issues/12110#issuecomment-1611341650). + +{% /note %} + +### 4. Start the Docker Compose Services + +Run the below command to deploy the OpenMetadata - + +```bash +docker compose --env-file ./env-mysql up --detach ``` You can validate that all containers are up by running with command `docker ps`. @@ -111,123 +192,63 @@ You can validate that all containers are up by running with command `docker ps`. ❯ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 470cc8149826 openmetadata/server:1.0.0 "./openmetadata-star…" 45 seconds ago Up 43 seconds 3306/tcp, 9200/tcp, 9300/tcp, 0.0.0.0:8585-8586->8585-8586/tcp openmetadata_server -63578aacbff5 openmetadata/ingestion:1.0.0 "./ingestion_depende…" 45 seconds ago Up 43 seconds 0.0.0.0:8080->8080/tcp openmetadata_ingestion -9f5ee8334f4b docker.elastic.co/elasticsearch/elasticsearch:7.10.2 "/tini -- /usr/local…" 45 seconds ago Up 44 seconds 0.0.0.0:9200->9200/tcp, 0.0.0.0:9300->9300/tcp openmetadata_elasticsearch -08947ab3424b openmetadata/db:1.0.0 "/entrypoint.sh mysq…" 45 seconds ago Up 44 seconds (healthy) 3306/tcp, 33060-33061/tcp openmetadata_mysql ``` In a few seconds, you should be able to access the OpenMetadata UI at [http://localhost:8585](http://localhost:8585) -## Port Mapping / Port Forwarding -### For OpenMetadata-Server -We are shipping the OpenMetadata server and UI at `8585`, and the ingestion container (Airflow) at `8080`. You can -change the port number's according to your requirement. As an example, You could -update the ports to serve OpenMetadata Server and UI at port `80` +## Port Mapping / Port Forwarding + +We are shipping the OpenMetadata server and UI at container port and host port `8585`. You can change the host port number according to your requirement. +As an example, You could update the ports to serve OpenMetadata Server and UI at port `80` + +To achieve this - -To achieve this - You just have to update the ports mapping of the openmetadata-server in the `docker-compose.yml` file under `openmetadata-server` docker service section. ```yaml +--- ports: - "80:8585" ``` -- Once the port is updated if there are any containers running remove them first using `docker compose down` command and then recreate the containers once again by below command -```commandline -docker compose up --build -d -``` -### For Ingestion-Server -We are shipping the OpenMetadata server and UI at `8585`, and the ingestion container (Airflow) at `8080`. You can -change the port number's according to your requirement. As an example, You could -update the ports to serve Ingestion Server and UI at port `80` - -To achieve this -- You just have to update the ports mapping of the openmetadata-server in the `docker-compose.yml` file under `ingestion-server` docker service section. -```yaml -ports: - - "80:8080" -``` -- Also update the Airflow environment variable in openmetadata-server section - ```commandline - AIRFLOW_HOST: '' - ``` +- Once the port is updated if there are any containers running remove them first using `docker compose down` command and then recreate the containers once again by below command -- Once the port is updated if there are any containers running remove them first using `docker compose down` command and then recreate the containers once again by below command ```commandline -docker compose up --build -d +docker compose up --detach ``` -## PROD Deployment of OpenMetadata Using Docker -If you are planning on going to PROD, we recommend to validate below points: -- MySQL and OpenSearch (ElasticSearch) are available. -- OpenMetadata-Server require the minimum configuration of 2vCPU and 6Memory (GiB) -- OpenMetadata-Ingestion require the minimum configuration of 2vCPU and 8Memory (GiB) -- We also recommend to bind Docker Volumes for data persistence. Minimum disk space required would be 128 Gib. Learn how to do so [here](/deployment/docker/volumes). -{% note noteType="Warning" %} +## Run OpenMetadata with a load balancer -- When setting up environment file if your custom password includes any special characters then make sure to follow the steps [here](https://github.com/open-metadata/OpenMetadata/issues/12110#issuecomment-1611341650). - -{% /note %} +You may put one or more OpenMetadata instances behind a load balancer for reverse proxying. To do this you will need to add one or more entries to the configuration file for your reverse proxy. +### Nginx -### Steps for Deploying Ingestion -- Download the docker-compose.yml file from the release page [here](https://github.com/open-metadata/OpenMetadata/releases). -- Update the environment variables below for OpenMetadata-Ingestion Docker Compose backed systems to connect with Database. -``` -# MySQL Environment Variables for ingestion service -AIRFLOW_DB_HOST='' -AIRFLOW_DB_PORT='' -AIRFLOW_DB='' -AIRFLOW_DB_SCHEME='' -DB_USER='' -DB_PASSWORD='' -``` -Once the environment variables values with the RDS are updated then provide this environment variable file as part of docker compose command. +To use OpenMetadata behind Nginx reverse proxy, add an entry resembling the following the http context of your Nginx configuration file for each OpenMetadata instance. ``` -docker compose --env-file ./config/.env.prod up -d openmetadata_ingestion +server { + access_log /var/log/nginx/stage-reverse-access.log; + error_log /var/log/nginx/stage-reverse-error.log; + server_name stage.open-metadata.org; + location / { + proxy_pass http://127.0.0.1:8585; + } +} ``` -### Steps for Deploying OpenMetadata-Server -- Download the docker-compose.yml file from the release page [here](https://github.com/open-metadata/OpenMetadata/releases). -- Update the environment variables below for OpenMetadata-Ingestion Docker Compose backed systems to connect with Database and ElasticSearch and Ingestion. -``` -# MySQL Environment Variables -DB_DRIVER_CLASS='com.mysql.cj.jdbc.Driver' -DB_SCHEME='mysql' -DB_USE_SSL='true' -DB_USER_PASSWORD='' -DB_HOST='' -DB_USER='' -OM_DATABASE='' -DB_PORT='' -# ElasticSearch Environment Variables -ELASTICSEARCH_SOCKET_TIMEOUT_SECS='60' -ELASTICSEARCH_USER='' -ELASTICSEARCH_CONNECTION_TIMEOUT_SECS='5' -ELASTICSEARCH_PORT='443' -ELASTICSEARCH_SCHEME='https' -ELASTICSEARCH_BATCH_SIZE='10' -ELASTICSEARCH_HOST='' -ELASTICSEARCH_PASSWORD='' -# Ingestion or Airflow Environment Variables -AIRFLOW_HOST: '' -SERVER_HOST_API_URL: '' -``` -Once the environment variables values with the RDS are updated then provide this environment variable file as part of docker compose command. - -``` -docker compose --env-file ./config/.env.prod up -d openmetadata_server -``` ## Run OpenMetadata with AWS Services If you are running OpenMetadata in AWS, it is recommended to use [Amazon RDS](https://docs.aws.amazon.com/rds/index.html) and [Amazon OpenSearch Service](https://docs.aws.amazon.com/opensearch-service/?id=docs_gateway). -We support +We support - Amazon RDS (MySQL) engine version 8 or greater - Amazon OpenSearch (ElasticSearch) engine version up to 7.10 or Amazon OpenSearch engine version up to 1.3 -- Amazon RDS (PostgreSQL) engine version between 12 and 14.6 +- Amazon RDS (PostgreSQL) engine version 12 or greater + +Note:- +When using AWS Services the SearchType Configuration for elastic search should be `opensearch`, for both cases ElasticSearch and OpenSearch, +as you can see in the ElasticSearch configuration example. For Production Systems, we recommend Amazon RDS to be in Multiple Availability Zones. For Amazon OpenSearch (or ElasticSearch) Service, we recommend Multiple Availability Zones with minimum 3 Master Nodes. @@ -244,6 +265,7 @@ DB_USER='' OM_DATABASE='' DB_PORT='' # ElasticSearch Environment Variables +SEARCH_TYPE = 'opensearch' ELASTICSEARCH_SOCKET_TIMEOUT_SECS='60' ELASTICSEARCH_USER='' ELASTICSEARCH_CONNECTION_TIMEOUT_SECS='5' @@ -256,8 +278,51 @@ ELASTICSEARCH_PASSWORD='' Replace the environment variables values with the RDS and OpenSearch Service ones and then provide this environment variable file as part of docker compose command. +```bash +docker compose --env-file ./env-mysql up --detach +``` + +## Advanced + +### Add Docker Volumes for OpenMetadata Server Compose Service + +There are many scenarios where you would want to provide additional files to the OpenMetadata Server and serve while running the application. In such scenarios, it is recommended to provision docker volumes for OpenMetadata Application. + +{%note noteType="Tip"%} + +If you are not familiar with Docker Volumes with Docker Compose Services, Please refer to [official documentation](https://docs.docker.com/storage/volumes/#use-a-volume-with-docker-compose) for more information. + +{%/note%} + +For example, we would like to provide custom JWT Configuration Keys to be served to OpenMetadata Application. This requires the OpenMetadata Containers to have docker volumes sharing the private and public keys. Let's assume you have the keys available in `jwtkeys` directory in the same directory where your `docker-compose` file is available in the host machine. + +We add the volumes section to mount the keys onto the docker containers create with docker compose as follows - + +```yaml +services: + openmetadata-server: + ... + volumes: + - ./jwtkeys:/etc/openmetadata/jwtkeys + ... ``` -docker compose --env-file ./config/.env.prod up -d openmetadata_server + +The above example uses [bind mounts](https://docs.docker.com/storage/bind-mounts/#use-a-bind-mount-with-compose) to share files and directories between host machine and openmetadata container. + +Next, in your environment file, update the jwt configurations to use the right path from inside the container. + +```bash +... +# JWT Configuration +RSA_PUBLIC_KEY_FILE_PATH="/etc/openmetadata/jwtkeys/public_key.der" +RSA_PRIVATE_KEY_FILE_PATH="/etc/openmetadata/jwtkeys/private_key.der" +... +``` + +Once the changes are updated, if there are any containers running remove them first using `docker compose down` command and then recreate the containers once again by below command + +```commandline +docker compose up --detach ``` ## Troubleshooting @@ -283,26 +348,26 @@ OPENMETADATA_HEAP_OPTS="-Xmx2G -Xms2G" The flag `Xmx` specifies the maximum memory allocation pool for a Java virtual machine (JVM), while `Xms` specifies the initial memory allocation pool. -Restart the OpenMetadata Docker Compose Application using `docker compose --env-file my-env-file -f docker-compose.yml up -d` which will recreate the containers with new environment variable values you have provided. - -# Production Deployment - -If you are planning on going to PROD, we also recommend taking a look at the following -other deployment strategies: - -{%inlineCalloutContainer%} - {%inlineCallout - color="violet-70" - icon="storage" - bold="Deploy on Bare Metal" - href="/deployment/bare-metal"%} - Deploy OpenMetadata directly using the binaries. - {%/inlineCallout%} - {%inlineCallout - color="violet-70" - icon="fit_screen" - bold="Deploy on Kubernetes" - href="/deployment/kubernetes"%} - Deploy and scale with Kubernetes - {%/inlineCallout%} -{%/inlineCalloutContainer%} +Restart the OpenMetadata Docker Compose Application using `docker compose --env-file -f docker-compose.yml up --detach` which will recreate the containers with new environment variable values you have provided. + +### PostgreSQL Issue permission denied to create extension "pgcrypto" + +{% partial file="/v1.2/deployment/postgresql-issue-permission-denied-extension-pgcrypto.md" /%} + +{%note%} + +In the above command, replace `` with the sql user used by OpenMetadata Application to connect to PostgreSQL Database. + +{%/note%} + +## Security + +Please follow our [Enable Security Guide](/deployment/docker/security) to configure security for your OpenMetadata +installation. + +## Next Steps + +1. Visit the [Features](/releases/features) overview page and explore the OpenMetadata UI. +2. Visit the [Connectors](/connectors) documentation to see what services you can integrate with + OpenMetadata. +3. Visit the [API](/swagger.html) documentation and explore the rich set of OpenMetadata APIs. diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/admin-guide-roles-policies/authorization.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/admin-guide-roles-policies/authorization.md new file mode 100644 index 000000000000..2587b2b77a99 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/admin-guide-roles-policies/authorization.md @@ -0,0 +1,142 @@ +--- +title: Building Blocks of Authorization - Rules, Policies, and Roles +slug: /how-to-guides/admin-guide-roles-policies/authorization +--- + +# Building Blocks of Authorization: Rules, Policies, and Roles + +## Building Blocks of Authorization: Rules + +A Rule in a Policy is the building block of Authorization. It contains the following: +1. **Name:** A unique name to define the rule +2. **Description:** Description of the rule +3. **Resources:** List of resources this rule applies to. An Admin can select a specific resource, such as Table or All, to apply against all resources. +4. **Operations:** List of operations this rule applies to. An Admin can select a specific operation such as EditOwner or All to apply against all the operations. +5. **Condition:** Expressions written using policy functions that evaluate true or false. +6. **Effect:** Deny or Allow the operation. + +{% image +src="/images/v1.1/how-to-guides/roles-policies/rules3.png" +alt="Building Blocks of Authorization: Rules" +caption="Building Blocks of Authorization: Rules" +/%} + +## Condition + +OpenMetadata provides [SpEL](https://docs.spring.io/spring-framework/docs/3.0.x/reference/expressions.html)-based conditions for Admins to select during rule creation. + +Here are some examples of conditions. + +| | | +|--- | --- | +| **noOwner()** | Returns true if the resource being accessed has no owner. | +| **isOwner()** | Returns true if the user accessing the resource is the owner of the resource. | +| **matchAllTags(tagFqn, [tagFqn…])** | Returns true if the resource has all the tags from the tag list. | +| **matchAnyTag(tagFqn, [tagFqn…])** | Returns true if the resource has any of the tags from the tag list. | +| **matchTeam()** | Returns true if the user belongs to the team that owns the resource. | + +Conditions are used to assess DataAsset like Tables/Topics/Dashboards etc.. for specific attributes. + +Example: Consider the noOwner() condition when applied to the table **fact_orders**. If this table lacks an assigned owner, then the condition returns **true**. However, if an owner is present, it returns **false**. + +Another instance: The ***matchAnyTag(PII.Sensitive)*** condition, when applied to the **dim_address** table that carries the PII.Sensitive tag, will yield **true**. But, without this specific tag, the outcome is **false**. + +You can also combine conditions using logical operators like AND (&&) or OR (||). + +For instance, the combined condition: +***noOwner() && matchAllTags('PersonalData.Personal', 'Tier.Tier1', 'Business Glossary.Clothing')*** +will produce a true result if the Data Asset (be it a Table, Topic, etc.) has no assigned owner and concurrently matches all specified tags. + +These dynamic conditions empower admins to craft rules that holistically consider the DataAssets and its attributes when dictating access control. + +### Default Policy and Rule + +When navigating to Settings -> Policies -> Organization Policy, you'll discover the default rules set at the organizational level. Here’s a quick breakdown of these rules: + +{% image +src="/images/v1.1/how-to-guides/roles-policies/rules4.png" +alt="Default Policy and Rule" +caption="Default Policy and Rule" +/%} + +#### OrganizationPolicy-NoOwner-Rule + +{% image +src="/images/v1.1/how-to-guides/roles-policies/rules5.png" +alt="Organization Policy - No Owner - Rule" +caption="Organization Policy - No Owner - Rule" +/%} + +**Purpose:** This rule allows users to assign ownership to resources without an owner. + +**Example:** If a user accesses fact_table and finds that it is unowned, then they can modify the ownership field to establish a new owner. However, for a table like dim_address that already has an assigned owner, any attempt to change the ownership will be restricted. + +#### OrganizationPolicy-Owner-Rule + +{% image +src="/images/v1.1/how-to-guides/roles-policies/rules6.png" +alt="Organization Policy - Owner - Rule" +caption="Organization Policy - Owner - Rule" +/%} + +**Purpose:** This rule grants permissions based on the ownership of a data asset. + +**Details:** When a user, who either personally owns a table or is part of the team owning that table logs in, they're granted extensive rights. They can modify all properties of that Data Asset and access complete information about it. + +By setting up such default rules, the organization ensures clarity in access control based on ownership status and user roles. + +## Building Blocks of Authorization: Policies + +A policy encompasses multiple rules, as delineated earlier. When a user accesses a resource, the policy evaluates all its associated rules in the context of that user's current session. + +**Resolving Conflicts** +In instances where a policy has contradicting rules - for example, one rule allows "EditDescription" for all resources while another denies the same - the "Deny" action takes precedence. + +**Policy Assignments** +While policies can be associated with a specific team within the organizational hierarchy, they cannot be directly linked to individual users. + +**Inheritance and Application** + +{% image +src="/images/v1.1/how-to-guides/roles-policies/inheritance.png" +alt="Inheritance and Application" +caption="Inheritance and Application" +/%} + +Any user positioned within a team structure inherently adopts its policies. For instance, if the **"Organization-NoOwner-Policy"** is instituted at the organization's apex, all its internal members will be governed by this policy and the rules therein. + +Similarly, if a policy is designated to "Division1" such as “Division Policy”, every member, be it "Department", "Team1", "Team2", or the individual users in these groups, will fall under its purview. + +However, if you formulate a policy explicitly for "Team1", only the members of "Team1" will be affected. + +**The Philosophy Behind the Design** + +This architecture aims to establish broad, overarching rules at the organizational level, potentially being more lenient in nature. As you move down the hierarchy, teams can sculpt stricter, more tailored policies. For instance, a policy might dictate, "Deny access to everyone outside of Team 1." This ensures a blend of flexibility at the top and precision at the grassroots level. + +## Building Blocks of Authorization: Roles + +Policies serve as mechanisms to enforce authorization, while Roles offer a more structured hierarchy for the same purpose. Each role is closely aligned with a user's function or job description. + +{% image +src="/images/v1.1/how-to-guides/roles-policies/role1.png" +alt="Building Blocks of Authorization: Roles" +caption="Building Blocks of Authorization: Roles" +/%} + +For instance, in an organization, you might have: +- **Data Engineers:** Tasked with producing data assets. +- **Data Scientists:** Responsible for creating dashboards and utilizing assets developed by Data Engineers. +- **Data Stewards:** Experts in all data-related matters who oversee governance duties within the organization. + +Roles provide the advantage of bundling multiple policies encapsulating a user’s specific function or job. For example, a "Data Consumer" role would encompass a "Data Consumer Policy." Any individual or team assigned this role would automatically be subject to the stipulations outlined in the associated policies. + +Moreover, roles can be allocated either to individual users or teams within an organizational hierarchy. When a role is assigned to a team, every member of that team inherits the privileges of that role. This design is intentional, aiming to simplify the role assignment process for administrators. + +{%inlineCallout + color="violet-70" + bold="Use Cases: Creating Roles & Policies in OpenMetadata" + icon="MdArrowForward" + align="right" + href="/how-to-guides/admin-guide-roles-policies/use-cases"%} + Tailor you policies to meet your organizational and team needs. +{%/inlineCallout%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/admin-guide-roles-policies/index.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/admin-guide-roles-policies/index.md new file mode 100644 index 000000000000..e4c4008e6e5f --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/admin-guide-roles-policies/index.md @@ -0,0 +1,111 @@ +--- +title: Admin Guide for Roles and Policies +slug: /how-to-guides/admin-guide-roles-policies +--- + +# Admin Guide for Roles and Policies + +## Users and Teams + +OpenMetadata introduces a versatile hierarchical team structure that aligns with your organization's setup. Administrators can mirror their organizational hierarchy by creating various team types. + +**Organization** serves as the foundation of the team hierarchy representing the entire company. Under Organization, you can add Business Units, Divisions, Departments, Groups, and Users. For instance, if your company is Facebook, then the Organization represents entire Facebook itself, which further houses diverse teams like Engineering, Sales, Finance, and Marketing. + +{% image +src="/images/v1.1/how-to-guides/roles-policies/all-teams.png" +alt="Teams Hierarchy" +caption="Teams Hierarchy" +/%} + +**BusinessUnit** is positioned one level below the Organization and can contain other Business Units, Divisions, Departments, and Groups. To illustrate, the Engineering Business Unit could be one of the top-tier Business Units in the Organization. It contains other teams like Groups and additional Business Units. + +{% image +src="/images/v1.1/how-to-guides/roles-policies/b-u.png" +alt="Business Unit" +caption="Business Unit" +/%} + +**Division** is positioned below Business Unit and can include Divisions, Departments, and Groups. For example, a Division named 'Product Development' under the Engineering Business Unit. It can have teams like 'Software Division,' 'Hardware Division,' and 'QA Division.' + +**Department** is positioned below Division and can include other Departments and Groups. For example, a 'Data Engineering Department could include specialized teams like 'Infrastructure,' 'Data Science,' and 'Platform.' + +**Group** represents the final tier in this hierarchy. It contains a group of users that reflect finite teams within your organization. + +***Notably, only Groups have the privilege of owning Data Assets within the OpenMetadata platform.*** + +This structured hierarchy enhances your control over team management and resource ownership. By creating a dynamic model mirroring your organization's functions, OpenMetadata empowers you to effortlessly manage permissions, access controls, and data ownership at different levels of granularity. + +## Access Control Design: Roles and Policies + +OpenMetadata incorporates a robust Access Control framework that merges Role-Based Access Control (RBAC) with Attribute-Based Access Control (ABAC) in a powerful hybrid model. This security design is reinforced by + +**Authentication with SSO Integration:** OpenMetadata seamlessly integrates with various Single Sign-On (SSO) providers, including Azure AD, Google, Okta, Auth0, OneLogin, and more. This ensures a unified and secure authentication experience for users. + +**Team Hierarchy:** OpenMetadata offers a structured team hierarchy that mirrors your organization's structure, enhancing manageability and granularity in access control. + +**Roles and Policies:** Policies and Roles are pivotal in determining who can access what resources and perform what actions. These policies are based on a combination of user attributes, roles, and resource attributes. + +**User and Bots Authentication:** OpenMetadata accommodates human users and automated applications (bots). For human users, logging into the OpenMetadata UI mandates SSO authentication. Upon successful authentication, a JWT token is issued. +Bots, on the other hand, are equipped with a JWT token generated based on SSL certificates. This token serves as their identity and authorization mechanism when interacting with the OpenMetadata server APIs. + +## Authentication Flow + +{% image +src="/images/v1.1/how-to-guides/roles-policies/auth.png" +alt="Authentication Flow" +caption="Authentication Flow" +/%} + +**User Authentication:** When users access the OpenMetadata UI, they authenticate with their SSO provider. Upon successful authentication, a JWT token is generated. This token validates the user's session and permits them to authenticate requests to the OpenMetadata server. + +**Bot Authentication:** Automated applications like the ingestion connector are equipped with a pre-generated JWT Token. OpenMetadata, with its configured SSL Certificates, authenticates the JWT token, establishing the bot's identity. This token authorizes the bot to interact with OpenMetadata server APIs. + +## Authorization Framework + +OpenMetadata's authorization is a result of evaluating three crucial factors: + +{% image +src="/images/v1.1/how-to-guides/roles-policies/access.png" +alt="Authorization Framework" +caption="Authorization Framework" +/%} + +**Who is the User (Authentication):** This aspect is determined by the authentication process – whether it a user or a bot – ensuring that only authorized entities access the system. + +**What Resource (Resource Attributes):** Based on the API calls being made, OpenMetadata identifies the target resource and its associated attributes. + +Below is a list of resources that correspond to Entities such as Table, Topic, Pipeline, etc. + +{% image +src="/images/v1.1/how-to-guides/roles-policies/rules1.png" +alt="Resources Correspond to Entities" +caption="Resources Correspond to Entities" +/%} + +**What Operation (API Call):** Each API call is linked to a specific operation, such as editing descriptions, deleting tags, changing ownership, etc. + +There are common operations such as Create, Delete, and ViewAll that apply to all the resources. Each resource can also have its specific operation, such as ViewTests, ViewQueries for Table. + +{% image +src="/images/v1.1/how-to-guides/roles-policies/rules2.png" +alt="Each Resource has its Own Set of Granular Operations" +caption="Each Resource has its Own Set of Granular Operations" +/%} + +By synthesizing these components, OpenMetadata dynamically ascertains whether a user or bot can perform a particular action on a specific resource. This **fusion of RBAC and ABAC** in the hybrid model contributes to a robust and flexible access control mechanism, bolstering the security and control of your OpenMetadata environment. + +{%inlineCallout + color="violet-70" + bold="Building Blocks of Authorization: Rules, Policies, and Roles" + icon="add_moderator" + href="/how-to-guides/admin-guide-roles-policies/authorization"%} + Learn all the details of Rules, Policies, and Roles +{%/inlineCallout%} + +{%inlineCallout + color="violet-70" + bold="Use Cases: Creating Roles & Policies in OpenMetadata" + icon="add_moderator" + href="/how-to-guides/admin-guide-roles-policies/use-cases"%} + Tailor you policies to meet your organizational and team needs. +{%/inlineCallout%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/admin-guide-roles-policies/use-cases.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/admin-guide-roles-policies/use-cases.md new file mode 100644 index 000000000000..52bfc8429c37 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/admin-guide-roles-policies/use-cases.md @@ -0,0 +1,64 @@ +--- +title: Use Cases - Creating Roles & Policies in OpenMetadata +slug: /how-to-guides/admin-guide-roles-policies/use-cases +--- + +# Use Cases: Creating Roles & Policies in OpenMetadata + +OpenMetadata comes with default configurations such as the Organization Policy and Data Consumer Roles. These roles are setup to foster data collaboration. + +We advise retaining the Organization policy, which enables everyone to view the assets and claim ownership when no owner is specified. + +For individual teams, tailor your policies according to the specific needs of both the organization and the team. You may choose to adopt stricter policies as detailed in the previous sections. + +### Use Case 1: We want our teams to be able to create services and extract metadata + +You can create a policy with DatabaseService, Ingesiton Pipeline, and Workflow resources with All operations set to allow. + +{% image +src="/images/v1.1/how-to-guides/roles-policies/policy1.png" +alt="Creating Roles & Policies in OpenMetadata" +caption="Allow All Operations" +/%} + +You can create a Role such as ServiceOwner role and assign the above policy. Once the role is created, you can assign it to users to enable service creation by themselves without the need for an Admin. + +### Use Case 2: Roles for Data Steward + +A data steward in OpenMetadata should be able to create Glossaries and Glossary Terms and be able to view all data and manage it for governance purposes. + +Here is an example of a policy to enable it for Data Stewards using two rules. +1. **Allow Glossary Operations:** Enables the policy to allow operations on all Glossary related actions. +2. **Edit Rule:** Grants access to the Data Steward to edit description, edit tags on all entities; enabling the user to manage the data. + +You can fine tune these permissions to suit your organizational needs. + +{% image +src="/images/v1.1/how-to-guides/roles-policies/policy2.png" +alt="Roles for Data Steward" +caption="Roles for Data Steward" +/%} + +### Use Case 3: Only the team that owns the data asset should be able to access it + +To safeguard the data owned by a specific team, you can prevent external access. + +The above rule specifies to deny all operations if the logged-in user is not the owner, or if the logged-in user’s team is not the owner of an asset. + +{% image +src="/images/v1.1/how-to-guides/roles-policies/policy3.png" +alt="Team Only Policy" +caption="Team Only Policy" +/%} + +### Use Case 4: Deny all the access if the data asset is tagged with PII.Sensitive and allow only the owners + +Just like the above policy, you can create a rule with complex conditions as shown below + +{% image +src="/images/v1.1/how-to-guides/roles-policies/policy4.png" +alt="PII Sensitive Tag Policy" +caption="PII Sensitive Tag Policy" +/%} + +In this rule, we are specifying to deny operations if the table tag contains PII.Sensitive tag and if the logged-in user is not the owner, or their team is not the owner of the Table. \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/aws/index.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/aws/index.md deleted file mode 100644 index d3cc0aac7100..000000000000 --- a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/aws/index.md +++ /dev/null @@ -1,26 +0,0 @@ ---- -title: How to enable AWS RDS IAM Auth on postgresql -slug: /how-to-guides/aws/index.md ---- - -# Aws resources on Rds IAM Auth -https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.IAMDBAuth.html - -# Requirements -1. AWS Rds Cluster with IAM auth enabled -2. User on Db Cluster with iam enabled -3. IAM policy with permission on rds connect -4. Role with IAM policy attached -5. IAM role attached to ec2 instance on which openmetadata is deployed or ServiceAccount/Kube2Iam role attached to pod - -# How to enable ADS RDS IAM Auth on postgresql - -Set environment variables -```Commandline - DB_USER_PASSWORD: "dummy" - DB_PARAMS: "awsRegion=eu-west-1&allowPublicKeyRetrieval=true&sslmode=require&serverTimezone=UTC" -``` -Either through helm (if deployed in kubernetes) or as env vars - -# Note -The `DB_USER_PASSWORD` is still required and cannot be empty. Set it to a random/dummy string. diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/cli-ingestion-with-basic-auth/index.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/cli-ingestion-with-basic-auth/index.md index dcd2ee1bbbd3..d0c8e5211ef9 100644 --- a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/cli-ingestion-with-basic-auth/index.md +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/cli-ingestion-with-basic-auth/index.md @@ -21,13 +21,13 @@ From `0.12.1` OpenMetadata has changed the default `no-auth` to `Basic` auth, So **1.** Go to the `settings` page from the navbar and then scroll down to the `Integrations` Section. Click on the `Bots` and you will see the list of bots, then click on the `ingestion-bot`. {% image - src="/images/v1.2/cli-ingestion-with-basic-auth/bot-list.png" + src="/images/v1.1/cli-ingestion-with-basic-auth/bot-list.png" alt="bot-list" /%} **2.** You will be redirected to the `ingestion-bot` details page. there you will get the JWT token, click on the copy button and copy the JWT token. {% image -src="/images/v1.2/cli-ingestion-with-basic-auth/bot-token.png" +src="/images/v1.1/cli-ingestion-with-basic-auth/bot-token.png" alt="bot-token" /%} diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/feature-configurations/bots.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/feature-configurations/bots.md index 771a766ce891..00d64dc96154 100644 --- a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/feature-configurations/bots.md +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/feature-configurations/bots.md @@ -15,7 +15,7 @@ for Google SSO, but it can apply to any SSO. - Click on `ingestion-bot`: {% image -src="/images/v1.2/how-to-guides/feature-configurations/bots/click-bot.png" +src="/images/v1.1/how-to-guides/feature-configurations/bots/click-bot.png" alt="click-bot" caption="Click on 'ingestion-bot'" /%} @@ -24,7 +24,7 @@ caption="Click on 'ingestion-bot'" /%} generated JWT Token by clicking the "**Revoke**" button: {% image -src="/images/v1.2/how-to-guides/feature-configurations/bots/revoke-jwt-token.png" +src="/images/v1.1/how-to-guides/feature-configurations/bots/revoke-jwt-token.png" alt="revoke-jwt-toke" caption="Revoke JWT Token" /%} @@ -32,15 +32,14 @@ caption="Revoke JWT Token" /%} - Then, click on "**Generate New Token**": {% image -src="/images/v1.2/how-to-guides/feature-configurations/bots/generate-new-token.png" +src="/images/v1.1/how-to-guides/feature-configurations/bots/generate-new-token.png" alt="generate-new-token" caption="Generate New Token to edit" /%} - Select your configured SSO from the list. In this case, `Google SSO`. -{% image -src="/images/v1.2/how-to-guides/feature-configurations/bots/select-google-sso.png" +{% image src="/images/v1.1/how-to-guides/feature-configurations/bots/select-google-sso.png" alt="select-google-sso" caption="Select 'Google SSO'" /%} @@ -48,7 +47,7 @@ caption="Select 'Google SSO'" /%} bot. {% image -src="/images/v1.2/how-to-guides/feature-configurations/bots/configure-bot.png" +src="/images/v1.1/how-to-guides/feature-configurations/bots/configure-bot.png" alt="configure-bot" caption="Configure the ingestion-bot with your SSO values" /%} diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/how-to-add-custom-logo.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/how-to-add-custom-logo.md index d20243288f49..416f070b7cfa 100644 --- a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/how-to-add-custom-logo.md +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/how-to-add-custom-logo.md @@ -10,14 +10,14 @@ To change the Logo for the application, we need to update logo at two locations. 1. Login Page {% image -src="/images/v1.2/how-to-guides/login-Page-Logo.png" +src="/images/v1.1/how-to-guides/login-Page-Logo.png" alt="loginPage-image" /%} 2. Navigation Bar {% image -src="/images/v1.2/how-to-guides/nav-Bar-Logo.png" +src="/images/v1.1/how-to-guides/nav-Bar-Logo.png" alt="navBar-image" /%} diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/how-to-add-language-support.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/how-to-add-language-support.md index 34f85f873c52..7419ad232404 100644 --- a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/how-to-add-language-support.md +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/how-to-add-language-support.md @@ -36,7 +36,7 @@ To copy the contents of en-us.json and add it to your translation JSON file, fol You can refer to the image below for a visual guide: {% image -src="/images/v1.2/how-to-guides/language-support.png" +src="/images/v1.1/how-to-guides/language-support.png" alt="copy-en-us" /%} diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/index.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/index.md index 723de36f9330..35ce858ed000 100644 --- a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/index.md +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/index.md @@ -6,3 +6,93 @@ slug: /how-to-guides # How to Guides How to Guides will give you a walk through on `How to do things in OpenMetadata`. + +# Overview of OpenMetadata + +## What is OpenMetadata? + +OpenMetadata is an all-in-one platform for data discovery, lineage, data quality, observability, governance, and team collaboration. It is one of the fastest-growing open-source projects with a vibrant community and adoption by a diverse set of companies in a variety of industry verticals. Powered by a centralized metadata store based on Open Metadata Standards/APIs, supporting connectors to a wide range of data services, OpenMetadata enables end-to-end metadata management, giving you the freedom to unlock the value of your data assets. + +## Features in OpenMetadata + +{% tilesContainer %} +{% tile + title="Data Discovery" + description="Discover the right data assets to make timely business decisions." + link="/how-to-guides/openmetadata/data-discovery" + icon="discovery" +/%} +{% tile + title="Data Collaboration" + description="Foster data team collaboration to enhance data understanding." + link="/how-to-guides/openmetadata/data-collaboration" + icon="collaboration" +/%} +{% tile + title="Data Quality & Profiler" + description="Trust your data with quality tests that ensure freshness, & accuracy." + link="/how-to-guides/openmetadata/data-quality-profiler" + icon="quality" +/%} +{% tile + title="Data Lineage" + description="Trace the path of data across tables, pipelines, and dashboards." + link="/how-to-guides/openmetadata/data-lineage" + icon="lineage" +/%} +{% tile + title="Data Insights" + description="Define KPIs and set goals to proactively hone the data culture of your company." + link="/how-to-guides/openmetadata/data-insights" + icon="discovery" +/%} +{% tile + title="Data Governance" + description="Enhance your data platform governance using OpenMetadata." + link="/how-to-guides/openmetadata/data-governance" + icon="governance" +/%} +{% /tilesContainer %} + +## Quick Start Guides + +{% tilesContainer %} +{% tile + title="Admin Guide" + description="Admin users can get started with OpenMetadata with just three quick and easy steps & know-it-all with the advanced guides." + link="/how-to-guides/quick-start-guide-for-admins" + icon="administration" +/%} +{% tile + title="Guide for Data Stewards" + description="Get to know the basics of OpenMetadata and about the data assets that you can explore in the all-in-one platform." + link="/how-to-guides/user-guide-for-data-stewards" + icon="steward" +/%} +{% /tilesContainer %} + +## How OpenMetadata helps Data Teams? + +OpenMetadata is a complete package for data teams to break down team silos, share data assets from multiple sources securely, collaborate around data, and build a documentation-first data culture in the organization. + +{% note %} + +- Centralized, **Single Source of Truth** for all your metadata. + +- **[Discover](/how-to-guides/openmetadata/data-discovery)** the right assets in time and reduce dependencies. + +- Foster **[Team Collaboration](/how-to-guides/openmetadata/data-collaboration)** with conversations, tasks, announcements, and alerts in real time. + +- Build trust in your data with **[Data Quality Tests](/how-to-guides/openmetadata/data-quality-profiler)** to ensure completeness and accuracy. + +- Track your data evolution with end-to-end **[Data Lineage](/how-to-guides/openmetadata/data-lineage)**. + +- Secure access to sensitive data by defining **[Roles and Policies](/how-to-guides/admin-guide-roles-policies)**. + +- Enhance organizational **[Data Culture](/how-to-guides/openmetadata/data-insights)** to gain crucial insights to drive innovation. + +- Define your **[Glossary](/how-to-guides/openmetadata/data-governance/glossary-classification)** to build a common understanding of terms within your organization. + +- Implement **[Data Governance](/how-to-guides/openmetadata/data-governance)** to maintain data integrity, security, and compliance. + +{% /note %} diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-collaboration/activity-feeds.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-collaboration/activity-feeds.md new file mode 100644 index 000000000000..8eb2f6f3b7e3 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-collaboration/activity-feeds.md @@ -0,0 +1,26 @@ +--- +title: Understanding Activity Feeds +slug: /how-to-guides/openmetadata/data-collaboration/activity-feeds +--- + +# Understanding Activity Feeds + +The **Activity Feeds** in OpenMetadata displays all the activities around the data you own and the data you follow. Activity Feeds are great conversation starters to collaborate with your team. Click on the OpenMetadata icon to access the landing page with the activity feeds. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/activity1.png" +alt="My Data: Activity Feed Widget" +caption="My Data: Activity Feed Widget" +/%} + +The **Activity Feeds Widget** displays: +- **All:** All the activities related to the data assets that you own, follow, or where you are mentioned +- **@Mentions:** Feeds where you are mentioned +- **Tasks:** Tasks created by you, or assigned to you are displayed. Only the Open tasks are displayed here. + +Users can Reply to discuss further as well as Edit, Delete, or share Reactions with the team by using emojis. +{% image +src="/images/v1.1/how-to-guides/collaboration/emoji.png" +alt="React with Emojis" +caption="React with Emojis" +/%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-collaboration/add-announcement.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-collaboration/add-announcement.md new file mode 100644 index 000000000000..9cb98c0d915b --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-collaboration/add-announcement.md @@ -0,0 +1,45 @@ +--- +title: How to Create an Announcement +slug: /how-to-guides/openmetadata/data-collaboration/add-announcement +--- + +# How to Create an Announcement + +{% note noteType="Tip" %} **Quick Tip:** Always watch out for announcements on the backward incompatible changes. Saves a ton of debugging time later on for data teams. {% /note %} + +To add an announcement: +- Navigate to **Explore** and the relevant **Data Asset** section to select a specific asset. +- Click on the vertical ellipsis icon **⋮** located on the top right and select **Announcements**. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/announce5.png" +alt="Announcements Option" +caption="Announcements Option" +/%} + +- Click on **Add Announcement**. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/announce6.png" +alt="Add an Announcement" +caption="Add an Announcement" +/%} + +- Enter the following information and click Submit. + - Title of the Announcement + - Start Date + - End Date + - Description + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/announce7.png" +alt="Add the Announcement Details" +caption="Add the Announcement Details" +/%} + +This announcement will be displayed in OpenMetadata during the scheduled time. It will be displayed to all the users who own or follow that particular data asset. + +{% note noteType="Warning" %} +**Pro Tip:** Create Announcements for deletion, deprecation, and other important changes. Let your team know of a tentative date when these changes will be implemented. +{% /note noteType="Warning" %} +{% /note %} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-collaboration/announcements.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-collaboration/announcements.md new file mode 100644 index 000000000000..ceab68430315 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-collaboration/announcements.md @@ -0,0 +1,56 @@ +--- +title: Overview of Announcements +slug: /how-to-guides/openmetadata/data-collaboration/announcements +--- + +# Overview of Announcements + +It is a huge challenge to inform the data team about upcoming changes to data. In most organizations, data changes are announced in advance over email or Slack; and sometimes, this information is noticed pretty late, leaving very little time to prepare for the changes. + +In OpenMetadata, **announcements** can be set up to inform the entire team about the upcoming changes to a data asset. With the Announcements feature, you can now inform your entire team of all the upcoming events and changes, such as **deprecation, deletion, or schema changes**. These announcements can be scheduled with a start date and an end date. All the users following your data are not only notified in Activity Feeds but a banner is also shown on the data asset details page. + +{% note %} +**Tip:** Ideally, it’s best to schedule the announcements well in advance before modifying or deleting a data asset, so you can ensure that the entire team has a reasonable amount of time to plan accordingly. +{% /note %} + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/announce1.png" +alt="Banner on Data Assets Page" +caption="Banner on Data Assets Page" +/%} + +{% note noteType="Warning" %} +**Pro Tip:** Ensure that all **backward incompatible changes** are announced to the team well in advance. For example, when deleting a column from a table. +{% /note noteType="Warning" %} +{% /note %} + +Clicking on the announcement will display further details. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/announce2.png" +alt="Details of the Announcement" +caption="Details of the Announcement" +/%} + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/announce3.png" +alt="Details of an Announcement" +caption="Details of an Announcement" +/%} + +Details of an announcement are as follows: +- **Creator:** Get to know who added the announcement. +- **Data Asset:** Know the data asset type (Table, Pipeline) as well as name of the data asset it pertains to. +- **Scheduled Date:** A date range can be added during which the announcement will be displayed in OpenMetadata. This consists of a start and end date. + +These announcements are also displayed on the top right of the landing page. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/announce4.png" +alt="Announcement Display (Top Right)" +caption="Landing Page Announcement Display (Top Right)" +/%} + +{% note %} +**Advanced Tip:** Users can set up Alerts to be sent from OpenMetadata via Email, Chat, Slack, MS Teams, and Webhooks. If alerts have been set up for Activity Feeds, then the concerned data owners and followers will be notified via email, Slack, etc. +{% /note %} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-collaboration/index.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-collaboration/index.md new file mode 100644 index 000000000000..dd0c1ef52e53 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-collaboration/index.md @@ -0,0 +1,57 @@ +--- +title: Data Collaboration +slug: /how-to-guides/openmetadata/data-collaboration +--- + +# Overview of Data Collaboration + +OpenMetadata is a catalyst for collaboration that brings data teams together to break the information silos, share organizational knowledge, and sort the data deluge. Users can add documentation, descriptions, and annotations to metadata to provide context and share knowledge about data assets. This encourages collaboration among data users and enhances data understanding. + +There are three important aspects of data collaboration in OpenMetadata: +- **Conversations Threads:** Collaborate around data assets and tags by asking the right questions and discussing the details right within OpenMetadata. + +- **Tasks:** Create tasks around data assets to create and update descriptions, request for tags, and initaite a glossary term approval workflow. + +- **Announcements:** Announce to your entire team about the upcoming events and changes such as deprecation, deletion, or schema changes. + +Watch the video on how to use the collaboration features in OpenMetadata. + +{% youtube videoId="M6mbFLA1bQc" start="0:00" end="5:58" /%} + +{%inlineCalloutContainer%} + {%inlineCallout + color="violet-70" + bold="Understanding Activity Feeds" + icon="MdConnectWithoutContact" + href="/how-to-guides/openmetadata/data-collaboration/activity-feeds"%} + Learn more about the announcements in OpenMetadata + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Request for Description" + icon="MdDescription" + href="/how-to-guides/user-guide-for-data-stewards/overview-data-assets/request-description"%} + Request for a description and discuss the same within OpenMetadata + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Request for Tags" + icon="MdDiscount" + href="/how-to-guides/user-guide-for-data-stewards/overview-data-assets/request-tags"%} + Request for tags and discuss about the same, all within OpenMetadata. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Overview of Announcements" + icon="MdVolumeUp" + href="/how-to-guides/openmetadata/data-collaboration/announcements"%} + Learn more about the announcements in OpenMetadata + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Create an Announcement" + icon="MdVolumeUp" + href="/how-to-guides/openmetadata/data-collaboration/add-announcement"%} + Follow the steps to add an announcement + {%/inlineCallout%} +{%/inlineCalloutContainer%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-collaboration/request-description.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-collaboration/request-description.md new file mode 100644 index 000000000000..efc812fcac67 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-collaboration/request-description.md @@ -0,0 +1,66 @@ +--- +title: Request for Description +slug: /how-to-guides/openmetadata/data-collaboration/request-description +--- + +# How to Request for Description + +Apart from adding the a description to the data assets directly, users can also request to update description. This is typically done when the user wants another opinion on the description being added, or if the user does not have access to edit the description. Requesting for a description will create a Task in OpenMetadata. + +- Click on the **?** icon next to Description + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/desc3.png" +alt="Request for Data Asset Description" +caption="Request for Data Asset Description" +/%} + +- A Task will be created with some pre-populated details. Fill in the other important information: + - **Title** - This is auto-populated + - **Assignees** - Multiple users or teams can be added + - **Description** - Add the new description. + - You can view the **Current** description. + - You can add the **New** description. + - It will display the **Difference** as well. + - Click on **Submit** to create the task. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/desc4.png" +alt="Create a Task for Data Asset Description" +caption="Create a Task for Data Asset Description" +/%} + +Once a task has been created, it is displayed in the **Activity Feeds & Tasks** tab for that Data Asset. The assignees, can either `Accept the Suggestion` or `Edit and Accept the Suggestion`. Assignees can also add a **Comment**. They can also add other users as **Assignees**. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/desc5.png" +alt="Task: Accept Suggestion and Comment" +caption="Task: Accept Suggestion and Comment" +/%} + +## Conversations around the Data Asset Description + +Apart from requesting for a description, users can also create a **Conversation** around the description of a data asset. +- Click on the **Conversation** icon next to the description. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/desc6.png" +alt="Conversation around Description" +caption="Conversation around Description" +/%} + +- Start a conversation right within the data asset page. Add **@mention** to tag a user or team. Add a **#mention** to tag a data asset. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/desc7.png" +alt="Start a Conversation" +caption="Start a Conversation" +/%} + +- Further in the conversation, users can **Reply** to discuss further as well as add **Reactions**, **Edit**, or **Delete**. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/desc8.png" +alt="Conversation: Reply, React, Edit or Delete" +caption="Conversation: Reply, React, Edit or Delete" +/%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-collaboration/request-tags.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-collaboration/request-tags.md new file mode 100644 index 000000000000..4172680e2f53 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-collaboration/request-tags.md @@ -0,0 +1,66 @@ +--- +title: How to Request for Tags +slug: /how-to-guides/openmetadata/data-collaboration/request-tags +--- + +# How to Request for Tags + +Apart from adding the tags directly to the data assets, users can also request to update tags. This is typically done when the user wants another opinion on the tag being added, or if the user does not have access to add tags directly. Requesting for a tag will create a Task in OpenMetadata. + +- Click on the **?** icon next to tags + +{% image +src="/images/v1.1/how-to-guides/governance/tag8.png" +alt="Request to Update Tags" +caption="Request to Update Tags" +/%} + +- A Task will be created with some pre-populated details. Fill in the other important information: + - **Title** - This is auto-populated + - **Assignees** - Multiple users can be added + - **Update Tags** - It displays 3 tabs. + - You can view the **Current** tags. + - You can add the **New** tags. + - It will display the **Difference** as well. + - Click on **Submit** to create the task. + + {% image + src="/images/v1.1/how-to-guides/governance/task1.png" + alt="Add a Task: Request to Update Tags" + caption="Add a Task: Request to Update Tags" + /%} + +Once a task has been created, it is displayed in the **Activity Feeds & Tasks** tab for that Data Asset. The assignees, can either `Accept the Suggestion` or `Edit and Accept the Suggestion`. Assignees can also add a **Comment**. They can also add other users as **Assignees**. + +{% image +src="/images/v1.1/how-to-guides/governance/task2.png" +alt="Task: Accept Suggestion and Comment" +caption="Task: Accept Suggestion and Comment" +/%} + +## Conversations around Tags + +Apart from requesting for tags, users can also create a **Conversation** around the tags assigned to a data asset. +- Click on the **Conversation** icon next to the tag. + +{% image +src="/images/v1.1/how-to-guides/governance/ct1.png" +alt="Conversations around Tags" +caption="Conversations around Tags" +/%} + +- Start a conversation right within the data asset page. Add **@mention** to tag a user or team. Add a **#mention** to tag a data asset. + +{% image +src="/images/v1.1/how-to-guides/governance/ct2.png" +alt="Start a Conversation" +caption="Start a Conversation" +/%} + +- Further in the conversation, users can **Reply** to discuss further as well as add **Reactions**, **Edit**, or **Delete**. + +{% image +src="/images/v1.1/how-to-guides/governance/ct3.png" +alt="Conversation: Reply, React, Edit or Delete" +caption="Conversation: Reply, React, Edit or Delete" +/%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-discovery/advanced.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-discovery/advanced.md new file mode 100644 index 000000000000..be8ff7403ea7 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-discovery/advanced.md @@ -0,0 +1,54 @@ +--- +title: Add Complex Queries using Advanced Search +slug: /how-to-guides/openmetadata/data-discovery/advanced +--- + +# Add Complex Queries using Advanced Search + +In case of voluminous data, the advanced search option helps to narrow down the search results for data discovery. The query builder supports multiple conditions as well as grouped conditions to simplify search. + +{% note noteType="Tip" %} The advanced search option is a quick and easy to use UI query builder to support complex queries for data discovery. {% /note %} + +To use the advanced search for complex queries: +- Navigate to the Explore page and click on the **Advanced** option on the top right + +{% image +src="/images/v1.1/how-to-guides/discovery/adv1.png" +alt="Advanced Search" +caption="Advanced Search" +/%} + +- Using the **Syntax Editor**, select the **Field** you would like like to search by. Currently, the following fields are supported: Deleted, Owner, Tags, Tier, Service, Database, Database Schema, and Column. +- Select the required **Conditions** for your query. The following fields are supported: Equal to, Not equal to, Any in, Not in, Contains, and Does not contain. The conditions will vary based on the field selected. +- Add in the values for the **Criteria**. +- You can add multiple conditions and group the conditions together. +- Use the AND/OR conditions. Select `AND` to ensure that all the conditions are satisfied. Select `OR` to ensure that any one of the conditions is satisfied. + +{% image +src="/images/v1.1/how-to-guides/discovery/adv2.png" +alt="Add Complex Queries using Advanced Search" +caption="Add Complex Queries using Advanced Search" +/%} + +For example, we can set up a complex query as follows: +- Group one set of conditions together by defining the `Owner`. You can add multiple conditions to define different owners and use the `OR` condition to ensure that the owner is any one among them. + +{% image +src="/images/v1.1/how-to-guides/discovery/adv3.png" +alt="Grouped Condition based on the Owner of the Data Assets" +caption="Grouped Condition based on the Owner of the Data Assets" +/%} + +- Next, you can add another set of conditions specific to the data based on the Service, Database, Schema, or Columns. **Apply** the conditions to search. + +{% image +src="/images/v1.1/how-to-guides/discovery/adv4.png" +alt="Advanced Search Conditions" +caption="Advanced Search Conditions" +/%} + +{% image +src="/images/v1.1/how-to-guides/discovery/adv5.png" +alt="Advanced Search Results" +caption="Advanced Search Results" +/%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-discovery/details.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-discovery/details.md new file mode 100644 index 000000000000..90ab75130106 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-discovery/details.md @@ -0,0 +1,177 @@ +--- +title: Detailed View of the Data Assets +slug: /how-to-guides/openmetadata/data-discovery/details +--- + +# Detailed View of the Data Assets + +The data asset details page displays the **Source, Owner (Team/User), Tier, Type, Usage, and Description** on the top panel. + +{% image +src="/images/v1.1/how-to-guides/discovery/asset1.png" +alt="Overview of Data Assets" +caption="Overview of Data Assets" +/%} + +# Version History and Other Details + +On the top right of the data asset details page, we can view details on: +- **Tasks:** The circular icon displays the number of open tasks. +- **Version History:** The clock icon displays the details of the version history in terms of major and minor changes. +- **Follow:** The star icon displays the number of users following the data asset. +- **Share:** Users can share the link to the data asset. +- **Announcements** On clicking the **⋮** icon, users can add announcements. +- **Rename:** On clicking the **⋮** icon, users can rename the data asset. +- **Delete:** On clicking the **⋮** icon, users can delete the data asset. + +{% image +src="/images/v1.1/how-to-guides/discovery/vh.png" +alt="Version History and Other Details" +caption="Version History and Other Details" +/%} + +# Data Asset Tabs +There are separate tabs each for Schema, Activity Feeds & Tasks, Sample Data, Queries, Profiler & Data Quality, Lineage, Custom Properties, Config, Details, Features, Children, and Executions based on the type of data asset selected. Let's take a look at each of the tabs. + +| **TABS** | **Table** | **Topic** | **Dashboard** | **Pipeline** | **ML Model** | **Container** | +|:--- | :--- | :--- | :--- | :--- | :--- | :--- | +| **Schema** | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="check" /%} | +| **Activity Feeds & Tasks** | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="check" /%} | +| **Sample Data** | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | +| **Queries** | {% icon iconName="check" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | +| **Profiler & Data Quality** | {% icon iconName="check" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | +| **Lineage** | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="check" /%} | +| **Custom Properties** | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="check" /%} | +| **Config** | {% icon iconName="cross" /%} | {% icon iconName="check" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | +| **Details** | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="check" /%} | {% icon iconName="cross" /%} | {% icon iconName="check" /%} | {% icon iconName="cross" /%} | +| **Executions** | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="check" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | +| **Features** | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="check" /%} | {% icon iconName="cross" /%} | +| **Children** | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="check" /%} | + +## Schema Tab + +The Schema Data tab is displayed only for Tables, Topics, and Containers. Schema will display the columns, type of column, and description, alongwith the tags, and glossary terms associated with each column. The table also displays details on the **Frequently Joined Tables, Tags, and Glossary Terms** associated with it. + +{% image +src="/images/v1.1/how-to-guides/discovery/schema.png" +alt="Schema Tab" +caption="Schema Tab" +/%} + +## Activity Feeds & Tasks Tab + +The Activity Feeds & Task tab is displayed for all types of data assets. It displays all the tasks and mentions for a particular data asset. + +{% image +src="/images/v1.1/how-to-guides/discovery/aft1.png" +alt="Activity Feeds & Tasks Tab" +caption="Activity Feeds & Tasks Tab" +/%} + +## Sample Data Tab + +During metadata ingestion, you can opt to bring in sample data. If sample data is enabled, the same is displayed here. The Sample Data tab is displayed only for Tables and Topics. + +{% image +src="/images/v1.1/how-to-guides/discovery/sample.png" +alt="Sample Data Tab" +caption="Sample Data Tab" +/%} + +## Queries Tab + +The Queries tab is displayed only for Tables. It displays the SQL queries run against a particular table. It provides the details on when the query was run and the amount of time taken. It also displays if the query was used by other tables. You can also add new queries. + +{% image +src="/images/v1.1/how-to-guides/discovery/query.png" +alt="Queries Tab" +caption="Queries Tab" +/%} + +## Profiler & Data Quality Tab + +The Profiler & Data Quality tab is displayed only for Tables. It has three sub-tabs for **Table Profile, Column Profile, and Data Quality**. The Profiler brings in details like number of rows and columns for the table profile alongwith the details of the data volume, table updates, and volume change. For the column profile, it brings in the details of the type of each column, the value count, null value %, distinct value %, unique %, etc. Data quality tests can be run on this sample data. We can add tests at the table and column level. + +{% image +src="/images/v1.1/how-to-guides/discovery/dq1.png" +alt="Profiler & Data Quality" +caption="Profiler & Data Quality" +/%} + +{% image +src="/images/v1.1/how-to-guides/discovery/dq2.png" +alt="Column Profile of a Table" +caption="Column Profile of a Table" +/%} + +## Lineage Tab + +The lineage tab is displayed for all types of data assets. The lineage view displays comprehensive lineage to capture the relation between the data assets. OpenMetadata UI displays end-to-end lineage traceability for the table and column levels. It displays both the upstream and downstream for each node. + +{% image +src="/images/v1.1/how-to-guides/discovery/lineage1.png" +alt="Comprehensive Lineage in OpenMetadata" +caption="Comprehensive Lineage in OpenMetadata" +/%} + +Users can configure the number of upstreams, downstreams, and nodes per layer by clicking on the Settings icon. OpenMetadata support manual lineage. By clicking on the Edit icon, users can edit the lineage and connect the data assets with a no-code editor. Clicking on any data asset in the lineage view will display a preview with the details of the data asset, alongwith tags, schema, data quality and profiler metrics. + +{% image +src="/images/v1.1/how-to-guides/discovery/lineage2.png" +alt="Data Asset Preview in Lineage Tab" +caption="Data Asset Preview in Lineage Tab" +/%} + +## Custom Properties Tab + +OpenMetadata uses a schema-first approach. We also support custom properties for all types of data assets. Organizations can extend the attributes as required to capture custom metadata. The Custom Properties tab shows up for all types of data assets. User can add or edit the custom property values for the data assets from this tab. Learn [How to Create a Custom Property for a Data Asset](/how-to-guides/user-guide-for-data-stewards/overview-data-assets/custom) + +{% image +src="/images/v1.1/how-to-guides/discovery/custom3.png" +alt="Enter the Value for a Custom Property" +caption="Enter the Value for a Custom Property" +/%} + +## Config Tab + +The Config tab is displayed only for Topics. + +## Details Tab + +The Details tab is displayed only for Dashboards and ML Models. In case of Dashboards, the Details tab displays the chart name, type of chart, and description of the chart. It also displays the associated tags for each chart. +{% image +src="/images/v1.1/how-to-guides/discovery/dsb1.png" +alt="Dashboards: Details Tab" +caption="Dashboards: Details Tab" +/%} + +In case of ML Models, it displays the Hyper Parameters and Model Store details. +{% image +src="/images/v1.1/how-to-guides/discovery/mlm2.png" +alt="ML Models: Details Tab" +caption="ML Models: Details Tab" +/%} + +## Executions Tab + +The Executions tab is displayed only for Pipelines. It displays the Date, Time, and Status of the pipelines. You can get a quick glance of the status in terms of Success, Failure, Pending, and Aborted. The status can be viewed as a Chronological list or as a tree. You can filter by status as well as by date. + +{% image +src="/images/v1.1/how-to-guides/discovery/exec.png" +alt="Pipelines: Executions Tab" +caption="Pipelines: Executions Tab" +/%} + +## Features Tab + +The Features tab is displayed only for ML Models. It displays a Description of the ML Model, and the features that have been used. Each feature will have further details on the Type of feature, Algorithm, Description, Sources, and the associated Glossary Terms and Tags. + +{% image +src="/images/v1.1/how-to-guides/discovery/mlm1.png" +alt="ML Models: Features Tab" +caption="ML Models: Features Tab" +/%} + +## Children Tab + +The Children tab is displayed only for Containers. \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-discovery/discover.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-discovery/discover.md new file mode 100644 index 000000000000..60d3a46c0c93 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-discovery/discover.md @@ -0,0 +1,128 @@ +--- +title: How to Discover Assets of Interest +slug: /how-to-guides/openmetadata/data-discovery/discover +--- + +# How to Discover Assets of Interest + +Search is at the front and center of OpenMetadata and is available in the top Menu bar across all the different pages. OpenMetadata simplifies data discovery with the following strategies. + +## Keyword Search +A simple yet powerful way to find assets by typing the name or description from the search interface. The search suggest will display matching data assets in several categories. Your query will retrieve all matching tables, topics, dashboards, pipelines, ML models, containers, glossaries, and tags. Your queries will match names for data assets and their components, such as column names for tables and chart names for dashboards. The queries will also match the descriptions used. + +{% image +src="/images/v1.1/how-to-guides/discovery/kw.png" +alt="Keyword Search" +caption="Keyword Search" +/%} + +## Quick Filters +Multiple quick filter options further help to narrow down the search by **Owner, Tag, Tier, Service, Service Type**, and other filters relevant to the type of data asset like **Database, Schema, Columns**. You can also search by deleted data assets. + +{% image +src="/images/v1.1/how-to-guides/discovery/kw2.png" +alt="Filter using Multiple Parameters" +caption="Filter using Multiple Parameters" +/%} + +## Filter by the Type of Data Asset +The search results can be narrowed down by data assets such as Table, Topic, Dashboard, Pipeline, ML Model, Container, Glossary, or Tag. + +{% image +src="/images/v1.1/how-to-guides/discovery/da1.png" +alt="Filter by the Type of Data Asset" +caption="Filter by the Type of Data Asset" +/%} + +Users can navigate to the Explore page for specific type of data assets and use the filter options relevant to that data assset to narrow down the search. + +## Filter by Asset Owner +A team or a user can own the data asset in OpenMetadata. Users can filter data assets by the asset owner. With information on the data asset owners, you can direct your questions to the right person or team. + +{% image +src="/images/v1.1/how-to-guides/discovery/owner.png" +alt="Filter by Asset Owner" +caption="Filter by Asset Owner" +/%} + +## Filter by Database +When searching while you are in a database page, you can narrow down your search to within the database or to include the overall search results within OpenMetadata. + +{% image +src="/images/v1.1/how-to-guides/discovery/db.png" +alt="Filter by Database" +caption="Filter by Database" +/%} + +## Filter based on Importance: Tiers +Using tiers, you can search for data based on importance. + +{% image +src="/images/v1.1/how-to-guides/discovery/tier.png" +alt="Filter based on Importance using Tiers" +caption="Filter based on Importance using Tiers" +/%} + +## Filter based on Importance: Usage +OpenMetadata captures usage profiles for tables during metadata/profiler ingestion. This helps to learn how other data consumers are using the tables. You can use the quick filter to narrow down the search results by relevance by clicking on the down arrow on the top right of the Explore page. You can search for data by: +- **Last Updated** - Filter data by the recent updates and changes. +- **Weekly Usage** - Based on the data asset usage metrics. +- **Relevance** + +These details are based on the usage summary computations. Further, you can **sort** the results by ascending and descending order. + +{% image +src="/images/v1.1/how-to-guides/discovery/usage.png" +alt="Filter based on Importance: Usage" +caption="Filter based on Importance: Usage" +/%} + +## Discover Data through Association +OpenMetadata provides the links to the frequently joined tables and columns as measured by the data profiler. You can also discover assets through relationships based on data lineage. + +{% image +src="/images/v1.1/how-to-guides/discovery/fjt.png" +alt="Frequently Joined Tables" +caption="Frequently Joined Tables" +/%} + +## Discover Assets through Relationships +OpenMetadata helps to locate assets of interest by tracing data lineage. You can view the upstream and downstream nodes to discover the sources of data and learn about the tables, pipelines, and more. The table and column descriptions help to decide if the data is helpful for your use case. Similarly, the pipeline description helps to uncover the transformation and more data of interest. + +{% image +src="/images/v1.1/how-to-guides/discovery/lineage.png" +alt="Discover Assets through Relationships: Lineage" +caption="Discover Assets through Relationships: Lineage" +/%} + +## Advanced Search +Users can find data assets matching strict criteria by multiple parameters on metadata properties, using the **syntax editor** with and/or conditions. Advanced search in OpenMetadata supports Boolean operators and faceted queries to search for specific facets of your data. Separate advanced search options are available for Tables, Topics, Dashboards, Pipelines, ML Models, Containers, Glossary, and Tags. + +{% image + src="/images/v1.1/features/data-discovery.gif" +/%} + +## Discover Data Evolution +By viewing lineage and metadata versioning, users can discover the data evolution of data assets. + +{% image +src="/images/v1.1/how-to-guides/discovery/version.png" +alt="Discover Data Evolution: Version History" +caption="Discover Data Evolution: Version History" +/%} + +## Filter by Deleted Data Assets +Users can also search for the soft-deleted data assets in OpenMetadata. Use the toggle bar to search for deleted assets. The deleted data assets are read-only. + +{% image +src="/images/v1.1/how-to-guides/discovery/deleted.png" +alt="Filter by Deleted Data Assets" +caption="Filter by Deleted Data Assets" +/%} + +Users can click on **Clear** to unselect all the filter options. +{% image +src="/images/v1.1/how-to-guides/discovery/clear.png" +alt="Clear the Filters" +caption="Clear the Filters" +/%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-discovery/index.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-discovery/index.md new file mode 100644 index 000000000000..29f22401323e --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-discovery/index.md @@ -0,0 +1,50 @@ +--- +title: Data Discovery +slug: /how-to-guides/openmetadata/data-discovery +--- + +# Overview of Data Discovery + +Discovering data among thousands of datasets is hard without rich metadata and faceted search. OpenMetadata with a single catalog aggregates metadata about all data assets, and presents the right information to users depending on their needs. OpenMetadata aims to help **data producers** make smart decisions to evolve data and prioritize bug fixes; and help **data consumers** make timely decisions with the right data. + +OpenMetadata provides a user-friendly interface for **data discovery**. OpenMetadata enables you to discover your data using a variety of strategies, including: keyword search, data associations (e.g., frequently joined tables, lineage), and complex queries. Using OpenMetadata you can search across tables, topics, dashboards, pipelines, ML models, containers, glossaries, and tags. OpenMetadata supports detailed metadata for assets and their components (e.g., columns, charts), including support for complex data types such as arrays and structs. Users can get a complete picture of their data by viewing the **data evolution** tracked using lineage and metadata versioning. + +{% image + src="/images/v1.1/features/data-discovery.gif" +/%} + +Watch the video on how easy it is to discovery your data in OpenMetadata. + +{% youtube videoId="3xaHf3A2PgU" start="0:00" end="3:17" /%} + +{%inlineCalloutContainer%} + {%inlineCallout + color="violet-70" + bold="How to Discover Assets of Interest" + icon="MdSearch" + href="/how-to-guides/openmetadata/data-discovery/discover"%} + Discover the right data assets quickly. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Get a Quick Glance of the Data Assets" + icon="MdSearch" + href="/how-to-guides/openmetadata/data-discovery/preview"%} + Quick preview of the selected data asset. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Data Asset Details" + icon="MdSearch" + href="/how-to-guides/openmetadata/data-discovery/details"%} + Get a holistic view of the data assets. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Advanced Search" + icon="MdSearch" + href="/how-to-guides/openmetadata/data-discovery/advanced"%} + Add complex queries using advanced search. + {%/inlineCallout%} +{%/inlineCalloutContainer%} + diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-discovery/preview.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-discovery/preview.md new file mode 100644 index 000000000000..ee1454edba6b --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-discovery/preview.md @@ -0,0 +1,73 @@ +--- +title: Get a Quick Glance of the Data Assets +slug: /how-to-guides/openmetadata/data-discovery/preview +--- + +# Get a Quick Glance of the Data Assets + +For the each of the data assets displayed in the Explore page, some basic information is displayed on the data asset card. Users can view the **Source, Name of the Data Asset, Description, Owner (Team/User details), Tier, and Usage** information for each data asset. +{% image +src="/images/v1.1/how-to-guides/discovery/prv8.png" +alt="Basic Information about the Data Asset" +caption="Basic Information about the Data Asset" +/%} + +OpenMetadata provides a quick preview of the data asset on the right side panel. Just click on the empty space next to the relevant data asset to get a quick preview. + +## Preview based on the Data Asset Type +Based on the type of data asset (Table, Topic, Dashboard, Pipeline, ML Model, Container, Glossary, Tag), the quick preview provides information. For example, the **type of table, the number of queries, and columns** are displayed for `tables`. +{% image +src="/images/v1.1/how-to-guides/discovery/prv1.png" +alt="Quick Glance of the Table Details" +caption="Quick Glance of the Table Details" +/%} + +Similarly, the quick glance displays the information on the **Partitions, Replication Factor, Retention Size, CleanUp Policies, Max Message Size, and Schema Type** for `topics`. +{% image +src="/images/v1.1/how-to-guides/discovery/prv2.png" +alt="Quick Glance of the Topic Details" +caption="Quick Glance of the Topic Details" +/%} + +For `ML Models`, it displays the **Algorithm, Target, Server, and Dashboard**. +{% image +src="/images/v1.1/how-to-guides/discovery/prv3.png" +alt="Quick Glance of the ML Model Details" +caption="Quick Glance of the ML Model Details" +/%} + +A `glossary` preview displays the **Reviewers, Synonyms, and Children**. +{% image +src="/images/v1.1/how-to-guides/discovery/prv4.png" +alt="Quick Glance of the Glossary Term Details" +caption="Quick Glance of the Glossary Term Details" +/%} + +Likewise, for `dashboards`, and `pipelines`, it displays the **Dashboard URL** and **Pipeline URL** respectively. For containers, the **Objects, Service Type, and Columns** are displayed. The `tag` preview displays the **Usage** of the tags. + +## Data Quality and Profiler Metrics + +The data quality and profiler metrics displays the details on the **Tests Passed, Aborted, and Failed**. +{% image +src="/images/v1.1/how-to-guides/discovery/prv5.png" +alt="Quick Glance of the Data Quality and Profiler Metrics" +caption="Quick Glance of the Data Quality and Profiler Metrics" +/%} + +## Tags + +Users can view all the tags associated with a particular data asset. +{% image +src="/images/v1.1/how-to-guides/discovery/prv6.png" +alt="Quick Glance of the Tags" +caption="Quick Glance of the Tags" +/%} + +## Schema + +The Schema provides the details on the **column names, type of column, and column description**. +{% image +src="/images/v1.1/how-to-guides/discovery/prv7.png" +alt="Quick Glance of the Schema" +caption="Quick Glance of the Schema" +/%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/assets.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/assets.md new file mode 100644 index 000000000000..be0a3a73d30b --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/assets.md @@ -0,0 +1,56 @@ +--- +title: How to Add Assets to Glossary Terms +slug: /how-to-guides/openmetadata/data-governance/glossary-classification/assets +--- + +# How to Add Assets to Glossary Terms + +After creating a glossary term, data assets can be associated with the term. In the **Glossary Term > Assets Tab** all the assets associated with the glossary term are displayed. These data assets are further subgrouped as Tables, Topics, Dashboards, etc. + +{% image +src="/images/v1.1/how-to-guides/governance/term3.png" +alt="Assets Tab" +caption="Assets Tab" +/%} + +You can add more assets by clicking on **Add > Assets**. + +{% image +src="/images/v1.1/how-to-guides/governance/asset.png" +alt="Add Asset" +caption="Add Asset" +/%} + +You can further search and filter assets by type. Simply select the relevant assets and click **Save**. + +{% image +src="/images/v1.1/how-to-guides/governance/asset1.png" +alt="Assets Related to the Glossary Term" +caption="Assets Related to the Glossary Term" +/%} + +The glossary term lists the Assets, which makes it easy to discover all the data assets related to the term. + +## Glossary Terms and Tags + +If **Tags** are associated with a **Glossary Term**, then applying that glossary term to a data asset, will also automatically apply the associated tags to that data asset. For example, the glossary term ‘Account’ has a PII.Sensitive tag associated with it. When you add a glossary term to a data asset, the associated tags also get added. + +{% image +src="/images/v1.1/how-to-guides/governance/tag5.png" +alt="Glossary Term and Associated Tags" +caption="Glossary Term and Associated Tags" +/%} + +{% image +src="/images/v1.1/how-to-guides/governance/tag6.png" +alt="Glossary Term and Tag gets Added to the Data Asset" +caption="Glossary Term and Tag gets Added to the Data Asset" +/%} + +{%inlineCallout + color="violet-70" + bold="How to Classify Data Assets" + icon="MdArrowForward" + href="/how-to-guides/openmetadata/data-governance/glossary-classification/classify-assets"%} + Add tags to data assets, or request them and discuss about the same, all within OpenMetadata. +{%/inlineCallout%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/best-practices.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/best-practices.md new file mode 100644 index 000000000000..c9b3e19e0363 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/best-practices.md @@ -0,0 +1,78 @@ +--- +title: Best Practices for Glossary and Classification +slug: /how-to-guides/openmetadata/data-governance/glossary-classification/best-practices +--- + +# Best Practices for Glossary and Classification + +A controlled vocabulary is an organized arrangement of words and phrases to define terminology to organize and retrieve information. **Glossary** and **Classification** are both controlled vocabulary. + +Here are the **Top 8 Best Practices** around Terminologies: + +## 1. Use Hierarchical Relationships + +A hierarchical structure helps in grouping similar concepts and helps in better understanding. Instead of using a flat list of glossary terms, add a hierarchical (Parent-Child) relationship. This provides more context to a glossary term. The additional context helps in classification and policy enforcement. + +When using hierarchy, it is better to limit the hierarchy to three levels. + +{% image +src="/images/v1.1/how-to-guides/governance/glossary7.png" +alt="Phone Number in the Context of a User and Business" +caption="Phone Number in the Context of a User and Business" +/%} + +In a flat list, the term ‘Phone Number’ lacks context and it would be difficult to ascertain the sensitivity of data. A ‘User Phone Number’ is PII-Sensitive, whereas a ‘Business Phone Number’ is not PII-Sensitive. This can be best represented with hierarchical relationships and by grouping concepts. + +## 2. Add Classification Tags to Glossary Terms + +Classification tags can be added to a glossary term. This helps to define both the semantic meaning and type of data in a single step. Instead of adding classification tags manually, a glossary term can be added to define the **meaning** of the data, and classification tags like PII-sensitive can be added to the term to define the **type** of data. This helps to auto-assign PII tags. + +Organizations have data producers who create tables, and build data models. Team members who understand regulatory compliance requirements are good at classifying data. Among them, those who understand the data as well as the regulatory requirements, can help organizations scale by adding glossary terms along with the classification and tags. + +{% image +src="/images/v1.1/how-to-guides/governance/glossary4.png" +alt="Add Classification Tags to Glossary Terms" +caption="Add Classification Tags to Glossary Terms" +/%} + +## 3. Make Use of Tier Classification + +Tiering helps define the importance of data to an organization. By focusing on Tier 1 data, organizations can create the highest impact. Identifying Tier 5 can help declutter the existing data. Learn more about [Tiers](/how-to-guides/openmetadata/data-governance/glossary-classification/tiers). + +## 4. Use Classifications to Simplify Policies + +Along with ownership and team membership, tags are a powerful way to group data assets. A single policy can be created at the Resource level instead of managing multiple policies for various resources. + +Resources can be grouped using classification tags like sensitive data, restrictive data, external data, raw data, public data, internal data, etc. Further, Policies can be created based on Tags to simplify data governance. + +Instead of creating policies for separate tables with sensitive data, the ‘Sensitive’ tag can be attached to various data assets; and a policy can be created to match based on the Sensitive tag, which will take care of all the resources marked accordingly. + +## 5. Use Display Name to Improve Names + +When classifications and glossaries are inherited from source systems, the names may not communicate the concept well. For example, dep-prod instead of Product Department. Users are more likely to search using common terms like Product or Department, and this helps in better discovery. + +{% image +src="/images/v1.1/how-to-guides/governance/glossary5.png" +alt="Add Display Names for Better Discovery" +caption="Add Display Names for Better Discovery" +/%} + +In cases where abbreviations or acronyms are used, a better display name helps in data discovery. For example, `c_id` can be changed to `Customer ID`, and `CAC` can be changed to `Customer Acquisition Cost` + +## 6. Use Glossary Import Export + +For glossary bulk edits to update descriptions, ownership, reviewer, and status, export the Glossary, make the edits in a CSV file, and import it. Learn more about [Glossary Bulk Import](/how-to-guides/openmetadata/data-governance/glossary-classification/import-glossary). + +## 7. Don’t Delete Classification & Glossary Terms; Rename them + +When glossary terms or classification has typos, users tend to delete the term. All the effort spent in tagging the data assets is lost when terms are deleted. OpenMetadata supports renaming Glossary and Classification terms. Simply rename the terms. + +## 8. Group Similar Concepts Together + +When adding terms, building a semantic relationship helps to understand data through concepts. For example, grouping related terms helps in understanding the various terms and their overall relationship. + +{% image +src="/images/v1.1/how-to-guides/governance/glossary6.png" +alt="Group Similar Concepts Together" +caption="Group Similar Concepts Together" +/%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/classification.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/classification.md new file mode 100644 index 000000000000..1634a75b626b --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/classification.md @@ -0,0 +1,71 @@ +--- +title: What is Classification +slug: /how-to-guides/openmetadata/data-governance/glossary-classification/classification +--- + +# What is Classification + +**Classification** is a tag or annotation that categorizes or classifies a data asset. Classification does not define the semantics or meaning of data, but it helps define the type of data. For example, data can be: +- Sensitive or Non-sensitive, +- PII or Non-PII in terms of privacy, +- Verified or Unverified in terms of readiness for data consumption. + +Classification is used for policy enforcement purposes. Classification helps in browsing, searching, grouping, and managing data. It also helps in Security, Data Privacy, and Data Protection use cases. All of this is done by defining Policies, like Access Control policies, Retention policies, and Data Management policies. + +For Classification in OpenMetadata, we use a flat list of terms from knowledge organization systems. Classification groups together a set of similar terms called **Tags**, which can be accessed from **Govern > Classification**. + +In the below example, PersonalData is a Classification and it further has Tags under it. `PersonalData` is also a **System** Classification. System classifications are an important part of OpenMetadata and therefore cannot be deleted. The descriptions for the System tags can be modified. They can also be disabled. `PII` and `Tiers` are the other important system classifications in OpenMetadata. + +{% image +src="/images/v1.1/how-to-guides/governance/tag4.png" +alt="Classification: Groups together Tags" +caption="Classification: Groups together Tags" +/%} + +## Classification and Categorization Tags + +OpenMetadata supports both Classification and Categorization tags. +- **Classification tags** are **mutually exclusive**. A data asset can be in only one class in a hierarchy. Data can either be Public or Private, Sensitive or Non-sensitive. It cannot be both. + +- **Categorization tags** are **not mutually exclusive**. A data asset can belong to multiple categories. The same table can have Usage, Financial, Reporting and Compliance tags. + +## Mutually Exclusive Tags + +There are cases where only one tag from a particular classification is relevant for a data asset. For example, an asset can either be PII Sensitive or PII Non-Sensitive. It cannot be both. For such cases, a Classification can be created where the tags can be mutually exclusive. If this configuration is enabled, you won’t be able to assign multiple tags from the same Classification to the same data asset. + +{% note %} +**Pro Tip:** The Global Search in OpenMetadata also helps discover related Glossary Terms and Tags. +{% image +src="/images/v1.1/how-to-guides/governance/tag1.png" +alt="Search for Glossary Terms and Tags" +caption="Search for Glossary Terms and Tags" +/%} +{% /note %} + +## How Classification Helps? + +- You can discover the data assets in the Tags page. +- You can also search for data assets and filter them by tags. +- Tags can be used for authoring Policies. + +## Classification APIs + +OpenMetadata has extensive classification APIs to automate tagging. These APIs support two kinds of entities - Classification and Tags. These entities are identified by a Unique ID. Tags have a fully qualified name in the form of `classification.tagTerm` + +Refer the **[API Documentation on Classification](https://sandbox.open-metadata.org/docs#tag/Classifications)**. + +{%inlineCallout + color="violet-70" + bold="What are Tiers" + icon="MdArrowForward" + href="/how-to-guides/openmetadata/data-governance/glossary-classification/tiers"%} + Tiers helps to define the importance of data to an organization. +{%/inlineCallout%} + +{%inlineCallout + color="violet-70" + bold="How to Classify Data Assets" + icon="MdArrowForward" + href="/how-to-guides/openmetadata/data-governance/glossary-classification/classify-assets"%} + Add tags to data assets, or request them and discuss about the same, all within OpenMetadata. +{%/inlineCallout%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/classify-assets.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/classify-assets.md new file mode 100644 index 000000000000..4e81717d489e --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/classify-assets.md @@ -0,0 +1,145 @@ +--- +title: How to Classify Data Assets +slug: /how-to-guides/openmetadata/data-governance/glossary-classification/classify-assets +--- + +# How to Classify Data Assets + +## How to Add Classification Tags + +- From the Explore page, select a data asset and click on the edit icon or + Add for Tags. +- Search for the relevant tags. You can either type and search, or scroll to select from the options provided. +- Click on the checkmark to save the changes. + +{% image +src="/images/v1.1/how-to-guides/governance/tag7.png" +alt="Add Tags to Classify Data Assets" +caption="Add Tags to Classify Data Assets" +/%} + +The tagged data assets can be discovered right from the Classification page. +- Navigate to **Govern >> Classification**. +- The list of tags is displayed along with the details of Usage in various data assets. +- Click on the Usage number to view the tagged assets. + +{% image +src="/images/v1.1/how-to-guides/governance/tag2.png" +alt="Usage: Number of Assets Tagged" +caption="Usage: Number of Assets Tagged" +/%} + +{% image +src="/images/v1.1/how-to-guides/governance/tag3.png" +alt="Discover the Tagged Data Assets" +caption="Discover the Tagged Data Assets" +/%} + +You can view all the tags in the right panel. + +Data assets can also be classified using Tiers. Learn more about [Tiers](/how-to-guides/openmetadata/data-governance/glossary-classification/tiers). + +Among the Classification Tags, OpenMetadata has some System Classification. Learn more about the [System Tags](/how-to-guides/openmetadata/data-governance/glossary-classification/classification). + +## Task: Request to Update Tags + +Apart from adding the tags directly, users can also request to update tags. This is typically done when the user wants another opinion on the tag being added, or if the user does not have access to add tags directly. + +- Click on the **?** icon next to tags + +{% image +src="/images/v1.1/how-to-guides/governance/tag8.png" +alt="Request to Update Tags" +caption="Request to Update Tags" +/%} + +- A Task will be created with some pre-populated details. Fill in the other important information: + - **Title** - This is auto-populated + - **Assignees** - Multiple users can be added + - **Update Tags** - It displays 3 tabs. + - You can view the **Current** tags. + - You can add the **New** tags. + - It will display the **Difference** as well. + - Click on **Submit** to create the task. + + {% image + src="/images/v1.1/how-to-guides/governance/task1.png" + alt="Add a Task: Request to Update Tags" + caption="Add a Task: Request to Update Tags" + /%} + +Once a task has been created, it is displayed in the **Activity Feeds & Tasks** tab for that Data Asset. The assignees, can either `Accept the Suggestion` or `Edit and Accept the Suggestion`. Assignees can also add a **Comment**. They can also add other users as **Assignees**. + +{% image +src="/images/v1.1/how-to-guides/governance/task2.png" +alt="Task: Accept Suggestion and Comment" +caption="Task: Accept Suggestion and Comment" +/%} + +## Conversations around Classification + +Apart from requesting for tags, users can also create a **Conversation** around the tags assigned to a data asset. +- Click on the **Conversation** icon next to the tag. + +{% image +src="/images/v1.1/how-to-guides/governance/ct1.png" +alt="Conversations around Tags" +caption="Conversations around Tags" +/%} + +- Start a conversation right within the data asset page. Add **@mention** to tag a user or team. Add a **#mention** to tag a data asset. + +{% image +src="/images/v1.1/how-to-guides/governance/ct2.png" +alt="Start a Conversation" +caption="Start a Conversation" +/%} + +- Further in the conversation, users can **Reply** to discuss further as well as add **Reactions**, **Edit**, or **Delete**. + +{% image +src="/images/v1.1/how-to-guides/governance/ct3.png" +alt="Conversation: Reply, React, Edit or Delete" +caption="Conversation: Reply, React, Edit or Delete" +/%} + +## Auto-Classification in OpenMetadata + +OpenMetadata identifies PII data and auto tags or suggests the tags. The data profiler automatically tags the PII-Sensitive data. The addition of tags about PII data helps consumers and governance teams identify data that needs to be treated carefully. + +In the example below, the columns ‘user_name’ and ‘social security number’ are auto-tagged as PII-sensitive. This works using NLP as part of the profiler during ingestion. + +{% image +src="/images/v1.1/how-to-guides/governance/auto1.png" +alt="User_name and Social Security Number are Auto-Classified as PII Sensitive" +caption="User_name and Social Security Number are Auto-Classified as PII Sensitive" +/%} + +In the below example, the column ‘dwh_x10’ is also auto-tagged as PII Sensitive, even though the column name does not provide much information. + +{% image +src="/images/v1.1/how-to-guides/governance/auto2.png" +alt="Column Name does not provide much information" +caption="Column Name does not provide much information" +/%} + +When we look at the content of the column ‘dwh_x10’ in the Sample Data tab, it becomes clear that the auto-classification is based on the data in the column. + +{% image +src="/images/v1.1/how-to-guides/governance/auto3.png" +alt="Column Data provides information" +caption="Column Data provides information" +/%} + +You can read more about [Auto PII Tagging](https://docs.open-metadata.org/v1.1.x/connectors/ingestion/auto_tagging) here. + +## Tag Mapping + +Tag mapping is supported in the backend and not in the OpenMetadata UI. When two related tags are associated with each other, applying one tag, automatically applies the other tag. For example, when the tag `Personal Data.Personal` is applied, it automatically applies another tag `Data Classification.Confidential`. That way, applying the tag `Personal` automatically applies the tag `Confidential`. + +{%inlineCallout + color="violet-70" + bold="Best Practices for Glossary and Classification" + icon="MdArrowForward" + href="/how-to-guides/openmetadata/data-governance/glossary-classification/best-practices"%} + Here are the Top 8 Best Practices around Terminologies. +{%/inlineCallout%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/glossary-terms.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/glossary-terms.md new file mode 100644 index 000000000000..1589408c3215 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/glossary-terms.md @@ -0,0 +1,48 @@ +--- +title: How to Create Glossary Terms +slug: /how-to-guides/openmetadata/data-governance/glossary-classification/glossary-terms +--- + +# How to Create Glossary Terms + +Once a glossary has been created, you can add multiple **Glossary Terms** and **Child Terms** in it. + +- Once in the Glossary, click on **Add Term**. + +{% image +src="/images/v1.1/how-to-guides/governance/glossary-term.png" +alt="Add Glossary Term" +caption="Add Glossary Term" +/%} + +- Enter the required information: + - **Name*** - This contains the name of the glossary term, and is a required field. + - **Display Name** - This contains the Display name of the glossary term. + - **Description*** - A unique and clear definition to establish consistent usage and understanding of the term. This is a required field. + - **Tags** - Classification tags can be added to glossary terms. When adding a glossary term to assets, it will also add the associated tags to that asset. This helps to further describe and categorize the data assets. + - **Synonyms** - Other terms that are used for the same concept. For e.g., for a term ‘Customer’, the synonyms can be ‘Client’, ‘Shopper’, ‘Purchaser’. + - **Related Terms** - These terms can build a network of concepts to capture an associative relationship. For e.g., for a term ‘Customer’, the related terms can be ‘Customer LTV (LifeTime Value)’, ‘Customer Acquisition Cost (CAC)’. + - **Mutually Exclusive** - There are cases where only one term from a particular glossary is relevant for a data asset. For example, an asset can either be ‘PII-Sensitive’ or a ‘PII-NonSensitive’. It cannot be both. For such cases, a Glossary Term can be created where the child terms can be mutually exclusive. If this configuration is enabled, you won’t be able to assign multiple terms from the same Glossary Term to the same data asset. + - **References** - Add links from the internet from where you inherited the term. + - **Owner** - Either a Team or a User can be the Owner of a Glossary term. + - **Reviewers** - Multiple reviewers can be added. + +Once a glossary term has been added, you can create **Child Terms** under it. The child terms help to build a conceptual hierarchy (Parent-Child relationship) to go from generic to specific concepts. For e.g., for a term ‘Customer’, the child terms can be ‘Loyal Customer’, ‘New Customer’, ‘Online Customer’. + +Instead of creating a glossary manually, you can **[bulk upload glossary terms](/how-to-guides/openmetadata/data-governance/glossary-classification/import-glossary)** using a CSV file. + +{%inlineCallout + color="violet-70" + bold="How to Bulk Import a Glossary" + icon="MdArrowForward" + href="/how-to-guides/openmetadata/data-governance/glossary-classification/import-glossary"%} + Save time and effort by bulk uploading glossary terms using a CSV file. +{%/inlineCallout%} + +{%inlineCallout + color="violet-70" + bold="How to Add Assets to Glossary Terms" + icon="MdArrowForward" + href="/how-to-guides/openmetadata/data-governance/glossary-classification/assets"%} + Associate glossary terms to data assets making it easier for data discovery +{%/inlineCallout%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/glossary.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/glossary.md new file mode 100644 index 000000000000..5a6ba35f6857 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/glossary.md @@ -0,0 +1,120 @@ +--- +title: What is a Glossary +slug: /how-to-guides/openmetadata/data-governance/glossary-classification/glossary +--- + +# What is a Glossary + +A Glossary is a Controlled Vocabulary to describe important concepts and terminologies within your organization to foster a common and consistent understanding of data. It defines concepts related to a specific domain. For example, Business Glossary or Bank Glossary. A well-defined business glossary helps foster team collaboration with the use of standard terms. Glossaries are important for data discovery, retrieval, and exploration through conceptual terms, and facilitates **Data Governance**. + +Glossary adds semantics or meaning to data. OpenMetadata models a Glossary as a Thesauri that organizes terms with **hierarchical**, equivalent, and associative relationships within a domain. + +The Glossary in OpenMetadata can be accessed from **Govern >> Glossary**. All the Glossaries are displayed in the left nav bar. Clicking on a specific glossary will display the expanded view to show the entire hierarchy of the glossary terms (parent-child terms). + +{% image +src="/images/v1.1/how-to-guides/governance/banking.png" +alt="Banking Glossary" +caption="Banking Glossary" +/%} + +{% note %} +**Tip:** A well-defined and centralized glossary makes it easy to **onboard new team members** and help them get familiar with the **organizational terminology**. +{% /note %} + +## Glossary Term + +A Glossary Term is a preferred terminology for a concept. In a Glossary term, you can add tags, synonyms, related terms to build a conceptual semantic graph, and also add reference links. + +The glossary term can include additional information as follows: +- **Description** - A unique and clear definition to establish consistent usage and understanding of the term. This is a mandatory requirement. + +- **Tags** - Classification tags can be added to glossary terms. When adding a glossary term to assets, it will also add the associated tags to that asset. This helps to further describe and categorize the data assets. + +- **Synonyms** - Other terms that are used for the same concept. For e.g., for a term ‘Customer’, the synonyms can be ‘Client’, ‘Shopper’, ‘Purchaser’. + +- **Child Terms** - Child terms help to build a conceptual hierarchy (Parent-Child relationship) to go from generic to specific concepts. For e.g., for a term ‘Customer’, the child terms can be ‘Loyal Customer’, ‘New Customer’, ‘Online Customer’. + +- **Related Terms** - These terms can build a network of concepts to capture an associative relationship. For e.g., for a term ‘Customer’, the related terms can be ‘Customer LTV (LifeTime Value)’, ‘Customer Acquisition Cost (CAC)’. + +- **References** - Add links from the internet from where you inherited the term. + +- **Mutually Exclusive** - There are cases where only one term from a particular glossary is relevant for a data asset. For example, an asset can either be ‘PII-Sensitive’ or a ‘PII-NonSensitive’. It cannot be both. For such cases, a Glossary or a Glossary Term can be created where the child terms can be mutually exclusive. If this configuration is enabled, you won’t be able to assign multiple terms from the same Glossary/Term to the same data asset. + +- **Reviewers** - Multiple reviewers can be added. + +- **Assets** - After creating a glossary term, data assets can be associated with the term. + +{% image +src="/images/v1.1/how-to-guides/governance/glossary-term.png" +alt="Glossary Term Requirements" +caption="Glossary Term Requirements" +/%} + +The details of a Glossary Term in OpenMetadata are displayed in three tabs: Overview, Glossary Terms, and Assets. The **Overview tab** displays the details of the term, along with the synonyms, related terms, references, and tags. It also displays the Owner and the Reviewers for the Glossary Term. + +{% image +src="/images/v1.1/how-to-guides/governance/term1.png" +alt="Overview of a Glossary Term" +caption="Overview of a Glossary Term" +/%} + +The **Glossary Term Tab** displays all the child terms associated with the parent term. You can also add more child terms from this tab. + +{% image +src="/images/v1.1/how-to-guides/governance/term2.png" +alt="Glossary Terms Tab" +caption="Glossary Terms Tab" +/%} + +{% note %} +**Tip:** Glossary terms help to organize as well as discover data assets. +{% /note %} + +The **Assets Tab** displays all the assets that are associated with the glossary term. These data assets are further subgrouped as Tables, Topics, Dashboards. The right side panel shows a preview of the data assets selected. + +{% image +src="/images/v1.1/how-to-guides/governance/term3.png" +alt="Assets Tab" +caption="Assets Tab" +/%} + +You can add more assets by clicking on **Add > Assets**. You can further search and filter assets by type. Simply select the relevant assets and click Save. The glossary term lists the Assets, which makes it easy to discover all the data assets related to the term. + +{% note %} +**Pro Tip:** The Global Search in OpenMetadata also helps discover related Glossary Terms and Tags. +{% image +src="/images/v1.1/how-to-guides/governance/tag1.png" +alt="Search for Glossary Terms and Tags" +caption="Search for Glossary Terms and Tags" +/%} +{% /note %} + +## Glossary and Glossary Term Version History + +The glossary as well as the terms maintain a version history, which can be viewed on the top right. Clicking on the number will display the details of the **Version History**. + +{% image +src="/images/v1.1/how-to-guides/governance/version.png" +alt="Glossary Term Version History" +caption="Glossary Term Version History" +/%} + +The Backward compatible changes result in a **Minor** version change. A change in the description, tags, or ownership will increase the version of the entity metadata by **0.1** (e.g., from 0.1 to 0.2). + +The Backward incompatible changes result in a **Major** version change. For example, when a term is deleted, the version increases by **1.0** (e.g., from 0.2 to 1.2). + +## Glossary APIs + +OpenMetadata has extensive Glossary APIs. The main entities are **Glossary** and **Glossary Term**. These entities are identified by a Unique ID. Glossary terms have a fully qualified name in the form of `glossary.parentTerm.childTerm` + +You can create, delete, modify, and update using APIs. Refer to the **[Glossary API documentation](https://sandbox.open-metadata.org/docs#tag/Glossaries)**. + +You can also [export or bulk import the glossary terms](/how-to-guides/openmetadata/data-governance/glossary-classification/import-glossary) using a CSV file. + +{%inlineCallout + color="violet-70" + bold="What is Classification" + icon="MdArrowForward" + href="/how-to-guides/openmetadata/data-governance/glossary-classification/classification"%} + Learn about the classification tags, system tags, and mutually exclusive tags. +{%/inlineCallout%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/import-glossary.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/import-glossary.md new file mode 100644 index 000000000000..43a90a48fab5 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/import-glossary.md @@ -0,0 +1,113 @@ +--- +title: How to Bulk Import a Glossary +slug: /how-to-guides/openmetadata/data-governance/glossary-classification/import-glossary +--- + +# How to Bulk Import a Glossary + +OpenMetadata supports **Glossary Bulk Upload** to save time and effort by uploading a CSV with thousands of terms in one go. You can create or update multiple glossary terms simultaneously. When bulk uploading, Owners and Reviewers can be defined, who will be further propagated to every glossary term. + +To import a glossary into OpenMetadata: +- Navigate to **Govern > Glossary** +- Click on the **⋮** icon and **Export** the glossary file. If you have glossary terms in your Glossary, the same will be exported as a CSV file. If you have If there are no terms in the Glossary, then a blank CSV template will be downloaded. + +{% image +src="/images/v1.1/how-to-guides/governance/glossary8.png" +alt="Export Glossary File" +caption="Export Glossary File" +/%} + +- Once you have the template, you can fill in the following details: + - **parent** - The parent column helps to define the hierarchy of the glossary terms. If you leave this field blank, the Term will be created at the root level. If you want to create a hierarchy of Glossary Terms, the parent details must be entered as per the hierarchy. For example, from the Glossary level, `Banking.Account.Savings Account` + + {% image + src="/images/v1.1/how-to-guides/governance/glossary9.png" + alt="Hierarchy can be defined in the Parent Column" + caption="Hierarchy can be defined in the Parent Column" + /%} + - **name*** - This contains the name of the glossary term, and is a required field. + + - **displayName** - This contains the Display name of the glossary term. + + - **description*** - This contains the description or details of the glossary term and is a required field. + + - **synonyms** - Include words that have the same meaning as the glossary term. For e.g., for a term ‘Customer’, the synonyms can be ‘Client’, ‘Shopper’, ‘Purchaser’. In the CSV file, the synonyms must be separated by a semicolon (;) as in `Client;Shopper;Purchaser` + + - **relatedTerms** - A term which has a related concept as the glossary term. This term must be available in OpenMetadata. For e.g., for a term ‘Customer’, the related terms can be ‘Customer LTV (LifeTime Value)’, ‘Customer Acquisition Cost (CAC)’. In the CSV file, the relatedTerms must contain the hierarchy, which is separated by a full stop (.). Multiple terms must be separated by a semicolon (;) as in `Banking.Account.Savings account;Banking.Debit card` + - **references** - Add links from the internet from where you inherited the term. In the CSV file, the references must be in the format (name;url;name;url) `IBM;https://www.ibm.com/;World Bank;https://www.worldbank.org/` + - **tags** - Add the tags which are already existing in OpenMetadata. In the CSV file, the tags must be in the format `PII.Sensitive;PersonalData.Personal` + +The * marked fields are required fields. +- To create a new glossary, navigate to **Govern > Glossary** and first **Add** a new glossary. You can also bulk upload terms to an existing glossary. + +{% image +src="/images/v1.1/how-to-guides/governance/glossary1.png" +alt="Add a New Glossary" +caption="Add a New Glossary" +/%} + +- Add the Name*, Display Name, Description*, Tags, Owner, and Reviewer details for the glossary. + +{% image +src="/images/v1.1/how-to-guides/governance/glossary2.png" +alt="Configure the Glossary" +caption="Configure the Glossary" +/%} + +## Mutually Exclusive + +You can also mark the Glossary as Mutually Exclusive if you want only one of the terms from the glossary to be applicable to the data assets. There are cases where only one glossary term from a Glossary is relevant for a data asset. For example, an asset can either be PII Sensitive or PII Non-Sensitive. It cannot be both. For such cases, a Glossary can be created where the terms can be mutually exclusive. If this configuration is enabled, you won’t be able to assign multiple tags from the same Glossary to the same data asset. + +## Add Owners and Reviewers to a Glossary + +If the Owner details are added while creating the glossary, the same will be inherited for the glossary terms. Either a Team or a User can be the **Owner** of a Glossary. Multiple users can be **Reviewers**. These can be changed later. The glossary **Owner and Reviewers** are inherited for all the glossary terms. + +- Once the CSV file is ready, click on the ⋮ icon and select the **Import** button. + +- Drag and drop the CSV file, or upload it by clicking on the Browse button. + +{% image +src="/images/v1.1/how-to-guides/governance/import0.png" +alt="Import the Glossary CSV File" +caption="Import the Glossary CSV File" +/%} + +- The import utility will validate the file and a **Preview** of the elements that will be imported to OpenMetadata is displayed. + +- After previewing the uploaded terms, click on **Import**. + +{% image +src="/images/v1.1/how-to-guides/governance/import1.png" +alt="Preview of the Glossary" +caption="Preview of the Glossary" +/%} + +- The glossary terms will be scanned and imported. After which a Success or Failure message will be displayed. + +{% image +src="/images/v1.1/how-to-guides/governance/import2.png" +alt="Glossary Imported Successfully" +caption="Glossary Imported Successfully" +/%} + +- Once a part of the terms or all terms are created successfully, the Import button will be displayed. Click on Import to create the glossary terms from the CSV file in OpenMetadata. + +- Next you can **View** the imported glossary. You can **Expand All** the terms to view the nested terms. Glossary terms can be **dragged and dropped** as required to rearrange the glossary. + +- The glossary **Owner** is inherited for all the glossary terms. + +{% image +src="/images/v1.1/how-to-guides/governance/import3.png" +alt="Drag and Drop Glossary Terms to Rearrange the Hierarchy" +caption="Drag and Drop Glossary Terms to Rearrange the Hierarchy" +/%} + +Both importing and exporting the Glossary from OpenMetadata is quick and easy! + +{%inlineCallout + color="violet-70" + bold="How to Add Assets to Glossary Terms" + icon="MdArrowForward" + href="/how-to-guides/openmetadata/data-governance/glossary-classification/assets"%} + Associate glossary terms to data assets making it easier for data discovery +{%/inlineCallout%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/index.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/index.md new file mode 100644 index 000000000000..e10cefb820ea --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/index.md @@ -0,0 +1,78 @@ +--- +title: Glossary and Classification +slug: /how-to-guides/openmetadata/data-governance/glossary-classification +--- + +# Glossary and Classification + +**Glossary** and **Classification** are both controlled vocabulary and can be used for labeling data. A controlled vocabulary is an organized arrangement of words and phrases to define terminology to organize and retrieve information. Glossary adds meaning to data by defining the business terminologies, whereas Classification helps in defining the type of data. + +Watch the [Webinar on Glossaries and Classifications in OpenMetadata](https://www.youtube.com/watch?v=LII_5CDo_0s) + +[![Watch the video](/images/v1.1/how-to-guides/governance/glossary-webinar.png)](https://www.youtube.com/watch?v=LII_5CDo_0s) + +{%inlineCalloutContainer%} + {%inlineCallout + color="violet-70" + bold="What is a Glossary" + icon="MdMenuBook" + href="/how-to-guides/openmetadata/data-governance/glossary-classification/glossary"%} + Create glossaries in OpenMetadata with hierarchically arranged glossary terms. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="What is Classification" + icon="MdDiscount" + href="/how-to-guides/openmetadata/data-governance/glossary-classification/classification"%} + Learn about the classification tags, system tags, and mutually exclusive tags. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="What are Tiers" + icon="MdDiscount" + href="/how-to-guides/openmetadata/data-governance/glossary-classification/tiers"%} + Tiers helps to define the importance of data to an organization. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="How to Setup a Glossary" + icon="MdMenuBook" + href="/how-to-guides/openmetadata/data-governance/glossary-classification/setup-glossary"%} + Learn how to set up a glossary manually in OpenMetadata. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="How to Create Glossary Terms" + icon="MdMenuBook" + href="/how-to-guides/openmetadata/data-governance/glossary-classification/glossary-terms"%} + Setup glossary terms to define the terminology. Add tags, synonyms, related terms, links, etc. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="How to Bulk Import a Glossary" + icon="MdUpload" + href="/how-to-guides/openmetadata/data-governance/glossary-classification/import-glossary"%} + Save time and effort by bulk uploading glossary terms using a CSV file. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="How to Add Assets to Glossary Terms" + icon="MdPushPin" + href="/how-to-guides/openmetadata/data-governance/glossary-classification/assets"%} + Associate glossary terms to data assets making it easier for data discovery + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="How to Classify Data Assets" + icon="MdDiscount" + href="/how-to-guides/openmetadata/data-governance/glossary-classification/classify-assets"%} + Add tags to data assets, or request them and discuss about the same, all within OpenMetadata. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Best Practices for Glossary and Classification" + icon="MdThumbUp" + href="/how-to-guides/openmetadata/data-governance/glossary-classification/best-practices"%} + Here are the Top 8 Best Practices around Terminologies. + {%/inlineCallout%} +{%/inlineCalloutContainer%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/setup-glossary.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/setup-glossary.md new file mode 100644 index 000000000000..5681982fa390 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/setup-glossary.md @@ -0,0 +1,75 @@ +--- +title: How to Setup a Glossary +slug: /how-to-guides/openmetadata/data-governance/glossary-classification/setup-glossary +--- + +# How to Setup a Glossary + +To create a glossary manually in OpenMetadata: +- Navigate to **Govern > Glossary** +- Click on **+ Add** to add a new glossary + +{% image +src="/images/v1.1/how-to-guides/governance/glossary1.png" +alt="Add a New Glossary" +caption="Add a New Glossary" +/%} + +- Enter the details to configure the glossary. + - **Name*** - This is a required field. + + - **Display Name** + + - **Description*** - Describe the context or domain of the glossary. This is a required field. + + - **Tags** - Classification tags can be added to a glossary. + + - **Mutually Exclusive** - There are cases where only one term from a particular glossary is relevant for a data asset. For example, an asset can either be ‘PII-Sensitive’ or a ‘PII-NonSensitive’. It cannot be both. For such cases, a Glossary can be created where the glossary terms can be mutually exclusive. If this configuration is enabled, you won’t be able to assign multiple terms from the same Glossary to the same data asset. + + - **Owner** - Either a Team or a User can be the Owner of a Glossary. + + - **Reviewers** - Multiple reviewers can be added. + +{% image +src="/images/v1.1/how-to-guides/governance/glossary2.png" +alt="Configure the Glossary" +caption="Configure the Glossary" +/%} + +## Add a Owner and Reviewers to a Glossary + +When creating a glossary, you can add the glossary owner. Either a Team or a User can be a Owner of the Glossary. Simply click on the option for **Owner** to select the user or team. + +Multiple users can be added as Reviewers by clicking on the pencil icon. If the **Reviewer** details exist for a glossary, then the same details are reflected when adding a new term manually as well. + +{% image +src="/images/v1.1/how-to-guides/governance/owner.png" +alt="Add Owner and Reviewers" +caption="Add Owner and Reviewers" +/%} + +If the Owner and Reviewer details are added while creating the glossary, and the glossary terms are **[bulk uploaded using a CSV file](/how-to-guides/openmetadata/data-governance/glossary-classification/import-glossary)**, then the glossary Owner and Reviewers are inherited for all the glossary terms. These details can be changed later. + +{%inlineCallout + color="violet-70" + bold="How to Create Glossary Terms" + icon="MdArrowForward" + href="/how-to-guides/openmetadata/data-governance/glossary-classification/glossary-terms"%} + Setup Glossary Terms to define the terminology. Add tags, synonyms, related terms, links, etc. +{%/inlineCallout%} + +{%inlineCallout + color="violet-70" + bold="How to Bulk Import a Glossary" + icon="MdArrowForward" + href="/how-to-guides/openmetadata/data-governance/glossary-classification/import-glossary"%} + Save time and effort by bulk uploading glossary terms using a CSV file. +{%/inlineCallout%} + +{%inlineCallout + color="violet-70" + bold="How to Add Assets to Glossary Terms" + icon="MdArrowForward" + href="/how-to-guides/openmetadata/data-governance/glossary-classification/assets"%} + Associate glossary terms to data assets making it easier for data discovery +{%/inlineCallout%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/tiers.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/tiers.md new file mode 100644 index 000000000000..e883a9e5528b --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/glossary-classification/tiers.md @@ -0,0 +1,36 @@ +--- +title: What are Tiers +slug: /how-to-guides/openmetadata/data-governance/glossary-classification/tiers +--- + +# What are Tiers + +Tiering is an important concept of data classification in OpenMetadata. Tiers should be based on the importance of data. Using Tiers, data producers or owners can define the importance of data to an organization. + +In OpenMetadata, Tiers are System Classification tags and can be accessed from **Govern > Classification > Tier**. + +{% image +src="/images/v1.1/how-to-guides/governance/tier1.png" +alt="Classification Tags: Tiers" +caption="Classification Tags: Tiers" +/%} + +In case of tiering, it is easiest to start with the most important (Tier 1) and the least important (Tier 5) data. Once the **Tier 1** or most important data is identified, organizations can focus on improving the descriptions and data quality. The Data Insights in OpenMetadata helps identify the unused datasets as **Tier 5**. The Tier 5 datasets can be deleted periodically to declutter. Other tiers can be added as per your organizational needs. **Tags** can be added to further mark the data assets. + +| **Tier** | **Impact** | **Used for** | **Type of Impact** | **Usage** | +|--- | --- | --- | --- | --- | +| **Tier 1** | High | External & Internal Decisions | Revenue, Regulatory, & Reputational | Highly used | +| **Tier 2** | Moderate | Some External & Mostly Internal Decisions | Some Regulatory | Highly used | +| **Tier 3** | Low | Internal Decisions | - | Highly used (Top N percentile) | +| **Tier 4** | Low | Internal Team Decisions | - | - | +| **Tier 5** | Individual owned | Unused Datasets | - | - | + +## How to Add Tiers + +From the **Explore** page, select a data asset and click on the edit icon for **Tier**. Select the appropriate tier. Clicking on the arrow next to the tier will provide a description of the tier. + +{% image +src="/images/v1.1/how-to-guides/governance/tier2.png" +alt="Add a Tier to Data Asset" +caption="Add a Tier to Data Asset" +/%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/index.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/index.md new file mode 100644 index 000000000000..118857f1a58f --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-governance/index.md @@ -0,0 +1,18 @@ +--- +title: Data Governance +slug: /how-to-guides/openmetadata/data-governance +--- + +# Data Governance + +OpenMetadata is a rich collaborative platform for data teams. Data producers and data consumers can access all their organizational metadata from OpenMetadata. Users can mutually benefit from the team’s collaborative expertise around data. With several teams and users having access to the organizational data assets in OpenMetadata, it is crucial to have some form of governance in place. OpenMetadata supports [fine-grained Access Control Roles and Policies](https://docs.open-metadata.org/v1.1.x/how-to-guides/admin-guide-roles-policies) to ensure data security. + +Apart from well-defined access control roles and policies, a common vocabulary within the organization fosters effective collaboration and helps in data governance. A **Business Glossary** plays an important role in defining the common terminology in the organization. Data also needs be classified and tagged for policy enforcement purposes like privacy policy, data management policy, data retention policy, and so on. Using **Classification** you can manage access to the PII sensitive data in OpenMetadata. + +{%inlineCallout +color="violet-70" +bold="Glossary and Classification" +icon="MdMenuBook" +href="/how-to-guides/openmetadata/data-governance/glossary-classification"%} +Learn more about the Glossaries and Classification Tags in OpenMetadata. +{%/inlineCallout%} diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-insights/data-culture.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-insights/data-culture.md new file mode 100644 index 000000000000..8dea93e372ef --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-insights/data-culture.md @@ -0,0 +1,86 @@ +--- +title: How to Transform the Data Culture of Your Company +slug: /how-to-guides/openmetadata/data-insights/data-culture +--- + +# How to Transform the Data Culture of Your Company + +## What is Data Culture? + +Data Culture is a shared belief in the organization to use data to improve decision making and performance. It has essentially three important characteristics: +- People are empowered to use data. +- Data is prioritized in decision making. +- Data assets are managed as products. + +## Why Data Culture is Important? + +It is observed that data-driven organizations experience above-market growth, leading to increased revenue, profitability, and operational efficiency. In order to fully realize the benefit of a data-driven organization, data culture plays a crucial role. Clear ownership of data, customer sensitivity and keeping the data high quality and reliable should be part of the data culture. It cannot be an adhoc reactive measure. Long term data quality solutions require a strong data culture. + +Data is a shared responsibility of the organization and requires an end-to-end approach. Data producers and consumers can together work towards better documentation and classification. Here, collaboration is key to solving issues and to improve data. Also, an organization needs to have a clear understanding of where they are with data. So they can set clear goals, achieve those goals, and measure success. + +## Key Aspects of Data Culture + +### 1. Data Needs Clear Ownership + +All important data must be owned. Individuals should not own important data assets. Team ownership is preffered over User ownership. It also pushes the data responsibility to a team instead of an individual user. + +### 2. Measure What Matters + +You cannot improve what you do not measure. It is important to set metrics to understand how your data is doing. Based on that, you can set goals towards improving your data. + +### 3. Treat Data as a Product + +- Maintain clear documentation of data with SLA and guarantees. +- Understand your user and use cases and build the data accordingly. +- Continuously improve data with discipline. +- Track user feedback using surveys. + +### 4 Reduce Toil with Integrated Tools and Automation + +Too many tools around specific use cases clutter the data landscape. It is better to use integrated tools that handle multiple scenarios like data discovery, collaboration, lineage, quality, etc. Automated data quality tools help to ensure updated and fresh data. Tools that identify the Tier 5 or least important data help to declutter with automted data deletion. + +### 5 Organize for Data + +Data does not come free. You need resources the teams to keep the data of high quality, reliable, and trustworthy. Decentralize data for end-to-end domain based data ownership. + +## How OpenMetadata Helps Enhance Data Culture +In order to enhance the data culture of a company, data need to be Trusted, Documented and Discoverable across the organization. OpenMetadata is an all-in-one platform for data discovery, collaboration, quality, governance, observability, lineage, glossary, and much more. Alongwith ensuring reliable quality of your data, you can use the collaborative features to maintain proper documentation, ownership, and appropriate tiering of your data assets. + +### 1. Centralize your Metadata in OpenMetadata + +OpenMetadata helps to understand your data landscape. It captures all your metadata in a single place. It is a collaborative tool for both technical and business users. + +### 2. Set KPIs to Drive Data Ownership + +The data insights feature allows you to set up KPIs using time-based goals to track ownership. Goal-based tasks can be set up for different teams. You can claim data asset ownership in OpenMetadata. + +### 3. Set KPIs to Drive Documentation + +Data without description is hard to use, resulting in the loss of productivity. Similarly, invalid or missing descriptions result in poor data outcomes. Good descriptions help to discover data assets quickly. You can set up KPIs with a specific goal to cover data documentation. + +### 4. Develop Data Vocabulary + +Data vocabulary helps in the consistent understanding of data. In OpenMetdata, using the [Glossary](/how-to-guides/openmetadata/data-governance/glossary-classification) feature, you can describe business terms and concepts in a single place. Also, the data assets can be labelled using these glossary terms in order to provide semantic meaning. + +### 5 Identify Important Data with Tiers + +Tiering is an important concept of data classification in OpenMetadata. Using [Tiers](/how-to-guides/openmetadata/data-governance/glossary-classification/tiers), data producers or owners can define the importance of data to an organization. + +In case of tiering, it is easiest to start with the most important (Tier 1) and the least important (Tier 5) data. Once the **Tier 1** or most important data is identified, organizations can focus on improving the descriptions and data quality. The Data Insights in OpenMetadata helps identify the unused datasets as **Tier 5**. The Tier 5 datasets can be deleted periodically to declutter. + +### 6 Provide Feedback to Teams + +OpenMetadata provides continuous feedback by way of weekly reports. The detailed reports help to track progress over time. It keeps the leaders well informed. It helps to recognize the teams that are doing well. + +### 7 Use OpenMetadata Browser Extension + +By using the Chrome browser extension, users can consume the metadata in the tools of their choice. It provides consistent understanding of metadata at their fingertips, and helps improve productivity. + +### 8 Data as a Product in OpenMetadata + +OpenMetadata helps customers understand their data with a 360° view. Admins can set up sample data, table and column profiling for the important data assets. Data quality is important and it is a shared responsibility in the organization. Admins can set up data quality tests in OpenMetadata to detect and fix the issues early on. Both data producers and consumers can collaborate to capture assumptions about data and set up tests accordingly. + +Go ahead, leverage Data Insights to transform the data culture of your organization! +Watch the video to learn more about proactively honing the data culture of your company by setting targets, monitoring, and boosting teams to accomplish data goals with OpenMetadata. + +{% youtube videoId="lOQepnTdA58" start="0:00" end="58:23" /%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-insights/index.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-insights/index.md new file mode 100644 index 000000000000..274ea190223a --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-insights/index.md @@ -0,0 +1,56 @@ +--- +title: Data Insights +slug: /how-to-guides/openmetadata/data-insights +--- + +# Overview of Data Insights + +OpenMetadata is a centralized, active metadata repository where all your data resides. Organizations can drive the adoption of OpenMetadata by monitoring its usage and setting up company-wide KPIs. The built-in goal-setting and tracking mechanisms help proactively drive your company's data culture. You can define the **Key Performance Indicators** and set goals to be achieved within a timeframe towards **better documentation, ownership, and tiering**. + +The **Data Insights Dashboard** provides an analytical view of all the key metrics around data assets and user activity. The KPIs help to drive platform adoption. You can monitor data health and track the progress toward the organizational goals. The Data Insights Report is emailed at a regular cadence so that teams can assess their performance relative to the KPIs to determine data ownership, tiering, and documentation. You can also assess user engagement and growth with the aggregated user activity. + +Watch the video to learn more about proactively honing the data culture of your company by setting targets, monitoring, and boosting teams to accomplish data goals. + +{% youtube videoId="lOQepnTdA58" start="0:00" end="58:23" /%} + +Watch a demo of Data Insights in OpenMetadata + +{% youtube videoId="Epd9G6igLUM" start="0:00" end="21:58" /%} + +{%inlineCalloutContainer%} + {%inlineCallout + color="violet-70" + bold="What is Tiering" + icon="MdInsights" + href="/how-to-guides/openmetadata/data-insights/tiering"%} + Set business importance of data using Tiers. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Set Up Data Insights Ingestion" + icon="MdInsights" + href="/how-to-guides/openmetadata/data-insights/ingestion"%} + Set up the ingestion pipeline right from the UI. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Key Performance Indicators (KPI)" + icon="MdInsights" + href="/how-to-guides/openmetadata/data-insights/kpi"%} + Define the KPIs and set goals for documentation, and ownership. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Data Insights Report" + icon="MdInsights" + href="/how-to-guides/openmetadata/data-insights/report"%} + Get a quick glance of data asset description, ownership, and tiering coverage. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="How to Transform the Data Culture of Your Company" + icon="MdInsights" + href="/how-to-guides/openmetadata/data-insights/data-culture"%} + Improve your data culture for data-driven decision making. + {%/inlineCallout%} +{%/inlineCalloutContainer%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-insights/ingestion.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-insights/ingestion.md new file mode 100644 index 000000000000..6e8e694223cd --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-insights/ingestion.md @@ -0,0 +1,51 @@ +--- +title: Set Up Data Insights Ingestion +slug: /how-to-guides/openmetadata/data-insights/ingestion +--- + +# Set Up Data Insights Ingestion + +Admin users can set up a data insights ingestion pipeline right from the OpenMetadata UI. + +- Navigate to **Settings >> OpenMetadata >> Data Insights**. +- Click on **Add Data Insights Ingestion**. + +{% image +src="/images/v1.1/how-to-guides/insights/di1.png" +alt="Set Up Data Insights Ingestion" +caption="Set Up Data Insights Ingestion" +/%} + +- A default name is generated for the ingestion pipeline. You can leave it as it is or edit the name as required. +- You can choose to enable the Debug Log. + +{% image +src="/images/v1.1/how-to-guides/insights/di2.png" +alt="Set Up Data Insights Ingestion" +caption="Set Up Data Insights Ingestion" +/%} + +- Choose a schedule execution time for your workflow. The schedule time is displayed in UTC. We recommend running this workflow overnight or when activity on the platform is at its lowest to ensure accurate data. It is scheduled to run daily. +- Click on **Add & Deploy**. + +{% image +src="/images/v1.1/how-to-guides/insights/di3.png" +alt="Set Up Data Insights Ingestion Schedule" +caption="Set Up Data Insights Ingestion Schedule" +/%} + +{% image +src="/images/v1.1/how-to-guides/insights/di4.png" +alt="Data Insights Ingestion Created and Deployed" +caption="Data Insights Ingestion Created and Deployed" +/%} + +Navigate to the Insights page. You should see your [Data Insights Reports](/how-to-guides/openmetadata/data-insights/report). Note that if you have just deployed OpenMetadata, App Analytics data might not be present. App Analytics data is fetched from the previous day (UTC). + +{%inlineCallout + color="violet-70" + bold="Key Performance Indicators (KPI)" + icon="MdArrowForward" + href="/how-to-guides/openmetadata/data-insights/kpi"%} + Define the KPIs and set goals for documentation, and ownership. +{%/inlineCallout%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-insights/kpi.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-insights/kpi.md new file mode 100644 index 000000000000..e43347008966 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-insights/kpi.md @@ -0,0 +1,61 @@ +--- +title: Key Performance Indicators (KPI) +slug: /how-to-guides/openmetadata/data-insights/kpi +--- + +# Key Performance Indicators (KPI) + +Admins can define the Key Performance Indicators (KPIs) and set goals within OpenMetadata to work towards **better documentation, ownership, and tiering**. These goals are based on data assets and driven to achieve targets within a specified timeframe. For example, Admins can set goals to have at least 60% of the entities documented, owned and tiered by the end of Q4 2023. + +The data insights feature in OpenMetadata helps organizations to decentralize documentation and ownership of data assets. Organizations can drive the adoption of OpenMetadata by setting up company-wide KPIs to track the documentation, ownership, and tiering of data assets. + +## KPI Categories + +OpenMetadata currently supports the following KPI Categories. + +**Completed Description:** This KPI measures the description coverage of your data assets in OpenMetadata. You can choose an absolute (number) or a relative (percentage) value. + +**Completed Ownership:** This KPI measures the ownership coverage of your data assets in OpenMetadata. You can choose an absolute (number) or a relative (percentage) value. + +## How to Add KPIs + +When OpenMetadata is set up with data ingestion from third party sources, the details on the description, ownership, and tiering are also brought into OpenMetadata. You can track the existing documentation and ownership coverage and work towards a better data culture by setting up data insights. Configure the KPIs and set goals at an organizational level to encourage your team to get your data to a better state. + +To add KPIs: +- Navigate to **Insights** and click on **Add KPI**. + +{% image +src="/images/v1.1/how-to-guides/insights/kpi1.png" +alt="Add a KPI" +caption="Add a KPI" +/%} + +- Enter the following details on the KPI configuration page: + - **Select a Chart** among the available chart options. + - Enter a **Display Name**. + - Select the **Metric Type**, i.e., Percentage or Number. You can choose an absolute number or define a relative percentage of the data assets to be covered. + - Select a **Start and End Date** by which to achieve the KPI target. + - Add a **Description** to define what the KPI is about. +- Click on **Submit**. + +{% image +src="/images/v1.1/how-to-guides/insights/kpi2.png" +alt="Details of the KPI" +caption="Details of the KPI" +/%} + +{% image +src="/images/v1.1/how-to-guides/insights/kpi3.png" +alt="Ownership Coverage KPI Added" +caption="Ownership Coverage KPI Added" +/%} + +The line graph represents the progress made on a daily basis. It also displays the days left to achieve the target and the coverage so far. + +{%inlineCallout + color="violet-70" + bold="Data Insights Report" + icon="MdArrowForward" + href="/how-to-guides/openmetadata/data-insights/report"%} + Get a quick glance of data asset description, ownership, and tiering coverage. +{%/inlineCallout%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-insights/report.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-insights/report.md new file mode 100644 index 000000000000..2a8cd328b024 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-insights/report.md @@ -0,0 +1,143 @@ +--- +title: Data Insights Report +slug: /how-to-guides/openmetadata/data-insights/report +--- + +# Data Insights Report + +The data insights report provides a quick glance at aspects like data ownership, description coverage, data tiering, and so on. Admins can view the aggregated user activity and get insights into user engagement and user growth. Admins can check for Daily active users and know how the tool is being used. + +OpenMetadata offers a suite of reports providing platform analytics around specific areas. The reports are available in three sections: +- Data Assets +- App Analytics +- KPIs + +{% image +src="/images/v1.1/how-to-guides/insights/insights1.png" +alt="Data Insights Report" +caption="Data Insights Report" +/%} + +All the reports can be filtered by **Teams, Data Tiers, and a Time Filter**. +{% image +src="/images/v1.1/how-to-guides/insights/insights2.png" +alt="Data Insights Report Filters: Team, Tier, Time" +caption="Data Insights Report Filters: Team, Tier, Time" +/%} + +## Data Assets Report +The Data Asset reports display important metrics around your data assets in OpenMetadata. This report also displays the organizational health at a glance with details on the Total Data Assets, Data Assets with Description, Owners, and Tiers. + +{% image +src="/images/v1.1/how-to-guides/insights/ohg.png" +alt="Organization Health at a Glance" +caption="Organization Health at a Glance" +/%} + +### Total Data Assets + +This chart represents the total number of data assets present in OpenMetadata. It offers a view of your data assets broken down by asset type (i.e. DatabaseSchema, Database, Dashboard, Chart, Topic, ML Model, etc.) + +{% image +src="/images/v1.1/how-to-guides/insights/tda.png" +alt="Total Data Assets" +caption="Total Data Assets" +/%} + +### Percentage of Data Assets with Description + +It displays the percentage of data assets with description by data asset type. + +{% image +src="/images/v1.1/how-to-guides/insights/pdad.png" +alt="Percentage of Data Assets with Description" +caption="Percentage of Data Assets with Description" +/%} + +### Percentage of Data Assets with Owners + +This chart represents the percentage of data assets present in OpenMetadata with an owner assigned. Data assets that do not support assigning an owner will not be counted in this percentage. It allows you to quickly view the ownership coverage for your data assets in OpenMetadata. + +{% image +src="/images/v1.1/how-to-guides/insights/pdao.png" +alt="Percentage of Data Assets with Owners" +caption="Percentage of Data Assets with Owners" +/%} + +### Total Data Assets by Tier + +It displays a broken down view of data assets by Tiers. Data Assets with no tiers assigned are not included in this. It allows you to quickly view the breakdown of data assets by tier. + +{% image +src="/images/v1.1/how-to-guides/insights/tdat.png" +alt="Total Data Assets by Tier" +caption="Total Data Assets by Tier" +/%} + +## App Analytics + +App analytics helps to track user engagement. This report provides important metrics around the usage of OpenMetadata. This report also displays the organizational health at a glance with details on the Page Views by Data Assets, Daily Active Users on the Platform, and the Most Active User. + +{% image +src="/images/v1.1/how-to-guides/insights/ohg2.png" +alt="Organization Health at a Glance" +caption="Organization Health at a Glance" +/%} + +### Most Viewed Data Assets + +Know the 10 most viewed data assets in your platform. It offers a quick view to identify the data assets of the most interest in your organization. + +{% image +src="/images/v1.1/how-to-guides/insights/mvda.png" +alt="Most Viewed Data Assets" +caption="Most Viewed Data Assets" +/%} + +### Page Views by Data Assets + +It helps to understand the total number of page views by asset type. This allows you to understand which asset familly drives the most interest in your organization + +{% image +src="/images/v1.1/how-to-guides/insights/pvda.png" +alt="Page Views by Data Assets" +caption="Page Views by Data Assets" +/%} + +### Daily Active Users on the Platform + +Active users are users with at least one session. This report allows to understand the platform usage and see how your organization leverages OpenMetadata. + +{% image +src="/images/v1.1/how-to-guides/insights/daup.png" +alt="Daily Active Users on the Platform" +caption="Daily Active Users on the Platform" +/%} + +### Most Active Users + +This report displays the most active users on the platform based on Page Views. They are the power users in your data team. + +{% image +src="/images/v1.1/how-to-guides/insights/mau.png" +alt="Most Active Users" +caption="Most Active Users" +/%} + +## Key Performance Indicators (KPI) + +While data insights reports gives an analytical view of the OpenMetadata platform, KPIs are here to drive platform adoption. The below report displays the percentage coverage of description and ownership of the data assets. + +{% image +src="/images/v1.1/how-to-guides/insights/kpi.png" +alt="Key Performance Indicators (KPI)" +caption="Key Performance Indicators (KPI)" +/%} + +{%inlineCallout + color="violet-70" + bold="How to Transform the Data Culture of Your Company" + icon="MdArrowForward" + href="/how-to-guides/openmetadata/data-insights/data-culture"%} + Improve your data culture for data-driven decision making. +{%/inlineCallout%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-insights/tiering.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-insights/tiering.md new file mode 100644 index 000000000000..16e81780bdbf --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-insights/tiering.md @@ -0,0 +1,44 @@ +--- +title: What is Tiering +slug: /how-to-guides/openmetadata/data-insights/tiering +--- + +# What is Tiering + +Tiering is an important concept of data classification in OpenMetadata. Data Producers and Consumers can set business importance of data by setting Tiers. `Tier 1` is the most important data of an organization. + +In OpenMetadata, Tiers are System Classification tags and can be accessed from **Govern > Classification > Tier**. + +{% image +src="/images/v1.1/how-to-guides/governance/tier1.png" +alt="Classification Tags: Tiers" +caption="Classification Tags: Tiers" +/%} + +In case of tiering, it is easiest to start with the most important (Tier 1) and the least important (Tier 5) data. Once the **Tier 1** or most important data is identified, organizations can focus on improving the descriptions and data quality. The Data Insights in OpenMetadata helps identify the unused datasets as **Tier 5**. The Tier 5 datasets can be deleted periodically to declutter. Other tiers can be added as per your organizational needs. **Tags** can be added to further mark the data assets. + +| **Tier** | **Impact** | **Used for** | **Type of Impact** | **Usage** | +|--- | --- | --- | --- | --- | +| **Tier 1** | High | External & Internal Decisions | Revenue, Regulatory, & Reputational | Highly used | +| **Tier 2** | Moderate | Some External & Mostly Internal Decisions | Some Regulatory | Highly used | +| **Tier 3** | Low | Internal Decisions | - | Highly used (Top N percentile) | +| **Tier 4** | Low | Internal Team Decisions | - | - | +| **Tier 5** | Individual owned | Unused Datasets | - | - | + +## How to Add Tiers + +From the **Explore** page, select a data asset and click on the edit icon for **Tier**. Select the appropriate tier. Clicking on the arrow next to the tier will provide a description of the tier. + +{% image +src="/images/v1.1/how-to-guides/governance/tier2.png" +alt="Add a Tier to Data Asset" +caption="Add a Tier to Data Asset" +/%} + +{%inlineCallout + color="violet-70" + bold="Set Up Data Insights Ingestion" + icon="MdArrowForward" + href="/how-to-guides/openmetadata/data-insights/ingestion"%} + Set up the ingestion pipeline right from the UI. +{%/inlineCallout%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-lineage/column.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-lineage/column.md new file mode 100644 index 000000000000..72d2b7be81aa --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-lineage/column.md @@ -0,0 +1,45 @@ +--- +title: How Column-Level Lineage Works +slug: /how-to-guides/openmetadata/data-lineage/column +--- + +# How Column-Level Lineage Works + +OpenMetadata supports rich column-level lineage for understanding the relationship between tables and to perform impact analysis. Users can manually edit both the table and column level lineage to capture any information that is not automatically surfaced. + +{% image +src="/images/v1.1/how-to-guides/lineage/lineage1.png" +alt="Column-Level Data Lineage in OpenMetadata" +caption="Column-Level Data Lineage in OpenMetadata" +/%} + +{% note noteType="Tip" %} **Quick Tip:** Drilldown to view all the available columns for a table when viewing column-level lineage. {% /note %} + +You can generate the column-level lineage automatically by running the **Lineage Ingestion**. + +{% image +src="/images/v1.1/how-to-guides/lineage/ingestion.png" +alt="Lineage Ingestion" +caption="Lineage Ingestion" +/%} + +## Manually Edit Column Level Lineage + +OpenMetadata supports manual editing of both table and column level lineage. You can edit the lineage for the individual columns by clicking on the edit option on the top right. User the anchor points on either side of the columns to create links and trace individual columns through their lineage. You can also add new tables that have columns you want to trace. Connect the relevant columns to the current lineage. + +{% image +src="/images/v1.1/how-to-guides/lineage/column1.png" +alt="Manually Edit Column Level Lineage" +caption="Manually Edit Column Level Lineage" +/%} + +Watch the video on editing column-level lineage. +{% youtube videoId="HTkbTvi2H9c" start="0:00" end="00:51" /%} + +{%inlineCallout + color="violet-70" + bold="Manual Lineage" + icon="MdArrowForward" + href="/how-to-guides/openmetadata/data-lineage/manual"%} + Edit the table and column level lineage manually. +{%/inlineCallout%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-lineage/explore.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-lineage/explore.md new file mode 100644 index 000000000000..506af4806915 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-lineage/explore.md @@ -0,0 +1,63 @@ +--- +title: Explore the Lineage View +slug: /how-to-guides/openmetadata/data-lineage/explore +--- + +# Explore the Lineage View + +OpenMetadata UI displays end-to-end lineage traceability for the table and column levels. OpenMetadata supports lineage for Database, Dashboard, and Pipelines. Just search for an data asset and expand the graph to unfold lineage. It’ll display the upstreams and downstreams edges for each node. The lineage details specify the SQL query, pipeline information, and column lineage. + +In the lineage view, in the example below, the table on the left is the parent or **Source** node. The table on the right is the **Target** node. You can also identify the target node by looking at the arrow attached to it. The arrow connecting the data assets or tables is the **Edge**. Clicking on an edge connecting a source and a destination will display all the edge information: the Source, Target, Description, and SQL Query. It displays the SQL query used to generate the view (The table is of the Type View). The SQL query provides information on how the target table was generated from the source table. + +{% image +src="/images/v1.1/how-to-guides/lineage/edge.png" +alt="Edge Information: Source and Target" +caption="Edge Information: Source and Target" +/%} + +{% note noteType="Tip" %} **Tip:** Metadata ingestion also brings in the View Lineage, if the database has views (Data assets of the Type View). {% /note %} + +You can set up the **Lineage Config** to display the required number of Upstream and Downstream Nodes, as well as the Nodes per layer. You can set up to **3** Upstream and Downstream Nodes. +{% image +src="/images/v1.1/how-to-guides/lineage/nodes.png" +alt="Lineage Config" +caption="Lineage Config" +/%} + +You can click on the data assets to view the data asset details. +- Users can view the Source, Name of the Data Asset, Description, Owner (Team/User details), Tier, and Usage information for the data asset. +- Based on the **type of data asset** (Table, Topic, Dashboard, Pipeline, ML Model, Container), the quick preview provides additional information. For example, for `tables`, the type of table, the number of queries, and columns are displayed. +- The **data quality and profiler metrics** displays the details on the Tests Passed, Aborted, and Failed. +- Users can view all the **tags** associated with the data asset. +- The **Schema** provides the details on the column names, type of column, and column description. + +{% image +src="/images/v1.1/how-to-guides/lineage/lineage2.png" +alt="Quick Glance at the Data Asset from Lineage View" +caption="Quick Glance at the Data Asset from Lineage View" +/%} + +Clicking on the tables will display the list of columns and column-level lineage. +{% image +src="/images/v1.1/how-to-guides/lineage/lineage1.png" +alt="Column-Level Data Lineage in OpenMetadata" +caption="Column-Level Data Lineage in OpenMetadata" +/%} + +In case of **Pipelines**, we first have the lineage ingested from the databases. Further, when setting up the pipeline ingestion, we specify the database service name. That way we display the lineage of the database tables connected via pipelines. If a lineage is created through a pipeline, the same is displayed in the Edge information. + +{% image +src="/images/v1.1/how-to-guides/lineage/pipeline.png" +alt="Database and Pipeline Lineage" +caption="Database and Pipeline Lineage" +/%} + +Similarly for a **Dashboard**, we first have the lineage ingested from the databases. Further, when setting up the dashboard ingestion, the data models and charts are ingested. That way we display the lineage of the database tables connected using the dashboard data models. + +{%inlineCallout + color="violet-70" + bold="Column-Level Lineage" + icon="MdArrowForward" + href="/how-to-guides/openmetadata/data-lineage/column"%} + Explore and edit the rich column-level lineage. +{%/inlineCallout%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-lineage/index.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-lineage/index.md new file mode 100644 index 000000000000..472af1b34b11 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-lineage/index.md @@ -0,0 +1,49 @@ +--- +title: Data Lineage +slug: /how-to-guides/openmetadata/data-lineage +--- + +# Overview of Data Lineage + +OpenMetadata tracks data lineage, showing how data moves through the organization's systems. Users can visualize how data is transformed and where it is used, helping with data traceability and impact analysis. OpenMetadata supports lineage for Database, Dashboard, and Pipelines. + +{% image +src="/images/v1.1/how-to-guides/lineage/lineage1.png" +alt="Data Lineage in OpenMetadata" +caption="Data Lineage in OpenMetadata" +/%} + +Watch the video on data lineage to understand the different options to automatically extract the lineage from your data warehouses such as Snowflake, dashboard service like metabase. Also learn about creating lineage programmatically with python SDK. + +{% youtube videoId="jEbN1tt89H0" start="0:00" end="41:43" /%} + +{%inlineCalloutContainer%} + {%inlineCallout + color="violet-70" + bold="Lineage Workflow" + icon="MdPolyline" + href="/how-to-guides/openmetadata/data-lineage/workflow"%} + Configure a lineage workflow right from the UI. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Explore Lineage" + icon="MdPolyline" + href="/how-to-guides/openmetadata/data-lineage/explore"%} + Explore the rich lineage view in OpenMetadata. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Column-Level Lineage" + icon="MdViewColumn" + href="/how-to-guides/openmetadata/data-lineage/column"%} + Explore and edit the rich column-level lineage. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Manual Lineage" + icon="MdPolyline" + href="/how-to-guides/openmetadata/data-lineage/manual"%} + Edit the table and column level lineage manually. + {%/inlineCallout%} +{%/inlineCalloutContainer%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-lineage/manual.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-lineage/manual.md new file mode 100644 index 000000000000..1ba35c139bf8 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-lineage/manual.md @@ -0,0 +1,42 @@ +--- +title: How to Manually Add or Edit Lineage +slug: /how-to-guides/openmetadata/data-lineage/manual +--- + +# How to Manually Add or Edit Lineage + +Edit lineage to provide a richer understanding of the provenance of data. The OpenMetadata no-code editor provides a drag and drop interface. Drop tables, topics, pipelines, dashboards, ML models, containers, and pipelines onto the lineage graph. You may add new edges or delete existing edges to better represent data lineage. + +OpenMetadata supports manual editing of both table and column level lineage. We can build the lineage by creating edges. You can connect the source of the lineage to the destination by connecting the nodes. + +Once you have ingested your database and dashboard services. +- Start by picking one database service, and select a table. In the data asset details page, navigate to the Lineage Tab. +- Click on the Edit option to enable the lineage editor. +- Select the type of data asset (table, topic, dashboard, ML model, container, pipeline) to connect to as the destination. + +{% image +src="/images/v1.1/how-to-guides/lineage/l1.png" +alt="Data Asset: Lineage Tab" +caption="Data Asset: Lineage Tab" +/%} + +- Search and select the relevant data asset. +- Create an edge between these two data assets. + +{% image +src="/images/v1.1/how-to-guides/lineage/l2.png" +alt="Link the Table to the Dashboard to Add Lineage Manually" +caption="Link the Table to the Dashboard to Add Lineage Manually" +/%} + +- You can also expand a table to view the available columns +- Link the relevant columns together by connecting the column edges to trace column-level lineage. + +{% image +src="/images/v1.1/how-to-guides/lineage/l3.png" +alt="Column-Level Lineage" +caption="Column-Level Lineage" +/%} + +Watch the video about lineage (13:30 to 15:50) +{% youtube videoId="jEbN1tt89H0" start="13:30" end="15:48" /%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-lineage/workflow.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-lineage/workflow.md new file mode 100644 index 000000000000..c168c4aa4d70 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-lineage/workflow.md @@ -0,0 +1,89 @@ +--- +title: How to Deploy a Lineage Workflow +slug: /how-to-guides/openmetadata/data-lineage/workflow +--- + +# How to Deploy a Lineage Workflow + +Lineage data can be ingested from your data sources right from the OpenMetadata UI. Currently, the lineage workflow is supported for a limited set of connectors, like [BigQuery](/connectors/database/bigquery), [Snowflake](/connectors/database/snowflake), [MSSQL](/connectors/database/mssql), [Redshift](/connectors/database/redshift), [Clickhouse](/connectors/database/clickhouse), [Postgres](/connectors/database/postgres), [Databricks](/connectors/database/databricks). + +{% note noteType="Tip" %} **Tip:** Trace the upstream and downstream dependencies with Lineage. {% /note %} + +## View Lineage from Metadata Ingestion +Once the metadata ingestion runs correctly, and we are able to explore the service Entities, we can add the view lineage information for the data assets. This will populate the Lineage tab in the data asset page. During the Metadata Ingestion workflow we differentiate if a Table is a View. For those sources, where we can obtain the query that generates the View, we bring in the view lineage along with the metadata. After all Tables have been ingested in the workflow, it's time to parse all the queries generating Views. During the query parsing, we will obtain the source and target tables, search if the Tables exist in OpenMetadata, and finally create the lineage relationship between the involved Entities. + +If the database has views, then the view lineage would be generated automatically, along with the column-level lineage. In such a case, the table type is **View** as shown in the example below. + {% image + src="/images/v1.1/how-to-guides/lineage/view.png" + alt="View Lineage through Metadata Ingestion" + caption="View Lineage through Metadata Ingestion" + /%} + +## Lineage Ingestion from UI +Apart from the Metadata ingestion, we can create a workflow that will obtain the query log and table creation information from the underlying database and feed it to OpenMetadata. The Lineage Ingestion will be in charge of obtaining this data. The metadata ingestion will only bring in the View lineage queries, whereas the lineage ingestion workflow will be bring in all those queries that can be used to generate lineage information. + +### 1. Add a Lineage Ingestion + +Navigate to **Settings >> Services**. Select the required service + {% image + src="/images/v1.1/how-to-guides/lineage/wkf1.png" + alt="Select a Service" + caption="Select a Service" + /%} + +Go the the **Ingestions** tab. Click on **Add Ingestion** and select **Add Lineage Ingestion**. + {% image + src="/images/v1.1/how-to-guides/lineage/wkf2.png" + alt="Add a Lineage Ingestion" + caption="Add a Lineage Ingestion" + /%} + +### 2. Configure the Lineage Ingestion + +Here you can enter the Lineage Ingestion details: + {% image + src="/images/v1.1/how-to-guides/lineage/wkf3.png" + alt="Configure the Lineage Ingestion" + caption="Configure the Lineage Ingestion" + /%} + +### Lineage Options + +**Query Log Duration:** Specify the duration in days for which the profiler should capture lineage data from the query logs. For example, if you specify 2 as the value for the duration, the data profiler will capture lineage information for 2 **days** or 48 hours prior to when the ingestion workflow is run. + +**Parsing Timeout Limit:** Specify the timeout limit for parsing the sql queries to perform the lineage analysis. This must be specified in **seconds**. + +**Result Limit:** Set the limit for the query log results to be run at a time. This is the **number of rows**. + +**Filter Condition:** We execute a query on query history table of the respective data source to perform the query analysis and extract the lineage and usage information. This field will be useful when you want to restrict some queries from being part of this analysis. In this field you can specify a sql condition that will be applied on the query history result set. You can check more about [Usage Query Filtering here](/connectors/ingestion/workflows/usage/filter-query-set). + +### 3. Schedule and Deploy + +After clicking Next, you will be redirected to the Scheduling form. This will be the same as the Metadata Ingestion. Select your desired schedule and click on Deploy to find the lineage pipeline being added to the Service Ingestions. + {% image + src="/images/v1.1/how-to-guides/lineage/wkf4.png" + alt="Schedule and Deploy the Lineage Ingestion" + caption="Schedule and Deploy the Lineage Ingestion" + /%} + +## dbt Ingestion + +We can also generate lineage through [dbt ingestion](/connectors/ingestion/workflows/dbt/ingest-dbt-ui). The dbt workflow can fetch queries that carry lineage information. For a dbt ingestion pipeline, the path to the Catalog and Manifest files must be specified. We also fetch the column level lineage through dbt. + +You can learn more about [lineage ingestion here](/connectors/ingestion/lineage). + +## Query Logs using CSV File + +Lineage ingestion is supported for a few connectors as mentioned earlier. For the unsupported connectors, you can set up [Lineage Workflows using Query Logs](/connectors/ingestion/workflows/lineage/lineage-workflow-query-logs) using a CSV file. + +## Manual Lineage + +Lineage can also be added and edited manually in OpenMetadata. Refer for more information on [adding lineage manually](/how-to-guides/openmetadata/data-lineage/manual). + +{%inlineCallout + color="violet-70" + bold="Explore Lineage" + icon="MdArrowForward" + href="/how-to-guides/openmetadata/data-lineage/explore"%} + Explore the rich lineage view in OpenMetadata. +{%/inlineCallout%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-quality-profiler/alerts.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-quality-profiler/alerts.md new file mode 100644 index 000000000000..90ea49140d13 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-quality-profiler/alerts.md @@ -0,0 +1,33 @@ +--- +title: How to Set Alerts for Test Case Fails +slug: /how-to-guides/openmetadata/data-quality-profiler/alerts +--- + +# How to Set Alerts for Test Case Fails + +Users can set up alerts to be notified when data quality tests fail. + +To set up an alert for test failures: +- Navigate to **Settings >> Alerts** +- Click on **Create Alert** + +{% image +src="/images/v1.1/how-to-guides/quality/alert1.png" +alt="Set up Alerts for Test Failure" +caption="Set up Alerts for Test Failure" +/%} + +Enter the following details: +- **Name:** Add a name for the alert. +- **Description:** Describe what the laert is for. +- **Trigger:** Uncheck the trigger for `All` and add a trigger for `Test Case` +- **Filters:** Add filters to narrow down to the `Test Results` which `Failed`. You can also add another filter to specify the `FQN` to only include the tables that you want to consider. +- **Destination:** Specify the destination where the test failed notification must be sent. The alerts can be sent to Email, Slack, MS Teams, Google Chat, and other Webhooks. Notifications can also be sent only to Admins, Owners and Followers of data assets. + +{% image +src="/images/v1.1/how-to-guides/quality/alert2.png" +alt="Configure an Alert for Test Failure" +caption="Configure an Alert for Test Failure" +/%} + +**Save** the details to create an Alert. \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-quality-profiler/index.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-quality-profiler/index.md new file mode 100644 index 000000000000..bce041bf88ef --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-quality-profiler/index.md @@ -0,0 +1,56 @@ +--- +title: Data Quality and Profiler +slug: /how-to-guides/openmetadata/data-quality-profiler +--- + +# Overview of Data Quality and Profiler + +With OpenMetadata, you can build trust in your data by creating tests to monitor that the data is complete, fresh, and accurate. OpenMetadata supports data quality tests for all of the supported database connectors. Users can run tests at the table and column levels to verify their data for business as well as technical use cases. + +The profiler in OpenMetadata helps to understand the shape of your data and to quickly validate assumptions. The data profiler helps to capture table usage statistics over a period of time. This happens as part of profiler ingestion. Data profiles enable you to check for null values in non-null columns, for duplicates in a unique column, etc. You can gain a better understanding of column data distributions through the descriptive statistics provided. + +OpenMetadata provides Data Quality workflows, which helps with: +- **Native tests** for all database connectors to run assertions. +- **Alerting system** to send notifications on test failure. +- **Health dashboard** to track realtime test failure and to prioritize efforts. +- **Resolution workflow** to inform the data consumer on test resolutions. + +The data quality in OpenMetadata is also **extensible** to adapt to your needs. + +{% image +src="/images/v1.1/how-to-guides/quality/quality1.png" +alt="Profiler & Data Quality" +caption="Profiler & Data Quality" +/%} + +Watch the video to understand OpenMetadata’s native Data Profiler and Data Quality tests. + +{% youtube videoId="gLdTOF81YpI" start="0:00" end="1:08:10" /%} + +Watch the video on Data Quality Simplified to effortlessly build, deploy, monitor, and configure alerts using OpenMetadata's no-code platform + +{% youtube videoId="ihwtuNHt1kI" start="0:00" end="29:08" /%} + +{%inlineCalloutContainer%} + {%inlineCallout + color="violet-70" + bold="Profiler and Data Quality Tab" + icon="MdSecurity" + href="/how-to-guides/openmetadata/data-quality-profiler/tab"%} + Get a complete picture of the Table Profile, Column Profile, and Data Quality details. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Write and Deploy No-Code Test Cases" + icon="MdSecurity" + href="/how-to-guides/openmetadata/data-quality-profiler/test"%} + Verify your data quality with table and column level tests. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Set Alerts for Test Case Fails" + icon="MdSecurity" + href="/how-to-guides/openmetadata/data-quality-profiler/alerts"%} + Get notified when a data quality test fails. + {%/inlineCallout%} +{%/inlineCalloutContainer%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-quality-profiler/tab.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-quality-profiler/tab.md new file mode 100644 index 000000000000..ed6092cfe6cf --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-quality-profiler/tab.md @@ -0,0 +1,147 @@ +--- +title: Profiler and Data Quality Tab +slug: /how-to-guides/openmetadata/data-quality-profiler/tab +--- + +# Profiler and Data Quality Tab + +The Profiler & Data Quality tab is displayed only for Tables. It has three sub-tabs for **Table Profile, Column Profile, and Data Quality**. + +## Table Profile Tab + +The table profile helps to monitor and understand the table structure. It displays the number of **rows and columns** in the table. You can view these details over a timeframe to understand how the table has been evolving. It displays the **profile sample** either as an absolute number or as a percentage of data. You also get details on the **size of the data** as well as when the table was created. + +{% image +src="/images/v1.1/how-to-guides/quality/tp.png" +alt="Table Profile" +caption="Table Profile" +/%} + +The Table Profile tab also displays timeseries graphs on **Data Volume, Table Updates, and Volume Change**. + +### Data Volume + +The **Data Volume** chart gives an overview on how the data is evolving across a time period. + +{% image +src="/images/v1.1/how-to-guides/quality/dv.png" +alt="Table Profile: Data Volume" +caption="Table Profile: Data Volume" +/%} + +### Table Updates +In **Table Updates** chart, users can view the changes that happened in the table in terms of data inserts, updates, and deletes. + +{% image +src="/images/v1.1/how-to-guides/quality/tu.png" +alt="Table Profile: Table Updates" +caption="Table Profile: Table Updates" +/%} + +### Volume Change + +In **Volume Change** chart, users can view the changes that happened in the table in terms of data volume for inserts, updates, and deletes. + +{% image +src="/images/v1.1/how-to-guides/quality/vc.png" +alt="Table Profile: Volume Change" +caption="Table Profile: Volume Change" +/%} + +## Column Profile Tab + +The Column Profile tab provides a summary of table metrics similar to the Table Profile tab. It displays the number of **rows and columns** over a period of time. It displays the **profile sample** either as an absolute number or as a percentage of data. You also get details on the **size of the data** as well as when the table was created. + +The column profile helps to monitor and understand the column structure with a summary of metrics for every column. You can view the type of each column, the value count, null value %, distinct value %, unique %, the tests run as well as the test status. + +{% image +src="/images/v1.1/how-to-guides/quality/cp.png" +alt="Column Profile of a Table" +caption="Column Profile of a Table" +/%} + +By clicking on any column, you can view more detailed reports about that column. + +{% image +src="/images/v1.1/how-to-guides/quality/cp1.png" +alt="Column Profile of a Column" +caption="Column Profile of a Column" +/%} + +The Column Profile for a particular column also displays timeseries graphs on **Data Counts, Data Proportions, Data Range, Data Aggregate, Data Quartiles, and Data Distribution**. Based on the type of column you are viewing, you can verify the accuracy of the data values. + +### Data Counts + +The data counts chart provides information on the **Distinct Count, Null Count, Unique Count, and Values Count**. + +{% image +src="/images/v1.1/how-to-guides/quality/dc.png" +alt="Column Profile: Data Counts" +caption="Column Profile: Data Counts" +/%} + +### Data Proportions + +The data proportions chart displays the **Distinct, Null, and Unique Proportions**. + +{% image +src="/images/v1.1/how-to-guides/quality/dp.png" +alt="Column Profile: Data Proportions" +caption="Column Profile: Data Proportions" +/%} + +### Data Range + +The length of the string that are stored in the database is profiled. The data range displays the Minimum, Maximum, and Mean values, which can be helpful for users who are doing an NLP or Text analysis. + +{% image +src="/images/v1.1/how-to-guides/quality/dr.png" +alt="Column Profile: Data Range" +caption="Column Profile: Data Range" +/%} + +### Data Aggregate + +{% image +src="/images/v1.1/how-to-guides/quality/da.png" +alt="Column Profile: Data Aggregate" +caption="Column Profile: Data Aggregate" +/%} + +### Data Quartiles + +This chart displays the First Quartile, Median, Inter Quartile Range, and the Third Quartile. + +{% image +src="/images/v1.1/how-to-guides/quality/dq.png" +alt="Column Profile: Data Quartiles" +caption="Column Profile: Data Quartiles" +/%} + +### Data Distribution + +The distribution of the character length inside the column is displayed to help you get a sense of the structure of your data. + +{% image +src="/images/v1.1/how-to-guides/quality/dd.png" +alt="Column Profile: Data Distribution" +caption="Column Profile: Data Distribution" +/%} + +## Data Quality Tab + +Data quality tests can be run on the sample data. We can add tests at the table and column level. The Data Quality tab displays the total number of tests that were run, and also the number of tests that were successful, aborted, or failed. The list of test cases displays the details of the table or column on which the test was run. + +{% image +src="/images/v1.1/how-to-guides/quality/dq1.png" +alt="Profiler & Data Quality" +caption="Profiler & Data Quality" +/%} + +You can click on a Test Case to view further details. You can use a time filter on these reports. You can also edit these tests by clicking on the pencil icon next to each test. + +{% image +src="/images/v1.1/how-to-guides/quality/dq2.png" +alt="Details of a Test Case" +caption="Details of a Test Case" +/%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-quality-profiler/test.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-quality-profiler/test.md new file mode 100644 index 000000000000..f701fb3717cb --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/data-quality-profiler/test.md @@ -0,0 +1,145 @@ +--- +title: How to Write and Deploy No-Code Test Cases +slug: /how-to-guides/openmetadata/data-quality-profiler/test +--- + +# How to Write and Deploy No-Code Test Cases + +OpenMetadata supports data quality tests at the table and column level on all of the supported database connectors. OpenMetadata supports both business-oriented tests as well as data engineering tests. The data engineering tests are more on the technical side to ascertain a sanity check on the data. It ensures that your data meets the technical definition of the data assets, like the columns are not null, columns are unique, etc. + +There is no need to fill a YAML file or a JSON config file to set up data quality tests in OpenMetadata. You can simply select the options and add in the details right from the UI to set up test cases. + +To create a test in OpenMetadata: + +- Navigate to the table you would like to create a test for. Click on the **Profiler & Data Quality** tab. +- Click on **Add Test** to select a `Table` or `Column` level test. + +{% image +src="/images/v1.1/how-to-guides/quality/test1.png" +alt="Write and Deploy No-Code Test Cases" +caption="Write and Deploy No-Code Test Cases" +/%} + +## Table Level Test + +To create a **Table Level Test** enter the following details: +- **Name:** Add a name that best defines your test case. +- **Test Type:** Based on the test type, you will have further fields to define your test. +- **Description:** Describe the test case. +Click on **Submit** to set up a test. + +OpenMetadata currently supports the following table level test types: +1. Table Column Count to be Between: Define the Min. and Max. +2. Table Column Count to Equal: Define a number. +3. Table Column Name to Exist: Define a column name. +4. Table Column Names to Match Set: Add comma separated column names to match. You can also verify if the column names are in order. +5. Custom SQL Query: Define a SQL expression. Select a strategy if it should apply for Rows or for Count. Define a threshold to determine if the test passes or fails. +6. Table Row Count to be Between: Define the Min. and Max. +7. Table Row Count to Equal: Define a number. +8. Table Row Inserted Count to be Between: : Define the Min. and Max. row count. This test will work for columns whose values are of the type Timestamp, Date, and Date Time field. Specify the range type in terms of Hour, Day, Month, or Year. Define the interval based on the range type selected. + +{% image +src="/images/v1.1/how-to-guides/quality/test4.png" +alt="Configure a Table Level Test" +caption="Configure a Table Level Test" +/%} + +## Column Level Test + +To create a **Column Level Test** enter the following details: +- **Column:** Select a column. On the right hand side, you can view some context about that column. +- **Name:** Add a name that best defines your test case. +- **Test Type:** Based on the test type, you will have further fields to define your test. +- **Description:** Describe the test case. +Click on **Submit** to set up a test. + +OpenMetadata currently supports the following column level test types: +1. Column Value Lengths to be Between: Define the Min. and Max. +2. Column Value Max. to be Between: Define the Min. and Max. +3. Column Value Mean to be Between: Define the Min. and Max. +4. Column Value Median to be Between: Define the Min. and Max. +5. Column Value Min. to be Between: Define the Min. and Max. +6. Column Values Missing Count: Define the number of missing values. You can also match all null and empty values as missing. You can also configure additional missing strings like N/A. +7. Column Values Sum to be Between: Define the Min. and Max. +8. Column Value Std Dev to be Between: Define the Min. and Max. +9. Column Values to be Between: Define the Min. and Max. +10. Column Values to be in Set: You can add an array of allowed values. +11. Column Values to be Not in Set: You can add an array of forbidden values. +12. Column Values to be Not Null +13. Column Values to be Unique +14. Column Values to Match Regex Pattern: Define the regular expression that the column entries should match. +15. Column Values to Not Match Regex: Define the regular expression that the column entries should not match. + +{% image +src="/images/v1.1/how-to-guides/quality/test2.png" +alt="Configure a Column Level Test" +caption="Configure a Column Level Test" +/%} + +Once the test has been created, you can view the test suite. The test case will be displayed in the Data Quality tab. You can also edit the Display Name and Description for the test. + +{% image +src="/images/v1.1/how-to-guides/quality/test3.png" +alt="Column Level Test Created" +caption="Column Level Test Created" +/%} + +A pipeline can be set up for the tests to run at a regular cadence. +- Click on the `Pipeline` tab +- Add a pipeline + +{% image +src="/images/v1.1/how-to-guides/quality/test5.png" +alt="Set up a Pipeline" +caption="Set up a Pipeline" +/%} + +- Set up the scheduler for the desired frequency. The timezone is in UTC. +- Click on **Submit**. + +{% image +src="/images/v1.1/how-to-guides/quality/test6.png" +alt="Schedule the Pipeline" +caption="Schedule the Pipeline" +/%} + +The pipeline has been set up and will run at the scheduled time. + +{% image +src="/images/v1.1/how-to-guides/quality/test7.png" +alt="Pipeline Scheduled" +caption="Pipeline Scheduled" +/%} + +The tests will be run and the results will be updated in the Data Quality tab. + +{% image +src="/images/v1.1/how-to-guides/quality/test8.png" +alt="Data Quality Tests" +caption="Data Quality Tests" +/%} + +If a **test fails**, you can **Edit the Test Status** to New, Acknowledged, or Resolved status by clicking on the Status icon. + +{% image +src="/images/v1.1/how-to-guides/quality/test9.png" +alt="Failed Test: Edit Status" +caption="Failed Test: Edit Status" +/%} + +- Select the Test Status +{% image +src="/images/v1.1/how-to-guides/quality/test10.png" +alt="Edit Test Status" +caption="Edit Test Status" +/%} + +- If you are marking the test status as **Resolved**, you must specify the **Reason** for the failure and add a **Comment**. The reasons for failure can be Duplicates, False Positive, Missing Data, Other, or Out of Bounds. +- Click on **Submit**. +{% image +src="/images/v1.1/how-to-guides/quality/test11.png" +alt="Resolved Status: Reason" +caption="Resolved Status: Reason" +/%} + +Users can also set up [alerts](/how-to-guides/openmetadata/data-quality-profiler/alerts) to be notified when a test fails. \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/index.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/index.md new file mode 100644 index 000000000000..111abafc802b --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/openmetadata/index.md @@ -0,0 +1,57 @@ +--- +title: The Six Pillars of OpenMetadata +slug: /how-to-guides/openmetadata +--- + +# The Six Pillars of OpenMetadata + +OpenMetadata is an all-in-one platform for data discovery, lineage, data quality, observability, governance, and team collaboration. Powered by a centralized metadata store based on Open Metadata Standards/APIs, supporting connectors to a wide range of data services, OpenMetadata enables end-to-end metadata management, giving you the freedom to unlock the value of your data assets. + +OpenMetadata is a complete package for data teams to break down team silos, share data assets from multiple sources securely, collaborate around data, and build a documentation-first data culture in the organization. + +Let us learn more about the six pillars of OpenMetadata that helps maintain its ground as the best in effective metadata management. + +{%inlineCalloutContainer%} + {%inlineCallout + color="violet-70" + bold="Data Discovery" + icon="MdSearch" + href="/how-to-guides/openmetadata/data-discovery"%} + Discover the right data assets to make timely business decisions. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Data Collaboration" + icon="MdGroups" + href="/how-to-guides/openmetadata/data-collaboration"%} + Foster data team collaboration to enhance data understanding. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Data Quality and Profiler" + icon="MdSecurity" + href="/how-to-guides/openmetadata/data-quality-profiler"%} + Trust your data with quality tests that ensure freshness, & accuracy. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Data Lineage" + icon="MdPolyline" + href="/how-to-guides/openmetadata/data-lineage"%} + Trace the path of data across tables, pipelines, and dashboards. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Data Insights" + icon="MdInsertChart" + href="/how-to-guides/openmetadata/data-insights"%} + Define KPIs and set goals to proactively hone the data culture of your company. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Data Governance" + icon="MdMenuBook" + href="/how-to-guides/openmetadata/data-governance"%} + Enhance your data platform governance using OpenMetadata. + {%/inlineCallout%} +{%/inlineCalloutContainer%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/delete-service-connection.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/delete-service-connection.md new file mode 100644 index 000000000000..182573ad5791 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/delete-service-connection.md @@ -0,0 +1,20 @@ +--- +title: How to Delete a Service Connection +slug: /how-to-guides/quick-start-guide-for-admins/how-to-ingest-metadata/delete-service-connection +--- + +# How to Delete a Service Connection + +To delete a service connection, navigate to the service page and click on the ⋮ icon on the right of the page, and click on **Delete**. +{% image + src="/images/v1.1/how-to-guides/quick-start-guide-for-admins/delete1.png" + alt="Delete a Service Connection" + caption="Delete a Service Connection" + /%} + +To permanently delete the database, type DELETE and Confirm. +{% image + src="/images/v1.1/how-to-guides/quick-start-guide-for-admins/delete2.png" + alt="Permanently Delete the Database" + caption="Permanently Delete the Database" + /%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/how-to-ingest-metadata.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/how-to-ingest-metadata.md new file mode 100644 index 000000000000..db250d4b896a --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/how-to-ingest-metadata.md @@ -0,0 +1,169 @@ +--- +title: How to Ingest Metadata +slug: /how-to-guides/quick-start-guide-for-admins/how-to-ingest-metadata +--- + +# How to Ingest Metadata + +*This section deals with integrating third-party sources with OpenMetadata and running the workflows from the UI.* + +OpenMetadata gives you the flexibility to bring in your data from third-party sources using CLI, or the UI. Let’s start with ingesting your metadata from various sources through the UI. Follow the easy steps to add a connector to fetch metadata on a regular basis at your desired frequency. + +{% note %} + +**Note:** Ensure that you have **Admin access** in the source tools to be able to add a connector and ingest metadata. + +{% /note %} + +Admin users can connect to multiple data sources like Databases, Dashboards, Pipelines, ML Models, Messaging, Storage, as well as Metadata services. + +{%note%} + +{%inlineCallout +color="violet-70" +bold="Connector Documentation" +icon="add_moderator" +href="/connectors"%} +Refer to the Docs to ingest metadata from multiple sources - Databases, Dashboards, Pipelines, ML Models, Messaging, Storage, as well as Metadata services. + {%/inlineCallout%} + +- **Database Services:** [Athena](/connectors/database/athena), [AzureSQL](/connectors/database/azuresql), [BigQuery](/connectors/database/bigquery), [Clickhouse](/connectors/database/clickhouse), [Databricks](/connectors/database/databricks), [Datalake](/connectors/database/datalake), [DB2](/connectors/database/db2), [DeltaLake](/connectors/database/deltalake), [Domo Database](/connectors/database/domo-database), [Druid](/connectors/database/druid), [DynamoDB](/connectors/database/dynamodb), [Glue](/connectors/database/glue), [Hive](/connectors/database/hive), [Impala](/connectors/database/impala), [MariaDB](/connectors/database/mariadb), [MongoDB](/connectors/database/mongodb), [MSSQL](/connectors/database/mssql), [MySQL](/connectors/database/mysql), [Oracle](/connectors/database/oracle), [PinotDB](/connectors/database/pinotdb), [Postgres](/connectors/database/postgres), [Presto](/connectors/database/presto), [Redshift](/connectors/database/redshift), [Salesforce](/connectors/database/salesforce), [SAP Hana](/connectors/database/sap-hana), [SingleStore](/connectors/database/singlestore), [Snowflake](/connectors/database/snowflake), [SQLite](/connectors/database/sqlite), [Trino](/connectors/database/trino), and [Vertica](/connectors/database/vertica). + +- **Dashboard Services:** [Domo Dashboard](/connectors/dashboard/domo-dashboard), [Looker](/connectors/dashboard/looker), [Metabase](/connectors/dashboard/metabase), [Mode](/connectors/dashboard/mode), [PowerBI](/connectors/dashboard/powerbi), [Qlik Sense](/connectors/dashboard/qliksense), [QuickSight](/connectors/dashboard/quicksight), [Redash](/connectors/dashboard/redash), [Superset](/connectors/dashboard/superset), and [Tableau](/connectors/dashboard/tableau). + +- **Messaging Services:** [Kafka](/connectors/messaging/kafka), [Kinesis](/connectors/messaging/kinesis), and [Redpanda](/connectors/messaging/redpanda). + +- **Pipeline Services:** [Airbyte](/connectors/pipeline/airbyte), [Airflow](/connectors/pipeline/airflow), [Dagster](/connectors/pipeline/dagster), [Databricks Pipeline](/connectors/pipeline/databricks-pipeline), [Domo Pipeline](/connectors/pipeline/domo-pipeline), [Fivetran](/connectors/pipeline/fivetran), [Glue Pipeline](/connectors/pipeline/glue-pipeline), [NiFi](/connectors/pipeline/nifi), and [Spline](/connectors/pipeline/spline). + +- **ML Model Services:** [MLflow](/connectors/ml-model/mlflow), and [Sagemaker](/connectors/ml-model/sagemaker). + +- **Storage Service:** [Amazon S3](/connectors/storage/s3) + +- **Metadata Services:** [Amundsen](/connectors/metadata/amundsen), and [Atlas](/connectors/metadata/atlas) + +{%/note%} + +Let’s start with an example of fetching metadata from a database service, i.e., Snowflake. + +- Start by creating a service connection by clicking on **Settings** from the left nav bar. Navigate to the **Services** section, and click on **Databases**. Click on **Add New Service**. +{% image + src="/images/v1.1/how-to-guides/quick-start-guide-for-admins/connector1.jpg" + alt="Create a Service Connection" + caption="Create a Service Connection" + /%} + +- Select the Database service of your choice. For example, Snowflake. Click **Next**. +{% image + src="/images/v1.1/how-to-guides/quick-start-guide-for-admins/connector2.jpg" + alt="Select the Database Connector" + caption="Select the Database Connector" + /%} + +- To configure Snowflake, enter a unique service name. Click **Next**. + - **Name:** No spaces allowed. Apart from letters and numbers, you can use _ - . & ( ) + - **Description:** It is optional, but best to add documentation to improve data culture. +{% image + src="/images/v1.1/how-to-guides/quick-start-guide-for-admins/snowflake1.png" + alt="Configure Snowflake" + caption="Configure Snowflake" + /%} + +- Enter the **Connection Details**. The Connector documentation is available right within OpenMetadata in the right side panel. The connector details will differ based on the service selected. Users can add their credentials to create a service and further set up the workflows. +{% image + src="/images/v1.1/how-to-guides/quick-start-guide-for-admins/snowflake2.png" + alt="Connection Details" + caption="Connection Details" + /%} + +- Users can **Test the Connection** before creating the service. Test Connection checks for access, and also about what details can be ingested using the connection. +{% image + src="/images/v1.1/how-to-guides/quick-start-guide-for-admins/snowflake3.png" + alt="Test the Connection" + caption="Test the Connection" + /%} + +- The **Connection Status** will verify access to the service as well as to the data assets. Once the connection has been tested, you can save the details. +{% image + src="/images/v1.1/how-to-guides/quick-start-guide-for-admins/testconnection1.png" + alt="Connection Successful" + caption="Connection Successful" + /%} + +- Once the database service is created and the connections are established, admins can set up Pipelines to ingest all the source data into OpenMetadata. + - Clicking on **View Service** will navigate to the Database service page, where you can view the Databases, Ingestion, and Connection Details Tabs. You can also **Add the Metadata Ingestion** from the Ingestion tab. + + - Or, you can directly start with **Adding Ingestion**. +{% image + src="/images/v1.1/how-to-guides/quick-start-guide-for-admins/snowflake4.png" + alt="Snowflake Service Created" + caption="Snowflake Service Created" + /%} + +{% note %} +**Tip:** In the Service page, the **Connection Tab** provides information on the connection details as well as details on what data can be ingested from the source using this connection. +{% /note %} + +{% image + src="/images/v1.1/how-to-guides/quick-start-guide-for-admins/snowflake5.png" + alt="View Snowflake Service" + caption="View Snowflake Service" + /%} + +- Click on **Add Ingestion** and enter the details to ingest metadata: + - **Name:** The name is randomly generated, and includes the Service Name, and a randomly generated text to create a unique name. + - **Database Filter Pattern:** to include or exclude certain databases. A database service has multiple databases, of which you can selectively ingest the required databases. + - **Schema Filter Pattern:** to include or exclude certain schemas. A database can have multiple schemas, of which you can selectively ingest the required schemas. + - **Table Filter Pattern:** Use the toggle options to: + - Use FQN for Filtering + - Include Views - to generate lineage + - Include Tags + - Enable Debug Log: We recommend enabling the debug log. + - Mark Deleted Tables + - **View Definition Parsing Timeout Limit:** The default is set to 300. + +{% image + src="/images/v1.1/how-to-guides/quick-start-guide-for-admins/snowflake6.png" + alt="Configure Metadata Ingestion" + caption="Configure Metadata Ingestion" + /%} + +- **Schedule Metadata Ingestion** - Define when the metadata ingestion pipeline must run on a regular basis. Users can also use a **Custom Cron** expression. +{% image + src="/images/v1.1/how-to-guides/quick-start-guide-for-admins/schedule.png" + alt="Schedule and Deploy Metadata Ingestion" + caption="Schedule and Deploy Metadata Ingestion" + /%} + +After the ingestion pipeline has been created and deployed successfully, click on **View Service**. The **Ingestion Tab** will provide all the details for the recent runs, like if the pipeline is queued, running, failed, or successful. On hovering over the ingestion details, admin users can view the scheduling frequency, as well as the start and end times for the recent runs. Users can perform certain actions, like: +- **Run** the pipeline now. +- **Kill** to end all the currently running pipelines. +- **Redeploy:** When a service connection is setup, it fetches the data as per the access provided. If the connection credentials are changed at a later point in time, redeploying will fetch additional data with updated access, if any. + +{% image + src="/images/v1.1/how-to-guides/quick-start-guide-for-admins/view-service.png" + alt="View Service Ingestion" + caption="View Service Ingestion" + /%} + +By connecting to a database service, you can ingest the databases, schemas, tables, and columns. In the Service page, the **Databases Tab** will display all the ingested databases. Users can further drilldown to view the **Schemas**, and **Tables**. +{% image + src="/images/v1.1/how-to-guides/quick-start-guide-for-admins/snowflake7.png" + alt="View Table Details" + caption="View Table Details" + /%} + +{% note %} +**Note:** Once you’ve run a metadata ingestion pipeline, you can create separate pipelines to bring in [**Usage**](/connectors/ingestion/workflows/usage), [**Lineage**](/connectors/ingestion/workflows/lineage), [**dbt**](/connectors/ingestion/workflows/dbt), or to run [**Profiler**](/connectors/ingestion/workflows/profiler). To add ingestion pipelines, select the required type of ingestion and enter the required details. +{% /note %} + +{% image + src="/images/v1.1/how-to-guides/quick-start-guide-for-admins/snowflake8.png" + alt="Add Ingestion Pipelines for Usage, Lineage, Profiler, and dbt" + caption="Add Ingestion Pipelines for Usage, Lineage, Profiler, and dbt" + /%} + +Admin users can create, edit, or delete services. They can also view the connection details for the existing services. + +{% note %} +**Pro Tip:** Refer to the [Best Practices for Metadata Ingestion](/connectors/ingestion/best-practices). +{% /note %} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/index.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/index.md new file mode 100644 index 000000000000..e5f7a062871c --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/index.md @@ -0,0 +1,43 @@ +--- +title: Quick Start Guide for Admins +slug: /how-to-guides/quick-start-guide-for-admins +--- + +# Quick Start Guide for Admins + +Admin users have access to manage all the data assets. They can manage all the functions to create, edit, or delete. Admins can manage Roles, Policies, Services, Notifications, Custom Properties, Data Insights and more. They can add other users, or create teams to onboard users. An organization can have multiple Admins so that separate Admins can effectively manage different teams and departments. + +Get started with OpenMetadata with just **three quick and easy steps**. + +{%inlineCallout + color="violet-70" + bold="Ingest your Data from Multiple Sources" + icon="MdOutlineShare" + href="/how-to-guides/quick-start-guide-for-admins/how-to-ingest-metadata"%} + Integrate with third-party sources and ingest your metadata. +{%/inlineCallout%} + +{%inlineCallout + color="violet-70" + bold="Create Teams" + icon="MdGroups" + href="/how-to-guides/quick-start-guide-for-admins/teams-and-users"%} + Create hierarchical teams and manage access effectively. +{%/inlineCallout%} + +{%inlineCallout + color="violet-70" + bold="Invite Users to Start Collaborating on Data" + icon="MdPersonAdd" + href="/how-to-guides/quick-start-guide-for-admins/teams-and-users/invite-users"%} + Data is a team game. Start collaborating to get your data right. +{%/inlineCallout%} + +Ready to start with advanced access management using **Roles and Policies**? +{%inlineCallout + color="violet-70" + bold="Admin Guide for Roles and Policies" + icon="MdAdminPanelSettings" + href="/how-to-guides/admin-guide-roles-policies"%} + Know it all about Roles and Policies to get started. +{%/inlineCallout%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/teams-and-users/add-team.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/teams-and-users/add-team.md new file mode 100644 index 000000000000..8f6ddf9f0bae --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/teams-and-users/add-team.md @@ -0,0 +1,38 @@ +--- +title: How to Add a Team +slug: /how-to-guides/quick-start-guide-for-admins/teams-and-users/add-team +--- + +# How to Add a Team + +Creating a Team in OpenMetadata is easy. Decide on the `teamType` that you would like to add. Refer to the [**Team Structure in OpenMetadata**](/how-to-guides/quick-start-guide-for-admins/teams-and-users) to get a clear understanding of the various **Team Types**. + +**1.** Click on **Settings >> Teams**. Further, navigate to the relevant `BusinessUnit`, `Division`, or `Department` where you would like to create a new team. Click on **Add Team**. + +{% image +src="/images/v1.1/how-to-guides/teams-and-users/add-team1.png" +alt="select-team-type" +caption="Add a Team" +/%} + +**2.** Enter the details like `Name`, `Display Name`, `Email`, `Team Type`, and `Description` and click on **OK**. The choice of the `teamType` is restricted by the type of the parent team selected. More information can be found in the [**Team Structure**](/how-to-guides/quick-start-guide-for-admins/teams-and-users) document. + +{% note noteType="Warning" %} +- Once created, the teamType for `Group` **cannot be changed later**. +- Only the Teams of the type `Group` can **own data assets**. +{% /note noteType="Warning" %} +{% /note %} + +{% image +src="/images/v1.1/how-to-guides/teams-and-users/add-team2.png" +alt="select-team-type" +caption="Enter the Team Details" +/%} + +**3.** The new team has been created. You can further add Users or create another Team within it. + +{% image +src="/images/v1.1/how-to-guides/teams-and-users/add-team3.png" +alt="select-team-type" +caption="New Team Created" +/%} diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/teams-and-users/add-users.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/teams-and-users/add-users.md new file mode 100644 index 000000000000..47148d34dd84 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/teams-and-users/add-users.md @@ -0,0 +1,42 @@ +--- +title: How to Add Users to Teams +slug: /how-to-guides/quick-start-guide-for-admins/teams-and-users/add-users +--- + +# How to Add Users to Teams + +If the user does not already exist in OpenMetadata, then [invite the user](/how-to-guides/quick-start-guide-for-admins/teams-and-users/invite-users) to OpenMetadata. While creating the **new user**, you can add them to the relevant **Team** as well as assign them the relevant **Roles**. + +{% note %} +**Note:** You can add a User to multiple Teams when creating a New User. +{% image +src="/images/v1.1/how-to-guides/teams-and-users/add-user5.png" +alt="Add User to Team" +/%} +{% /note %} + + +If the user, already exists in OpenMetadata, then +- Go to **Settings >> Teams >> Users Tab** +- Select the specific team you would like to add a user to. A team may have further sub teams. Select the required sub team. +- Click on **Add User**. + +{% note %} +**Note:** Users will inherit the Roles that apply to the Team they belong to. +{% /note %} + +{% image +src="/images/v1.1/how-to-guides/teams-and-users/add-user3.png" +alt="Add User to Team" +caption="Select Relevant Team" +/%} + +- Search for the user, select the checkbox, and click on **Update**. + +{% image +src="/images/v1.1/how-to-guides/teams-and-users/add-user4.png" +alt="Add User to Team" +caption="Add User to the Team" +/%} + +It's that simple to add users to teams! \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/teams-and-users/change-team-type.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/teams-and-users/change-team-type.md new file mode 100644 index 000000000000..81b5a03a2696 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/teams-and-users/change-team-type.md @@ -0,0 +1,36 @@ +--- +title: How to Change the Team Type +slug: /how-to-guides/quick-start-guide-for-admins/teams-and-users/change-team-type +--- + +# How to Change the Team Type + +Refer to the [**Team Structure in OpenMetadata**](/how-to-guides/quick-start-guide-for-admins/teams-and-users/team-structure-openmetadata) to get a clear understanding of the various **Team Types**. + +Let's say you have a team `Cloud_Infra` of the type **`Department`** and you want to change it to the type **`BusinessUnit`**. You can easily change that through the UI. + +**1.** Navigate to **Settings >> Teams**. Click on the `Cloud_Infra` team name to view the details. + +{% image +src="/images/v1.1/how-to-guides/teams-and-users/change-team1.png" +alt="Cloud_Infra Team" +caption="Click on the Team Name" +/%} + +**2.** On the `Cloud_Infra` team details page, you will see the `Type - Department` with an edit icon. + +{% image +src="/images/v1.1/how-to-guides/teams-and-users/change-team2.png" +alt="Type - Department" +caption="Edit Team Type" +/%} + +**3.** Click on the edit button. You will get a set of options, from which you can select `BusinessUnit`.Click on ✅ to save it. + +{% image +src="/images/v1.1/how-to-guides/teams-and-users/change-team3.png" +alt="select-team-type" +caption="Select the Required Team Type" +/%} + +**4.** This changes the `Cloud_Infra` team type from `Department` to `BusinessUnit`. diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/teams-and-users/index.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/teams-and-users/index.md new file mode 100644 index 000000000000..3d0c9c3a4200 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/teams-and-users/index.md @@ -0,0 +1,25 @@ +--- +title: Manage Teams and Users +slug: /how-to-guides/quick-start-guide-for-admins/teams-and-users +--- + +# Manage Teams and Users + +OpenMetadata’s versatile hierarchical team structure helps align with your organization's setup. Admins can mirror their organizational hierarchy by creating various team types. You can onboard new users to the relevant teams. An organization can have multiple Admins, so that different teams and departments can be effectively managed by separate Admins. [Learn more about the Team Structure in OpenMetadata](/how-to-guides/quick-start-guide-for-admins/teams-and-users/team-structure-openmetadata). + +{%inlineCalloutContainer%} + {%inlineCallout + color="violet-70" + bold="Basics of Teams and Users" + icon="add_moderator" + href="/how-to-guides/quick-start-guide-for-admins/teams-and-users/team-structure-openmetadata"%} + Quick overview of the hierarchical team structure in OpenMetadata + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Admin Guide for Roles and Policies" + icon="add_moderator" + href="/how-to-guides/admin-guide-roles-policies"%} + Advanced guide to know it all about roles and policies. + {%/inlineCallout%} +{%/inlineCalloutContainer%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/teams-and-users/invite-users.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/teams-and-users/invite-users.md new file mode 100644 index 000000000000..b7ef44fbf3fc --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/teams-and-users/invite-users.md @@ -0,0 +1,34 @@ +--- +title: How to Invite Users to OpenMetadata +slug: /how-to-guides/quick-start-guide-for-admins/teams-and-users/invite-users +--- + +# How to Invite Users to OpenMetadata + +Data is a team game and OpenMetadata is a platform to discover, collaborate and get your data right. Collaboration works best when all the team members have access to a standard tool. Admins can send invitations and invite users to onboard to OpenMetadata. + +- Navigate to **Settings >> Users** and click on **Add Users**. + +{% image +src="/images/v1.1/how-to-guides/teams-and-users/add-user1.png" +alt="Add User" +caption="Add User" +/%} + +- Enter the details and click on **Create**. + - Email, + - Display Name, + - Description, + - Teams - Users can be invited to multiple teams. + - Roles - Multiple roles can be assigned to a user. + - Admin - Use the toggle to provide Admin access. + +{% note %} +**Note:** Users will inherit the Roles that apply to the Team they belong to. Additional roles can be explicitly assigned to a user. +{% /note %} + +{% image +src="/images/v1.1/how-to-guides/teams-and-users/add-user2.png" +alt="Add User" +caption="Invite User" +/%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/teams-and-users/team-structure-openmetadata.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/teams-and-users/team-structure-openmetadata.md new file mode 100644 index 000000000000..af4f0c41eccb --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/quick-start-guide-for-admins/teams-and-users/team-structure-openmetadata.md @@ -0,0 +1,30 @@ +--- +title: Team Structure in OpenMetadata +slug: /how-to-guides/quick-start-guide-for-admins/teams-and-users/team-structure-openmetadata +--- + +# Team Structure in OpenMetadata + +OpenMetadata supports a hierarchical team structure with **teamType** that can be `Organization`, `Business Unit`, `Division`, `Department`, and `Group` (default team type). **Organization** serves as the foundation of the team hierarchy representing the entire company. The other **team types** under Organization are Business Units, Divisions, Departments, and Groups. + +- **`Organization`** is the **root team** in the hierarchy. _It cannot have a parent_. It can have children of the type `Business Unit`, `Division`, `Department`, `Group` along with `Users` directly as children (who are without teams). + +- **`BusinessUnit`** is the next level of the team in the hierarchy. It can have `Business Unit`, `Division`, `Department`, and `Group` as children. It can only have **one parent** either of the type `Organization`, or `Business Unit`. + +- **`Division`** is the next level of the team in the hierarchy below `Business Unit`. It can have `Division`, `Department`, and `Group` as children. It can only have **one parent** of the type `Organization`, `Business Unit`, or `Division`. + +- **`Department`** is the next level of the team in the hierarchy below `Division`. It can have `Department` and `Group` as children. It can have `Organization`, `Business Unit`, `Division`, or `Department` as parents. It can have **multiple parents**. + +- **`Group`** is the last level of the team in the hierarchy. It can only have `Users` as children and not any other teams. It can have all the team types as parents. It can have **multiple parents**. + +{% note noteType="Warning" %} +- Once created, the teamType for `Group` **cannot be changed later**. +- Only the Teams of the type `Group` can **own data assets**. +{% /note noteType="Warning" %} +{% /note %} + +{% image +src="/images/v1.1/how-to-guides/teams-and-users/teams.png" +alt="team-structure" +caption="Team Hierarchy in OpenMetadata" +/%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/teams-and-users/how-to-organise-teams-and-users.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/teams-and-users/how-to-organise-teams-and-users.md deleted file mode 100644 index d6e1dda62e89..000000000000 --- a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/teams-and-users/how-to-organise-teams-and-users.md +++ /dev/null @@ -1,53 +0,0 @@ ---- -title: How To Organise Teams And users -slug: /how-to-guides/teams-and-users/how-to-organise-teams-and-users ---- - -# How To Organise Teams And users - -## Team structure in OpenMetadata - -In OpenMetadata we have hierarchal team structure with `teamType` that can be `Organization`, `Business Unit`, `Division`, `Department`, and `Group` (default team type). - -- `Organization` is the root team in the hierarchy. _It can't have a parent_. It can have children of type `Business Unit`, `Division`, `Department`, `Group` along with `Users` directly as children (who are without teams). - -- `BusinessUnit` is the next level of the team in the hierarchy. It can have `Business Unit`, `Division`, `Department`, and `Group` as children. It can only have **one parent** either of type `Organization`, or `Business Unit`. - -- `Division` is the next level of the team in the hierarchy below `Business Unit`. It can have `Division`, `Department`, and `Group` as children. It can only have **one parent** of type `Organization`, `Business Unit`, or `Division`. - -- `Department` is the next level of the team in the hierarchy below `Division`. It can have `Department` and `Group` as children. It can have `Organization`, `Business Unit`, `Division`, or `Department` as parents. **It can have multiple parents**. - -- `Group` is the last level of the team in the hierarchy. It can have only `Users` as children and not any other teams. It can have all the team types as parents. **It can have multiple parents**. - -{% image -src="/images/v1.2/how-to-guides/teams-and-users/teams-structure.png" -alt="team-structure" -/%} - - -## How to change the team type - -Let's say you have team `Cloud_Infra` of type `Department` and you want to change it to the type `BusinessUnit`, you can easily do that through UI. - -**1.** Click on the `Cloud_Infra` team name and it will take you to the `Cloud_Infra` details page. - -{% image -src="/images/v1.2/how-to-guides/teams-and-users/cloud-infra.png" -alt="cloud-infra" -/%} - -**2.** On details page you will see the `Type - Department` with edit button. - -{% image -src="/images/v1.2/how-to-guides/teams-and-users/team-type.png" -alt="team-type" -/%} - -**3.** Now Click on the edit button and you will get a set of options and from them select `BusinessUnit` and click on ✅ to save it. - -{% image -src="/images/v1.2/how-to-guides/teams-and-users/select-team-type.png" -alt="select-team-type" -/%} - -**4.** Now you can see `Cloud_Infra` team type changed to `BusinessUnit` from `Department`. diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/teams-and-users/index.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/teams-and-users/index.md deleted file mode 100644 index eb85034795f6..000000000000 --- a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/teams-and-users/index.md +++ /dev/null @@ -1,25 +0,0 @@ ---- -title: Team structure in OpenMetadata -slug: /how-to-guides/teams-and-users ---- - -# Team structure in OpenMetadata - -In OpenMetadata we have hierarchal team structure with `teamType` that can be `Organization`, `Business Unit`, `Division`, `Department`, and `Group` (default team type). - -- `Organization` is the root team in the hierarchy. _It can't have a parent_. It can have children of type `Business Unit`, `Division`, `Department`, `Group` along with `Users` directly as children (who are without teams). - -- `BusinessUnit` is the next level of the team in the hierarchy. It can have `Business Unit`, `Division`, `Department`, and `Group` as children. It can only have **one parent** either of type `Organization`, or `Business Unit`. - -- `Division` is the next level of the team in the hierarchy below `Business Unit`. It can have `Division`, `Department`, and `Group` as children. It can only have **one parent** of type `Organization`, `Business Unit`, or `Division`. - -- `Department` is the next level of the team in the hierarchy below `Division`. It can have `Department` and `Group` as children. It can have `Organization`, `Business Unit`, `Division`, or `Department` as parents. **It can have multiple parents**. - -- `Group` is the last level of the team in the hierarchy. It can have only `Users` as children and not any other teams. It can have all the team types as parents. **It can have multiple parents**. - - -{% image -src="/images/v1.2/how-to-guides/teams-and-users/teams-structure.png" -alt="team-structure" -/%} - diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/basics-openmetadata.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/basics-openmetadata.md new file mode 100644 index 000000000000..c8e3deef6d42 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/basics-openmetadata.md @@ -0,0 +1,78 @@ +--- +title: Understanding the Basics of OpenMetadata +slug: /how-to-guides/user-guide-for-data-stewards/basics-openmetadata +--- + +# Understanding the Basics of OpenMetadata + +## Let’s Start with ‘My Data’ Page + +Once you login to OpenMetadata, you can start exploring from the widget-rich ‘My Data’ page. My Data provides a single pane view of all your data assets, collaboration updates, data insights and more. This landing page comprises of widgets for Activity Feeds, Data Insights, Announcements, and other data asset related widgets. + +## Activity Feed Widget + +The main section consists of the **Activity Feed Widget**. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/activity1.png" +alt="My Data: Activity Feed Widget" +caption="My Data: Activity Feed Widget" +/%} + +**Activity Feeds Widget** displays: +- **All:** All the activities related to the data assets that you own, follow, or where you are mentioned +- **@Mentions:** Feeds where you are mentioned +- **Tasks:** Tasks created by you, or assigned to you are displayed. Only the Open tasks are displayed here. + +## My Data Widget + +A quick glance at the **My Data Widget** will display all the data assets that you own. In case, you or your team does not own any data assets, you can start claiming the assets from the Explore page. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/data1.png" +alt="My Data Widget" +caption="My Data Widget" +/%} + +## Key Performance Indicators (KPI) Widget + +The KPI widget is accessible to Admins only. Other users can also be given access to the KPI widget. The KPI widget gives ready information about the data asset ownership coverage, description coverage, and tiering. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/kpi.png" +alt="Key Performance Indicators (KPI) Widget" +caption="Key Performance Indicators (KPI) Widget" +/%} + +## Total Data Assets Widget + +This widget displays the trend of the total data assets in the last 14 days. It displays the total Tables, Dashboards, Databases, Database Schemas, Pipelines, Topics, MLModels, Charts, etc. Users can quickly identify the most populated data assets. In the below example, the organization has more Tables when compared to any other data asset. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/data2.png" +alt="Total Data Assets Widget" +caption="Total Data Assets Widget" +/%} + +## Announcements, Following, and Recent Views + +The right-side panel of the ‘My Data’ page displays the **Recent Announcements**, the data assets that you are **Following**, and the **Recently Viewed** data assets. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/data3.png" +alt="Announcements, Following, and Recent Views" +caption="Announcements, Following, and Recent Views" +/%} + +- **Announcements:** View all the recent announcements about the data assets you own or follow. Learn [How to Add an Announcement](/how-to-guides/user-guide-for-data-stewards/overview-data-assets/add-announcement). +- **Following:** View all the data assets that you are following. [Learn How to Follow a Data Asset](/how-to-guides/user-guide-for-data-stewards/overview-data-assets/follow-data-asset). +- **Recent Views:** Displays all the recently viewed data. + +{%inlineCallout + color="violet-70" + bold="Overview of Data Assets" + icon="MdArrowForward" + align="right" + href="/how-to-guides/user-guide-for-data-stewards/overview-data-assets"%} + Know it all about data assets. +{%/inlineCallout%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/index.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/index.md new file mode 100644 index 000000000000..e1d5262b925a --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/index.md @@ -0,0 +1,29 @@ +--- +title: User Guide for Data Stewards +slug: /how-to-guides/user-guide-for-data-stewards +--- + +# User Guide for Data Stewards + +This user guide will provide more information on the basics of OpenMetadata, the details of the landing page and the widgets for Activity Feeds, Data Insights, Announcements, and more. You’ll also get more information about the data assets, and the details associated with it, like tags, tasks, conversations, announcements, versions, and more. + +Get quick access to all the data in your organization and the activities around it. + +{% note noteType="Tip" %} **Tip:** Navigate to the My Data page from anywhere by clicking on OpenMetadata logo. {% /note %} + +{%inlineCalloutContainer%} + {%inlineCallout + color="violet-70" + bold="Understanding the Basics of OpenMetadata" + icon="add_moderator" + href="/how-to-guides/user-guide-for-data-stewards/basics-openmetadata"%} + Learn about the OpenMetadata landing page and widgets + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Overview of Data Assets" + icon="add_moderator" + href="/how-to-guides/user-guide-for-data-stewards/overview-data-assets"%} + Learn about the data assets and the details associated with it. + {%/inlineCallout%} +{%/inlineCalloutContainer%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/add-announcement.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/add-announcement.md new file mode 100644 index 000000000000..9ff7c36ab77d --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/add-announcement.md @@ -0,0 +1,45 @@ +--- +title: How to Create an Announcement +slug: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/add-announcement +--- + +# How to Create an Announcement + +{% note noteType="Tip" %} **Quick Tip:** Always watch out for announcements on the backward incompatible changes. Saves a ton of debugging time later on for data teams. {% /note %} + +To add an announcement: +- Navigate to **Explore** and the relevant **Data Asset** section to select a specific asset. +- Click on the vertical ellipsis icon **⋮** located on the top right and select **Announcements**. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/announce5.png" +alt="Announcements Option" +caption="Announcements Option" +/%} + +- Click on **Add Announcement**. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/announce6.png" +alt="Add an Announcement" +caption="Add an Announcement" +/%} + +- Enter the following information and click Submit. + - Title of the Announcement + - Start Date + - End Date + - Description + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/announce7.png" +alt="Add the Announcement Details" +caption="Add the Announcement Details" +/%} + +This announcement will be displayed in OpenMetadata during the scheduled time. It will be displayed to all the users who own or follow that particular data asset. + +{% note noteType="Warning" %} +**Pro Tip:** Create Announcements for deletion, deprecation, and other important changes. Let your team know of a tentative date when these changes will be implemented. +{% /note noteType="Warning" %} +{% /note %} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/announcements.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/announcements.md new file mode 100644 index 000000000000..051b0f1885ea --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/announcements.md @@ -0,0 +1,56 @@ +--- +title: Overview of Announcements +slug: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/announcements +--- + +# Overview of Announcements + +It is a huge challenge to inform the data team about upcoming changes to data. In most organizations, data changes are announced in advance over email or Slack; and sometimes, this information is noticed pretty late, leaving very little time to prepare for the changes. + +In OpenMetadata, **announcements** can be set up to inform the entire team about the upcoming changes to a data asset. With the Announcements feature, you can now inform your entire team of all the upcoming events and changes, such as **deprecation, deletion, or schema changes**. These announcements can be scheduled with a start date and an end date. All the users following your data are not only notified in Activity Feeds but a banner is also shown on the data asset details page. + +{% note %} +**Tip:** Ideally, it’s best to schedule the announcements well in advance before modifying or deleting a data asset, so you can ensure that the entire team has a reasonable amount of time to plan accordingly. +{% /note %} + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/announce1.png" +alt="Banner on Data Assets Page" +caption="Banner on Data Assets Page" +/%} + +{% note noteType="Warning" %} +**Pro Tip:** Ensure that all **backward incompatible changes** are announced to the team well in advance. For example, when deleting a column from a table. +{% /note noteType="Warning" %} +{% /note %} + +Clicking on the announcement will display further details. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/announce2.png" +alt="Details of the Announcement" +caption="Details of the Announcement" +/%} + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/announce3.png" +alt="Details of an Announcement" +caption="Details of an Announcement" +/%} + +Details of an announcement are as follows: +- **Creator:** Get to know who added the announcement. +- **Data Asset:** Know the data asset type (Table, Pipeline) as well as name of the data asset it pertains to. +- **Scheduled Date:** A date range can be added during which the announcement will be displayed in OpenMetadata. This consists of a start and end date. + +These announcements are also displayed on the top right of the landing page. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/announce4.png" +alt="Announcement Display (Top Right)" +caption="Landing Page Announcement Display (Top Right)" +/%} + +{% note %} +**Advanced Tip:** Users can set up Alerts to be sent from OpenMetadata via Email, Chat, Slack, MS Teams, and Webhooks. If alerts have been set up for Activity Feeds, then the concerned data owners and followers will be notified via email, Slack, etc. +{% /note %} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/custom.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/custom.md new file mode 100644 index 000000000000..758c619081b8 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/custom.md @@ -0,0 +1,39 @@ +--- +title: How to Create a Custom Property for a Data Asset +slug: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/custom +--- + +# How to Create a Custom Property for a Data Asset + +OpenMetadata uses a schema-first approach, and that's why we support custom properties for all types of data assets. Organizations can extend the attributes as required to capture custom metadata. You can view the Custom Properties tab in the detailed view for all types of data assets. + +To create a Custom Property in OpenMetadata: +- Navigate to **Settings** >> **Custom Attributes** +- Click on the type of data asset you would like to create a custom property for. +- Click on **Add Property** + +{% image +src="/images/v1.1/how-to-guides/discovery/custom1.png" +alt="Create a Custom Property" +caption="Create a Custom Property" +/%} + +- Enter the required details: `Name`, `Type`, and `Description`. You can lookup for the details of the information asked on the right side panel. + - **Name:** The name must start with a lowercase letter, as preferred in the camelCase format. Uppercase letters and numbers can be included in the field name; but spaces, underscores, and dots are not supported. + - **Type:** Integer, Markdown, and String are supported. + - **Description:** Describe your custom property to provide more information to your team. +- Click on **Create**. + +{% image +src="/images/v1.1/how-to-guides/discovery/custom2.png" +alt="Define a Custom Property" +caption="Define a Custom Property" +/%} + +Once the custom property has been created for a type of data asset, you can add the values for the custom property from the Custom Property tab in the detailed view of the data assets. + +{% image +src="/images/v1.1/how-to-guides/discovery/custom3.png" +alt="Enter the Value for a Custom Property" +caption="Enter the Value for a Custom Property" +/%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/data-ownership.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/data-ownership.md new file mode 100644 index 000000000000..bc2ab8bd2c63 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/data-ownership.md @@ -0,0 +1,49 @@ +--- +title: How to Assign or Change Data Ownership +slug: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/data-ownership +--- + +# How to Assign or Change Data Ownership + +## Data Asset Ownership + +In OpenMetadata, either a **team** or an **individual user** can be the owner of a data asset. Owners have access to perform all the operations on a data asset. For example, edit description, tags, glossary terms, etc. + +## Assign Data Ownership + +Admin users have access to add or change data ownership. + +- Navigate to the data asset and click on the edit icon next to the Owner of the data asset. +- Select a Team or a User as the Owner of the Data Asset. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/data-owner1.png" +alt="Assign an Owner to a Data Asset" +caption="Assign an Owner to a Data Asset" +/%} + +## Change Data Ownership + +If the data asset already has an owner, you can change the owner by clicking on the edit icon for Owner and simply selecting a team or user to change ownership. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/data-owner2.png" +alt="Change the Owner of the Data Asset" +caption="Change the Owner of the Data Asset" +/%} + +If no owner is selected, and if the Database or Database Schema has a owner, then by default the same owner will be assigned to the Database Schema or Table respectively, based on the owner propagation in OpenMetadata. + +## Owner Propagation in OpenMetadata + +OpenMetadata supports Owner Propagation and the owner will be propagated based on a top-down hierarchy. The owner of the Database will be auto-propagated as the owner of the Database Schemas and Tables under it. Similarly, the owner of the Database Schema will be auto-propagated as the owner of the Tables under it. + +- Owner Propogation does not work for data assets that already have an Owner assigned to them. If there is **no owner**, then an Owner will be assigned based on the hierarchy. + +- If a Database or Database Schema has an Owner assigned, and you **delete the owner** from the Database Schema or Tables under it, then the Owner will be auto-assigned in this case based on the existing Owner details at the top hierarchy. + +- You can also assign a different owner manually. + +## Team Ownership is Preffered + +OpenMetadata is a data collaboration platform. We highly recommend Team Ownership of data assets, because individual users will only have part of the context about the data asset in question. Assigning team ownership will give access to all the members of a particular team. Only teams of the type ‘**Groups**’ can own data assets. \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/delete.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/delete.md new file mode 100644 index 000000000000..9156e715cc98 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/delete.md @@ -0,0 +1,30 @@ +--- +title: How to Delete a Data Asset +slug: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/delete +--- + +# How to Delete a Data Asset + +Data assets have a lot of user-generated metadata, such as descriptions, tags, ownership, tiering. There’s also rich metadata generated by OpenMetadata through the data profiler, usage data, lineage, test results, and other graph relationships with other data assets. When a data asset is deleted, all of this rich information is lost, and it’s not easy to recreate it. OpenMetadata supports both soft delete and hard delete. + +To delete a data asset: + +- Navigate to **Explore** and the relevant **Data Asset** section to select a specific asset. +- Click on the vertical ellipsis icon **⋮** located on the top right and click on **Delete**. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/delete.png" +alt="Delete a Data Asset" +caption="Delete a Data Asset" +/%} + +- You can choose to soft delete or hard delete a data asset. Soft deleting will provide a read-only access to the data assets. Hard deleting will permanently delete the data asset from OpenMetadata. +- Type DELETE to confirm deletion. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/delete2.png" +alt="Soft or Hard Delete a Data Asset" +caption="Soft or Hard Delete a Data Asset" +/%} + +{% note noteType="Tip" %} **Tip:** Notify team members of data changes by creating [announcements](/how-to-guides/user-guide-for-data-stewards/overview-data-assets/add-announcement).{% /note %} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/description.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/description.md new file mode 100644 index 000000000000..a126d26cc8e4 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/description.md @@ -0,0 +1,33 @@ +--- +title: How to Add Description using Markdown +slug: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/description +--- + +# How to Add Description using Markdown + +Description allows you to document the data to help data consumers understand it. In OpenMetadata, the search option also searches based on the text in the description. You can also document nested columns and search based on the descriptions. You can tag the columns using glossary terms to add semantic meaning. You can add classification tags to classify the data. You can set business importance of data by setting Tier. Tier 1 is the most important data of an organization. All the descriptive metadata you add in the form of description, ownership, and tier are available in Search and Explore. + +In OpenMetadata, you can work around the data asset descriptions in one of the following ways: +- Add or edit description +- Request for description +- Start a conversation around the description + +## Add or Edit Description +- From the Explore page, navigate to the data asset you would like to edit. +- Click on the pencil icon to add or edit the description + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/desc1.png" +alt="Add or Edit the Data Asset Description" +caption="Add or Edit the Data Asset Description" +/%} + +- You can add or edit the description from the editor. You can also preview the description. +- The description is supported in Markdown. You can also use the edit options available like Headers, Bold, Italics, Strikethrough, Bulleted list, Numbered list, Hyperlinks, Line break, Block quote, Inline code, Code block, +- Refer to the [Markdown Guide](https://www.markdownguide.org/cheat-sheet/) for more information. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/desc2.png" +alt="Add or Edit the Data Asset Description" +caption="Add or Edit the Data Asset Description" +/%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/follow-data-asset.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/follow-data-asset.md new file mode 100644 index 000000000000..6ff0bdc53051 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/follow-data-asset.md @@ -0,0 +1,43 @@ +--- +title: How to Follow a Data Asset +slug: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/follow-data-asset +--- + +# How to Follow a Data Asset + +Users can get timely information about all the **activities**, **announcements**, and **feeds** related to a data asset by following those assets. The owners of the data assets will receive all the updates by default. The follow option can be used for the assets that you do not own. Data asset owners have more access to edit the assets, whereas followers can only **rate** the data assets, and not edit them. + +To follow a data asset, +- Navigate to the **Explore** page. +- Select the relevant type of data asset (Tables, Topics, Dashboards, Pipelines, ML Models, and Containers). +- Select a data asset. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/data4.png" +alt="Select a Data Asset" +caption="Select a Data Asset" +/%} + +- **Star** the data asset to start following it. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/data5.png" +alt="Star the Data Asset" +caption="Star the Data Asset" +/%} + +The data assets you are following will be displayed in the ‘My Data’ page on the right-hand side. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/data6.png" +alt="Data Assets you are Following" +caption="Data Assets you are Following" +/%} + +Clicking on **View All** to check all the data assets you are following will redirect to the **Profile Page > Following Tab**. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/profile1.png" +alt="Profile Page: Following Tab" +caption="Profile Page: Following Tab" +/%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/glossary.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/glossary.md new file mode 100644 index 000000000000..79b1d7cc09ee --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/glossary.md @@ -0,0 +1,40 @@ +--- +title: How to Add Glossary Terms +slug: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/glossary +--- + +# How to Add Glossary Terms + +- From the Explore page, select a data asset and click on the edit icon or + Add for Glossary Term. +- Search for the relevant tags. You can either type and search, or scroll to select from the options provided. +- Click on the checkmark to save the changes. + +{% image +src="/images/v1.1/how-to-guides/discovery/glossary1.png" +alt="Add Glossary Terms" +caption="Add Glossary Terms" +/%} + +You can view all the associated glossary terms in the right panel. + +{% image +src="/images/v1.1/how-to-guides/discovery/glossary2.png" +alt="Add Glossary Terms" +caption="Add Glossary Terms" +/%} + +## Glossary Terms and Tags + +If **Tags** are associated with a **Glossary Term**, then applying that glossary term to a data asset, will also automatically apply the associated tags to that data asset. For example, the glossary term ‘Account’ has a PII.Sensitive tag associated with it. When you add a glossary term to a data asset, the associated tags also get added. + +{% image +src="/images/v1.1/how-to-guides/governance/tag5.png" +alt="Glossary Term and Associated Tags" +caption="Glossary Term and Associated Tags" +/%} + +{% image +src="/images/v1.1/how-to-guides/governance/tag6.png" +alt="Glossary Term and Tag gets Added to the Data Asset" +caption="Glossary Term and Tag gets Added to the Data Asset" +/%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/index.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/index.md new file mode 100644 index 000000000000..02c9cbd66653 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/index.md @@ -0,0 +1,110 @@ +--- +title: Overview of Data Assets +slug: /how-to-guides/user-guide-for-data-stewards/overview-data-assets +--- + +# Overview of Data Assets + +OpenMetadata displays a single-pane view for each of the data assets. In the detailed view of a data asset, the **Source, Owner (Team/User), Tier, Type, Usage, Description** are displayed on the top panel. Further, there are separate tabs each for Schema, Activity Feeds & Tasks, Sample Data, Queries, Profiler & Data Quality, Lineage, Custom Properties, Config, Details, Features, Children, and Executions based on the type of data asset selected. + +{% image +src="/images/v1.1/how-to-guides/discovery/asset1.png" +alt="Overview of Data Assets" +caption="Overview of Data Assets" +/%} + +This section will deal with all the details related to data assets: how to change data ownership, how to add tags, glossary terms, announcements, how to follow data assets, how data assets versioning works, and so on. + +{%inlineCalloutContainer%} + {%inlineCallout + color="violet-70" + bold="Data Asset Tabs" + icon="MdTab" + href="/how-to-guides/user-guide-for-data-stewards/overview-data-assets/tabs"%} + Get a detailed view of the data assets + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Add Description" + icon="MdDescription" + href="/how-to-guides/user-guide-for-data-stewards/overview-data-assets/description"%} + Describe your data assets using Markdown + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Request for Description" + icon="MdDescription" + href="/how-to-guides/user-guide-for-data-stewards/overview-data-assets/request-description"%} + Request for a description and discuss the same within OpenMetadata + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Data Ownership" + icon="MdPerson" + href="/how-to-guides/user-guide-for-data-stewards/overview-data-assets/data-ownership"%} + Learn how to assign or change data owners + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Follow a Data Asset" + icon="MdDone" + href="/how-to-guides/user-guide-for-data-stewards/overview-data-assets/follow-data-asset"%} + Learn how to follow data assets + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="How to Add Tags" + icon="MdDiscount" + href="/how-to-guides/user-guide-for-data-stewards/overview-data-assets/tags"%} + Add tags to data assets and learn about auto-PII classification. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Request for Tags" + icon="MdDiscount" + href="/how-to-guides/user-guide-for-data-stewards/overview-data-assets/request-tags"%} + Request for tags and discuss about the same, all within OpenMetadata. + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="How to Add Glossary Terms" + icon="MdPushPin" + href="/how-to-guides/user-guide-for-data-stewards/overview-data-assets/glossary"%} + Add glossary terms to data assets making it easier for data discovery + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Custom Properties" + icon="MdDashboardCustomize" + href="/how-to-guides/user-guide-for-data-stewards/overview-data-assets/custom"%} + Learn how to create custom attributes for data assets + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Overview of Announcements" + icon="MdVolumeUp" + href="/how-to-guides/user-guide-for-data-stewards/overview-data-assets/announcements"%} + Learn more about the announcements in OpenMetadata + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Create an Announcement" + icon="MdVolumeUp" + href="/how-to-guides/user-guide-for-data-stewards/overview-data-assets/add-announcement"%} + Follow the steps to add an announcement + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Data Asset Versioning" + icon="MdChangeCircle" + href="/how-to-guides/user-guide-for-data-stewards/overview-data-assets/versions"%} + Review all the major and minor changes with version history + {%/inlineCallout%} + {%inlineCallout + color="violet-70" + bold="Delete a Data Asset" + icon="MdCancel" + href="/how-to-guides/user-guide-for-data-stewards/overview-data-assets/delete"%} + Soft, or hard delete data assets + {%/inlineCallout%} +{%/inlineCalloutContainer%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/request-description.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/request-description.md new file mode 100644 index 000000000000..3fb3bfeb7574 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/request-description.md @@ -0,0 +1,66 @@ +--- +title: Request for Description +slug: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/request-description +--- + +# How to Request for Description + +Apart from adding the a description to the data assets directly, users can also request to update description. This is typically done when the user wants another opinion on the description being added, or if the user does not have access to edit the description. Requesting for a description will create a Task in OpenMetadata. + +- Click on the **?** icon next to Description + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/desc3.png" +alt="Request for Data Asset Description" +caption="Request for Data Asset Description" +/%} + +- A Task will be created with some pre-populated details. Fill in the other important information: + - **Title** - This is auto-populated + - **Assignees** - Multiple users or teams can be added + - **Description** - Add the new description. + - You can view the **Current** description. + - You can add the **New** description. + - It will display the **Difference** as well. + - Click on **Submit** to create the task. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/desc4.png" +alt="Create a Task for Data Asset Description" +caption="Create a Task for Data Asset Description" +/%} + +Once a task has been created, it is displayed in the **Activity Feeds & Tasks** tab for that Data Asset. The assignees, can either `Accept the Suggestion` or `Edit and Accept the Suggestion`. Assignees can also add a **Comment**. They can also add other users as **Assignees**. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/desc5.png" +alt="Task: Accept Suggestion and Comment" +caption="Task: Accept Suggestion and Comment" +/%} + +## Conversations around the Data Asset Description + +Apart from requesting for a description, users can also create a **Conversation** around the description of a data asset. +- Click on the **Conversation** icon next to the description. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/desc6.png" +alt="Conversation around Description" +caption="Conversation around Description" +/%} + +- Start a conversation right within the data asset page. Add **@mention** to tag a user or team. Add a **#mention** to tag a data asset. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/desc7.png" +alt="Start a Conversation" +caption="Start a Conversation" +/%} + +- Further in the conversation, users can **Reply** to discuss further as well as add **Reactions**, **Edit**, or **Delete**. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/desc8.png" +alt="Conversation: Reply, React, Edit or Delete" +caption="Conversation: Reply, React, Edit or Delete" +/%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/request-tags.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/request-tags.md new file mode 100644 index 000000000000..8f2a1d70f785 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/request-tags.md @@ -0,0 +1,66 @@ +--- +title: How to Request for Tags +slug: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/request-tags +--- + +# How to Request for Tags + +Apart from adding the tags directly to the data assets, users can also request to update tags. This is typically done when the user wants another opinion on the tag being added, or if the user does not have access to add tags directly. Requesting for a tag will create a Task in OpenMetadata. + +- Click on the **?** icon next to tags + +{% image +src="/images/v1.1/how-to-guides/governance/tag8.png" +alt="Request to Update Tags" +caption="Request to Update Tags" +/%} + +- A Task will be created with some pre-populated details. Fill in the other important information: + - **Title** - This is auto-populated + - **Assignees** - Multiple users can be added + - **Update Tags** - It displays 3 tabs. + - You can view the **Current** tags. + - You can add the **New** tags. + - It will display the **Difference** as well. + - Click on **Submit** to create the task. + + {% image + src="/images/v1.1/how-to-guides/governance/task1.png" + alt="Add a Task: Request to Update Tags" + caption="Add a Task: Request to Update Tags" + /%} + +Once a task has been created, it is displayed in the **Activity Feeds & Tasks** tab for that Data Asset. The assignees, can either `Accept the Suggestion` or `Edit and Accept the Suggestion`. Assignees can also add a **Comment**. They can also add other users as **Assignees**. + +{% image +src="/images/v1.1/how-to-guides/governance/task2.png" +alt="Task: Accept Suggestion and Comment" +caption="Task: Accept Suggestion and Comment" +/%} + +## Conversations around Tags + +Apart from requesting for tags, users can also create a **Conversation** around the tags assigned to a data asset. +- Click on the **Conversation** icon next to the tag. + +{% image +src="/images/v1.1/how-to-guides/governance/ct1.png" +alt="Conversations around Tags" +caption="Conversations around Tags" +/%} + +- Start a conversation right within the data asset page. Add **@mention** to tag a user or team. Add a **#mention** to tag a data asset. + +{% image +src="/images/v1.1/how-to-guides/governance/ct2.png" +alt="Start a Conversation" +caption="Start a Conversation" +/%} + +- Further in the conversation, users can **Reply** to discuss further as well as add **Reactions**, **Edit**, or **Delete**. + +{% image +src="/images/v1.1/how-to-guides/governance/ct3.png" +alt="Conversation: Reply, React, Edit or Delete" +caption="Conversation: Reply, React, Edit or Delete" +/%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/tabs.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/tabs.md new file mode 100644 index 000000000000..9ace4f3a8965 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/tabs.md @@ -0,0 +1,170 @@ +--- +title: Data Asset Tabs +slug: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/tabs +--- + +# Data Asset Tabs + +The data asset details page displays the Source, Owner (Team/User), Tier, Type, Usage, and Description on the top panel. There are separate tabs each for Schema, Activity Feeds & Tasks, Sample Data, Queries, Profiler & Data Quality, Lineage, Custom Properties, Config, Details, Features, Children, and Executions based on the type of data asset selected. Let's take a look at each of the tabs. + +| **TABS** | **Table** | **Topic** | **Dashboard** | **Pipeline** | **ML Model** | **Container** | +|:--- | :--- | :--- | :--- | :--- | :--- | :--- | +| **Schema** | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="check" /%} | +| **Activity Feeds & Tasks** | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="check" /%} | +| **Sample Data** | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | +| **Queries** | {% icon iconName="check" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | +| **Profiler & Data Quality** | {% icon iconName="check" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | +| **Lineage** | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="check" /%} | +| **Custom Properties** | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="check" /%} | {% icon iconName="check" /%} | +| **Config** | {% icon iconName="cross" /%} | {% icon iconName="check" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | +| **Details** | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="check" /%} | {% icon iconName="cross" /%} | {% icon iconName="check" /%} | {% icon iconName="cross" /%} | +| **Executions** | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="check" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | +| **Features** | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="check" /%} | {% icon iconName="cross" /%} | +| **Children** | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="cross" /%} | {% icon iconName="check" /%} | + +## Schema Tab + +The Schema Data tab is displayed only for Tables, Topics, and Containers. Schema will display the columns, type of column, and description, alongwith the tags, and glossary terms associated with each column. The table also displays details on the **Frequently Joined Tables, Tags, and Glossary Terms** associated with it. + +{% image +src="/images/v1.1/how-to-guides/discovery/schema.png" +alt="Schema Tab" +caption="Schema Tab" +/%} + +## Activity Feeds & Tasks Tab + +The Activity Feeds & Task tab is displayed for all types of data assets. It displays all the tasks and mentions for a particular data asset. + +{% image +src="/images/v1.1/how-to-guides/discovery/aft1.png" +alt="Activity Feeds & Tasks Tab" +caption="Activity Feeds & Tasks Tab" +/%} + +## Sample Data Tab + +During metadata ingestion, you can opt to bring in sample data. If sample data is enabled, the same is displayed here. The Sample Data tab is displayed only for Tables and Topics. + +{% image +src="/images/v1.1/how-to-guides/discovery/sample.png" +alt="Sample Data Tab" +caption="Sample Data Tab" +/%} + +## Queries Tab + +The Queries tab is displayed only for Tables. It displays the SQL queries run against a particular table. It provides the details on when the query was run and the amount of time taken. It also displays if the query was used by other tables. You can also add new queries. + +{% image +src="/images/v1.1/how-to-guides/discovery/query.png" +alt="Queries Tab" +caption="Queries Tab" +/%} + +## Profiler & Data Quality Tab + +The Profiler & Data Quality tab is displayed only for Tables. It has three sub-tabs for **Table Profile, Column Profile, and Data Quality**. The Profiler brings in details like number of rows and columns for the table profile alongwith the details of the data volume, table updates, and volume change. For the column profile, it brings in the details of the type of each column, the value count, null value %, distinct value %, unique %, etc. Data quality tests can be run on this sample data. We can add tests at the table and column level. + +{% image +src="/images/v1.1/how-to-guides/discovery/dq1.png" +alt="Profiler & Data Quality" +caption="Profiler & Data Quality" +/%} + +{% image +src="/images/v1.1/how-to-guides/discovery/dq2.png" +alt="Column Profile of a Table" +caption="Column Profile of a Table" +/%} + +Check for more detailed information on the [Profiler and Data Quality Tab](/how-to-guides/openmetadata/data-quality-profiler/tab). + +## Lineage Tab + +The lineage tab is displayed for all types of data assets. The lineage view displays comprehensive lineage to capture the relation between the data assets. OpenMetadata UI displays end-to-end lineage traceability for the table and column levels. It displays both the upstream and downstream dependencies for each node. + +{% image +src="/images/v1.1/how-to-guides/discovery/lineage1.png" +alt="Comprehensive Lineage in OpenMetadata" +caption="Comprehensive Lineage in OpenMetadata" +/%} + +Users can configure the number of upstreams, downstreams, and nodes per layer by clicking on the Settings icon. OpenMetadata support manual lineage. By clicking on the Edit icon, users can edit the lineage and connect the data assets with a no-code editor. Clicking on any data asset in the lineage view will display a preview with the details of the data asset, alongwith tags, schema, data quality and profiler metrics. + +{% image +src="/images/v1.1/how-to-guides/discovery/lineage2.png" +alt="Data Asset Preview in Lineage Tab" +caption="Data Asset Preview in Lineage Tab" +/%} + +## Custom Properties Tab + +OpenMetadata uses a schema-first approach. We also support custom properties for all types of data assets. Organizations can extend the attributes as required to capture custom metadata. The Custom Properties tab shows up for all types of data assets. User can add or edit the custom property values for the data assets from this tab. Learn [How to Create a Custom Property for a Data Asset](/how-to-guides/user-guide-for-data-stewards/overview-data-assets/custom) + +{% image +src="/images/v1.1/how-to-guides/discovery/custom3.png" +alt="Enter the Value for a Custom Property" +caption="Enter the Value for a Custom Property" +/%} + +## Config Tab + +The Config tab is displayed only for Topics. + +## Details Tab + +The Details tab is displayed only for Dashboards and ML Models. In case of Dashboards, the Details tab displays the chart name, type of chart, and description of the chart. It also displays the associated tags for each chart. +{% image +src="/images/v1.1/how-to-guides/discovery/dsb1.png" +alt="Dashboards: Details Tab" +caption="Dashboards: Details Tab" +/%} + +In case of ML Models, it displays the Hyper Parameters and Model Store details. +{% image +src="/images/v1.1/how-to-guides/discovery/mlm2.png" +alt="ML Models: Details Tab" +caption="ML Models: Details Tab" +/%} + +## Executions Tab + +The Executions tab is displayed only for Pipelines. It displays the Date, Time, and Status of the pipelines. You can get a quick glance of the status in terms of Success, Failure, Pending, and Aborted. The status can be viewed as a Chronological list or as a tree. You can filter by status as well as by date. + +{% image +src="/images/v1.1/how-to-guides/discovery/exec.png" +alt="Pipelines: Executions Tab" +caption="Pipelines: Executions Tab" +/%} + +## Features Tab + +The Features tab is displayed only for ML Models. It displays a Description of the ML Model, and the features that have been used. Each feature will have further details on the Type of feature, Algorithm, Description, Sources, and the associated Glossary Terms and Tags. + +{% image +src="/images/v1.1/how-to-guides/discovery/mlm1.png" +alt="ML Models: Features Tab" +caption="ML Models: Features Tab" +/%} + +## Children Tab + +The Children tab is displayed only for Containers. + +# Version History and Other Details + +On the top right of the data asset details page, we can view details on: +- **Tasks:** The circular icon displays the number of open tasks. +- **Version History:** The clock icon displays the details of the version history in terms of major and minor changes. +- **Follow:** The star icon displays the number of users following the data asset. +- **Share:** Users can share the link to the data asset. +- **Announcements** On clicking the **⋮** icon, users can add announcements. +- **Rename:** On clicking the **⋮** icon, users can rename the data asset. +- **Delete:** On clicking the **⋮** icon, users can delete the data asset. + +{% image +src="/images/v1.1/how-to-guides/discovery/vh.png" +alt="Version History and Other Details" +caption="Version History and Other Details" +/%} \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/tags.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/tags.md new file mode 100644 index 000000000000..a8a1cbd16f78 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/tags.md @@ -0,0 +1,69 @@ +--- +title: How to Add Tags +slug: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/tags +--- + +# How to Add Tags + +- From the Explore page, select a data asset and click on the edit icon or + Add for Tags. +- Search for the relevant tags. You can either type and search, or scroll to select from the options provided. +- Click on the checkmark to save the changes. + +{% image +src="/images/v1.1/how-to-guides/governance/tag7.png" +alt="Add Tags to Classify Data Assets" +caption="Add Tags to Classify Data Assets" +/%} + +The tagged data assets can be discovered right from the Classification page. +- Navigate to **Govern >> Classification**. +- The list of tags is displayed along with the details of Usage in various data assets. +- Click on the Usage number to view the tagged assets. + +{% image +src="/images/v1.1/how-to-guides/governance/tag2.png" +alt="Usage: Number of Assets Tagged" +caption="Usage: Number of Assets Tagged" +/%} + +{% image +src="/images/v1.1/how-to-guides/governance/tag3.png" +alt="Discover the Tagged Data Assets" +caption="Discover the Tagged Data Assets" +/%} + +You can view all the tags in the right panel. + +Data assets can also be classified using Tiers. Learn more about [Tiers](/how-to-guides/openmetadata/data-governance/glossary-classification/tiers). + +Among the Classification Tags, OpenMetadata has some System Classification. Learn more about the [System Tags](/how-to-guides/openmetadata/data-governance/glossary-classification/classification). + +## Auto-Classification in OpenMetadata + +OpenMetadata identifies PII data and auto tags or suggests the tags. The data profiler automatically tags the PII-Sensitive data. The addition of tags about PII data helps consumers and governance teams identify data that needs to be treated carefully. + +In the example below, the columns ‘user_name’ and ‘social security number’ are auto-tagged as PII-sensitive. This works using NLP as part of the profiler during ingestion. + +{% image +src="/images/v1.1/how-to-guides/governance/auto1.png" +alt="User_name and Social Security Number are Auto-Classified as PII Sensitive" +caption="User_name and Social Security Number are Auto-Classified as PII Sensitive" +/%} + +In the below example, the column ‘dwh_x10’ is also auto-tagged as PII Sensitive, even though the column name does not provide much information. + +{% image +src="/images/v1.1/how-to-guides/governance/auto2.png" +alt="Column Name does not provide much information" +caption="Column Name does not provide much information" +/%} + +When we look at the content of the column ‘dwh_x10’ in the Sample Data tab, it becomes clear that the auto-classification is based on the data in the column. + +{% image +src="/images/v1.1/how-to-guides/governance/auto3.png" +alt="Column Data provides information" +caption="Column Data provides information" +/%} + +You can read more about [Auto PII Tagging](https://docs.open-metadata.org/v1.1.x/connectors/ingestion/auto_tagging) here. \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/versions.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/versions.md new file mode 100644 index 000000000000..315883725301 --- /dev/null +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/how-to-guides/user-guide-data-stewards/overview-data-assets/versions.md @@ -0,0 +1,38 @@ +--- +title: Data Asset Versioning +slug: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/versions +--- + +# Data Asset Versioning + +OpenMetadata maintains the version history for all data assets using a number with the format *major.minor*, starting with 0.1 as the initial version of an entity. Changes in metadata result in version changes as follows: +- **Backward compatible** changes result in a **Minor version** change. A change in the description, tags, or ownership will increase the version of the data asset metadata by 0.1 (e.g., from 0.1 to 0.2). +- **Backward incompatible** changes result in a **Major version** change. For example, when a column in a table is deleted, the version increases by 1.0 (e.g., from 0.2 to 1.2). + +Metadata versioning helps **simplify debugging processes**. View the version history to see if a recent change led to a data issue. Data owners and admins can review changes and revert if necessary. + +Versioning also helps in **broader collaboration** among consumers and producers of data. Admins can provide access to more users in the organization to change certain fields. Crowd-sourcing makes metadata the collective responsibility of the entire organization. + +{% image + src="/images/v1.1/features/ingestion/versioning/metadata-versioning.gif" + alt="Metadata versioning" + caption="Metadata Versioning" + /%} + +OpenMetadata versions all the changes to the metadata to capture the evolution of data over time in the Versions History. This is tracked for all the data assets. OpenMetadata also captures the metadata changes at the source. Click on the Versions icon to see the Version History of your data. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/v1.png" +alt="Version History Icon" +caption="Version History Icon" +/%} + +If a user adds a description to a column that is recorded as a **Minor version** change by incrementing the version number by 0.1. When description, owner, or tags are added, updated, or removed the minor version changes are recorded. These are backward-compatible changes. When a column is deleted at the source, OpenMetadata captures it as backward-incompatible change. To indicate that, the **major version** is changed by incrementing the version number by 1.0. + +{% image +src="/images/v1.1/how-to-guides/user-guide-for-data-stewards/v2.png" +alt="Version History" +caption="Version History" +/%} + +All the changes that have happened to your data and metadata are at your fingertips to understand the evolution of your data over time. This is also key for Data Governance. \ No newline at end of file diff --git a/openmetadata-docs/content/v1.2.x-SNAPSHOT/menu.md b/openmetadata-docs/content/v1.2.x-SNAPSHOT/menu.md index 217bb8faf496..26cf83d294f7 100644 --- a/openmetadata-docs/content/v1.2.x-SNAPSHOT/menu.md +++ b/openmetadata-docs/content/v1.2.x-SNAPSHOT/menu.md @@ -579,26 +579,154 @@ site_menu: - category: Connectors / Ingestion / Best Practices url: /connectors/ingestion/best-practices - - category: How to guides + - category: How to Guides url: /how-to-guides color: violet-70 icon: openmetadata - - category: How to guides / CLI Ingestion with basic auth + - category: How to Guides / Quick Start Guide for Admins + url: /how-to-guides/quick-start-guide-for-admins + - category: How to Guides / Quick Start Guide for Admins / How to Ingest Metadata + url: /how-to-guides/quick-start-guide-for-admins/how-to-ingest-metadata + - category: How to Guides / Quick Start Guide for Admins / How to Ingest Metadata / How to Delete a Service Connection + url: /how-to-guides/quick-start-guide-for-admins/how-to-ingest-metadata/delete-service-connection + - category: How to Guides / Quick Start Guide for Admins / Manage Teams and Users + url: /how-to-guides/quick-start-guide-for-admins/teams-and-users + - category: How to Guides / Quick Start Guide for Admins / Manage Teams and Users / Team Structure in OpenMetadata + url: /how-to-guides/quick-start-guide-for-admins/teams-and-users/team-structure-openmetadata + - category: How to Guides / Quick Start Guide for Admins / Manage Teams and Users / How to Add a Team + url: /how-to-guides/quick-start-guide-for-admins/teams-and-users/add-team + - category: How to Guides / Quick Start Guide for Admins / Manage Teams and Users / How to Invite Users to OpenMetadata + url: /how-to-guides/quick-start-guide-for-admins/teams-and-users/invite-users + - category: How to Guides / Quick Start Guide for Admins / Manage Teams and Users / How to Add Users to Teams + url: /how-to-guides/quick-start-guide-for-admins/teams-and-users/add-users + - category: How to Guides / Quick Start Guide for Admins / Manage Teams and Users / How to Change the Team Type + url: /how-to-guides/quick-start-guide-for-admins/teams-and-users/change-team-type + - category: How to Guides / Admin Guide for Roles and Policies + url: /how-to-guides/admin-guide-roles-policies + - category: How to Guides / Admin Guide for Roles and Policies / Building Blocks of Authorization - Rules, Policies, and Roles + url: /how-to-guides/admin-guide-roles-policies/authorization + - category: How to Guides / Admin Guide for Roles and Policies / Use Cases - Creating Roles & Policies in OpenMetadata + url: /how-to-guides/admin-guide-roles-policies/use-cases + - category: How to Guides / User Guide for Data Stewards + url: /how-to-guides/user-guide-for-data-stewards + - category: How to Guides / User Guide for Data Stewards / Understanding the Basics of OpenMetadata + url: /how-to-guides/user-guide-for-data-stewards/basics-openmetadata + - category: How to Guides / User Guide for Data Stewards / Overview of Data Assets + url: /how-to-guides/user-guide-for-data-stewards/overview-data-assets + - category: How to Guides / User Guide for Data Stewards / Overview of Data Assets / Data Asset Tabs + url: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/tabs + - category: How to Guides / User Guide for Data Stewards / Overview of Data Assets / How to Add Description using Markdown + url: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/description + - category: How to Guides / User Guide for Data Stewards / Overview of Data Assets / Request for Description + url: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/request-description + - category: How to Guides / User Guide for Data Stewards / Overview of Data Assets / How to Assign or Change Data Ownership + url: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/data-ownership + - category: How to Guides / User Guide for Data Stewards / Overview of Data Assets / How to Follow a Data Asset + url: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/follow-data-asset + - category: How to Guides / User Guide for Data Stewards / Overview of Data Assets / How to Add Tags + url: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/tags + - category: How to Guides / User Guide for Data Stewards / Overview of Data Assets / How to Request for Tags + url: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/request-tags + - category: How to Guides / User Guide for Data Stewards / Overview of Data Assets / How to Add Glossary Terms + url: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/glossary + - category: How to Guides / User Guide for Data Stewards / Overview of Data Assets / How to Create a Custom Property for a Data Asset + url: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/custom + - category: How to Guides / User Guide for Data Stewards / Overview of Data Assets / Overview of Announcements + url: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/announcements + - category: How to Guides / User Guide for Data Stewards / Overview of Data Assets / How to Create an Announcement + url: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/add-announcement + - category: How to Guides / User Guide for Data Stewards / Overview of Data Assets / Data Asset Versioning + url: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/versions + - category: How to Guides / User Guide for Data Stewards / Overview of Data Assets / How to Delete a Data Asset + url: /how-to-guides/user-guide-for-data-stewards/overview-data-assets/delete + - category: How to Guides / The Six Pillars of OpenMetadata + url: /how-to-guides/openmetadata + - category: How to Guides / The Six Pillars of OpenMetadata / Data Discovery + url: /how-to-guides/openmetadata/data-discovery + - category: How to Guides / The Six Pillars of OpenMetadata / Data Discovery / How to Discover Assets of Interest + url: /how-to-guides/openmetadata/data-discovery/discover + - category: How to Guides / The Six Pillars of OpenMetadata / Data Discovery / Get a Quick Glance of the Data Assets + url: /how-to-guides/openmetadata/data-discovery/preview + - category: How to Guides / The Six Pillars of OpenMetadata / Data Discovery / Detailed View of the Data Assets + url: /how-to-guides/openmetadata/data-discovery/details + - category: How to Guides / The Six Pillars of OpenMetadata / Data Discovery / Add Complex Queries using Advanced Search + url: /how-to-guides/openmetadata/data-discovery/advanced + - category: How to Guides / The Six Pillars of OpenMetadata / Data Collaboration + url: /how-to-guides/openmetadata/data-collaboration + - category: How to Guides / The Six Pillars of OpenMetadata / Data Collaboration / Understanding Activity Feeds + url: /how-to-guides/openmetadata/data-collaboration/activity-feeds + - category: How to Guides / The Six Pillars of OpenMetadata / Data Collaboration / How to Request for Description + url: /how-to-guides/openmetadata/data-collaboration/request-description + - category: How to Guides / The Six Pillars of OpenMetadata / Data Collaboration / How to Request for Tags + url: /how-to-guides/openmetadata/data-collaboration/request-tags + - category: How to Guides / The Six Pillars of OpenMetadata / Data Collaboration / Overview of Announcements + url: /how-to-guides/openmetadata/data-collaboration/announcements + - category: How to Guides / The Six Pillars of OpenMetadata / Data Collaboration / How to Create an Announcement + url: /how-to-guides/openmetadata/data-collaboration/add-announcement + - category: How to Guides / The Six Pillars of OpenMetadata / Data Quality and Profiler + url: /how-to-guides/openmetadata/data-quality-profiler + - category: How to Guides / The Six Pillars of OpenMetadata / Data Quality and Profiler / Profiler and Data Quality Tab + url: /how-to-guides/openmetadata/data-quality-profiler/tab + - category: How to Guides / The Six Pillars of OpenMetadata / Data Quality and Profiler / How to Write and Deploy No-Code Test Cases + url: /how-to-guides/openmetadata/data-quality-profiler/test + - category: How to Guides / The Six Pillars of OpenMetadata / Data Quality and Profiler / How to Set Alerts for Test Case Fails + url: /how-to-guides/openmetadata/data-quality-profiler/alerts + - category: How to Guides / The Six Pillars of OpenMetadata / Data Lineage + url: /how-to-guides/openmetadata/data-lineage + - category: How to Guides / The Six Pillars of OpenMetadata / Data Lineage / How to Deploy a Lineage Workflow + url: /how-to-guides/openmetadata/data-lineage/workflow + - category: How to Guides / The Six Pillars of OpenMetadata / Data Lineage / Explore the Lineage View + url: /how-to-guides/openmetadata/data-lineage/explore + - category: How to Guides / The Six Pillars of OpenMetadata / Data Lineage / How Column-Level Lineage Works + url: /how-to-guides/openmetadata/data-lineage/column + - category: How to Guides / The Six Pillars of OpenMetadata / Data Lineage / How to Manually Add or Edit Lineage + url: /how-to-guides/openmetadata/data-lineage/manual + - category: How to Guides / The Six Pillars of OpenMetadata / Data Insights + url: /how-to-guides/openmetadata/data-insights + - category: How to Guides / The Six Pillars of OpenMetadata / Data Insights / What is Tiering + url: /how-to-guides/openmetadata/data-insights/tiering + - category: How to Guides / The Six Pillars of OpenMetadata / Data Insights / Set Up Data Insights Ingestion + url: /how-to-guides/openmetadata/data-insights/ingestion + - category: How to Guides / The Six Pillars of OpenMetadata / Data Insights / Key Performance Indicators (KPI) + url: /how-to-guides/openmetadata/data-insights/kpi + - category: How to Guides / The Six Pillars of OpenMetadata / Data Insights / Data Insights Report + url: /how-to-guides/openmetadata/data-insights/report + - category: How to Guides / The Six Pillars of OpenMetadata / Data Insights / How to Transform the Data Culture of Your Company + url: /how-to-guides/openmetadata/data-insights/data-culture + - category: How to Guides / The Six Pillars of OpenMetadata / Data Governance + url: /how-to-guides/openmetadata/data-governance + - category: How to Guides / The Six Pillars of OpenMetadata / Data Governance / Glossary and Classification + url: /how-to-guides/openmetadata/data-governance/glossary-classification + - category: How to Guides / The Six Pillars of OpenMetadata / Data Governance / Glossary and Classification / What is a Glossary + url: /how-to-guides/openmetadata/data-governance/glossary-classification/glossary + - category: How to Guides / The Six Pillars of OpenMetadata / Data Governance / Glossary and Classification / What is Classification + url: /how-to-guides/openmetadata/data-governance/glossary-classification/classification + - category: How to Guides / The Six Pillars of OpenMetadata / Data Governance / Glossary and Classification / What are Tiers + url: /how-to-guides/openmetadata/data-governance/glossary-classification/tiers + - category: How to Guides / The Six Pillars of OpenMetadata / Data Governance / Glossary and Classification / How to Setup a Glossary + url: /how-to-guides/openmetadata/data-governance/glossary-classification/setup-glossary + - category: How to Guides / The Six Pillars of OpenMetadata / Data Governance / Glossary and Classification / How to Create Glossary Terms + url: /how-to-guides/openmetadata/data-governance/glossary-classification/glossary-terms + - category: How to Guides / The Six Pillars of OpenMetadata / Data Governance / Glossary and Classification / How to Bulk Import a Glossary + url: /how-to-guides/openmetadata/data-governance/glossary-classification/import-glossary + - category: How to Guides / The Six Pillars of OpenMetadata / Data Governance / Glossary and Classification / How to Add Assets to Glossary Terms + url: /how-to-guides/openmetadata/data-governance/glossary-classification/assets + - category: How to Guides / The Six Pillars of OpenMetadata / Data Governance / Glossary and Classification / How to Classify Data Assets + url: /how-to-guides/openmetadata/data-governance/glossary-classification/classify-assets + - category: How to Guides / The Six Pillars of OpenMetadata / Data Governance / Glossary and Classification / Best Practices for Glossary and Classification + url: /how-to-guides/openmetadata/data-governance/glossary-classification/best-practices + - category: How to Guides / CLI Ingestion with basic auth url: /how-to-guides/cli-ingestion-with-basic-auth - - category: How to guides / Feature configurations + - category: How to Guides / Feature configurations url: /how-to-guides/feature-configurations - - category: How to guides / Feature configurations / Bots + - category: How to Guides / Feature configurations / Bots url: /how-to-guides/feature-configurations/bots - - category: How to guides / Teams and Users - url: /how-to-guides/teams-and-users - - category: How to guides / Teams and Users / How to Organise Teams and Users - url: /how-to-guides/teams-and-users/how-to-organise-teams-and-users - - category: How to guides / How to add a custom property to an entity + - category: How to Guides / How to add a custom property to an entity url: /how-to-guides/how-to-add-custom-property-to-an-entity - - category: How to guides / How to add Custom Logo + - category: How to Guides / How to add Custom Logo url: /how-to-guides/how-to-add-custom-logo - - category: How to guides / How to Add Language Support + - category: How to Guides / How to Add Language Support url: /how-to-guides/how-to-add-language-support - category: Features