Skip to content

Commit

Permalink
feat(docs): Document __DATAHUB_TO_FILE_ directive (datahub-project#10968
Browse files Browse the repository at this point in the history
)

Co-authored-by: Harshal Sheth <[email protected]>
  • Loading branch information
pedro93 and hsheth2 authored Jul 26, 2024
1 parent f816a14 commit 1f7c92b
Showing 1 changed file with 29 additions and 0 deletions.
29 changes: 29 additions & 0 deletions metadata-ingestion/recipe_overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,35 @@ similar to variable substitution in GNU bash or in docker-compose files.
For details, see [variable-substitution](https://docs.docker.com/compose/compose-file/compose-file-v2/#variable-substitution).
This environment variable substitution should be used to mask sensitive information in recipe files. As long as you can get env variables securely to the ingestion process there would not be any need to store sensitive information in recipes.

### Loading Sensitive Data as Files in Recipes


Some sources (e.g. kafka, bigquery, mysql) require paths to files on a local file system. This doesn't work for UI ingestion, where the recipe needs to be totally self-sufficient. To add files to ingestion processes as part of the necessary configuration, DataHub offers a directive `__DATAHUB_TO_FILE_` which allows recipes to set the contents of files.

The syntax for this directive is: `__DATAHUB_TO_FILE_<property>: <value>` which will get turned into `<property>: <path to file containing value>`. Note that value can be specified inline or using an env var/secret.

I.e:

```yaml
source:
type: mysql
config:
# Coordinates
host_port: localhost:3306
database: dbname

# Credentials
username: root
password: example
# If you need to use SSL with MySQL:
options:
connect_args:
__DATAHUB_TO_FILE_ssl_key: '${secret}' # use this for secrets that you need to mount to a file
# this will get converted into
# ssl_key: /tmp/path/to/file # where file contains the contents of ${secret}
...
```

### Transformations

If you'd like to modify data before it reaches the ingestion sinks – for instance, adding additional owners or tags – you can use a transformer to write your own module and integrate it with DataHub. Transformers require extending the recipe with a new section to describe the transformers that you want to run.
Expand Down

0 comments on commit 1f7c92b

Please sign in to comment.