Skip to content

Commit

Permalink
Merge branch 'datahub-project:master' into master
Browse files Browse the repository at this point in the history
  • Loading branch information
anshbansal authored Jul 31, 2024
2 parents e6de1e6 + f73149a commit 4a54301
Show file tree
Hide file tree
Showing 3 changed files with 36 additions and 3 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@
import javax.annotation.Nullable;
import lombok.extern.slf4j.Slf4j;


@Slf4j
public class InputFieldsMapper {

Expand Down Expand Up @@ -40,9 +39,14 @@ public com.linkedin.datahub.graphql.generated.InputFields apply(
if (field.hasSchemaFieldUrn()) {
fieldResult.setSchemaFieldUrn(field.getSchemaFieldUrn().toString());
try {
parentUrn = Urn.createFromString(field.getSchemaFieldUrn().getEntityKey().get(0));
parentUrn =
Urn.createFromString(field.getSchemaFieldUrn().getEntityKey().get(0));
} catch (URISyntaxException e) {
log.error("Field urn resolution: failed to extract parentUrn successfully from {}. Falling back to {}", field.getSchemaFieldUrn(), entityUrn, e);
log.error(
"Field urn resolution: failed to extract parentUrn successfully from {}. Falling back to {}",
field.getSchemaFieldUrn(),
entityUrn,
e);
}
}
if (field.hasSchemaField()) {
Expand Down
1 change: 1 addition & 0 deletions docs/api/graphql/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ For more information on, please refer to the following links."
- [Querying for Domain of a Dataset](/docs/api/tutorials/domains.md#read-domains)
- [Querying for Glossary Terms of a Dataset](/docs/api/tutorials/terms.md#read-terms)
- [Querying for Deprecation of a dataset](/docs/api/tutorials/deprecation.md#read-deprecation)
- [Querying for all DataJobs that belong to a DataFlow](/docs/lineage/airflow.md#get-all-datajobs-associated-with-a-dataflow)

### Search

Expand Down
28 changes: 28 additions & 0 deletions docs/lineage/airflow.md
Original file line number Diff line number Diff line change
Expand Up @@ -266,6 +266,34 @@ with DAG(
- ingest this DAG, and it will remove all the obsolete pipelines and tasks from the Datahub based on the `cluster` value set in the `airflow.cfg`


## Get all dataJobs associated with a dataFlow

If you are looking to find all tasks (aka DataJobs) that belong to a specific pipeline (aka DataFlow), you can use the following GraphQL query:

```graphql
query {
dataFlow(urn: "urn:li:dataFlow:(airflow,db_etl,prod)") {
childJobs: relationships(
input: {
types: ["IsPartOf"],
direction: INCOMING,
start: 0,
count: 100
}
) {
total
relationships {
entity {
... on DataJob {
urn
}
}
}
}
}
}
```

## Emit Lineage Directly

If you can't use the plugin or annotate inlets/outlets, you can also emit lineage using the `DatahubEmitterOperator`.
Expand Down

0 comments on commit 4a54301

Please sign in to comment.