Skip to content

Commit

Permalink
feat(ingest): connector for Neo4j (datahub-project#11526)
Browse files Browse the repository at this point in the history
Co-authored-by: kbartlett <[email protected]>
Co-authored-by: Andrew Sikowitz <[email protected]>
Co-authored-by: Jay Feldman <[email protected]>
Co-authored-by: Harshal Sheth <[email protected]>
Co-authored-by: Mayuri Nehate <[email protected]>
Co-authored-by: Shirshanka Das <[email protected]>
Co-authored-by: deepgarg-visa <[email protected]>
Co-authored-by: Felix Lüdin <[email protected]>
  • Loading branch information
9 people authored Dec 2, 2024
1 parent a31c88e commit dc87b51
Show file tree
Hide file tree
Showing 11 changed files with 612 additions and 0 deletions.
4 changes: 4 additions & 0 deletions datahub-web-react/src/app/ingest/source/builder/constants.ts
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ import sigmaLogo from '../../../../images/sigmalogo.png';
import sacLogo from '../../../../images/saclogo.svg';
import cassandraLogo from '../../../../images/cassandralogo.png';
import datahubLogo from '../../../../images/datahublogo.png';
import neo4j from '../../../../images/neo4j.png';

export const ATHENA = 'athena';
export const ATHENA_URN = `urn:li:dataPlatform:${ATHENA}`;
Expand Down Expand Up @@ -137,6 +138,8 @@ export const DATAHUB_GC = 'datahub-gc';
export const DATAHUB_LINEAGE_FILE = 'datahub-lineage-file';
export const DATAHUB_BUSINESS_GLOSSARY = 'datahub-business-glossary';
export const DATAHUB_URN = `urn:li:dataPlatform:${DATAHUB}`;
export const NEO4J = 'neo4j';
export const NEO4J_URN = `urn:li:dataPlatform:${NEO4J}`;

export const PLATFORM_URN_TO_LOGO = {
[ATHENA_URN]: athenaLogo,
Expand Down Expand Up @@ -180,6 +183,7 @@ export const PLATFORM_URN_TO_LOGO = {
[SAC_URN]: sacLogo,
[CASSANDRA_URN]: cassandraLogo,
[DATAHUB_URN]: datahubLogo,
[NEO4J_URN]: neo4j,
};

export const SOURCE_TO_PLATFORM_URN = {
Expand Down
8 changes: 8 additions & 0 deletions datahub-web-react/src/app/ingest/source/builder/sources.json
Original file line number Diff line number Diff line change
Expand Up @@ -325,5 +325,13 @@
"description": "Ingest databases and tables from any Iceberg catalog implementation",
"docsUrl": "https://datahubproject.io/docs/generated/ingestion/sources/iceberg",
"recipe": "source:\n type: \"iceberg\"\n config:\n env: dev\n # each thread will open internet connections to fetch manifest files independently, \n # this value needs to be adjusted with ulimit\n processing_threads: 1 \n # a single catalog definition with a form of a dictionary\n catalog: \n demo: # name of the catalog\n type: \"rest\" # other types are available\n uri: \"uri\"\n s3.access-key-id: \"access-key\"\n s3.secret-access-key: \"secret-access-key\"\n s3.region: \"aws-region\"\n profiling:\n enabled: false\n"
},
{
"urn": "urn:li:dataPlatform:neo4j",
"name": "neo4j",
"displayName": "Neo4j",
"description": "Import Nodes and Relationships from Neo4j.",
"docsUrl": "https://datahubproject.io/docs/generated/ingestion/sources/neo4j/",
"recipe": "source:\n type: 'neo4j'\n config:\n uri: 'neo4j+ssc://host:7687'\n username: 'neo4j'\n password: 'password'\n env: 'PROD'\n\nsink:\n type: \"datahub-rest\"\n config:\n server: 'http://localhost:8080'"
}
]
Binary file added datahub-web-react/src/images/neo4j.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
20 changes: 20 additions & 0 deletions metadata-ingestion/docs/sources/neo4j/neo4j.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
## Integration Details

<!-- Plain-language description of what this integration is meant to do. -->
<!-- Include details about where metadata is extracted from (ie. logs, source API, manifest, etc.) -->
Neo4j metadata will be ingested into DataHub using
`CALL apoc.meta.schema() YIELD value UNWIND keys(value) AS key RETURN key, value[key] AS value;`
The data that is returned will be parsed
and will be displayed as Nodes and Relationships in DataHub. Each object will be tagged with describing what kind of DataHub
object it is. The defaults are 'Node' and 'Relationship'. These tag values can be overwritten in the recipe.



## Metadata Ingestion Quickstart

### Prerequisites

In order to ingest metadata from Neo4j, you will need:

* Neo4j instance with APOC installed

12 changes: 12 additions & 0 deletions metadata-ingestion/docs/sources/neo4j/neo4j_recipe.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
source:
type: 'neo4j'
config:
uri: 'neo4j+ssc://host:7687'
username: 'neo4j'
password: 'password'
env: 'PROD'

sink:
type: "datahub-rest"
config:
server: 'http://localhost:8080'
3 changes: 3 additions & 0 deletions metadata-ingestion/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -525,6 +525,7 @@
"qlik-sense": sqlglot_lib | {"requests", "websocket-client"},
"sigma": sqlglot_lib | {"requests"},
"sac": sac,
"neo4j": {"pandas", "neo4j"},
}

# This is mainly used to exclude plugins from the Docker image.
Expand Down Expand Up @@ -673,6 +674,7 @@
"sigma",
"sac",
"cassandra",
"neo4j",
]
if plugin
for dependency in plugins[plugin]
Expand Down Expand Up @@ -792,6 +794,7 @@
"sigma = datahub.ingestion.source.sigma.sigma:SigmaSource",
"sac = datahub.ingestion.source.sac.sac:SACSource",
"cassandra = datahub.ingestion.source.cassandra.cassandra:CassandraSource",
"neo4j = datahub.ingestion.source.neo4j.neo4j_source:Neo4jSource",
],
"datahub.ingestion.transformer.plugins": [
"pattern_cleanup_ownership = datahub.ingestion.transformer.pattern_cleanup_ownership:PatternCleanUpOwnership",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ class DatasetSubTypes(StrEnum):
SAC_MODEL = "Model"
SAC_IMPORT_DATA_MODEL = "Import Data Model"
SAC_LIVE_DATA_MODEL = "Live Data Model"
NEO4J_NODE = "Neo4j Node"
NEO4J_RELATIONSHIP = "Neo4j Relationship"

# TODO: Create separate entity...
NOTEBOOK = "Notebook"
Expand Down
Empty file.
Loading

0 comments on commit dc87b51

Please sign in to comment.