Skip to content

Commit

Permalink
Merge pull request #225 from treeverse/glue-iceberg-demo-v1.0
Browse files Browse the repository at this point in the history
Added Iceberg integration notebooks which work with AWS Glue
  • Loading branch information
kesarwam authored Aug 23, 2024
2 parents bec59d7 + e7259b7 commit 4b03ecd
Show file tree
Hide file tree
Showing 5 changed files with 1,267 additions and 0 deletions.
1 change: 1 addition & 0 deletions 00_notebooks/00_index.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@
"* [AWS **Databricks**](https://github.com/treeverse/lakeFS-samples/blob/main/01_standalone_examples/aws-databricks/)\n",
"* [AWS **Glue and Athena**](https://github.com/treeverse/lakeFS-samples/blob/main/01_standalone_examples/aws-glue-athena/)\n",
"* [AWS **Glue and Trino**](https://github.com/treeverse/lakeFS-samples/blob/main/01_standalone_examples/aws-glue-trino/)\n",
"* [AWS **Glue and Iceberg**](https://github.com/treeverse/lakeFS-samples/blob/main/01_standalone_examples/aws-glue-iceberg/)\n",
"* [lakeFS + **Dagster**](https://github.com/treeverse/lakeFS-samples/blob/main/01_standalone_examples/dagster-integration/)\n",
"* [lakeFS + **Prefect**](https://github.com/treeverse/lakeFS-samples/blob/main/01_standalone_examples/prefect-integration/)\n",
"* [Reproducibility and Data Version Control for **LangChain** and **LLM/OpenAI** Models](https://github.com/treeverse/lakeFS-samples/blob/main/01_standalone_examples/llm-openai-langchain-integration/)<br/>_See also the [accompanying blog](https://lakefs.io/blog/lakefs-langchain-loader/)_\n",
Expand Down
26 changes: 26 additions & 0 deletions 01_standalone_examples/aws-glue-iceberg/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Integration of lakeFS with Glue Notebooks and Iceberg

Start by ⭐️ starring [lakeFS open source](https://go.lakefs.io/oreilly-course) project.

This repository includes following Glue notebooks:

1. iceberg-books:
* Use Case: Isolated Dev/Test Environments
* This interactive notebook demonstrate integration of lakeFS with [Glue Notebooks](https://docs.aws.amazon.com/glue/latest/dg/notebook-getting-started.html) and [Iceberg](https://docs.lakefs.io/integrations/iceberg.html) with a basic example.

2. iceberg-lakefs-nyc:
* Use Case: Isolated Dev/Test Environments
* This interactive notebook demonstrate integration of lakeFS with Glue Notebooks and Iceberg for [NYC Film Permits](https://data.cityofnewyork.us/City-Government/Film-Permits/tg4x-b46p/about_data) example.


## Prerequisites
* lakeFS installed and running in your AWS environment or in the lakeFS Cloud. If you don't have lakeFS already running then either use [lakeFS Cloud](https://lakefs.cloud/) which provides free lakeFS server on-demand with a single click or [Deploy lakeFS on AWS](https://docs.lakefs.io/howto/deploy/aws.html) doc.


## Setup

Download these notebooks from GitHub and upload these notebooks as Jupyter notebooks in your AWS Glue Studio.

## Demo Instructions

Open the notebook in Glue Studio and follow the instructions.
Loading

0 comments on commit 4b03ecd

Please sign in to comment.