Docs & notebook to use the published jar

No need to build the jar by oneself any more
rovio · Apr 16, 2021 · 99eb45c · 99eb45c
1 parent 81f1d13
commit 99eb45c
Show file tree

Hide file tree

Showing 2 changed files with 60 additions and 119 deletions.
diff --git a/README.md b/README.md
@@ -108,14 +108,15 @@ satisfies the requirements of `DruidSource`
 First, set the following spark conf:
 
 ```python
-.conf("spark.jars",
-      "s3://my-bucket/my/prefix/rovio-ingest-1.0-SNAPSHOT.jar") \
-
+.conf("spark.jars.repositories",
+      "https://s01.oss.sonatype.org/content/repositories/snapshots") \
+.conf("spark.jars.packages",
+      "com.rovio.ingest:rovio-ingest:1.0.0_spark_3.0.1-SNAPSHOT") \
 .conf("spark.submit.pyFiles",
       "s3://my-bucket/my/prefix/rovio_ingest.zip")
 ```
 
-This is assuming that you [built from source](#building-from-source) and copied the packages to s3.
+This is assuming that you [built a python zip](#building-rovio_ingest-python) and copied it to s3.
 
 ```python
 from rovio_ingest import DRUID_SOURCE
@@ -170,14 +171,13 @@ Maven: see [Java](#java).
 
 A `Dataset[Row]` extension is provided to repartition the dataset for the `DruidSource` Datasource.
 
-First, set the following spark conf:
+For an interactive spark session you can set the following spark conf:
 
 ```scala
-("spark.jars", "s3://my-bucket/my/prefix/rovio-ingest-1.0-SNAPSHOT.jar")
+("spark.jars.repositories", "https://s01.oss.sonatype.org/content/repositories/snapshots"),
+("spark.jars.packages", "com.rovio.ingest:rovio-ingest:1.0.0_spark_3.0.1-SNAPSHOT")
 ```
 
-This is assuming that you [built from source](#building-rovio-ingest-jar) and copied the jar to s3.
-
 ```scala
 import org.apache.spark.sql.{Dataset, Row, SaveMode, SparkSession}
 import com.rovio.ingest.extensions.DruidDatasetExtensions._
@@ -235,14 +235,6 @@ Maven (for a full example, see [examples/rovio-ingest-maven-example](examples/ro
 
 A `DruidDataset` wrapper class is provided to repartition the dataset for the `DruidSource` DataSource.
 
-First, set the following spark conf:
-
-```java
-("spark.jars", "s3://my-bucket/my/prefix/rovio-ingest-1.0-SNAPSHOT.jar")
-```
-
-This is assuming that you [built from source](#building-rovio-ingest-jar) and copied the jar to s3.
-
 ```java
 import org.apache.spark.sql.Dataset;
 import org.apache.spark.sql.Row;
@@ -372,9 +364,7 @@ To build the jar package:
 
 The recommended way is to build a shaded jar and use it.
 
-Another option is to depend on `rovio-ingest` as a maven module (or use the plain jar), but there may
-be version conflicts between maven dependencies. If you'd still like to do it that way, see
-[this notebook](python/notebooks/druid_ingestion_test.ipynb) for guidance.
+To test the jar in practice, see [this notebook](python/notebooks/druid_ingestion_test.ipynb) as an example.
 
 #### Building rovio_ingest (python)