Change spark-tensorflow-connector dependency to be spark 3.0.0 preview #141

WeichenXu123 · 2019-10-09T03:29:30Z

Change spark-tensorflow-connector to be spark-3.0.0-preview2
Test:

cd $PROJ_HOME/hadoop
mvn clean install  # build tensorflow-hadoop:1.10.0 and install into local repo

cd $PROJ_HOME/spark/spark-tensorflow-connector
mvn clean install

googlebot · 2019-10-09T03:29:33Z

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.

What to do if you already signed the CLA

Individual signers

It's possible we don't have your GitHub username or you're using a different email address on your commit. Check your existing CLA data and verify that your email is set on your git commits.

Corporate signers

Your company has a Point of Contact who decides which employees are authorized to participate. Ask your POC to be added to the group of authorized contributors. If you don't know who your Point of Contact is, direct the Google project maintainer to go/cla#troubleshoot (Public version).
The email used to register you as an authorized contributor must be the email used for the Git commit. Check your existing CLA data and verify that your email is set on your git commits.
The email used to register you as an authorized contributor must also be attached to your GitHub account.

ℹ️ Googlers: Go here for more info.

WeichenXu123 · 2019-10-09T03:32:44Z

spark/spark-tensorflow-connector/pom.xml

@@ -86,7 +87,7 @@
                    <configuration>
                        <recompileMode>incremental</recompileMode>
                        <useZincServer>true</useZincServer>
-                        <scalaVersion>${scala.binary.version}</scalaVersion>
+                        <scalaVersion>${scala.version}</scalaVersion>


We need specify scala version "2.12.10" instead of "2.12" here, otherwise it will cause some compatibility issue inside maven scala plugin.

...flow-connector/src/main/scala/org/tensorflow/spark/datasources/tfrecords/DefaultSource.scala

WeichenXu123 · 2019-10-09T04:03:38Z

@googlebot I signed it!

googlebot · 2019-10-09T04:03:41Z

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

WeichenXu123 · 2019-10-09T04:06:24Z

@jhseu Could you help review it ? Thanks!
Just run mvn clean install under directory spark/spark-tensorflow-connector to verify PR correctness.
Btw why there's no jenkins test ?

jkbradley · 2019-10-10T18:00:40Z

When I try to build this, I'm hitting:

[ERROR] Failed to execute goal on project spark-tensorflow-connector_2.11: Could not resolve dependencies for project org.tensorflow:spark-tensorflow-connector_2.11:jar:1.10.0: Could not find artifact org.tensorflow:tensorflow-hadoop:jar:1.10.0 in central (https://repo.maven.apache.org/maven2) -> [Help 1]

It looks like this tries to get a tensorflow-hadoop version which matches the spark-tensorflow-connector version. Is that intentional (given that tensorflow-hadoop is on version 1.14.0, whereas spark-tensorflow-connector is on version 1.10.0)?

WeichenXu123 · 2019-10-11T14:31:56Z

@jkbradley
Yes, the project version is 1.10, so it will depend on tensorflow-hadoop:1.10.0 version.

The default maven repo only include tensorflow-hadoop version >= 1.11,
so we should enter hadoop directory to build it first, command is:

cd $PROJ_HOME/hadoop
mvn clean install  # build tensorflow-hadoop:1.10.0 and install into local repo

cd $PROJ_HOME/spark/spark-tensorflow-connector
mvn clean install

jkbradley · 2019-10-11T22:30:52Z

Whoops, my bad, did not realize it's in the same project & is a manually handled dependency. Thanks!

jkbradley · 2019-10-15T18:49:25Z

Since this project's CI isn't running, I tested this PR locally. It may have some flakiness in the impl or tests right now. I ran the tests once (mvn clean install) and hit the following failure. But then I ran them again (mvn test) & they passed. I ran a 3rd time (mvn clean install) and they passed.

Failure in LocalWriteSuite:

- should write data locally *** FAILED ***
  org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 1 times, most recent failure: Lost task 1.0 in stage 0.0 (TID 1, c02w81rbhtd5.attlocal.net, executor driver): java.lang.IllegalStateException: LocalPath /var/folders/y_/_46df7ns1cn8dj_6hrs2fdxm0000gp/T/spark-connector-propagate2230735357410018221 already exists. SaveMode: ErrorIfExists.
	at org.tensorflow.spark.datasources.tfrecords.DefaultSource$.writePartitionLocal(DefaultSource.scala:182)
	at org.tensorflow.spark.datasources.tfrecords.DefaultSource$.mapFun$1(DefaultSource.scala:212)
	at org.tensorflow.spark.datasources.tfrecords.DefaultSource$.$anonfun$writePartitionLocalFun$1(DefaultSource.scala:214)
	at org.tensorflow.spark.datasources.tfrecords.DefaultSource$.$anonfun$writePartitionLocalFun$1$adapted(DefaultSource.scala:214)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2(RDD.scala:889)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2$adapted(RDD.scala:889)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:349)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:313)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
	at org.apache.spark.scheduler.Task.run(Task.scala:127)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:455)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:458)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)

Driver stacktrace:
  at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:1979)
  at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:1967)
  at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:1966)
  at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
  at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
  at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1966)
  at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:946)
  at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:946)
  at scala.Option.foreach(Option.scala:407)
  ...
  Cause: java.lang.IllegalStateException: LocalPath /var/folders/y_/_46df7ns1cn8dj_6hrs2fdxm0000gp/T/spark-connector-propagate2230735357410018221 already exists. SaveMode: ErrorIfExists.
  at org.tensorflow.spark.datasources.tfrecords.DefaultSource$.writePartitionLocal(DefaultSource.scala:182)
  at org.tensorflow.spark.datasources.tfrecords.DefaultSource$.mapFun$1(DefaultSource.scala:212)
  at org.tensorflow.spark.datasources.tfrecords.DefaultSource$.$anonfun$writePartitionLocalFun$1(DefaultSource.scala:214)
  at org.tensorflow.spark.datasources.tfrecords.DefaultSource$.$anonfun$writePartitionLocalFun$1$adapted(DefaultSource.scala:214)
  at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2(RDD.scala:889)
  at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2$adapted(RDD.scala:889)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:349)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:313)
  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
  ...

Also, one nit: the name of the artifact in the pom should be updated to 2.12: spark-tensorflow-connector_2.11

jhseu · 2019-10-15T22:19:34Z

I'm not opposed to this, but wouldn't it be better to wait until Spark 3.0.0 is released?

mengxr · 2019-10-15T22:53:46Z

@jhseu After we verify correctness, we can keep this PR open so less work for users who want to try out Spark 3.0 preview with spark-tensorflow-connector.

jhseu · 2019-10-15T23:27:28Z

Yeah, I don't mind keeping this open.

WeichenXu123 · 2019-10-16T09:33:18Z

@jkbradley Flaky test fixed. You could retest it. And pom artifact is updated to 2_12.

mengxr · 2019-10-16T16:04:35Z

@WeichenXu123 Could you explain the test flakiness? Is it relevant to Spark 3.0 upgrade? If not, let's submit another PR so the fix can go in.

WeichenXu123 · 2019-10-17T02:52:16Z

@mengxr Not relevant to spark 3.0. Create new PR here with some explanation #144

This reverts commit 0a6c412.

mengxr · 2020-03-31T14:26:12Z

@jhseu If we do not plan to make a new release that is 2.4 compatible, shall we review and merge this PR?

vikatskhay · 2020-04-24T07:29:49Z

Hi, we would like to use this library with spark 2.4 and scala 2.12.10. Would it be possible to support multiple versions with multiple profiles? I should probably create an issue but just wanted to ask here as well.

kangnak · 2020-06-22T09:16:02Z

Now Spark 3.0.0 is released. And we need https://mvnrepository.com/artifact/org.tensorflow/spark-tensorflow-connector_2.12 to be released, I think.

init pr

c5bb25e

googlebot added the cla: no label Oct 9, 2019

WeichenXu123 commented Oct 9, 2019

View reviewed changes

...flow-connector/src/main/scala/org/tensorflow/spark/datasources/tfrecords/DefaultSource.scala Show resolved Hide resolved

googlebot added cla: yes and removed cla: no labels Oct 9, 2019

fix flaky test and update pom artifact

0a6c412

WeichenXu123 added 2 commits October 17, 2019 10:58

Revert "fix flaky test and update pom artifact"

77bab80

This reverts commit 0a6c412.

update pom artifact

292ce87

WeichenXu123 changed the title ~~Change spark-tensorflow-connector dependency to be spark 3.0.0 snapshot~~ Change spark-tensorflow-connector dependency to be spark 3.0.0 preview Nov 10, 2019

update pom.xml

2ffb7b4

update to spark 3.0 preview2

0493f82

vikatskhay mentioned this pull request Apr 24, 2020

Support scala 2.12.10 and spark 2.4.4 #155

Merged

simonzhaoms mentioned this pull request Jun 24, 2020

Possible reason that may fail 'Propagate' - 'write data locally' in LocalWriteSuite test #164

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change spark-tensorflow-connector dependency to be spark 3.0.0 preview #141

Change spark-tensorflow-connector dependency to be spark 3.0.0 preview #141

WeichenXu123 commented Oct 9, 2019 •

edited

Loading

googlebot commented Oct 9, 2019

WeichenXu123 Oct 9, 2019

WeichenXu123 commented Oct 9, 2019

googlebot commented Oct 9, 2019

WeichenXu123 commented Oct 9, 2019

jkbradley commented Oct 10, 2019

WeichenXu123 commented Oct 11, 2019

jkbradley commented Oct 11, 2019

jkbradley commented Oct 15, 2019

jhseu commented Oct 15, 2019

mengxr commented Oct 15, 2019

jhseu commented Oct 15, 2019

WeichenXu123 commented Oct 16, 2019

mengxr commented Oct 16, 2019

WeichenXu123 commented Oct 17, 2019

mengxr commented Mar 31, 2020

vikatskhay commented Apr 24, 2020

kangnak commented Jun 22, 2020

Change spark-tensorflow-connector dependency to be spark 3.0.0 preview #141

Are you sure you want to change the base?

Change spark-tensorflow-connector dependency to be spark 3.0.0 preview #141

Conversation

WeichenXu123 commented Oct 9, 2019 • edited Loading

googlebot commented Oct 9, 2019

What to do if you already signed the CLA

Individual signers

Corporate signers

WeichenXu123 Oct 9, 2019

Choose a reason for hiding this comment

WeichenXu123 commented Oct 9, 2019

googlebot commented Oct 9, 2019

WeichenXu123 commented Oct 9, 2019

jkbradley commented Oct 10, 2019

WeichenXu123 commented Oct 11, 2019

jkbradley commented Oct 11, 2019

jkbradley commented Oct 15, 2019

jhseu commented Oct 15, 2019

mengxr commented Oct 15, 2019

jhseu commented Oct 15, 2019

WeichenXu123 commented Oct 16, 2019

mengxr commented Oct 16, 2019

WeichenXu123 commented Oct 17, 2019

mengxr commented Mar 31, 2020

vikatskhay commented Apr 24, 2020

kangnak commented Jun 22, 2020

WeichenXu123 commented Oct 9, 2019 •

edited

Loading