Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change spark-tensorflow-connector dependency to be spark 3.0.0 preview #141

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

WeichenXu123
Copy link
Contributor

@WeichenXu123 WeichenXu123 commented Oct 9, 2019

Change spark-tensorflow-connector to be spark-3.0.0-preview2
Test:

cd $PROJ_HOME/hadoop
mvn clean install  # build tensorflow-hadoop:1.10.0 and install into local repo

cd $PROJ_HOME/spark/spark-tensorflow-connector
mvn clean install

@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@@ -86,7 +87,7 @@
<configuration>
<recompileMode>incremental</recompileMode>
<useZincServer>true</useZincServer>
<scalaVersion>${scala.binary.version}</scalaVersion>
<scalaVersion>${scala.version}</scalaVersion>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need specify scala version "2.12.10" instead of "2.12" here, otherwise it will cause some compatibility issue inside maven scala plugin.

@WeichenXu123
Copy link
Contributor Author

@googlebot I signed it!

@googlebot
Copy link

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

@googlebot googlebot added cla: yes and removed cla: no labels Oct 9, 2019
@WeichenXu123
Copy link
Contributor Author

@jhseu Could you help review it ? Thanks!
Just run mvn clean install under directory spark/spark-tensorflow-connector to verify PR correctness.
Btw why there's no jenkins test ?

@jkbradley
Copy link

When I try to build this, I'm hitting:

[ERROR] Failed to execute goal on project spark-tensorflow-connector_2.11: Could not resolve dependencies for project org.tensorflow:spark-tensorflow-connector_2.11:jar:1.10.0: Could not find artifact org.tensorflow:tensorflow-hadoop:jar:1.10.0 in central (https://repo.maven.apache.org/maven2) -> [Help 1]

It looks like this tries to get a tensorflow-hadoop version which matches the spark-tensorflow-connector version. Is that intentional (given that tensorflow-hadoop is on version 1.14.0, whereas spark-tensorflow-connector is on version 1.10.0)?

@WeichenXu123
Copy link
Contributor Author

@jkbradley
Yes, the project version is 1.10, so it will depend on tensorflow-hadoop:1.10.0 version.

The default maven repo only include tensorflow-hadoop version >= 1.11,
so we should enter hadoop directory to build it first, command is:

cd $PROJ_HOME/hadoop
mvn clean install  # build tensorflow-hadoop:1.10.0 and install into local repo

cd $PROJ_HOME/spark/spark-tensorflow-connector
mvn clean install

@jkbradley
Copy link

Whoops, my bad, did not realize it's in the same project & is a manually handled dependency. Thanks!

@jkbradley
Copy link

Since this project's CI isn't running, I tested this PR locally. It may have some flakiness in the impl or tests right now. I ran the tests once (mvn clean install) and hit the following failure. But then I ran them again (mvn test) & they passed. I ran a 3rd time (mvn clean install) and they passed.

Failure in LocalWriteSuite:

- should write data locally *** FAILED ***
  org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 1 times, most recent failure: Lost task 1.0 in stage 0.0 (TID 1, c02w81rbhtd5.attlocal.net, executor driver): java.lang.IllegalStateException: LocalPath /var/folders/y_/_46df7ns1cn8dj_6hrs2fdxm0000gp/T/spark-connector-propagate2230735357410018221 already exists. SaveMode: ErrorIfExists.
	at org.tensorflow.spark.datasources.tfrecords.DefaultSource$.writePartitionLocal(DefaultSource.scala:182)
	at org.tensorflow.spark.datasources.tfrecords.DefaultSource$.mapFun$1(DefaultSource.scala:212)
	at org.tensorflow.spark.datasources.tfrecords.DefaultSource$.$anonfun$writePartitionLocalFun$1(DefaultSource.scala:214)
	at org.tensorflow.spark.datasources.tfrecords.DefaultSource$.$anonfun$writePartitionLocalFun$1$adapted(DefaultSource.scala:214)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2(RDD.scala:889)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2$adapted(RDD.scala:889)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:349)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:313)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
	at org.apache.spark.scheduler.Task.run(Task.scala:127)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:455)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:458)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)

Driver stacktrace:
  at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:1979)
  at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:1967)
  at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:1966)
  at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
  at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
  at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1966)
  at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:946)
  at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:946)
  at scala.Option.foreach(Option.scala:407)
  ...
  Cause: java.lang.IllegalStateException: LocalPath /var/folders/y_/_46df7ns1cn8dj_6hrs2fdxm0000gp/T/spark-connector-propagate2230735357410018221 already exists. SaveMode: ErrorIfExists.
  at org.tensorflow.spark.datasources.tfrecords.DefaultSource$.writePartitionLocal(DefaultSource.scala:182)
  at org.tensorflow.spark.datasources.tfrecords.DefaultSource$.mapFun$1(DefaultSource.scala:212)
  at org.tensorflow.spark.datasources.tfrecords.DefaultSource$.$anonfun$writePartitionLocalFun$1(DefaultSource.scala:214)
  at org.tensorflow.spark.datasources.tfrecords.DefaultSource$.$anonfun$writePartitionLocalFun$1$adapted(DefaultSource.scala:214)
  at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2(RDD.scala:889)
  at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2$adapted(RDD.scala:889)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:349)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:313)
  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
  ...

Also, one nit: the name of the artifact in the pom should be updated to 2.12: spark-tensorflow-connector_2.11

@jhseu
Copy link
Contributor

jhseu commented Oct 15, 2019

I'm not opposed to this, but wouldn't it be better to wait until Spark 3.0.0 is released?

@mengxr
Copy link
Contributor

mengxr commented Oct 15, 2019

@jhseu After we verify correctness, we can keep this PR open so less work for users who want to try out Spark 3.0 preview with spark-tensorflow-connector.

@jhseu
Copy link
Contributor

jhseu commented Oct 15, 2019

Yeah, I don't mind keeping this open.

@WeichenXu123
Copy link
Contributor Author

@jkbradley Flaky test fixed. You could retest it. And pom artifact is updated to 2_12.

@mengxr
Copy link
Contributor

mengxr commented Oct 16, 2019

@WeichenXu123 Could you explain the test flakiness? Is it relevant to Spark 3.0 upgrade? If not, let's submit another PR so the fix can go in.

@WeichenXu123
Copy link
Contributor Author

@mengxr Not relevant to spark 3.0. Create new PR here with some explanation #144

@WeichenXu123 WeichenXu123 changed the title Change spark-tensorflow-connector dependency to be spark 3.0.0 snapshot Change spark-tensorflow-connector dependency to be spark 3.0.0 preview Nov 10, 2019
@mengxr
Copy link
Contributor

mengxr commented Mar 31, 2020

@jhseu If we do not plan to make a new release that is 2.4 compatible, shall we review and merge this PR?

@vikatskhay
Copy link
Contributor

Hi, we would like to use this library with spark 2.4 and scala 2.12.10. Would it be possible to support multiple versions with multiple profiles? I should probably create an issue but just wanted to ask here as well.

@kangnak
Copy link

kangnak commented Jun 22, 2020

Now Spark 3.0.0 is released. And we need https://mvnrepository.com/artifact/org.tensorflow/spark-tensorflow-connector_2.12 to be released, I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants