Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TensorFlow on Hadoop YARN #33

Open
tslam75 opened this issue Feb 16, 2017 · 11 comments
Open

TensorFlow on Hadoop YARN #33

tslam75 opened this issue Feb 16, 2017 · 11 comments

Comments

@tslam75
Copy link

tslam75 commented Feb 16, 2017

Hadoop YARN is a commonly deployed cluster manager. Having the ability to run TensorFlow on YARN would be very useful in such environment.

Our team is currently working on a YARN application for this purpose, and would like to contribute our work here. We will provide more details of our contribution soon.

-Jason

@jhseu
Copy link
Contributor

jhseu commented Feb 16, 2017

Thanks! That'd be really useful.

@zhe-thoughts
Copy link

@tslam75 Have you looked at https://issues.apache.org/jira/browse/YARN-6043?

@tslam75
Copy link
Author

tslam75 commented Feb 22, 2017

@zhe-thoughts Thanks for the reference! Looked over YARN-6043, and both uses a native application master for TensorFlow.

Attaching a design document here now. We also have an implementation based on this design, and will publish the code soon.

TensorFlow_on_YARN.pdf

@tslam75
Copy link
Author

tslam75 commented Mar 21, 2017

Sorry for the delay.

Created pull request #39 while waiting for the CLA to be signed.

@IDerr
Copy link

IDerr commented Apr 11, 2017

Awesome job :O

@leftnoteasy
Copy link

In Hadoop 3.0, YARN native services can support running Tensorflow services on YARN without adding any dependencies or implement a new YARN application master.

Please see our blogpost: https://hortonworks.com/blog/distributed-tensorflow-assembly-hadoop-yarn/ and let me know if you have any questions. Thanks!

@tbchj
Copy link

tbchj commented Apr 21, 2017

focus ...

@butterluo
Copy link

@tslam75 Does your 'TensorFlow on Yarn' support fault tolerance ? If yes, how?

@zhanglistar
Copy link

mark

@oliverhu
Copy link

oliverhu commented Feb 1, 2018

+1

@zhe-thoughts
Copy link

We (LinkedIn Hadoop team) just open sourced TonY:
Repo: https://github.com/linkedin/TonY
Blog post: https://engineering.linkedin.com/blog/2018/09/open-sourcing-tony--native-support-of-tensorflow-on-hadoop

Comments / discussions very welcome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants