Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changes to _runner.py to make .launch() more resilient to non-standard environments #18

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

seanjensengrey
Copy link

Hello Brandyn,

I chatted with amiller on IRC and told him I would be sending a pull request your way with my modifications. If you don't accept them, I cool with that. They work wonderfully in my environment

Centos 5.5
Python 2.6.6 installed in a NFS mounted user dir

The biggest problems you might have are that I turned off the default typedbytes and sequencefile settings. The way I resolve the python interpreter path is helpful in an environment where one isn't using the system python. The find command now will traverse symlinks, and if /usr/lib/hadoop exists I short circuit to that path. If the search for the streaming.jar doesn't find it, an exception is thrown.

Sean

paramters to launch modified to more noob friendly.

 * single quote for shell safety, double quote for strings
    with possible $ expansion

== streaming.jar search ==

 * Finding streaming.jar more is more resilient to paths with symlinks.
 * switched running find in a subshell
 * warns user if HADOOP_HOME is not set

== in hadoopy.launch() following changes ==

 * use_typedbytes=False, use_seqoutput=False,
 * use_autoinput=True
 * add_python=False
 * python_cmd=None, if you pass in python bin path, be explicit

If you specify add_python=True, it will use sys.executable, if you
need to override sys.executable, use python_cmd='path/to/python'

The changes to _runner.py.launch() should make it a little more friendly
out of the box. If you need more advanced features you can
turn those on with the above named parameters.

The changes to how the python interperter is located make it
possible to easily intergrate non-system python installs (like
Python 2.6 running on Centos 5.5).

Tested on:
 * Python 2.6.6 x86_64
 * Hadoop 0.20.2+737
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant