This is a docker image built for submarine development and quick start test.
Please Note: don't use the image in production environment. It's only for test purpose.
docker pull apache/submarine:mini-0.3.0
You may need a VPN if your network is limited
1.Clone the source code of Submarine
git clone https://github.com/apache/submarine.git
2.Build Submarine
cd ./submarine
mvn clean install package -DskipTests
3.Build image of mini-submarine
You can download in advance of these three kind of compressed file for building : zookeeper-3.4.14.tar.gz , hadoop-2.9.2.tar.gz , spark-2.4.4-bin-hadoop2.7.tgz and put them into "submarine/dev-support/mini-submarine/"
cd submarine/dev-support/mini-submarine/
./build_mini-submarine.sh
When doing release, the release manager might needs to package a artifact candidates in this docker image and public the image candidate for a vote. In this scenario, we can do this:
Put submarine candidate aritifacts to a folder like "~/releases/submarine-release"
$ ls $release_candidates_path
submarine-dist-0.3.0-hadoop-2.9.tar.gz submarine-dist-0.3.0-src.tar.gz.asc
submarine-dist-0.3.0-hadoop-2.9.tar.gz.asc submarine-dist-0.3.0-src.tar.gz.sha512
submarine-dist-0.3.0-hadoop-2.9.tar.gz.sha512 submarine-dist-0.3.0-src.tar.gz
export submarine_version=0.3.0
export release_candidates_path=~/releases/submarine-release
./build_mini-submarine.sh
#docker run -it -h submarine-dev --net=bridge --privileged -P local/mini-submarine:0.3.0 /bin/bash
docker tag local/mini-submarine:0.3.0 apache/mini-submarine:0.3.0:RC0
docker push apache/mini-submarine:0.3.0:RC0
In the container, we can verify that the submarine jar version is the expected 0.3.0. Then we can upload this image with a "RC" tag for a vote.
docker run -it -h submarine-dev --name mini-submarine --net=bridge --privileged -P local/mini-submarine:0.4.0-SNAPSHOT /bin/bash
# In the container, use root user to bootstrap hdfs and yarn
/tmp/hadoop-config/bootstrap.sh
# Two commands to check if yarn and hdfs is running as expected
yarn node -list -showDetails
If you pull the image directly, please replace "local/mini-submarine:0.4.0-SNAPSHOT" with "apache/submarine:mini-0.3.0".
Total Nodes:1
Node-Id Node-State Node-Http-Address Number-of-Running-Containers
submarine-dev:35949 RUNNING submarine-dev:8042 0
Detailed Node Information :
Configured Resources : <memory:8192, vCores:16, nvidia.com/gpu: 1>
Allocated Resources : <memory:0, vCores:0>
Resource Utilization by Node : PMem:4144 MB, VMem:4189 MB, VCores:0.25308025
Resource Utilization by Containers : PMem:0 MB, VMem:0 MB, VCores:0.0
Node-Labels :
hdfs dfs -ls /user
drwxr-xr-x - yarn supergroup 0 2019-07-22 07:59 /user/yarn
- Setup mysql mariadb server
Because mysql and mariadb use the GPL license, So there is no binary file containing mysql in the image, you need to manually execute the script to install it.
/tmp/hadoop-config/setup-mysql.sh
You can execute command mysql -uroot
login mysql mariadb.
- Start submarine server
su yarn
/opt/submarine-current/bin/submarine-daemon.sh start getMysqlJar
- Login submarine workbench
Execute the following command in your host machine, Get the access URL of the submarine workbench running in docker
echo "http://localhost:$(docker inspect --format='{{(index (index .NetworkSettings.Ports "8080/tcp") 0).HostPort}}' mini-submarine)"
The URL returned by the command (like to: http://localhost:32819) is opened through a browser. The username and initial password of the workbench are both admin
.
su yarn
cd /home/yarn/submarine/
# run TF 1 distributed training job
./run_submarine_mnist_tony.sh
# run TF 2 distributed training job
./run_submarine_mnist_tf2_tony.sh
When run_submarine_mnist_tony.sh is executed, mnist data is download from the url, google mnist, by default. If the url is unaccessible, you can use parameter "-d" to specify a customized url. For example, if you are in mainland China, you can use the following command
./run_submarine_mnist_tony.sh -d http://yann.lecun.com/exdb/mnist/
Submarine server is supposed to manage jobs lifecycle. Clients can just submit job parameters or yaml file to submarine server instead of submitting jobs directly by themselves. Submarine server can handle the rest of the work.
Set submarine.server.rpc.enabled to true in the file of /opt/submarine-current/conf/submarine-site
<property>
<name>submarine.server.rpc.enabled</name>
<value>true</value>
<description>Run jobs using rpc server.</description>
</property>
Run the following command to submit a job via submarine server
./run_submarine_mnist_tony_rpc.sh
Run container with your source code. You can also use "docker cp" to an existing running container
-
docker run -it -h submarine-dev --net=bridge --privileged -v pathToMyScrit.py:/home/yarn/submarine/myScript.py local/hadoop-docker:submarine /bin/bash
-
Refer to the
run_submarine_mnist_tony.sh
and modify the script to your script -
Try to run it. Since this is a single node environment, keep in mind that the workers could have conflicts with each other. For instance, the mnist_distributed.py example has a workaround to fix the conflicts when two workers are using same "data_dir" to download data set.
You can follow the documentation instructions to update your own modified and compiled submarine package to the submarine container.
cd submarine-project-dir/
mvn clean install package -DskipTests
docker cp submarine-all/target/submarine-all-<SUBMARINE_VERSION>-hadoop-<HADOOP_VERSION>.jar <container-id>:/tmp/
cd /home/yarn/submarine
vi run_customized_submarine-all_mnist.sh
# Need to modify environment variables based on hadoop and submarine version numbers
SUBMARINE_VERSION=<submarine-version-number>
HADOOP_VERSION=<hadoop-version-number> # default 2.9
cd /home/yarn/submarine
./run_customized_submarine-all_mnist.sh
When using mini-submarine, you can debug submarine client, applicationMaster and executor for trouble shooting.
Run the following command to start mini-submarine.
docker run -it -P -h submarine-dev --net=bridge --expose=8000 --privileged local/mini-submarine:0.4.0-SNAPSHOT /bin/bash
Debug submarine client with the parameter "--debug"
./run_submarine_mnist_tony.sh --debug
Port 8000 is used in the mini-submarine. You need to find the debug port mapping between mini-subamrine and the host on which run mini-subamrine.
docker port <SUBMARINE_CONTAINER_ID>
For example, we can get some info like this
8000/tcp -> 0.0.0.0:32804
Then port 32804 can be used for remote debug.
Run the following command to start mini-submarine.
docker run -it -P -h submarine-dev --net=bridge --expose=8001 --privileged local/mini-submarine:0.4.0-SNAPSHOT /bin/bash
Add the following configuration in the file /usr/local/hadoop/etc/hadoop/tony.xml.
<property>
<name>tony.task.am.jvm.opts</name>
<value>-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=8001</value>
</property>
You can use run_submarine_mnist_tony.sh to submit a job. Port 8001 is used for AM debugging in mini-submarine. And the debug port mapping can be obtained using the way as Debug submarine client shows.
Run the following command to start mini-submarine.
docker run -it -P -h submarine-dev --net=bridge --expose=8002 --privileged local/mini-submarine:0.4.0-SNAPSHOT /bin/bash
Add the following configuration in the file /usr/local/hadoop/etc/hadoop/tony.xml.
<property>
<name>tony.task.executor.jvm.opts</name>
<value>-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=8002</value>
</property>
Port 8002 is used for executor debugging in mini-submarine. To avoid port confliction, you need to use only one executor, which means the parameter of submarine job should be like this
--num_workers 1 \
--num_ps 0 \
You can get the debug port mapping using the way as Debug submarine client shows.
You can also run a distributedShell job in mini-submarine.
cd && ./yarn-ds-docker.sh
Spark jobs are supported as well.
cd && cd spark-script && ./run_spark.sh
-
Submarine package name error
Because the package name of submarine 0.3.0 or higher has been changed from
apache.hadoop.yarn.submarine
toapache.submarine
, So you need to set the Runtime settings in the/usr/local/hadoop/etc/hadoop/submarine-site.xml
file.<configuration> <property> <name>submarine.runtime.class</name> <value>org.apache.submarine.server.submitter.yarn.YarnRuntimeFactory</value> </property> </configuration>