Skip to content

Bluefog Docker Test Notes

YBC edited this page Mar 12, 2020 · 24 revisions

Run on Single Machine

$ sudo docker run -it --gpus all bluefog:latest
root@Gyes:/examples# horovodrun -np 4 -H localhost:4 python ./blue-fog-examples/pytorch_cifar10_resnet.py {--no-bluefog}

Add additional argument --no-bluefog to disable bluefog and use horovod instead.

Run for testing under Docker Container (GPU)

For easier testing in the docker environment, it is better to mount the host directory into docker container. To build the test docker image (you may not need to run unless it is the firt time):

$ sudo docker build -t bluefog:devel . -f dockerfile.gpu.test

Running the following command under root folder to mount the bluefog folder:

$ sudo docker run -it --gpus all --name devtest \
   --mount type=bind,source="$(pwd)",target=/bluefog bluefog:devel

Remember to remove the devtestcontainer if you need it

$ sudo docker container rm devtest

Run for testing under Docker Container (CPU)

It takes a similar approach like GPU version:

$ sudo docker build -t bluefog_cpu:devel . -f dockerfile.cpu.test

Running the following command under root folder to mount the bluefog folder:

$ sudo docker run -it --name devtest --mount type=bind,source="$(pwd)",target=/bluefog \
   bluefog_cpu:devel

Remember to remove the devtestcontainer if you need it

$ sudo docker container rm devtest

Run on Multiple Machines

In slave:

$ sudo docker run -it --gpus all --network=host -v /mnt/share/ssh:/root/.ssh \
   --mount type=bind,source="$(pwd)",target=/bluefog bluefog:devel
$ /usr/sbin/sshd -p 40000

In master:

$ sudo docker run -it --gpus all --network=host -v /mnt/share/ssh:/root/.ssh  \
   --mount type=bind,source="$(pwd)",target=/bluefog bluefog:devel
$ ssh labg -p 40000 date
$ horovodrun -np 8 -H localhost:4,labg:4 -p 40000 python examples/pytorch_mnist.py {--epochs=3} {--no-bluefog}
Clone this wiki locally