We appreciate all contributions. If you are planning to contribute a bug fix for an open issue, please comment on the thread and we're happy to provide guidance. You are welcome to pick issues with good first issue and help wanted labels to get started.
If you plan to contribute new features or extensions to this repository, first open an issue and discuss the feature with us. Sending a PR without discussion might result in a rejected PR, because we might be taking the repository in a different direction.
We recommend you use our prebuilt Docker image to start your development work using either VS Code or a local container:
-
Create an empty directory for your workspace on your development host. These instructions assume you are using a remote host and are connecting to it over SSH.
-
Clone PyTorch, TorchVision, and PyTorch/XLA into your workspace directory:
git clone --recursive --depth=1 https://github.com/pytorch/pytorch.git
# Install TorchVision if you need to run tests that involve vision modules
git clone --recursive --depth=1 https://github.com/pytorch/vision.git
# Clone with HTTPS if you use a GitHub a personal access token
git clone https://github.com/pytorch/xla.git pytorch/xla
# Or clone with SSH if you prefer:
git clone [email protected]:pytorch/xla.git pytorch/xla
- Create links to VS Code configuration files in your workspace directory:
ln -s pytorch/xla/.devcontainer/ .devcontainer
ln -s pytorch/xla/contrib/vscode/ .vscode
ln -s pytorch/xla/.style.yapf .style.yapf
ln -s pytorch/xla/.clang-format .clang-format
-
Start VS Code and ensure you have the
Remote Development
Extension Pack installed. It includes theRemote - SSH
andDev Containers
extensions. -
From VS Code, connect to your remote host and open your workspace directory. You will be prompted to reopen your workspace in container. Choose the appropriate container. Use
tpu-contributor
if you are unsure of which to use. If you are not prompted to reopen in a container, in the VS Code command pallete, typeDev Containers: Reopen in Container
to open your workspace in one of our pre-built Docker containers. Select the correct container based on your local accelerator. If you are unsure, usetpu-contributor
. -
Open a new terminal window in VS Code. Since you are running as root in this container, mark the repository directories as safe. The commands below assume your workspace directory is
torch
, update the commands to use your workspace directory.
git config --global --add safe.directory /workspaces/torch/pytorch
git config --global --add safe.directory /workspaces/torch/pytorch/xla
git config --global --add safe.directory /workspaces/torch/vision
- In the terminal window, run the following commands to build PyTorch, TorchVision, and PyTorch/XLA:
cd pytorch
# pytorch/xla requires pytorch wheel to be presented under pytorch/dist
python setup.py bdist_wheel
python setup.py install
cd ../vision
python setup.py develop
cd ../pytorch/xla
python setup.py develop
# Optional: if you're using TPU, install libtpu
pip install torch_xla[tpu] \
-f https://storage.googleapis.com/libtpu-wheels/index.html \
-f https://storage.googleapis.com/libtpu-releases/index.html
- If you are running on a TPU VM, ensure
torch
andtorch_xla
were built and installed correctly:
python -c 'import torch_xla as xla; print(xla.device())'
# Output: xla:0
Subsequent builds: after building the packages from source code for the
first time, you may need to build everything again, for example, after a
git pull
. You can run scripts/build_developer.sh
which will rebuild PyTorch,
TorchVision, and PyTorch/XLA.
-
Setup Development Docker Image
docker pull us-central1-docker.pkg.dev/tpu-pytorch-releases/docker/development:tpu docker run --privileged --name ptxla -it -d -e "TERM=xterm-256color" us-central1-docker.pkg.dev/tpu-pytorch-releases/docker/development:tpu docker exec --privileged -it ptxla /bin/bash
All of the code below will be assumed to be run within the docker.
-
Clone the PyTorch repo as per instructions.
git clone --recursive https://github.com/pytorch/pytorch cd pytorch/
-
Clone the PyTorch/XLA repo:
git clone --recursive https://github.com/pytorch/xla.git
-
Build PyTorch
# pytorch/xla requires pytorch wheel to be presented under pytorch/dist python setup.py bdist_wheel python setup.py develop
-
Build PyTorch/XLA
cd xla/ python setup.py develop
Please refer to this guide.
In pytorch/xla
repo we enforce coding style for both C++ and Python files. Please try to format
your code before submitting a pull request.
pytorch/xla
uses clang-format-11
with a customized style config.
If your PR touches the C++ source files, please run the following command before submitting a PR.
# How to install: sudo apt install clang-format-11
# If your PR only changes foo.cpp, run the following in xla/ folder
clang-format-11 -i -style=file /PATH/TO/foo.cpp
# To format all cpp files, run the following in xla/ folder
find -name '*.cpp' -o -name '*.h' -o -name '*.cc' | xargs clang-format-11 -i -style=file
pytorch/xla
uses yapf
(specially version 0.30.0 in case it's not backward compatible) with a customized style config.
If your PR touches the Python source files, please run the following command before submitting a PR.
# How to install: pip install yapf==0.30.0
yapf --recursive -i *.py test/ scripts/ torch_xla/ benchmarks/
To run the tests, follow one of the options below:
-
Run on local CPU:
export PJRT_DEVICE=CPU
-
Run on Cloud TPU:
export PJRT_DEVICE=TPU
-
Run on GPU:
export PJRT_DEVICE=CUDA GPU_NUM_DEVICES=${NUM_GPU}
For more detail on configuring the runtime, please refer to this doc
If you are planning to be building from source and hence using the latest PyTorch/TPU code base, it is suggested for you to select the Nightly builds when you create a Cloud TPU instance.
Then run test/run_tests.sh
and test/cpp/run_tests.sh
to verify the setup is working.
- If local changes aren't visible, uninstall existing pytorch/xla with
pip uninstall torch_xla
andpip uninstall torch
, then rebuild PyTorch and PyTorch/XLA withpython setup.py develop
orpython setup.py install
. - PJRT errors when running on TPU such as
The PJRT plugin has PJRT API version 0.34. The framework PJRT API version is 0.40
. You need to update yourlibtpu.so
and ensure it's in yourLD_LIBRARY_PATH
environmental directory. You can download a newlibtpu.so
at Google Cloud, which are sorted by date. Download the newest one and install it atpip install libtpu...whl
.