-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xla_extension failed encountered when trying to use exla in a Docker container #90
Comments
Is there a reason you are trying to build XLA from source, rather than use the the precompiled binaries? We use these dockerfiles for precompilation, so those instructions should work. |
Ideally, we would prefer not to build the extension from source. I noticed that the xla gets built from source when we add exla in our dependencies. Here are the dependencies we've added along with exla:
We did not add the xla dependency in our list of dependencies, but somehow, it gets added (maybe because it's part of Nx). |
By default it will download a precompiled version. Does it print anything saying it can't use a precompiled and therefore it must compile from source? |
So you have XLA_BUILD set by any chance? |
I did not set it anywhere (.bashprofile, Dockerfile etc). Based on the README.md it is set to false by default. |
The build should trigger only when One way to check would be to add |
I did notice the image uses a rather outdated combo of Elixir and OTP, as well as an older Debian. If possible, I'd update to eliminate any possibility of the compilation being triggered by not finding the proper version/platform precompiled archive |
It still went through 🥲
|
Interesting, I don't have any idea at the moment. It would be helpful if you could minimize it into a reproducible repo, like an empty mix project with the deps and the Dockerfile :) |
Got something similar: 5.162 ==> xla
5.162 Compiling 5 files (.ex)
5.267 Generated xla app
5.315
5.315 17:30:36.318 [info] Downloading a precompiled XLA archive for target aarch64-linux-gnu-cpu
9.752
9.752 17:30:40.757 [info] Successfully downloaded the XLA archive
10.47 ==> exla
10.47 Unpacking /root/.cache/xla/0.8.0/download/xla_extension-0.8.0-aarch64-linux-gnu-cpu.tar.gz into /app/deps/exla/cache
15.12 g++ cache/0.9.2/objs/exla.o cache/0.9.2/objs/exla_client.o cache/0.9.2/objs/exla_mlir.o cache/0.9.2/objs/custom_calls.o cache/0.9.2/objs/exla_nif_util.o cache/0.9.2/objs/ipc.o cache/0.9.2/objs/custom_calls/eigh_f32.o cache/0.9.2/objs/custom_calls/eigh_f64.o cache/0.9.2/objs/custom_calls/lu_bf16.o cache/0.9.2/objs/custom_calls/lu_f16.o cache/0.9.2/objs/custom_calls/lu_f32.o cache/0.9.2/objs/custom_calls/lu_f64.o cache/0.9.2/objs/custom_calls/qr_bf16.o cache/0.9.2/objs/custom_calls/qr_f16.o cache/0.9.2/objs/custom_calls/qr_f32.o cache/0.9.2/objs/custom_calls/qr_f64.o cache/0.9.2/objs/exla_cuda.o -o cache/libexla.so -Lcache/xla_extension/lib -lxla_extension -shared -Wl,-rpath,'$ORIGIN/xla_extension/lib'
15.14 cache/0.9.2/objs/exla.o: file not recognized: file format not recognized
15.14 collect2: error: ld returned 1 exit status
15.14 make: *** [Makefile:101: cache/libexla.so] Error 1
15.14 could not compile dependency :exla, "mix compile" failed. Errors may have been logged above. You can recompile this dependency with "mix deps.compile exla --force", update it with "mix deps.update exla" or clean it with "mix deps.clean exla"
15.14 ==> relax
15.14 ** (Mix) Could not compile with "make" (exit status: 2).
15.14 You need to have gcc and make installed. If you are using
15.14 Ubuntu or any other Debian-based system, install the packages
15.14 "build-essential". Also install "erlang-dev" package if not
15.14 included in your Erlang/OTP version. If you're on Fedora, run
15.14 "dnf group install 'Development Tools'".
[+] Running 0/1
⠹ Service api Building 94.2s
failed to solve: process "/bin/sh -c mix compile" did not complete successfully: exit code: 1 I'm running this in a Macbook M1 Pro. This is a bare minimal elixir repo available at https://github.com/georgeguimaraes/relax (using All I'm running to trigger this is |
Changing the dependency to makes it work: api-1 | ==> exla
api-1 | Using libexla.so from /root/.cache/xla/exla/elixir-1.17.3-erts-15.2-xla-0.8.0-exla-0.8.0-ioo6ddg2zbm7ovoei2oc4ucrjy/libexla.so
api-1 | Compiling 23 files (.ex)
api-1 | Generated exla app
api-1 | ==> relax
api-1 | Compiling 1 file (.ex)
api-1 | Generated relax app
api-1 | Running ExUnit with seed: 697364, max_cases: 8
api-1 |
api-1 | ..
api-1 | Finished in 0.01 seconds (0.00s async, 0.01s sync) |
Using ❯ docker compose up --build
[+] Running 0/0
[+] Running 0/1 Building 0.1s
[+] Building 57.3s (13/13) FINISHED docker:default
=> [api internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 577B 0.0s
=> [api internal] load metadata for mirror.gcr.io/hexpm/elixir:1.17.3-erlang-27.2-ubuntu-noble-20241015 1.3s
=> [api internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [api 1/7] FROM mirror.gcr.io/hexpm/elixir:1.17.3-erlang-27.2-ubuntu-noble-20241015@sha256:f3a173c0d868e720c77a63c83de10c4b169f939 0.0s
=> [api internal] load build context 0.5s
=> => transferring context: 3.14MB 0.5s
=> CACHED [api 2/7] RUN apt-get update -y && apt-get install -y inotify-tools build-essential erlang-dev git curl && apt-get clean 0.0s
=> CACHED [api 3/7] WORKDIR /app 0.0s
=> CACHED [api 4/7] RUN mix local.hex --force && mix local.rebar --force 0.0s
=> [api 5/7] COPY . . 0.9s
=> [api 6/7] RUN mix deps.get 2.8s
=> [api 7/7] RUN mix compile 49.7s
=> [api] exporting to image 2.1s
=> => exporting layers 2.1s
=> => writing image sha256:629ef48806cb54cd54e5c420d3761de5693c4b24cc56e60c70dada4c38250f04 0.0s
[+] Running 2/1o docker.io/library/relax-api 0.0s
✔ Service api Built 57.4s
✔ Container relax-api-1 Recreated 0.1s
Attaching to api-1
api-1 | ==> complex
api-1 | Compiling 2 files (.ex)
api-1 | Generated complex app
api-1 | ==> nx
api-1 | Compiling 36 files (.ex)
api-1 | Generated nx app
api-1 | ==> nimble_pool
api-1 | Compiling 2 files (.ex)
api-1 | Generated nimble_pool app
api-1 | ==> elixir_make
api-1 | Compiling 8 files (.ex)
api-1 | Generated elixir_make app
api-1 | ==> xla
api-1 | Compiling 5 files (.ex)
api-1 | Generated xla app
api-1 | ==> exla
api-1 | Using libexla.so from /root/.cache/xla/exla/elixir-1.17.3-erts-15.2-xla-0.8.0-exla-0.9.1-t34ppw6zq2bvv4txq247gllfci/libexla.so
api-1 | Compiling 24 files (.ex)
api-1 | warning: Nx.Defn.stream/3 is deprecated. Move the streaming loop to Elixir instead
api-1 | │
api-1 | 356 │ Nx.Defn.stream(function, args, Keyword.put(options, :compiler, EXLA))
api-1 | │ ~
api-1 | │
api-1 | └─ lib/exla.ex:356:13: EXLA.stream/3
api-1 |
api-1 | Generated exla app
api-1 | ==> relax
api-1 | Compiling 1 file (.ex)
api-1 | Generated relax app
api-1 | Running ExUnit with seed: 309870, max_cases: 8
api-1 |
api-1 | ..
api-1 | Finished in 0.01 seconds (0.00s async, 0.01s sync)
api-1 | 1 doctest, 1 test, 0 failures
api-1 exited with code 0 |
btw you'll see in my repo that I'm using the latest Elixir, OTP, and Ubuntu available |
@georgeguimaraes in your case, the issue is that you do I was able to reproduce the error by running |
Tks @jonatanklosko! TIL :) |
I encounter xla_extension failed when I try to run exla while building a docker container. Here are some of the snippets from my Dockerfile:
I get this error after I run the Dockerfile
I only encounter this issue when trying to build a docker container. I do not encounter any issues when I run mix phx.server.
Do we have an official Dockerfile sample for cases where docker container setup is required?
The text was updated successfully, but these errors were encountered: