Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't use nasm feature on arch linux (for faster rav1d video decoding) #7671

Closed
jleibs opened this issue Oct 9, 2024 · 10 comments · Fixed by #7742
Closed

Can't use nasm feature on arch linux (for faster rav1d video decoding) #7671

jleibs opened this issue Oct 9, 2024 · 10 comments · Fixed by #7742
Assignees
Labels
🪳 bug Something isn't working dependencies concerning crates, pip packages etc 🐧 linux Linux-specific problems 🚀 performance Optimization, memory use, etc 📺 re_viewer affects re_viewer itself

Comments

@jleibs
Copy link
Member

jleibs commented Oct 9, 2024

To reproduce:

cargo run --package rerun-cli --no-default-features --features native_viewer,nasm

Output:

  = note: /usr/bin/ld: /home/jleibs/rerun/target/debug/deps/librav1d-13aa763589daefd4.rlib(mc16_avx512.o): warning: relocation against `dav1d_mc_warp_filter' in read-only section `.text'
          /usr/bin/ld: /home/jleibs/rerun/target/debug/deps/librav1d-13aa763589daefd4.rlib(looprestoration_sse.o): relocation R_X86_64_PC32 against symbol `dav1d_sgr_x_by_x' can not be used when making a shared object; recompile with -fPIC
          /usr/bin/ld: final link failed: bad value
          collect2: error: ld returned 1 exit status

Affected platforms: Arch Linux

I'm confirmed I have nasm installed:

➜  nasm --version
NASM version 2.16.03 compiled on May  4 2024
@jleibs jleibs added 🪳 bug Something isn't working 📺 re_viewer affects re_viewer itself labels Oct 9, 2024
@jleibs
Copy link
Member Author

jleibs commented Oct 9, 2024

@jprochazk
Copy link
Member

jprochazk commented Oct 10, 2024

I can repro this on Linux (Fedora). I tried setting -Clink-args=-fPIC, but that didn't help, it still claims it should be recompiled with -fPIC. Neither mold nor lld work.

Weirdly enough, it's just re_viewer. Compiling re_video by itself seems to work?

cargo run --release -p re_video --example frames -F nasm

But this does not:

cargo run --release --package rerun-cli --no-default-features --features native_viewer,nasm

@emilk
Copy link
Member

emilk commented Oct 10, 2024

So maybe some clash of dependencies then?

@teh-cmc teh-cmc self-assigned this Oct 10, 2024
@emilk emilk changed the title re_viewer no longer compiles: dav1d_sgr_x_by_x' can not be used when making a shared object; recompile with -fPIC Can't use nasm feature on arch linux (for faster rav1d video decoding) Oct 10, 2024
@emilk emilk added 🚀 performance Optimization, memory use, etc dependencies concerning crates, pip packages etc 🐧 linux Linux-specific problems labels Oct 10, 2024
@teh-cmc teh-cmc removed their assignment Oct 10, 2024
emilk added a commit that referenced this issue Oct 10, 2024
### What
* Circumvent #7671

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested the web demo (if applicable):
* Using examples from latest `main` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/7675?manifest_url=https://app.rerun.io/version/main/examples_manifest.json)
* Using full set of examples from `nightly` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/7675?manifest_url=https://app.rerun.io/version/nightly/examples_manifest.json)
* [x] The PR title and labels are set such as to maximize their
usefulness for the next release's CHANGELOG
* [x] If applicable, add a new check to the [release
checklist](https://github.com/rerun-io/rerun/blob/main/tests/python/release_checklist)!
* [x] If have noted any breaking changes to the log API in
`CHANGELOG.md` and the migration guide

- [PR Build Summary](https://build.rerun.io/pr/7675)
- [Recent benchmark results](https://build.rerun.io/graphs/crates.html)
- [Wasm size tracking](https://build.rerun.io/graphs/sizes.html)

To run all checks from `main`, comment on the PR with `@rerun-bot
full-check`.
@teh-cmc
Copy link
Member

teh-cmc commented Oct 10, 2024

TL;DR:

  • Only x64 platforms go through the nasm-based path, everything else uses the system's default assembler (I'm not sure why, but naively trying to use the system's assembler for x64 fails miserably).
  • On Linux x64, the nasm path generates relocation entries that fail to resolve at link-time for some unknown reason, even though the whole thing is fully statically link, only use relative relocs, etc. Even weirder, it only fails when linking the full viewer, simpler examples do link successfully (it's possible that the problematic symbols are unused and therefore optimized out before the reloc pass for these examples?).
  • On Mac x64, the nasm path just works. Somehow.

@Wumpf
Copy link
Member

Wumpf commented Oct 11, 2024

On windows x64 it works like a charm. Given that it makes 8k video usable (completely unusable without), I think as a first step we should limit the changes of #7675 to Linux

@Wumpf
Copy link
Member

Wumpf commented Oct 11, 2024

:sigh: that is ofc running against rust-lang/cargo#1197, it's very hard to make this feature have only an effect on certain platforms

@emilk
Copy link
Member

emilk commented Oct 11, 2024

I think we can work-around it by adding this to a crate other than re_video

[target.'cfg( not(target_os = "linux") )'.dependencies] # TODO(#7671)
re_video = { workspace = true, features = [ "nasm" ] }

@teh-cmc
Copy link
Member

teh-cmc commented Oct 11, 2024

Interestingly, the official dav1d (Dav1d, with a D!) libraries don't even have any of these problematic symbols:

readelf --wide -s /usr/lib/libdav1d.so | rg dav1d_
    39: 0000000000003284  1797 FUNC    GLOBAL DEFAULT   11 dav1d_open
    40: 000000000016a390   912 FUNC    GLOBAL DEFAULT   11 dav1d_flush
    41: 0000000000169910   610 FUNC    GLOBAL DEFAULT   11 dav1d_parse_sequence_header
    42: 0000000000003989    20 FUNC    GLOBAL DEFAULT   11 dav1d_close
    43: 0000000000169c60   113 FUNC    GLOBAL DEFAULT   11 dav1d_data_props_unref
    44: 0000000000169320   206 FUNC    GLOBAL DEFAULT   11 dav1d_data_wrap
    45: 0000000000002b81    12 FUNC    GLOBAL DEFAULT   11 dav1d_version
    46: 0000000000002d9f    96 FUNC    GLOBAL DEFAULT   11 dav1d_get_frame_delay
    47: 000000000016a060   413 FUNC    GLOBAL DEFAULT   11 dav1d_apply_grain
    48: 0000000000172f00   890 FUNC    GLOBAL DEFAULT   11 dav1d_get_picture
    49: 00000000001692a0   115 FUNC    GLOBAL DEFAULT   11 dav1d_data_create
    50: 0000000000169270    41 FUNC    GLOBAL DEFAULT   11 dav1d_get_event_flags
    51: 0000000000002020    11 FUNC    GLOBAL DEFAULT   11 dav1d_set_cpu_flags_mask
    52: 0000000000169ce0   253 FUNC    GLOBAL DEFAULT   11 dav1d_get_decode_error_data_props
    53: 0000000000002b8d    10 FUNC    GLOBAL DEFAULT   11 dav1d_version_api
    54: 0000000000169470     9 FUNC    GLOBAL DEFAULT   11 dav1d_data_unref
    55: 00000000001693f0   125 FUNC    GLOBAL DEFAULT   11 dav1d_data_wrap_user_data
    56: 0000000000172e50   164 FUNC    GLOBAL DEFAULT   11 dav1d_send_data
    57: 0000000000002b97    86 FUNC    GLOBAL DEFAULT   11 dav1d_default_settings
    58: 000000000016a720     9 FUNC    GLOBAL DEFAULT   11 dav1d_picture_unref

So I guess these routines are specific to the Rust implementation.

Wumpf added a commit that referenced this issue Oct 11, 2024
### What

Related to
* #7671
* #7605
* #7588

Biting the bullet and split the feature on our rav1d fork:

rerun-io/re_rav1d@emilk/dav1d-interface...rerun-io:rav1d:andreas/disable-asm-linux

Testing done:
* wsl linux release build works now (didn't before)
* windows build still works
    * shows no-asm warning if compiled with out nasm
    * works otherwise
* added a `panic` under `cfg(asm)` in one of the interface functions
(now removed again) .. then:
    * linux still worked
    * windows hit that panic when compiled with nasm feature 

### Checklist
* [x] main ci passes
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested the web demo (if applicable):
* Using examples from latest `main` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/7694?manifest_url=https://app.rerun.io/version/main/examples_manifest.json)
* Using full set of examples from `nightly` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/7694?manifest_url=https://app.rerun.io/version/nightly/examples_manifest.json)
* [x] The PR title and labels are set such as to maximize their
usefulness for the next release's CHANGELOG
* [x] If applicable, add a new check to the [release
checklist](https://github.com/rerun-io/rerun/blob/main/tests/python/release_checklist)!
* [x] If have noted any breaking changes to the log API in
`CHANGELOG.md` and the migration guide

- [PR Build Summary](https://build.rerun.io/pr/7694)
- [Recent benchmark results](https://build.rerun.io/graphs/crates.html)
- [Wasm size tracking](https://build.rerun.io/graphs/sizes.html)

To run all checks from `main`, comment on the PR with `@rerun-bot
full-check`.

---------

Co-authored-by: Emil Ernerfeldt <[email protected]>
@teh-cmc teh-cmc self-assigned this Oct 12, 2024
@emilk
Copy link
Member

emilk commented Oct 15, 2024

Should have been fixed by rerun-io/re_rav1d#3, but lets test it before closing (and maybe follow-up and remove TODO(#7671) from the Rerun codebase)

@emilk
Copy link
Member

emilk commented Oct 15, 2024

Apparently the -F nasm feature makes a 25x perf difference for @teh-cmc on Linux 64.

Based on that, I think we should disable av1 decoding without -F nasm on Linux and Windows (Mac is fast even without it).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🪳 bug Something isn't working dependencies concerning crates, pip packages etc 🐧 linux Linux-specific problems 🚀 performance Optimization, memory use, etc 📺 re_viewer affects re_viewer itself
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants