Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update fmt (to 11.0.2) and spdlog (to 1.14.1) #689

Merged
merged 11 commits into from
Sep 23, 2024
2 changes: 1 addition & 1 deletion dependencies.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,7 @@ dependencies:
- zlib
- output_types: [conda]
packages:
- fmt==9.1.0
- fmt==11.0.2
py_version:
specific:
- output_types: conda
Expand Down
13 changes: 0 additions & 13 deletions rapids-cmake/cpm/patches/fmt/fix_10_1_1_version.diff

This file was deleted.

24 changes: 24 additions & 0 deletions rapids-cmake/cpm/patches/fmt/fix_11_0_2_unreachable_loop.diff
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename this so it has the word "nvcc" in the filename. If that requires too much CI rerunning, don't bother.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think it's not worth the CI re-running. The patch has #if defined(__NVCC__) around its changes, so hopefully that's enough information for anyone looking at it that this is specific to nvcc.

Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
diff --git a/include/fmt/base.h b/include/fmt/base.h
index 62764942..60f7483b 100644
--- a/include/fmt/base.h
+++ b/include/fmt/base.h
@@ -464,6 +464,9 @@ template <typename Char> FMT_CONSTEXPR auto length(const Char* s) -> size_t {
return len;
}

+#if defined(__NVCC__)
+#pragma nv_diag_suppress 128
+#endif
template <typename Char>
FMT_CONSTEXPR auto compare(const Char* s1, const Char* s2, std::size_t n)
-> int {
@@ -474,6 +477,9 @@ FMT_CONSTEXPR auto compare(const Char* s1, const Char* s2, std::size_t n)
}
return 0;
}
+#if defined(__NVCC__)
+#pragma nv_diag_default 128
+#endif
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch works around an issue we observed in raft (rapidsai/raft#2433 (comment)) and in the cudf pip devcontainers build.

  FAILED: CMakeFiles/raft_objs.dir/src/matrix/detail/select_k_half_uint32_t.cu.o 
  /usr/bin/sccache /usr/local/cuda/bin/nvcc -forward-unknown-to-host-compiler -ccbin=/usr/bin/g++ -DCUTLASS_NAMESPACE=raft_cutlass -DFMT_HEADER_ONLY=1 -DLIBCUDACXX_ENABLE_EXPERIMENTAL_MEMORY_RESOURCE -DRAFT_COMPILED -DRAFT_EXPLICIT_INSTANTIATE_ONLY -DRAFT_SYSTEM_LITTLE_ENDIAN=1 -DSPDLOG_FMT_EXTERNAL -DTHRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_CUDA -DTHRUST_DISABLE_ABI_NAMESPACE -DTHRUST_HOST_SYSTEM=THRUST_HOST_SYSTEM_CPP -DTHRUST_IGNORE_ABI_NAMESPACE_ERROR -I/home/coder/raft/cpp/include -I/home/coder/raft/cpp/build/pip/cuda-12.5/release/_deps/hnswlib-src -I/home/coder/raft/cpp/build/pip/cuda-12.5/release/_deps/rmm-src/include -I/home/coder/raft/cpp/build/pip/cuda-12.5/release/_deps/cccl-src/thrust/thrust/cmake/../.. -I/home/coder/raft/cpp/build/pip/cuda-12.5/release/_deps/cccl-src/libcudacxx/lib/cmake/libcudacxx/../../../include -I/home/coder/raft/cpp/build/pip/cuda-12.5/release/_deps/cccl-src/cub/cub/cmake/../.. -I/home/coder/raft/cpp/build/pip/cuda-12.5/release/_deps/fmt-src/include -I/home/coder/raft/cpp/build/pip/cuda-12.5/release/_deps/spdlog-src/include -I/home/coder/raft/cpp/build/pip/cuda-12.5/release/_deps/nvtx3-src/c/include -I/home/coder/raft/cpp/build/pip/cuda-12.5/release/_deps/cuco-src/include -I/home/coder/raft/cpp/build/pip/cuda-12.5/release/_deps/nvidiacutlass-src/include -I/home/coder/raft/cpp/build/pip/cuda-12.5/release/_deps/nvidiacutlass-build/include -I/usr/local/cuda/include -isystem /usr/local/cuda/targets/x86_64-linux/include -t=1 -O3 -DNDEBUG -std=c++17 "--generate-code=arch=compute_70,code=[sm_70]" "--generate-code=arch=compute_75,code=[sm_75]" "--generate-code=arch=compute_80,code=[sm_80]" "--generate-code=arch=compute_86,code=[sm_86]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" -Xcompiler=-fPIC -Xcompiler=-Wno-deprecated-declarations -DRAFT_HIDE_DEPRECATION_WARNINGS -Xcompiler=-Wall,-Werror,-Wno-error=deprecated-declarations -Werror=all-warnings --expt-extended-lambda --expt-relaxed-constexpr -DCUDA_API_PER_THREAD_DEFAULT_STREAM -Xfatbin=-compress-all -Xcompiler=-fopenmp -MD -MT CMakeFiles/raft_objs.dir/src/matrix/detail/select_k_half_uint32_t.cu.o -MF CMakeFiles/raft_objs.dir/src/matrix/detail/select_k_half_uint32_t.cu.o.d -x cu -c /home/coder/raft/cpp/src/matrix/detail/select_k_half_uint32_t.cu -o CMakeFiles/raft_objs.dir/src/matrix/detail/select_k_half_uint32_t.cu.o
  /home/coder/raft/cpp/build/pip/cuda-12.5/release/_deps/fmt-src/include/fmt/base.h(471): error #128-D: loop is not reachable
      for (; n != 0; ++s1, ++s2, --n) {
      ^
            detected during:
              instantiation of "auto fmt::v11::detail::compare(const Char *, const Char *, std::size_t)->int [with Char=char]" at line 583
              instantiation of "auto fmt::v11::basic_string_view<Char>::compare(fmt::v11::basic_string_view<Char>) const->int [with Char=char]" at line 591
              instantiation of class "fmt::v11::basic_string_view<Char> [with Char=char]" at line 1929
              instantiation of "auto fmt::v11::basic_format_args<Context>::get_id(fmt::v11::basic_string_view<Char>) const->int [with Context=fmt::v11::context, Char=char]" at line 1969
  
  Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"
  
  1 error detected in the compilation of "/home/coder/raft/cpp/src/matrix/detail/select_k_half_uint32_t.cu".
  ninja: build stopped: subcommand failed.

I'm not sure what the root cause is, but thought the cost "a bit of unreachable code is compiled in" was worth it in exchange for getting this upgrade done.

Otherwise, I think we'd have to do one of these other higher-effort things:

  • put similar guards around every #include that can lead to #include <fmt/base.h> (recursively) across RAPIDS
  • find the true root cause of the warning

NOTE: taking this patch means that fmt will still be vendored in some RAPIDS conda packages though (rapidsai/rmm#1528).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need a strategy to upstream or fix this, so that we aren't carrying the patch indefinitely. Let's compose an issue to fmt that demonstrates this more minimally, and then figure out what to do for a fix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember which nvcc versions were affected (just 12.5? also 11.8?) but that would be relevant to include in a bug report.

I assume that this is a fmt bug and not an nvcc bug, but if you think it's an nvcc bug, let's discuss and file that internally as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I'd love to get rid of the patch.

I don't know the range of nvcc versions or whether the issue is in fmt or nvcc. Will put up a tracking issue here in rapids-cmake with steps to reproduce this so we can investigate later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put up #695 to track the work of trying to remove this patch.


namespace adl {
using namespace std;
41 changes: 0 additions & 41 deletions rapids-cmake/cpm/patches/spdlog/nvcc_constexpr_fix.diff

This file was deleted.

22 changes: 7 additions & 15 deletions rapids-cmake/cpm/versions.json
Original file line number Diff line number Diff line change
Expand Up @@ -24,14 +24,13 @@
"git_tag": "d3477661d771e0d6fd22259bf6dd6f8c64a7401c"
},
"fmt": {
"version": "10.1.1",
"version": "11.0.2",
"git_url": "https://github.com/fmtlib/fmt.git",
"git_tag": "${version}",
"patches": [
{
"file": "fmt/fix_10_1_1_version.diff",
"issue": "fmt 10.1.1 produces a CMake package with version 10.1.0",
"fixed_in": "10.2.0"
"file": "fmt/fix_11_0_2_unreachable_loop.diff",
"issue": "fmt 11.0.2 produces a warning about an unreachable loop when compiled with nvcc"
}
]
},
Expand Down Expand Up @@ -66,20 +65,13 @@
},
"rmm": {
"version": "${rapids-cmake-version}",
"git_url": "https://github.com/rapidsai/rmm.git",
"git_tag": "branch-${version}"
"git_url": "https://github.com/jameslamb/rmm.git",
"git_tag": "fmt-and-spdlog"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"git_url": "https://github.com/jameslamb/rmm.git",
"git_tag": "fmt-and-spdlog"
"git_url": "https://github.com/rapidsai/rmm.git",
"git_tag": "branch-${version}"

TODO: revert this back right before merging

},
"spdlog": {
"version": "1.12.0",
"version": "1.14.1",
"git_url": "https://github.com/gabime/spdlog.git",
"git_tag": "v${version}",
"patches": [
{
"file": "spdlog/nvcc_constexpr_fix.diff",
"issue": "Fix constexpr mismatch between spdlog and fmt [https://github.com/gabime/spdlog/issues/2856]",
"fixed_in": "1.13"
}
]
"git_tag": "v${version}"
}
}
}