-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update fmt (to 11.0.2) and spdlog (to 1.14.1) #689
Conversation
…++ compilers (only want this for nvcc)
} | ||
+#if defined(__NVCC__) | ||
+#pragma nv_diag_default 128 | ||
+#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This patch works around an issue we observed in raft
(rapidsai/raft#2433 (comment)) and in the cudf
pip devcontainers build.
FAILED: CMakeFiles/raft_objs.dir/src/matrix/detail/select_k_half_uint32_t.cu.o
/usr/bin/sccache /usr/local/cuda/bin/nvcc -forward-unknown-to-host-compiler -ccbin=/usr/bin/g++ -DCUTLASS_NAMESPACE=raft_cutlass -DFMT_HEADER_ONLY=1 -DLIBCUDACXX_ENABLE_EXPERIMENTAL_MEMORY_RESOURCE -DRAFT_COMPILED -DRAFT_EXPLICIT_INSTANTIATE_ONLY -DRAFT_SYSTEM_LITTLE_ENDIAN=1 -DSPDLOG_FMT_EXTERNAL -DTHRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_CUDA -DTHRUST_DISABLE_ABI_NAMESPACE -DTHRUST_HOST_SYSTEM=THRUST_HOST_SYSTEM_CPP -DTHRUST_IGNORE_ABI_NAMESPACE_ERROR -I/home/coder/raft/cpp/include -I/home/coder/raft/cpp/build/pip/cuda-12.5/release/_deps/hnswlib-src -I/home/coder/raft/cpp/build/pip/cuda-12.5/release/_deps/rmm-src/include -I/home/coder/raft/cpp/build/pip/cuda-12.5/release/_deps/cccl-src/thrust/thrust/cmake/../.. -I/home/coder/raft/cpp/build/pip/cuda-12.5/release/_deps/cccl-src/libcudacxx/lib/cmake/libcudacxx/../../../include -I/home/coder/raft/cpp/build/pip/cuda-12.5/release/_deps/cccl-src/cub/cub/cmake/../.. -I/home/coder/raft/cpp/build/pip/cuda-12.5/release/_deps/fmt-src/include -I/home/coder/raft/cpp/build/pip/cuda-12.5/release/_deps/spdlog-src/include -I/home/coder/raft/cpp/build/pip/cuda-12.5/release/_deps/nvtx3-src/c/include -I/home/coder/raft/cpp/build/pip/cuda-12.5/release/_deps/cuco-src/include -I/home/coder/raft/cpp/build/pip/cuda-12.5/release/_deps/nvidiacutlass-src/include -I/home/coder/raft/cpp/build/pip/cuda-12.5/release/_deps/nvidiacutlass-build/include -I/usr/local/cuda/include -isystem /usr/local/cuda/targets/x86_64-linux/include -t=1 -O3 -DNDEBUG -std=c++17 "--generate-code=arch=compute_70,code=[sm_70]" "--generate-code=arch=compute_75,code=[sm_75]" "--generate-code=arch=compute_80,code=[sm_80]" "--generate-code=arch=compute_86,code=[sm_86]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" -Xcompiler=-fPIC -Xcompiler=-Wno-deprecated-declarations -DRAFT_HIDE_DEPRECATION_WARNINGS -Xcompiler=-Wall,-Werror,-Wno-error=deprecated-declarations -Werror=all-warnings --expt-extended-lambda --expt-relaxed-constexpr -DCUDA_API_PER_THREAD_DEFAULT_STREAM -Xfatbin=-compress-all -Xcompiler=-fopenmp -MD -MT CMakeFiles/raft_objs.dir/src/matrix/detail/select_k_half_uint32_t.cu.o -MF CMakeFiles/raft_objs.dir/src/matrix/detail/select_k_half_uint32_t.cu.o.d -x cu -c /home/coder/raft/cpp/src/matrix/detail/select_k_half_uint32_t.cu -o CMakeFiles/raft_objs.dir/src/matrix/detail/select_k_half_uint32_t.cu.o
/home/coder/raft/cpp/build/pip/cuda-12.5/release/_deps/fmt-src/include/fmt/base.h(471): error #128-D: loop is not reachable
for (; n != 0; ++s1, ++s2, --n) {
^
detected during:
instantiation of "auto fmt::v11::detail::compare(const Char *, const Char *, std::size_t)->int [with Char=char]" at line 583
instantiation of "auto fmt::v11::basic_string_view<Char>::compare(fmt::v11::basic_string_view<Char>) const->int [with Char=char]" at line 591
instantiation of class "fmt::v11::basic_string_view<Char> [with Char=char]" at line 1929
instantiation of "auto fmt::v11::basic_format_args<Context>::get_id(fmt::v11::basic_string_view<Char>) const->int [with Context=fmt::v11::context, Char=char]" at line 1969
Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"
1 error detected in the compilation of "/home/coder/raft/cpp/src/matrix/detail/select_k_half_uint32_t.cu".
ninja: build stopped: subcommand failed.
I'm not sure what the root cause is, but thought the cost "a bit of unreachable code is compiled in" was worth it in exchange for getting this upgrade done.
Otherwise, I think we'd have to do one of these other higher-effort things:
- put similar guards around every
#include
that can lead to#include <fmt/base.h>
(recursively) across RAPIDS - find the true root cause of the warning
NOTE: taking this patch means that fmt
will still be vendored in some RAPIDS conda packages though (rapidsai/rmm#1528).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will need a strategy to upstream or fix this, so that we aren't carrying the patch indefinitely. Let's compose an issue to fmt
that demonstrates this more minimally, and then figure out what to do for a fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember which nvcc versions were affected (just 12.5? also 11.8?) but that would be relevant to include in a bug report.
I assume that this is a fmt bug and not an nvcc bug, but if you think it's an nvcc bug, let's discuss and file that internally as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I'd love to get rid of the patch.
I don't know the range of nvcc
versions or whether the issue is in fmt
or nvcc
. Will put up a tracking issue here in rapids-cmake
with steps to reproduce this so we can investigate later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put up #695 to track the work of trying to remove this patch.
rapids-cmake/cpm/versions.json
Outdated
"git_url": "https://github.com/jameslamb/rmm.git", | ||
"git_tag": "fmt-and-spdlog" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"git_url": "https://github.com/jameslamb/rmm.git", | |
"git_tag": "fmt-and-spdlog" | |
"git_url": "https://github.com/rapidsai/rmm.git", | |
"git_tag": "branch-${version}" |
TODO: revert this back right before merging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rename this so it has the word "nvcc" in the filename. If that requires too much CI rerunning, don't bother.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think it's not worth the CI re-running. The patch has #if defined(__NVCC__)
around its changes, so hopefully that's enough information for anyone looking at it that this is specific to nvcc.
} | ||
+#if defined(__NVCC__) | ||
+#pragma nv_diag_default 128 | ||
+#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember which nvcc versions were affected (just 12.5? also 11.8?) but that would be relevant to include in a bug report.
I assume that this is a fmt bug and not an nvcc bug, but if you think it's an nvcc bug, let's discuss and file that internally as well.
## Description Replaces #15603 Contributes to: * rapidsai/build-planning#54 * rapidsai/build-planning#56 * rapidsai/rapids-cmake#387 Now that most of `conda-forge` has been updated to `fmt >=11.0.1,<12` and `spdlog>=1.14.1,<1.15` (rapidsai/build-planning#56 (comment)), we're attempting to upgrade RAPIDS to similar versions of those libraries. This improves the likelihood that RAPIDS will be installable alongside newer versions of its dependencies and complementary packages on conda-forge. ## Notes for Reviewers This PR is testing changes made in rapidsai/rapids-cmake#689. It shouldn't be merged until those `rapids-cmake` changes are merged and any testing-specific details have been removed.
… libcuml conda host dependencies (#6071) Contributes to rapidsai/build-planning#56 * updates `fmt` and `spdlog` to newer versions, to match the rest of RAPIDS * adds `fmt` and `spdlog` to `host:` dependencies for `libcuml` conda packages (see #6071 (comment)) Now that most of `conda-forge` has been updated to `fmt >=11.0.1,<12` and `spdlog>=1.14.1,<1.15` (rapidsai/build-planning#56 (comment)), we're attempting to upgrade RAPIDS to similar versions of those libraries. This improves the likelihood that RAPIDS will be installable alongside newer versions of its dependencies and complementary packages on conda-forge. ## Notes for Reviewers This PR is testing changes made in rapidsai/rapids-cmake#689. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Bradley Dice (https://github.com/bdice) URL: #6071
Description
Contributes to rapidsai/build-planning#56
Replaces #592
Now that most of
conda-forge
has been updated tofmt >=11.0.1,<12.0a0
andspdlog>=1.14.1,<1.15
(rapidsai/build-planning#56 (comment)), we're attempting to upgrade RAPIDS to similar versions of those libraries... so that the next release of RAPIDS will be installable alongside newer versions of its dependencies and complementary packages on conda-forge.Notes for Reviewers
This will also have the added side-benefit of (at least for now) shrinking the size and install time of some RAPIDS conda packages... because now builds will get
spdlog
from those conda-forge packages, instead of downloading it via CPM and then vendoring it.How I tested this
Opened PRs in some other RAPIDS repos using
rapids-cmake
from here andrmm
from rapidsai/rmm#1678.See the list at rapidsai/build-planning#56
Also tested in unified devcontainers, with all RAPIDS repos at (in devcontainers manifest.yaml) checked out to the branches from those PRs (pointed at my fork of
rapids-cmake
here), like this:uninstall-all -j clean-all -j build-all -j ./rmm/ci/run_pytests.sh ./cudf/ci/run_cudf_pytests.sh ./cuml/ci/run_cuml_singlegpu_pytests.sh pushd ./cuspatial/python/cuspatial/cuspatial python -m pytest \ --cache-clear \ --numprocesses=8 \ --dist=worksteal \ tests
Saw everything build successfully and all the tests pass (with CUDA 12.2, on x86_64).
Checklist
cmake-format.json
is up to date with these changes.include_guard(GLOBAL)
)