diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index e7f7a20e307..dce92d7e613 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -217,7 +217,7 @@ cuda-gdb -ex r --args python .py ``` ```bash -cuda-memcheck python .py +compute-sanitizer --tool memcheck python .py ``` ### Device debug symbols diff --git a/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md b/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md index 8188c466312..ce9840050a9 100644 --- a/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md +++ b/cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md @@ -1384,3 +1384,25 @@ cuIO is a component of libcudf that provides GPU-accelerated reading and writing formats commonly used in data analytics, including CSV, Parquet, ORC, Avro, and JSON_Lines. // TODO: add more detail and move to a separate file. + +# Debugging Tips + +Here are some tools that can help with debugging libcudf (besides printf of course): +1. `cuda-gdb`\ + Follow the instructions in the [Contributor to cuDF guide](../../../CONTRIBUTING.md#debugging-cudf) to build + and run libcudf with debug symbols. +2. `compute-sanitizer`\ + The [CUDA Compute Sanitizer](https://docs.nvidia.com/compute-sanitizer/ComputeSanitizer/index.html) + tool can be used to locate many CUDA reported errors by providing a call stack + close to where the error occurs even with a non-debug build. The sanitizer includes various + tools including `memcheck`, `racecheck`, and `initcheck` as well as others. + The `racecheck` and `initcheck` have been known to produce false positives. +3. `cudf::test::print()`\ + The `print()` utility can be called within a gtest to output the data in a `cudf::column_view`. + More information is available in the [Testing Guide](TESTING.md#printing-and-accessing-column-data) +4. GCC Address Sanitizer\ + The GCC ASAN can also be used by adding the `-fsanitize=address` compiler flag. + There is a compatibility issue with the CUDA runtime that can be worked around by setting + environment variable `ASAN_OPTIONS=protect_shadow_gap=0` before running the executable. + Note that the CUDA `compute-sanitizer` can also be used with GCC ASAN by setting the + environment variable `ASAN_OPTIONS=protect_shadow_gap=0,alloc_dealloc_mismatch=0`. diff --git a/cpp/doxygen/developer_guide/TESTING.md b/cpp/doxygen/developer_guide/TESTING.md index a4ffe0f575b..9c86be5a55d 100644 --- a/cpp/doxygen/developer_guide/TESTING.md +++ b/cpp/doxygen/developer_guide/TESTING.md @@ -455,10 +455,19 @@ Column comparison functions in the `cudf::test::detail` namespace should **NOT** ### Printing and accessing column data -`include/cudf_test/column_utilities.hpp` defines various functions and overloads for printing +The `` header defines various functions and overloads for printing columns (`print`), converting column data to string (`to_string`, `to_strings`), and copying data to -the host (`to_host`). - +the host (`to_host`). For example, to print a `cudf::column_view` contents or `column_wrapper` instance +to the console use the `cudf::test::print()`: +```cpp + cudf::test::fixed_width_column_wrapper input({1,2,3,4}); + auto splits = cudf::split(input,{2}); + cudf::test::print(input); + cudf::test::print(splits.front()); +``` +Fixed-width and strings columns output as comma-separated entries including null rows. +Nested columns are also supported and output includes the offsets and data children as well as +the null mask bits. ## Validating Stream Usage