rapidsai · rapids-bot · Mar 13, 2024 · Mar 7, 2024 · Mar 7, 2024 · Mar 7, 2024
@@ -1,5 +1,5 @@
-API Documentation
-=================
+API Reference
+=============
 
 .. toctree::
    :maxdepth: 1

@@ -0,0 +1,91 @@
+cuVS API Basics
+===============
+
+.. toctree::
+   :maxdepth: 1
+   :caption: Contents:
+
+   `Memory management`_
+   `Resource management`_
+
+Memory management
+-----------------
+
+Centralized memory management allows flexible configuration of allocation strategies, such as sharing the same CUDA memory pool across library boundaries. cuVS uses the [RMM](https://github.com/rapidsai/rmm) library, which eases the burden of configuring different allocation strategies globally across GPU-accelerated libraries.
+
+RMM currently has APIs for C++ and Python.
+
+C++
+^^^
+
+Here's an example of configuring RMM to use a pool allocator in C++ (derived from the RMM example [here](https://github.com/rapidsai/rmm?tab=readme-ov-file#example)):
+
+.. code-block:: c++
+
+    rmm::mr::cuda_memory_resource cuda_mr;
+    // Construct a resource that uses a coalescing best-fit pool allocator
+    // With the pool initially half of available device memory
+    auto initial_size = rmm::percent_of_free_device_memory(50);
+    rmm::mr::pool_memory_resource<rmm::mr::cuda_memory_resource> pool_mr{&cuda_mr, initial_size};
+    rmm::mr::set_current_device_resource(&pool_mr); // Updates the current device resource pointer to `pool_mr`
+    rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource(); // Points to `pool_mr`
+
+Python
+^^^^^^
+
+And the corresponding code in Python (derived from the RMM example [here](https://github.com/rapidsai/rmm?tab=readme-ov-file#memoryresource-objects)):
+
+.. code-block:: python
+
+    import rmm
+    pool = rmm.mr.PoolMemoryResource(
+      rmm.mr.CudaMemoryResource(),
+      initial_pool_size=2**30,
+      maximum_pool_size=2**32)
+    rmm.mr.set_current_device_resource(pool)
+
+
+Resource management
+-------------------
+
+cuVS uses an API from the [RAFT](https://github.com/rapidsai/raft) library of ML and data mining primitives to centralize and reuse expensive resources, such as memory management. The below code examples demonstrate how to create these resources for use throughout this guide.
+
+See RAFT's [resource API documentation](https://docs.rapids.ai/api/raft/nightly/cpp_api/core_resources/) for more information.
+
+C
+^
+
+.. code-block:: c
+
+    #include <cuda_runtime.h>
+    #include <cuvs/core/c_api.h>
+
+    cuvsResources_t res;
+    cuvsResourcesCreate(&res);
+
+    // ... do some processing ...
+
+    cuvsResourcesDestroy(res);
+
+C++
+^^^
+
+.. code-block:: c++
+
+    #include <raft/core/device_resources.hpp>
+
+    raft::device_resources res;
+
+Python
+^^^^^^
+
+.. code-block:: python
+
+    import pylibraft
+
+    res = pylibraft.common.DeviceResources()
+
+
+Rust
+^^^^
+
@@ -1,8 +1,6 @@
-**# Installation
+# Installation
 
-The cuVS software development kit provides APIs for C, C++, Python, and Rust languages. These 
-
-Both the C++ and Python APIs require CMake to build from source.
+The cuVS software development kit provides APIs for C, C++, Python, and Rust languages. This guide outlines how to install the pre-compiled packages, build it from source, and use it in downstream applications. 
 
 ## Table of Contents
 
@@ -25,10 +23,6 @@ Both the C++ and Python APIs require CMake to build from source.
 - [Build Documentation](#build-documentation)
 - [Use cuVS in your application](#use-cuvs-in-your-application)
 
-[//]: # (- [Using cuVS in downstream projects]&#40;#using-raft-c-in-downstream-projects&#41;)
-
-[//]: # (  - [CMake targets]&#40;#cmake-targets&#41;_)
-
 ------
 
 ## Installing Pre-compiled Packages
@@ -39,17 +33,17 @@ The easiest way to install the pre-compiled C, C++, and Python packages is throu
 
 #### C++ Package
 ```bash
-mamba install -c rapidsai -c conda-forge -c nvidia libcuvs cuda-version=11.8
+mamba install -c rapidsai -c conda-forge -c nvidia libcuvs cuda-version=12.0
 ```
 
 #### C Package
 ```bash
-mamba install -c rapidsai -c conda-forge -c nvidia libcuvs_c cuda-version=11.8
+mamba install -c rapidsai -c conda-forge -c nvidia libcuvs_c cuda-version=12.0
 ```
 
 #### Python Package
 ```bash
-mamba install -c rapidsai -c conda-forge -c nvidia pycuvs cuda-version=11.8
+mamba install -c rapidsai -c conda-forge -c nvidia cuvs cuda-version=12.0
 ```
 
 ### Python through Pip
@@ -58,15 +52,15 @@ The cuVS Python package can also be [installed through pip](https://rapids.ai/pi
 
 For CUDA 11 packages:
 ```bash
-pip install pycuvs-cu11 --extra-index-url=https://pypi.nvidia.com
+pip install cuvs-cu11 --extra-index-url=https://pypi.nvidia.com
 ```
 
 And CUDA 12 packages:
 ```bash
-pip install pycuvs-cu12 --extra-index-url=https://pypi.nvidia.com
+pip install cuvs-cu12 --extra-index-url=https://pypi.nvidia.com
 ```
 
-Note: these packages statically links the C and C++ libraries so the `libcuvs` and `libcuvs_c` shared libraries won't be readily available to use in your code. 
+Note: these packages statically link the C and C++ libraries so the `libcuvs` and `libcuvs_c` shared libraries won't be readily available to use in your code. 
 
 ### Rust through crates.io
 
@@ -124,10 +118,10 @@ Compile the C and C++ Googletests using the `tests` target in `build.sh`.
 
 The tests will be written to the build directory, which is `cpp/build/` by default, and they will be named `*_TEST`.
 
-It can take sometime to compile all of the tests. You can build individual tests by providing a semicolon-separated list to the `--limit-tests` option in `build.sh`. Make sure to pass the `-n` flag so the tests are not installed.
+It can take some time to compile all of the tests. You can build individual tests by providing a semicolon-separated list to the `--limit-tests` option in `build.sh`. Make sure to pass the `-n` flag so the tests are not installed.
 
 ```bash
-./build.sh libcuvs tests -n --limit-tests=NEIGHBORS_TEST;CLUSTER_TEST
+./build.sh libcuvs tests -n --limit-tests=NEIGHBORS_TEST;CAGRA_C_TEST
 ```
 
 ### Python library

@@ -0,0 +1,12 @@
+Getting Started
+===============
+
+This guide provides an initial starting point of the basic concepts and using the various APIs in the cuVS software development kit.
+
+.. toctree::
+   :maxdepth: 1
+   :caption: Contents:
+
+   basics.rst
+   interoperability.rst
+   working_with_ann_indexes.rst
@@ -21,10 +21,10 @@ cuVS is a library for vector search and clustering on the GPU.
    :maxdepth: 1
    :caption: Contents:
 
-   quick_start.md
+   getting_started.rst
+   integrations.rst
    build.md
    api_docs.rst
-   developer_guide.md
    contributing.md
 
 

@@ -0,0 +1,13 @@
+Integrations
+============
+
+In addition to using cuVS through any one of its different language APIs
+
+FAISS
+-----
+
+Milvus
+------
+
+Kinetica
+--------
@@ -0,0 +1,72 @@
+Interoperability
+================
+
+DLPack (C)
+^^^^^^^^^^
+
+Multi-dimensional span (C++)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+cuVS is built on top of the GPU-accelerated machine learning and data mining primitives in the [RAFT](https://github.com/rapidsai/raft) library. Most of the C++ APIs in cuVS accept [mdspan](https://arxiv.org/abs/2010.06474) multi-dimensional array view for representing data in higher dimensions similar to the `ndarray` in the Numpy Python library. RAFT also contains the corresponding owning `mdarray` structure, which simplifies the allocation and management of multi-dimensional data in both host and device (GPU) memory.
+
+The `mdarray` is an owning object that forms a convenience layer over RMM and can be constructed in RAFT using a number of different helper functions:
+
+.. code-block:: c++
+
+    #include <raft/core/device_mdarray.hpp>
+
+    int n_rows = 10;
+    int n_cols = 10;
+
+    auto scalar = raft::make_device_scalar<float>(handle, 1.0);
+    auto vector = raft::make_device_vector<float>(handle, n_cols);
+    auto matrix = raft::make_device_matrix<float>(handle, n_rows, n_cols);
+
+The `mdspan` is a lightweight non-owning view that can wrap around any pointer, maintaining shape, layout, and indexing information for accessing elements.
+
+We can construct `mdspan` instances directly from the above `mdarray` instances:
+
+.. code-block:: c++
+
+    // Scalar mdspan on device
+    auto scalar_view = scalar.view();
+
+    // Vector mdspan on device
+    auto vector_view = vector.view();
+
+    // Matrix mdspan on device
+    auto matrix_view = matrix.view();
+
+Since the `mdspan` is just a lightweight wrapper, we can also construct it from the underlying data handles in the `mdarray` instances above. We use the extent to get information about the `mdarray` or `mdspan`'s shape.
+
+.. code-block:: c++
+
+    #include <raft/core/device_mdspan.hpp>
+
+    auto scalar_view = raft::make_device_scalar_view(scalar.data_handle());
+    auto vector_view = raft::make_device_vector_view(vector.data_handle(), vector.extent(0));
+    auto matrix_view = raft::make_device_matrix_view(matrix.data_handle(), matrix.extent(0), matrix.extent(1));
+
+Of course, RAFT's `mdspan`/`mdarray` APIs aren't just limited to the `device`. You can also create `host` variants:
+
+.. code-block:: c++
+
+    #include <raft/core/host_mdarray.hpp>
+    #include <raft/core/host_mdspan.hpp>
+
+    int n_rows = 10;
+    int n_cols = 10;
+
+    auto scalar = raft::make_host_scalar<float>(handle, 1.0);
+    auto vector = raft::make_host_vector<float>(handle, n_cols);
+    auto matrix = raft::make_host_matrix<float>(handle, n_rows, n_cols);
+
+    auto scalar_view = raft::make_host_scalar_view(scalar.data_handle());
+    auto vector_view = raft::make_host_vector_view(vector.data_handle(), vector.extent(0));
+    auto matrix_view = raft::make_host_matrix_view(matrix.data_handle(), matrix.extent(0), matrix.extent(1));
+
+Please refer to RAFT's `mdspan` [documentation](https://docs.rapids.ai/api/raft/stable/cpp_api/mdspan/) to learn more.
+
+
+CUDA array interface (Python)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^