Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement maybe-owning multi-dimensional container (mdbuffer) #1999

Merged
merged 176 commits into from
Jan 4, 2024

Conversation

wphicks
Copy link
Contributor

@wphicks wphicks commented Nov 15, 2023

What is mdbuffer?

This PR introduces a maybe-owning multi-dimensional abstraction called mdbuffer to help simplify code that may require an mdarray but only if the data are not already in a desired form or location.

As a concrete example, consider a function foo_device which operates on memory accessible from the device. If we wish to pass it data originating on the host, a separate code path must be created in which a device_mdarray is created and the data are explicitly copied from host to device. This leads to a proliferation of branches as foo_device interacts with other functions with similar requirements.

As an initial simplification, mdbuffer allows us to write a single template that accepts an mdspan pointing to memory on either host or device and routes it through the same code:

template <typename mdspan_type>
void foo_device(raft::resources const& res, mdspan_type data) {
  auto buf = raft::mdbuffer{res, raft::mdbuffer{data}, raft::memory_type::device};
  // Data in buf is now guaranteed to be accessible from device.
  // If it was already accessible from device, no copy was performed. If it
  // was not, a copy was performed.

  some_kernel<<<...>>>(buf.view<raft::memory_type::device>());

  // It is sometimes useful to know whether or not a copy was performed to
  // e.g. determine whether the transformed data should be copied back to its original
  // location. This can be checked via the `is_owning()` method.
  if (buf.is_owning()) {
    raft::copy(res, data, buf.view<raft::memory_type::device>());
  }
}

foo_device(res, some_host_mdspan);  // Still works; memory is allocated and copy is performed
foo_device(res, some_device_mdspan);  // Still works and no allocation or copy is required
foo_device(res, some_managed_mdspan);  // Still works and no allocation or copy is required

While this is a useful simplification, it still leads to a proliferation of template instantiations. If this is undesirable, mdbuffer permits a further consolidation through implicit conversion of an mdspan to an mdbuffer:

void foo_device(raft::resources const& res, raft::mdbuffer<float, raft::matrix_extent<int>>&& data)
{ auto buf = raft::mdbuffer{res, data, raft::memory_type::device};
  some_kernel<<<...>>>(buf.view<raft::memory_type::device>());
  if (buf.is_owning()) {
    raft::copy(res, data, buf.view<raft::memory_type::device>());
  }
}

// All of the following work exactly as before but no longer require separate template instantiations
foo_device(res, some_host_mdspan);
foo_device(res, some_device_mdspan);
foo_device(res, some_managed_mdspan);

mdbuffer also offers a simple way to perform runtime dispatching based on the memory type passed to it using standard C++ patterns. While mdbuffer's .view() method takes an optional template parameter indicating the mdspan type to retrieve as a view, that parameter can be omitted to retrieve a std::variant of all mdspan types which may provide a view on the mdbuffer's data (depending on its memory type). We can then use std::visit to perform runtime dispatching based on where the data are stored:

void foo(raft::resources const& res, raft::mdbuffer<float, raft::matrix_extent<int>>&& data) {
  std::visit([](auto view) {
    if constexpr (typename decltype(view)::accessor_type::is_device_accessible) {
      // Do something with these data on device
    } else {
      // Do something with these data on host
    }
  }, data.view());
}

In addition to moving data among various memory types (host, device, managed, and pinned currently), mdbuffer can be used to coerce data to a desired in-memory layout or to a compatible data type (e.g. floats to doubles). As with changes in the memory type, a copy will be performed if and only if it is necessary.

template <typename mdspan_type>
void foo_device(raft::resources const& res, mdspan_type data) {
  auto buf = raft::mdbuffer<float, raft::matrix_extent<int>, raft::row_major>{res,
raft::mdbuffer{data}, raft::memory_type::device};
  // Data in buf is now guaranteed to be accessible from device, and
  // represented by floats in row-major order.

  some_kernel<<<...>>>(buf.view<raft::memory_type::device>());

  // The same check can be used to determine whether or not a copy was
  // required, regardless of the cause. I.e. if the data were already on
  // device but in column-major order, the is_owning() method would still
  // return true because new storage needed to be allocated.
  if (buf.is_owning()) {
    raft::copy(res, data, buf.view<raft::memory_type::device>());
  }
}

What mdbuffer is not

mdbuffer is not a replacement for either mdspan or mdarray. mdspan remains the standard object for passing data views throughout the RAFT codebase, and mdarray remains the standard object for allocating new multi-dimensional data. This is reflected in the fact that mdbuffer can only be constructed from an existing mdspan or mdarray or another mdbuffer. mdbuffer is intended to be used solely to simplify code where data may need to be copied to a different location.

Follow-ups

  • I have omitted the mdbuffer-based replacement for and generalization of temporary_device_buffer since this PR is already enormous. I have this partially written however, and I'll post a link to its current state to help motivate the changes here.
  • For all necessary copies, mdbuffer uses raft::copy. For some transformations that require a change in data type or layout, raft::copy is not fully optimized. See [FEA] Use cache-oblivious copies for arbitrary copies in raft::copy #1842 for more information. Optimizing this will be an important change to ensure that mdbuffer can be used with absolutely minimal overhead in all cases. These non-optimized cases represent a small fraction of the real-world use cases we can expect for mdbuffer, however, so there should be little concern about beginning to use it as is.
  • std::visit's performance for a small number of variants is sometimes non-optimal. As a followup, it would be good to benchmark mdbuffer's current performance and compare to internal use of a visit implementation that uses a switch on the available memory types.

Resolve #1602

@wphicks
Copy link
Contributor Author

wphicks commented Dec 14, 2023

/ok to test

@wphicks wphicks added 5 - Ready to Merge and removed 4 - Waiting on Author Waiting for author to respond to review labels Dec 14, 2023
@wphicks
Copy link
Contributor Author

wphicks commented Dec 15, 2023

/ok to test

Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Giving my approval again. Let's give this another day to see if @tfeher and @benfred have any feedback and then we can get it merged for the weekend!

@wphicks
Copy link
Contributor Author

wphicks commented Dec 15, 2023

/ok to test

Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @whpicks for this feature! Due to lack of time I just skimmed through a large part of implementation details, so my review is restricted to the high level API.

The API is well polished and seems convenient to use. I think it will be a really helpful utility!

You are handling host, device, managed, and pinned memory types. I agree that these are the variants that we expect to encounter in practice.

We have APIs in RAFT, specifically some of the ANN build algorithms, that just care about the data shape, and host/device dispatch is happenning only deeper in the call chain. At that point mdbuffer is a great help to simplify the code when we finally need to guarantee accessibility from device. And I see that actually all four allocation types could be useful when we create an mdbuffer. So having the vocabulary to express this using raft::memory_type seems helpful.

Still, I am a little bit concerned by the proliferation of mdarray types. The allocation mechanism of mdarray can be defined by simply passing the desired memory_resource during creation. When we want to use the data, we care about whether it is accessible from host or device memory space, and that is expressed by the accessor_type. We shall ask the question: it is really useful to have so many mdarray / mdspan variants? I am not sure about this, but I do not have any objection for the current proposal. We should still keep in mind that other memory requirements will could come up (e.g. array allocated with huge page attributes) and we probably don't want to define new mdarray type for each of these.

Thanks once more for this work, it is a great addition to RAFT! The PR looks good to me.

Copy link
Member

@benfred benfred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks great! I think this code will really help simplify some of our code, and I'm looking forward to taking advantage of it

cpp/include/raft/core/memory_type.hpp Show resolved Hide resolved
@wphicks
Copy link
Contributor Author

wphicks commented Jan 2, 2024

Still, I am a little bit concerned by the proliferation of mdarray types.

I actually strongly agree with this. As you say, there are use cases where we really only care about whether or not we can access those data at a particular point in runtime and what the shape/layout is. In that case, we pay fairly dearly for the proliferation of compile-time types. The compile-time specification of memory type also runs counter to some of the general ways I see folks thinking about the purpose of mdspan.

I think there may be room for both constructs to exist alongside one another, however. There's no question that having the ability to dispatch to radically different compile-time paths based on the runtime condition of memory type is useful, and I'm hoping that's what mdbuffer will support. On the other hand, it would be good to revisit how we're using mdspan in cases where we explicitly wish to elide the difference between memory types for e.g. GPU/CPU compatible code. I had proposed something along those lines quite some time ago, but at the time others objected to it. Maybe we can review some of those ideas again as a follow-on to this PR and see if there is greater clarity on the different use cases now that mdbuffer has solidified.

@wphicks
Copy link
Contributor Author

wphicks commented Jan 2, 2024

Thank you everyone for all of the valuable feedback on this! I'm going to try to implement @benfred 's suggestion to align the memory_type values with cudaMemoryType, and if there are no objections, I'll trigger a merge after that.

@cjnolet
Copy link
Member

cjnolet commented Jan 2, 2024

The compile-time specification of memory type also runs counter to some of the general ways I see folks thinking about the purpose of mdspan.

This is counter to the original use of mdspan within RAFT also. We intentionally created only two different types (device and host) with a general accessor wrapper that specifies whether or not the memory can be read from device or host. Note that the "both" case can still be specified through that pattern, which also covers managed memory, for example. The idea is to allow functions to define which access level they are interested in without having to introduce new template types for each possible combination.

The idea behind this lightweight accessor wrapper is so that we can support any accessor the user might be using so long as the accessor can produce the correct values, either on host or device, when asked... (intentionally side stepping the submdspan can of worms discussion here too).

I also agree that we should be able to simplify mdarray in this way- since the template itself should only care about what mdspan it needs to produce in order to interact with APIs that require it. Ideally, the mdarray would accept the memory type as a runtime argument so long as it produces an mdspan that is either device or host accessible. One problem here is that the way our current mdarray is designed (which now slightly differs from libcu++ and the actual spec), the memory type is coupled to the container policy, which is a template argument. It's on the todo list to reconcile the mdarray with libcu++, and even eventually replace the current one with the one in libcu++. It just keeps getting put on the back-burner behind cuVS priorities.

I'm not convinced we need to be worrying about CUDA IPC (or even SCADA) at this layer. That seems like details we would want to delegate to the container policy. Hopefully we can someday find a good way to purge that from the templates altogether.

@cjnolet
Copy link
Member

cjnolet commented Jan 2, 2024

I had proposed something along those lines quite some time ago, but at the time others objected to it. Maybe we can review some of those ideas again as a follow-on to this PR and see if there is greater clarity on the different use cases now that mdbuffer has solidified.

The decision to make separate "device" and "host" mdspan typedefs adheres to the design of the other c++ vocabulary types in RAPIDS and in the CCCL (including Thrust). As an organization, we should be striving to follow CCCL's lead here.

I personally think the mdbuffer is going to be very useful for gluing together the container and view layer, allowing things like smart copy-back in between. However, this shouldn't change the vocab types in our public APIs (which is why it's so great that mdbuffer supports implicit conversion).

cc @jrhemstad @mhoemmen @harrism

@wphicks
Copy link
Contributor Author

wphicks commented Jan 2, 2024

The decision to make separate "device" and "host" mdspan typedefs adheres to the design of the other c++ vocabulary types

Thanks! That's not really what I was getting at, but I think this is getting beyond the scope of this PR. We can discuss more elsewhere.

@wphicks
Copy link
Contributor Author

wphicks commented Jan 2, 2024

/ok to test

@wphicks
Copy link
Contributor Author

wphicks commented Jan 2, 2024

/merge

@cjnolet
Copy link
Member

cjnolet commented Jan 2, 2024

@wphicks, my prior response was also in reply to @tfeher's earlier comment:

We shall ask the question: it is really useful to have so many mdarray / mdspan variants? I am not sure about this, but I do not have any objection for the current proposal

(I can only speak to the decision for "device"-accessible and "host"-accessible variants)

@wphicks
Copy link
Contributor Author

wphicks commented Jan 3, 2024

/ok to test

@rapids-bot rapids-bot bot merged commit 0d65954 into rapidsai:branch-24.02 Jan 4, 2024
61 checks passed
ChristinaZ pushed a commit to ChristinaZ/raft that referenced this pull request Jan 17, 2024
…ai#1999)

### What is mdbuffer?

This PR introduces a maybe-owning multi-dimensional abstraction called `mdbuffer` to help simplify code that _may_ require an `mdarray` but only if the data are not already in a desired form or location.

As a concrete example, consider a function `foo_device` which operates on memory accessible from the device. If we wish to pass it data originating on the host, a separate code path must be created in which a `device_mdarray` is created and the data are explicitly copied from host to device. This leads to a proliferation of branches as `foo_device` interacts with other functions with similar requirements.

As an initial simplification, `mdbuffer` allows us to write a single template that accepts an `mdspan` pointing to memory on either host _or_ device and routes it through the same code:
```c++
template <typename mdspan_type>
void foo_device(raft::resources const& res, mdspan_type data) {
  auto buf = raft::mdbuffer{res, raft::mdbuffer{data}, raft::memory_type::device};
  // Data in buf is now guaranteed to be accessible from device.
  // If it was already accessible from device, no copy was performed. If it
  // was not, a copy was performed.

  some_kernel<<<...>>>(buf.view<raft::memory_type::device>());

  // It is sometimes useful to know whether or not a copy was performed to
  // e.g. determine whether the transformed data should be copied back to its original
  // location. This can be checked via the `is_owning()` method.
  if (buf.is_owning()) {
    raft::copy(res, data, buf.view<raft::memory_type::device>());
  }
}

foo_device(res, some_host_mdspan);  // Still works; memory is allocated and copy is performed
foo_device(res, some_device_mdspan);  // Still works and no allocation or copy is required
foo_device(res, some_managed_mdspan);  // Still works and no allocation or copy is required
```

While this is a useful simplification, it still leads to a proliferation of template instantiations. If this is undesirable, `mdbuffer` permits a further consolidation through implicit conversion of an mdspan to an mdbuffer:

```c++
void foo_device(raft::resources const& res, raft::mdbuffer<float, raft::matrix_extent<int>>&& data)
{ auto buf = raft::mdbuffer{res, data, raft::memory_type::device};
  some_kernel<<<...>>>(buf.view<raft::memory_type::device>());
  if (buf.is_owning()) {
    raft::copy(res, data, buf.view<raft::memory_type::device>());
  }
}

// All of the following work exactly as before but no longer require separate template instantiations
foo_device(res, some_host_mdspan);
foo_device(res, some_device_mdspan);
foo_device(res, some_managed_mdspan);
```

`mdbuffer` also offers a simple way to perform runtime dispatching based on the memory type passed to it using standard C++ patterns. While mdbuffer's `.view()` method takes an optional template parameter indicating the mdspan type to retrieve as a view, that parameter can be omitted to retrieve a `std::variant` of all mdspan types which may provide a view on the `mdbuffer`'s data (depending on its memory type). We can then use `std::visit` to perform runtime dispatching based on where the data are stored:

```c++
void foo(raft::resources const& res, raft::mdbuffer<float, raft::matrix_extent<int>>&& data) {
  std::visit([](auto view) {
    if constexpr (typename decltype(view)::accessor_type::is_device_accessible) {
      // Do something with these data on device
    } else {
      // Do something with these data on host
    }
  }, data.view());
}
```

In addition to moving data among various memory types (host, device, managed, and pinned currently), `mdbuffer` can be used to coerce data to a desired in-memory layout or to a compatible data type (e.g. floats to doubles). As with changes in the memory type, a copy will be performed if and only if it is necessary.

```c++
template <typename mdspan_type>
void foo_device(raft::resources const& res, mdspan_type data) {
  auto buf = raft::mdbuffer<float, raft::matrix_extent<int>, raft::row_major>{res,
raft::mdbuffer{data}, raft::memory_type::device};
  // Data in buf is now guaranteed to be accessible from device, and
  // represented by floats in row-major order.

  some_kernel<<<...>>>(buf.view<raft::memory_type::device>());

  // The same check can be used to determine whether or not a copy was
  // required, regardless of the cause. I.e. if the data were already on
  // device but in column-major order, the is_owning() method would still
  // return true because new storage needed to be allocated.
  if (buf.is_owning()) {
    raft::copy(res, data, buf.view<raft::memory_type::device>());
  }
}
```

### What mdbuffer is **not**
`mdbuffer` is **not** a replacement for either `mdspan` or `mdarray`. `mdspan` remains the standard object for passing data views throughout the RAFT codebase, and `mdarray` remains the standard object for allocating new multi-dimensional data. This is reflected in the fact that `mdbuffer` can _only_ be constructed from an existing `mdspan` or `mdarray` or another `mdbuffer`. `mdbuffer` is intended to be used solely to simplify code where data _may_ need to be copied to a different location.

### Follow-ups

-  I have omitted the mdbuffer-based replacement for and generalization of `temporary_device_buffer` since this PR is already enormous. I have this partially written however, and I'll post a link to its current state to help motivate the changes here.
- For all necessary copies, `mdbuffer` uses `raft::copy`. For _some_ transformations that require a change in data type or layout, `raft::copy` is not fully optimized. See rapidsai#1842 for more information. Optimizing this will be an important change to ensure that `mdbuffer` can be used with absolutely minimal overhead in all cases. These non-optimized cases represent a small fraction of the real-world use cases we can expect for `mdbuffer`, however, so there should be little concern about beginning to use it as is.
- `std::visit`'s performance for a small number of variants is sometimes non-optimal. As a followup, it would be good to benchmark `mdbuffer`'s current performance and compare to internal use of a `visit` implementation that uses a `switch` on the available memory types.

Resolve rapidsai#1602

Authors:
  - William Hicks (https://github.com/wphicks)
  - Tarang Jain (https://github.com/tarang-jain)

Approvers:
  - Divye Gala (https://github.com/divyegala)
  - Corey J. Nolet (https://github.com/cjnolet)
  - Artem M. Chirkin (https://github.com/achirkin)
  - Tamas Bela Feher (https://github.com/tfeher)
  - Ben Frederickson (https://github.com/benfred)

URL: rapidsai#1999
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

Successfully merging this pull request may close these issues.

[FEA] mdbuffer implementation in RAFT
7 participants