Skip to content

Commit

Permalink
further gardening
Browse files Browse the repository at this point in the history
  • Loading branch information
paleolimbot committed Sep 18, 2024
1 parent 7ef151c commit d54c974
Showing 1 changed file with 30 additions and 33 deletions.
63 changes: 30 additions & 33 deletions docs/source/roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,66 +27,63 @@ have not yet been scheduled for implementation.
## C library

- **Type coverage**: The C library currently provides support for all types that
are available via the Arrow C Data interface except run-end encoded (REE)
types and string view/list view types. Support for these should be added in
nanoarrow as well.
- **Array append**: The `ArrowArrayAppend*()` family of functions provide a means
by which to incrementally build arrays; however, there is no built-in way to
append an `ArrowArrayView`, potentially more efficiently appending multiple
values at once. Among other things, this would provide a route to an
unoptimized filter/take implementation.
are available via the Arrow C Data interface except string view/list view types.
Support for these should be added in nanoarrow as well
([#583](https://github.com/apache/arrow-nanoarrow/issues/583),
[#616](https://github.com/apache/arrow-nanoarrow/issues/616),
[#510](https://github.com/apache/arrow-nanoarrow/issues/510)).
- **Remove Arrow C++ dependency for tests**: The C library and IPC extension rely
on Arrow C++ for some test code that was written early in the library's development.
These tests are valuable to ensure compatibility between nanoarrow and Arrow C++;
however, including them in the default test suite complicates release verification
for some users and prevents testing in environments where Arrow C++ does not
currently build (e.g., WASM, compilers without C++17 support).
currently build (e.g., WASM, compilers without C++17 support)
([#619](https://github.com/apache/arrow-nanoarrow/issues/619)).
- **Test verbosity**: Tests for the C library were written before testing utilities
in the `nanoarrow_testing` library were available (and before there was a
`nanoarrow_testing` library in which to put new ones). As a result, some of them
are very verbose and can be difficult to read, which can and should be improved.
are very verbose and can be difficult to read, which can and should be improved
([#577](https://github.com/apache/arrow-nanoarrow/issues/577),
[#566](https://github.com/apache/arrow-nanoarrow/issues/566)).
- **C++ integration**: The existing C++ integration is intentionally minimal;
however, there are likely improvements that could be made to better integrate
nanoarrow into existing C++ projects.
nanoarrow into existing C++ projects
([#599](https://github.com/apache/arrow-nanoarrow/issues/599)).
- **Documentation**: As the C library and its user base evolves, documentation
needs to be refined and expanded to support the current set of use cases.

## IPC extension

- **Dictionary support**: The IPC extension does not currently support reading
dictionary messages an IPC stream.
- **Compression**: The IPC extension does not currently support compressed
streams using per-buffer compression, although streams can be compressed
outside the nanoarrow library (e.g., gzip compression of the entire stream).

## Device extension

- **Define useful scope**: The device extension now provides basic support for
creating and consuming `ArrowDeviceArray` structures for CPU, CUDA, and
Apple Metal devices, including copying to/from the CPU device of arbitrary
arrays. As adoption of `ArrowDeviceArray` becomes more widespread, the
useful scope of the extension needs to be defined based on its use (or not)
in the Arrow community.
needs to be refined and expanded to support the current set of use cases
([#187](https://github.com/apache/arrow-nanoarrow/issues/187),
[#497](https://github.com/apache/arrow-nanoarrow/issues/497)).
- **IPC Dictionary support**: The IPC extension does not currently support reading
dictionary messages an IPC stream
([#622](https://github.com/apache/arrow-nanoarrow/issues/622)).
- **IPC Compression support**: The IPC extension does not currently support
compressed streams using per-buffer compression, although streams can be compressed
outside the nanoarrow library (e.g., gzip compression of the entire stream)
([#621](https://github.com/apache/arrow-nanoarrow/issues/621))

## R bindings

- **Conversion internals**: The initial implementation of conversion from
Arrow data to R vectors was implemented in C and its verbosity makes it
difficult to add support for new types. The internals should be refactored
to make the conversion code easier to understand for new developers.
to make the conversion code easier to understand for new developers
([#392](https://github.com/apache/arrow-nanoarrow/pull/392)).
- **Type support**: The R bindings currently rely on the Arrow R package for
conversion of some R types (e.g., list_of), and some types are not supported
in nanoarrow nor the arrow R package (e.g., run-end encoding, list view, and
string/binary view).
string/binary view)
([#617](https://github.com/apache/arrow-nanoarrow/issues/617)).
- **ALTREP support**: A recent R release added enhanced ALTREP support such that
types that convert to `list()` can defer materialization cost/allocation.
Arrow sources that arrive in chunks (e.g., from a `Table` or `ChunkedArray`)
currently can't be converted via any ALTREP mechanism and support could be
added.
added ([#219](https://github.com/apache/arrow-nanoarrow/issues/219)).

## Python bindings

- **Type support**: The Python bindings do not currently support unions,
string/binary view, or list view, or run-end-encoded types. When creating
Arrow arrays from iterables of Python objects, some types are not yet
supported (e.g., struct, list, datetime objects).
supported (e.g., struct, list, datetime objects)
([#618](https://github.com/apache/arrow-nanoarrow/issues/618),
[#620](https://github.com/apache/arrow-nanoarrow/issues/620)).

0 comments on commit d54c974

Please sign in to comment.