Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Client-side chunks 3: micro-batching (#6440)
This is a fork of the old `DataTable` batcher, and works very similarly. Like before, this batcher will micro-batch using both space and time thresholds. There are two main differences: - This batcher maintains a dataframe per-entity, as opposed to the old one which worked globally. - Once a threshold is reached, this batcher further splits the incoming batch in order to fulfill these invariants: ```rust /// In particular, a [`Chunk`] cannot: /// * contain data for more than one entity path /// * contain rows with different sets of timelines /// * use more than one datatype for a given component /// * contain more rows than a pre-configured threshold if one or more timelines are unsorted ``` Most of the code is the same, the real interesting piece is `PendingRow::many_into_chunks`, as well as the newly added tests. - Fixes #4431 --- Part of a PR series to implement our new chunk-based data model on the client-side (SDKs): - #6437 - #6438 - #6439 - #6440 - #6441
- Loading branch information