-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataframe API v2 p1: API definitions #7559
Conversation
18886e3
to
a989ac2
Compare
@@ -242,6 +242,8 @@ pub struct ComponentColumnDescriptor { | |||
pub store_datatype: ArrowDatatype, | |||
|
|||
/// How the data will be joined into the resulting `RecordBatch`. | |||
// | |||
// TODO(cmc): remove with the old re_dataframe. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? I believe we still need this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now the API is a row iterator, there's nothing to join 🤷
#[derive(Debug, Clone, PartialEq, Eq, Hash)] | ||
pub struct ControlColumnSelector { | ||
/// Name of the control column. | ||
// | ||
// TODO(cmc): this should be `component_name`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems like an unnecessary verbosity. We generally refer to components by name. For example, we write things like "The Point3D component" and consider it synonymous with "The component named Point3D".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now the convention across the codebase is that component
the actual component data (component: impl Component)
whereas component_name
is just a name (component_name: ComponentName
).
/// view contents: it is possible to end up with values from outside the view! | ||
LatestAtGlobal, | ||
// | ||
// TODO(cmc): `LatestAtView`? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 or more importantly, LatestAtWindowed(...)
Co-authored-by: Jeremy Leibs <[email protected]>
7d1cb72
to
39cfb1a
Compare
Co-authored-by: Jeremy Leibs <[email protected]>
A first implementation of the new dataframe APIs. The name is now very misleading though: there isn't anything dataframe-y left in here, it is a row-based iterator with Rerun semantics baked in, driven by a sorted streaming join. It is rather slow (related: #7558 (comment)), lacks many features and is full of edge cases, but it works. It does support dedupe-latest semantics (slowly), view contents and selections, chunk overlaps, and pagination (horribly, by virtue of implementing `Iterator`). It does _not_ support `Clear`s, nor `latest-at` sparse-filling, nor PoVs, nor index sampling. Yet. Upcoming PRs will be all about fixing these shortcomings one by one. It should look somewhat familiar: ```rust let query_cache = QueryCache::new(store); let query_engine = QueryEngine { store, cache: &query_cache, }; let mut query = QueryExpression2::new(timeline); query.view_contents = Some( query_engine .iter_entity_paths(&entity_path_filter) .map(|entity_path| (entity_path, None)) .collect(), ); query.filtered_index_range = Some(ResolvedTimeRange::new(time_from, time_to)); eprintln!("{query:#?}:"); let query_handle = query_engine.query(query.clone()); // eprintln!("{:#?}", query_handle.selected_contents()); for batch in query_handle.into_batch_iter().skip(offset).take(len) { eprintln!("{batch}"); } ``` No tests until we have the guarantee that these are the semantics we will commit to. * Part of #7495 * Requires #7559
The new public API definition and nothing else. Speak now.
Checklist
main
build: rerun.io/viewernightly
build: rerun.io/viewerCHANGELOG.md
and the migration guideTo run all checks from
main
, comment on the PR with@rerun-bot full-check
.