Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize gathering of point cloud colors #3730

Merged
merged 16 commits into from
Oct 10, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion crates/re_log_types/src/data_cell.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
use std::sync::Arc;

use arrow2::datatypes::DataType;
use re_types::{Component, ComponentName, DeserializationError};
use re_types::{Component, ComponentBatch, ComponentName, DeserializationError};

use crate::SizeBytes;

Expand Down Expand Up @@ -164,6 +164,14 @@ pub struct DataCellInner {
// TODO(#1696): Check that the array is indeed a leaf / component type when building a cell from an
// arrow payload.
impl DataCell {
/// Builds a new `DataCell` from a component batch.
#[inline]
pub fn from_component_batch(batch: &dyn ComponentBatch) -> re_types::SerializationResult<Self> {
batch
.to_arrow()
.map(|arrow| DataCell::from_arrow(batch.name(), arrow))
}

/// Builds a new `DataCell` from a uniform iterable of native component values.
///
/// Fails if the given iterable cannot be serialized to arrow, which should never happen when
Expand Down
26 changes: 25 additions & 1 deletion crates/re_log_types/src/data_row.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
use ahash::HashSetExt;
use nohash_hasher::IntSet;
use re_types::ComponentName;
use re_types::{AsComponents, ComponentName};
use smallvec::SmallVec;

use crate::{DataCell, DataCellError, DataTable, EntityPath, SizeBytes, TableId, TimePoint};
Expand Down Expand Up @@ -266,6 +266,30 @@ pub struct DataRow {
}

impl DataRow {
/// Builds a new `DataRow` from anything implementing [`AsComponents`].
pub fn from_component_batches(
row_id: RowId,
timepoint: TimePoint,
entity_path: EntityPath,
as_components: &dyn AsComponents,
) -> anyhow::Result<Self> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect this to take an iterator of ComponentBatches: it's strictly more expressive, easier to build / come by and more consistent with the rest of our APIs.

Suggested change
pub fn from_component_batches(
row_id: RowId,
timepoint: TimePoint,
entity_path: EntityPath,
as_components: &dyn AsComponents,
) -> anyhow::Result<Self> {
pub fn from_component_batches(
row_id: RowId,
timepoint: TimePoint,
entity_path: EntityPath,
comp_batches: impl IntoIterator<Item = &'a dyn ComponentBatch>,
) -> anyhow::Result<Self> {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would we get the num_instances in that case? Take the max of all the batches? What if there are no batches, or if they are all splats?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take the max of all the batches?

Yes, that matches the behavior of our log methods (in all languages, even!):

What if there are no batches

Then there's nothing in the row and there are no instances

or if they are all splats?

You cannot have "all splats", that would just result in a row with num_instances = 1.

Copy link
Member Author

@emilk emilk Oct 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I thought we stored num_instances separately.

So what happens if I log a full point cloud first and then later want to update all the colors with a splat color - that would be a new row with num_instance = 1 then, even though it will affect several instances?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's where it becomes funky...

If you do this:

rr.log("random", rr.Points3D(positions, colors=colors, radii=radii))
rr.log_components("random", [rr.components.ColorBatch([255, 0, 0])])

Then you're going to end up with the original colors being discarded, a single red point and the rest of the points using the default color for this entity path (because that ColorBatch is not a splat).

Now, there is a trick at your disposal... you could do this:

rr.log("random", rr.Points3D(positions, colors=colors, radii=radii))
rr.log_components("random", [rr.components.ColorBatch([255, 0, 0])], num_instances=2)

And now you'll end up with only red points, because you explicitly said that the data was 2 instances wide, and so the log function considers the ColorBatch to be a splat...

Of course we could change things so that logging 1 thing is always considered a splat, but then you have the opposite problem, which might or might not be better depending on the situation 🤷.

And this is why I don't like that splats are a logging-time rather than a query-time concern: the view should get to decide what to do with the data that it has as its disposal, and that behavior should be configurable through blueprints and through the UI.
This instance key business is pretty similar to e.g. configurable texture clamping modes in gfx APIs after all.

re_tracing::profile_function!();

let data_cells = as_components
.as_component_batches()
.into_iter()
.map(|batch| DataCell::from_component_batch(batch.as_ref()))
.collect::<Result<Vec<DataCell>, _>>()?;

Ok(DataRow::from_cells(
row_id,
timepoint,
entity_path,
as_components.num_instances() as _,
data_cells,
)?)
}

/// Builds a new `DataRow` from an iterable of [`DataCell`]s.
///
/// Fails if:
Expand Down
Loading