Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filter_is_not_null / filtered_point_of_view returns incorrect results #7681

Closed
jleibs opened this issue Oct 10, 2024 · 1 comment · Fixed by #7683
Closed

filter_is_not_null / filtered_point_of_view returns incorrect results #7681

jleibs opened this issue Oct 10, 2024 · 1 comment · Fixed by #7683
Assignees
Labels
🪳 bug Something isn't working feat-dataframe-api Everything related to the dataframe API

Comments

@jleibs
Copy link
Member

jleibs commented Oct 10, 2024

Position3D is not in the result set even though it should be.

Can be reproduced on top of #7680

Simple repro in python:

import rerun as rr
import tempfile


rr.init("rerun_example_test_recording")

rr.set_time_sequence("my_index", 1)
rr.log("points", rr.Points3D([[1, 2, 3], [4, 5, 6], [7, 8, 9]]))
rr.set_time_sequence("my_index", 7)
rr.log("points", rr.Points3D([[10, 11, 12]], colors=[[255, 0, 0]]))

with tempfile.TemporaryDirectory() as tmpdir:
    rrd = tmpdir + "/tmp.rrd"

    rr.save(rrd)

    recording = rr.dataframe.load_recording(rrd)

color_col = rr.dataframe.ComponentColumnSelector("points", rr.components.Color)

view = recording.view(index="my_index", contents="points")

# Baseline
table = view.select().read_all()
print(80 * "=")
print(table)
print(80 * "=")

# my_index, log_time, log_tick, points, colors
assert table.num_columns == 5
assert table.num_rows == 2

assert table.column("my_index").combine_chunks()[1].as_py() == 7

assert table.column("/points:Color").combine_chunks()[1][0].as_py() == 4278190335
assert table.column("/points:Position3D").combine_chunks()[1][0].as_py() == [10, 11, 12]

# Filtered to Color is not null
table = view.filter_is_not_null(color_col).select().read_all()
print(80 * "=")
print(table)
print(80 * "=")

# my_index, log_time, log_tick, points, colors
assert table.num_columns == 5
assert table.num_rows == 1

assert table.column("my_index").combine_chunks()[0].as_py() == 7

assert table.column("/points:Color").combine_chunks()[0][0].as_py() == 4278190335
assert table.column("/points:Position3D").combine_chunks()[0][0].as_py() == [10, 11, 12]

Output:

================================================================================
pyarrow.Table
log_tick: int64
log_time: timestamp[ns]
my_index: int64
/points:Color: list<item: uint32>
  child 0, item: uint32
/points:Position3D: list<item: fixed_size_list<item: float not null>[3]>
  child 0, item: fixed_size_list<item: float not null>[3]
      child 0, item: float not null
----
log_tick: [[1]]
log_time: [[2024-10-10 16:22:00.557968899]]
my_index: [[7]]
/points:Color: [[[4278190335]]]
/points:Position3D: [[null]]
================================================================================
Traceback (most recent call last):
  File "/home/jleibs/rerun/repro.py", line 50, in <module>
    assert table.column("/points:Position3D").combine_chunks()[0][0].as_py() == [10, 11, 12]
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
  File "pyarrow/scalar.pxi", line 691, in pyarrow.lib.ListScalar.__getitem__
TypeError: 'NoneType' object is not subscriptable
@jleibs jleibs added 🪳 bug Something isn't working 👀 needs triage This issue needs to be triaged by the Rerun team feat-dataframe-api Everything related to the dataframe API and removed 👀 needs triage This issue needs to be triaged by the Rerun team labels Oct 10, 2024
@teh-cmc
Copy link
Member

teh-cmc commented Oct 10, 2024

Repro:

let mut query = QueryExpression::new(Timeline::new_sequence("my_index"));
query.view_contents = Some([("points".into(), None)].into_iter().collect());
query.filtered_point_of_view = Some(ComponentColumnSelector::new_for_component_name(
    "points".into(),
    "rerun.components.Color".into(),
));
eprintln!("{query:#?}:");

let query_handle = query_engine.query(query.clone());
// eprintln!("{:#?}", query_handle.selected_contents());
for batch in query_handle.into_batch_iter() {
    eprintln!("{batch}");
}
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ CHUNK METADATA:                                                                                                                    │
│                                                                                                                                    │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ ┌─────────────┬───────────────────────────────┬─────────────┬───────────────────────────────┬────────────────────────────────────┐ │
│ │ log_tick    ┆ log_time                      ┆ my_index    ┆ /points:Color                 ┆ /points:Position3D                 │ │
│ │ ---         ┆ ---                           ┆ ---         ┆ ---                           ┆ ---                                │ │
│ │ type: "i64" ┆ type: "timestamp(ns)"         ┆ type: "i64" ┆ type: "list[u32]"             ┆ type: "list[fixed-list[f32; 3]]"   │ │
│ │             ┆                               ┆             ┆ sorbet.path: "/points"        ┆ sorbet.path: "/points"             │ │
│ │             ┆                               ┆             ┆ sorbet.semantic_type: "Color" ┆ sorbet.semantic_type: "Position3D" │ │
│ ╞═════════════╪═══════════════════════════════╪═════════════╪═══════════════════════════════╪════════════════════════════════════╡ │
│ │ 1           ┆ 2024-10-10 16:34:41.186071606 ┆ 7           ┆ [4278190335]                  ┆ -                                  │ │
│ └─────────────┴───────────────────────────────┴─────────────┴───────────────────────────────┴────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

@teh-cmc teh-cmc self-assigned this Oct 10, 2024
teh-cmc added a commit that referenced this issue Oct 11, 2024
Simpler. Faster. Correct-er.

* Fixes #7681
* DNM: requires #7677
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🪳 bug Something isn't working feat-dataframe-api Everything related to the dataframe API
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants