Skip to content

Commit

Permalink
Warn when selection only includes static data (#7758)
Browse files Browse the repository at this point in the history
### What
Based on top of:
- #7754
- Will need rebase after merging ^

This tries to alleviates a possible footgun where a user creates what
appears to be a valid view expression but it only includes static data.
In these cases the results of `.select()` won't produce any data since
there are no row-providing columns.

There are many possible ways to end up in this state but the logic here
should not be too likely for false-warnings while producing a reasonable
degree of user safety.

If the user:
- Writes a content expression that only matches static content
- AND writes a select statement that queries static data
- AND does not call `using_index_values(...)`

Then we will produce a warning.

The most likely false positive where this would introduce a spurious
warning would be a user wanting to query for a mixture of static and
non-static data in a circumstance where sometimes none of the non-static
data is logged and the user expects to (correctly) get no rows in this
case. However, these circumstances generally imply a more advanced user
that could then work around then with a mixed query + join anyways.

Future work:
- #7759
  • Loading branch information
jleibs authored Oct 16, 2024
1 parent dba6c76 commit 520917e
Showing 1 changed file with 41 additions and 0 deletions.
41 changes: 41 additions & 0 deletions rerun_py/src/dataframe.rs
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,16 @@ pub(crate) fn register(m: &Bound<'_, PyModule>) -> PyResult<()> {
Ok(())
}

fn py_rerun_warn(msg: &str) -> PyResult<()> {
Python::with_gil(|py| {
let warning_type = PyModule::import_bound(py, "rerun")?
.getattr("error_utils")?
.getattr("RerunWarning")?;
PyErr::warn_bound(py, &warning_type, msg, 0)?;
Ok(())
})
}

/// Python binding for `IndexColumnDescriptor`
#[pyclass(frozen, name = "IndexColumnDescriptor")]
#[derive(Clone)]
Expand Down Expand Up @@ -513,6 +523,37 @@ impl PyRecordingView {

let query_handle = engine.query(query_expression);

// If the only contents found are static, we might need to warn the user since
// this means we won't naturally have any rows in the result.
let available_data_columns = query_handle
.view_contents()
.iter()
.filter(|c| matches!(c, ColumnDescriptor::Component(_)))
.collect::<Vec<_>>();

// We only consider all contents static if there at least some columns
let all_contents_are_static = !available_data_columns.is_empty()
&& available_data_columns.iter().all(|c| c.is_static());

// Additionally, we only want to warn if the user actually tried to select some
// of the static columns. Otherwise the fact that there are no results shouldn't
// be surprising.
let selected_data_columns = query_handle
.selected_contents()
.iter()
.map(|(_, col)| col)
.filter(|c| matches!(c, ColumnDescriptor::Component(_)))
.collect::<Vec<_>>();

let any_selected_data_is_static = selected_data_columns.iter().any(|c| c.is_static());

if self.query_expression.using_index_values.is_none()
&& all_contents_are_static
&& any_selected_data_is_static
{
py_rerun_warn("RecordingView::select: tried to select static data, but no non-static contents generated an index value on this timeline. No results will be returned. Either include non-static data or consider using `select_static()` instead.")?;
}

let schema = query_handle.schema();
let fields: Vec<arrow::datatypes::Field> =
schema.fields.iter().map(|f| f.clone().into()).collect();
Expand Down

0 comments on commit 520917e

Please sign in to comment.