Skip to content

Commit

Permalink
New decorator/context manager to catch exceptions when not in strict …
Browse files Browse the repository at this point in the history
…mode. (#3531)

### What
Resolves: #3513

The permutations and indirections in our new API style made this fairly
involved.

In an effort to be as-forgiving-as-possible I wanted to be able to
handle exceptions down multiple levels in the construction hierarchy.
For example:
```
rr.log("points", rr.Points3D(positions, colors='RED'))
```

Should produce an error:
```
my_points3d.py:12: RerunWarning: ColorBatch: ValueError(expected sequence of length of 3 or 4, received 1)
```
But still successfully log positions; just not colors.

However, if something else goes wrong in that constructor, or the things
using it, the next level up in the stack, for example `log` should catch
the error instead. In this case no data will be sent, but we'll still
get a warning and not crash the program.

To handle this I introduced a hybrid decorator / context manager:
`catch_and_log_exceptions` which tries to behave as sanely as possible
in the context of a situation where we might recursively call multiple
levels of functions, any one of which might produce an exception we
would like to handle.

The basic idea is that the deepest context manager to see the error will
handle it and produce the message. This means we can add the
`catch_and_log_exceptions` decorator or context manager anywhere we know
that we can try to recover sanely. We'll get an error produced from that
location, but the code higher up in the stack will still run.

To improve the user experience, errors are appended to a warning list as
they are encountered, and finally we output them when exiting the
top-most context manager. This means we get warning lines that point to
the user call-site rather than some arbitrary location within the guts
of rerun SDK.

One irritation I discovered is that decorating `__init__` methods causes
two major issues. For one, it leaves the object in a potentially
uninitialized state, just waiting to cascade into more exceptions in
future. Additionally, it wreaks havoc on the type-checker. To
accommodate this,`__init__` functions follow a pattern where they use
the context-manager directly and then fallback on a hopefully safer
code-path.

The Archetype constructors now generally call `__attr_clear__` which is
a new helper function designed to call `__attrs_init__` with `None` for
all the known params.

I also code-gen a `_clear` helper to use as a similar fallback path in
builder functions.

Recommend reviewing with:

![image](https://github.com/rerun-io/rerun/assets/3312232/6f05d3b8-8c5c-48ff-a227-c515a92735ea)

Since a number of big snippets got indented when adding the context
managers.

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/3531) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/3531)
- [Docs
preview](https://rerun.io/preview/41f3a7dae887503cf098e063121e54bcdeb7378d/docs)
<!--DOCS-PREVIEW-->
- [Examples
preview](https://rerun.io/preview/41f3a7dae887503cf098e063121e54bcdeb7378d/examples)
<!--EXAMPLES-PREVIEW-->
- [Recent benchmark results](https://ref.rerun.io/dev/bench/)
- [Wasm size tracking](https://ref.rerun.io/dev/sizes/)
  • Loading branch information
jleibs authored Sep 30, 2023
1 parent db71899 commit d4041b4
Show file tree
Hide file tree
Showing 50 changed files with 1,428 additions and 423 deletions.
6 changes: 6 additions & 0 deletions crates/re_sdk/src/log_sink.rs
Original file line number Diff line number Diff line change
Expand Up @@ -147,6 +147,12 @@ impl MemorySinkStorage {
self.msgs.read()
}

/// How many messages are currently written to this memory sink
#[inline]
pub fn num_msgs(&self) -> usize {
self.read().len()
}

/// Consumes and returns the inner array of [`LogMsg`].
///
/// This automatically takes care of flushing the underlying [`crate::RecordingStream`].
Expand Down
48 changes: 47 additions & 1 deletion crates/re_types_builder/src/codegen/python.rs
Original file line number Diff line number Diff line change
Expand Up @@ -324,6 +324,7 @@ impl PythonCodeGenerator {
import pyarrow as pa
import uuid
from {rerun_path}error_utils import catch_and_log_exceptions
from {rerun_path}_baseclasses import (
Archetype,
BaseExtensionType,
Expand Down Expand Up @@ -547,7 +548,7 @@ fn code_for_struct(
if field.is_nullable {
format!("converter={typ_unwrapped}Batch._optional, # type: ignore[misc]\n")
} else {
format!("converter={typ_unwrapped}Batch, # type: ignore[misc]\n")
format!("converter={typ_unwrapped}Batch._required, # type: ignore[misc]\n")
}
} else if !default_converter.is_empty() {
code.push_text(&converter_function, 1, 0);
Expand Down Expand Up @@ -618,6 +619,10 @@ fn code_for_struct(
code.push_text(quote_init_method(obj, ext_class, objects), 2, 4);
}

if obj.kind == ObjectKind::Archetype {
code.push_text(quote_clear_methods(obj), 2, 4);
}

if obj.is_delegating_component() {
code.push_text(
format!(
Expand Down Expand Up @@ -1656,6 +1661,20 @@ fn quote_init_method(obj: &Object, ext_class: &ExtensionClass, objects: &Objects
format!("self.__attrs_init__({})", attribute_init.join(", "))
};

// Make sure Archetypes catch and log exceptions as a fallback
let forwarding_call = if obj.kind == ObjectKind::Archetype {
unindent::unindent(&format!(
r#"
with catch_and_log_exceptions(context=self.__class__.__name__):
{forwarding_call}
return
self.__attrs_clear__()
"#
))
} else {
forwarding_call
};

format!(
"{head}\n{}",
indent::indent_all_by(
Expand All @@ -1665,6 +1684,33 @@ fn quote_init_method(obj: &Object, ext_class: &ExtensionClass, objects: &Objects
)
}

fn quote_clear_methods(obj: &Object) -> String {
let param_nones = obj
.fields
.iter()
.map(|field| format!("{} = None, # type: ignore[arg-type]", field.name))
.join("\n ");

let classname = &obj.name;

unindent::unindent(&format!(
r#"
def __attrs_clear__(self) -> None:
"""Convenience method for calling `__attrs_init__` with all `None`s."""
self.__attrs_init__(
{param_nones}
)
@classmethod
def _clear(cls) -> {classname}:
"""Produce an empty {classname}, bypassing `__init__`"""
inst = cls.__new__(cls)
inst.__attrs_clear__()
return inst
"#
))
}

// --- Arrow registry code generators ---
use arrow2::datatypes::{DataType, Field, UnionMode};

Expand Down
39 changes: 2 additions & 37 deletions rerun_py/rerun_sdk/rerun/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,7 @@
from .archetypes.boxes2d_ext import Box2DFormat
from .components import MediaType, TextLogLevel
from .datatypes import Quaternion, RotationAxisAngle, Scale3D, TranslationAndMat3x3, TranslationRotationScale3D
from .error_utils import set_strict_mode
from .log_deprecated.annotation import AnnotationInfo, ClassDescription, log_annotation_context
from .log_deprecated.arrow import log_arrow
from .log_deprecated.bounding_box import log_obb, log_obbs
Expand Down Expand Up @@ -199,7 +200,6 @@
"unregister_shutdown",
"cleanup_if_forked_child",
"shutdown_at_exit",
"strict_mode",
"set_strict_mode",
"start_web_viewer_server",
]
Expand All @@ -224,11 +224,6 @@ def _init_recording_stream() -> None:
_init_recording_stream()


# If `True`, we raise exceptions on use error (wrong parameter types, etc.).
# If `False` we catch all errors and log a warning instead.
_strict_mode = False


def init(
application_id: str,
*,
Expand Down Expand Up @@ -298,8 +293,7 @@ def init(
random.seed(0)
np.random.seed(0)

global _strict_mode
_strict_mode = strict
set_strict_mode(strict)

# Always check whether we are a forked child when calling init. This should have happened
# via `_register_on_fork` but it's worth being conservative.
Expand Down Expand Up @@ -506,35 +500,6 @@ def wrapper(*args: Any, **kwargs: Any) -> Any:
# ---


def strict_mode() -> bool:
"""
Strict mode enabled.
In strict mode, incorrect use of the Rerun API (wrong parameter types etc.)
will result in exception being raised.
When strict mode is on, such problems are instead logged as warnings.
The default is OFF.
"""

return _strict_mode


def set_strict_mode(mode: bool) -> None:
"""
Turn strict mode on/off.
In strict mode, incorrect use of the Rerun API (wrong parameter types etc.)
will result in exception being raised.
When strict mode is off, such problems are instead logged as warnings.
The default is OFF.
"""
global _strict_mode

_strict_mode = mode


def start_web_viewer_server(port: int = 0) -> None:
"""
Start an HTTP server that hosts the rerun web viewer.
Expand Down
41 changes: 26 additions & 15 deletions rerun_py/rerun_sdk/rerun/_baseclasses.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
import pyarrow as pa
from attrs import define, fields

from .error_utils import catch_and_log_exceptions

T = TypeVar("T")


Expand Down Expand Up @@ -152,7 +154,7 @@ class BaseBatch(Generic[T]):

def __init__(self, data: T | None) -> None:
"""
Primary method for creating Arrow arrays for required components.
Construct a new batch.
This method must flexibly accept native data (which comply with type `T`). Subclasses must provide a type
parameter specifying the type of the native data (this is automatically handled by the code generator).
Expand All @@ -172,22 +174,30 @@ def __init__(self, data: T | None) -> None:
-------
The Arrow array encapsulating the data.
"""

# If data is already an arrow array, use it
if isinstance(data, pa.Array):
if data.type == self._ARROW_TYPE:
self.pa_array = data
return
elif data.type == self._ARROW_TYPE.storage_type:
self.pa_array = self._ARROW_TYPE.wrap_array(data)
if data is not None:
with catch_and_log_exceptions(self.__class__.__name__):
# If data is already an arrow array, use it
if isinstance(data, pa.Array) and data.type == self._ARROW_TYPE:
self.pa_array = data
elif isinstance(data, pa.Array) and data.type == self._ARROW_TYPE.storage_type:
self.pa_array = self._ARROW_TYPE.wrap_array(data)
else:
self.pa_array = self._ARROW_TYPE.wrap_array(
self._native_to_pa_array(data, self._ARROW_TYPE.storage_type)
)
return

if data is None:
pa_array = _empty_pa_array(self._ARROW_TYPE.storage_type)
else:
pa_array = self._native_to_pa_array(data, self._ARROW_TYPE.storage_type)
# If we didn't return above, default to the empty array
self.pa_array = _empty_pa_array(self._ARROW_TYPE)

self.pa_array = self._ARROW_TYPE.wrap_array(pa_array)
@classmethod
def _required(cls, data: T | None) -> BaseBatch[T]:
"""
Primary method for creating Arrow arrays for optional components.
Just calls through to __init__, but with clearer type annotations.
"""
return cls(data)

@classmethod
def _optional(cls, data: T | None) -> BaseBatch[T] | None:
Expand Down Expand Up @@ -270,9 +280,10 @@ def component_name(self) -> str:
return self._ARROW_TYPE._TYPE_NAME # type: ignore[attr-defined, no-any-return]


@catch_and_log_exceptions(context="creating empty array")
def _empty_pa_array(type: pa.DataType) -> pa.Array:
if isinstance(type, pa.ExtensionType):
return _empty_pa_array(type.storage_type)
return type.wrap_array(_empty_pa_array(type.storage_type))

# Creation of empty arrays of dense unions aren't implemented in pyarrow yet.
if isinstance(type, pa.UnionType):
Expand Down
23 changes: 19 additions & 4 deletions rerun_py/rerun_sdk/rerun/_log.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@

from . import components as cmp
from ._baseclasses import AsComponents, ComponentBatchLike
from .error_utils import _send_warning
from .error_utils import _send_warning, catch_and_log_exceptions
from .recording_stream import RecordingStream

__all__ = ["log", "IndicatorComponentBatch", "AsComponents"]
Expand Down Expand Up @@ -102,13 +102,15 @@ def _splat() -> cmp.InstanceKeyBatch:
return pa.array([_MAX_U64], type=cmp.InstanceKeyType().storage_type) # type: ignore[no-any-return]


@catch_and_log_exceptions()
def log(
entity_path: str,
entity: AsComponents | Iterable[ComponentBatchLike],
*,
ext: dict[str, Any] | None = None,
timeless: bool = False,
recording: RecordingStream | None = None,
strict: bool | None = None,
) -> None:
"""
Log an entity.
Expand All @@ -127,7 +129,10 @@ def log(
Specifies the [`rerun.RecordingStream`][] to use.
If left unspecified, defaults to the current active data recording, if there is one.
See also: [`rerun.init`][], [`rerun.set_global_data_recording`][].
strict:
If True, raise exceptions on non-loggable data.
If False, warn on non-loggable data.
if None, use the global default from `rerun.strict_mode()`
"""
# TODO(jleibs): Profile is_instance with runtime_checkable vs has_attr
# Note from: https://docs.python.org/3/library/typing.html#typing.runtime_checkable
Expand All @@ -139,7 +144,7 @@ def log(
if hasattr(entity, "as_component_batches"):
components = entity.as_component_batches()
else:
components = entity
components = list(entity)

if hasattr(entity, "num_instances"):
num_instances = entity.num_instances()
Expand All @@ -156,6 +161,7 @@ def log(
)


@catch_and_log_exceptions()
def log_components(
entity_path: str,
components: Iterable[ComponentBatchLike],
Expand All @@ -164,6 +170,7 @@ def log_components(
ext: dict[str, Any] | None = None,
timeless: bool = False,
recording: RecordingStream | None = None,
strict: bool | None = None,
) -> None:
"""
Log an entity from a collection of `ComponentBatchLike` objects.
Expand All @@ -190,7 +197,10 @@ def log_components(
Specifies the [`rerun.RecordingStream`][] to use. If left unspecified,
defaults to the current active data recording, if there is one. See
also: [`rerun.init`][], [`rerun.set_global_data_recording`][].
strict:
If True, raise exceptions on non-loggable data.
If False, warn on non-loggable data.
if None, use the global default from `rerun.strict_mode()`
"""
instanced: dict[str, pa.Array] = {}
splats: dict[str, pa.Array] = {}
Expand All @@ -204,6 +214,11 @@ def log_components(
num_instances = max(len(arr) for arr in arrow_arrays)

for name, array in zip(names, arrow_arrays):
# Array could be None if there was an error producing the empty array
# Nothing we can do at this point other than ignore it. Some form of error
# should have been logged.
if array is None:
pass
# Strip off the ExtensionArray if it's present. We will always log via component_name.
# TODO(jleibs): Maybe warn if there is a name mismatch here.
if isinstance(array, pa.ExtensionArray):
Expand Down
2 changes: 1 addition & 1 deletion rerun_py/rerun_sdk/rerun/_validators.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ def flat_np_float_array_from_array_like(data: Any, dimension: int) -> npt.NDArra

if not valid:
raise ValueError(
f"Expected either a flat array with a length a of {dimension} elements, or an array with shape (`num_elements`, {dimension}). Shape of passed array was {array.shape}."
f"Expected either a flat array with a length multiple of {dimension} elements, or an array with shape (`num_elements`, {dimension}). Shape of passed array was {array.shape}."
)

return array.reshape((-1,))
Expand Down
21 changes: 19 additions & 2 deletions rerun_py/rerun_sdk/rerun/archetypes/annotation_context.py

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit d4041b4

Please sign in to comment.