Fix `patch_off_object` #114

scottcanoe · 2024-12-14T21:28:29Z

Problem Statement

In a small percentage of runs (<1%), Monty gets stuck bouncing between off-object observations early in the run. These experiments timeout after max_monty_steps with no matching steps performed. These timeouts occur mostly, but not exclusively, in experiments with distant agents. So far I've focused on distant agents. Here's a quick overview of what I've observed and done.

In distant agents experiments, get_good_view is called at the beginning of an episode. It moves the agent to a reasonable distance from the object and orients the agent so that the view finder's central pixel is on-object. However, the view finder's central area is not a perfect proxy for sensor module 0 (a patch), and this method does not guarantee that the patch will be on-object, even if the view finder is.

Because of the differences in resolutions/zooms of the patch and view finder, the central view finder's pixel is covered by about 10 pixels in the patch's image. If the view finder is aimed directly on an object's edge, the patch's central pixel can end up slightly off-object. If the agent goes the wrong way to try and find the object and ends up even further off-object, then the system gets stuck in a loop, bouncing back and forth between these two off-object observations, repeatedly calling fixme_undo_last_action. Example (object is "banana"):

A semantic map can contain "holes", and it can be problematic if the agent is oriented directly onto it. For example, the "chain" object has (real) topological holes in the links. If the agent is oriented onto a hole, the agent may then jumped to an off-object location, reverse this action, and ultimately get stuck in a fixme_undo_last_action loop. In another case, there was a small (spurious) hole in the object's semantic map of the knife, and in this episode, the system also got stuck in a fixme_undo_last_action loop. Here's the patch's semantic map on the part of the knife while holes:

Solutions

I've made several changes here.

Where get_good_view was called (at the beginning of an episode and after a jump), it is now called twice, first using the view finder and then sensor module 0. The second call allows for finer orienting movements that ensure that the central sensor patch ends up on-object.
Allowing the agent to move forward on the second pass with the sensor patch caused problems, so I added a parameter to get_good_view to facilitate disabling move-forward actions. Solutions (1) and (2) are the primary fix that solved problem (1).
I shrunk the size of the smoothing kernel in find_location_to_look_at to reduce the likelihood of orienting onto a semantic hole. The previous kernel was quite large, and the smearing of semantic information made it difficult to discourage orientation towards semantic holes. This fixed problem (2).
I've also added a fix that resets an LM's current_mlh between episodes. This wasn't affecting the system's behavior, but it did cause incorrect logging of these patch_off_object episodes.

These changes eliminated patch_off_object episodes for benchmark experiments tested except for the multi-object experiment and surface agent experiments. Those will have to be tackled in a future PR.

Benchmark updates for the 10- and 77-object experiments have been included in this PR with no substantial changes to report.

Enforce stricter criteria for considering a view on-object, and improve the ability to locate a good movement target when it is not.

Added `allow_translation` for get_good_view to make moving closer to the object optional. Also changed the uses of get_good_view to use the double-checking (use view finder and then a patch)

scottcanoe · 2024-12-16T21:19:06Z

Tagging @hlee9212 for visibility.

scottcanoe · 2024-12-16T21:25:16Z

Also noting here that I don't know what's causing the spurious semantic holes to sometimes show up (like in the "knife" example above), but I'll plan to take a closer look at that while working on the upcoming semantic sensor PR.

nielsleadholm

Really nice, thanks for the fix and the great documentation. Just added a function suggestion.

nielsleadholm · 2024-12-18T10:58:03Z

src/tbp/monty/frameworks/environments/embodied_data.py

@@ -715,6 +735,10 @@ def handle_successful_jump(self):

        else:
            self.get_good_view("view_finder")


I wonder if this could be made a bit "dryer", and the underlying assumptions of the central sensor patch a bit more visible, by having an outer function, maybe called something like multi_step_get_good_view (or a better name), which takes no arguments (since we are hard-coding these anyways). This function then calls:

self.get_good_view("view_finder") for patch_id in ("patch", "patch_0"): if patch_id in self._observation["agent_id_0"].keys(): self.get_good_view(patch_id, allow_translation=False) break

which could be used on both lines 487 and 738.

How about get_good_view_with_patch_refinement? It's a long name, but multi-step to me sounds like it could refer to multiple steps taken with the same sensor (as is partly done already with move forward).

Yeah good point, I like it!

nielsleadholm · 2024-12-18T11:06:31Z

@vkakerbeck I'm curious from when you've examined YCB objects in the past, whether you ever encountered holes in objects like the knife blade? I'm assuming this is a defect in the underlying mesh that emerged during the original 3D scanning of the YCB objects, since it's being picked up by the ground truth semantic sensor.

vkakerbeck

LGTM, thanks for the fix! Just a few questions so I understand all the changes.

Oh and yes, the YCB objects have tons of weird artifacts like that...

vkakerbeck · 2024-12-19T08:30:12Z

src/tbp/monty/frameworks/models/motor_policies.py

@@ -724,7 +725,7 @@ def find_location_to_look_at(self, sem3d_obs, image_shape, target_semantic_id):
        # as expected, which can otherwise break if e.g. on_object_image is passed
        # as an int or boolean rather than float
        smoothed_on_object_image = scipy.ndimage.gaussian_filter(
-            on_object_image, sem3d_obs.shape[0] / 10, mode="constant"
+            on_object_image, 2, mode="constant"


Why is this number hardcoded now?

I just landed on a kernel size of 2 because it made the smoothed images look "reasonably" smooth, and solved the issue with holes on the object. I can see why we would want to kernel size to scale with the resolution, but then I thought what we really want to scale it by is the physical size on the object between pixels since presumably holes in the object are usually pretty small (in centimeters). This is something I can come back to, but I just left it since it seemed to be giving a good result when I looked at all the images during debugging.

vkakerbeck · 2024-12-19T08:30:28Z

src/tbp/monty/frameworks/models/evidence_matching.py

+        self.current_mlh["location"] = [0, 0, 0]
+        self.current_mlh["rotation"] = Rotation.from_euler("xyz", [0, 0, 0])
+        self.current_mlh["scale"] = 1
+        self.current_mlh["evidence"] = 0


Wha was the reason these needed to be added now?

Since not resetting current_mlh was causing incorrect logging for patch_off_object episodes, I thought it made sense to include it here. More specifically, an episode could end up with patch_off_object but it would report that the most likely object was whatever the MLH was for the previous episode.

ah okay, good catch!

vkakerbeck · 2024-12-19T08:31:52Z

docs/overview/benchmark-experiments.md

+| base_77obj_surf_agent               | 98.27%    | 4.33%      | 52                 | 0.18                     | 42m      | 123s                 |
+| randrot_noise_77obj_dist_agent      | 87.01%    | 29.00%     | 151                | 0.63                     | 2h10m    | 468s                 |
+| randrot_noise_77obj_surf_agent      | 94.37%    | 21.65%     | 113                | 0.61                     | 1h31m    | 339s                 |
+| randrot_noise_77obj_5lms_dist_agent | 90.91%    | 5.19%      | 70                 | 1.01                     | 1h7m     | 1439s                |


Do you have an idea why performance is worse now in terms of % correct and % used mlh?

I was hoping to come up with a good answer for this, but ultimately I don't really know. I reran with a handful of different random seeds, sometimes recovering the roughly the same accuracy as before. Somehow the very slightly different starting conditions ended up having a noticeable change on accuracy, but I don't have much to say beyond that. It's something I'd like to investigate further in the near future. I'll be redoing all benchmarks for the semantic sensor PR, so maybe there's more I can learn then.

scottcanoe and others added 15 commits December 12, 2024 08:13

Reset current_mlh

4a92419

Update motor_policies.py

8d890df

Enforce stricter criteria for considering a view on-object, and improve the ability to locate a good movement target when it is not.

temp

1dd7155

Merge branch 'main' into patch_off_object

6640b6b

Restore ycb_experiments.py

076b674

Update embodied_data.py

b5ce9bd

Added `allow_translation` for get_good_view to make moving closer to the object optional. Also changed the uses of get_good_view to use the double-checking (use view finder and then a patch)

Removing padded on-object checking

a7c3e5e

Before deleting detailed dumping

c9ed4f4

Remove custom detailed logging code

c2aaf3b

Delete sandbox.py

f409539

Minor formatting

8b13932

Minor formatting

a0378d5

update docs and results

c5871c6

Update docstrings

0d33c11

Merge branch 'main' into patch_off_object

a0716b6

scottcanoe marked this pull request as ready for review December 16, 2024 21:18

scottcanoe requested review from nielsleadholm and vkakerbeck December 16, 2024 21:18

scottcanoe added bug Something isn't working documentation Improvements or additions to documentation enhancement New feature or request labels Dec 16, 2024

scottcanoe changed the title ~~WIP: Fix patch_off_object~~ Fix patch_off_object Dec 16, 2024

tristanls assigned vkakerbeck and nielsleadholm Dec 17, 2024

tristanls added the triaged This issue or pull request was triaged label Dec 17, 2024

nielsleadholm approved these changes Dec 18, 2024

View reviewed changes

Add get_good_view_with_patch_refinement

9e7f4f3

Update docstrings

4fcc858

vkakerbeck approved these changes Dec 19, 2024

View reviewed changes

Merge branch 'main' into patch_off_object

8ad068b

scottcanoe merged commit f610702 into thousandbrainsproject:main Dec 20, 2024
13 checks passed

scottcanoe deleted the patch_off_object branch December 20, 2024 17:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `patch_off_object` #114

Fix `patch_off_object` #114

scottcanoe commented Dec 14, 2024 •

edited

Loading

scottcanoe commented Dec 16, 2024

scottcanoe commented Dec 16, 2024

nielsleadholm left a comment •

edited

Loading

nielsleadholm Dec 18, 2024 •

edited

Loading

scottcanoe Dec 18, 2024

nielsleadholm Dec 18, 2024

scottcanoe Dec 19, 2024

nielsleadholm commented Dec 18, 2024

vkakerbeck left a comment •

edited

Loading

vkakerbeck Dec 19, 2024

scottcanoe Dec 19, 2024

vkakerbeck Dec 19, 2024

scottcanoe Dec 19, 2024

vkakerbeck Dec 20, 2024

vkakerbeck Dec 19, 2024

scottcanoe Dec 20, 2024

		@@ -715,6 +735,10 @@ def handle_successful_jump(self):

		else:
		self.get_good_view("view_finder")

Fix patch_off_object #114

Fix patch_off_object #114

Conversation

scottcanoe commented Dec 14, 2024 • edited Loading

Problem Statement

Solutions

scottcanoe commented Dec 16, 2024

scottcanoe commented Dec 16, 2024

nielsleadholm left a comment • edited Loading

Choose a reason for hiding this comment

nielsleadholm Dec 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nielsleadholm commented Dec 18, 2024

vkakerbeck left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Fix `patch_off_object` #114

Fix `patch_off_object` #114

scottcanoe commented Dec 14, 2024 •

edited

Loading

nielsleadholm left a comment •

edited

Loading

nielsleadholm Dec 18, 2024 •

edited

Loading

vkakerbeck left a comment •

edited

Loading