Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add future work documentation #87

Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
7602233
Add SM suggestions
nielsleadholm Nov 27, 2024
f2a4d63
Add more suggestions to future work
nielsleadholm Nov 27, 2024
1435220
Add more info esp. on policies
nielsleadholm Nov 28, 2024
ffb5d05
Further updates to policy directions
nielsleadholm Nov 28, 2024
bf3b402
Add RL and decomposition motor info
nielsleadholm Nov 28, 2024
8b3cdfa
Add more CMP and multi-object etc. descriptions
nielsleadholm Nov 28, 2024
cf95ff1
Add more descriptions e.g. symmetry handling
nielsleadholm Nov 29, 2024
7a7485d
Add note on scale invariance
nielsleadholm Dec 2, 2024
cea1e11
Merge branch 'main' into Update-future-work-documentation
nielsleadholm Dec 2, 2024
645ae53
Fix broken links
nielsleadholm Dec 2, 2024
5d115a8
Add links to new tasks
nielsleadholm Dec 2, 2024
af5b69e
Typo fixes
nielsleadholm Dec 2, 2024
9eb2fea
Further typo fixes
nielsleadholm Dec 2, 2024
8423b85
Updates to memory description
nielsleadholm Dec 2, 2024
5f1b891
Add dataset link
nielsleadholm Dec 2, 2024
774238e
General language improvements
nielsleadholm Dec 2, 2024
b101f13
Change naming of less to fewer
nielsleadholm Dec 5, 2024
f40b8b3
Add details on re-anchoring
nielsleadholm Dec 5, 2024
aa80d5f
Fix typo
nielsleadholm Dec 5, 2024
1d2b7cf
Adjustments to voting-related work
nielsleadholm Dec 10, 2024
1d29d40
Update to supervision description
nielsleadholm Dec 10, 2024
257b550
Improve comments re. multi-object datasets
nielsleadholm Dec 10, 2024
b7eeea7
Reformulate using priors work
nielsleadholm Dec 10, 2024
1b1eb15
Address further review comments
nielsleadholm Dec 11, 2024
2ddee71
Update docs/future-work/learning-module-improvements/use-off-object-o…
nielsleadholm Dec 11, 2024
3b58a04
Update docs/future-work/learning-module-improvements/use-off-object-o…
nielsleadholm Dec 11, 2024
720600b
Update docs/future-work/voting-improvements/use-pose-for-voting.md
nielsleadholm Dec 11, 2024
24566d9
Update docs/future-work/motor-system-improvements/implement-efficient…
nielsleadholm Dec 11, 2024
8ef1072
Update particle filter description
nielsleadholm Dec 11, 2024
d86623c
Merge remote-tracking branch 'origin/Update-future-work-documentation…
nielsleadholm Dec 11, 2024
a6fe0f5
Merge branch 'main' into Update-future-work-documentation
nielsleadholm Dec 11, 2024
daec672
Further changes for review comments
nielsleadholm Dec 11, 2024
d45d72f
Update hash-tags for tasks
nielsleadholm Dec 11, 2024
399b48b
Make naming consistent for future work sections
nielsleadholm Dec 11, 2024
1073091
Use better term for policy switching
nielsleadholm Dec 12, 2024
8baaab5
Reorder motor tasks in list
nielsleadholm Dec 12, 2024
20c23df
Remove confusing reference to voting for hierarchical connections
nielsleadholm Dec 12, 2024
9607328
Update wording around displacement CMP
nielsleadholm Dec 12, 2024
b141759
Merge branch 'main' into Update-future-work-documentation
nielsleadholm Dec 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added docs/figures/future-work/dinner_medieval.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/figures/future-work/dinner_standard.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5 changes: 2 additions & 3 deletions docs/future-work/cmp-hierarchy-improvements.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,8 @@ description: Improvements we would like to add to the CMP or hierarchical inform
---
These are the things we would like to implement:

- [Figure out performance measure and supervision in heterarchy](cmp-hierarchy-improvements/figure-out-performance-measure-and-supervision-in-heterarchy.md) #infrastructure
- [Add top-down connections](cmp-hierarchy-improvements/add-top-down-connections.md) #numsteps
- [Add associative connections](cmp-hierarchy-improvements/add-associative-connections.md) #abstract #numsteps
- [Figure out performance measures and supervision in heterarchy](cmp-hierarchy-improvements/figure-out-performance-measure-and-supervision-in-heterarchy.md) #infrastructure #compositional
- [Add top-down connections](cmp-hierarchy-improvements/add-top-down-connections.md) #numsteps #multiobj #compositional
- [Run & Analyze experiments with >2LMs in heterarchy testbed](cmp-hierarchy-improvements/run-analyze-experiments-with-2lms-in-heterarchy-testbed.md) #compositional
- [Run & Analyze experiments in multiobject environment looking at scene graphs](cmp-hierarchy-improvements/run-analyze-experiments-in-multiobject-environment-looking-at-scene-graphs.md) #multiobj
- [Test learning at different speeds depending on level in hierarchy](cmp-hierarchy-improvements/test-learning-at-different-speeds-depending-on-level-in-hierarchy.md) #learning #generalization
Expand Down

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,8 @@
title: Add Top-Down Connections
---

One of the main roles of top-down connections is the associative recall and prediction outlined in [Associative Connections](add-associative-connections.md). However, top-down projections can also support decomposing goal-states into specific sub-goals, as discussed in [Decomposing Goal States](../motor-system-improvements/decompose-goals-into-subgoals-communicate.md).
In Monty systems, low-level LMs project to high-level LMs, where this projection occurs if their sensory receptive fields are co-aligned. Hierarchical connections should be able to learn a mapping between objects represented at these low-level LMs, and objects represented in the high-level LMs that frequently co-occur. Such learning would be similar to that required for [Generalizing Voting To Associative Connections](../voting-improvements/generalize-voting-to-associative-connections.md).

For example, a high-level LM of a dinner-set might have learned that the fork is present at a particular location in its internal reference frame. When at that location, it would therefore predict that the low-level LM should be sensing a fork, enabling the perception of a fork in the low-level LM even when there is a degree of noise or other source of uncertainty in the low-level LM's representation.

In the brain, these top-down projections correspond to L6 to L1 connections, where the synapses at L1 would support predictions about object ID. However, these projections also form local synapses en-route through the L6 layer of the lower-level cortical column. In a Monty LM, this would correspond to the top-down connection predicting not just the object that the low-level LM should be sensing, but also the specific location that it should be sensing it at. This could be complemented with predicting a particular pose of the low-level object (see [Use Better Priors for Hypothesis Initialization](../learning-module-improvements/use-better-hypothesis-priors.md)).
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
---
title: Figure out Performance Measures and Supervision in Heterarchy
---
As we introduce hierarchy and compositional objects, such as a dinner-table setting, we need to figure out both how to measure the performance of the system, and how to supervise the learning. For the latter, we might choose to train the system on component objects in isolation (a fork, a knife, etc.) before then showing Monty the full compositional object (the dinner-table setting). When evaluating performance, we might then see how well the system retrieves representations at different levels of the hierarchy. However, in the more core setting of unsupervised learning, representations of the sub-objects would likely also emerge at the high level (a coarse knife representation, etc.), while we may also find some representations of the dinner scene in low-level LMs. Deciding then how we measure performance will be more difficult.

As we introduce hierarchy and leverage more unsupervised learning, representations will emerge at different levels of the system that may not correspond to any labels present in our datasets. For example, handles, or the head of a spoon, may emerge as object-representations in low-level LMs, even though the dataset only recognizes labels like "mug" and "spoon".
When we move to objects with less obvious composition (i.e. where the sub-objects must be disentangled in a fully unsupervised manner), representations will emerge at different levels of the system that may not correspond to any labels present in our datasets. For example, handles, or the head of a spoon, may emerge as object-representations in low-level LMs, even though the dataset only recognizes labels like "mug" and "spoon".

One approach to measure the "correctness" of representations in this setting might be how well a predicted representation aligns with the outside world. For example, while LMs are not designed to be used as generative models, we could visualize how well an inferred object graph maps onto the object actually present in the world. Quantifying such alignment might leverage measures such as differences in point-clouds. This would provide some evidence of how well the learned decomposition of objects corresponds to the actual objects present in the world.
This is less clear, but one approach to measure the "correctness" of representations in this setting might be how well a predicted representation aligns with the outside world. For example, while LMs are not designed to be used as generative models, we could visualize how well an inferred object graph maps onto the object actually present in the world. Quantifying such alignment might leverage measures such as differences in point-clouds. This would provide some evidence of how well the learned decomposition of objects corresponds to the actual objects present in the world.

See also [Make Dataset to Test Compositional Objects](../environment-improvements/make-dataset-to-test-compositional-objects.md) and [Metrics to Evaluate Categories and Generalization](../environment-improvements/create-dataset-and-metrics-to-evaluate-categories-and-generalization.md).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just jotting down here that I'm interested in this direction. :)

Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,16 @@
title: Send Similarity Encoding Object ID to Next Level & Test
---

We have implemented the ability to encode object IDs using sparse-distributed representations (SDRs), and in particular can use this as a way of capturing similarity and disimlarity between objects. Using such encodings in learned [Associative Connections](add-associative-connections.md), we should observe a degree of natural generalization when recognizing compositional objects.
We have implemented the ability to encode object IDs using sparse-distributed representations (SDRs), and in particular can use this as a way of capturing similarity and disimlarity between objects. Using such encodings in learned [Hierarchical Connections](add-top-down-connections.md), we should observe a degree of natural generalization when recognizing compositional objects.

For example, assume a Monty system learns a dinner table setting with normal cuttlery and plates. Separately, the system learns about medieval instances of cuttlery and plates, but never sees them arranged in a dinner table setting. Based on the similarity of the medieval cutterly objects to their modern counterparts, the objects should have considerable overlap in their SDR encodings.
For example, assume a Monty system learns a dinner table setting with normal cuttlery and plates (see examples below). Separately, the system learns about medieval instances of cuttlery and plates, but never sees them arranged in a dinner table setting. Based on the similarity of the medieval cutterly objects to their modern counterparts, the objects should have considerable overlap in their SDR encodings.

If the system was to then see a medieval dinner table setting for the first time, it should be able to recognize the arrangement as a dinner-table setting with reasonable confidence, even if the constituent objects are somewhat different from those present when the compositional object was first learned.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be nice to include images of these two scenes here for better visualization

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Adding


We should note that we are still determining whether overlapping bits between SDRs is the best way to encode object similarity. As such, we are also open to exploring this task with alternative approaches, such as directly making use of values in the evidence-similarity matrix (from which SDRs are currently derived).
We should note that we are still determining whether overlapping bits between SDRs is the best way to encode object similarity. As such, we are also open to exploring this task with alternative approaches, such as directly making use of values in the evidence-similarity matrix (from which SDRs are currently derived).

![Standard dinner table setting](../../figures/future-work/dinner_standard.png)
*Example of a standard dinner table setting with modern cutlery and plates that the system could learn from.*

![Medieval dinner table setting](../../figures/future-work/dinner_medieval.png)
*Example of a medieval dinner table setting with medieval cutlery and plates that the system could be evaluated on, after having observed the individual objects in isolation.*
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,6 @@ For example, the brain has a specialized region for episodic memory (the hippoca

As such, we would like to explore adding forms of episodic and working memory by introducing high-level learning modules that learn information on extremely fast time scales relative to lower-level LMs. These should be particularly valuable in settings such as recognizing multi-object arrangements in a scene, and providing memory when a Monty system is performing a multi-step task. Note that because of the overlap in the core algorithms, LMs can be used largely as-is for these memory systems, with the only change being the learning rate.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be worth noting that the GridObjectModel would be particularly well suited for introducing a learning rate parameter.


As an additional note, varying the learning rate across learning modules will likely play an important role in dealing with representational drift, and the impact it can have on continual learning. For example, we expect that low-level LMs, which partly form the representations in higher-level LMs, will change their representations more slowly.
It is worth noting that the `GridObjectModel` would be particularly well suited for introducing a learning-rate parameter, due to its constraints on the amount of information that can be stored.

As a final note, varying the learning rate across learning modules will likely play an important role in dealing with representational drift, and the impact it can have on continual learning. For example, we expect that low-level LMs, which partly form the representations in higher-level LMs, will change their representations more slowly.
2 changes: 1 addition & 1 deletion docs/future-work/environment-improvements.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ description: New environments and benchmark experiments we would like to add.

These are the things we would like to implement:

- [Make dataset to test compositional objects](environment-improvements/make-dataset-to-test-compositional-objects.md) #compositional
- [Make dataset to test compositional objects](environment-improvements/make-dataset-to-test-compositional-objects.md) #compositional #multiobj
- [Set up Environment that allows for object manipulation](environment-improvements/set-up-environment-that-allows-for-object-manipulation.md) #goalpolicy
- [Set up object manipulation benchmark tasks and evaluation measures](environment-improvements/set-up-object-manipulation-benchmark-tasks-and-evaluation-measures.md) #goalpolicy
- [Create dataset and metrics to evaluate categories and generalization](environment-improvements/create-dataset-and-metrics-to-evaluate-categories-and-generalization.md) #generalization
Expand Down
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This always reminded me of the problem of multi-label classification (https://paperswithcode.com/task/multi-label-classification). It might be worth looking into some off-the-shelf model that can attach multiple labels, or even multiple attributes / afforances.

Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,6 @@ Datasets do not typically capture the flexibility of object labels based on whet

Labeling a dataset with "hierarchical" labels, such that an object might be both a "can", as well as a "can of tomato soup" would be one approach to capturing this flexibility. Once available, classification accuracy could be assessed both at the level of individual object instances, as well as at the level of categories.

We might leverage crowd-sourced labels to ensure that this labeling is reflective of human perception, and not biased by our beliefs as designers of Monty.
We might leverage crowd-sourced labels to ensure that this labeling is reflective of human perception, and not biased by our beliefs as designers of Monty. This also relates to the general problem fo [Multi-Label Classification](https://paperswithcode.com/task/multi-label-classification), and so there may be off-the-shelf solutions that we can explore.

Initially such labels should focus on morphology, as this is the current focus of Monty's recognition system. However, we would eventually want to also account for affordances, such as an object that is a chair, a vessel, etc.
Initially such labels should focus on morphology, as this is the current focus of Monty's recognition system. However, we would eventually want to also account for affordances, such as an object that is a chair, a vessel, etc. Being able to classify objects based on their affordances would be an experimental stepping stone to the true measure of the systems representations, which would be how well affordances are used to manipulate the world.
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,14 @@
title: Make Dataset to Test Compositional Objects
---

We have developed an initial dataset based on setting a dinner-table with a variety of objects. For example, the objects can be arranged in a normal setting, or aligned in a row (i.e. not a typical dinner-table setting). Similarly, the component objects can be those of a modern dining table, or those from a "medieval" time-period. As such, this dataset can be used to test the ability of Monty systems to recognize compositional objects based on the specific arrangement of objects, and to test generalization to novel compositions.
We have developed an initial dataset based on recognizing a variety of dinner table sets with different arrangements of plates and cutlery. For example, the objects can be arranged in a normal setting, or aligned in a row (i.e. not a typical dinner-table setting). Similarly, the component objects can be those of a modern dining table, or those from a "medieval" time-period. As such, this dataset can be used to test the ability of Monty systems to recognize compositional objects based on the specific arrangement of objects, and to test generalization to novel compositions.

By using explicit objects to compose multi-part objects, this dataset has the advantage that we can learn on the component objects in isolation, using supervised learning signals if necessary.
By using explicit objects to compose multi-part objects, this dataset has the advantage that we can learn on the component objects in isolation, using supervised learning signals if necessary. It's worth noting that this is often how learning of complex compositional objects takes place in humans. For example, when learning to read, children begin by learning individual letters, which are themselves composed of a variety of strokes. Only when letters are learned can they learn to combine them into words. More generally, disentangling an object from other objects is difficult without the ability to interact with it, or see it in a sufficient range of contexts that it's separation from other objects becomes clear.

However, we would eventually expect compositional objects to be learned in an unsupervised manner. When this is consistently possible, we can consider more diverse datasets where the component objects may not be as explicit. At that time, the challenges described in [Figure out Performance Measure and Supervision in Heterarchy](../cmp-hierarchy-improvements/figure-out-performance-measure-and-supervision-in-heterarchy.md) will become more relevant.
However, we would eventually expect compositional objects to be learned in an unsupervised manner. When this is consistently possible, we can consider more diverse datasets where the component objects may not be as explicit. At that time, the challenges described in [Figure out Performance Measure and Supervision in Heterarchy](../cmp-hierarchy-improvements/figure-out-performance-measure-and-supervision-in-heterarchy.md) will become more relevant.

![Dinner table set](../../figures/future-work/dinner_variations_standard.png)
*Example of compositional objects made up of modern cutlery and plates.*

![Dinner table set](../../figures/future-work/dinner_variations_medieval.png)
*Example of compositional objects made up of medieval cutlery and plates.*
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,5 @@ title: Set up Environment that Allows for Object Manipulation
---

See [Decompose Goals Into Subgoals & Communicate](../motor-system-improvements/decompose-goals-into-subgoals-communicate.md) for a discussion of the kind of tasks we are considering for early object-manipulation experiments. An even simpler task that we have recently considered is pressing a switch to turn a lamp on or off. We will provide further details on what these tasks might look like soon.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add here that an important aspect of this task is to find a good simulator and figure out how we best set up an environment and agent for such a task (avoiding objects falling into the void, resetting the environment, modeling friction,...)


Beyond the specifics of any given task, an important part of this future-work component is to identify a good simulator for such settings. For example, we would like to have a setup where objects are subject to gravity, but are prevented from falling into a void by a table or floor. Other elements of physics such as friction should also be simulated, while it should be straightforward to reset an environment, and specify the arrangement of objects (for example using 3D modelling software).
Loading
Loading