Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs describe policies fix typos #42

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 54 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,50 +26,77 @@ Stages are connected by **``Stage Transitions``**, which are directed edges asso
With this structure alone, a user can define a basic curriculum with the flexibility of defining skip connections and regressions. For nodes with multiple ongoing edges, edges are labelled by priority, set by the user.


| ![High-Level Curriculum](./examples/example_project/diagrams/high_level_curr_diagram.png "Title") |
|:--:|
| ![High-Level Curriculum](./examples/example_project/diagrams/high_level_curr_diagram.png "Title") |
|:--:|
|*An example curriculum consisting of purely stages and stage transitions. This **``Curriculum``** consists of a skip connection between **``Stage``** 'StageA' and **``Stage``** 'Graduated'. **``Stage Transitions``** are triggered on a parameter 't2' and the skip transition is ordered before the transition going to **``Stage``** StageB.* |

$~$

This library also supports **``Curriculum``** **hypergraphs**.
Stages are intended to represent 'checkpoint learning objectives', which wrap independent sets of parameters,
for example, Stage1 = {P1, P2, P3} -> Stage2 = {P4, P5, P6}.

Conceptually, a user may want to change the rig parameters associated with a stage, but this set of rig parameters would be unnatural to classify as a new training stage altogether.
In this situation, the user may define a graph of **``Policies``** and **``Policy Transitions``** within a **``Stage``**.
A **``Policy``**, changes the task parameters of a **``Stage``**, as described above. A **``Policy Transition``** acts just like a **``Stage Transition``**, and defines transitions between **``Policies``** on a trigger condition. Like **``Stage Transitions``**, **``Policy Transitions``** can connect any two arbitrary **``Policies``** and are ordered by priority set by the user.
If a curriculum demands changing the same set of parameters,
for example, Stage1 = {P1, P2, P3} -> Stage1' = {P1', P2', P3}, it is a good idea to use PolicyGraphs.

A PolicyGraph is a **parallel programming interface** for changing **``Stage``** parameters.

| ![Full Curriculum](./examples/example_project/diagrams/my_curr_diagram.png "Title") |
|:--:|
|*An example **``Curriculum``** consisting of **``Stage``** and **``Policy``** graphs. Left: The high level policy graph. Right: Internal policy graphs.* |

**``Policies``** are more nuanced than **``Stages``**.
| ![Full Curriculum](./examples/example_project/diagrams/my_curr_diagram.png "Title") |
|:--:|
|*An example **``Curriculum``** consisting of **``Stage``** and **``Policy``** graphs. Left: The high level policy graph. Right: Internal policy graphs.* |

Yellow **``Policies``** in the example indicate '**Start Policies**'. To initialize the rig parameters of a **``Stage``**, the user must specify which **``Policy/Policies``** in the **``Stage``** policy graph to start with.

Unlike **``Stages``**, a mouse can occupy multiple active **``Policies``** within a **``Stage``**. As described later, the **``Trainer``** will record the net combination of rig parameters.
| ![Track Curriculum](./examples/example_project_2/diagrams/track_curr_diagram.png "Title") |
|:--:|
|*A 'Train Track' **``Curriculum``*** |

$~$

**Any hypergraph is supported!**

Here are some examples of the possibilities. The high-level stage graph are shown to the left and the inidividual policy graphs are shown to the right.


| ![Tree Curriculum](./examples/example_project_2/diagrams/tree_curr_diagram.png "Title") |
|:--:|
A PolicyGraph consists of **``Policy``** nodes and **``PolicyTransition``** directed edges.
Policies are user-defined functions that take in the current Stage **``TaskParameters``** and return the updated Stage **``TaskParameters``**.
PolicyTransitions define conditional execution of downstream Policies. Like **``StageTransition``**, **``PolicyTransition``**
can connect any two arbitrary **``Policy``** and are ordered by priority set by the user.
The yellow polices indicate **Start policies**, which are entrypoint(s) into the PolicyGraph specified by the user.
Altogether, Policies and PolicyTransitions may be assembled to form arbitrary execution trees and loops.

Notably, PolicyGraph is executed in parallel (execution is done by the Trainer, discussed later).
A mouse may occupy multiple policies at once and will traverse down all trigger transitions returning True, similar to current in a circuitboard.
While a mouse can only occupy one Stage at a time, a mouse can and will often occupy many active policies.
Intuitively, the current state of Stage parameters is the net parameter change of all active policies.

Parallel execution has the benefit of supporting asynchronous parameter updates, which is a more natural way of defining parameter changes.
Rather than defining how all stage parameters all change as a group, a policy can instead define updates to individual parameters, which asynchronously trigger on different metrics.

A good example of using PolicyGraphs can be demonstrated in the 'Track' curriculum above.

Imagine 'Track Stage' manages two rig parameters, P1 and P2,and these rig parameters update independently from one another
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think we should use "rig parameters" in the language of the repo as this might be confusing against the way sci comp is using a rig.json schema

according to different metrics, in this case, metrics m1 and m2 associated with m1_rule and m2_rule respectively.
With parallel execution, the most natural way of implementing this situation is with two tracks as shown, where a mouse can progress asynchronously along each parameter track.
If PolicyGraph was limited to serial execution, implementing this use case would be possible but more clumsy.
m1_rule and m2_rule would have to be combined into a compound policy transition and the left/right policies
would need to be combined into a compound policy with additional conditional logic inside checking if m1_rule or m2_rule was triggered.
With parallel execution, Policies and PolicyTransitions simplify into atomic operations.

Writing to PolicyGraph is easy.
Similar to Curriculum's API for adding, removing, and reordering stages,
Stage comes with a simple API for adding, removing, and reordering policies.
The structure of the high-level graph and the policy graphs can always be seen using **``Curriculum.export_diagram(...)``**.

This library has been rigorously tested, and all combinations of StageGraph and PolicyGraph are supported.
Here are some more examples of the possibilities.
The high-level stage graph are shown to the left and the individual policy graphs are shown to the right.
All diagrams have been generated automatically from examples/example_project and examples/example_project_2.

| ![Tree Curriculum](./examples/example_project_2/diagrams/tree_curr_diagram.png "Title") |
|:--:|
|*A 'Tree' **``Curriculum``*** |

| ![Track Curriculum](./examples/example_project_2/diagrams/track_curr_diagram.png "Title") |
|:--:|
|*A 'Train Track' **``Curriculum``*** |

| ![Policy Triangle Curriculum](./examples/example_project_2/diagrams/p_triangle_curr_diagram.png "Title") |
|:--:|
| ![Policy Triangle Curriculum](./examples/example_project_2/diagrams/p_triangle_curr_diagram.png "Title") |
|:--:|
|*A 'Policy Triangle' **``Curriculum``*** |

| ![Stage Triangle Curriculum](./examples/example_project_2/diagrams/s_triangle_curr_diagram.png "Title") |
|:--:|
| ![Stage Triangle Curriculum](./examples/example_project_2/diagrams/s_triangle_curr_diagram.png "Title") |
|:--:|
|*A 'Stage Triangle' **``Curriculum``*** |

$~$
Expand Down
10 changes: 0 additions & 10 deletions docs/source/aind_behavior_curriculum/base.rst

This file was deleted.

30 changes: 29 additions & 1 deletion docs/source/aind_behavior_curriculum/curriculum.rst
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to go back to this way of documenting the classes?

Original file line number Diff line number Diff line change
@@ -1,9 +1,37 @@
curriculum
--------------------------------------------

.. automodule:: aind_behavior_curriculum.curriculum
.. autoclass:: aind_behavior_curriculum.curriculum.BehaviorGraph
:members:
:undoc-members:
:show-inheritance:
:no-index:

.. autoclass:: aind_behavior_curriculum.curriculum.Metrics
:show-inheritance:

.. autoclass:: aind_behavior_curriculum.curriculum.Rule
:show-inheritance:

.. autoclass:: aind_behavior_curriculum.curriculum.Policy
:members: rule, validate_rule
:show-inheritance:

.. autoclass:: aind_behavior_curriculum.curriculum.PolicyTransition
:members: rule, validate_rule
:show-inheritance:

.. autoclass:: aind_behavior_curriculum.curriculum.Stage
:members: add_policy, add_policy_transition, get_task_parameters, graph, name, remove_policy, remove_policy_transition, see_policies, see_policy_transitions, set_policy_transition_priority, set_start_policies, set_task_parameters, start_policies, task, validate_stage
:show-inheritance:

.. autoclass:: aind_behavior_curriculum.curriculum.StageTransition
:members: rule, validate_rule
:show-inheritance:

.. autoclass:: aind_behavior_curriculum.curriculum.StageGraph
:show-inheritance:

.. autoclass:: aind_behavior_curriculum.curriculum.Curriculum
:members: add_stage, add_stage_transition, download_curriculum, export_curriculum, export_diagram, export_json, graph, name, pkg_location, remove_stage, remove_stage_transition, see_stage_transitions, see_stages, set_stage_transition_priority, validate_curriculum
:show-inheritance:
92 changes: 50 additions & 42 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -42,20 +42,13 @@ edges are labelled by priority, set by the user.

:math:`~`

This library also supports :py:class:`~aind_behavior_curriculum.curriculum.Curriculum` **hypergraphs**.

Conceptually, a user may want to change the task parameters associated
with a stage, but this set of task parameters would be unnatural to
classify as a new training stage altogether. In this situation, the user
may define a graph of :py:class:`~aind_behavior_curriculum.curriculum.Policy` and :py:class:`~aind_behavior_curriculum.curriculum.PolicyTransition`
within a :py:class:`~aind_behavior_curriculum.curriculum.Stage`
. A :py:class:`~aind_behavior_curriculum.curriculum.Policy`, changes the task parameters of
a :py:class:`~aind_behavior_curriculum.curriculum.Stage`
, as described above. A :py:class:`~aind_behavior_curriculum.curriculum.PolicyTransition` acts
just like a :py:class:`~aind_behavior_curriculum.curriculum.StageTransition`, and defines transitions between
:py:class:`~aind_behavior_curriculum.curriculum.Policy` on a trigger condition. Like :py:class:`~aind_behavior_curriculum.curriculum.StageTransition`,
:py:class:`~aind_behavior_curriculum.curriculum.PolicyTransition` can connect any two arbitrary
:py:class:`~aind_behavior_curriculum.curriculum.Policy` and are ordered by priority set by the user.
Stages are intended to represent 'checkpoint learning objectives', which wrap independent sets of parameters,
for example, Stage1 = {P1, P2, P3} -> Stage2 = {P4, P5, P6}.

If a curriculum demands changing the same set of parameters,
for example, Stage1 = {P1, P2, P3} -> Stage1' = {P1', P2', P3}, it is a good idea to use PolicyGraphs.

A PolicyGraph is a **parallel programming interface** for changing :py:class:`~aind_behavior_curriculum.curriculum.Stage` parameters.

|Full Curriculum|

Expand All @@ -65,43 +58,57 @@ just like a :py:class:`~aind_behavior_curriculum.curriculum.StageTransition`, an
*graphs. Left: The high level policy graph. Right: Internal policy graphs.*


:py:class:`~aind_behavior_curriculum.curriculum.Policy` are more nuanced than :py:class:`~aind_behavior_curriculum.curriculum.Stage`.
|Track Curriculum|

Yellow :py:class:`~aind_behavior_curriculum.curriculum.Policy` in the example indicate '**Start Policies**'. To
initialize the task parameters of a :py:class:`~aind_behavior_curriculum.curriculum.Stage`
, the user must specify
which :py:class:`~aind_behavior_curriculum.curriculum.Policy` in the :py:class:`~aind_behavior_curriculum.curriculum.Stage`
policy graph to start with.
*A 'Track'* :py:class:`~aind_behavior_curriculum.curriculum.Curriculum`

Unlike :py:class:`~aind_behavior_curriculum.curriculum.Stage`
, a mouse can occupy multiple active
:py:class:`~aind_behavior_curriculum.curriculum.Policy` within a :py:class:`~aind_behavior_curriculum.curriculum.Stage`
. As described later, the
:py:class:`~aind_behavior_curriculum.trainer.Trainer` will record the net combination of task parameters.

:math:`~`

**Any hypergraph is supported!**
A PolicyGraph consists of :py:class:`~aind_behavior_curriculum.curriculum.Policy` nodes and :py:class:`~aind_behavior_curriculum.curriculum.PolicyTransition` directed edges.
Policies are user-defined functions that take in the current Stage :py:class:`~aind_behavior_curriculum.task.TaskParameters` and return the updated Stage :py:class:`~aind_behavior_curriculum.task.TaskParameters`.
PolicyTransitions define conditional execution of downstream Policies. Like :py:class:`~aind_behavior_curriculum.curriculum.StageTransition`, :py:class:`~aind_behavior_curriculum.curriculum.PolicyTransition`
can connect any two arbitrary :py:class:`~aind_behavior_curriculum.curriculum.Policy` and are ordered by priority set by the user.
The yellow polices indicate **Start policies**, which are entrypoint(s) into the PolicyGraph specified by the user.
Altogether, Policies and PolicyTransitions may be assembled to form arbitrary execution trees and loops.

Here are some examples of the possibilities. The high-level stage graph
are shown to the left and the individual policy graphs are shown to the
right.
Notably, PolicyGraph is executed in parallel (execution is done by the Trainer, discussed later).
A mouse may occupy multiple policies at once and will traverse down all trigger transitions returning True, similar to current in a circuitboard.
While a mouse can only occupy one Stage at a time, a mouse can and will often occupy many active policies.
Intuitively, the current state of Stage parameters is the net parameter change of all active policies.

|Tree Curriculum|
Parallel execution has the benefit of supporting asynchronous parameter updates, which is a more natural way of defining parameter changes.
Rather than defining how all stage parameters all change as a group, a policy can instead define updates to individual parameters, which asynchronously trigger on different metrics.

*A 'Tree'* :py:class:`~aind_behavior_curriculum.curriculum.Curriculum`
A good example of using PolicyGraphs can be demonstrated in the 'Track' curriculum above.

Imagine 'Track Stage' manages two rig parameters, P1 and P2,and these rig parameters update independently from one another
according to different metrics, in this case, metrics m1 and m2 associated with m1_rule and m2_rule respectively.
With parallel execution, the most natural way of implementing this situation is with two tracks as shown, where a mouse can progress asynchronously along each parameter track.
If PolicyGraph was limited to serial execution, implementing this use case would be possible but more clumsy.
m1_rule and m2_rule would have to be combined into a compound policy transition and the left/right policies
would need to be combined into a compound policy with additional conditional logic inside checking if m1_rule or m2_rule was triggered.
With parallel execution, Policies and PolicyTransitions simplify into atomic operations.

|Track Curriculum|
Writing to PolicyGraph is easy.
Similar to Curriculum's API for adding, removing, and reordering stages,
Stage comes with a simple API for adding, removing, and reordering policies.
The structure of the high-level graph and the policy graphs can always be seen using :py:meth:`~aind_behavior_curriculum.curriculum.Curriculum.export_diagram`.

*A 'Track'* :py:class:`~aind_behavior_curriculum.curriculum.Curriculum`
This library has been rigorously tested, and all combinations of StageGraph and PolicyGraph are supported.
Here are some more examples of the possibilities.
The high-level stage graph are shown to the left and the individual policy graphs are shown to the right.
All diagrams have been generated automatically from examples/example_project and examples/example_project_2.


|Tree Curriculum|

*A 'Tree'* :py:class:`~aind_behavior_curriculum.curriculum.Curriculum`

|Policy Triangle Curriculum|

*A 'Policy Triangle'* :py:class:`~aind_behavior_curriculum.curriculum.Curriculum`


|Stage Triangle Curriculum|

*A 'Stage Triangle'* :py:class:`~aind_behavior_curriculum.curriculum.Curriculum`
Expand All @@ -123,18 +130,19 @@ policies as a starting place for evaluation.

2) Evaluation: For each registered mouse, the :py:class:`~aind_behavior_curriculum.trainer.Trainer` looks at
the mouse's current position in its hypergraph curriculum. The
:py:class:`~aind_behavior_curriculum.trainer.Trainer` collects all the current outgoing transitions and checks which evaluate to True. The :py:class:`~aind_behavior_curriculum.trainer.Trainer` determines the updated hypergraph position and associated :py:class:`~aind_behavior_curriculum.task.Task` parameters according to the following simple rules:
:py:class:`~aind_behavior_curriculum.trainer.Trainer` collects all the current outgoing transitions and checks which evaluate to True.
The :py:class:`~aind_behavior_curriculum.trainer.Trainer` determines the updated hypergraph position and associated :py:class:`~aind_behavior_curriculum.task.Task` parameters according to the following simple rules:

- :py:class:`~aind_behavior_curriculum.trainer.Trainer` takes the outgoing :py:class:`~aind_behavior_curriculum.trainer.Trainer` with
the highest priority. If multiple :py:class:`~aind_behavior_curriculum.trainer.Trainer`
evaluate to True, then the :py:class:`~aind_behavior_curriculum.trainer.Trainer` with the
highest priority is chosen. Priority is set by the user.
- :py:class:`~aind_behavior_curriculum.trainer.Trainer` takes the outgoing :py:class:`~aind_behavior_curriculum.curriculum.StageTransition` with
the highest priority. If multiple :py:class:`~aind_behavior_curriculum.curriculum.StageTransition`
evaluate to True, then the :py:class:`~aind_behavior_curriculum.curriculum.StageTransition` with the
highest priority is chosen. Priority is set by the user using :py:meth:`~aind_behavior_curriculum.curriculum.Curriculum.set_stage_transition_priority`.
- :py:class:`~aind_behavior_curriculum.trainer.Trainer` takes the outgoing :py:class:`~aind_behavior_curriculum.curriculum.PolicyTransition` with
the highest priority. If multiple :py:class:`~aind_behavior_curriculum.curriculum.PolicyTransition`
evaluate to True, then the :py:class:`~aind_behavior_curriculum.curriculum.PolicyTransition` with the
highest priority is chosen. Priority is set by the user.
- :py:class:`~aind_behavior_curriculum.trainer.Trainer` override :py:class:`~aind_behavior_curriculum.curriculum.PolicyTransition`. If
a :py:class:`~aind_behavior_curriculum.trainer.Trainer` and :py:class:`~aind_behavior_curriculum.curriculum.PolicyTransition` both
highest priority is chosen. Priority is set by the user using :py:meth:`~aind_behavior_curriculum.curriculum.Stage.set_policy_transition_priority`.
- :py:class:`~aind_behavior_curriculum.curriculum.StageTransition` overrides :py:class:`~aind_behavior_curriculum.curriculum.PolicyTransition`. If
a :py:class:`~aind_behavior_curriculum.curriculum.StageTransition` and :py:class:`~aind_behavior_curriculum.curriculum.PolicyTransition` both
evaluate to True, the :py:class:`~aind_behavior_curriculum.trainer.Trainer` jumps directly to the next
:py:class:`~aind_behavior_curriculum.curriculum.Stage`.
- If no transitions are True, the mouse stays in place.
Expand Down
Loading