AllenNeuralDynamics · jwong-nd · Jun 14, 2024 · Jun 14, 2024 · Jun 14, 2024 · Jun 14, 2024
diff --git a/README.md b/README.md
@@ -26,50 +26,77 @@ Stages are connected by **``Stage Transitions``**, which are directed edges asso
 With this structure alone, a user can define a basic curriculum with the flexibility of defining skip connections and regressions. For nodes with multiple ongoing edges, edges are labelled by priority, set by the user.
 
 
-| ![High-Level Curriculum](./examples/example_project/diagrams/high_level_curr_diagram.png "Title") | 
-|:--:| 
+| ![High-Level Curriculum](./examples/example_project/diagrams/high_level_curr_diagram.png "Title") |
+|:--:|
 |*An example curriculum consisting of purely stages and stage transitions. This **``Curriculum``** consists of a skip connection between **``Stage``** 'StageA' and **``Stage``** 'Graduated'. **``Stage Transitions``** are triggered on a parameter 't2' and the skip transition is ordered before the transition going to **``Stage``** StageB.* |
 
 $~$
 
-This library also supports **``Curriculum``** **hypergraphs**. 
+Stages are intended to represent 'checkpoint learning objectives', which wrap independent sets of parameters,
+for example, Stage1 = {P1, P2, P3} -> Stage2 = {P4, P5, P6}.
 
-Conceptually, a user may want to change the rig parameters associated with a stage, but this set of rig parameters would be unnatural to classify as a new training stage altogether.
-In this situation, the user may define a graph of **``Policies``** and **``Policy Transitions``** within a **``Stage``**.
-A **``Policy``**, changes the task parameters of a **``Stage``**, as described above. A **``Policy Transition``** acts just like a **``Stage Transition``**, and defines transitions between **``Policies``** on a trigger condition. Like **``Stage Transitions``**, **``Policy Transitions``**  can connect any two arbitrary **``Policies``** and are ordered by priority set by the user.
+If a curriculum demands changing the same set of parameters,
+for example, Stage1 = {P1, P2, P3} -> Stage1' = {P1', P2', P3}, it is a good idea to use PolicyGraphs.
 
+A PolicyGraph is a **parallel programming interface** for changing **``Stage``** parameters.
 
-| ![Full Curriculum](./examples/example_project/diagrams/my_curr_diagram.png "Title") | 
-|:--:| 
-|*An example **``Curriculum``** consisting of **``Stage``** and  **``Policy``** graphs. Left: The high level policy graph. Right: Internal policy graphs.* |
 
-**``Policies``** are more nuanced than **``Stages``**.
+| ![Full Curriculum](./examples/example_project/diagrams/my_curr_diagram.png "Title") |
+|:--:|
+|*An example **``Curriculum``** consisting of **``Stage``** and  **``Policy``** graphs. Left: The high level policy graph. Right: Internal policy graphs.* |
 
-Yellow **``Policies``** in the example indicate '**Start Policies**'. To initialize the rig parameters of a **``Stage``**, the user must specify which **``Policy/Policies``** in the **``Stage``** policy graph to start with.
 
-Unlike **``Stages``**, a mouse can occupy multiple active **``Policies``**  within a **``Stage``**. As described later, the **``Trainer``** will record the net combination of rig parameters.
+| ![Track Curriculum](./examples/example_project_2/diagrams/track_curr_diagram.png "Title") |
+|:--:|
+|*A 'Train Track' **``Curriculum``*** |
 
 $~$
 
-**Any hypergraph is supported!**
-
-Here are some examples of the possibilities. The high-level stage graph are shown to the left and the inidividual policy graphs are shown to the right.
-
-
-| ![Tree Curriculum](./examples/example_project_2/diagrams/tree_curr_diagram.png "Title") | 
-|:--:| 
+A PolicyGraph consists of **``Policy``** nodes and **``PolicyTransition``** directed edges.
+Policies are user-defined functions that take in the current Stage **``TaskParameters``** and return the updated Stage **``TaskParameters``**.
+PolicyTransitions define conditional execution of downstream Policies. Like **``StageTransition``**, **``PolicyTransition``**
+can connect any two arbitrary **``Policy``** and are ordered by priority set by the user.
+The yellow polices indicate **Start policies**, which are entrypoint(s) into the PolicyGraph specified by the user.
+Altogether, Policies and PolicyTransitions may be assembled to form arbitrary execution trees and loops.
+
+Notably, PolicyGraph is executed in parallel (execution is done by the Trainer, discussed later).
+A mouse may occupy multiple policies at once and will traverse down all trigger transitions returning True, similar to current in a circuitboard.
+While a mouse can only occupy one Stage at a time, a mouse can and will often occupy many active policies.
+Intuitively, the current state of Stage parameters is the net parameter change of all active policies.
+
+Parallel execution has the benefit of supporting asynchronous parameter updates, which is a more natural way of defining parameter changes.
+Rather than defining how all stage parameters all change as a group, a policy can instead define updates to individual parameters, which asynchronously trigger on different metrics.
+
+A good example of using PolicyGraphs can be demonstrated in the 'Track' curriculum above.
+
+Imagine 'Track Stage' manages two rig parameters, P1 and P2,and these rig parameters update independently from one another
+according to different metrics, in this case, metrics m1 and m2 associated with m1_rule and m2_rule respectively.
+With parallel execution, the most natural way of implementing this situation is with two tracks as shown, where a mouse can progress asynchronously along each parameter track.
+If PolicyGraph was limited to serial execution, implementing this use case would be possible but more clumsy.
+m1_rule and m2_rule would have to be combined into a compound policy transition and the left/right policies
+would need to be combined into a compound policy with additional conditional logic inside checking if m1_rule or m2_rule was triggered.
+With parallel execution, Policies and PolicyTransitions simplify into atomic operations.
+
+Writing to PolicyGraph is easy.
+Similar to Curriculum's API for adding, removing, and reordering stages,
+Stage comes with a simple API for adding, removing, and reordering policies.
+The structure of the high-level graph and the policy graphs can always be seen using **``Curriculum.export_diagram(...)``**.
+
+This library has been rigorously tested, and all combinations of StageGraph and PolicyGraph are supported.
+Here are some more examples of the possibilities.
+The high-level stage graph are shown to the left and the individual policy graphs are shown to the right.
+All diagrams have been generated automatically from examples/example_project and examples/example_project_2.
+
+| ![Tree Curriculum](./examples/example_project_2/diagrams/tree_curr_diagram.png "Title") |
+|:--:|
 |*A 'Tree' **``Curriculum``*** |
 
-| ![Track Curriculum](./examples/example_project_2/diagrams/track_curr_diagram.png "Title") | 
-|:--:| 
-|*A 'Train Track' **``Curriculum``*** |
-
-| ![Policy Triangle Curriculum](./examples/example_project_2/diagrams/p_triangle_curr_diagram.png "Title") | 
-|:--:| 
+| ![Policy Triangle Curriculum](./examples/example_project_2/diagrams/p_triangle_curr_diagram.png "Title") |
+|:--:|
 |*A 'Policy Triangle' **``Curriculum``*** |
 
-| ![Stage Triangle Curriculum](./examples/example_project_2/diagrams/s_triangle_curr_diagram.png "Title") | 
-|:--:| 
+| ![Stage Triangle Curriculum](./examples/example_project_2/diagrams/s_triangle_curr_diagram.png "Title") |
+|:--:|
 |*A 'Stage Triangle' **``Curriculum``*** |
 
 $~$

diff --git a/docs/source/aind_behavior_curriculum/base.rst b/docs/source/aind_behavior_curriculum/base.rst
diff --git a/docs/source/aind_behavior_curriculum/curriculum.rst b/docs/source/aind_behavior_curriculum/curriculum.rst
@@ -1,9 +1,37 @@
 curriculum
 --------------------------------------------
 
-.. automodule:: aind_behavior_curriculum.curriculum
+.. autoclass:: aind_behavior_curriculum.curriculum.BehaviorGraph
    :members:
    :undoc-members:
    :show-inheritance:
+   :no-index:
 
+.. autoclass:: aind_behavior_curriculum.curriculum.Metrics
+   :show-inheritance:
+
+.. autoclass:: aind_behavior_curriculum.curriculum.Rule
+   :show-inheritance:
+
+.. autoclass:: aind_behavior_curriculum.curriculum.Policy
+   :members: rule, validate_rule
+   :show-inheritance:
+
+.. autoclass:: aind_behavior_curriculum.curriculum.PolicyTransition
+   :members: rule, validate_rule
+   :show-inheritance:
 
+.. autoclass:: aind_behavior_curriculum.curriculum.Stage
+   :members: add_policy, add_policy_transition, get_task_parameters, graph, name, remove_policy, remove_policy_transition, see_policies, see_policy_transitions, set_policy_transition_priority, set_start_policies, set_task_parameters, start_policies, task, validate_stage
+   :show-inheritance:
+
+.. autoclass:: aind_behavior_curriculum.curriculum.StageTransition
+   :members: rule, validate_rule
+   :show-inheritance:
+
+.. autoclass:: aind_behavior_curriculum.curriculum.StageGraph
+   :show-inheritance:
+
+.. autoclass:: aind_behavior_curriculum.curriculum.Curriculum
+   :members: add_stage, add_stage_transition, download_curriculum, export_curriculum, export_diagram, export_json, graph, name, pkg_location, remove_stage, remove_stage_transition, see_stage_transitions, see_stages, set_stage_transition_priority, validate_curriculum
+   :show-inheritance:
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -42,20 +42,13 @@ edges are labelled by priority, set by the user.
 
 :math:`~`
 
-This library also supports :py:class:`~aind_behavior_curriculum.curriculum.Curriculum` **hypergraphs**.
-
-Conceptually, a user may want to change the task parameters associated
-with a stage, but this set of task parameters would be unnatural to
-classify as a new training stage altogether. In this situation, the user
-may define a graph of :py:class:`~aind_behavior_curriculum.curriculum.Policy` and :py:class:`~aind_behavior_curriculum.curriculum.PolicyTransition`
-within a :py:class:`~aind_behavior_curriculum.curriculum.Stage`
-. A :py:class:`~aind_behavior_curriculum.curriculum.Policy`, changes the task parameters of
-a :py:class:`~aind_behavior_curriculum.curriculum.Stage`
-, as described above. A :py:class:`~aind_behavior_curriculum.curriculum.PolicyTransition` acts
-just like a :py:class:`~aind_behavior_curriculum.curriculum.StageTransition`, and defines transitions between
-:py:class:`~aind_behavior_curriculum.curriculum.Policy` on a trigger condition. Like :py:class:`~aind_behavior_curriculum.curriculum.StageTransition`,
-:py:class:`~aind_behavior_curriculum.curriculum.PolicyTransition` can connect any two arbitrary
-:py:class:`~aind_behavior_curriculum.curriculum.Policy` and are ordered by priority set by the user.
+Stages are intended to represent 'checkpoint learning objectives', which wrap independent sets of parameters,
+for example, Stage1 = {P1, P2, P3} -> Stage2 = {P4, P5, P6}.
+
+If a curriculum demands changing the same set of parameters,
+for example, Stage1 = {P1, P2, P3} -> Stage1' = {P1', P2', P3}, it is a good idea to use PolicyGraphs.
+
+A PolicyGraph is a **parallel programming interface** for changing :py:class:`~aind_behavior_curriculum.curriculum.Stage` parameters.
 
 |Full Curriculum|
 
@@ -65,43 +58,57 @@ just like a :py:class:`~aind_behavior_curriculum.curriculum.StageTransition`, an
    *graphs. Left: The high level policy graph. Right: Internal policy graphs.*
 
 
-:py:class:`~aind_behavior_curriculum.curriculum.Policy` are more nuanced than :py:class:`~aind_behavior_curriculum.curriculum.Stage`.
+|Track Curriculum|
 
-Yellow :py:class:`~aind_behavior_curriculum.curriculum.Policy` in the example indicate '**Start Policies**'. To
-initialize the task parameters of a :py:class:`~aind_behavior_curriculum.curriculum.Stage`
-, the user must specify
-which :py:class:`~aind_behavior_curriculum.curriculum.Policy` in the :py:class:`~aind_behavior_curriculum.curriculum.Stage`
-policy graph to start with.
+   *A 'Track'* :py:class:`~aind_behavior_curriculum.curriculum.Curriculum`
 
-Unlike :py:class:`~aind_behavior_curriculum.curriculum.Stage`
-, a mouse can occupy multiple active
-:py:class:`~aind_behavior_curriculum.curriculum.Policy` within a :py:class:`~aind_behavior_curriculum.curriculum.Stage`
-. As described later, the
-:py:class:`~aind_behavior_curriculum.trainer.Trainer` will record the net combination of task parameters.
 
 :math:`~`
 
-**Any hypergraph is supported!**
+A PolicyGraph consists of :py:class:`~aind_behavior_curriculum.curriculum.Policy` nodes and :py:class:`~aind_behavior_curriculum.curriculum.PolicyTransition` directed edges.
+Policies are user-defined functions that take in the current Stage :py:class:`~aind_behavior_curriculum.task.TaskParameters` and return the updated Stage :py:class:`~aind_behavior_curriculum.task.TaskParameters`.
+PolicyTransitions define conditional execution of downstream Policies. Like :py:class:`~aind_behavior_curriculum.curriculum.StageTransition`, :py:class:`~aind_behavior_curriculum.curriculum.PolicyTransition`
+can connect any two arbitrary :py:class:`~aind_behavior_curriculum.curriculum.Policy` and are ordered by priority set by the user.
+The yellow polices indicate **Start policies**, which are entrypoint(s) into the PolicyGraph specified by the user.
+Altogether, Policies and PolicyTransitions may be assembled to form arbitrary execution trees and loops.
 
-Here are some examples of the possibilities. The high-level stage graph
-are shown to the left and the individual policy graphs are shown to the
-right.
+Notably, PolicyGraph is executed in parallel (execution is done by the Trainer, discussed later).
+A mouse may occupy multiple policies at once and will traverse down all trigger transitions returning True, similar to current in a circuitboard.
+While a mouse can only occupy one Stage at a time, a mouse can and will often occupy many active policies.
+Intuitively, the current state of Stage parameters is the net parameter change of all active policies.
 
-|Tree Curriculum|
+Parallel execution has the benefit of supporting asynchronous parameter updates, which is a more natural way of defining parameter changes.
+Rather than defining how all stage parameters all change as a group, a policy can instead define updates to individual parameters, which asynchronously trigger on different metrics.
 
-   *A 'Tree'* :py:class:`~aind_behavior_curriculum.curriculum.Curriculum`
+A good example of using PolicyGraphs can be demonstrated in the 'Track' curriculum above.
 
+Imagine 'Track Stage' manages two rig parameters, P1 and P2,and these rig parameters update independently from one another
+according to different metrics, in this case, metrics m1 and m2 associated with m1_rule and m2_rule respectively.
+With parallel execution, the most natural way of implementing this situation is with two tracks as shown, where a mouse can progress asynchronously along each parameter track.
+If PolicyGraph was limited to serial execution, implementing this use case would be possible but more clumsy.
+m1_rule and m2_rule would have to be combined into a compound policy transition and the left/right policies
+would need to be combined into a compound policy with additional conditional logic inside checking if m1_rule or m2_rule was triggered.
+With parallel execution, Policies and PolicyTransitions simplify into atomic operations.
 
-|Track Curriculum|
+Writing to PolicyGraph is easy.
+Similar to Curriculum's API for adding, removing, and reordering stages,
+Stage comes with a simple API for adding, removing, and reordering policies.
+The structure of the high-level graph and the policy graphs can always be seen using :py:meth:`~aind_behavior_curriculum.curriculum.Curriculum.export_diagram`.
 
-   *A 'Track'* :py:class:`~aind_behavior_curriculum.curriculum.Curriculum`
+This library has been rigorously tested, and all combinations of StageGraph and PolicyGraph are supported.
+Here are some more examples of the possibilities.
+The high-level stage graph are shown to the left and the individual policy graphs are shown to the right.
+All diagrams have been generated automatically from examples/example_project and examples/example_project_2.
+
+
+|Tree Curriculum|
 
+   *A 'Tree'* :py:class:`~aind_behavior_curriculum.curriculum.Curriculum`
 
 |Policy Triangle Curriculum|
 
    *A 'Policy Triangle'* :py:class:`~aind_behavior_curriculum.curriculum.Curriculum`
 
-
 |Stage Triangle Curriculum|
 
    *A 'Stage Triangle'* :py:class:`~aind_behavior_curriculum.curriculum.Curriculum`
@@ -123,18 +130,19 @@ policies as a starting place for evaluation.
 
 2) Evaluation: For each registered mouse, the :py:class:`~aind_behavior_curriculum.trainer.Trainer` looks at
    the mouse's current position in its hypergraph curriculum. The
-   :py:class:`~aind_behavior_curriculum.trainer.Trainer` collects all the current outgoing transitions and checks which evaluate to True. The :py:class:`~aind_behavior_curriculum.trainer.Trainer` determines the updated hypergraph position and associated :py:class:`~aind_behavior_curriculum.task.Task` parameters according to the following simple rules:
+   :py:class:`~aind_behavior_curriculum.trainer.Trainer` collects all the current outgoing transitions and checks which evaluate to True.
+   The :py:class:`~aind_behavior_curriculum.trainer.Trainer` determines the updated hypergraph position and associated :py:class:`~aind_behavior_curriculum.task.Task` parameters according to the following simple rules:
 
-   -  :py:class:`~aind_behavior_curriculum.trainer.Trainer` takes the outgoing :py:class:`~aind_behavior_curriculum.trainer.Trainer` with
-      the highest priority. If multiple :py:class:`~aind_behavior_curriculum.trainer.Trainer`
-      evaluate to True, then the :py:class:`~aind_behavior_curriculum.trainer.Trainer` with the
-      highest priority is chosen. Priority is set by the user.
+   -  :py:class:`~aind_behavior_curriculum.trainer.Trainer` takes the outgoing :py:class:`~aind_behavior_curriculum.curriculum.StageTransition` with
+      the highest priority. If multiple :py:class:`~aind_behavior_curriculum.curriculum.StageTransition`
+      evaluate to True, then the :py:class:`~aind_behavior_curriculum.curriculum.StageTransition` with the
+      highest priority is chosen. Priority is set by the user using :py:meth:`~aind_behavior_curriculum.curriculum.Curriculum.set_stage_transition_priority`.
    -  :py:class:`~aind_behavior_curriculum.trainer.Trainer` takes the outgoing :py:class:`~aind_behavior_curriculum.curriculum.PolicyTransition` with
       the highest priority. If multiple :py:class:`~aind_behavior_curriculum.curriculum.PolicyTransition`
       evaluate to True, then the :py:class:`~aind_behavior_curriculum.curriculum.PolicyTransition` with the
-      highest priority is chosen. Priority is set by the user.
-   -  :py:class:`~aind_behavior_curriculum.trainer.Trainer` override :py:class:`~aind_behavior_curriculum.curriculum.PolicyTransition`. If
-      a :py:class:`~aind_behavior_curriculum.trainer.Trainer` and :py:class:`~aind_behavior_curriculum.curriculum.PolicyTransition` both
+      highest priority is chosen. Priority is set by the user using :py:meth:`~aind_behavior_curriculum.curriculum.Stage.set_policy_transition_priority`.
+   -  :py:class:`~aind_behavior_curriculum.curriculum.StageTransition` overrides :py:class:`~aind_behavior_curriculum.curriculum.PolicyTransition`. If
+      a :py:class:`~aind_behavior_curriculum.curriculum.StageTransition` and :py:class:`~aind_behavior_curriculum.curriculum.PolicyTransition` both
       evaluate to True, the :py:class:`~aind_behavior_curriculum.trainer.Trainer` jumps directly to the next
       :py:class:`~aind_behavior_curriculum.curriculum.Stage`.
    -  If no transitions are True, the mouse stays in place.