Skip to content

Releases: tspooner/rsrl

0.3.0

12 Feb 12:50
Compare
Choose a tag to compare
0.3.0 Pre-release
Pre-release

This includes major changes to the agent interface. The ControlAgent and PredictionAgent traits have been unified into a single trait called Agent with an associated type which distinguishes between control and predictions tasks. The methods associated with choosing actions etc are then defined in Controller for example.

This makes it much easier to build a library of agents that have some, but not all, guarantees about behaviour. This will also make it easier to move into continuous actions space support and better combinability of the various agents (say, when using a predictor inside an actor critic method).

In addition, this push comes with some new agent implementations based on "true online" gradient methods.

0.2.3

06 Feb 16:13
Compare
Choose a tag to compare
0.2.3 Pre-release
Pre-release

In this release we include some internal naming changes, additional features for the geometry module (in particular geometry::spaces) and the addition of a new domain that interacts with the OpenAI gym via a python interpreter instance. At the moment there are some unresolved changes needed to better handle infinite dimensions (which are common in the OpenAI gym - see #24), but one can run experiments and upload results to the OpenAI leaderboard.

0.2.2

18 Jan 11:44
Compare
Choose a tag to compare
0.2.2 Pre-release
Pre-release

This version includes an implementation of the Polynomial basis for linear function approximation, an associated usage example and some refactors to the codebase, including:

  • Dedicated type aliases for Array1 and Array2: Vector and Matrix, respectively; these have a default type argument of f64.
  • Cleanup of the cartesian_product utility function.

0.2.1

15 Jan 12:45
Compare
Choose a tag to compare
0.2.1 Pre-release
Pre-release

Along with some minor additions and changes, this PR includes a fix for the eligibility trace algorithms QLambda and SARSALambda which were observed to diverge during testing. As explained in 5919e80, inconsistencies between the application of normalisation in the methods for updating/evaluating the Linear function approximator lead to instability.

These have been corrected and an example was added to show convergence of the SARSALambda algorithm on the MountainCar domain.

0.2.0

08 Jan 09:43
Compare
Choose a tag to compare
0.2.0 Pre-release
Pre-release

This breaking update changes the method used to optimise for dense/sparse basis projectors. Specifically, we replace SparseLinear and DenseLinear with a single unified struct Linear which handles both dense and sparse feature vectors.

The new Projection enum is returned from a Projector (originally called Projection itself) which requires explicitly match statements to handle either a dense or sparse vector. A Projector must now also reveal an expand_projection method which converts a Projection into a dense feature vector (of type Array1<f64>).

All code has been refactored and optimised with respect to these changes and further tests added.

0.1.1

01 Jan 15:17
Compare
Choose a tag to compare
0.1.1 Pre-release
Pre-release

This patch update brings improved normalisation methods and corrections for the existing projection classes, leading to improved stability of function approximation.

0.1.0

24 Dec 18:04
Compare
Choose a tag to compare
0.1.0 Pre-release
Pre-release

This is the alpha release of the rsrl framework. Please try it out and let me know what you think!