Skip to content

Commit

Permalink
Merge pull request #77 from dynamicslab/attr-workaround-docs
Browse files Browse the repository at this point in the history
use workaround for attribute formatting
  • Loading branch information
briandesilva authored May 4, 2020
2 parents 9c380a4 + 7f74d0a commit ea6950c
Show file tree
Hide file tree
Showing 12 changed files with 119 additions and 41 deletions.
2 changes: 1 addition & 1 deletion README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -191,7 +191,7 @@ References
- Brunton, Steven L., Joshua L. Proctor, and J. Nathan Kutz.
*Sparse identification of nonlinear dynamics with control (SINDYc).*
IFAC-PapersOnLine 49.18 (2016): 710-715.
`[DOI] <https://doi.org/10.1016/j.ifacol.2016.10.249>`
`[DOI] <https://doi.org/10.1016/j.ifacol.2016.10.249>`_


.. |BuildCI| image:: https://github.com/dynamicslab/pysindy/workflows/Build%20CI/badge.svg
Expand Down
39 changes: 39 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,3 +67,42 @@ def setup(app):
gallery_dirs=["examples"],
pattern=".+.ipynb",
)

# -- Extensions to the Napoleon GoogleDocstring class ---------------------
# michaelgoerz.net/notes/extending-sphinx-napoleon-docstring-sections.html
from sphinx.ext.napoleon.docstring import GoogleDocstring # noqa: E402


def parse_keys_section(self, section):
return self._format_fields("Keys", self._consume_fields())


GoogleDocstring._parse_keys_section = parse_keys_section


def parse_attributes_section(self, section):
return self._format_fields("Attributes", self._consume_fields())


GoogleDocstring._parse_attributes_section = parse_attributes_section


def parse_class_attributes_section(self, section):
return self._format_fields("Class Attributes", self._consume_fields())


GoogleDocstring._parse_class_attributes_section = parse_class_attributes_section


def patched_parse(self):
"""
we now patch the parse method to guarantee that the the above methods are
assigned to the _section dict
"""
self._sections["keys"] = self._parse_keys_section
self._sections["class attributes"] = self._parse_class_attributes_section
self._unpatched_parse()


GoogleDocstring._unpatched_parse = GoogleDocstring._parse
GoogleDocstring._parse = patched_parse
22 changes: 11 additions & 11 deletions docs/tips.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Practical tips
==============

Here we provide pragmatic advice for using `PySINDy` effectively. We discuss potential pitfalls and strategies for overcoming them. We also specify how to incorporate custom methods not implemented natively in `PySINDy`, where applicable. The information presented here is derived from a combination of experience and theoretical considerations.
Here we provide pragmatic advice for using PySINDy effectively. We discuss potential pitfalls and strategies for overcoming them. We also specify how to incorporate custom methods not implemented natively in PySINDy, where applicable. The information presented here is derived from a combination of experience and theoretical considerations.

Numerical differentiation
-------------------------
Expand All @@ -23,10 +23,10 @@ By default, a second order finite difference method is used to differentiate inp

A toy example illustrating the effect of noise on derivatives computed with a second order finite difference method. Left: The data to be differentiated; :math:`y=\sin(x)` with and without a small amount of additive noise (normally distributed with mean 0 and standard deviation 0.01). Right: Derivatives of the data; the exact derivative :math:`\cos(x)` (blue), the finite difference derivative of the exact data (black, dashed), and the finite difference derivative of the noisy data.

One way to mitigate the effects of noise is to smooth the measurements before computing derivatives. The `SmoothedFiniteDifference` method can be used for this purpose.
One way to mitigate the effects of noise is to smooth the measurements before computing derivatives. The :code:`SmoothedFiniteDifference` method can be used for this purpose.
A numerical differentiation scheme with total variation regularization has also been proposed [Chartrand_2011]_ and recommended for use in SINDy [Brunton_2016]_.

Users wishing to employ their own numerical differentiation schemes have two ways of doing so. Derivatives of input measurements can be computed externally with the method of choice and then passed directly into the `SINDy.fit` method via the `x_dot` keyword argument. Alternatively, users can implement their own differentiation methods and pass them into the `SINDy` constructor using the `differentiation_method` argument. In this case, the supplied class need only have implemented a `__call__` method taking two arguments, `x` and `t`.
Users wishing to employ their own numerical differentiation schemes have two ways of doing so. Derivatives of input measurements can be computed externally with the method of choice and then passed directly into the :code:`SINDy.fit` method via the :code:`x_dot` keyword argument. Alternatively, users can implement their own differentiation methods and pass them into the :code:`SINDy` constructor using the :code:`differentiation_method` argument. In this case, the supplied class need only have implemented a :code:`__call__` method taking two arguments, :code:`x` and :code:`t`.

Library selection
-----------------
Expand All @@ -35,25 +35,25 @@ The SINDy method assumes dynamics can be represented as a *sparse* linear combin

Typically, prior knowledge of the system of interest and its dynamics should be used to make a judicious choice of basis functions. When such information is unavailable, the default class of library functions, polynomials, are a good place to start, as smooth functions have rapidly converging Taylor series. Brunton et al. [Brunton_2016]_ showed that, equipped with a polynomial library, SINDy can recover the first few terms of the (zero-centered) Taylor series of the true right-hand side function :math:`\mathbf{f}(x)`. If one has reason to believe the dynamics can be sparsely represented in terms of Chebyshev polynomials rather than monomials, then the library should include Chebyshev polynomials.

`PySINDy` includes the `CustomLibrary` and `IdentityLibrary` objects to allow for flexibility in the library functions. When the desired library consists of a set of functions that should be applied to each measurement variable (or pair, triplet, etc. of measurement variables) in turn, the `CustomLibrary` class should be used. The `IdentityLibrary` class is the most customizable, but transfers the work of computing library functions over to the user. It expects that all the features one wishes to include in the library have already been computed and are present in `X` before `SINDy.fit` is called, as it simply applies the identity map to each variable that is passed to it.
PySINDy includes the :code:`CustomLibrary` and :code:`IdentityLibrary` objects to allow for flexibility in the library functions. When the desired library consists of a set of functions that should be applied to each measurement variable (or pair, triplet, etc. of measurement variables) in turn, the :code:`CustomLibrary` class should be used. The :code:`IdentityLibrary` class is the most customizable, but transfers the work of computing library functions over to the user. It expects that all the features one wishes to include in the library have already been computed and are present in :code:`X` before :code:`SINDy.fit` is called, as it simply applies the identity map to each variable that is passed to it.
It is best suited for situations in which one has very specific instructions for how to apply library functions (e.g. if some of the functions should be applied to only some of the input variables).

As terms are added to the library, the underlying sparse regression problem becomes increasingly ill-conditioned. Therefore it is recommended to start with a small library whose size is gradually expanded until the desired level of performance is achieved.
For example, a user may wish to start with a library of linear terms and then add quadratic and cubic terms as necessary to improve model performance.
For the best results, the strength of regularization applied should be increased in proportion to the size of the library to account for the worsening condition number of the resulting linear system.

Users may also choose to implement library classes tailored to their applications. To do so one should have the new class inherit from our `BaseFeatureLibrary` class. See the documentation for guidance on which functions the new class is expected to implement.
Users may also choose to implement library classes tailored to their applications. To do so one should have the new class inherit from our :code:`BaseFeatureLibrary` class. See the documentation for guidance on which functions the new class is expected to implement.

Optimization
------------
`PySINDy` uses various optimizers to solve the sparse regression problem. For a fixed differentiation method, set of inputs, and candidate library, there is still some variance in the dynamical system identified by SINDY, depending on which optimizer is employed.
PySINDy uses various optimizers to solve the sparse regression problem. For a fixed differentiation method, set of inputs, and candidate library, there is still some variance in the dynamical system identified by SINDY, depending on which optimizer is employed.

The default optimizer in `PySINDy` is the sequentially-thresholded least-squares algorithm (`STLSQ`). In addition to being the method originally proposed for use with SINDy, it involves a single, easily interpretable hyperparameter, and it exhibits good performance across a variety of problems.
The default optimizer in PySINDy is the sequentially-thresholded least-squares algorithm (:code:`STLSQ`). In addition to being the method originally proposed for use with SINDy, it involves a single, easily interpretable hyperparameter, and it exhibits good performance across a variety of problems.

The sparse relaxed regularized regression (`SR3`) [Zheng_2018]_ [Champion_2019]_ algorithm can be used when the results of `STLSQ` are unsatisfactory. It involves a few more hyperparameters that can be tuned for improved accuracy. In particular, the `thresholder` parameter controls the type of regularization that is applied. For optimal results, one may find it useful to experiment with :math:`L^0`, :math:`L^1`, and clipped absolute deviation (CAD) regularization. The other hyperparameters can be tuned with cross-validation.
The sparse relaxed regularized regression (:code:`SR3`) [Zheng_2018]_ [Champion_2019]_ algorithm can be used when the results of :code:`STLSQ` are unsatisfactory. It involves a few more hyperparameters that can be tuned for improved accuracy. In particular, the :code:`thresholder` parameter controls the type of regularization that is applied. For optimal results, one may find it useful to experiment with :math:`L^0`, :math:`L^1`, and clipped absolute deviation (CAD) regularization. The other hyperparameters can be tuned with cross-validation.

Custom or third party sparse regression methods are also supported. Simply instantiate an instance of the custom object and pass it to the `SINDy` constructor using the `optimizer` keyword. Our implementation is compatible with any of the linear models from Scikit-learn (e.g. `RidgeRegression`, `Lasso`, and `ElasticNet`).
See the documentation for a list of methods and attributes a custom optimizer is expected to implement. There you will also find an example where the Scikit-learn `Lasso` object is used to perform sparse regression.
Custom or third party sparse regression methods are also supported. Simply instantiate an instance of the custom object and pass it to the :code:`SINDy` constructor using the :code:`optimizer` keyword. Our implementation is compatible with any of the linear models from Scikit-learn (e.g. :code:`RidgeRegression`, :code:`Lasso`, and :code:`ElasticNet`).
See the documentation for a list of methods and attributes a custom optimizer is expected to implement. There you will also find an example where the Scikit-learn :code:`Lasso` object is used to perform sparse regression.

Regularization
--------------
Expand All @@ -64,7 +64,7 @@ Applying strong regularization biases the learned weights *away* from the values
Therefore once a sparse set of nonzero coefficients is discovered, our methods apply an extra "unbiasing" step where *unregularized* least-squares is used to find the values of the identified nonzero coefficients.
All of our built-in methods use regularization by default.

Some general best practices regarding regularization follow. Most problems will benefit from some amount of regularization. Regularization strength should be increased as the size of the candidate right-hand side library grows. If warnings about ill-conditioned matrices are generated when `SINDy.fit` is called, more regularization may help. We also recommend setting `unbias` to `True` when invoking the `SINDy.fit` method, especially when large amounts of regularization are being applied. Cross-validation can be used to select appropriate regularization parameters for a given problem.
Some general best practices regarding regularization follow. Most problems will benefit from some amount of regularization. Regularization strength should be increased as the size of the candidate right-hand side library grows. If warnings about ill-conditioned matrices are generated when :code:`SINDy.fit` is called, more regularization may help. We also recommend setting :code:`unbias` to :code:`True` when invoking the :code:`SINDy.fit` method, especially when large amounts of regularization are being applied. Cross-validation can be used to select appropriate regularization parameters for a given problem.


.. [Chartrand_2011] R. Chartrand, “Numerical differentiation of noisy, nonsmooth data,” *ISRN Applied Mathematics*, vol. 2011, 2011.
Expand Down
13 changes: 8 additions & 5 deletions pysindy/feature_library/feature_library.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,10 +96,13 @@ class ConcatLibrary(BaseFeatureLibrary):
libraries_ : list of libraries
Library instances to be applied to the input matrix.
n_input_features_ : int
The total number of input features.
n_output_features_ : int
The total number of output features. The number of output features
is the product of the number of library functions and the number of
input features.
is the sum of the numbers of output features for each of the
concatenated libraries.
Examples
--------
Expand Down Expand Up @@ -135,13 +138,13 @@ def fit(self, X, y=None):
_, n_features = check_array(X).shape
self.n_input_features_ = n_features

# first fit all libs provided below
# First fit all libs provided below
fitted_libs = [lib.fit(X, y) for lib in self.libraries_]

# calculate the sum of output features
# Calculate the sum of output features
self.n_output_features_ = sum([lib.n_output_features_ for lib in fitted_libs])

# save fitted libs
# Save fitted libs
self.libraries_ = fitted_libs

return self
Expand Down
3 changes: 2 additions & 1 deletion pysindy/feature_library/identity_library.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@

class IdentityLibrary(BaseFeatureLibrary):
"""
Generate an identity library.
Generate an identity library which maps all input features to
themselves.
Attributes
----------
Expand Down
13 changes: 10 additions & 3 deletions pysindy/feature_library/polynomial_library.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,11 @@ class PolynomialLibrary(PolynomialFeatures, BaseFeatureLibrary):
The degree of the polynomial features.
include_interaction : boolean, optional (default True)
Determines whether interaction features are produced.
If false, features are all of the form `x[i] ** k`.
If false, features are all of the form ``x[i] ** k``.
interaction_only : boolean, optional (default False)
If true, only interaction features are produced: features that are
products of at most `degree` *distinct* input features (so not
`x[1] ** 2`, `x[0] * x[2] ** 3`, etc.).
products of at most ``degree`` *distinct* input features (so not
``x[1] ** 2``, ``x[0] * x[2] ** 3``, etc.).
include_bias : boolean, optional (default True)
If True (default), then include a bias column, the feature in which
all polynomial powers are zero (i.e. a column of ones - acts as an
Expand All @@ -42,6 +42,13 @@ class PolynomialLibrary(PolynomialFeatures, BaseFeatureLibrary):
----------
powers_ : array, shape (n_output_features, n_input_features)
powers_[i, j] is the exponent of the jth input in the ith output.
n_input_features_ : int
The total number of input features.
n_output_features_ : int
The total number of output features. This number is computed by
iterating over all appropriately sized combinations of input features.
"""

def __init__(
Expand Down
12 changes: 12 additions & 0 deletions pysindy/optimizers/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,18 @@ class BaseOptimizer(LinearRegression, ComplexityMixin):
copy_X : boolean, optional (default True)
If True, X will be copied; else, it may be overwritten.
Attributes
----------
coef_ : array, shape (n_features,) or (n_targets, n_features)
Weight vector(s).
ind_ : array, shape (n_features,) or (n_targets, n_features)
Array of 0s and 1s indicating which coefficients of the
weight vector have not been masked out.
history_ : list
History of ``coef_`` over iterations of the optimization algorithm.
"""

def __init__(self, max_iter=20, normalize=False, fit_intercept=False, copy_X=True):
Expand Down
14 changes: 7 additions & 7 deletions pysindy/optimizers/sindy_optimizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,22 +12,22 @@ class SINDyOptimizer(BaseEstimator):
Enables single target regressors (i.e. those whose predictions are 1-dimensional)
to perform multi target regression (i.e. predictions are 2-dimensional).
Also enhances an `_unbias` function to reduce bias when regularization is used.
Also enhances an ``_unbias`` function to reduce bias when regularization is used.
Parameters
----------
optimizer: estimator object
The optimizer/sparse regressor to be wrapped, implementing `fit` and `predict`.
`optimizer` should also have the attributes `coef_`, `fit_intercept`,
`normalize`, and `intercept_`.
The optimizer/sparse regressor to be wrapped, implementing ``fit`` and
``predict``. ``optimizer`` should also have the attributes ``coef_``,
``fit_intercept``, ``normalize``, and ``intercept_``.
unbias : boolean, optional (default True)
Whether to perform an extra step of unregularized linear regression to unbias
the coefficients for the identified support.
For example, if `optimizer=STLSQ(alpha=0.1)` is used then the learned
For example, if ``optimizer=STLSQ(alpha=0.1)`` is used then the learned
coefficients will be biased toward 0 due to the L2 regularization.
Setting `unbias=True` will trigger an additional step wherein the nonzero
coefficients learned by the `STLSQ` object will be updated using an
Setting ``unbias=True`` will trigger an additional step wherein the nonzero
coefficients learned by the optimizer object will be updated using an
unregularized least-squares fit.
"""

Expand Down
9 changes: 7 additions & 2 deletions pysindy/optimizers/sr3.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,8 @@ class SR3(BaseOptimizer):
.. math::
0.5\\|y-Xw\\|^2_2 + lambda \\times R(v)
+ (0.5 / nu)\\|w-v\\|^2_2
0.5\\|y-Xw\\|^2_2 + \\lambda \\times R(v)
+ (0.5 / \\nu)\\|w-v\\|^2_2
where :math:`R(v)` is a regularization function. See the following reference
for more details:
Expand Down Expand Up @@ -74,6 +74,11 @@ class SR3(BaseOptimizer):
Weight vector(s) that are not subjected to the regularization.
This is the w in the objective function.
history_ : list
History of sparse coefficients. ``history_[k]`` contains the
sparse coefficients (v in the optimization objective function)
at iteration k.
Examples
--------
>>> import numpy as np
Expand Down
Loading

0 comments on commit ea6950c

Please sign in to comment.