You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for the great presentation today, @benfulcher!
Inspired, I looked in greater detail into the repository, to my shame perhaps for the first time at that level of detail.
What I understood is that your SPI are actually not all manually implemented, but there is a wealth of them, some using external dependencies in turn. As such, pyspi is, morally, very much similar to sktime, being a mix of de-novo implementations, direct interfaces to external algorithms, and implementations that use components with soft dependencies.
I also noticed that you have tags for the different SPI, which again is very similar to sktime.
Further, when trying to interface SPI individually, I noticed that this is currently not intended to be possible - only batch feature sets can be obtained? Which seems to be a shame, you have collected so many useful pairwise transformations! Unless of course you use the yaml, and the process of discovery if you want that is tedious, and currently cannot be automated, so composability with other frameworks is severely limited.
Based on this, I had a number of ideas if you would like to hear me out:
move the SPIs to a strategy object orientation pattern, using the tag system provided by scikit-base. This would give you for free runtime discoverability - no need to use config yaml, or the webpage.
change SPIs from private API to public API. Define a public API, and add a test suite for testing individual SPI for interface conformance.
isolate SPI specific dependencies to the particular SPI. Then you can treat, I believe, all dependencies as soft dependencies.
users could then say: compute all SPI for which I have all dependencies installed. Or, get me all SPI for this dependency set. SPIs would also tell the user directly which dependency to install.
What do you think? I'd be happy to devote some time to shift the code base gradually towards this schema. As a side effect, it would also easily allow to interface all SPI as time seires distances in sktime, and would make it easier to add SPI for multivariate or unequal length time series.
FYI @jmoo2880
The text was updated successfully, but these errors were encountered:
Thanks for the great presentation today, @benfulcher!
Inspired, I looked in greater detail into the repository, to my shame perhaps for the first time at that level of detail.
What I understood is that your SPI are actually not all manually implemented, but there is a wealth of them, some using external dependencies in turn. As such,
pyspi
is, morally, very much similar tosktime
, being a mix of de-novo implementations, direct interfaces to external algorithms, and implementations that use components with soft dependencies.I also noticed that you have tags for the different SPI, which again is very similar to
sktime
.Further, when trying to interface SPI individually, I noticed that this is currently not intended to be possible - only batch feature sets can be obtained? Which seems to be a shame, you have collected so many useful pairwise transformations! Unless of course you use the
yaml
, and the process of discovery if you want that is tedious, and currently cannot be automated, so composability with other frameworks is severely limited.Based on this, I had a number of ideas if you would like to hear me out:
scikit-base
. This would give you for free runtime discoverability - no need to use config yaml, or the webpage.all_estimators
insktime
: https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.registry.all_estimators.htmlpyproject
would look like this insktime
: https://github.com/sktime/sktime/blob/main/pyproject.toml - minimal core dependency set; and dependencies are managed via tags like this: https://www.sktime.net/en/latest/api_reference/tags.html#general-tags-packagingWhat do you think? I'd be happy to devote some time to shift the code base gradually towards this schema. As a side effect, it would also easily allow to interface all SPI as time seires distances in
sktime
, and would make it easier to add SPI for multivariate or unequal length time series.FYI @jmoo2880
The text was updated successfully, but these errors were encountered: