-
Notifications
You must be signed in to change notification settings - Fork 60
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
11 changed files
with
2,846 additions
and
1,375 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,137 @@ | ||
# TimeSeriesCVSplitter { #pytimetk.TimeSeriesCVSplitter } | ||
|
||
`TimeSeriesCVSplitter(self, *, frequency, train_size, forecast_horizon, time_series, gap=0, stride=None, window='rolling', mode='backward', start_dt=None, end_dt=None, split_limit=None)` | ||
|
||
The `TimeSeriesCVSplitter` is a scikit-learn compatible cross-validator using `TimeSeriesCV`. | ||
|
||
This cross-validator generates splits based on time values, making it suitable for time series data. | ||
|
||
## Parameters: | ||
|
||
frequency: str | ||
The frequency of the time series (e.g., "days", "hours"). | ||
train_size: int | ||
Minimum number of time units in the training set. | ||
forecast_horizon: int | ||
Number of time units to forecast in each split. | ||
time_series: pd.Series | ||
A pandas Series or Index representing the time values. | ||
gap: int | ||
Number of time units to skip between training and testing sets. | ||
stride: int | ||
Number of time units to move forward after each split. | ||
window: str | ||
Type of window, either "rolling" or "expanding". | ||
mode: str | ||
Order of split generation, "forward" or "backward". | ||
start_dt: pd.Timestamp | ||
Start date for the time period. | ||
end_dt: pd.Timestamp | ||
End date for the time period. | ||
split_limit: int | ||
Maximum number of splits to generate. If None, all possible splits will be generated. | ||
|
||
## Raises: | ||
|
||
ValueError: | ||
If the input arrays are incompatible in length with the time series. | ||
|
||
## Returns: | ||
|
||
A generator of tuples of arrays containing the training and forecast data. | ||
|
||
## See Also: | ||
|
||
TimeSeriesCV | ||
|
||
## Examples | ||
|
||
``` {python} | ||
import pandas as pd | ||
import numpy as np | ||
from pytimetk import TimeSeriesCVSplitter | ||
start_dt = pd.Timestamp(2023, 1, 1) | ||
end_dt = pd.Timestamp(2023, 1, 31) | ||
time_series = pd.Series(pd.date_range(start_dt, end_dt, freq="D")) | ||
size = len(time_series) | ||
df = pd.DataFrame(data=np.random.randn(size, 2), columns=["a", "b"]) | ||
X, y = df[["a", "b"]], df[["a", "b"]].sum(axis=1) | ||
cv = TimeSeriesCVSplitter( | ||
time_series=time_series, | ||
frequency="days", | ||
train_size=14, | ||
forecast_horizon=7, | ||
gap=0, | ||
stride=1, | ||
window="rolling", | ||
) | ||
cv | ||
``` | ||
|
||
``` {python} | ||
# Insepct the cross-validation splits | ||
cv.splitter.plot(y, time_series = time_series) | ||
``` | ||
|
||
``` {python} | ||
# Using the TimeSeriesCVSplitter in a scikit-learn CV model | ||
from sklearn.linear_model import Ridge | ||
from sklearn.model_selection import RandomizedSearchCV | ||
# Fit and get best estimator | ||
param_grid = { | ||
"alpha": np.linspace(0.1, 2, 10), | ||
"fit_intercept": [True, False], | ||
"positive": [True, False], | ||
} | ||
random_search_cv = RandomizedSearchCV( | ||
estimator=Ridge(), | ||
param_distributions=param_grid, | ||
cv=cv, | ||
n_jobs=-1, | ||
).fit(X, y) | ||
random_search_cv.best_estimator_ | ||
``` | ||
|
||
## Methods | ||
|
||
| Name | Description | | ||
| --- | --- | | ||
| [get_n_splits](#pytimetk.TimeSeriesCVSplitter.get_n_splits) | Returns the number of splits. | | ||
| [split](#pytimetk.TimeSeriesCVSplitter.split) | Generates train and test indices for cross-validation. | | ||
|
||
### get_n_splits { #pytimetk.TimeSeriesCVSplitter.get_n_splits } | ||
|
||
`TimeSeriesCVSplitter.get_n_splits(X=None, y=None, groups=None)` | ||
|
||
Returns the number of splits. | ||
|
||
### split { #pytimetk.TimeSeriesCVSplitter.split } | ||
|
||
`TimeSeriesCVSplitter.split(X=None, y=None, groups=None)` | ||
|
||
Generates train and test indices for cross-validation. | ||
|
||
#### Parameters: | ||
|
||
X: | ||
Optional input features (ignored, for compatibility with scikit-learn). | ||
y: | ||
Optional target variable (ignored, for compatibility with scikit-learn). | ||
groups: | ||
Optional group labels (ignored, for compatibility with scikit-learn). | ||
|
||
#### Yields: | ||
|
||
Tuple[np.ndarray, np.ndarray]: | ||
Tuples of train and test indices. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters