diff --git a/README.md b/README.md index fbbe5693..b855f4d4 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,37 @@ This library is currently under development and is not intended for general usage yet. Functionality is experimental until release 0.1.0. -## Installation +# Install Dev Version + +```bash +pip install git+https://github.com/business-science/pytimetk.git +``` + +# Quickstart: + +This is a simple code to test the function `summarize_by_time`: + +```python +import timetk +import pandas as pd + +df = timetk.datasets.load_dataset('bike_sales_sample') +df['order_date'] = pd.to_datetime(df['order_date']) + +df \ + .summarize_by_time( + date_column='order_date', + value_column= 'total_price', + groups = "category_2", + freq = "M", + kind = 'timestamp', + agg_func = ['mean', 'sum'] + ) +``` + + + +## Developers (Contributors): Installation To install `timetk` using [Poetry](https://python-poetry.org/), follow these steps: @@ -37,26 +67,4 @@ or you can create a virtualenv with poetry and install the dependencies ```bash poetry shell poetry install -``` - -# Usage - -This is a simple code to test the function `summarize_by_time`: - -```python -import timetk -import pandas as pd - -df = timetk.datasets.load_dataset('bike_sales_sample') -df['order_date'] = pd.to_datetime(df['order_date']) - -df \ - .summarize_by_time( - date_column='order_date', - value_column= 'total_price', - groups = "category_2", - freq = "M", - kind = 'timestamp', - agg_func = ['mean', 'sum'] - ) -``` +``` \ No newline at end of file diff --git a/docs/_freeze/getting-started/01_installation/execute-results/html.json b/docs/_freeze/getting-started/01_installation/execute-results/html.json index 84511cc7..219f0305 100644 --- a/docs/_freeze/getting-started/01_installation/execute-results/html.json +++ b/docs/_freeze/getting-started/01_installation/execute-results/html.json @@ -1,9 +1,9 @@ { - "hash": "dd18664a45403263527305f37122f2d6", + "hash": "1d4b00d8fea30f9bbb9a80d51866b6cd", "result": { - "markdown": "---\ntitle: Install\n---\n\n::: {.callout-warning collapse=\"false\"}\n## Under Development\n\nThis library is currently under development and is not intended for general usage yet. Functionality is experimental until release 0.1.0. \n:::\n\n\n## Installation\n\nTo install `timetk` using [Poetry](https://python-poetry.org/), follow these steps:\n\n### 1. Prerequisites\n\nMake sure you have Python 3.9 or later installed on your system.\n\n### 2. Install Poetry\n\nTo install Poetry, you can use the [official installer](https://python-poetry.org/docs/#installing-with-the-official-installer) provided by Poetry. Do not use pip.\n\n### 3. Clone the Repository\n\nClone the `timetk` repository from GitHub:\n\n```\ngit clone https://github.com/business-science/pytimetk\n```\n\n### 4. Install Dependencies\n\nUse Poetry to install the package and its dependencies:\n\n```\npoetry install\n```\n\nor you can create a virtualenv with poetry and install the dependencies\n\n```\npoetry shell\npoetry install\n```\n\n", + "markdown": "---\ntitle: Install\ntoc: true\ntoc-depth: 3\nnumber-sections: true\nnumber-depth: 2\n---\n\n::: {.callout-warning collapse=\"false\"}\n## Under Development\n\nThis library is currently under development and is not intended for general usage yet. Functionality is experimental until release 0.1.0. \n:::\n\n## Quick Install\n\nLet's get you up and running with `timetk` fast. You can install from GitHub with this code. \n\n```bash\npip install git+https://github.com/business-science/pytimetk.git\n```\n\n", "supporting": [ - "01_installation_files\\figure-html" + "01_installation_files" ], "filters": [], "includes": {} diff --git a/docs/_freeze/getting-started/02_quick_start/execute-results/html.json b/docs/_freeze/getting-started/02_quick_start/execute-results/html.json index 2e9d896b..09d99894 100644 --- a/docs/_freeze/getting-started/02_quick_start/execute-results/html.json +++ b/docs/_freeze/getting-started/02_quick_start/execute-results/html.json @@ -1,9 +1,9 @@ { - "hash": "235f1248a3cf054738487edbcbfd63db", + "hash": "4a7145c938a48d9a587433fff90c38f3", "result": { - "markdown": "---\ntitle: Quick Start\n---\n\n::: {.callout-warning collapse=\"false\"}\n## Under Development\n\nThis library is currently under development and is not intended for general usage yet. Functionality is experimental until release 0.1.0. \n:::\n\n## Quick Start: A Monthly Sales Analysis\n\nThis is a simple exercise to showcase the power of [`summarize_by_time()`](/reference/summarize_by_time.html):\n\n### Import Libraries & Data\n\nFirst, `import timetk as tk`. This gets you access to the most important functions. Use `tk.load_dataset()` to load the \"bike_sales_sample\" dataset.\n\n::: {.callout-note collapse=\"false\"}\n## About the Bike Sales Sample Dataset\n\nThis dataset contains \"orderlines\" for orders recieved. The `order_date` column contains timestamps. We can use this column to peform sales aggregations (e.g. total revenue).\n:::\n\n::: {.cell execution_count=1}\n``` {.python .cell-code}\nimport timetk as tk\nimport pandas as pd\n\ndf = tk.load_dataset('bike_sales_sample')\ndf['order_date'] = pd.to_datetime(df['order_date'])\n\ndf \n```\n\n::: {.cell-output .cell-output-display execution_count=1}\n```{=html}\n
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
order_idorder_lineorder_datequantitypricetotal_pricemodelcategory_1category_2frame_materialbikeshop_namecitystate
0112011-01-07160706070Jekyll Carbon 2MountainOver MountainCarbonIthaca Mountain ClimbersIthacaNY
1122011-01-07159705970Trigger Carbon 2MountainOver MountainCarbonIthaca Mountain ClimbersIthacaNY
2212011-01-10127702770Beast of the East 1MountainTrailAluminumKansas City 29ersKansas CityKS
3222011-01-10159705970Trigger Carbon 2MountainOver MountainCarbonKansas City 29ersKansas CityKS
4312011-01-1011066010660Supersix Evo Hi-Mod TeamRoadElite RoadCarbonLouisville Race EquipmentLouisvilleKY
..........................................
246132132011-12-22114101410CAAD8 105RoadElite RoadAluminumMiami Race EquipmentMiamiFL
246232212011-12-28112501250Synapse Disc TiagraRoadEndurance RoadAluminumPhoenix Bi-pedsPhoenixAZ
246332222011-12-28126602660Bad Habit 2MountainTrailAluminumPhoenix Bi-pedsPhoenixAZ
246432232011-12-28123402340F-Si 1MountainCross Country RaceAluminumPhoenix Bi-pedsPhoenixAZ
246532242011-12-28158605860Synapse Hi-Mod Dura AceRoadEndurance RoadCarbonPhoenix Bi-pedsPhoenixAZ
\n

2466 rows × 13 columns

\n
\n```\n:::\n:::\n\n\n### Using `summarize_by_time()` for a Sales Analysis\n\nYour company might be interested in sales patterns for various categories of bicycles. We can obtain a grouped monthly sales aggregation by `category_1` in two lines of code:\n\n1. First use pandas's `groupby()` method to group the DataFrame on `category_1`\n2. Next, use timetk's `summarize_by_time()` method to apply the sum function my month start (\"MS\") and use `wide_format` to return the dataframe in wide format. \n\nThe result is the total revenue for Mountain and Road bikes by month. \n\n::: {.cell execution_count=2}\n``` {.python .cell-code}\nsummary_category_1_df = df \\\n .groupby(\"category_1\") \\\n .summarize_by_time(\n date_column = 'order_date', \n value_column = 'total_price',\n freq = \"MS\",\n agg_func = 'sum',\n wide_format = True\n )\n\nsummary_category_1_df\n```\n\n::: {.cell-output .cell-output-display execution_count=2}\n```{=html}\n
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
order_datetotal_price_Mountaintotal_price_Road
02011-01-01221490261525
12011-02-01660555501520
22011-03-01358855301120
32011-04-011075975751165
42011-05-01450440393730
52011-06-01723040690405
62011-07-01767740426690
72011-08-01361255318535
82011-09-01401125413595
92011-10-01377335357585
102011-11-01549345456740
112011-12-01276055197065
\n
\n```\n:::\n:::\n\n\n### Visualizing Sales Patterns\n\n::: {.callout-note collapse=\"false\"}\n## Coming soon: `plot_timeseries()`.\n\nWe are working on an even easier and more attractive plotting solution specifically designed for Time Series Analysis. It's coming soon. \n:::\n\nWe can visualize with `plotly`. \n\n::: {.cell execution_count=3}\n``` {.python .cell-code}\nimport plotly.express as px\n\npx.line(\n summary_category_1_df, \n x = 'order_date', \n y = ['total_price_Mountain', 'total_price_Road'],\n template = \"plotly_dark\", \n title = \"Monthly Sales of Mountain and Road Bicycles\",\n width = 900\n)\n```\n\n::: {.cell-output .cell-output-display}\n```{=html}\n
\n```\n:::\n:::\n\n\n## More coming soon...\n\nThere's a lot more coming in `timetk` for Python. You can check out our [Project Roadmap here](https://github.com/business-science/pytimetk/issues/2). \n\n", + "markdown": "---\ntitle: Quick Start\ntoc: true\ntoc-depth: 3\nnumber-sections: true\nnumber-depth: 2\n---\n\n::: {.callout-warning collapse=\"false\"}\n## Under Development\n\nThis library is currently under development and is not intended for general usage yet. Functionality is experimental until release 0.1.0. \n:::\n\n# Quick Start: A Monthly Sales Analysis\n\nThis is a simple exercise to showcase the power of our 2 most popular function:\n\n1. [`summarize_by_time()`](/reference/summarize_by_time.html)\n2. [`plot_timeseries()`](/reference/plot_timeseries.html)\n\n## Import Libraries & Data\n\nFirst, `import timetk as tk`. This gets you access to the most important functions. Use `tk.load_dataset()` to load the \"bike_sales_sample\" dataset.\n\n::: {.callout-note collapse=\"false\"}\n## About the Bike Sales Sample Dataset\n\nThis dataset contains \"orderlines\" for orders recieved. The `order_date` column contains timestamps. We can use this column to peform sales aggregations (e.g. total revenue).\n:::\n\n::: {.cell execution_count=1}\n``` {.python .cell-code}\nimport timetk as tk\nimport pandas as pd\n\ndf = tk.load_dataset('bike_sales_sample')\ndf['order_date'] = pd.to_datetime(df['order_date'])\n\ndf \n```\n\n::: {.cell-output .cell-output-display execution_count=13}\n```{=html}\n
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
order_idorder_lineorder_datequantitypricetotal_pricemodelcategory_1category_2frame_materialbikeshop_namecitystate
0112011-01-07160706070Jekyll Carbon 2MountainOver MountainCarbonIthaca Mountain ClimbersIthacaNY
1122011-01-07159705970Trigger Carbon 2MountainOver MountainCarbonIthaca Mountain ClimbersIthacaNY
2212011-01-10127702770Beast of the East 1MountainTrailAluminumKansas City 29ersKansas CityKS
3222011-01-10159705970Trigger Carbon 2MountainOver MountainCarbonKansas City 29ersKansas CityKS
4312011-01-1011066010660Supersix Evo Hi-Mod TeamRoadElite RoadCarbonLouisville Race EquipmentLouisvilleKY
..........................................
246132132011-12-22114101410CAAD8 105RoadElite RoadAluminumMiami Race EquipmentMiamiFL
246232212011-12-28112501250Synapse Disc TiagraRoadEndurance RoadAluminumPhoenix Bi-pedsPhoenixAZ
246332222011-12-28126602660Bad Habit 2MountainTrailAluminumPhoenix Bi-pedsPhoenixAZ
246432232011-12-28123402340F-Si 1MountainCross Country RaceAluminumPhoenix Bi-pedsPhoenixAZ
246532242011-12-28158605860Synapse Hi-Mod Dura AceRoadEndurance RoadCarbonPhoenix Bi-pedsPhoenixAZ
\n

2466 rows × 13 columns

\n
\n```\n:::\n:::\n\n\n## Using `summarize_by_time()` for a Sales Analysis\n\nYour company might be interested in sales patterns for various categories of bicycles. We can obtain a grouped monthly sales aggregation by `category_1` in two lines of code:\n\n1. First use pandas's `groupby()` method to group the DataFrame on `category_1`\n2. Next, use timetk's `summarize_by_time()` method to apply the sum function my month start (\"MS\") and use `wide_format = 'False'` to return the dataframe in a long format (Note long format is the default). \n\nThe result is the total revenue for Mountain and Road bikes by month. \n\n::: {.cell execution_count=2}\n``` {.python .cell-code}\nsummary_category_1_df = df \\\n .groupby(\"category_1\") \\\n .summarize_by_time(\n date_column = 'order_date', \n value_column = 'total_price',\n freq = \"MS\",\n agg_func = 'sum',\n wide_format = False\n )\n\n# First 5 rows shown\nsummary_category_1_df.head()\n```\n\n::: {.cell-output .cell-output-display execution_count=14}\n```{=html}\n
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
category_1order_datetotal_price
0Mountain2011-01-01221490
1Mountain2011-02-01660555
2Mountain2011-03-01358855
3Mountain2011-04-011075975
4Mountain2011-05-01450440
\n
\n```\n:::\n:::\n\n\n## Visualizing Sales Patterns\n\n::: {.callout-note collapse=\"false\"}\n## Now available: `plot_timeseries()`.\n\nPlot time series is a quick and easy way to visualize time series and make professional time series plots. \n:::\n\nWith the data summarized by time, we can visualize with `plot_timeseries()`. `timetk` functions are `groupby()` aware meaning they understand if your data is grouped to do things by group. This is useful in time series where we often deal with 100s of time series groups. \n\n::: {.cell execution_count=3}\n``` {.python .cell-code}\nsummary_category_1_df \\\n .groupby('category_1') \\\n .plot_timeseries(\n date_column = 'order_date',\n value_column = 'total_price',\n smooth_frac = 0.8\n )\n```\n\n::: {.cell-output .cell-output-display}\n```{=html}\n
\n```\n:::\n:::\n\n\n# More coming soon...\n\nThere's a lot more coming in `timetk` for Python. You can check out our [Project Roadmap here](https://github.com/business-science/pytimetk/issues/2). \n\n", "supporting": [ - "02_quick_start_files\\figure-html" + "02_quick_start_files" ], "filters": [], "includes": { diff --git a/docs/_freeze/guides/01_visualization/execute-results/html.json b/docs/_freeze/guides/01_visualization/execute-results/html.json index c0364146..a211c04d 100644 --- a/docs/_freeze/guides/01_visualization/execute-results/html.json +++ b/docs/_freeze/guides/01_visualization/execute-results/html.json @@ -1,9 +1,9 @@ { - "hash": "92850dbb8bfde6b36c5868a81d9acab3", + "hash": "18cc71a8b77510ec0143e6ffa2a3e1b5", "result": { "markdown": "---\ntitle: Data Visualization\ntoc: true\ntoc-depth: 3\nnumber-sections: true\nnumber-depth: 2\n---\n\nComing soon...\n\n", "supporting": [ - "01_visualization_files\\figure-html" + "01_visualization_files/figure-html" ], "filters": [], "includes": {} diff --git a/docs/_freeze/guides/02_timetk_concepts/execute-results/html.json b/docs/_freeze/guides/02_timetk_concepts/execute-results/html.json index f85fabf2..a57a7630 100644 --- a/docs/_freeze/guides/02_timetk_concepts/execute-results/html.json +++ b/docs/_freeze/guides/02_timetk_concepts/execute-results/html.json @@ -1,9 +1,9 @@ { - "hash": "3f673790aa82e886d9342a125f3b10ca", + "hash": "0a95f85a50e1e9ec40c9cdab1de76f8e", "result": { "markdown": "---\ntitle: Timetk Basics\ntoc: true\ntoc-depth: 3\nnumber-sections: true\nnumber-depth: 2\n---\n\n> *Timetk has one mission:* To make time series analysis simpler, easier, and faster in Python. This goal requires some opinionated ways of treating time series in Python. We will conceptually lay out how `timetk` can help. \n\n::: {.callout-note collapse=\"false\"}\n## How this guide benefits you\n\nThis guide covers how to use `timetk` conceptually. Once you understand key concepts, you can go from basic to advanced time series analysis very fast. \n:::\n\n\n\nLet's first start with how to think about time series data conceptually. **Time series data has 3 core properties.** \n\n# The 3 Core Properties of Time Series Data\n\nEvery time series DataFrame should have the following properties:\n\n1. *Time Series Index:* A column containing 'datetime64' time stamps.\n2. *Value Columns:* One or more columns containing numeric data that can be aggregated and visualized by time\n3. *Group Columns (Optional):* One or more `categorical` or `str` columns that can be grouped by and time series can be evaluated by groups. \n\nIn practice here's what this looks like using the \"m4_daily\" dataset:\n\n::: {.cell execution_count=1}\n``` {.python .cell-code}\n# Import packages\nimport timetk as tk\nimport pandas as pd\nimport numpy as np\n\n# Import a Time Series Data Set\nm4_daily_df = tk.load_dataset(\"m4_daily\", parse_dates = ['date'])\nm4_daily_df\n```\n\n::: {.cell-output .cell-output-display execution_count=1}\n```{=html}\n
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
iddatevalue
0D102014-07-032076.2
1D102014-07-042073.4
2D102014-07-052048.7
3D102014-07-062048.9
4D102014-07-072006.4
............
9738D5002012-09-199418.8
9739D5002012-09-209365.7
9740D5002012-09-219445.9
9741D5002012-09-229497.9
9742D5002012-09-239545.3
\n

9743 rows × 3 columns

\n
\n```\n:::\n:::\n\n\n::: {.callout-note collapse=\"false\"}\n## (Example: m4_daily dataset) 3 Core Properties of Time Series Data\n\nWe can see that the `m4_daily` dataset has:\n\n1. *Time Series Index:* The `date` column\n2. *Value Column(s):* The `value` column\n3. *Group Column(s):* The `id` column\n:::\n\n::: {.callout-important collapse=\"false\"}\n## Missing any of the 3 Core Properties of Time Series Data\n\nIf your data is not formatted properly for `timetk`, meaning it's missing columns containing datetime, numeric values, or grouping columns, this can impact your ability to use `timetk` for time series anlysis. \n:::\n\n::: {.callout-important collapse=\"false\"}\n## No Pandas Index, No Problem\n\nTimetk standardizes using a date column. This is to reduce friction in converting to other package formats like `polars`, which don't use an an index (each row is indexed by its integer position). \n:::\n\n# The 2 Ways that Timetk Makes Time Series Analysis Easier\n\n::: {.callout-note collapse=\"false\"}\n## 2 Types of Time Series Functions\n\n1. Pandas `DataFrame` Operations\n2. Pandas `Series` Operations \n:::\n\nTimetk contains a number of functions designed to make time series analysis operations easier. In general, these operations come in 2 types of time series functions:\n\n1. *Pandas DataFrame Operations:* These functions work on `pd.DataFrame` objects and derivatives such as `groupby()` objects for Grouped Time Series Analysis. You will see `data` as the first parameter in these functions. \n \n2. *Pandas Series Operations:* These functions work on `pd.Series` objects.\n \n - *Time Series Index Operations:* Are designed for *Time Series index*. You will see `idx` as the first parameter of these functions. In these cases, these functions also work with `datetime64` values (e.g. those produced when you parse_dates via `pd.read_csv()` or create time series with `pd.date_range()`)\n \n - *Numeric Operations:* Are designed for *Numeric Values*. You will see `x` as the first parameter for these functions. \n\nLet's take a look at how to use the different types of Time Series Analysis functions in `timetk`. We'll start with Type 1: Pandas `DataFrame` Operations. \n\n## Type 1: Pandas DataFrame Operations\n\nBefore we start using `timetk`, let's make sure our data is set up properly. \n\n### Timetk Data Format Compliance\n\n::: {.callout-important collapse=\"false\"}\n## 3 Core Properties Must Be Upheald\n\nA `Timetk`-Compliant Pandas `DataFrame` must have:\n\n1. *Time Series Index:* A Time Stamp column containing `datetime64` values\n2. *Value Column(s):* The value column(s) containing `float` or `int` values\n3. *Group Column(s):* Optionally for grouped time series analysis, one or more columns containg `str` or `categorical` values (shown as an object)\n\nIf these are NOT upheld, this will impact your ability to use `timetk` DataFrame operations. \n:::\n\n::: {.callout-tip collapse=\"false\"}\n## Inspect the DataFrame\n\nUse Pandas `info()` method to check compliance. \n:::\n\nUsing pandas `info()` method, we can see that we have a compliant data frame with a `date` column containing `datetime64` and a `value` column containing `float64`. For grouped analysis we have the `id` column containing `object` dtype. \n\n::: {.cell execution_count=2}\n``` {.python .cell-code}\n# Tip: Inspect for compliance with info()\nm4_daily_df.info()\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n\nRangeIndex: 9743 entries, 0 to 9742\nData columns (total 3 columns):\n # Column Non-Null Count Dtype \n--- ------ -------------- ----- \n 0 id 9743 non-null object \n 1 date 9743 non-null datetime64[ns]\n 2 value 9743 non-null float64 \ndtypes: datetime64[ns](1), float64(1), object(1)\nmemory usage: 228.5+ KB\n```\n:::\n:::\n\n\n### Grouped Time Series Analysis with Summarize By Time\n\nFirst, inspect how the `summarize_by_time` function works by calling `help()`. \n\n::: {.cell execution_count=3}\n``` {.python .cell-code}\n# Review the summarize_by_time documentation (output not shown)\nhelp(tk.summarize_by_time)\n```\n:::\n\n\n::: {.callout-note collapse=\"false\"}\n## Help Doc Info: `summarize_by_time()`\n\n- The first parameter is `data`, indicating this is a `DataFrame` operation. \n- The Examples show different use cases for how to apply the function on a DataFrame\n:::\n\nLet's test the `summarize_by_time()` DataFrame operation out using the grouped approach with method chaining. DataFrame operations can be used as Pandas methods with method-chaining, which allows us to more succinctly apply time series operations.\n\n::: {.cell execution_count=4}\n``` {.python .cell-code}\n# Grouped Summarize By Time with Method Chaining\ndf_summarized = (\n m4_daily_df\n .groupby('id')\n .summarize_by_time(\n date_column = 'date',\n value_column = 'value',\n freq = 'QS', # QS = Quarter Start\n agg_func = [\n 'mean', \n 'median', \n 'min',\n ('q25', lambda x: np.quantile(x, 0.25)),\n ('q75', lambda x: np.quantile(x, 0.75)),\n 'max',\n ('range',lambda x: x.max() - x.min()),\n ],\n )\n)\n\ndf_summarized\n```\n\n::: {.cell-output .cell-output-display execution_count=4}\n```{=html}\n
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
iddatevalue_meanvalue_medianvalue_minvalue_q25value_q75value_maxvalue_range
0D102014-07-011960.0788891979.901781.61915.2252002.5752076.2294.6
1D102014-10-012184.5869572154.052022.82125.0752274.1502344.9322.1
2D102015-01-012309.8300002312.302209.62284.5752342.1502392.4182.8
3D102015-04-012344.4813192333.002185.12301.7502391.0002499.8314.7
4D102015-07-012156.7543482186.701856.61997.2502289.4252368.1511.5
..............................
105D5002011-07-019727.3217399745.558964.59534.12510003.90010463.91499.4
106D5002011-10-018175.5652177897.006755.07669.8758592.5759860.03105.0
107D5002012-01-018291.3175828412.607471.57814.8008677.8508980.71509.2
108D5002012-04-018654.0208798471.108245.68389.8509017.2509349.21103.6
109D5002012-07-018770.5023538690.508348.18604.4008846.0009545.31197.2
\n

110 rows × 9 columns

\n
\n```\n:::\n:::\n\n\n::: {.callout-note collapse=\"false\"}\n## Key Takeaways: `summarize_by_time()`\n\n- The `data` must comply with the 3 core properties (date column, value column(s), and group column(s)) \n- The aggregation functions were applied by combination of group (id) and resample (Quarter Start)\n- The result was a pandas DataFrame with group column, resampled date column, and summary values (mean, median, min, 25th-quantile, etc)\n:::\n\n### Another DataFrame Example: Creating 29 Engineered Features\n\nLet's examine another `DataFrame` function, `tk.augment_timeseries_signature()`. Feel free to inspect the documentation with `help(tk.augment_timeseries_signature)`.\n\n::: {.cell execution_count=5}\n``` {.python .cell-code}\n# Creating 29 engineered features from the date column\n# Not run: help(tk.augment_timeseries_signature)\ndf_augmented = (\n m4_daily_df\n .augment_timeseries_signature(date_column = 'date')\n)\n\ndf_augmented.head()\n```\n\n::: {.cell-output .cell-output-display execution_count=5}\n```{=html}\n
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
iddatevaluedate_index_numdate_yeardate_year_isodate_yearstartdate_yearenddate_leapyeardate_half...date_mdaydate_qdaydate_ydaydate_weekenddate_hourdate_minutedate_seconddate_mseconddate_nseconddate_am_pm
0D102014-07-032076.21404345600201420140002...33184000000am
1D102014-07-042073.41404432000201420140002...44185000000am
2D102014-07-052048.71404518400201420140002...55186000000am
3D102014-07-062048.91404604800201420140002...66187100000am
4D102014-07-072006.41404691200201420140002...77188000000am
\n

5 rows × 32 columns

\n
\n```\n:::\n:::\n\n\n::: {.callout-note collapse=\"false\"}\n## Key Takeaways: `augment_timeseries_signature()`\n\n- The `data` must comply with the 1 of the 3 core properties (date column) \n- The result was a pandas DataFrame with 29 time series features that can be used for Machine Learning and Forecasting\n:::\n\n\n### Making Future Dates with Future Frame\n\nA common time series task before forecasting with machine learning models is to make a future DataFrame some `length_out` into the future. You can do this with `tk.future_frame()`. Here's how. \n\n::: {.cell execution_count=6}\n``` {.python .cell-code}\n# Preparing a time series data set for Machine Learning Forecasting\nfull_augmented_df = (\n m4_daily_df \n .groupby('id')\n .future_frame('date', length_out = 365)\n .augment_timeseries_signature('date')\n)\nfull_augmented_df\n```\n\n::: {.cell-output .cell-output-display execution_count=6}\n```{=html}\n
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
iddatevaluedate_index_numdate_yeardate_year_isodate_yearstartdate_yearenddate_leapyeardate_half...date_mdaydate_qdaydate_ydaydate_weekenddate_hourdate_minutedate_seconddate_mseconddate_nseconddate_am_pm
0D102014-07-032076.21404345600201420140002...33184000000am
1D102014-07-042073.41404432000201420140002...44185000000am
2D102014-07-052048.71404518400201420140002...55186000000am
3D102014-07-062048.91404604800201420140002...66187100000am
4D102014-07-072006.41404691200201420140002...77188000000am
..................................................................
4556D5002013-09-19NaN1379548800201320130002...1981262000000am
4557D5002013-09-20NaN1379635200201320130002...2082263000000am
4558D5002013-09-21NaN1379721600201320130002...2183264000000am
4559D5002013-09-22NaN1379808000201320130002...2284265100000am
4560D5002013-09-23NaN1379894400201320130002...2385266000000am
\n

11203 rows × 32 columns

\n
\n```\n:::\n:::\n\n\nWe can then get the future data by keying in on the data with `value` column that is missing (`np.nan`).\n\n::: {.cell execution_count=7}\n``` {.python .cell-code}\n# Get the future data (just the observations that haven't happened yet)\nfuture_df = (\n full_augmented_df\n .query('value.isna()')\n)\nfuture_df\n```\n\n::: {.cell-output .cell-output-display execution_count=7}\n```{=html}\n
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
iddatevaluedate_index_numdate_yeardate_year_isodate_yearstartdate_yearenddate_leapyeardate_half...date_mdaydate_qdaydate_ydaydate_weekenddate_hourdate_minutedate_seconddate_mseconddate_nseconddate_am_pm
674D102016-05-07NaN1462579200201620160011...737128000000am
675D102016-05-08NaN1462665600201620160011...838129100000am
676D102016-05-09NaN1462752000201620160011...939130000000am
677D102016-05-10NaN1462838400201620160011...1040131000000am
678D102016-05-11NaN1462924800201620160011...1141132000000am
..................................................................
4556D5002013-09-19NaN1379548800201320130002...1981262000000am
4557D5002013-09-20NaN1379635200201320130002...2082263000000am
4558D5002013-09-21NaN1379721600201320130002...2183264000000am
4559D5002013-09-22NaN1379808000201320130002...2284265100000am
4560D5002013-09-23NaN1379894400201320130002...2385266000000am
\n

1460 rows × 32 columns

\n
\n```\n:::\n:::\n\n\n## Type 2: Pandas Series Operations\n\nThe main difference between a `DataFrame` operation and a Series operation is that we are operating on an array of values from typically one of the following `dtypes`:\n\n1. Timestamps (`datetime64`)\n2. Numeric (`float64` or `int64`) \n\nThe first argument of Series operations that operate on Timestamps will always be `idx`. \n\nLet's take a look at one shall we? We'll start with a common action: Making future time series from an existing time series with a regular frequency. \n\n### The Make Future Time Series Function\n\nSay we have a monthly sequence of timestamps. What if we want to create a forecast where we predict 12 months into the future? Well, we will need to create 12 future timestamps. Here's how. \n\nFirst create a `pd.date_range()` with dates starting at the beginning of each month.\n\n::: {.cell execution_count=8}\n``` {.python .cell-code}\n# Make a monthly date range\ndates_dt = pd.date_range(\"2023-01\", \"2024-01\", freq=\"MS\")\ndates_dt\n```\n\n::: {.cell-output .cell-output-display execution_count=8}\n```\nDatetimeIndex(['2023-01-01', '2023-02-01', '2023-03-01', '2023-04-01',\n '2023-05-01', '2023-06-01', '2023-07-01', '2023-08-01',\n '2023-09-01', '2023-10-01', '2023-11-01', '2023-12-01',\n '2024-01-01'],\n dtype='datetime64[ns]', freq='MS')\n```\n:::\n:::\n\n\nNext, use `tk.make_future_timeseries()` to create the next 12 timestamps in the sequence. \n\n::: {.panel-tabset group=\"future-dates\"}\n\n## Pandas Series\n\n::: {.cell execution_count=9}\n``` {.python .cell-code}\n# Pandas Series: Future Dates\nfuture_series = pd.Series(dates_dt).make_future_timeseries(12)\nfuture_series\n```\n\n::: {.cell-output .cell-output-display execution_count=9}\n```\n0 2024-02-01\n1 2024-03-01\n2 2024-04-01\n3 2024-05-01\n4 2024-06-01\n5 2024-07-01\n6 2024-08-01\n7 2024-09-01\n8 2024-10-01\n9 2024-11-01\n10 2024-12-01\n11 2025-01-01\ndtype: datetime64[ns]\n```\n:::\n:::\n\n\n## DateTimeIndex\n\n::: {.cell execution_count=10}\n``` {.python .cell-code}\n# DateTimeIndex: Future Dates\nfuture_dt = tk.make_future_timeseries(\n idx = dates_dt,\n length_out = 12\n)\nfuture_dt\n```\n\n::: {.cell-output .cell-output-display execution_count=10}\n```\n0 2024-02-01\n1 2024-03-01\n2 2024-04-01\n3 2024-05-01\n4 2024-06-01\n5 2024-07-01\n6 2024-08-01\n7 2024-09-01\n8 2024-10-01\n9 2024-11-01\n10 2024-12-01\n11 2025-01-01\ndtype: datetime64[ns]\n```\n:::\n:::\n\n\n:::\n\nWe can combine the actual and future timestamps into one combined timeseries. \n\n::: {.cell execution_count=11}\n``` {.python .cell-code}\n# Combining the 2 series and resetting the index\ncombined_timeseries = (\n pd.concat(\n [pd.Series(dates_dt), pd.Series(future_dt)],\n axis=0\n )\n .reset_index(drop = True)\n)\n\ncombined_timeseries\n```\n\n::: {.cell-output .cell-output-display execution_count=11}\n```\n0 2023-01-01\n1 2023-02-01\n2 2023-03-01\n3 2023-04-01\n4 2023-05-01\n5 2023-06-01\n6 2023-07-01\n7 2023-08-01\n8 2023-09-01\n9 2023-10-01\n10 2023-11-01\n11 2023-12-01\n12 2024-01-01\n13 2024-02-01\n14 2024-03-01\n15 2024-04-01\n16 2024-05-01\n17 2024-06-01\n18 2024-07-01\n19 2024-08-01\n20 2024-09-01\n21 2024-10-01\n22 2024-11-01\n23 2024-12-01\n24 2025-01-01\ndtype: datetime64[ns]\n```\n:::\n:::\n\n\nNext, we'll take a look at how to go from an irregular time series to a regular time series. \n\n### Flooring Dates\n\nAn example is `tk.floor_date`, which is used to round down dates. See `help(tk.floor_date)`.\n\nFlooring dates is often used as part of a strategy to go from an irregular time series to regular by combining with an aggregation. Often `summarize_by_time()` is used (I'll share why shortly). But conceptually, date flooring is the secret. \n\n\n::: {.panel-tabset group=\"flooring\"}\n\n## With Flooring\n\n::: {.cell execution_count=12}\n``` {.python .cell-code}\n# Monthly flooring rounds dates down to 1st of the month\nm4_daily_df['date'].floor_date(unit = \"M\")\n```\n\n::: {.cell-output .cell-output-display execution_count=12}\n```\n0 2014-07-01\n1 2014-07-01\n2 2014-07-01\n3 2014-07-01\n4 2014-07-01\n ... \n9738 2012-09-01\n9739 2012-09-01\n9740 2012-09-01\n9741 2012-09-01\n9742 2012-09-01\nName: date, Length: 9743, dtype: datetime64[ns]\n```\n:::\n:::\n\n\n## Without Flooring\n\n::: {.cell execution_count=13}\n``` {.python .cell-code}\n# Before Flooring\nm4_daily_df['date']\n```\n\n::: {.cell-output .cell-output-display execution_count=13}\n```\n0 2014-07-03\n1 2014-07-04\n2 2014-07-05\n3 2014-07-06\n4 2014-07-07\n ... \n9738 2012-09-19\n9739 2012-09-20\n9740 2012-09-21\n9741 2012-09-22\n9742 2012-09-23\nName: date, Length: 9743, dtype: datetime64[ns]\n```\n:::\n:::\n\n\n:::\n\nThis \"date flooring\" operation can be useful for creating date groupings.\n\n::: {.cell execution_count=14}\n``` {.python .cell-code}\n# Adding a date group with floor_date()\ndates_grouped_by_month = (\n m4_daily_df\n .assign(date_group = lambda x: x['date'].floor_date(\"M\"))\n)\n\ndates_grouped_by_month\n```\n\n::: {.cell-output .cell-output-display execution_count=14}\n```{=html}\n
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
iddatevaluedate_group
0D102014-07-032076.22014-07-01
1D102014-07-042073.42014-07-01
2D102014-07-052048.72014-07-01
3D102014-07-062048.92014-07-01
4D102014-07-072006.42014-07-01
...............
9738D5002012-09-199418.82012-09-01
9739D5002012-09-209365.72012-09-01
9740D5002012-09-219445.92012-09-01
9741D5002012-09-229497.92012-09-01
9742D5002012-09-239545.32012-09-01
\n

9743 rows × 4 columns

\n
\n```\n:::\n:::\n\n\nWe can then do grouped operations. \n\n::: {.cell execution_count=15}\n``` {.python .cell-code}\n# Example of a grouped operation with floored dates\nsummary_df = (\n dates_grouped_by_month\n .drop('date', axis=1) \\\n .groupby(['id', 'date_group'])\n .mean() \\\n .reset_index()\n)\n\nsummary_df\n```\n\n::: {.cell-output .cell-output-display execution_count=15}\n```{=html}\n
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
iddate_groupvalue
0D102014-07-011967.493103
1D102014-08-011985.548387
2D102014-09-011926.593333
3D102014-10-012100.077419
4D102014-11-012155.326667
............
318D5002012-05-018407.096774
319D5002012-06-019124.903333
320D5002012-07-018674.551613
321D5002012-08-018666.054839
322D5002012-09-019040.604348
\n

323 rows × 3 columns

\n
\n```\n:::\n:::\n\n\nOf course for this operation, we can do it faster with `summarize_by_time()` (and it's much more flexible). \n\n::: {.cell execution_count=16}\n``` {.python .cell-code}\n# Summarize by time is less code and more flexible\n(\n m4_daily_df \n .groupby('id')\n .summarize_by_time(\n 'date', 'value', \n freq = \"MS\",\n agg_func = ['mean', 'median', 'min', 'max']\n )\n)\n```\n\n::: {.cell-output .cell-output-display execution_count=16}\n```{=html}\n
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
iddatevalue_meanvalue_medianvalue_minvalue_max
0D102014-07-011967.4931031978.801876.02076.2
1D102014-08-011985.5483871995.601914.72027.5
2D102014-09-011926.5933331920.951781.62023.5
3D102014-10-012100.0774192107.602022.82154.9
4D102014-11-012155.3266672149.302083.52245.4
.....................
318D5002012-05-018407.0967748430.808245.68578.1
319D5002012-06-019124.9033339163.858686.19349.2
320D5002012-07-018674.5516138673.608407.59091.1
321D5002012-08-018666.0548398667.408348.18939.6
322D5002012-09-019040.6043489091.408500.09545.3
\n

323 rows × 6 columns

\n
\n```\n:::\n:::\n\n\nAnd that's the core idea behind `timetk`, writing less code and getting more. \n\n\n\nNext, let's do one more function. The brother of `augment_timeseries_signature()`...\n\n### The Get Time Series Signature Function\n\nThis function takes a pandas `Series` or `DateTimeIndex` and returns a `DataFrame` containing the 29 engineered features. \n\nStart with either a DateTimeIndex...\n\n::: {.cell execution_count=17}\n``` {.python .cell-code}\ntimestamps_dt = pd.date_range(\"2023\", \"2024\", freq = \"D\")\ntimestamps_dt\n```\n\n::: {.cell-output .cell-output-display execution_count=17}\n```\nDatetimeIndex(['2023-01-01', '2023-01-02', '2023-01-03', '2023-01-04',\n '2023-01-05', '2023-01-06', '2023-01-07', '2023-01-08',\n '2023-01-09', '2023-01-10',\n ...\n '2023-12-23', '2023-12-24', '2023-12-25', '2023-12-26',\n '2023-12-27', '2023-12-28', '2023-12-29', '2023-12-30',\n '2023-12-31', '2024-01-01'],\n dtype='datetime64[ns]', length=366, freq='D')\n```\n:::\n:::\n\n\n... Or a Pandas Series.\n\n::: {.cell execution_count=18}\n``` {.python .cell-code}\ntimestamps_series = pd.Series(timestamps_dt)\ntimestamps_series\n```\n\n::: {.cell-output .cell-output-display execution_count=18}\n```\n0 2023-01-01\n1 2023-01-02\n2 2023-01-03\n3 2023-01-04\n4 2023-01-05\n ... \n361 2023-12-28\n362 2023-12-29\n363 2023-12-30\n364 2023-12-31\n365 2024-01-01\nLength: 366, dtype: datetime64[ns]\n```\n:::\n:::\n\n\nAnd you can use the pandas Series function, `tk.get_timeseries_signature()` to create 29 features from the date sequence. \n\n::: {.panel-tabset group=\"get_timeseries_signature\"}\n\n## Pandas Series\n\n::: {.cell execution_count=19}\n``` {.python .cell-code}\n# Pandas series: get_timeseries_signature\ntimestamps_series.get_timeseries_signature()\n```\n\n::: {.cell-output .cell-output-display execution_count=19}\n```{=html}\n
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
index_numyearyear_isoyearstartyearendleapyearhalfquarterquarteryearquarterstart...mdayqdayydayweekendhourminutesecondmsecondnsecondam_pm
0167253120020232022100112023Q11...111100000am
1167261760020232023000112023Q10...222000000am
2167270400020232023000112023Q10...333000000am
3167279040020232023000112023Q10...444000000am
4167287680020232023000112023Q10...555000000am
..................................................................
361170372160020232023000242023Q40...2889362000000am
362170380800020232023000242023Q40...2990363000000am
363170389440020232023000242023Q40...3091364000000am
364170398080020232023010242023Q40...3192365100000am
365170406720020242024101112024Q11...111000000am
\n

366 rows × 29 columns

\n
\n```\n:::\n:::\n\n\n## DateTimeIndex\n\n::: {.cell execution_count=20}\n``` {.python .cell-code}\n# DateTimeIndex: get_timeseries_signature\ntk.get_timeseries_signature(timestamps_dt)\n```\n\n::: {.cell-output .cell-output-display execution_count=20}\n```{=html}\n
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
index_numyearyear_isoyearstartyearendleapyearhalfquarterquarteryearquarterstart...mdayqdayydayweekendhourminutesecondmsecondnsecondam_pm
0167253120020232022100112023Q11...111100000am
1167261760020232023000112023Q10...222000000am
2167270400020232023000112023Q10...333000000am
3167279040020232023000112023Q10...444000000am
4167287680020232023000112023Q10...555000000am
..................................................................
361170372160020232023000242023Q40...2889362000000am
362170380800020232023000242023Q40...2990363000000am
363170389440020232023000242023Q40...3091364000000am
364170398080020232023010242023Q40...3192365100000am
365170406720020242024101112024Q11...111000000am
\n

366 rows × 29 columns

\n
\n```\n:::\n:::\n\n\n:::\n\n\n\n# More Coming Soon...\n\nWe are in the early stages of development. But it's obvious the potential for `timetk` now in Python. 🐍\n\n- For requested functions, please see our [Project Roadmap GH Issue #2](https://github.com/business-science/pytimetk/issues/2). You can make requests there. \n\n", "supporting": [ - "02_timetk_concepts_files\\figure-html" + "02_timetk_concepts_files/figure-html" ], "filters": [], "includes": { diff --git a/docs/_freeze/index/execute-results/html.json b/docs/_freeze/index/execute-results/html.json index 4e539e7a..40242f3b 100644 --- a/docs/_freeze/index/execute-results/html.json +++ b/docs/_freeze/index/execute-results/html.json @@ -1,9 +1,9 @@ { - "hash": "7db1cfed92eb05de91f563b77fee6869", + "hash": "80db6db36ea9cd69c13c097172b872fa", "result": { - "markdown": "---\ntitle: timetk for Python \n---\n\n\n\n> The Time Series Toolkit for Python\n\n**Timetk's Mission:** To make time series analysis easier, faster, and more enjoyable in Python.\n\n::: {.callout-warning collapse=\"false\"}\n## Under Development\n\nThis library is currently under development and is not intended for general usage yet. Functionality is experimental until release 0.1.0. \n:::\n\n## Quick Start: A Monthly Sales Analysis\n\nThis is a simple exercise to showcase the power of [`summarize_by_time()`](/reference/summarize_by_time.html):\n\n### Import Libraries & Data\n\nFirst, `import timetk as tk`. This gets you access to the most important functions. Use `tk.load_dataset()` to load the \"bike_sales_sample\" dataset.\n\n::: {.callout-note collapse=\"false\"}\n## About the Bike Sales Sample Dataset\n\nThis dataset contains \"orderlines\" for orders recieved. The `order_date` column contains timestamps. We can use this column to peform sales aggregations (e.g. total revenue).\n:::\n\n\n::: {.cell execution_count=1}\n``` {.python .cell-code}\nimport timetk as tk\nimport pandas as pd\n\ndf = tk.load_dataset('bike_sales_sample')\ndf['order_date'] = pd.to_datetime(df['order_date'])\n\ndf \n```\n\n::: {.cell-output .cell-output-display execution_count=1}\n```{=html}\n
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
order_idorder_lineorder_datequantitypricetotal_pricemodelcategory_1category_2frame_materialbikeshop_namecitystate
0112011-01-07160706070Jekyll Carbon 2MountainOver MountainCarbonIthaca Mountain ClimbersIthacaNY
1122011-01-07159705970Trigger Carbon 2MountainOver MountainCarbonIthaca Mountain ClimbersIthacaNY
2212011-01-10127702770Beast of the East 1MountainTrailAluminumKansas City 29ersKansas CityKS
3222011-01-10159705970Trigger Carbon 2MountainOver MountainCarbonKansas City 29ersKansas CityKS
4312011-01-1011066010660Supersix Evo Hi-Mod TeamRoadElite RoadCarbonLouisville Race EquipmentLouisvilleKY
..........................................
246132132011-12-22114101410CAAD8 105RoadElite RoadAluminumMiami Race EquipmentMiamiFL
246232212011-12-28112501250Synapse Disc TiagraRoadEndurance RoadAluminumPhoenix Bi-pedsPhoenixAZ
246332222011-12-28126602660Bad Habit 2MountainTrailAluminumPhoenix Bi-pedsPhoenixAZ
246432232011-12-28123402340F-Si 1MountainCross Country RaceAluminumPhoenix Bi-pedsPhoenixAZ
246532242011-12-28158605860Synapse Hi-Mod Dura AceRoadEndurance RoadCarbonPhoenix Bi-pedsPhoenixAZ
\n

2466 rows × 13 columns

\n
\n```\n:::\n:::\n\n\n### Using `summarize_by_time()` for a Sales Analysis\n\nYour company might be interested in sales patterns for various categories of bicycles. We can obtain a grouped monthly sales aggregation by `category_1` in two lines of code:\n\n1. First use pandas's `groupby()` method to group the DataFrame on `category_1`\n2. Next, use timetk's `summarize_by_time()` method to apply the sum function my month start (\"MS\") and use `wide_format` to return the dataframe in wide format. \n\nThe result is the total revenue for Mountain and Road bikes by month. \n\n::: {.cell execution_count=2}\n``` {.python .cell-code}\nsummary_category_1_df = df \\\n .groupby(\"category_1\") \\\n .summarize_by_time(\n date_column = 'order_date', \n value_column = 'total_price',\n freq = \"MS\",\n agg_func = 'sum',\n wide_format = True\n )\n\nsummary_category_1_df\n```\n\n::: {.cell-output .cell-output-display execution_count=2}\n```{=html}\n
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
order_datetotal_price_Mountaintotal_price_Road
02011-01-01221490261525
12011-02-01660555501520
22011-03-01358855301120
32011-04-011075975751165
42011-05-01450440393730
52011-06-01723040690405
62011-07-01767740426690
72011-08-01361255318535
82011-09-01401125413595
92011-10-01377335357585
102011-11-01549345456740
112011-12-01276055197065
\n
\n```\n:::\n:::\n\n\n### Visualizing Sales Patterns\n\n::: {.callout-note collapse=\"false\"}\n## Coming soon: `plot_timeseries()`.\n\nWe are working on an even easier and more attractive plotting solution specifically designed for Time Series Analysis. It's coming soon. \n:::\n\nWe can visualize with `plotly`. \n\n::: {.cell execution_count=3}\n``` {.python .cell-code}\nimport plotly.express as px\n\npx.line(\n summary_category_1_df, \n x = 'order_date', \n y = ['total_price_Mountain', 'total_price_Road'],\n template = \"plotly_dark\", \n title = \"Monthly Sales of Mountain and Road Bicycles\",\n width = 900\n)\n```\n\n::: {.cell-output .cell-output-display}\n```{=html}\n
\n```\n:::\n:::\n\n\n## Installation\n\nTo install `timetk` using [Poetry](https://python-poetry.org/), follow these steps:\n\n### 1. Prerequisites\n\nMake sure you have Python 3.9 or later installed on your system.\n\n### 2. Install Poetry\n\nTo install Poetry, you can use the [official installer](https://python-poetry.org/docs/#installing-with-the-official-installer) provided by Poetry. Do not use pip.\n\n### 3. Clone the Repository\n\nClone the `timetk` repository from GitHub:\n\n```\ngit clone https://github.com/business-science/pytimetk\n```\n\n### 4. Install Dependencies\n\nUse Poetry to install the package and its dependencies:\n\n```\npoetry install\n```\n\nor you can create a virtualenv with poetry and install the dependencies\n\n```\npoetry shell\npoetry install\n```\n\n", + "markdown": "---\ntoc: true\ntoc-depth: 3\nnumber-sections: true\nnumber-depth: 2\ntitle: timetk for Python \n---\n\n\n\n\n\n> The Time Series Toolkit for Python\n\n**Timetk's Mission:** To make time series analysis easier, faster, and more enjoyable in Python.\n\n::: {.callout-warning collapse=\"false\"}\n## Under Development\n\nThis library is currently under development and is not intended for general usage yet. Functionality is experimental until release 0.1.0. \n:::\n\n# Install GitHub Version\n\n```bash\npip install git+https://github.com/business-science/pytimetk.git\n```\n\n# Quick Start: A Monthly Sales Analysis\n\nThis is a simple exercise to showcase the power of [`summarize_by_time()`](/reference/summarize_by_time.html):\n\n### Import Libraries & Data\n\nFirst, `import timetk as tk`. This gets you access to the most important functions. Use `tk.load_dataset()` to load the \"bike_sales_sample\" dataset.\n\n::: {.callout-note collapse=\"false\"}\n## About the Bike Sales Sample Dataset\n\nThis dataset contains \"orderlines\" for orders recieved. The `order_date` column contains timestamps. We can use this column to peform sales aggregations (e.g. total revenue).\n:::\n\n\n::: {.cell execution_count=1}\n``` {.python .cell-code}\nimport timetk as tk\nimport pandas as pd\n\ndf = tk.load_dataset('bike_sales_sample')\ndf['order_date'] = pd.to_datetime(df['order_date'])\n\ndf \n```\n\n::: {.cell-output .cell-output-display execution_count=4}\n```{=html}\n
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
order_idorder_lineorder_datequantitypricetotal_pricemodelcategory_1category_2frame_materialbikeshop_namecitystate
0112011-01-07160706070Jekyll Carbon 2MountainOver MountainCarbonIthaca Mountain ClimbersIthacaNY
1122011-01-07159705970Trigger Carbon 2MountainOver MountainCarbonIthaca Mountain ClimbersIthacaNY
2212011-01-10127702770Beast of the East 1MountainTrailAluminumKansas City 29ersKansas CityKS
3222011-01-10159705970Trigger Carbon 2MountainOver MountainCarbonKansas City 29ersKansas CityKS
4312011-01-1011066010660Supersix Evo Hi-Mod TeamRoadElite RoadCarbonLouisville Race EquipmentLouisvilleKY
..........................................
246132132011-12-22114101410CAAD8 105RoadElite RoadAluminumMiami Race EquipmentMiamiFL
246232212011-12-28112501250Synapse Disc TiagraRoadEndurance RoadAluminumPhoenix Bi-pedsPhoenixAZ
246332222011-12-28126602660Bad Habit 2MountainTrailAluminumPhoenix Bi-pedsPhoenixAZ
246432232011-12-28123402340F-Si 1MountainCross Country RaceAluminumPhoenix Bi-pedsPhoenixAZ
246532242011-12-28158605860Synapse Hi-Mod Dura AceRoadEndurance RoadCarbonPhoenix Bi-pedsPhoenixAZ
\n

2466 rows × 13 columns

\n
\n```\n:::\n:::\n\n\n### Using `summarize_by_time()` for a Sales Analysis\n\nYour company might be interested in sales patterns for various categories of bicycles. We can obtain a grouped monthly sales aggregation by `category_1` in two lines of code:\n\n1. First use pandas's `groupby()` method to group the DataFrame on `category_1`\n2. Next, use timetk's `summarize_by_time()` method to apply the sum function my month start (\"MS\") and use `wide_format = 'False'` to return the dataframe in a long format (Note long format is the default). \n\nThe result is the total revenue for Mountain and Road bikes by month. \n\n::: {.cell execution_count=2}\n``` {.python .cell-code}\nsummary_category_1_df = df \\\n .groupby(\"category_1\") \\\n .summarize_by_time(\n date_column = 'order_date', \n value_column = 'total_price',\n freq = \"MS\",\n agg_func = 'sum',\n wide_format = False\n )\n\n# First 5 rows shown\nsummary_category_1_df.head()\n```\n\n::: {.cell-output .cell-output-display execution_count=5}\n```{=html}\n
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
category_1order_datetotal_price
0Mountain2011-01-01221490
1Mountain2011-02-01660555
2Mountain2011-03-01358855
3Mountain2011-04-011075975
4Mountain2011-05-01450440
\n
\n```\n:::\n:::\n\n\n### Visualizing Sales Patterns\n\n::: {.callout-note collapse=\"false\"}\n## Now available: `plot_timeseries()`.\n\nPlot time series is a quick and easy way to visualize time series and make professional time series plots. \n:::\n\nWith the data summarized by time, we can visualize with `plot_timeseries()`. `timetk` functions are `groupby()` aware meaning they understand if your data is grouped to do things by group. This is useful in time series where we often deal with 100s of time series groups. \n\n::: {.cell execution_count=3}\n``` {.python .cell-code}\nsummary_category_1_df \\\n .groupby('category_1') \\\n .plot_timeseries(\n date_column = 'order_date',\n value_column = 'total_price',\n smooth_frac = 0.8\n )\n```\n\n::: {.cell-output .cell-output-display}\n```{=html}\n
\n```\n:::\n:::\n\n\n# Contributing\n\nInterested in helping us make this the best Python package for time series analysis? We'd love your help. \n\n[Follow these instructions to Contribute.](/contributing.html)\n\n", "supporting": [ - "index_files\\figure-html" + "index_files" ], "filters": [], "includes": { diff --git a/docs/_freeze/reference/plot_timeseries/execute-results/html.json b/docs/_freeze/reference/plot_timeseries/execute-results/html.json index d7915398..6d6d9803 100644 --- a/docs/_freeze/reference/plot_timeseries/execute-results/html.json +++ b/docs/_freeze/reference/plot_timeseries/execute-results/html.json @@ -1,9 +1,9 @@ { - "hash": "1228d56883035cbdd4cab0d1daf8c484", + "hash": "557e61a5df290c6979e8861a8f59aaaa", "result": { - "markdown": "---\ntitle: plot_timeseries\n---\n\n\n\n`plot_timeseries(data, date_column, value_column, color_column=None, facet_ncol=1, facet_nrow=None, facet_scales='free_y', facet_dir='h', line_color='#2c3e50', line_size=0.65, line_type='solid', line_alpha=1.0, y_intercept=None, y_intercept_color='#2c3e50', x_intercept=None, x_intercept_color='#2c3e50', smooth=True, smooth_color='#3366FF', smooth_frac=0.2, smooth_size=1.0, smooth_alpha=1.0, title='Time Series Plot', x_lab='', y_lab='', color_lab='Legend', x_axis_date_labels='%b %Y', base_size=11, width=None, height=None, engine='plotly')`\n\nCreates time series plots using different plotting engines such as Plotnine, Matplotlib, and Plotly.\n\n## Parameters\n\n| Name | Type | Description | Default |\n|----------------------|----------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------|\n| `data` | pd.DataFrame or pd.core.groupby.generic.DataFrameGroupBy | The input data for the plot. It can be either a Pandas DataFrame or a Pandas DataFrameGroupBy object. | _required_ |\n| `date_column` | str | The name of the column in the DataFrame that contains the dates for the time series data. | _required_ |\n| `value_column` | str | The `value_column` parameter is used to specify the name of the column in the DataFrame that contains the values for the time series data. This column will be plotted on the y-axis of the time series plot. | _required_ |\n| `color_column` | str | The `color_column` parameter is an optional parameter that specifies the column in the DataFrame that will be used to assign colors to the different time series. If this parameter is not provided, all time series will have the same color. | `None` |\n| `facet_ncol` | int | The `facet_ncol` parameter determines the number of columns in the facet grid. It specifies how many subplots will be arranged horizontally in the plot. | `1` |\n| `facet_nrow` | int | The `facet_nrow` parameter determines the number of rows in the facet grid. It specifies how many subplots will be arranged vertically in the grid. | `None` |\n| `facet_scales` | str | The `facet_scales` parameter determines the scaling of the y-axis in the facetted plots. It can take the following values: - \"free_y\": The y-axis scale will be free for each facet, but the x-axis scale will be fixed for all facets. This is the default value. - \"free_x\": The y-axis scale will be free for each facet, but the x-axis scale will be fixed for all facets. - \"free\": The y-axis scale will be free for each facet (subplot). This is the default value. | `'free_y'` |\n| `facet_dir` | str | The `facet_dir` parameter determines the direction in which the facets (subplots) are arranged. It can take two possible values: - \"h\": The facets will be arranged horizontally (in rows). This is the default value. - \"v\": The facets will be arranged vertically (in columns). | `'h'` |\n| `line_color` | str | The `line_color` parameter is used to specify the color of the lines in the time series plot. It accepts a string value representing a color code or name. The default value is \"#2c3e50\", which corresponds to a dark blue color. | `'#2c3e50'` |\n| `line_size` | float | The `line_size` parameter is used to specify the size of the lines in the time series plot. It determines the thickness of the lines. | `0.65` |\n| `line_type` | str | The `line_type` parameter is used to specify the type of line to be used in the time series plot. | `'solid'` |\n| `line_alpha` | float | The `line_alpha` parameter controls the transparency of the lines in the time series plot. It accepts a value between 0 and 1, where 0 means completely transparent (invisible) and 1 means completely opaque (solid). | `1.0` |\n| `y_intercept` | float | The `y_intercept` parameter is used to add a horizontal line to the plot at a specific y-value. It can be set to a numeric value to specify the y-value of the intercept. If set to `None` (default), no y-intercept line will be added to the plot | `None` |\n| `y_intercept_color` | str | The `y_intercept_color` parameter is used to specify the color of the y-intercept line in the plot. It accepts a string value representing a color code or name. The default value is \"#2c3e50\", which corresponds to a dark blue color. You can change this value. | `'#2c3e50'` |\n| `x_intercept` | str | The `x_intercept` parameter is used to add a vertical line at a specific x-axis value on the plot. It is used to highlight a specific point or event in the time series data. - By default, it is set to `None`, which means no vertical line will be added. - You can use a date string to specify the x-axis value of the intercept. For example, \"2020-01-01\" would add a vertical line at the beginning of the year 2020. | `None` |\n| `x_intercept_color` | str | The `x_intercept_color` parameter is used to specify the color of the vertical line that represents the x-intercept in the plot. By default, it is set to \"#2c3e50\", which is a dark blue color. You can change this value to any valid color code. | `'#2c3e50'` |\n| `smooth` | bool | The `smooth` parameter is a boolean indicating whether or not to apply smoothing to the time eries data. If set to True, the time series will be smoothed using the lowess algorithm. The default value is True. | `True` |\n| `smooth_color` | str | The `smooth_color` parameter is used to specify the color of the smoothed line in the time series plot. It accepts a string value representing a color code or name. The default value is `#3366FF`, which corresponds to a shade of blue. You can change this value to any valid color code. | `'#3366FF'` |\n| `smooth_frac` | float | The `smooth_frac` parameter is used to control the fraction of data points used for smoothing the time series. It determines the degree of smoothing applied to the data. A smaller value of `smooth_frac` will result in more smoothing, while a larger value will result in less smoothing. The default value is 0.2. | `0.2` |\n| `smooth_size` | float | The `smooth_size` parameter is used to specify the size of the line used to plot the smoothed values in the time series plot. It is a numeric value that controls the thickness of the line. A larger value will result in a thicker line, while a smaller value will result in a thinner line | `1.0` |\n| `smooth_alpha` | float | The `smooth_alpha` parameter controls the transparency of the smoothed line in the plot. It accepts a value between 0 and 1, where 0 means completely transparent and 1 means completely opaque. | `1.0` |\n| `title` | str | The title of the plot. | `'Time Series Plot'` |\n| `x_lab` | str | The `x_lab` parameter is used to specify the label for the x-axis in the plot. It is a string that represents the label text. | `''` |\n| `y_lab` | str | The `y_lab` parameter is used to specify the label for the y-axis in the plot. It is a string that represents the label for the y-axis. | `''` |\n| `color_lab` | str | The `color_lab` parameter is used to specify the label for the legend or color scale in the plot. It is used to provide a description of the colors used in the plot, typically when a color column is specified. | `'Legend'` |\n| `x_axis_date_labels` | str | The `x_axis_date_labels` parameter is used to specify the format of the date labels on the x-axis of the plot. It accepts a string representing the format of the date labels. For example, \"%b %Y\" would display the month abbreviation and year (e.g., Jan 2020). | `'%b %Y'` |\n| `base_size` | float | The `base_size` parameter is used to set the base font size for the plot. It determines the size of the text elements such as axis labels, titles, and legends. | `11` |\n| `width` | int | The `width` parameter is used to specify the width of the plot. It determines the horizontal size of the plot in pixels. | `None` |\n| `height` | int | The `height` parameter is used to specify the height of the plot in pixels. It determines the vertical size of the plot when it is rendered. | `None` |\n| `engine` | str | The `engine` parameter specifies the plotting library to use for creating the time series plot. It can take one of the following values: - \"plotly\" (interactive): Use the plotly library to create the plot. This is the default value. - \"plotnine\" (static): Use the plotnine library to create the plot. This is the default value. - \"matplotlib\" (static): Use the matplotlib library to create the plot. | `'plotly'` |\n\n## Returns\n\n| Type | Description |\n|------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| The function `plot_timeseries` returns a plot object, depending on the specified `engine` parameter. | - If `engine` is set to 'plotnine' or 'matplotlib', the function returns a plot object that can be further customized or displayed. - If `engine` is set to 'plotly', the function returns a plotly figure object. |\n\n## Examples\n\n\n::: {.cell execution_count=1}\n``` {.python .cell-code}\nimport timetk as tk\n\ndf = tk.load_dataset('m4_monthly', parse_dates = ['date'])\n\n# Plotly Object: Single Time Series\nfig = (\n df\n .query('id == \"M750\"')\n .plot_timeseries(\n 'date', 'value', \n facet_ncol = 1,\n x_axis_date_labels = \"%Y\",\n engine = 'plotly',\n )\n)\nfig\n```\n\n::: {.cell-output .cell-output-display}\n```{=html}\n
\n```\n:::\n:::\n\n\n::: {.cell execution_count=2}\n``` {.python .cell-code}\n# Plotly Object: Grouped Time Series\nfig = (\n df\n .groupby('id')\n .plot_timeseries(\n 'date', 'value', \n facet_ncol = 2,\n facet_scales = \"free_y\",\n smooth_frac = 0.2,\n smooth_size = 2.0,\n y_intercept = None,\n x_axis_date_labels = \"%Y\",\n engine = 'plotly',\n width = 800,\n height = 500,\n )\n)\nfig\n```\n\n::: {.cell-output .cell-output-display}\n```{=html}\n
\n```\n:::\n:::\n\n\n::: {.cell execution_count=3}\n``` {.python .cell-code}\n# Plotly Object: Color Column\nfig = (\n df\n .plot_timeseries(\n 'date', 'value', \n color_column = 'id',\n smooth = False,\n y_intercept = 0,\n x_axis_date_labels = \"%Y\",\n engine = 'plotly',\n )\n)\nfig\n```\n\n::: {.cell-output .cell-output-display}\n```{=html}\n
\n```\n:::\n:::\n\n\n::: {.cell execution_count=4}\n``` {.python .cell-code}\n# Plotnine Object: Single Time Series\nfig = (\n df\n .query('id == \"M1\"')\n .plot_timeseries(\n 'date', 'value', \n x_axis_date_labels = \"%Y\",\n engine = 'plotnine'\n )\n)\nfig\n```\n\n::: {.cell-output .cell-output-display}\n![](plot_timeseries_files/figure-html/cell-5-output-1.png){}\n:::\n\n::: {.cell-output .cell-output-display execution_count=4}\n```\n
\n```\n:::\n:::\n\n\n::: {.cell execution_count=5}\n``` {.python .cell-code}\n# Plotnine Object: Grouped Time Series\nfig = (\n df\n .groupby('id')\n .plot_timeseries(\n 'date', 'value', \n color_column = 'id',\n facet_ncol = 2,\n facet_scales = \"free\",\n line_size = 0.35,\n x_axis_date_labels = \"%Y\",\n engine = 'plotnine'\n )\n)\nfig\n```\n\n::: {.cell-output .cell-output-display}\n![](plot_timeseries_files/figure-html/cell-6-output-1.png){}\n:::\n\n::: {.cell-output .cell-output-display execution_count=5}\n```\n
\n```\n:::\n:::\n\n\n::: {.cell execution_count=6}\n``` {.python .cell-code}\n# Plotnine Object: Color Column\nfig = (\n df\n .plot_timeseries(\n 'date', 'value', \n color_column = 'id',\n smooth = False,\n y_intercept = 0,\n x_axis_date_labels = \"%Y\",\n engine = 'plotnine',\n )\n)\nfig\n```\n\n::: {.cell-output .cell-output-display}\n![](plot_timeseries_files/figure-html/cell-7-output-1.png){}\n:::\n\n::: {.cell-output .cell-output-display execution_count=6}\n```\n
\n```\n:::\n:::\n\n\n::: {.cell execution_count=7}\n``` {.python .cell-code}\n# Matplotlib object (same as plotnine, but converted to matplotlib object)\nfig = (\n df\n .groupby('id')\n .plot_timeseries(\n 'date', 'value', \n color_column = 'id',\n facet_ncol = 2,\n x_axis_date_labels = \"%Y\",\n engine = 'matplotlib'\n )\n)\nfig\n```\n\n::: {.cell-output .cell-output-display execution_count=7}\n![](plot_timeseries_files/figure-html/cell-8-output-1.png){}\n:::\n:::\n\n\n", + "markdown": "---\ntitle: plot_timeseries\n---\n\n\n\n`plot_timeseries(data, date_column, value_column, color_column=None, facet_ncol=1, facet_nrow=None, facet_scales='free_y', facet_dir='h', line_color='#2c3e50', line_size=0.65, line_type='solid', line_alpha=1.0, y_intercept=None, y_intercept_color='#2c3e50', x_intercept=None, x_intercept_color='#2c3e50', smooth=True, smooth_color='#3366FF', smooth_frac=0.2, smooth_size=1.0, smooth_alpha=1.0, legend_show=True, title='Time Series Plot', x_lab='', y_lab='', color_lab='Legend', x_axis_date_labels='%b %Y', base_size=11, width=None, height=None, engine='plotly')`\n\nCreates time series plots using different plotting engines such as Plotnine, Matplotlib, and Plotly.\n\n## Parameters\n\n| Name | Type | Description | Default |\n|----------------------|----------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------|\n| `data` | pd.DataFrame or pd.core.groupby.generic.DataFrameGroupBy | The input data for the plot. It can be either a Pandas DataFrame or a Pandas DataFrameGroupBy object. | _required_ |\n| `date_column` | str | The name of the column in the DataFrame that contains the dates for the time series data. | _required_ |\n| `value_column` | str | The `value_column` parameter is used to specify the name of the column in the DataFrame that contains the values for the time series data. This column will be plotted on the y-axis of the time series plot. | _required_ |\n| `color_column` | str | The `color_column` parameter is an optional parameter that specifies the column in the DataFrame that will be used to assign colors to the different time series. If this parameter is not provided, all time series will have the same color. | `None` |\n| `facet_ncol` | int | The `facet_ncol` parameter determines the number of columns in the facet grid. It specifies how many subplots will be arranged horizontally in the plot. | `1` |\n| `facet_nrow` | int | The `facet_nrow` parameter determines the number of rows in the facet grid. It specifies how many subplots will be arranged vertically in the grid. | `None` |\n| `facet_scales` | str | The `facet_scales` parameter determines the scaling of the y-axis in the facetted plots. It can take the following values: - \"free_y\": The y-axis scale will be free for each facet, but the x-axis scale will be fixed for all facets. This is the default value. - \"free_x\": The y-axis scale will be free for each facet, but the x-axis scale will be fixed for all facets. - \"free\": The y-axis scale will be free for each facet (subplot). This is the default value. | `'free_y'` |\n| `facet_dir` | str | The `facet_dir` parameter determines the direction in which the facets (subplots) are arranged. It can take two possible values: - \"h\": The facets will be arranged horizontally (in rows). This is the default value. - \"v\": The facets will be arranged vertically (in columns). | `'h'` |\n| `line_color` | str | The `line_color` parameter is used to specify the color of the lines in the time series plot. It accepts a string value representing a color code or name. The default value is \"#2c3e50\", which corresponds to a dark blue color. | `'#2c3e50'` |\n| `line_size` | float | The `line_size` parameter is used to specify the size of the lines in the time series plot. It determines the thickness of the lines. | `0.65` |\n| `line_type` | str | The `line_type` parameter is used to specify the type of line to be used in the time series plot. | `'solid'` |\n| `line_alpha` | float | The `line_alpha` parameter controls the transparency of the lines in the time series plot. It accepts a value between 0 and 1, where 0 means completely transparent (invisible) and 1 means completely opaque (solid). | `1.0` |\n| `y_intercept` | float | The `y_intercept` parameter is used to add a horizontal line to the plot at a specific y-value. It can be set to a numeric value to specify the y-value of the intercept. If set to `None` (default), no y-intercept line will be added to the plot | `None` |\n| `y_intercept_color` | str | The `y_intercept_color` parameter is used to specify the color of the y-intercept line in the plot. It accepts a string value representing a color code or name. The default value is \"#2c3e50\", which corresponds to a dark blue color. You can change this value. | `'#2c3e50'` |\n| `x_intercept` | str | The `x_intercept` parameter is used to add a vertical line at a specific x-axis value on the plot. It is used to highlight a specific point or event in the time series data. - By default, it is set to `None`, which means no vertical line will be added. - You can use a date string to specify the x-axis value of the intercept. For example, \"2020-01-01\" would add a vertical line at the beginning of the year 2020. | `None` |\n| `x_intercept_color` | str | The `x_intercept_color` parameter is used to specify the color of the vertical line that represents the x-intercept in the plot. By default, it is set to \"#2c3e50\", which is a dark blue color. You can change this value to any valid color code. | `'#2c3e50'` |\n| `smooth` | bool | The `smooth` parameter is a boolean indicating whether or not to apply smoothing to the time eries data. If set to True, the time series will be smoothed using the lowess algorithm. The default value is True. | `True` |\n| `smooth_color` | str | The `smooth_color` parameter is used to specify the color of the smoothed line in the time series plot. It accepts a string value representing a color code or name. The default value is `#3366FF`, which corresponds to a shade of blue. You can change this value to any valid color code. | `'#3366FF'` |\n| `smooth_frac` | float | The `smooth_frac` parameter is used to control the fraction of data points used for smoothing the time series. It determines the degree of smoothing applied to the data. A smaller value of `smooth_frac` will result in more smoothing, while a larger value will result in less smoothing. The default value is 0.2. | `0.2` |\n| `smooth_size` | float | The `smooth_size` parameter is used to specify the size of the line used to plot the smoothed values in the time series plot. It is a numeric value that controls the thickness of the line. A larger value will result in a thicker line, while a smaller value will result in a thinner line | `1.0` |\n| `smooth_alpha` | float | The `smooth_alpha` parameter controls the transparency of the smoothed line in the plot. It accepts a value between 0 and 1, where 0 means completely transparent and 1 means completely opaque. | `1.0` |\n| `legend_show` | bool | The `legend_show` parameter is a boolean indicating whether or not to show the legend in the plot. If set to True, the legend will be displayed. The default value is True. | `True` |\n| `title` | str | The title of the plot. | `'Time Series Plot'` |\n| `x_lab` | str | The `x_lab` parameter is used to specify the label for the x-axis in the plot. It is a string that represents the label text. | `''` |\n| `y_lab` | str | The `y_lab` parameter is used to specify the label for the y-axis in the plot. It is a string that represents the label for the y-axis. | `''` |\n| `color_lab` | str | The `color_lab` parameter is used to specify the label for the legend or color scale in the plot. It is used to provide a description of the colors used in the plot, typically when a color column is specified. | `'Legend'` |\n| `x_axis_date_labels` | str | The `x_axis_date_labels` parameter is used to specify the format of the date labels on the x-axis of the plot. It accepts a string representing the format of the date labels. For example, \"%b %Y\" would display the month abbreviation and year (e.g., Jan 2020). | `'%b %Y'` |\n| `base_size` | float | The `base_size` parameter is used to set the base font size for the plot. It determines the size of the text elements such as axis labels, titles, and legends. | `11` |\n| `width` | int | The `width` parameter is used to specify the width of the plot. It determines the horizontal size of the plot in pixels. | `None` |\n| `height` | int | The `height` parameter is used to specify the height of the plot in pixels. It determines the vertical size of the plot when it is rendered. | `None` |\n| `engine` | str | The `engine` parameter specifies the plotting library to use for creating the time series plot. It can take one of the following values: - \"plotly\" (interactive): Use the plotly library to create the plot. This is the default value. - \"plotnine\" (static): Use the plotnine library to create the plot. This is the default value. - \"matplotlib\" (static): Use the matplotlib library to create the plot. | `'plotly'` |\n\n## Returns\n\n| Type | Description |\n|------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| The function `plot_timeseries` returns a plot object, depending on the specified `engine` parameter. | - If `engine` is set to 'plotnine' or 'matplotlib', the function returns a plot object that can be further customized or displayed. - If `engine` is set to 'plotly', the function returns a plotly figure object. |\n\n## Examples\n\n\n::: {.cell execution_count=1}\n``` {.python .cell-code}\nimport timetk as tk\n\ndf = tk.load_dataset('m4_monthly', parse_dates = ['date'])\n\n# Plotly Object: Single Time Series\nfig = (\n df\n .query('id == \"M750\"')\n .plot_timeseries(\n 'date', 'value', \n facet_ncol = 1,\n x_axis_date_labels = \"%Y\",\n engine = 'plotly',\n )\n)\nfig\n```\n\n::: {.cell-output .cell-output-display}\n```{=html}\n
\n```\n:::\n:::\n\n\n::: {.cell execution_count=2}\n``` {.python .cell-code}\n# Plotly Object: Grouped Time Series\nfig = (\n df\n .groupby('id')\n .plot_timeseries(\n 'date', 'value', \n facet_ncol = 2,\n facet_scales = \"free_y\",\n smooth_frac = 0.2,\n smooth_size = 2.0,\n y_intercept = None,\n x_axis_date_labels = \"%Y\",\n engine = 'plotly',\n width = 600,\n height = 500,\n )\n)\nfig\n```\n\n::: {.cell-output .cell-output-display}\n```{=html}\n
\n```\n:::\n:::\n\n\n::: {.cell execution_count=3}\n``` {.python .cell-code}\n# Plotly Object: Color Column\nfig = (\n df\n .plot_timeseries(\n 'date', 'value', \n color_column = 'id',\n smooth = False,\n y_intercept = 0,\n x_axis_date_labels = \"%Y\",\n engine = 'plotly',\n )\n)\nfig\n```\n\n::: {.cell-output .cell-output-display}\n```{=html}\n
\n```\n:::\n:::\n\n\n::: {.cell execution_count=4}\n``` {.python .cell-code}\n# Plotnine Object: Single Time Series\nfig = (\n df\n .query('id == \"M1\"')\n .plot_timeseries(\n 'date', 'value', \n x_axis_date_labels = \"%Y\",\n engine = 'plotnine'\n )\n)\nfig\n```\n\n::: {.cell-output .cell-output-display}\n![](plot_timeseries_files/figure-html/cell-5-output-1.png){}\n:::\n\n::: {.cell-output .cell-output-display execution_count=4}\n```\n
\n```\n:::\n:::\n\n\n::: {.cell execution_count=5}\n``` {.python .cell-code}\n# Plotnine Object: Grouped Time Series\nfig = (\n df\n .groupby('id')\n .plot_timeseries(\n 'date', 'value', \n facet_ncol = 2,\n facet_scales = \"free\",\n line_size = 0.35,\n x_axis_date_labels = \"%Y\",\n engine = 'plotnine'\n )\n)\nfig\n```\n\n::: {.cell-output .cell-output-display}\n![](plot_timeseries_files/figure-html/cell-6-output-1.png){}\n:::\n\n::: {.cell-output .cell-output-display execution_count=5}\n```\n
\n```\n:::\n:::\n\n\n::: {.cell execution_count=6}\n``` {.python .cell-code}\n# Plotnine Object: Color Column\nfig = (\n df\n .plot_timeseries(\n 'date', 'value', \n color_column = 'id',\n smooth = False,\n y_intercept = 0,\n x_axis_date_labels = \"%Y\",\n engine = 'plotnine',\n )\n)\nfig\n```\n\n::: {.cell-output .cell-output-display}\n![](plot_timeseries_files/figure-html/cell-7-output-1.png){}\n:::\n\n::: {.cell-output .cell-output-display execution_count=6}\n```\n
\n```\n:::\n:::\n\n\n::: {.cell execution_count=7}\n``` {.python .cell-code}\n# Matplotlib object (same as plotnine, but converted to matplotlib object)\nfig = (\n df\n .groupby('id')\n .plot_timeseries(\n 'date', 'value', \n color_column = 'id',\n facet_ncol = 2,\n x_axis_date_labels = \"%Y\",\n engine = 'matplotlib'\n )\n)\nfig\n```\n\n::: {.cell-output .cell-output-display execution_count=7}\n![](plot_timeseries_files/figure-html/cell-8-output-1.png){}\n:::\n:::\n\n\n", "supporting": [ - "plot_timeseries_files\\figure-html" + "plot_timeseries_files/figure-html" ], "filters": [], "includes": { diff --git a/docs/_freeze/reference/plot_timeseries/figure-html/cell-5-output-1.png b/docs/_freeze/reference/plot_timeseries/figure-html/cell-5-output-1.png index ede86673..c356eab5 100644 Binary files a/docs/_freeze/reference/plot_timeseries/figure-html/cell-5-output-1.png and b/docs/_freeze/reference/plot_timeseries/figure-html/cell-5-output-1.png differ diff --git a/docs/_freeze/reference/plot_timeseries/figure-html/cell-6-output-1.png b/docs/_freeze/reference/plot_timeseries/figure-html/cell-6-output-1.png index 531cfa3c..878fd69d 100644 Binary files a/docs/_freeze/reference/plot_timeseries/figure-html/cell-6-output-1.png and b/docs/_freeze/reference/plot_timeseries/figure-html/cell-6-output-1.png differ diff --git a/docs/_freeze/reference/plot_timeseries/figure-html/cell-7-output-1.png b/docs/_freeze/reference/plot_timeseries/figure-html/cell-7-output-1.png index 0fd55823..f60cb27f 100644 Binary files a/docs/_freeze/reference/plot_timeseries/figure-html/cell-7-output-1.png and b/docs/_freeze/reference/plot_timeseries/figure-html/cell-7-output-1.png differ diff --git a/docs/_quarto.yml b/docs/_quarto.yml index 49130680..3312ffc5 100644 --- a/docs/_quarto.yml +++ b/docs/_quarto.yml @@ -59,13 +59,26 @@ website: contents: getting-started/* - section: "🗺️ Guides:" contents: guides/* + - section: "📘 Applied Tutorials:" + contents: tutorials/* + - text: "---" - text: 📄 API Reference file: reference/index.qmd - text: "---" + - text: ⭐ Timetk on Github + file: https://github.com/business-science/pytimetk + external: true + - text: "---" + - text: 🍻 Contributing + file: contributing.qmd + - text: "---" - section: "More References:" - text: Business Science file: https://www.business-science.io/index.html external: true + - text: Quant Science + file: https://www.quantscience.io/ + external: true - text: Timetk (R Version) file: https://business-science.github.io/timetk/ external: true diff --git a/docs/_sidebar.yml b/docs/_sidebar.yml index d32fa27e..11efa8f5 100644 --- a/docs/_sidebar.yml +++ b/docs/_sidebar.yml @@ -19,19 +19,24 @@ website: section: "\U0001F3D7\uFE0F Adding Features to Time Series DataFrames (Augmenting)" - contents: - reference/ts_features.qmd + - reference/ts_summary.qmd section: TS Features - contents: - reference/make_future_timeseries.qmd - reference/make_weekday_sequence.qmd - reference/make_weekend_sequence.qmd - section: "\U0001F43C Time Series for Pandas Series" - - contents: + - reference/get_date_summary.qmd + - reference/get_frequency_summary.qmd + - reference/get_diff_summary.qmd - reference/get_pandas_frequency.qmd - reference/get_timeseries_signature.qmd - reference/get_holiday_signature.qmd + section: "\U0001F43C Time Series for Pandas Series" + - contents: - reference/floor_date.qmd - reference/is_holiday.qmd - reference/week_of_month.qmd + - reference/timeseries_unit_frequency_table.qmd section: "\U0001F6E0\uFE0F Utilities" - contents: - reference/get_available_datasets.qmd diff --git a/docs/_site/contributing.html b/docs/_site/contributing.html new file mode 100644 index 00000000..7be1a7c3 --- /dev/null +++ b/docs/_site/contributing.html @@ -0,0 +1,775 @@ + + + + + + + + + +timetk for Python - Contributing (Developer Setup) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+
+ + +
+ +
+ + +
+ + + +
+ +
+
+

Contributing (Developer Setup)

+
+ + + +
+ + + + +
+ + +
+ +
+
+
+ +
+
+Interested in contributing? +
+
+
+
+
+

Make sure to Fork the GitHub Repo. Clone your fork. Then use poetry to install the timetk package.

+
+
+
+
+

1 GitHub

+

To contribute, you’ll need to have a GitHub account. Then:

+
+

1.1 1. Fork our pytimetk repository

+

Head to our GitHub Repo and select “fork”. This makes a copied version of timetk for your personal use.

+
+
+

1.2 2. Clone your forked version

+

Cloning will put your own personal version of timetk on your local machine. Make sure to replace [your_user_name] with your user name.

+
git clone https://github.com/[your_user_name]/pytimetk
+
+
+
+

2 Poetry Environment Setup

+

To install timetk using Poetry, follow these steps:

+
+

1. Prerequisites

+

Make sure you have Python 3.9 or later installed on your system.

+
+
+

2. Install Poetry

+

To install Poetry, you can use the official installer provided by Poetry. Do not use pip.

+
+
+

3. Install Dependencies

+

Use Poetry to install the package and its dependencies:

+
poetry install
+

or you can create a virtualenv with poetry and install the dependencies

+
poetry shell
+poetry install
+
+
+
+

3 Submit a Pull Request

+
+

1. Make changes on a Branch

+

Make changes in your local version on a branch where my-feature-branch is a branch you’d like to create that contains modifications.

+
git checkout -b my-feature-branch
+
+
+

2. Push to your forked version of pytimetk

+
git push origin my-feature-branch
+
+
+

3. Create a Pull Request

+
    +
  • Go to your forked repository on GitHub and switch to your branch.
  • +
  • Click on “New pull request” and compare the changes you made with the original repository.
  • +
  • Fill out the pull request template with the necessary information, explaining your changes, the reason for them, and any other relevant information.
  • +
+
+
+

4. Submit the Pull Request

+
    +
  • Review your changes and submit the pull request.
  • +
+
+
+
+

4 Next Steps 🍻

+

We will review your PR. If all goes well, we’ll merge! And then you’ve just helped the community. 🍻

+ + +
+ +
+ +
+ + + + \ No newline at end of file diff --git a/docs/_site/getting-started/01_installation.html b/docs/_site/getting-started/01_installation.html index 41dbb577..02706786 100644 --- a/docs/_site/getting-started/01_installation.html +++ b/docs/_site/getting-started/01_installation.html @@ -20,6 +20,40 @@ margin: 0 0.8em 0.2em -1em; /* quarto-specific, see https://github.com/quarto-dev/quarto-cli/issues/4556 */ vertical-align: middle; } +/* CSS for syntax highlighting */ +pre > code.sourceCode { white-space: pre; position: relative; } +pre > code.sourceCode > span { display: inline-block; line-height: 1.25; } +pre > code.sourceCode > span:empty { height: 1.2em; } +.sourceCode { overflow: visible; } +code.sourceCode > span { color: inherit; text-decoration: inherit; } +div.sourceCode { margin: 1em 0; } +pre.sourceCode { margin: 0; } +@media screen { +div.sourceCode { overflow: auto; } +} +@media print { +pre > code.sourceCode { white-space: pre-wrap; } +pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; } +} +pre.numberSource code + { counter-reset: source-line 0; } +pre.numberSource code > span + { position: relative; left: -4em; counter-increment: source-line; } +pre.numberSource code > span > a:first-child::before + { content: counter(source-line); + position: relative; left: -1em; text-align: right; vertical-align: baseline; + border: none; display: inline-block; + -webkit-touch-callout: none; -webkit-user-select: none; + -khtml-user-select: none; -moz-user-select: none; + -ms-user-select: none; user-select: none; + padding: 0 4px; width: 4em; + } +pre.numberSource { margin-left: 3em; padding-left: 4px; } +div.sourceCode + { } +@media screen { +pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; } +} @@ -190,11 +224,55 @@ + +
  • +
  • + +
  • +
  • + + +
  • +
  • + +
  • +
  • + + +
  • +
  • + +
  • +
  • +