-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Record compute time for each SPI #7
Comments
Just wanted to echo this -- presented pyspi at CNS2022 and received questions about approx how long each SPI takes so users can estimate time requirements for a job |
Could check whether the preprocessed information is ~fast and thus could be neglected for an initial estimate. If so, this could be straightforward to benchmark on a range of simple VAR processes (for # processes and # time points). |
For posterity, as I'm sure no one else cares after 18 months. import numpy as np
import pandas as pd
import random
import time
from pyspi.calculator import Calculator
random.seed(42)
M = 2 # 5 processes
T = 300 # 500 observations
dataset = np.random.randn(M,T)
calc = Calculator(dataset=dataset)
spi_items = calc.spis.copy()
df_rows = []
for (k,v) in spi_items.items():
calc.spis.clear()
calc.spis[k] = v
begTime = time.perf_counter()
calc.compute()
calcTime = time.perf_counter() - begTime
df_rows.append(dict(spi=k, time=calcTime))
pd.DataFrame(df_rows).to_csv("calc_spi_times.csv",index=False) |
Thank you very much for adding this, @mesner! Very helpful, indeed :) You bring up an interesting point about computation time taking ~2x as long doing each SPI piecemeal versus all at once, which was also my experience when I tried a similar analysis. I believe @olivercliff designed pyspi in a sort of hierarchical computation method, wherein some parent computations are performed for a given SPI group (e.g., transfer entropy, precision matrices) that then propagate to individual SPIs therein to save time/computation. So it's a bit tricky to derive the amount of time each individual SPI takes in practice, but I think this is a great approximation for users interested in the relative computation time for each SPI. For example, it makes sense that the convergent cross-mapping (ccm_) SPIs take orders of magnitude longer than most of the other SPIs. For what it's worth, we played around with this question using different SPI subset configurations and multivariate time series (MTS) data sizes if you're interested: https://pyspi-toolkit.readthedocs.io/en/latest/faq.html#how-long-does-pyspi-take-to-run |
Will be useful for knowing which methods are fast/slow to compute and allows users to select faster options.
This might be finicky since many of the methods inherit preprocessed information from other methods (e.g., all spectral methods inherit spectral decompositions).
The text was updated successfully, but these errors were encountered: