Record compute time for each SPI #7

olivercliff · 2022-02-15T04:54:56Z

Will be useful for knowing which methods are fast/slow to compute and allows users to select faster options.

This might be finicky since many of the methods inherit preprocessed information from other methods (e.g., all spectral methods inherit spectral decompositions).

anniegbryant · 2022-07-23T05:43:13Z

Just wanted to echo this -- presented pyspi at CNS2022 and received questions about approx how long each SPI takes so users can estimate time requirements for a job

benfulcher · 2022-07-24T22:36:36Z

Could check whether the preprocessed information is ~fast and thus could be neglected for an initial estimate. If so, this could be straightforward to benchmark on a range of simple VAR processes (for # processes and # time points).

mesner · 2024-01-25T16:24:38Z

For posterity, as I'm sure no one else cares after 18 months.
Here's a code snipped I used for the same question.
Yes, it's hack, and the total time is about 2x what it takes to calculate them all at once (IIRC).
But, it's something.
Note that some spi's fail in my gpu-less linux env.

import numpy as np
import pandas as pd
import random
import time
from pyspi.calculator import Calculator

random.seed(42)

M = 2 # 5 processes
T = 300 # 500 observations

dataset = np.random.randn(M,T)
calc = Calculator(dataset=dataset)
spi_items = calc.spis.copy()
df_rows = []
for (k,v) in spi_items.items():
    calc.spis.clear()
    calc.spis[k] = v
    begTime = time.perf_counter()
    calc.compute()
    calcTime = time.perf_counter() - begTime
    df_rows.append(dict(spi=k, time=calcTime))


pd.DataFrame(df_rows).to_csv("calc_spi_times.csv",index=False)

calc_spi_times.csv

anniegbryant · 2024-01-25T21:38:32Z

Thank you very much for adding this, @mesner! Very helpful, indeed :)

You bring up an interesting point about computation time taking ~2x as long doing each SPI piecemeal versus all at once, which was also my experience when I tried a similar analysis. I believe @olivercliff designed pyspi in a sort of hierarchical computation method, wherein some parent computations are performed for a given SPI group (e.g., transfer entropy, precision matrices) that then propagate to individual SPIs therein to save time/computation. So it's a bit tricky to derive the amount of time each individual SPI takes in practice, but I think this is a great approximation for users interested in the relative computation time for each SPI. For example, it makes sense that the convergent cross-mapping (ccm_) SPIs take orders of magnitude longer than most of the other SPIs.

For what it's worth, we played around with this question using different SPI subset configurations and multivariate time series (MTS) data sizes if you're interested: https://pyspi-toolkit.readthedocs.io/en/latest/faq.html#how-long-does-pyspi-take-to-run

olivercliff added the enhancement New feature or request label Feb 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record compute time for each SPI #7

Record compute time for each SPI #7

olivercliff commented Feb 15, 2022

anniegbryant commented Jul 23, 2022

benfulcher commented Jul 24, 2022

mesner commented Jan 25, 2024 •

edited

Loading

anniegbryant commented Jan 25, 2024

Record compute time for each SPI #7

Record compute time for each SPI #7

Comments

olivercliff commented Feb 15, 2022

anniegbryant commented Jul 23, 2022

benfulcher commented Jul 24, 2022

mesner commented Jan 25, 2024 • edited Loading

anniegbryant commented Jan 25, 2024

mesner commented Jan 25, 2024 •

edited

Loading