Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running into errors when running algorithm #1

Open
szsb26 opened this issue Jun 4, 2021 · 3 comments
Open

Running into errors when running algorithm #1

szsb26 opened this issue Jun 4, 2021 · 3 comments

Comments

@szsb26
Copy link

szsb26 commented Jun 4, 2021

Hi Aria,

First, I want to thank you immensely for implementing this algorithm, as I could not find an implementation anywhere to test the claims in the original fast robustSTL paper.

However, im running into some issues when trying to run the algorithm.

When running your ex. with N = 1231 (size of data vector y), the algorithm runs fine, as shown below

Screen Shot 2021-06-04 at 2 00 30 PM

Screen Shot 2021-06-04 at 2 06 20 PM

However, when replacing "y" with a same size vector called "data" which looks like the following:

Screen Shot 2021-06-04 at 2 09 26 PM

I get the following error when running robustSTL

Screen Shot 2021-06-04 at 2 10 29 PM

where the error is:

Screen Shot 2021-06-04 at 2 11 18 PM

Since "y" and "data" are exactly the same size except with different values, and all parameters are fixed, im a bit confused as to what is going on and would really appreciate it if you can provide some help/insight.

If it helps, i've also attached the data used in "data" vector. It is in the attached dataframe under column "values"

internet_traffic_stl_results_complete.csv

Thanks!
Sichen

@xhd1203
Copy link

xhd1203 commented Jun 16, 2021

@szsb26
Hello! I met the same problem as you when running it (replacing it with my own data), did you solve it please? Also when running the original example from the thesis, I found that the decomposition results were particularly poor and completely different from the results in the thesis, did you notice this?

@szsb26
Copy link
Author

szsb26 commented Jun 16, 2021

@xhd1203
I solved the issue by scaling the data with the mean. (AKA data/data.mean). After running the algorithm, I then rescale the trend, and seasonal components back by multiplying data.mean(). I think the issue above comes from the cvx package and not from the algorithm or implementation itself. The best guess i can give to the core of the problem is that since the time series i was dealing had pretty large values, the cvx solver ran into numerical instability when dealing with these values. Funnily enough, in the RobustSTL paper(the original one), the authors also scaled their data to be in [0, 1] or used a log transform. Would be useful to know if they ran into the same issues or not since they also used cvx...

I also ran into issues of the algorithm performing poorly. You can see this not only from the experiment in the paper but also from the colab notebook link given in this repo. The magnitudes of the estimated seasonal components do not match with the magnitudes of the synthetically generated true seasonal components at all. Granted, this is probably due to the paper not explaining things very well. For ex., the fast RobustSTL paper does not explain how any of the denoising and regularization parameters were chosen, and like the repo author mentioned, did not provide enough details for the GADMM portion to be replicated...

@xhd1203
Copy link

xhd1203 commented Jun 17, 2021

@szsb26
Thank you very much for your reply and help. The problem that existed before after dividing the data by the average can be solved. I have tried many times to adjust the parameters, but the results of the decomposition are still not good. Anyway, thank you for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants