-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MDS #205
MDS #205
Conversation
Do we need to preprocess? Because if we need to preprocess, then the best approach is likely to give the result of IsotonicRegression as an input to MDS. |
The preprocessing is generally needed for points that have the same |
So I think we can isotonic regression fit and then run the linear regression ourselves? |
I think I don't understand the concept |
What I mean is that you defined a |
Yeah, that's probably the best idea |
I don't know why the test results here are so strange. I mean, I know that when I invoke |
@msluszniak what exactly is strange about it? |
For example, when I choose a smaller epsilon |
Locally, the response for smaller epsilon was as expected: much more iterations and smaller stress loss. And they correspond to what sklearn implemenation returns. |
@msluszniak I see. In this case I can't help much, it seems to be related to the algorithm. :( maybe |
Yeap, there is definitely something wrong with smacof procedure since it's mathematically proven that the stress shouldn't increase but currently sometime it does for bigger number of samples |
Ok, I found a problem. The problem was in pairwise distance. For a function that calculates pairwise distance on a single tensor, there was only a check for negative values (that occures sometimes because of numerical instability). However, there is also a check needed for positive values on the main diagonal. Ideally, there should be only zeros, but because of numerical instability sometimes there are positive values that in my specific case cause that loss explode to huge values |
@josevalim why on one CI two test fails and on the other three? And why do even these tests fail? 😅 |
@msluszniak for what is worth, it is very common to have precision differences across machines and compiler versions. You will have to either increase the tolerance of the matrix comparison or test based on a property (I don't know if there is such a property for MDS). |
Ok, I understand. Unfortunately, there is no such property in MDS. This algorithm is very sensitive to any numerical changes, and the final result will be completely different (but also correct, only embedded in different space). I guess that we may flag these tests as @to_skip or something like that because I think it's good to have them just in case. |
So how do you know MDS is correct? :D |
Actually, via visualization and empirical check. Even |
Sounds like I plan to me. |
I think it's ready to go |
💚 💙 💜 💛 ❤️ |
Closes #163
Currently, because we call isotonic regression inside
defn
I changed an implementation of isotonic regression so it would be able to runing MDS. Ofc, commenting whole sections of code is temporal solution and I'm wondering what we should do with it.