Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moons dataset looks weird #16

Open
kennysong opened this issue May 6, 2020 · 4 comments
Open

Moons dataset looks weird #16

kennysong opened this issue May 6, 2020 · 4 comments

Comments

@kennysong
Copy link

Nice article! The image of the moons dataset at the end of the article looks a bit strange since the points are heavily overlapping and not in a moon shape.

Screen Shot 2020-05-06 at 1 35 34 PM

Is it the correct image?

@apoorvagnihotri
Copy link
Collaborator

apoorvagnihotri commented May 6, 2020

Thanks a lot for the appreciation. :)

Yes, it is the correct image, though we have used a variance of 1 while creating the dataset. You can find the code used for replicating the above image here.

@kennysong
Copy link
Author

Thanks for the reference! Was the reason for the high noise just to make the task harder and spread out the performance of different hyperparameter sets?

@apoorvagnihotri
Copy link
Collaborator

apoorvagnihotri commented May 8, 2020

Hi @kennysong. To be honest, we didn't consider varying the noise. But I think your idea should be valid in this case.

I did some experiments with low noise and Bayesian Optimization (BO) is performing better. PFA, moon's data results on SVM.

New, simple moon’s dataset.

New accuracies on using an SVM.

Old accuracy on using an SVN.


New accuracies on using an RF.

Old accuracies on using an RF.

Our task is much difficult--because of using a very high noise in dataset--in the original article. Therefore, it was difficult for the BO to perform. As seen here, reducing the complexity helped BO achieve better results.

You can have a look at the ipynb used here.

Thanks a lot for pointing this out. I’ll keep this issue open as I think this is an important point. :)

@kennysong
Copy link
Author

Oh cool! It does look like the original harder task spread out the performance of different algorithms more. With less noise, all approaches seem to do better and converge to the global optimum.

Thanks for following up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants