Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

not able to reproduce paper results #1

Open
ronslos opened this issue Feb 8, 2018 · 15 comments
Open

not able to reproduce paper results #1

ronslos opened this issue Feb 8, 2018 · 15 comments

Comments

@ronslos
Copy link

ronslos commented Feb 8, 2018

I have been running the code with the default params, but don't get any substantial decrease in the loss and the results don't look anything like the ones that appear in the paper.
Here is what I am getting on CC
image

image

Please advise on what should be changed in order to achieve results such as in the paper.
Thanks.

@lihenryhfl
Copy link
Collaborator

What OS and versions of sklearn, numpy, tensorflow, and keras are you using? If there are any changes at all, can you try stashing them and running the CC code again?

@ronslos
Copy link
Author

ronslos commented Feb 8, 2018

Here are all the packages I am using currently.
image

I have not made any changes to the original code besides changing the name of the dataset and setting the active gpu index.
If you can tell me which versions you are using on your system I can try to test with your configuration.

@lihenryhfl
Copy link
Collaborator

lihenryhfl commented Feb 8, 2018

That's strange. I have tested on two MacOS systems and one linux system, once using your version configurations, but I can't seem to reproduce the error.

I have a few more questions:

  1. What OS are you using?
  2. Are you using python 2 or python 3?
  3. Are you running run.py, with src/applications/ as your working directory?

EDIT: I was not running with the gpu build of tensorflow-1.5.0. Can you try running with no GPU, or downgrading to tensorflow-1.4.0, and running again?

@ronslos
Copy link
Author

ronslos commented Feb 8, 2018

  1. I am running Ubuntu 16.04
  2. My python version is 3.6
  3. I ran run.py from the SpectralNet root directory

I have downgraded to tensorflow 1.4.0 cpu only as you suggested. Still no change.

@lihenryhfl
Copy link
Collaborator

Everything else seems normal, except that we haven't tried this on a machine with Ubuntu 16.04 before. It would be strange if this were the issue, but please try running the cc code on a different machine (preferably a mac) if possible.

@ronslos
Copy link
Author

ronslos commented Feb 8, 2018

It's working on mac.
I wonder, what could cause it to vary between OS's?
Thanks!

@lihenryhfl
Copy link
Collaborator

No problem! Agreed, it's definitely still an issue, and a pretty big one, since with v1.5.0 tensorflow-gpu now implicitly requires Ubuntu 16.04. I'm currently working on gaining access to a machine with Ubuntu 16.04 and seeing if I can reproduce it on my end.

@ronslos
Copy link
Author

ronslos commented Mar 11, 2018

I'm still working with your code.
I managed to run the example datasets given in your code, but when I build my own dataset following an example given in your paper I get the following results:

image
Here is the code I use to build this dataset

def generate_circles(n=2400, circles_num=3, noise_sigma=0.01, train_set_fraction=1.):

pts_per_cluster = int(n / circles_num)
initial_r = 1.0
r = initial_r
x = np.zeros([0,2]); y = np.zeros([0,1]);
# generate clusters
for i in range(circles_num):
    theta = (np.random.uniform(0, 1, pts_per_cluster) * 2* np.pi ).reshape(pts_per_cluster, 1)
    cluster = np.concatenate((np.cos(theta) * r, np.sin(theta) * r), axis=1)
    x = np.concatenate((x, cluster), axis=0)
    y = np.concatenate((y , i * np.ones(shape=(pts_per_cluster, 1))), axis=0)
    r -= initial_r/circles_num

# add noise to x
x = x + np.random.randn(x.shape[0], 2) * noise_sigma

# generate labels


# shuffle
p = np.random.permutation(n)
y = y[p]
x = x[p]

# make train and test splits
n_train = int(n * train_set_fraction)
x_train, x_test = x[:n_train], x[n_train:]
y_train, y_test = y[:n_train].flatten(), y[n_train:].flatten()

Basically I copied make_cc and modified it a bit.
Can you please guide me on how to get SpectralNet to converge for this dataset?

@lihenryhfl
Copy link
Collaborator

I'll look into this. But the most important hyperparameters to tweak are n_nbrs, scale_nbrs and affinity, so I'd recommend starting there.

@oj9040
Copy link

oj9040 commented Aug 9, 2018

hello,

While running your Spectralnet code, it raised the same non-reproducible issue for me as well.

For "cc" dataset, I had pretty much similar result with the paper.

However, using the default code and hyper parameter settings,
I got ACC 0.752, NMI 0.745 for mnist, and ACC 0.747, NMI 0.448 for reuters datasets, which is by far behind the numbers reported in the paper (ACC 0.971, NMI 0.924 for mnist, ACC 0.803, NMI 0.532 for reuters).

I found out that one of hyperparameters, "patience epochs", is not synced with Table 3 in the paper, which is also varying on which data you target for.
After fixing the parameter equivalently to 10 as shown in the table, the accuracy goes up for mnist, but goes down for reuters such as ACC 0.791, NMI 0.791 for mnist, and ACC 0.619, NMI 0.328 for reuters.

Can you give advice to reach your reporting accuracy in the paper?

FYI, the os environment is CentOS Linux 7 with Tensorflow 1.9.0 and python 3.6.3

@lihenryhfl
Copy link
Collaborator

That's perplexing. I don't have access to a CentOS operating system at the moment. Can you try running this on Ubuntu or Mac?

@oj9040
Copy link

oj9040 commented Aug 14, 2018

Thank you for the advice.
Could you specify the detailed version of Ubuntu, Tensorflow, and Python that you have tested on?

@lihenryhfl
Copy link
Collaborator

I've tested on Python 3.4-3.6, Tensorflow 1.4-1.8, and Ubuntu 14.04, 16.04, and 18.04. I have also tried running Tensorflow 1.5 on Python 3.5 on macOS. I will try running on Tensorflow 1.9 by the end of this week and get back to you. In the meantime, if convenient, can you try one of these?

@oj9040
Copy link

oj9040 commented Aug 18, 2018

I have tried python 3.5, tensorflow 1.4 in either Ubuntu 14.04.5 or Centos 7.
The accuracy is now as expected:
ACC 0.969 and NMI 0.921 (Ubuntu 14.04.5)
ACC 0.97 and NMI 0.922 (Centos 7)

Based on this, OS seems not the reason to incur unreproducible issue, rather from python and tensorflow version.

@angus040107
Copy link

python 3.6, tensorflow 1.4, keras2.1.6, Ubuntu 14.04
mnist->ACC: 0.97 NMI:0.923
reusters->ACC: 0.812 NMI:0.544
But it doesn't work on python 3.7, tensorflow 1.15, keras2.3 Ubuntu 14.04, I don't know why.
360截图17491102255147
360截图18490928707292

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants