-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2011 census microdata play #68
base: develop
Are you sure you want to change the base?
Conversation
From our discussion in-person just now:
|
I think classifiers run for a long time when no specific classifier with specific hyperparamters is passed in the run-inputs file. In this case, a number of classifiers are tested with many combinations of hyperparameters each. I recommend using something like this to reduce time. It uses only logistic regression with defined params. |
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
… around limitation on JADE)
As part of the Synthetic Data and Privacy Preservation - Turing/ONS partnership project 3, we're trying out the QUIPP pipeline on this dataset.
Note: may or may not need to ever merge this - just putting up so @ots22 can easily pull the branch
@ots22 I've attempted to modify the existing examples to run the different
synth-method
choices with stock parameters, only changing the parts referring to column names. Example 4, the SGF one, worked without any errors (I've set this one to enabled: true) - if you pull the branch and set enabled: false for any of the others you should hopefully get the errors I got for those.On the SGF one, it seems to have generated a synthetic dataset! Only there are no values for the 2nd column (possible I wrongly chose categorical type for the column in the dataset json here, not sure)
Also, I created an issue #67 for the error I got on the CTGAN one - as I noticed the same error when I tried to run the existing CTGAN example from
run-inputs