-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correlated rank similarity metric #59
base: develop-paper
Are you sure you want to change the base?
Conversation
aae547f
to
b04dfae
Compare
@OscartGiles This PR adds |
@gmingas - Just fixing the tests now but looks good. To get onto the VMs I can either redeploy the VM(s) or we can just install manually for now. Should be added next time a VM is deployed. |
Thanks! Yes, I did add it manually to do the weekend runs. No need to redeploy now, we can wait until the next time they are deployed. |
The Test pipeline run now fails because it tries to run the household_poverty stuff but doesn't have the data. |
We can grab the data in the makefile using the Kaggle API but that would require adding an authentication token to the repo. I don't know if it is possible to do this is a secure way, probably not from a quick search but maybe someone has done this before? The other option is to remove the household cleaning code from the makefile and run it manually whenever we need it after the data are added manually too. |
We could set it as an environment variable (can save it as a secret on github for use in the CI pipeline). But then we also need to make sure it is an environment variable on all our VMs and make it clear in the README that you need a kaggle API token. For ref https://github.com/Kaggle/kaggle-api#api-credentials |
…ifications Feature/dataset modifications
This PR adds support for Jenning's and Sebastian's correlated rank similarity metric.
Changes:
RankingSimilarity
class ofrbo.py
. These implement the correlated rank metric, its extrapolated version and the LP solver.feature_importance.py
to calculate the metric and also adds more complete RBO calculation (all types of RBO apart from uneven extrapolation) when comparing orig vs. rlds, orig vs rand and orig vs lower.feature_importance.py
pulp
to required libraries which will require a rebuild of the Docker image.WIP: