You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Mostly reading. Discussed with Greg and Kasra about their experience with UCLH synthetic data. Offered informal chats to academic and clinical collaborators before kick-off on 31st Jan.
There were two work streams pre-christmas, outlined in this standup. Briefly:
Privacy Analysis of Available Tables using IPF
this currently is in library form. The privacy analysis is in rudimentary form, with two methods: 1) membership inference attacks, 2) 'information gain' from inclusion (James Jordon's idea).
QUIPP and synthgauge
synthgauge will be made public. It is a ONS python library for visualising synthetic data and generating privacy and utility metrics.
Both the above streams are paused for now. There is now another focus, from Lukasz, looking at the NIST competition winning algorithm, and comparing it with the IPF method above. The comparison will be made by applying the Stadler et al privacy framework on it. It was suggested to use the ’Texas data set’ used in https://arxiv.org/abs/2011.07018 . ' (I'm not quite sure why we are using the Texas data set vs census data).
My main job for this week is to understand the Stadler et al privacy framework and be able to run the code.
Please post any useful updates from your project and agenda items you wish to discuss in our weekly Tuesday meeting.
The text was updated successfully, but these errors were encountered: