Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modeling non-independant labeling functions? #1641

Closed
moscow25 opened this issue Apr 5, 2021 · 3 comments
Closed

Modeling non-independant labeling functions? #1641

moscow25 opened this issue Apr 5, 2021 · 3 comments

Comments

@moscow25
Copy link

moscow25 commented Apr 5, 2021

My understanding is that in the Snorkel/Google paper, modeling feature co-variance is noted. However in the current implementation, as far as I can tell, features are all assumed independent.

Currently this class uses a conditionally independent label model, in which the LFs

Do I mis-understand, or is there a way of handling labels being very highly correlated? For example I may have a classifier, which is more accurate for short text than for longer text. At the moment I can't really create "independent" features for "model" and "model_280" which only applies to longer text. Since this skews the bootstrapping of the model.

Please let me know if I do not interpret this correctly?

@henryre
Copy link
Member

henryre commented Apr 12, 2021

Hi @moscow25, please take a look at this releated thread: #1596. In this case, you could manually resolve the dependency in the labeling function itself (e.g. by running the shorter model if the text field is below some character length limit), and you could also empirically test (e.g. using a hold-out set) whether adding both models as independent labeling functions actually helps performance.

@moscow25
Copy link
Author

Thanks @henryre. I appreciate the link to #1596. Empirically, the function works ok, good to know there's a formula one could implement from the paper.

@github-actions
Copy link

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants