Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fair and square model selection #10

Open
forrestbao opened this issue Dec 4, 2022 · 0 comments
Open

Fair and square model selection #10

forrestbao opened this issue Dec 4, 2022 · 0 comments
Assignees

Comments

@forrestbao
Copy link
Contributor

forrestbao commented Dec 4, 2022

In the code below, we used two models of quite different capacities: For bert-score and bertscore-sentence-MNLI, we used Roberta-Large, which is about 1.6GB (default for bert-score implemented in HF's evaluate library). But for bertscore-sentence, which is built on top of sentence-bert, we used all-MiniLM-L6-v2, which has only 80MB. So this gives our bertscore-sentence approach a huge disadvantage. Of course, we pick that one to be fast in pilot studies.

https://github.com/SigmaWe/DocAsRef_0/blob/de4de4b4275e661621bebf3b2f92d8676e2f81c2/dar_env.py#L8-L11

I think if we use a large-capacity model for bertscore-sentence, we can further boost our sentence-based pair-wise approach.

There are two directions we can try:

  1. A quick one is we just use a larger model trained by sentence-bert project. Let's try two all-mpnet-base-v2 and all-roberta-large-v1. The former one is still much smaller than Roberta-large but has higher scores according to sentence-bert leader board while the latter one is just RoBERTa-large but trained using sentence-bert's dot-product loss. Thus let's test both of these two versions below:

       sent_embedder = sentence_transformers.SentenceTransformer("all-mpnet-base-v2") 
       sent_embedder = sentence_transformers.SentenceTransformer("all-roberta-large-v1") 

    BTW, we can use HF's transformers library for Sentence-Bert as well. In this way, we don't have importing bothtransformers and sentence_transformers. We can consolidate all code under one framework.

  2. A slower but completely fair approach: we also use RoBERTa-large (generally trained, not on MNLI) to embed the sentence and extract the embedding corresponding to the [CLS] token. For how to do it, see here.

@forrestbao forrestbao changed the title Fair and square model selecition Fair and square model selection Dec 4, 2022
This was referenced Dec 4, 2022
TURX added a commit that referenced this issue Dec 9, 2022
bs_sent: mnli allow specify classifier, cos_sim allow specify embedder
fix: #5, #10 direction 1
TURX added a commit that referenced this issue Dec 9, 2022
bs_sent: mnli allow specify classifier, cos_sim allow specify embedder
fix: #5, #10 direction 1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants