You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the code below, we used two models of quite different capacities: For bert-score and bertscore-sentence-MNLI, we used Roberta-Large, which is about 1.6GB (default for bert-score implemented in HF's evaluate library). But for bertscore-sentence, which is built on top of sentence-bert, we used all-MiniLM-L6-v2, which has only 80MB. So this gives our bertscore-sentence approach a huge disadvantage. Of course, we pick that one to be fast in pilot studies.
I think if we use a large-capacity model for bertscore-sentence, we can further boost our sentence-based pair-wise approach.
There are two directions we can try:
A quick one is we just use a larger model trained by sentence-bert project. Let's try two all-mpnet-base-v2 and all-roberta-large-v1. The former one is still much smaller than Roberta-large but has higher scores according to sentence-bert leader board while the latter one is just RoBERTa-large but trained using sentence-bert's dot-product loss. Thus let's test both of these two versions below:
BTW, we can use HF's transformers library for Sentence-Bert as well. In this way, we don't have importing bothtransformers and sentence_transformers. We can consolidate all code under one framework.
A slower but completely fair approach: we also use RoBERTa-large (generally trained, not on MNLI) to embed the sentence and extract the embedding corresponding to the [CLS] token. For how to do it, see here.
The text was updated successfully, but these errors were encountered:
In the code below, we used two models of quite different capacities: For bert-score and bertscore-sentence-MNLI, we used Roberta-Large, which is about 1.6GB (default for bert-score implemented in HF's
evaluate
library). But for bertscore-sentence, which is built on top of sentence-bert, we used all-MiniLM-L6-v2, which has only 80MB. So this gives our bertscore-sentence approach a huge disadvantage. Of course, we pick that one to be fast in pilot studies.https://github.com/SigmaWe/DocAsRef_0/blob/de4de4b4275e661621bebf3b2f92d8676e2f81c2/dar_env.py#L8-L11
I think if we use a large-capacity model for bertscore-sentence, we can further boost our sentence-based pair-wise approach.
There are two directions we can try:
A quick one is we just use a larger model trained by sentence-bert project. Let's try two
all-mpnet-base-v2
andall-roberta-large-v1
. The former one is still much smaller than Roberta-large but has higher scores according to sentence-bert leader board while the latter one is just RoBERTa-large but trained using sentence-bert's dot-product loss. Thus let's test both of these two versions below:BTW, we can use HF's
transformers
library for Sentence-Bert as well. In this way, we don't have importing bothtransformers
andsentence_transformers
. We can consolidate all code under one framework.A slower but completely fair approach: we also use RoBERTa-large (generally trained, not on MNLI) to embed the sentence and extract the embedding corresponding to the [CLS] token. For how to do it, see here.
The text was updated successfully, but these errors were encountered: