add Min-K%++ attack #19

zjysteven · 2024-04-04T15:03:59Z

Hi @iamgroot42, first thanks for putting up this unified benchmark.

Overview

This pull request add our proposed method, Min-K%++, which outperforms other reference-free methods and performs on par with the Ref method on MIMIR. See below experiment part for details.

Changed files

We follow the instruction to create attacks/mink_plus_plus.py and register the method in attacks/all_attacks.py and attacks/utils.py. Our method needs the full probabilities over the vocabulary, so we modify the get_probabilities function in models.py with an additional argument return_all_probs to return full probs.

Meanwhile, we change run.py accordingly to include our Min-K%++ attack. Additionally, we let both Min-K% and Min-K%++ to run over a list of k values, which allows both methods to be compared by their upper bounds.

Lastly, we change the readme to incorporate Min-K%++.

Experiments

I'm putting this result table here in case MIMIR wants to have a leaderboard or something. We are using non-deduped pythia models and the default ngram_13_0.8 setting. The data splits are fetched from cache_100_200_1000_512 folder on huggingface.

Let us know if there are any issues. Otherwise, we look forward to Min-K%++ being integrated into the MIMIR benchmark.

iamgroot42 · 2024-04-04T15:15:19Z

Thanks a lot for the PR, @zjysteven ! I have a couple suggestions that I have added via comments. The fact that you submitted such a well-structured PR is alone sufficient and I would understand if you would rather have us make these changes, but would be great if you could make those changes

zjysteven · 2024-04-04T15:21:23Z

Yes I can make further changes. Somehow I'm not seeing your comments. Would you please point me to the comments?

iamgroot42 · 2024-04-04T15:14:11Z

mimir/attacks/min_k_plus_plus.py

+        target_prob, all_prob = (
+            (probs, all_probs)
+            if (probs is not None and all_probs is not None)
+            else self.model.get_probabilities(document, tokens=tokens, return_all_probs=True)


Is there a reason for computing softmax and log_softmax separately? Given access to the latter (which is default right now), a exp() operation should give you the desired softmax outputs for use in mink++?

I see. Earlier I misread and thought that we have softmax, and since log_softmax is numerically more stable than log(softmax) I compute them separately (but yeah as you pointed what we have already is log_softmax). I will make the change.

I do want to clarify here that while target_prob is the log probability of each input token, all_prob is the log probabilities of every token in the vocabulary (the whole categorical distribution).

iamgroot42 · 2024-04-04T15:15:07Z

run.py

-                            loss=loss,
-                        )
-                        sample_information[attack].append(score)
+                        if attack in [AllAttacks.MIN_K, AllAttacks.MIN_K_PLUS_PLUS]:


Our intent with the structure was to have as few attack-specific checks as possible. We plan on adding attack-specific hyper-parameters in the config file, so that any changes required for the attack here can be achieved by different config files, and not hard-coded hyper-parameters and switch-case in code!

Also a valid point. I will just remove this modification so run.py follows the current design.

iamgroot42 · 2024-04-04T15:22:57Z

Thanks! They should be visible now

zjysteven · 2024-04-04T15:54:03Z

I've made changes accordingly in the latest commit. @iamgroot42 Would you take another look?

zjysteven · 2024-04-04T15:55:18Z

mimir/models.py

@@ -116,7 +113,6 @@ def get_probabilities(self,
                if no_grads:
                    logits = logits.cpu()
                shift_logits = logits[..., :-1, :].contiguous()
-                probabilities = torch.nn.functional.softmax(shift_logits, dim=-1)


Like suggested here we don't need to separately call softmax given that log_softmax will be computed anyway.

iamgroot42 · 2024-04-05T18:56:54Z

Thanks for making the changes! Will run our standard workflow and merge it

add [mink++](https://zjysteven.github.io/mink-plus-plus/)

206c010

iamgroot42 requested changes Apr 4, 2024

View reviewed changes

further changes for coherence

fbc4594

zjysteven commented Apr 4, 2024

View reviewed changes

iamgroot42 approved these changes Apr 5, 2024

View reviewed changes

iamgroot42 merged commit f50d0b3 into iamgroot42:main Apr 5, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add Min-K%++ attack #19

add Min-K%++ attack #19

zjysteven commented Apr 4, 2024

iamgroot42 commented Apr 4, 2024

zjysteven commented Apr 4, 2024

iamgroot42 Apr 4, 2024

zjysteven Apr 4, 2024 •

edited

Loading

zjysteven Apr 4, 2024

iamgroot42 Apr 4, 2024

zjysteven Apr 4, 2024

iamgroot42 commented Apr 4, 2024

zjysteven commented Apr 4, 2024 •

edited

Loading

zjysteven Apr 4, 2024

iamgroot42 commented Apr 5, 2024

add Min-K%++ attack #19

add Min-K%++ attack #19

Conversation

zjysteven commented Apr 4, 2024

Overview

Changed files

Experiments

iamgroot42 commented Apr 4, 2024

zjysteven commented Apr 4, 2024

iamgroot42 Apr 4, 2024

Choose a reason for hiding this comment

zjysteven Apr 4, 2024 • edited Loading

Choose a reason for hiding this comment

zjysteven Apr 4, 2024

Choose a reason for hiding this comment

iamgroot42 Apr 4, 2024

Choose a reason for hiding this comment

zjysteven Apr 4, 2024

Choose a reason for hiding this comment

iamgroot42 commented Apr 4, 2024

zjysteven commented Apr 4, 2024 • edited Loading

zjysteven Apr 4, 2024

Choose a reason for hiding this comment

iamgroot42 commented Apr 5, 2024

zjysteven Apr 4, 2024 •

edited

Loading

zjysteven commented Apr 4, 2024 •

edited

Loading