Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Softmax in _coordinate_selection Leading to Saturated Outputs #15

Open
dezhi0730 opened this issue Sep 26, 2024 · 2 comments
Assignees

Comments

@dezhi0730
Copy link

Hello,

Thank you for maintaining this repository and the effort you've put into it. While working with the model, I encountered an issue related to the softmax function in the _coordinate_selection function. Specifically, the softmax output often becomes extremely saturated, where only one element in the position_probs tensor is 1, and all others are 0. This behavior is unexpected and may be causing problems with selecting edit positions.

Issue Details:

  • The issue occurs in the _coordinate_selection function.
  • After applying softmax(dim=-1) to the position_probs tensor, the output shows only one element with a value of 1, while all others are 0.
  • As a result, the element with a value of 1 is always selected, and the other edit positions are randomly chosen, which is likely not the desired outcome.
  • If my is_corrupted tensor is targeting a specific region, such as the first half of the tokenized_seq, I noticed that my sequence is still changing in the second half.

Exp:

image

image
image
Please feel free to reach out if further clarification is needed.

Best regards.

@samuelstanton
Copy link
Collaborator

samuelstanton commented Oct 7, 2024

Thanks for raising the issue, I just pushed a change that normalizes the attributions before softmaxing which should alleviate this issue. Note that you can also avoid this behavior if it persists in the most recent version by increasing the feature_attr_temp value.

Would you mind confirming the fix resolves your issue?

https://github.com/prescient-design/cortex/blob/main/cortex/optim/generative/_lambo.py#L264

@samuelstanton samuelstanton self-assigned this Oct 7, 2024
@dezhi0730
Copy link
Author

Hello, I believe the previous issue has been resolved, but I have another question regarding the constrain_fn used for optimization. Would you mind open-sourcing the part related to the edit budget in constrain_fn? Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants