You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While passing the hidden_states and scores required simply passing the corresponding output_* to the HF API (see #8), when doing the same for output_attention the model runs into an error and crashes.
I suspect it's due to either (1) DeepSpeed not supporting outputting attention distributions or (2) HF's "wrapper" around DeepSpeed APIs doesn't obtain attention in the right way.
Further investigation is needed.
The text was updated successfully, but these errors were encountered:
While passing the
hidden_states
andscores
required simply passing the correspondingoutput_*
to the HF API (see #8), when doing the same foroutput_attention
the model runs into an error and crashes.I suspect it's due to either (1) DeepSpeed not supporting outputting attention distributions or (2) HF's "wrapper" around DeepSpeed APIs doesn't obtain attention in the right way.
Further investigation is needed.
The text was updated successfully, but these errors were encountered: