You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We already implemented passing past_key_values to step_generate() to enable cache in LLM.
generate() also support this but should have a different method to implement this since generate() will truncate the input_ids if past_key_values is provided.
The text was updated successfully, but these errors were encountered:
We can pass past_key_values to generate(), however, it's not possible to return the updated past_key_values from generate(). We can modify the utils in huggingface package to enable this but it's a hacky way, so we decide now to implement this feature now.
We already implemented passing
past_key_values
tostep_generate()
to enable cache in LLM.generate()
also support this but should have a different method to implement this sincegenerate()
will truncate the input_ids ifpast_key_values
is provided.The text was updated successfully, but these errors were encountered: