You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
is it possible to load model on quantize version with bitsandbytes , i mean int4, int8 other other and how try 100 000 tokens lenght, i get message : Token indices sequence length is longer than the specified maximum sequence length for this model (798 > 512). Running this sequence through the model will result in indexing errors
and pretty good work, i share the same philosophie particularly i think t5 architecture is better for seq to seq task than decoder only. llm overgenerates and loss curves struggle to converge
The text was updated successfully, but these errors were encountered:
@jeaneigsi , thank you, I am also a big believer in encoder-decoder architectures, in one of my projects - a translator from different formats of chemicals it made a lot of sense.
Regarding your questions, unfortunately, our flash-attention realisation doesn't support int4, int8.
@jeaneigsi , thank you, I am also a big believer in encoder-decoder architectures, in one of my projects - a translator from different formats of chemicals it made a lot of sense.
Regarding your questions, unfortunately, our flash-attention realisation doesn't support int4, int8.
Ok thanks a lot , i am stay tune for incoming updates, keep build greats thing
is it possible to load model on quantize version with bitsandbytes , i mean int4, int8 other other and how try 100 000 tokens lenght, i get message : Token indices sequence length is longer than the specified maximum sequence length for this model (798 > 512). Running this sequence through the model will result in indexing errors
and pretty good work, i share the same philosophie particularly i think t5 architecture is better for seq to seq task than decoder only. llm overgenerates and loss curves struggle to converge
The text was updated successfully, but these errors were encountered: