-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
jinaai/jina-embeddings-v2-base-* not working #11
Comments
@michaelfeil do you maybe have another idea on how to get this sorted? |
@TimPietrusky The output you posted are not really descriptive for the problem that occures. The environment variables that are currently usable are not up to date. Here are all the functions that generate env variables in infinity. They are however just generating the defaults, I think there is nothing to do here. https://github.com/michaelfeil/infinity/blob/main/libs/infinity_emb/infinity_emb/env.py Runpod-infinity currently uses: |
@michaelfeil thanks for your quick response. Sorry for the unusable error messages, this is what we get in our UI. I hopefully get access to more in depth logging or find someone how can actually help out here.
Which version do you recommend? Should we try |
Yeah, maybe 0.0.53 fixes this? Can you try running infinity from this and 0.0.35 & see if it works? |
@michaelfeil awesome, thank you! Will do. |
@pandyamarut please let us know when you had time to update this 🙏 Then I can do the testing. |
Thanks to @pandyamarut we have updated the version in the worker
I guess this also doesn't help with finding anything. What could be the next steps here to debug what is going on @pandyamarut? Or maybe you have another idea @michaelfeil? |
Talked with @pandyamarut: We will try to debug what is going on here! |
i also got the same problem in here seems like the error has relation with the size of the embed data that server try to send back to the client. because i tried to hit the endpoint with different length of list that contain sentence, when i limit the length of the list its just work fine. i've been test it with intfloat/multilingual-e5-large and also nvidia/NV-Embed-v2 embedding model. and here is the data that i use for testing https://www.kaggle.com/datasets/rtatman/questionanswer-dataset i use the question data only. |
When using the worker with the image
runpod/worker-infinity-embedding:stable-cuda12.1.0
, with this env varMODEL_NAMES
:jinaai/jina-embeddings-v2-base-de
, we see this error:According to michaelfeil/infinity#115 (comment) we should be able to solve this by setting these env variables:
INFINITY_DISABLE_OPTIMUM
:TRUE
INFINITY_DISABLE_COMPILE
:TRUE
But this is not working, we still see an error:
Request
Output
So it looks like everything is completed, but there is no expected output (the embeddings).
OpenAI-compatible API
The behavior is the same when using the OpenAI-compatible API: It doesn't work, just provides the same output as above.
The text was updated successfully, but these errors were encountered: