We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running Serverless HF Inference as instructed in Documentation fails:
File "/home/tobias/src/tamingLLMs/tamingllms/.venv/lib/python3.12/site-packages/lighteval/models/tgi_model.py", line 70, in __init__ model_precision = self.model_info["model_dtype"] ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^ KeyError: 'model_dtype'
Next, if you manually set
model_precision = "float16"
in lighteval/models/tgi_model.py", line 70
lighteval/models/tgi_model.py", line 70
You get a second error:
ValueError: batch_size should be a positive integer value, but got batch_size=-1
which then gets resolved if you pass --override_batch_size 1 to lighteval bash command.
--override_batch_size 1
lighteval accelerate --model_config_path="endpoint_model.yaml" --tasks "leaderboard|mmlu:econometrics|0|0" --output_dir="./evals/"
endpoint_model.yaml: model: type: "tgi" instance: inference_server_address: "https://api-inference.huggingface.co/models/Qwen/Qwen2.5-Math-1.5B-Instruct" inference_server_auth: "<API-KEY>" model_id: null
model: type: "tgi" instance: inference_server_address: "https://api-inference.huggingface.co/models/Qwen/Qwen2.5-Math-1.5B-Instruct" inference_server_auth: "<API-KEY>" model_id: null
Run benchmark using Serverless HF Inference
python = "^3.11" lighteval = {extras = ["accelerate"], version = "^0.6.2"}
The user keeps getting ‘model is currently loading’
One recommendation would be to leverage wait_for_model param to avoid breaking the command line call if the model is still loading.
wait_for_model
The text was updated successfully, but these errors were encountered:
Hi! Can you try with the code of #445 ?
Sorry, something went wrong.
Successfully merging a pull request may close this issue.
Describe the bug
Running Serverless HF Inference as instructed in Documentation fails:
Next, if you manually set
in
lighteval/models/tgi_model.py", line 70
You get a second error:
ValueError: batch_size should be a positive integer value, but got batch_size=-1
which then gets resolved if you pass
--override_batch_size 1
to lighteval bash command.To Reproduce
endpoint_model.yaml:
model: type: "tgi" instance: inference_server_address: "https://api-inference.huggingface.co/models/Qwen/Qwen2.5-Math-1.5B-Instruct" inference_server_auth: "<API-KEY>" model_id: null
Expected behavior
Run benchmark using Serverless HF Inference
Version info
python = "^3.11"
lighteval = {extras = ["accelerate"], version = "^0.6.2"}
Other Suggestions
The user keeps getting ‘model is currently loading’
One recommendation would be to leverage
wait_for_model
param to avoid breaking the command line call if the model is still loading.The text was updated successfully, but these errors were encountered: