You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PyProcess W-100-model-stdout: The following parameters are not supported by neuron with rolling batch: {'frequency_penalty'}. This is just a warning. We would not fail because of this.
Do you have any other error messages in the log?
Description
Unable to use open-ai endpoint, getting the error below.
Error Message
PyProcess W-100-model-stdout: The following parameters are not supported by neuron with rolling batch: {'frequency_penalty'}.
How to Reproduce?
Using Docker.
"image": "deepjavalibrary/djl-serving:0.29.0-pytorch-inf2"
"envVars": "AWS_NEURON_VISIBLE_DEVICES=ALL
OPTION_TENSOR_PARALLEL_DEGREE=max
HF_HOME=/tmp/.cache/huggingface
OPTION_MODEL_ID=meta-llama/Meta-Llama-3.1-8B-Instruct
OPTION_ENTRYPOINT=djl_python.transformers_neuronx
OPTION_TRUST_REMOTE_CODE=true
SERVING_LOAD_MODELS=test::Python=/opt/ml/model
OPTION_ROLLING_BATCH=auto
OPTION_ENABLE_CHUNKED_PREFILL=true
OPTION_MAX_ROLLING_BATCH_SIZE=32
OPTION_N_POSITIONS=8192
OPTION_MAX_BATCH_DELAY=500
DJL_CACHE_DIR=/tmp/.cache/ ",
The text was updated successfully, but these errors were encountered: