Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker 0.29.0-pytorch-inf2 with meta-llama/Meta-Llama-3.1-8B-Instructn failes #2385

Open
yaronr opened this issue Sep 13, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@yaronr
Copy link

yaronr commented Sep 13, 2024

Description

Unable to use open-ai endpoint, getting the error below.

Error Message

PyProcess W-100-model-stdout: The following parameters are not supported by neuron with rolling batch: {'frequency_penalty'}.

How to Reproduce?

Using Docker.
"image": "deepjavalibrary/djl-serving:0.29.0-pytorch-inf2"
"envVars": "AWS_NEURON_VISIBLE_DEVICES=ALL
OPTION_TENSOR_PARALLEL_DEGREE=max
HF_HOME=/tmp/.cache/huggingface
OPTION_MODEL_ID=meta-llama/Meta-Llama-3.1-8B-Instruct
OPTION_ENTRYPOINT=djl_python.transformers_neuronx
OPTION_TRUST_REMOTE_CODE=true
SERVING_LOAD_MODELS=test::Python=/opt/ml/model
OPTION_ROLLING_BATCH=auto
OPTION_ENABLE_CHUNKED_PREFILL=true
OPTION_MAX_ROLLING_BATCH_SIZE=32
OPTION_N_POSITIONS=8192
OPTION_MAX_BATCH_DELAY=500
DJL_CACHE_DIR=/tmp/.cache/ ",

@yaronr yaronr added the bug Something isn't working label Sep 13, 2024
@sindhuvahinis
Copy link
Contributor

PyProcess W-100-model-stdout: The following parameters are not supported by neuron with rolling batch: {'frequency_penalty'}. This is just a warning. We would not fail because of this.
Do you have any other error messages in the log?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants