This covers configurations that are used globally and as part of startup for DJL Serving.
User can use the following parameters to start djl-serving, those parameters will override default behavior:
djl-serving -h
usage: djl-serving [OPTIONS]
-f,--config-file <CONFIG-FILE> Path to the configuration properties file.
-h,--help Print this help.
-m,--models <MODELS> Models to be loaded at startup.
-s,--model-store <MODELS-STORE> Model store location where models can be loaded.
Details about the models, model-store, and workflows can be found in the equivalent configuration properties.
DJL Serving use a config.properties
file to store configurations.
DJL Serving only allows localhost access by default.
- inference_address: inference API binding address, default: http://127.0.0.1:8080
- management_address: management API binding address, default: http://127.0.0.1:8080
Here are a couple of examples:
# bind inference API to all network interfaces with SSL enabled
inference_address=https://0.0.0.0:8443
# bind inference API to private network interfaces
inference_address=https://172.16.1.10:8443
Model Store
The model_store
config property can be used to define a directory where each file/folder in it is a model to be loaded.
It will then attempt to load all of them by default.
Here is an example:
model_store=build/models
Load Models
The load_models
config property can be used to define a list of models (or workflows) to be loaded.
The list should be defined as a comma separated list of urls to load models from.
Each model can be defined either as a URL directly or optionally with prepended endpoint data like [EndpointData]=modelUrl
.
The endpoint is a list of data items separated by commas.
The possible variations are:
[modelName]
[modelName:version]
[modelName:version:engine]
[modelName:version:engine:deviceNames]
The version can be an arbitrary string.
The engines uses the standard DJL Engine
names.
Possible deviceNames strings include *
for all devices and a ;
separated list of device names following the format defined in DJL Device.fromName
.
If no device is specified, it will use the DJL default device (usually GPU if available else CPU).
load_models=djl://ai.djl.zoo/mlp,[mlp:v1:PyTorch:*]=https://resources.djl.ai/test-models/mlp.zip
Workflows
Use the load_models
config property to define initial workflows that should be loaded on startup.
load_models=https://resources.djl.ai/test-models/basic-serving-workflow.json
View the workflow documentation to see more information about workflows and their configuration format.
For users who want to enable HTTPs, you can change inference_address
or management_addrss
protocol from http to https, for example: inference_addrss=https://127.0.0.1
.
This will make DJL Serving listen on localhost 443 port to accepting https request.
User also must provide certificate and private keys to enable SSL. DJL Serving support two ways to configure SSL:
-
Use keystore
- keystore: Keystore file location, if multiple private key entry in the keystore, first one will be picked.
- keystore_pass: keystore password, key password (if applicable) MUST be the same as keystore password.
- keystore_type: type of keystore, default: PKCS12
-
Use private-key/certificate files
- private_key_file: private key file location, support both PKCS8 and OpenSSL private key.
- certificate_file: X509 certificate chain file location.
This is a quick example to enable SSL with self-signed certificate
keytool -genkey -keyalg RSA -alias djl -keystore keystore.p12 -storepass changeit -storetype PKCS12 -validity 3600 -keysize 2048 -dname "CN=www.MY_DOMSON.com, OU=Cloud Service, O=model server, L=Palo Alto, ST=California, C=US"
Config following property in config.properties:
inference_address=https://127.0.0.1:8443
management_address=https://127.0.0.1:8444
keystore=keystore.p12
keystore_pass=changeit
keystore_type=PKCS12
# generate a private key with the correct length
openssl genrsa -out private-key.pem 2048
# generate corresponding public key
openssl rsa -in private-key.pem -pubout -out public-key.pem
# create a self-signed certificate
openssl req -new -x509 -key private-key.pem -out cert.pem -days 360
# convert pem to pfx/p12 keystore
openssl pkcs12 -export -inkey private-key.pem -in cert.pem -out keystore.p12
Config following property in config.properties:
inference_address=https://127.0.0.1:8443
management_address=https://127.0.0.1:8444
keystore=keystore.p12
keystore_pass=changeit
keystore_type=PKCS12
User can set environment variables to change DJL Serving behavior, following is a list of system environment variables that user can be set for DJL Serving:
Key | Type | Description |
---|---|---|
JAVA_HOME | env var | JDK home path |
MODEL_SERVER_HOME | env var | DJLServing home directory, default: Installation directory (e.g. /usr/local/Cellar/djl-serving//) |
DEFAULT_JVM_OPTS | env var | default: -Dlog4j.configurationFile=${APP_HOME}/conf/log4j2.xml Override default JVM startup options and system properties. |
JAVA_OPTS | env var | default: -Xms1g -Xmx1g -XX:+ExitOnOutOfMemoryError Add extra JVM options. |
SERVING_OPTS | env var | default: N/A Add serving related JVM options. Some of DJL configuration can only be configured by JVM system properties, user has to set DEFAULT_JVM_OPTS environment variable to configure them. - -Dai.djl.pytorch.num_interop_threads=2 , this will override interop threads for PyTorch- -Dai.djl.pytorch.num_threads=2 , this will override OMP_NUM_THREADS for PyTorch- -Dai.djl.logging.level=debug change DJL loggging level |
Note: Above system environment variable has higher priority than the default value in the container.
Global settings are configured at model server level. Change to these settings usually requires restart model server to take effect.
Most of the model server specific configuration can be configured in conf/config.properties
file.
You can find the configuration keys here:
ConfigManager.java
Each configuration key can also be overridden by environment variable with SERVING_
prefix, for example:
export SERVING_JOB_QUEUE_SIZE=1000 # This will override JOB_QUEUE_SIZE in the config
Similar to system environment variable, global environment variable has higher priority than config.properties.
There two type of model settings, options
and arguments
. options are used at model loading time,
arguments are used by Translator
for pre/post processing.
You can provide extra settings with environment variable, prefix with OPTION_
for options and ARGS_
for arguments. See: model configurations for more details.
Note: per model environment will NOT override values in serving.properties
.
export SERVING_OPTS="-Dai.djl.logging.level=debug"
export DEFAULT_JVM_OPTS="-Dlog4j.configurationFile=/MY_CONF/log4j2.xml
DJLServing provides a few built-in log4j2-XXX.xml
files in DJLServing containers.
Use the following environment variable to print HTTP access log to console:
export DEFAULT_JVM_OPTS="-Dlog4j.configurationFile=/usr/local/djl-serving-0.28.0/conf/log4j2-access.xml
Use the following environment variable to print both access log, server metrics and model metrics to console:
export DEFAULT_JVM_OPTS="-Dlog4j.configurationFile=/usr/local/djl-serving-0.28.0/conf/log4j2-console.xml