high latency #2721

sriram-dsl · 2024-10-07T08:58:51Z

Latency in OpenVINO Model Server Inside Kubernetes Cluster

To Reproduce

Steps to reproduce the behavior:

Prepare Models Repository: Followed standard procedures to set up the models repository.

OVMS Launch Command:

docker run -d --rm \
  -v /home/raa/model:/models \
  -p 900:9000 \
  -p 800:8000 \
  openvino/model_server:latest \
  --model_path /models/model1/1 \
  --model_name model1 \
  --port 9000 \
  --rest_port 8000 \
  --log_level DEBUG

**Client command ** :

 server address = 192.168.1.10:900
 client = make_grpc_client(server_address)
 pred = client.predict(inputs=inputs, model_name=model1)

Environment

I am using OpenVINO server for model deployment on edge devices with an Intel i7-13th Gen processor. My PyTorch model is trained with an image size of 640.

Standalone Inference: The prediction time for PyTorch weights is consistently <0.01 seconds per image.
IR Converted Weights: Similar inference time in standalone mode.
In Kubernetes Cluster: When deployed, the OpenVINO server takes approximately 1.5 seconds per prediction using gRPC calls.
Observation: Running the OpenVINO server outside the cluster in a container yields <0.01 seconds inference time for IR converted weights, while the same server inside the cluster significantly increases latency.

Expected Behavior

I expected the inference time for the OpenVINO Model Server inside the Kubernetes cluster to be comparable to that observed outside the cluster (i.e., <0.01 seconds).

model directory format

model/
├── model1/
│   └── 1/
│       ├── model.bin
│       ├── model.mapping
│       └── model.xml

The text was updated successfully, but these errors were encountered:

mzegla · 2024-10-08T07:15:55Z

Could you share Kubernetes deployment configuration? I can see you are running with DEBUG log level - can you share logs from the server in your deployment with a few predictions logged?

sriram-dsl · 2024-10-08T09:55:06Z

logs :

 [2024-10-08 09:51:34.769][1][modelmanager][debug][modelinstance.cpp:737] Model: model1; version: 1; target device: CPU, CompiledModel config key: OPTIMAL_NUMBER_OF_ _
_ [2024-10-08 09:51:34.769][1][modelmanager][debug][modelinstance.cpp:737] Model: model1; version: 1; target device: CPU, CompiledModel config key: NUM_STREAMS, value _
_ [2024-10-08 09:51:34.769][1][modelmanager][debug][modelinstance.cpp:737] Model: model1; version: 1; target device: CPU, CompiledModel config key: AFFINITY, value: C _
_ [2024-10-08 09:51:34.769][1][modelmanager][debug][modelinstance.cpp:737] Model: model1; version: 1; target device: CPU, CompiledModel config key: INFERENCE_NUM_THRE _
_ [2024-10-08 09:51:34.769][1][modelmanager][debug][modelinstance.cpp:737] Model: model1; version: 1; target device: CPU, CompiledModel config key: PERF_COUNT, value: _
_ [2024-10-08 09:51:34.769][1][modelmanager][debug][modelinstance.cpp:737] Model: model1; version: 1; target device: CPU, CompiledModel config key: INFERENCE_PRECISIO _
_ [2024-10-08 09:51:34.769][1][modelmanager][debug][modelinstance.cpp:737] Model: model1; version: 1; target device: CPU, CompiledModel config key: PERFORMANCE_HINT,  _
_ [2024-10-08 09:51:34.769][1][modelmanager][debug][modelinstance.cpp:737] Model: model1; version: 1; target device: CPU, CompiledModel config key: EXECUTION_MODE_HIN _
_ [2024-10-08 09:51:34.769][1][modelmanager][debug][modelinstance.cpp:737] Model: model1; version: 1; target device: CPU, CompiledModel config key: PERFORMANCE_HINT_N _
_ [2024-10-08 09:51:34.769][1][modelmanager][debug][modelinstance.cpp:737] Model: model1; version: 1; target device: CPU, CompiledModel config key: ENABLE_CPU_PINNING _
_ [2024-10-08 09:51:34.769][1][modelmanager][debug][modelinstance.cpp:737] Model: model1; version: 1; target device: CPU, CompiledModel config key: SCHEDULING_CORE_TY _
_ [2024-10-08 09:51:34.769][1][modelmanager][debug][modelinstance.cpp:737] Model: model1; version: 1; target device: CPU, CompiledModel config key: ENABLE_HYPER_THREA _
_ [2024-10-08 09:51:34.769][1][modelmanager][debug][modelinstance.cpp:737] Model: model1; version: 1; target device: CPU, CompiledModel config key: EXECUTION_DEVICES, _
_ [2024-10-08 09:51:34.771][1][serving][info][modelinstance.cpp:793] Loaded model model1; version: 1; batch size: 1; No of InferRequests: 1                            _
_ [2024-10-08 09:51:34.771][1][modelmanager][debug][modelinstance.cpp:870] Is model loaded from cache: false                                                           _
_ [2024-10-08 09:51:34.771][1][serving][debug][modelversionstatus.cpp:88] setAvailable: model1 - 1 (previous state: LOADING) -> error: OK                              _
_ [2024-10-08 09:51:34.771][1][serving][info][modelversionstatus.cpp:109] STATUS CHANGE: Version 1 of model model1 status change. New status: ( "state": "AVAILABLE",  _
_ [2024-10-08 09:51:34.771][1][serving][info][model.cpp:88] Updating default version for model: model1, from: 0                                                        _
_ [2024-10-08 09:51:34.771][1][serving][info][model.cpp:98] Updated default version for model: model1, to: 1                                                           _
_ [2024-10-08 09:51:34.771][1][serving][debug][modelconfig.cpp:664] Specified model parameters:               
_ [2024-10-08 09:51:34.771][1][serving][debug][modelconfig.cpp:665] model_basepath: /models/model2                                                                     _
_ [2024-10-08 09:51:34.771][1][serving][debug][modelconfig.cpp:666] model_name: model2                                                                                 _
_ [2024-10-08 09:51:34.771][1][serving][debug][modelconfig.cpp:667] batch_size: not configured                                                                         _
_ [2024-10-08 09:51:34.771][1][serving][debug][modelconfig.cpp:671] shape:                                                                                             _
_ [2024-10-08 09:51:34.771][1][serving][debug][modelconfig.cpp:677] model_version_policy: latest: 1                                                                    _
_ [2024-10-08 09:51:34.771][1][serving][debug][modelconfig.cpp:679] nireq: 0                                                                                           _
_ [2024-10-08 09:51:34.771][1][serving][debug][modelconfig.cpp:680] target_device: CPU                                                                                 _
_ [2024-10-08 09:51:34.771][1][serving][debug][modelconfig.cpp:681] plugin_config:                                                                                     _
_ [2024-10-08 09:51:34.771][1][serving][debug][modelconfig.cpp:683]   PERFORMANCE_HINT: LATENCY                                                                        _
_ [2024-10-08 09:51:34.771][1][serving][debug][modelconfig.cpp:689] Batch size set: false, shape set: false                                                            _
_ [2024-10-08 09:51:34.771][1][serving][debug][modelconfig.cpp:696] stateful: false                                                                                    _
_ [2024-10-08 09:51:34.771][1][modelmanager][debug][ov_utils.cpp:90] Validating plugin: CPU; configuration                                                             _
_ [2024-10-08 09:51:34.771][1][modelmanager][error][modelmanager.cpp:1340] Directory does not exist: /models/model2                                                    _
_ [2024-10-08 09:51:34.771][1][modelmanager][debug][modelmanager.cpp:737] Cannot reload model: model2 with versions due to error: The provided base path is invalid or _
_ [2024-10-08 09:51:34.771][1][serving][debug][modelconfig.cpp:664] Specified model parameters:                                                                        _
_ [2024-10-08 09:51:34.771][1][serving][debug][modelconfig.cpp:665] model_basepath: /models/model5                                                                     _
_ [2024-10-08 09:51:34.771][1][serving][debug][modelconfig.cpp:666] model_name: model5                                                                                 _
_ [2024-10-08 09:51:34.771][1][serving][debug][modelconfig.cpp:667] batch_size: not configured                                                                         _
_ [2024-10-08 09:51:34.771][1][serving][debug][modelconfig.cpp:671] shape:                                                                                             _
_ [2024-10-08 09:51:34.771][1][serving][debug][modelconfig.cpp:677] model_version_policy: latest: 1                                                                    _
_ [2024-10-08 09:51:34.771][1][serving][debug][modelconfig.cpp:679] nireq: 0                                                                                           _
_ [2024-10-08 09:51:34.771][1][serving][debug][modelconfig.cpp:680] target_device: CPU                                                                                 _
_ [2024-10-08 09:51:34.771][1][serving][debug][modelconfig.cpp:681] plugin_config:                                                                                     _
_ [2024-10-08 09:51:34.771][1][serving][debug][modelconfig.cpp:683]   PERFORMANCE_HINT: LATENCY                                                                        _
_ [2024-10-08 09:51:34.771][1][serving][debug][modelconfig.cpp:689] Batch size set: false, shape set: false                                                            _
_ [2024-10-08 09:51:34.771][1][serving][debug][modelconfig.cpp:696] stateful: false                                                                                    _
_ [2024-10-08 09:51:34.771][1][modelmanager][debug][ov_utils.cpp:90] Validating plugin: CPU; configuration 
_ [2024-10-08 09:51:34.771][1][modelmanager][error][modelmanager.cpp:1340] Directory does not exist: /models/model5                                                    _
_ [2024-10-08 09:51:34.771][1][modelmanager][debug][modelmanager.cpp:737] Cannot reload model: model5 with versions due to error: The provided base path is invalid or _
_ [2024-10-08 09:51:34.771][1][modelmanager][error][modelmanager.cpp:772] Loading main OVMS config models failed.                                                      _
_ [2024-10-08 09:51:34.771][1][modelmanager][info][modelmanager.cpp:542] Configuration file doesn't have custom node libraries property.                               _
_ [2024-10-08 09:51:34.771][1][modelmanager][info][modelmanager.cpp:585] Configuration file doesn't have pipelines property.                                           _
_ [2024-10-08 09:51:34.771][1][serving][info][servablemanagermodule.cpp:44] ServableManagerModule started                                                              _
_ [2024-10-08 09:51:34.771][72][modelmanager][info][modelmanager.cpp:1103] Started cleaner thread                                                                      _
_ [2024-10-08 09:51:34.771][71][modelmanager][info][modelmanager.cpp:1085] Started model manager thread                                                                _
_ [2024-10-08 09:51:34.928][63][serving][debug][kfs_grpc_inference_service.cpp:206] GetModelMetadata requested model: name model1; version 1                           _
_ [2024-10-08 09:51:34.928][63][serving][debug][modelinstance.cpp:1019] Model: model1, version: 1 already loaded                                                       _
_ [2024-10-08 09:51:34.941][63][serving][debug][kfs_grpc_inference_service.cpp:238] Processing gRPC request for model: model1; version: 1                              _
_ [2024-10-08 09:51:34.941][63][serving][debug][modelmanager.cpp:1550] Requesting model: model1; version: 1.                                                           _
_ [2024-10-08 09:51:34.941][63][serving][debug][modelinstance.cpp:1019] Model: model1, version: 1 already loaded                                                       _
_ [2024-10-08 09:51:34.941][63][serving][debug][modelinstance.cpp:1197] Getting infer req duration in model model1, version 1, nireq 0: 0.005 ms                       _
_ [2024-10-08 09:51:34.941][63][serving][debug][modelinstance.cpp:1205] Preprocessing duration in model model1, version 1, nireq 0: 0.000 ms                           _
_ [2024-10-08 09:51:34.941][63][serving][debug][modelinstance.cpp:1215] Deserialization duration in model model1, version 1, nireq 0: 0.052 ms                         _
_ [2024-10-08 09:51:35.771][71][modelmanager][error][modelmanager.cpp:1340] Directory does not exist: /models/model5                                                   _
_ [2024-10-08 09:51:35.771][71][modelmanager][error][modelmanager.cpp:1340] Directory does not exist: /models/model2                                                   _
_ [2024-10-08 09:51:35.828][64][serving][debug][kfs_grpc_inference_service.cpp:206] GetModelMetadata requested model: name model1; version 1                           _
_ [2024-10-08 09:51:35.828][64][serving][debug][modelinstance.cpp:1019] Model: model1, version: 1 already loaded                                                       _
_ [2024-10-08 09:51:35.843][64][serving][debug][kfs_grpc_inference_service.cpp:238] Processing gRPC request for model: model1; version: 1                              _
_ [2024-10-08 09:51:35.843][64][serving][debug][modelmanager.cpp:1550] Requesting model: model1; version: 1.                                                           _
_ [2024-10-08 09:51:35.843][64][serving][debug][modelinstance.cpp:1019] Model: model1, version: 1 already loaded                                                       _
_ [2024-10-08 09:51:36.368][63][serving][debug][modelinstance.cpp:1223] Prediction duration in model model1, version 1, nireq 0: 1426.586 ms                           _
_ [2024-10-08 09:51:36.368][63][serving][debug][modelinstance.cpp:1232] Serialization duration in model model1, version 1, nireq 0: 0.209 ms                           _
_ [2024-10-08 09:51:36.368][63][serving][debug][modelinstance.cpp:1240] Postprocessing duration in model model1, version 1, nireq 0: 0.000 ms                          _
_ [2024-10-08 09:51:36.371][64][serving][debug][modelinstance.cpp:1197] Getting infer req duration in model model1, version 1, nireq 0: 528.159 ms                     _
_ [2024-10-08 09:51:36.371][64][serving][debug][modelinstance.cpp:1205] Preprocessing duration in model model1, version 1, nireq 0: 0.000 ms 
_ [2024-10-08 09:51:36.371][64][serving][debug][modelinstance.cpp:1215] Deserialization duration in model model1, version 1, nireq 0: 0.054 ms                         _
_ [2024-10-08 09:51:36.371][63][serving][debug][kfs_grpc_inference_service.cpp:258] Total gRPC request processing time: 1430.05 ms                                     _
_ [2024-10-08 09:51:36.383][75][serving][debug][kfs_grpc_inference_service.cpp:206] GetModelMetadata requested model: name model1; version 1                           _
_ [2024-10-08 09:51:36.383][75][serving][debug][modelinstance.cpp:1019] Model: model1, version: 1 already loaded                                                       _
_ [2024-10-08 09:51:36.395][75][serving][debug][kfs_grpc_inference_service.cpp:238] Processing gRPC request for model: model1; version: 1                              _
_ [2024-10-08 09:51:36.395][75][serving][debug][modelmanager.cpp:1550] Requesting model: model1; version: 1.                                                           _
_ [2024-10-08 09:51:36.395][75][serving][debug][modelinstance.cpp:1019] Model: model1, version: 1 already loaded                                                       _
_ [2024-10-08 09:51:36.771][71][modelmanager][error][modelmanager.cpp:1340] Directory does not exist: /models/model5                                                   _
_ [2024-10-08 09:51:36.772][71][modelmanager][error][modelmanager.cpp:1340] Directory does not exist: /models/model2                                                   _
_ [2024-10-08 09:51:37.772][71][modelmanager][error][modelmanager.cpp:1340] Directory does not exist: /models/model5                                                   _
_ [2024-10-08 09:51:37.772][71][modelmanager][error][modelmanager.cpp:1340] Directory does not exist: /models/model2                                                   _
_ [2024-10-08 09:51:37.775][64][serving][debug][modelinstance.cpp:1223] Prediction duration in model model1, version 1, nireq 0: 1404.487 ms                           _
_ [2024-10-08 09:51:37.776][64][serving][debug][modelinstance.cpp:1232] Serialization duration in model model1, version 1, nireq 0: 0.187 ms                           _
_ [2024-10-08 09:51:37.776][64][serving][debug][modelinstance.cpp:1240] Postprocessing duration in model model1, version 1, nireq 0: 0.000 ms                          _
_ [2024-10-08 09:51:37.776][64][serving][debug][kfs_grpc_inference_service.cpp:258] Total gRPC request processing time: 1933.02 ms                                     _
_ [2024-10-08 09:51:37.776][75][serving][debug][modelinstance.cpp:1197] Getting infer req duration in model model1, version 1, nireq 0: 1381.122 ms                    _
_ [2024-10-08 09:51:37.776][75][serving][debug][modelinstance.cpp:1205] Preprocessing duration in model model1, version 1, nireq 0: 0.000 ms                           _
_ [2024-10-08 09:51:37.776][75][serving][debug][modelinstance.cpp:1215] Deserialization duration in model model1, version 1, nireq 0: 0.039 ms                         _
_ [2024-10-08 09:51:37.793][64][serving][debug][kfs_grpc_inference_service.cpp:206] GetModelMetadata requested model: name model1; version 1                           _
_ [2024-10-08 09:51:37.793][64][serving][debug][modelinstance.cpp:1019] Model: model1, version: 1 already loaded                                                       _
_ [2024-10-08 09:51:37.802][64][serving][debug][kfs_grpc_inference_service.cpp:238] Processing gRPC request for model: model1; version: 1                              _
_ [2024-10-08 09:51:37.802][64][serving][debug][modelmanager.cpp:1550] Requesting model: model1; version: 1.                                                           _
_ [2024-10-08 09:51:37.802][64][serving][debug][modelinstance.cpp:1019] Model: model1, version: 1 already loaded                                                       _
_ [2024-10-08 09:51:38.772][71][modelmanager][error][modelmanager.cpp:1340] Directory does not exist: /models/model5                                                   _
_ [2024-10-08 09:51:38.772][71][modelmanager][error][modelmanager.cpp:1340] Directory does not exist: /models/model2                                                   _
_ [2024-10-08 09:51:39.181][75][serving][debug][modelinstance.cpp:1223] Prediction duration in model model1, version 1, nireq 0: 1404.577 ms                           _
_ [2024-10-08 09:51:39.181][75][serving][debug][modelinstance.cpp:1232] Serialization duration in model model1, version 1, nireq 0: 0.243 ms                           _
_ [2024-10-08 09:51:39.181][75][serving][debug][modelinstance.cpp:1240] Postprocessing duration in model model1, version 1, nireq 0: 0.000 ms  
_ [2024-10-08 09:51:39.181][75][serving][debug][kfs_grpc_inference_service.cpp:258] Total gRPC request processing time: 2786.09 ms                                     _
_ [2024-10-08 09:51:39.182][64][serving][debug][modelinstance.cpp:1197] Getting infer req duration in model model1, version 1, nireq 0: 1379.621 ms                    _
_ [2024-10-08 09:51:39.182][64][serving][debug][modelinstance.cpp:1205] Preprocessing duration in model model1, version 1, nireq 0: 0.000 ms                           _
_ [2024-10-08 09:51:39.182][64][serving][debug][modelinstance.cpp:1215] Deserialization duration in model model1, version 1, nireq 0: 0.050 ms                         _
_ [2024-10-08 09:51:39.196][63][serving][debug][kfs_grpc_inference_service.cpp:206] GetModelMetadata requested model: name model1; version 1                           _
_ [2024-10-08 09:51:39.196][63][serving][debug][modelinstance.cpp:1019] Model: model1, version: 1 already loaded                                                       _
_ [2024-10-08 09:51:39.209][73][serving][debug][kfs_grpc_inference_service.cpp:238] Processing gRPC request for model: model1; version: 1                              _
_ [2024-10-08 09:51:39.209][73][serving][debug][modelmanager.cpp:1550] Requesting model: model1; version: 1.                                                           _
_ [2024-10-08 09:51:39.209][73][serving][debug][modelinstance.cpp:1019] Model: model1, version: 1 already loaded                                                       _
_ [2024-10-08 09:51:39.772][71][modelmanager][error][modelmanager.cpp:1340] Directory does not exist: /models/model5                                                   _
_ [2024-10-08 09:51:39.772][71][modelmanager][error][modelmanager.cpp:1340] Directory does not exist: /models/model2                                                   _
_ [2024-10-08 09:51:40.585][64][serving][debug][modelinstance.cpp:1223] Prediction duration in model model1, version 1, nireq 0: 1403.511 ms                           _
_ [2024-10-08 09:51:40.585][64][serving][debug][modelinstance.cpp:1232] Serialization duration in model model1, version 1, nireq 0: 0.186 ms                           _
_ [2024-10-08 09:51:40.585][64][serving][debug][modelinstance.cpp:1240] Postprocessing duration in model model1, version 1, nireq 0: 0.000 ms                          _
_ [2024-10-08 09:51:40.586][64][serving][debug][kfs_grpc_inference_service.cpp:258] Total gRPC request processing time: 2783.48 ms                                     _
_ [2024-10-08 09:51:40.586][73][serving][debug][modelinstance.cpp:1197] Getting infer req duration in model model1, version 1, nireq 0: 1376.769 ms                    _
_ [2024-10-08 09:51:40.586][73][serving][debug][modelinstance.cpp:1205] Preprocessing duration in model model1, version 1, nireq 0: 0.000 ms                           _
_ [2024-10-08 09:51:40.586][73][serving][debug][modelinstance.cpp:1215] Deserialization duration in model model1, version 1, nireq 0: 0.056 ms                         _
_ [2024-10-08 09:51:40.602][63][serving][debug][kfs_grpc_inference_service.cpp:206] GetModelMetadata requested model: name model1; version 1                           _
_ [2024-10-08 09:51:40.602][63][serving][debug][modelinstance.cpp:1019] Model: model1, version: 1 already loaded                                                       _
_ [2024-10-08 09:51:40.612][63][serving][debug][kfs_grpc_inference_service.cpp:238] Processing gRPC request for model: model1; version: 1                              _
_ [2024-10-08 09:51:40.612][63][serving][debug][modelmanager.cpp:1550] Requesting model: model1; version: 1.                                                           _
_ [2024-10-08 09:51:40.612][63][serving][debug][modelinstance.cpp:1019] Model: model1, version: 1 already loaded                                                       _
_ [2024-10-08 09:51:40.772][71][modelmanager][error][modelmanager.cpp:1340] Directory does not exist: /models/model5                                                   _
_ [2024-10-08 09:51:40.772][71][modelmanager][error][modelmanager.cpp:1340] Directory does not exist: /models/model2

mzegla · 2024-10-08T11:21:30Z

Okay, from the logs it looks like prediction takes ~1.4 s. Are you sure there are no resource constrains for your deployment?
If standalone server and K8S deployment run on the same machine then it looks like server running in kubernetes cluster is using a fraction on resources available to standalone instance.

sriram-dsl · 2024-10-10T05:06:16Z

I have been using this OpenVINO server for the last two years without any issues. However, we started encountering this problem in the most recent deployments, even though we are using the same resources as in previous deployments.

this is the resource column for the deployment :

  # Resource Configuration
  metaResources:
      resources:
        limits:
          cpu: "1"                     # Low resource limits for CPU
          memory: "2G"                 # Low resource limits for memory
        requests:
          cpu: "1"                     # Low resource requests for CPU
          memory: "2G"                 # Low resource requests for memory

  serviceName: inference-service1     # Name of the inference service
  serviceType: ClusterIP              # Service type for the inference service
  port: 9000                          # Port for the inference service
  grpcWorkers: 2                      # Number of gRPC workers
  labels:
    app: inference1                   # Labels for identifying the application

mzegla · 2024-10-10T08:40:23Z

Your standalone server does not have CPU limitations, but your Kubernetes deployment does. Please try removing CPU from limits and requests in resources field and check if performance gets closer to standalone instance.

sriram-dsl · 2024-10-22T12:22:44Z

Sorry for the delayed response. We have tried it, but it didn't work. Now, this issue is occurring in every deployment. We are running inference containers outside the cluster.

Could you help us understand what might be causing this? It seems like there could be issues in other parts of the platform as well, right? However, from the logs, it's clear that the server itself is taking more than 2 seconds for each inference

mzegla · 2024-10-28T10:39:15Z

Now, this issue is occurring in every deployment. We are running inference containers outside the cluster.

So you observe the same issue outside Kubernetes?

Could you help us understand what might be causing this? It seems like there could be issues in other parts of the platform as well, right? However, from the logs, it's clear that the server itself is taking more than 2 seconds for each inference

From the logs I would say that only a fraction of CPU is used, like there's some limitation set. I don't know the details of your platform, but you can try to run it in isolation and send request manually to make sure it behaves the same.

sriram-dsl · 2024-10-29T08:14:08Z

Outside of Kubernetes, it is working fine. The previous deployments done on Kubernetes are facing the issue. I will try removing all limitations inside the cluster.

sriram-dsl · 2024-12-24T10:38:27Z

this issue still persists
this is the docker stats of the openvino container outside the cluster

this is the docker stats of the openvino container inside the cluster

these are resources provided to the openvino

logs from the container running otside the cluster :

logs from the container running inside the cluster :

previously, only 1 core is given to the container in cluster, now i have increased it to 2
still there is issue

sriram-dsl added the bug Something isn't working label Oct 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

high latency #2721

high latency #2721

sriram-dsl commented Oct 7, 2024

mzegla commented Oct 8, 2024

sriram-dsl commented Oct 8, 2024

mzegla commented Oct 8, 2024

sriram-dsl commented Oct 10, 2024

mzegla commented Oct 10, 2024

sriram-dsl commented Oct 22, 2024

mzegla commented Oct 28, 2024

sriram-dsl commented Oct 29, 2024

sriram-dsl commented Dec 24, 2024 •

edited

Loading

high latency #2721

high latency #2721

Comments

sriram-dsl commented Oct 7, 2024

Latency in OpenVINO Model Server Inside Kubernetes Cluster

To Reproduce

Environment

Expected Behavior

model directory format

mzegla commented Oct 8, 2024

sriram-dsl commented Oct 8, 2024

mzegla commented Oct 8, 2024

sriram-dsl commented Oct 10, 2024

mzegla commented Oct 10, 2024

sriram-dsl commented Oct 22, 2024

mzegla commented Oct 28, 2024

sriram-dsl commented Oct 29, 2024

sriram-dsl commented Dec 24, 2024 • edited Loading

sriram-dsl commented Dec 24, 2024 •

edited

Loading