awscurl: Missing token metrics when -t option specified #2340

CoolFish88 · 2024-08-25T19:08:46Z

Description

When requesting token metrics from an endpoint running a LMI container using a vLLM engine, non-zero values are returned for tokenThroughput, totalTokens, and tokenPerRequest (as expected)

When requesting token metrics from an endpoint running a Triton Inference Server, zero values are returned for tokenThroughput, totalTokens, and tokenPerRequest (unexpected). The Triton endpoint was tested successfully to verify that it responds to individual as well as concurrent requests (produces the expected output given the input requests).

One difference between the two setups consists in the schema of the input requests and of the output response. Specifically, the Triton endpoint operates with a different input schema and produces an output structured differently from the LMI endpoint. Do you suspect this might be the reason of the token metrics not being computed?

Expected Behavior

Return token metrics when -t option is specified

Error Message

Zero valued token metrics

How to Reproduce?

TOKENIZER=<path_to_tokenizer> ./awscurl -c 1 -N 10 -X POST -n sagemaker <triton_endpoint> --dataset <path_to_dataset> -H 'Content-Type: application/json' -P --connect-timeout 60

Triton input schema:
{"inputs": [
{"name": str,
"shape": [int],
"datatype": str,
"data: [str]}
]
}
Triton output schema:
{
"model_name": str,
"model_version: str,
"outputs":[
{ "name": str,
"shape": [int],
"datatype": str,
"data: [str]
}
]
}

frankfliu · 2024-08-25T20:02:07Z

You can use "-j" parameter to define jsonquery for tritonserver output, see this test code: https://github.com/deepjavalibrary/djl-serving/blob/master/awscurl/src/test/java/ai/djl/awscurl/AwsCurlTest.java#L455-L456

CoolFish88 · 2024-08-26T08:27:55Z

Hello Frank,

Thank you for the valuable suggestion.
The dictionary storing the generated text (within the output array) looks like this:

{'name': 'generated_text',
'datatype': 'BYTES',
'shape': [1],
'data': ['{"response": "moon"}']},

Should a JSON query like "$.outputs[?(@.name=='generated_text')].data[0]": "$.response" work?

Notice that data is a JSON string and I was wondering if the execution code can process it accordingly by extracting the data contained in the response field (so the tokens associated with the key are not counted)

All the best

CoolFish88 · 2024-08-26T16:02:17Z

I noticed that tokenThroughput = (totalTokens * 1000000000d / totalTime * clients). Could you please tell me the meaning behind the 1000000000d constant and what does tokenThroughput end up measuring?

frankfliu · 2024-08-26T16:27:13Z

The totalTime is in nano seconds, need to convert to seconds.

frankfliu · 2024-08-26T21:09:32Z

Hello Frank,

Thank you for the valuable suggestion. The dictionary storing the generated text (within the output array) looks like this:

{'name': 'generated_text', 'datatype': 'BYTES', 'shape': [1], 'data': ['{"response": "moon"}']},

Should a JSON query like "$.outputs[?(@.name=='generated_text')].data[0]": "$.response" work?

Notice that data is a JSON string and I was wondering if the execution code can process it accordingly by extracting the data contained in the response field (so the tokens associated with the key are not counted)

All the best

I don't think we can parse the nested json string.

CoolFish88 · 2024-08-27T05:48:37Z

Thanks for the reply.

I also opened another awscurl bug ticket related to the tokenizer behavior. I would be very grateful if you could have a look.

All the best!

CoolFish88 · 2024-09-03T12:34:51Z

Hello Frank,

Token metrics are no longer computed when specifying a json query.
Has something changed with the latest release?

CoolFish88 added the bug Something isn't working label Aug 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

awscurl: Missing token metrics when -t option specified #2340

awscurl: Missing token metrics when -t option specified #2340

CoolFish88 commented Aug 25, 2024

frankfliu commented Aug 25, 2024

CoolFish88 commented Aug 26, 2024

CoolFish88 commented Aug 26, 2024

frankfliu commented Aug 26, 2024

frankfliu commented Aug 26, 2024

CoolFish88 commented Aug 27, 2024

CoolFish88 commented Sep 3, 2024

awscurl: Missing token metrics when -t option specified #2340

awscurl: Missing token metrics when -t option specified #2340

Comments

CoolFish88 commented Aug 25, 2024

Description

Expected Behavior

Error Message

How to Reproduce?

frankfliu commented Aug 25, 2024

CoolFish88 commented Aug 26, 2024

CoolFish88 commented Aug 26, 2024

frankfliu commented Aug 26, 2024

frankfliu commented Aug 26, 2024

CoolFish88 commented Aug 27, 2024

CoolFish88 commented Sep 3, 2024