Detectors with whole doc chunker and other chunker not returning results #271

evaline-ju · 2024-12-31T17:10:01Z

Describe the bug

Given a combination of detectors, at least one of which depends on the whole document chunker (whole_doc_chunker) and another which depends on another chunker (e.g. a sentence tokenizer) in a streaming generation with classification request, currently returns an empty result.

Instead with the current aggregation strategy, we would expect the result to not appear "streaming" and just return like a unary result, since the whole document chunker has to process/return the entire text.

Update: This may have implications on any use of any combination of differing types of chunkers

Sample Code

Config with two detectors example

detectors:
    detector_1:
        type: text_contents
        service:
            hostname: detector1.com
            port: 443
        chunker_id: sentence_chunker
        default_threshold: 0.5
    detector_2:
        type: text_contents
        service:
            hostname: detector2.com
            port: 443
        chunker_id: whole_doc_chunker

Call example

curl -v 'POST' \
  'http://localhost:8033/api/v1/task/server-streaming-classification-with-text-generation' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "model_id": "my-llm",
  "inputs": "Tell me a story about a dog",
  "guardrail_config": {
    "output": {
      "masks": [],
      "models": {"detector_1":{}, "detector_2": {}}
    },
    "input": {
      "models": {}
    }
  },
  "text_gen_parameters": {
    "max_new_tokens": 99,
    "min_new_tokens": 2,
  }
}'

Expected behavior

Results that look "unary" i.e. whole document results are returned

Observed behavior

No results

The text was updated successfully, but these errors were encountered:

evaline-ju added the bug Something isn't working label Dec 31, 2024

mdevino mentioned this issue Jan 2, 2025

Stream content endpoint #272

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detectors with whole doc chunker and other chunker not returning results #271

Detectors with whole doc chunker and other chunker not returning results #271

evaline-ju commented Dec 31, 2024 •

edited

Loading

Detectors with whole doc chunker and other chunker not returning results #271

Detectors with whole doc chunker and other chunker not returning results #271

Comments

evaline-ju commented Dec 31, 2024 • edited Loading

Describe the bug

Sample Code

Expected behavior

Observed behavior

evaline-ju commented Dec 31, 2024 •

edited

Loading