MultimodalAgent interrupts itself when response is created from function result #1178

davidzhao · 2024-12-05T09:10:07Z

During the function call, the model could return audio letting the user know that it's performing an operation.

While that speech is being played back, if the function call finishes with the result, we would have created an additional response that would interrupt that speech:

For example:

Let's say I have a tool called turn_on_the_light.
The agent will say "I'll turn on the {cut-in}} the light is now on".

This is happening because of the response.create() line

if called_fnc.result is not None:
            self.conversation.item.create(tool_call, item_id)
            self.response.create()

We should wait until current speech handle is finished before queuing additional speech

The text was updated successfully, but these errors were encountered:

davidzhao added the bug Something isn't working label Dec 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MultimodalAgent interrupts itself when response is created from function result #1178

MultimodalAgent interrupts itself when response is created from function result #1178

davidzhao commented Dec 5, 2024

MultimodalAgent interrupts itself when response is created from function result #1178

MultimodalAgent interrupts itself when response is created from function result #1178

Comments

davidzhao commented Dec 5, 2024