这是indexloc提供的服务,不要输入任何密码
Skip to content

Bug: TTS interruption causes discrepancy between spoken audio and displayed chat text #2607

@hsjun99

Description

@hsjun99

Describe the bug
When a user interrupts the agent's Text-to-Speech (TTS) stream, a discrepancy occurs between the final text that is displayed and added to the chat context, and the audio that was actually spoken. The displayed text consistently includes 1-2 more words than what the user heard before the interruption was triggered.

This pollutes the chat context with text that the user never received audibly, which can confuse the LLM in subsequent turns.

Steps to Reproduce

  1. Initiate a conversation with an agent using a streaming TTS provider.
  2. Receive a response from the agent that is long enough to be interruptible.
  3. While the agent's TTS is playing, interrupt it by starting to speak.
  4. Observe the final text message from the agent as it is reflected in the chat history.
  5. Compare the displayed text to the actual words spoken by the TTS audio before it was cut off.

Expected Behavior
The text added to the chat context should precisely match the words that were audibly synthesized and played by the TTS engine before the interruption occurred.

Actual Behavior
The final text in the chat context is consistently longer than the spoken audio. For example, if the agent intended to say "The weather today is sunny with a high of 25 degrees" and was interrupted after speaking the word "sunny", the chat context might incorrectly save the text as "The weather today is sunny with a...".

Environment

  • Library Version: livekit-agents v1.1.1
  • TTS Providers: This issue has been confirmed to happen with both ElevenLabs and Cartesia, indicating the problem is likely in the core TTS interruption handling logic and not specific to one provider.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions