这是indexloc提供的服务,不要输入任何密码
Skip to content

LLM gives no answer #2928

@sarmientoF

Description

@sarmientoF

Sometimes after user is done speaking, the llm node returns nothing, and then the STT node is waiting for user to speak again. Using the basic agent example.

2025-07-17 15:21:43,188 - INFO livekit.agents - STT metrics: audio_duration=5.00 {"room": "playground-Nqt8-zYVy", "pid": 185, "job_id": "simulated-job-c4aa6d196dc2"}
2025-07-17 15:21:48,203 - INFO livekit.agents - STT metrics: audio_duration=5.05 {"room": "playground-Nqt8-zYVy", "pid": 185, "job_id": "simulated-job-c4aa6d196dc2"}
2025-07-17 15:21:50,019 - DEBUG livekit.agents - received user transcript {"room": "playground-Nqt8-zYVy", "user_transcript": "¿Me puedes decir el clima de Japón?", "language": "es", "pid": 185, "job_id": "simulated-job-c4aa6d196dc2"}
2025-07-17 15:21:50,198 - DEBUG livekit.plugins.turn_detector - eou prediction {"room": "playground-Nqt8-zYVy", "eou_probability": 0.8865267038345337, "input": "<|im_start|>assistant\nHey there! What can I help you with today?<|im_end|>\n<|im_start|>user\nHola, ¿me puedes escuchar<|im_end|>\n<|im_start|>assistant\n¡Hola! Sí, te escucho. ¿En qué puedo ayudarte hoy?<|im_end|>\n<|im_start|>user\n¿Me puedes decir el clima de Japón?", "duration": 0.088, "pid": 185, "job_id": "simulated-job-c4aa6d196dc2"}
2025-07-17 15:21:50,198 - INFO livekit.agents - EOU metrics: end_of_utterance_delay=0.67, transcription_delay=0.49 {"room": "playground-Nqt8-zYVy", "pid": 185, "job_id": "simulated-job-c4aa6d196dc2"}
2025-07-17 15:21:50,875 - INFO livekit.agents - LLM metrics: ttft=-1.00, input_tokens=0,  cached_input_tokens=0, output_tokens=0, tokens_per_second=0.00 {"room": "playground-Nqt8-zYVy", "pid": 185, "job_id": "simulated-job-c4aa6d196dc2"}
2025-07-17 15:21:53,225 - INFO livekit.agents - STT metrics: audio_duration=5.00 {"room": "playground-Nqt8-zYVy", "pid": 185, "job_id": "simulated-job-c4aa6d196dc2"}
2025-07-17 15:21:58,283 - INFO livekit.agents - STT metrics: audio_duration=5.05 {"room": "playground-Nqt8-zYVy", "pid": 185, "job_id": "simulated-job-c4aa6d196dc2"}
2025-07-17 15:22:03,305 - INFO livekit.agents - STT metrics: audio_duration=5.05 {"room": "playground-Nqt8-zYVy", "pid": 185, "job_id": "simulated-job-c4aa6d196dc2"}
2025-07-17 15:22:08,333 - INFO livekit.agents - STT metrics: audio_duration=5.00 {"room": "playground-Nqt8-zYVy", "pid": 185, "job_id": "simulated-job-c4aa6d196dc2"}
2025-07-17 15:22:13,354 - INFO livekit.agents - STT metrics: audio_duration=5.05 {"room": "playground-Nqt8-zYVy", "pid": 185, "job_id": "simulated-job-c4aa6d196dc2"}
2025-07-17 15:23:48,929 - INFO livekit.agents - STT metrics: audio_duration=5.00 {"room": "playground-Nqt8-zYVy", "pid": 185, "job_id": "simulated-job-c4aa6d196dc2"}
import logging

from dotenv import load_dotenv
from livekit.agents import (
    Agent,
    AgentSession,
    JobContext,
    JobProcess,
    RoomInputOptions,
    RoomOutputOptions,
    RunContext,
    WorkerOptions,
    cli,
    metrics,
)
from livekit.agents.llm import function_tool
from livekit.agents.voice import MetricsCollectedEvent

# uncomment to enable Krisp background voice/noise cancellation
from livekit.plugins import deepgram, noise_cancellation, openai, silero
from livekit.plugins.turn_detector.multilingual import MultilingualModel

logger = logging.getLogger("basic-agent")

load_dotenv()


class MyAgent(Agent):
    def __init__(self) -> None:
        super().__init__(
            instructions="Your name is Kelly. You would interact with users via voice."
            "with that in mind keep your responses concise and to the point."
            "You are curious and friendly, and have a sense of humor.",
        )

    async def on_enter(self):
        # when the agent is added to the session, it'll generate a reply
        # according to its instructions
        self.session.generate_reply()

    # all functions annotated with @function_tool will be passed to the LLM when this
    # agent is active
    @function_tool
    async def lookup_weather(self, context: RunContext, location: str, latitude: str, longitude: str):
        """Called when the user asks for weather related information.
        Ensure the user's location (city or region) is provided.
        When given a location, please estimate the latitude and longitude of the location and
        do not ask the user for them.

        Args:
            location: The location they are asking for
            latitude: The latitude of the location, do not ask user for it
            longitude: The longitude of the location, do not ask user for it
        """

        logger.info(f"Looking up weather for {location}")

        return "sunny with a temperature of 70 degrees."


def prewarm(proc: JobProcess):
    proc.userdata["vad"] = silero.VAD.load()


async def entrypoint(ctx: JobContext):
    # each log entry will include these fields
    ctx.log_context_fields = {
        "room": ctx.room.name,
    }

    session = AgentSession(
        vad=ctx.proc.userdata["vad"],
        # any combination of STT, LLM, TTS, or realtime API can be used
        llm=openai.LLM(model="gpt-4o-mini"),
        stt=deepgram.STT(model="nova-2", language="es", detect_language=False),
        tts=openai.TTS(voice="ash"),
        # use LiveKit's turn detection model
        turn_detection=MultilingualModel(),
    )

    # log metrics as they are emitted, and total usage after session is over
    usage_collector = metrics.UsageCollector()

    @session.on("metrics_collected")
    def _on_metrics_collected(ev: MetricsCollectedEvent):
        metrics.log_metrics(ev.metrics)
        usage_collector.collect(ev.metrics)

    async def log_usage():
        summary = usage_collector.get_summary()
        logger.info(f"Usage: {summary}")

    # shutdown callbacks are triggered when the session is over
    ctx.add_shutdown_callback(log_usage)

    await session.start(
        agent=MyAgent(),
        room=ctx.room,
        room_input_options=RoomInputOptions(
            # uncomment to enable Krisp BVC noise cancellation
            noise_cancellation=noise_cancellation.BVC(),
        ),
        room_output_options=RoomOutputOptions(transcription_enabled=True),
    )


if __name__ == "__main__":
    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint, prewarm_fnc=prewarm))

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions