Releases: livekit/agents
livekit-plugins-google@0.11.5
Patch Changes
- backporting fix to agents 0.x to ignore Gemini LLM responses with no candidates (#2898) -
73e5384c85ea9b29fa4c946f29c66bef80d5d160
(@davidzhao)
livekit-agents@1.2.2
Note
livekit-agents 1.2 introduced many new features. You can check out the changelog here.
New features
SpeechHandle
Waiting for the playout to finish inside the function tools could lead to deadlocks. In this version, an error will be raised instead. To wait for the assistant's spoken response prior of executing a tool, use RunContext.wait_for_playout.
@function_tool
async def my_function_tool(self, ctx: RunContext):
await ctx.wait_for_playout() # wait for the assistant's spoken response that started the execution of this tool
False interruption detection
We're now emitting an event when the agent got interrupted, but we didn't receive any transcript. (Likely a false interruption).
This is useful to "re-regenerate" an assistant reply so the agent doesn't seems stuck.
@session.on("agent_false_interruption")
def on_false_interruption(ev: AgentFalseInterruptionEvent):
session.generate_reply(instructions=ev.extra_instructions or NOT_GIVEN)
Initial conversation recording
We have begun implementing conversation recording directly within the Worker. Currently, it can be accessed using the console subcommand. A future update will provide API to use this in production.
python3 examples/drive-thru/drivethru_agent.py console --record
What's Changed
- fix cartesia non-streaming tts by @longcw in #2942
- add RecorderIO and --record flag to the console mode by @theomonnom in #2934
- chore: remove prometheus database from repository by @mateuszkulpa in #2944
- parameterize inference worker init timeout by @levity in #2805
- plugins: openai: llm: add support for service_tier by @mike-r-mclaughlin in #2945
- fix: upgrade bithuman library to unblock accessing agents by @CathyL0 in #2948
- fix duplicated user messages when preemptive generation canceled by @longcw in #2949
- fix azure stt update options and add logs for error reason by @longcw in #2954
- Explictly calling ctx.connect before wait_for_participant by @sascotto in #2957
- azure stt: disable language detection if only one language sepcified by @longcw in #2959
- gemini: emit input_speech_started when new generation created by @longcw in #2963
- evals: fix realtime model RuntimeError by @theomonnom in #2965
- reveri/fix-11labs-error-fstring by @johncDepop in #2964
- add RunContext.wait_for_playout and guard against deadlocks by @theomonnom in #2966
- feat(realtime_model): correctly emit errors when the response is done by @bml1g12 in #2967
- slightly optimize import time by @theomonnom in #2968
- increase RoomInput frame_size_ms to 50ms by @theomonnom in #2970
- add warning when enabling unprovided input/output sinks by @longcw in #2969
- Handle RN format for preconnect mimeType by @davidzhao in #2952
- tune vad min_silence_duration and min_endpointing_delay by @longcw in #2953
- feat: add anam avatar by @karlson-anam in #2938
- fix types for anam avatar plugin by @longcw in #2976
- fix 11labs tts when audio is an empty string by @longcw in #2973
- support resume agent from a false interruption by @longcw in #2852
- feat: add simli avatar with example by @Antonyesk601 in #2923
- add simli plugin to ci by @longcw in #2978
- remove Resemble from CI by @theomonnom in #2979
- clean up avatar example and add retry for datastream io rpc call by @longcw in #2943
- expose transcription sync speed to RoomOutputOptions by @longcw in #2984
- hume tts: raise error message from the api by @longcw in #2982
- io: add input source hierarchy & cleanup by @theomonnom in #2983
- fix AgentFalseInterruptedEvent none message by @theomonnom in #2987
- rename AgentFalseInterruptedEvent -> AgentFalseInterruptionEvent by @theomonnom in #2988
- nit: update AgentFalseInterruptionEvent by @theomonnom in #2989
- fix deadlock & session close race by @theomonnom in #2997
- wait on_exit before pause scheduling by @longcw in #2996
- Gladia STT - add region parameter to gladia stt by @mfernandez-gladia in #2995
- fix: upgrade bithuman library version by @CathyL0 in #2998
- improve GetEmailTask instructions by @theomonnom in #3002
- message should be None when empty by @theomonnom in #3003
- ci: enable verbose evals by @theomonnom in #3004
- fix sensitive TTS tests by @theomonnom in #3005
New Contributors
- @levity made their first contribution in #2805
- @CathyL0 made their first contribution in #2948
- @ladvoc made their first contribution in #2956
- @johncDepop made their first contribution in #2964
- @bml1g12 made their first contribution in #2967
- @karlson-anam made their first contribution in #2938
- @Antonyesk601 made their first contribution in #2923
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.2.0...livekit-agents@1.2.2
livekit-agents@1.2.0
New Features
Evals & Testing:
You can now perform turn-by-turn evaluations on your agent interactions. Here's an example of how to validate expected behaviors:
result = await sess.run(user_input="Can I book an appointment? What's your availability for the next two weeks?")
result.expect.skip_next_event_if(type="message", role="assistant")
result.expect.next_event().is_function_call(name="list_available_slots")
result.expect.next_event().is_function_call_output()
await result.expect.next_event().is_message(role="assistant").judge(llm, intent="must confirm no availability")
Check out these practical examples: drive-thru, frontdesk
Documentation: https://docs.livekit.io/agents/build/testing/
Preemptive Generation
This feature enables speculative initiation of LLM and TTS processing before the user's turn concludes, significantly reducing response latency by overlapping processing with user audio. Disabled by default:
session = AgentSession(..., preemptive_generation=True)
Enhanced End-of-Turn (EOU) Detection
The end-of-turn model has been refined to reduce sensitivity to punctuation and better handle multilingual scenarios, notably improving Hindi language support.
'
Documentation: https://docs.livekit.io/agents/build/turns/turn-detector/#supported-languages
OpenTelemetry Integration
Agent now supports tracing for LLM/TTS requests and user callbacks using OpenTelemetry. See LangFuse example for detailed implementation.
Experimental Agent Tasks
AgentTask is a new experimental subset feature allowing agents to terminate upon achieving specific goals. You can await AgentTasks directly in your workflows:
@function_tool
async def schedule_appointment(self, ctx: RunContext[Userdata], slot_id: str) -> str:
# Attempts to retrieve user email, allowing multiple agent-user interactions
email_result = await beta.workflows.GetEmailTask(chat_ctx=self.chat_ctx)
Half-Duplex Pipeline
Combine Gemini or OpenAI's realtime STT/LLM with a separate TTS engine, optimizing your agent's voice interactions:
session = AgentSession(
llm=openai.realtime.RealtimeModel(modalities=["text"]),
# Alternatively: llm=google.beta.realtime.RealtimeModel(modalities=[Modality.TEXT]),
tts=openai.TTS(voice="ash"),
)
View the complete example.
Documentation: https://docs.livekit.io/agents/integrations/realtime/#separate-tts
Improved Transcription Synchronization
Align transcripts accurately with speech outputs from TTS engines such as Cartesia and 11labs for improved synchronization:
session = AgentSession(..., use_tts_aligned_transcript=True)
Refer to the complete example.
Documentation: https://docs.livekit.io/agents/build/text/#tts-aligned-transcriptions
Upgraded Tokenization Engine
Transitioned to the Blingfire tokenization engine from the previous naive implementation, significantly enhancing handling and accuracy for multiple languages.
Complete changelog
- introduce AgentTask by @theomonnom in #2483
- introduce workflows & GetEmailAgent by @theomonnom in #2498
- drive-thru example by @theomonnom in #2609
- reuse SpeechHandle for all generations inside a single turn by @theomonnom in #2623
- introduce test & eval primitives by @theomonnom in #2662
- evals: add maybe_* utils by @theomonnom in #2681
- evals: better error message for assertions by @theomonnom in #2682
- evals: RunResult final_output on Agent tasks by @theomonnom in #2696
- evals: AgentTask GetEmailAdress tests e.g by @theomonnom in #2697
- allow optional RunResult output_type by @theomonnom in #2698
- evals: add EventRangeAssert utils by @theomonnom in #2699
- add front-desk agent example by @theomonnom in #2724
- fix InlineAgent agent resume on error by @theomonnom in #2730
- add ChatContext.merge & merge inline tasks chat_ctx by @theomonnom in #2731
- better GetEmailAgent instructions by @theomonnom in #2732
- exclude function_call inside ChatContext.merge by @theomonnom in #2733
- add Blingfire tokenizer & use it by default by @theomonnom in #2771
- fix RealtimeModel generate_reply authorization by @theomonnom in #2773
- support timed transcripts from tts by @longcw in #2580
- ignore empty sentence in tts stream adapter by @longcw in #2777
- fix types for agents 1.2 by @longcw in #2778
- fix MockTools type by @longcw in #2781
- fix RunResult order of fnc_call & agent_handoff by @theomonnom in #2782
- fix types by @theomonnom in #2783
- fix tr_input by @theomonnom in #2784
- fix GetEmailAgent instructions by @theomonnom in #2786
- fix blingfire tokenizer test by @longcw in #2785
- support tts with realtime model (audio in, text out) by @longcw in #2628
- fix assistant message order on the RunResult by @theomonnom in #2787
- fix FrontDeskAgent list_available_slots by @theomonnom in #2788
- initial evals for the FrontDesk agent by @theomonnom in #2790
- ignore empty assistant messages by @theomonnom in #2792
- evals: add CI by @theomonnom in #2791
- evals ci: use python 3.12 by @theomonnom in #2793
- fix confirmation/validation ambiguity on GetEmailAgent instructions by @theomonnom in #2794
- punctuation free turn detector by @jeradf in #2717
- frontdesk: ToolError example by @theomonnom in #2808
- evals API improvements by @theomonnom in #2846
- make arguments optional for mock_tools by @theomonnom in #2847
- allow returning Exception inside function tools by @theomonnom in #2848
- add envvar to enable verbose evals logs by @theomonnom in #2849
- preemptive generation before end of user turn by @longcw in #2728
- fix next_event return type by @theomonnom in #2856
- evals: add docstrings to the public API by @theomonnom in #2857
- only print the judge result when verbose is enabled by @theomonnom in #2858
- Add contains_agent_handoff assertion by @bcherry in #2862
- allow editing SpeechHandle allow_interruptions & add RunContext.disallow_interruptions by @theomonnom in #2864
- fix evals test by @theomonnom in #2865
- fix ruff and types by @longcw in #2889
- add opentelemetry trace by @longcw in #2873
- fix unordered user messages by @theomonnom in #2891
- fix livekit-agents 1.2 tests by @theomonnom in #2866
- cleanup & prepare for release by @theomonnom in #2893
- add prometheus by @theomonnom in #2908
- add gen_ai attributes to llm_request by @longcw in #2905
- fix types and aws realtime model by @longcw in #2910
- fix TTS fallback adapter metrics_collected event by @longcw in #2890
- add model property for llm plugins by @longcw in #2914
- nit: mprove drivethru by @theomonnom in #2918
- Removing ctx.connect() from examples by @sascotto in #2909
- expose tokenizer option for cartesia tts by @longcw in #2916
- remove openai prewarm by @theomonnom in #2919
- add tts_audio_duration to usage metrics collection by @Panmax in #2915
...
livekit-agents@1.1.7
What's Changed
- fix log extra field handling in log.py by @Panmax in #2875
- fix aws realtime model types by @longcw in #2877
- chore: export PlayHandle type by @davidzhao in #2903
- fix gemini realtime user transcription sent twice by @longcw in #2899
- append framework ID to User-Agent Header by @BumaldaOverTheWater94 in #2896
- add gemini tts (beta) by @longcw in #2834
- fix DatastreamIO cancellation race by @theomonnom in #2911
- DataStreamIO wait for start when capturing_frame by @theomonnom in #2912
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.1.6...livekit-agents@1.1.7
livekit-agents@1.1.6
What's Changed
- Fix LMNT plugin docs by @zachoverflow in #2762
- Update new plugin readmes for format and links by @bcherry in #2571
- fix update_chat_ctx bug by @BumaldaOverTheWater94 in #2763
- Include item id when converting to LG messages by @dkeller-sondermind in #2767
- fix schedule speech on windows when monotonic_ns resolution is rough by @longcw in #2770
- Install optional dependencies during docs gen by @bcherry in #2766
- Feat/mistralai plugins by @fabitokki in #2772
- fix docker-compose typo by @theomonnom in #2789
- suppress main_stream ended error in stt fallback adapter by @longcw in #2684
- [fix] Fixed Orus voice name definition by @Is44m in #2797
- fix aws sonic type checking by @longcw in #2804
- fix deepgram stt docs by @longcw in #2803
- Hotfix for Baseten STT by @htrivedi99 in #2801
- fix inactive user instructions by @theomonnom in #2809
- fix BackgroundAudio hanging on close error by @theomonnom in #2814
- reset closing_ws for openai stt by @longcw in #2813
- avoid sid error in console mode by @theomonnom in #2815
- ignore livekit api when using console mode by @theomonnom in #2816
- Feature : Add audio_mixer_kwargs to BackgroundAudioPlayer by @CyprienRicqueB2L in #2796
- fix FunctionToolsExecutedEvent import by @longcw in #2832
- feat: ability to use remote EOT inference when deployed in Cloud by @davidzhao in #2780
- Add support for CustomPronunciations in Google TTS plugin by @kechako in #2692
- Nova Sonic Example Agent by @BumaldaOverTheWater94 in #2817
- Prevent console mode from crashing by @donalffons in #2853
- Small fix to README by @kath0la in #2861
- Fix: Use synchronized transcript for interrupted session.say() responses by @eliotsamuelmiller in #2843
- fix aws sonic tools by @theomonnom in #2859
- log metrics in extra by @theomonnom in #2868
- accidentally omit a docstring by @BumaldaOverTheWater94 in #2869
New Contributors
- @zachoverflow made their first contribution in #2762
- @dkeller-sondermind made their first contribution in #2767
- @fabitokki made their first contribution in #2772
- @Is44m made their first contribution in #2797
- @donalffons made their first contribution in #2853
- @eliotsamuelmiller made their first contribution in #2843
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.1.5...livekit-agents@1.1.6
livekit-agents@1.1.5
What's Changed
- Preserve original path when connecting to web socket (fix for #2700) by @arpesenti in #2702
- disconnect room when session closed due to participant disconnected by @longcw in #2712
- make sure audio_output.flush called when capture frame failed by @longcw in #2718
- Update Inworld README by @ShayneP in #2723
- Updating whisper API by @htrivedi99 in #2726
- Lock google-genai package to stable v1.20.0 by @simplegr33n in #2725
- fix(google): pass in raw schema according to genai 1.20 spec by @davidzhao in #2727
- feat(google): expose seed parameter in LLM.chat by @mrkowalski in #2721
- upgrade google genai to 1.23 by @longcw in #2743
- support 11labs auto mode with sentence tokenizer by @longcw in #2744
- add livekit-blingfire by @theomonnom in #2734
- remove changesets by @theomonnom in #2749
- uv: ignore blingfire by @theomonnom in #2750
- fix aggregate-dumps when no file is present by @theomonnom in #2751
- run tts tests on top10 providers by @theomonnom in #2752
- delete changesets x2 by @theomonnom in #2753
- add build CI by @theomonnom in #2754
- fix blingfire build CI by @theomonnom in #2756
- BlingFire: use Release config on Windows by @theomonnom in #2757
- build blingfire for macos x86 & linux arm64 by @theomonnom in #2758
- Nova Sonic Realtime Plugin by @BumaldaOverTheWater94 in #2740
- keep aws nova sonic optional by @theomonnom in #2760
New Contributors
- @arpesenti made their first contribution in #2702
- @mrkowalski made their first contribution in #2721
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.1.4...livekit-agents@1.1.5
livekit-agents@1.1.4
What's Changed
- add --ignore-changesets to update_versions.py by @theomonnom in #2665
- remove frame_size_ms param when creating AudioStream by @longcw in #2667
- Gladia STT - add new parameters to gladia stt by @mfernandez-gladia in #2649
- expose automatic_function_calling config for google LLM by @longcw in #2675
- start user away timer after user join by @longcw in #2676
- preserve created_at timestamp when updating instructions by @Panmax in #2677
- use parameters_json_schema for raw function tool with google LLM by @longcw in #2686
- import TextInputEvent from room_io by @longcw in #2679
- reset agent and user state after session closed by @longcw in #2691
- Add hedra extra by @bcherry in #2705
- add markdown filter for tts and transcription nodes by @longcw in #2695
- Fix Example Typo by @toubatbrian in #2706
- Inworld TTS by @davidzhao in #2693
- add warning for deprecated speed and emotion control for cartesia tts by @longcw in #2708
- fix(plugins-inworld): change default voice to Ashley by @MichaelSolati in #2707
- deepgram: disable smart_format by default by @theomonnom in #2704
- livekit-agents 1.1.4 by @theomonnom in #2709
New Contributors
- @mfernandez-gladia made their first contribution in #2649
- @Panmax made their first contribution in #2677
- @MichaelSolati made their first contribution in #2707
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.1.2...livekit-agents@1.1.4
livekit-agents@1.1.2
What's Changed
- Add spitch optional dependency by @temibabs in #2559
- add Cartesia STT usage event by @ChenghaoMou in #2565
- use the cgroup cpu_count for the inference thread pool by @theomonnom in #2572
- avoid possible contention on concurrent inference executions by @theomonnom in #2575
- use onnx dynamic_block_base by @theomonnom in #2578
- add vad for stt FallbackAdapter by @longcw in #2582
- Don't require sarvam api key param for TTS by @bcherry in #2579
- Remove unnecessary model param from baseten tts by @bcherry in #2568
- Fix baseten STT api key lookup by @bcherry in #2576
- fix stt fallback adapter imports by @longcw in #2590
- Replace the office-ambience sound file by @bcherry in #2588
- chore(deepgram,cartesia): removed AudioEnergyFilter by @davidzhao in #2594
- unit tests for agent session by @longcw in #2518
- fix unknown energy filter parameter by @theomonnom in #2599
- fix type check by @longcw in #2596
- wait for final transcript in manual turn detection by @longcw in #2597
- add volume gain option by @jmugicagonz in #2603
- increase audio frame size by @theomonnom in #2610
- Add SSML support for Google TTS by @kechako in #2608
- fix OpenAI Realtime connect timeout by @theomonnom in #2612
- fix OpenAI Realtime tool_choice by @theomonnom in #2613
- add transcript_confidence to ChatMessage by @theomonnom in #2611
- fix(turn-detector): improve accuracy by combining adjacent turns by @davidzhao in #2595
- fix transcription delay when VAD false negative by @longcw in #2620
- Hume plugin fixes by @zgreathouse in #2591
- Updating metrics for cached tokens for Realtime model (OpenAI) by @tg-bomze in #2621
- Disable ensure_ascii by @tg-bomze in #2622
- add timeout for agent session tests by @longcw in #2624
- add error log when llm fallback adapter failed because chunk_sent by @longcw in #2626
- fix ChatContext.insert type check by @theomonnom in #2635
- Removes the split_utterances option from Hume TTS plugin by @zgreathouse in #2638
- wait for video track from avatar plugins by @longcw in #2627
- add http_options for gemini LLM and realtime model by @longcw in #2640
- correctly passing speaking_rate to StreamingAudioConfig by @david-rodriguez in #2631
- Fix : Increase audio mixer timeout by @CyprienRicqueB2L in #2646
- handling multiple audio chunk output by @raghavjaistra in #2641
- Fix Hume TTS by @bcherry in #2639
- Update sarvam defaults, add 2.5 by @bcherry in #2618
- fix tracing param in openai realtime by @longcw in #2652
- raise error from gladia stt for fallback adapter and retry by @longcw in #2653
- fix await tasks groups never return by @longcw in #2654
- chore: add note for job_context.api usage by @davidzhao in #2655
- fix(google): update dependency versions by @davidzhao in #2658
- feat(baseten): add LLM module by @davidzhao in #2657
- cleanup tee in agent activity by @longcw in #2660
- fix duplicated audio on flush by @theomonnom in #2663
- fix transcription sync warning when gemini no text output by @longcw in #2661
- livekit-agents v1.1.2 by @theomonnom in #2664
New Contributors
- @tg-bomze made their first contribution in #2621
- @david-rodriguez made their first contribution in #2631
- @CyprienRicqueB2L made their first contribution in #2646
- @raghavjaistra made their first contribution in #2641
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.1.0...livekit-agents@1.1.2
livekit-agents@1.1.0
What's Changed
- TTS improvements & tests by @theomonnom in #2152
- rewrite Azure TTS by @theomonnom in #2151
- fix google TTS by @theomonnom in #2410
- add to_provider_format for ChatContext by @longcw in #2295
- automatically close agent session when participant disconnected by @longcw in #2398
- fix type checks and tts fallback adapter by @longcw in #2419
- deprecate multi-segments SynthesizeStream by @theomonnom in #2421
- cartesia: fix break by @theomonnom in #2422
- avoid raising tts empty errors when pushed text is empty by @longcw in #2420
- don't error when pushing on closed stream by @theomonnom in #2424
- Add diarization support by @ShayneP in #2338
- fix gemini user transcription when tool calls by @longcw in #2439
- skip response if no llm set in user turn completed by @longcw in #2441
- fix type checks for plugins by @longcw in #2423
- Fix SpeechHandle Priority Schedule by @toubatbrian in #2433
- use time.monotonic_ns for speech scheduling by @theomonnom in #2446
- fix transcription sync when on_playback_finished missing after flush by @longcw in #2397
- LMNT agent plugin for TTS synthesis by @naiveen in #2413
- fix agent state for pipeline agent by @longcw in #2453
- add max_session_duration and auto reconnection for OAI realtime api by @longcw in #2360
- avatar publish video after waiting participant by @longcw in #2450
- PlayAI plugin: fix language tag by @bryananderson in #2458
- Update README.md by @theomonnom in #2466
- fixed identifying streamable http mcp servers containing api key in url by @Akshay-a in #2468
- fix(google): Live syncs context, supports manual turns by @davidzhao in #2401
- AssemblyAI Remove Hardcoded Default Configuration by @dan-ince-aai in #2456
- add duration_per_frame for datastream audio receiver by @longcw in #2474
- add logs after session closed by @longcw in #2479
- rename to frame_size_ms for data stream audio receiver by @longcw in #2481
- chore(assemblyai): renaming to format_turns and only emit formatted f… by @dan-ince-aai in #2485
- fix optional args in Annotated argument by @longcw in #2491
- fix text only example by @longcw in #2490
- add artificial delay between consecutive speech handles by @longcw in #2492
- Support for Spitch in LiveKit by @temibabs in #2430
- detect inactive user example by @theomonnom in #2499
- recover from incorrect LLM arguments in function_tool by @theomonnom in #2500
- add max_unrecoverable_errors and connection options for agent session by @longcw in #2494
- Collect prompt cached tokens count in llm usage in AWS LLM plugin by @alfredguiaugment in #2508
- fix tts fallback adapter test and stream adapter by @longcw in #2514
- add hedra plugin by @longcw in #2163
- Fix broken link in silero readme by @bcherry in #2521
- fix(google): proactivity and affective_dialog require v1alpha1 API by @davidzhao in #2523
- fix: LLM to honor custom timeouts by @davidzhao in #2526
- ignore empty assistant messages by @theomonnom in #2530
- feat(openai): strip thinking tokens by @davidzhao in #2524
- Fix typo in cerebras error msg by @bcherry in #2531
- feat: surface tavus conversation id by @mertgerdan in #2532
- feat: langgraph integration by @davidzhao in #2534
- cleanup bithuman when process shutdown by @longcw in #2536
- add eleven labs v3 model by @choso in #2540
- lmnt: Update default voice, add temperature, topp options by @naiveen in #2539
- Add Cartesia STT integration by @DineshTeja in #2538
- Baseten Livekit plugin integration by @htrivedi99 in #2520
- feat: Sarvam.ai plugin for STT and TTS by @AnshTanwar in #2241
- chore: tweaks to plugins CI by @davidzhao in #2543
- use not given for room io options by @longcw in #2542
- Add new speed and tracing options to OpenAI RealtimeModel and RealtimeSession by @mikevin920 in #2503
- add connect options and error retry for realtime model by @longcw in #2544
- initial prewarm by @theomonnom in #2527
- bithuman avatar refresh token after prewarm by @longcw in #2541
- ignore prewarm failures by @theomonnom in #2545
- run room_io.start and ctx.connect concurrently in session.start by @longcw in #2505
- convert TracingOptions for session updates by @theomonnom in #2546
- use pyht SDK by @theomonnom in #2459
- update x.ai models by @theomonnom in #2547
- cancel tasks on start failure by @theomonnom in #2548
- livekit-agents v1.1.0 by @theomonnom in #2549
New Contributors
- @dan-ince-aai made their first contribution in #2399
- @toubatbrian made their first contribution in #2433
- @naiveen made their first contribution in #2413
- @temibabs made their first contribution in #2430
- @alfredguiaugment made their first contribution in #2508
- @mertgerdan made their first contribution in #2532
- @choso made their first contribution in #2540
- @DineshTeja made their first contribution in #2538
- @htrivedi99 made their first contribution in #2520
- @AnshTanwar made their first contribution in #2241
- @mikevin920 made their first contribution in #2503
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.0.23...livekit-agents@1.1.0
livekit-agents@1.0.23
What's Changed
- Support empty array of parameters when using raw_schema by @koen-boost in #2328
- fix: avoid shadowing name in function_tool decorator by @davidzhao in #2331
- feat: Add
with_letta
OpenAI plugin by @mattzh72 in #2182 - fix tool choice by @jayeshp19 in #2332
- Expose all realtime model parameters by @Shubhrakanti in #2324
- add prewarm to bithuman avatar example by @longcw in #2337
- fix agent transcription truncate for console mode by @longcw in #2327
- fix chat context item order by @longcw in #2321
- google: add new models to LLM and live by @davidzhao in #2344
- handle missing token_count in realtime usage metrics by @fredvollmer in #2350
- [Rime] Increase timeout for
arcana
model to allow for synthesis of long audio by @MaCaki in #2343 - google: do not error when empty responses are returned by @davidzhao in #2345
- fix type check by @longcw in #2335
- add internal worker token by @real-danm in #2354
- Add input_audio_noise_reduction to OpenAI RealtimeModel by @RBT22 in #2362
- Setting openai temperature on
LLM.chat
by @free-soellingeraj in #2353 - on_end_of_turn is sync by @theomonnom in #2374
- multilingual model update by @jeradf in #2219
- ignore any_generics for mypy by @longcw in #2375
- rename insert_item to insert by @theomonnom in #2372
- support stt END_OF_SPEECH for stt turn detection by @longcw in #2363
- Implemented #2379 - Add support for more paramaters for Google Live API by @F1nnM in #2380
- disable split characters for tts by @longcw in #2366
- google: fix proactive audio config, update genai by @davidzhao in #2390
- fix race condition in avatar runner when reset playback_position by @longcw in #2396
- set user state to away after a timeout by @longcw in #2408
- add MCP support for streamable HTTP client by @Akshay-a in #2394
- Upgrade AssemblyAI to Universal-Streaming by @dan-ince-aai in #2399
- fix AssemblyAI & follow docs by @theomonnom in #2445
New Contributors
- @koen-boost made their first contribution in #2328
- @mattzh72 made their first contribution in #2182
- @fredvollmer made their first contribution in #2350
- @real-danm made their first contribution in #2354
- @RBT22 made their first contribution in #2362
- @free-soellingeraj made their first contribution in #2353
- @F1nnM made their first contribution in #2380
- @Akshay-a made their first contribution in #2394
- @dan-ince-aai made their first contribution in #2399
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.0.22...livekit-agents@1.0.23