+
Skip to content

Conversation

BaimingX
Copy link

What - add an optional return_segments path to indextts/infer_v2.py, collecting per-segment text/sample metadata during synthesis. - teach webui.py to request segment metadata, write a sibling .srt when available, and surface subtitle preview + download. - tweak the generation tab layout to host the new UI elements while keeping audio UX unchanged.

Why - subtitles appear instantly alongside generated audio, eliminating post-hoc alignment steps and improving usability.

Details - paths: indextts/infer_v2.py:326-674, webui.py:107-224, webui.py:245-499. - helper functions normalize tokenizer output and format SRT timestamps. - backward compatible: default audio-only inference still returns the original structure when return_segments=False.

Tests - python -m compileall webui.py indextts/infer_v2.py - manual: run the Web UI, generate audio, confirm .srt appears next to the .wav and the preview/download widgets work.

Notes - restored upstream-tracked example audio and docs to avoid unintended deletions. - follow-up idea (separate PR): extend the CLI path to emit .srt when return_segments=True.

wechat_2025-10-11_081903_543

@BaimingX BaimingX force-pushed the feat/auto-srt-inference-ui branch from 908bf7a to 8b8ea26 Compare October 10, 2025 21:41
@hjj-lmx
Copy link

hjj-lmx commented Oct 21, 2025

seg_idx是哪里的,没有定义参数

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载