这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@Rrojaski
Copy link
Contributor

@Rrojaski Rrojaski commented Jun 27, 2024

Pull Request Type

  • ✨ feat
  • 🐛 fix
  • ♻️ refactor
  • 💄 style
  • 🔨 chore
  • 📝 docs

Relevant Issues

resolves #1603

What is in this change?

Implemented Ctrl (Cmd on Mac) + 'm' hotkeys to activate the existing React Speech Recognition feature.

Additional Information

Developer Validations

  • I ran yarn lint from the root of the repo & committed changes
  • Relevant documentation has been updated
  • I have tested my code functionality
  • Docker build succeeds locally

@timothycarambat timothycarambat added the PR:needs review Needs review by core team label Jun 28, 2024
@timothycarambat timothycarambat changed the title 1603 speach to text 1603 speech to text hotkey Jun 28, 2024
@timothycarambat
Copy link
Member

timothycarambat commented Jun 29, 2024

Love the idea. I have some feedback currently from a UI/UX perspective. When I press CMD+Enter the mic turns on but is not super obvious my hotkey did that (the icon just gets white for the mic). Then I can speak and I see the words coming in, but then I don't know how to really "stop" the streaming other than shutting up for the silence interval.

If I click on the mic icon, it'll auto-send, but if I click the send icon itself it will double-send my query as the mic gets the event to stop listening.

Thinking about a UX perspective that makes this less confusing for someone who might find themselves toying with this hot-key combination

@Rrojaski
Copy link
Contributor Author

Rrojaski commented Jul 1, 2024

@timothycarambat

Thanks for the feedback! I've addressed the issue where the query was being double-sent when clicking the mic button or pressing the hotkeys by stopping the TTS session when the send button is clicked and adding an event listener for PROMPT_INPUT_EVENT to check if the user clicks send before the timeout finishes.

Additionally, I've changed the hotkeys to CMD+M to avoid interfering with what the ENTER key currently does.

For a better user experience, I have added a pulse animation to the mic icon when it is active, making it more obvious when speech recognition is on.

All issues with the mic and double-send have been resolved.

@timothycarambat timothycarambat self-assigned this Jul 20, 2024
@timothycarambat timothycarambat merged commit 6a0f068 into Mintplex-Labs:master Aug 7, 2024
cabwds pushed a commit to cabwds/anything-llm that referenced this pull request Jul 3, 2025
* Added ctrl + enter hotkeys to init speach to text

* Ran linter

* Fixed speech transcript from being submitted twice when the user clicks the send button. Updated speech hotkeys.

* Added pulse animation to mic

* Fixed prompt double-send when clicking the send button or ending the TTS session.

* Fixed comment grammar

* Update mic hotkeys

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

PR:needs review Needs review by core team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEAT]: Add speech-to-text prompting possibility activated with a key combination

2 participants