Opened 6 years ago
Last modified 5 years ago
#2871 assigned enhancement
Add speech recognition for running commands in VR
Reported by: | Tom Goddard | Owned by: | Tom Goddard |
---|---|---|---|
Priority: | moderate | Milestone: | |
Component: | VR | Version: | |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Notify when closed: | Platform: | all | |
Project: | ChimeraX |
Description
Conrad has added the speech ChimeraX bundle that uses the PyPi SpeechRecognition module, PyAudio and the Google Speech Recognition service as an initial try at speech input. He reports it works ok in a completely silent environment.
Testing in the VizVault with VivePro VR headset microphone produced poor results. It recognized "open" and "close" one time and then mostly does not log any response to spoken input. Sometimes is logs a phrase from attempted commands given over a 60 second period. Sometimes it logs "cannot recognize speech". To debug further it would need to say when audio is being submitted and play back that recorded audio.
PyAudio failed to install from PyPi due to a compilation problem on Windows 10 with Microsoft Visual Studio 2015, so a wheel from a separate site was used.
Change History (2)
comment:1 by , 6 years ago
comment:2 by , 5 years ago
Cc: | removed |
---|---|
Owner: | changed from | to
Molecular Zoo has pretty robust speech input implemented in Unity in this file by Ray Altenberg while an intern at UCSF.
https://github.com/alanbrilliant/MolecularZoo/blob/master/Assets/Scripts/VoiceRecog.cs
It is using UnityEngine.Windows.Speech specifically the KeywordRecognizer which recognizes fixed words like Oxygen or Reset and does not require an internet connection, and DictationRecognizer which transcribes free speech after the word "create" is said in MolZoo and then does a PubChem search. Both work quite well. The keyword recognizer uses a vocabulary of about 25 words or phrases and is especially robust.
Here is documentation on the Unity KeywordRecognizer and DictationRecognizer classes
https://docs.unity3d.com/ScriptReference/Windows.Speech.KeywordRecognizer.html
https://docs.unity3d.com/ScriptReference/Windows.Speech.DictationRecognizer.html
More details about these Unity capabilities are given my Microsoft here
https://docs.microsoft.com/en-us/windows/mixed-reality/voice-input-in-unity
It looks likely that the dictation recognizer is using the Windows 10 online speech recognition capabilities used by Cortana and described here
https://support.microsoft.com/en-us/help/4468250/windows-10-speech-voice-activation-inking-typing-privacy
Possibly the keyword dictation which does not use an internet connection uses Windows Speech Recognition
https://en.wikipedia.org/wiki/Windows_Speech_Recognition