Get ready state of recognizer #80

DanielUsselmann · 2024-03-17T20:33:24Z

Hi,
is there a way to get something like a ready state from the recognizer?
I have the problem that it takes some seconds until the recognizer is really recognizing speech from the user.
While it is in this not ready state I want a loading circle to wait until the recognizer is ready for use.

I am using getUserMedia() function for Audio.

erikh2000 · 2024-03-17T21:52:17Z

I think I agree it would be nice to have, and maybe that feature is already there and I missed it. But...

If you maintain your own ready state and set it to true once all the setup work is done, you should have the same thing. The main asynchronous delay is loading and creating the model which is then passed to the KaldiRecognizer constructor. I believe that after the KaldiRecognizer instance is constructed, it's ready to receive data from the microphone via acceptWaveform(). (But that's going from memory, and I can't test it right now.)

DanielUsselmann · 2024-03-17T22:10:42Z

I have the following react-app:

main.jsx:
<Recognizer language={language} onModelLoad={handleLoadComplete} onResult={handleResult}/>

recognizer.jsx:
useEffect(() => { if(ready){ onModelLoad(); // Invoke onModelLoad callback to notify the parent component } }, [ready]);

`useEffect(() => {
const loadModel = async () => {
const defaultModelPath = language;
setLoading(true); // Set loading state to true initially
const newChannel = new MessageChannel();
setChannel(newChannel); // Set channel state
const model = await createModel(defaultModelPath);
model.registerPort(newChannel.port1);
setLoadedModel({ model, path: defaultModelPath });
const newRecognizer = new model.KaldiRecognizer(48000);
newRecognizer.setWords(true);
newRecognizer.on("result", (message) => {
const result = message.result;
if (result.text !== "") {
setUtterances((utt) => [...utt, result]);
onResultRef.current?.(result);
}
});
newRecognizer.on("partialresult", (message) => {
setPartial(message.result.partial);
});
setRecognizer(newRecognizer);
setLoading(false); // Set loading state to false once loading is complete

};

loadModel();

}, []); // Load the default model when language changes`

mic.jsx:
` const startRecognitionStream = useCallback(async () => {
if (recognizer) {
if (!mediaStream) {

    try {
      mediaStream = await navigator.mediaDevices.getUserMedia({
        video: false,
        audio: {
          echoCancellation: true,
          noiseSuppression: true,
        },
      });
      if (mediaStream) {
      ready(true);
      }
      const audioContext = new AudioContext();
      await audioContext.audioWorklet.addModule('js/recognizer-processor.js')

      const recognizerProcessor = new AudioWorkletNode(audioContext, 'recognizer-processor', { channelCount: 1, numberOfInputs: 1, numberOfOutputs: 1 });
      recognizerProcessor.port.postMessage({ action: 'init', recognizerId: recognizer.id },[channel.port2])
      recognizerProcessor.connect(audioContext.destination);

      const source = audioContext.createMediaStreamSource(mediaStream);
      if(source.connect(recognizerProcessor) != undefined)
      {
        ;
      }
      
    } catch (e) {
        f7.dialog.alert(e.name, e.message);
    }
  } 


}

}, [recognizer]);

useEffect(() => {
startRecognitionStream();
}, [recognizer]);`

Sorry to put my code in that bad, but its somehow not supported..

To explain: If the model has been loaded the loading flag is set to true and if the mic is allowed the ready flag is set.
But somehow the recognition starts not immediately

erikh2000 · 2024-03-17T22:28:36Z

It's a little too much for me to find the issue in that code. I'll say that I would suspect that the setter functions (e.g. setRecognizer(), setLoading()) might not be updating values when you want them to inside of the function passed to useEffect().

I recommend setting breakpoints and stepping through the code in a browser debugger, like Chrome's or Firefox's. You can narrow it to the exact point of execution where something is happening outside of your expectations.

DanielUsselmann · 2024-03-18T17:43:51Z

I mean the current code has the spinner circle until the user allows the mic to use, but it still takes some seconds until you can speak

erikh2000 · 2024-03-18T22:53:52Z

I don't trust my eyes and mind to sort out the React-based state logic above. That's why I say it might be useful for you to narrow down the issue with debugging.

So for example, one thing I would verify is that the series of events is really like this:

recognizer is constructed with the loaded model.
microphone is captured and audioWorklet is running.
user experiences delay of some seconds before words are recognized.

Because maybe you're actually seeing something more like:
2. microphone is captured and audioWorklet is running.
3. user experiences delay of some seconds before words are recognized.

recognizer is constructed with the loaded model.

I often find unexpected behavior around useEffect() and useState(). I'm sure its based on my ignorance of how React works. But I'm just saying that I've been surprised a thousand times.

If it helps in any way, here is my code handling initialization: https://github.com/erikh2000/sl-web-speech/blob/main/src/speech/Recognizer.ts

DanielUsselmann · 2024-03-19T23:06:26Z

Thanks!
Do you know how to stop the recognizer from "recognizing"?
In my opinion normally if a component is unmounted it shall stop right or do i have to do sth manually ?

erikh2000 · 2024-03-20T00:22:10Z

Do you know how to stop the recognizer from "recognizing"?

One way is to stop sending samples to the recognizer via .acceptWaveform(). So in your audioworklet, you can check a "muted" flag and just not send samples if the flag is set. That should cut way down on CPU. And it also has a nice guarantee that the recognizer isn't continuing to listen in some unexpected way that will make your users upset about privacy.

In my opinion normally if a component is unmounted it shall stop right or do i have to do sth manually ?

This is a really good question. I started to type an answer, and realized I was guessing beyond my knowledge.

With the combination of web workers, WASM, and React component lifecycle, I'm just not 100% sure. A hypothesis is that 1. all execution in the recognizer stops when you stop calling .acceptWaveform() and 2. memory of the recognizer instance is freed by garbage collection some time after your component unmounts, if your recognizer instance is stored in a variable scoped to the component and nowhere else.

On point #2, I prefer to keep the recognizer instance in a module-scoped variable that isn't bound to a React component. In this way, I can reuse the same recognizer instance even if the user exits a screen and returns. (My app has multiple screens, each rendered by a separate component) By module-scoped variable, I mean a declaration of a the recognizer instance like:

`let recognizer = null;

export function initRecognizer() {
//...setup omitted
recognizer = new KaldiRecognizer(...etc...);
}`

DanielUsselmann · 2024-04-07T19:47:49Z

In my case I need the recognizer to be a State in React, so let is not an option.
I can call some functions on my const [recognizer, setRecognizer] = useState(), but I cant find anything to stop the recognizer.
How would you handle that ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Get ready state of recognizer #80

Get ready state of recognizer #80

DanielUsselmann commented Mar 17, 2024 •

edited

Loading

erikh2000 commented Mar 17, 2024

DanielUsselmann commented Mar 17, 2024

erikh2000 commented Mar 17, 2024

DanielUsselmann commented Mar 18, 2024

erikh2000 commented Mar 18, 2024

DanielUsselmann commented Mar 19, 2024

erikh2000 commented Mar 20, 2024 •

edited

Loading

DanielUsselmann commented Apr 7, 2024

Get ready state of recognizer #80

Get ready state of recognizer #80

Comments

DanielUsselmann commented Mar 17, 2024 • edited Loading

erikh2000 commented Mar 17, 2024

DanielUsselmann commented Mar 17, 2024

erikh2000 commented Mar 17, 2024

DanielUsselmann commented Mar 18, 2024

erikh2000 commented Mar 18, 2024

DanielUsselmann commented Mar 19, 2024

erikh2000 commented Mar 20, 2024 • edited Loading

DanielUsselmann commented Apr 7, 2024

DanielUsselmann commented Mar 17, 2024 •

edited

Loading

erikh2000 commented Mar 20, 2024 •

edited

Loading