Research - Dynamic speech reflex #4

yacineMTB · 2023-06-12T06:11:46Z

Right now, I'm planning to initiate the response with a "vim pedal", aka a hotkey, because knowing when to respond is difficult. https://github.com/yacineMTB/talk/blob/master/index.ts#L108-L135

When humans speak to each other, we use intonation and other signals to let the other human know when the floor is open, and we also use it to let the other human know that we want the floor.

Right now, we just need some naive event firing when the speaker stops speaking.
Is this something that we can get out of whisper.cpp's embeddings? Possibly a classifier trained on top of the embeddings?

Also I wouldn't shy away from running a python sidecar that takes requests from the main node proc.

What would be awesome

Figuring out how to either get whisper.cpp, or some sidecar, that takes a byte stream and outputs a continous "activation function" based on likelihood to respond

odkken · 2023-06-12T07:31:49Z

Interesting line of thought from here. The issue that immediately pops up for me is "personality" - how much of a pushover is the thing? Does it stop talking as soon as you make a sound? Does it only speak when spoken to?

yacineMTB · 2023-06-14T03:19:51Z

@odkken

Does it stop talking as soon as you make a sound? Does it only speak when spoken to?

Abstractly, it's an event that fires based on some activation threshold. The threshold should be configurable!

stangirala · 2023-06-21T20:20:23Z

Thoughts on reusing this? https://github.com/ggerganov/whisper.cpp/blob/1d716d6e34f3f4ba57bd9706a9258a0bdb008153/examples/stream/stream.cpp#L584-L592

If that looks good should be easy enough to modify the current audio stream and fire an event (actually, what event for Talk?)

simonMoisselin · 2023-06-23T06:56:37Z

Thoughts on reusing this? https://github.com/ggerganov/whisper.cpp/blob/1d716d6e34f3f4ba57bd9706a9258a0bdb008153/examples/stream/stream.cpp#L584-L592

If that looks good should be easy enough to modify the current audio stream and fire an event (actually, what event for Talk?)

This is just a high-pass filter, it might fire too often for almost any type of noise, I think we need something more specific - ml-based.

But we can use the same logic, just replacing this high_pass_filter

void high_pass_filter(std::vector<float> & data, float cutoff, float sample_rate) {
    const float rc = 1.0f / (2.0f * M_PI * cutoff);
    const float dt = 1.0f / sample_rate;
    const float alpha = dt / (rc + dt);

    float y = data[0];

    for (size_t i = 1; i < data.size(); i++) {
        y = alpha * (y + data[i] - data[i - 1]);
        data[i] = y;
    }
}

choombaa · 2023-06-27T02:32:05Z

We could also use [BLANK_AUDIO] as a response reflex when it is transcribed. This might require shrinking the buffer size to reduce latency, I'm not sure how that is controlled right now

yacineMTB · 2023-07-08T22:35:08Z

@choombaa I merged your voice detection
good shit

yacineMTB · 2023-07-08T22:35:34Z

#39

yacineMTB · 2023-07-08T22:35:49Z

Keeping issue open as we might make it a bit more involved

yacineMTB mentioned this issue Jun 12, 2023

Not enough issues #1

Closed

yacineMTB added the help wanted Extra attention is needed label Jun 12, 2023

yacineMTB pinned this issue Jun 12, 2023

Repository owner deleted a comment from jmanhype Jun 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Research - Dynamic speech reflex #4

Research - Dynamic speech reflex #4

yacineMTB commented Jun 12, 2023 •

edited

Loading

odkken commented Jun 12, 2023

yacineMTB commented Jun 14, 2023 •

edited

Loading

stangirala commented Jun 21, 2023 •

edited

Loading

simonMoisselin commented Jun 23, 2023

choombaa commented Jun 27, 2023

yacineMTB commented Jul 8, 2023

yacineMTB commented Jul 8, 2023

yacineMTB commented Jul 8, 2023

Research - Dynamic speech reflex #4

Research - Dynamic speech reflex #4

Comments

yacineMTB commented Jun 12, 2023 • edited Loading

What would be awesome

odkken commented Jun 12, 2023

yacineMTB commented Jun 14, 2023 • edited Loading

stangirala commented Jun 21, 2023 • edited Loading

simonMoisselin commented Jun 23, 2023

choombaa commented Jun 27, 2023

yacineMTB commented Jul 8, 2023

yacineMTB commented Jul 8, 2023

yacineMTB commented Jul 8, 2023

yacineMTB commented Jun 12, 2023 •

edited

Loading

yacineMTB commented Jun 14, 2023 •

edited

Loading

stangirala commented Jun 21, 2023 •

edited

Loading