A question about how to accurately remove silence segments? #167

shenpengjie · 2023-05-11T10:33:14Z

When I'm processing data, I take random-length chunks from each utterance and use the "activitydetector()" function in audiolib.py. Unfortunately, it's highly possible for me to randomly sample a completely silent segment from each utterance. For example, in the "book_07839_chp_0009_reader_06500_4.wav" utterance, I randomly sampled a segment of approximately 3.8 seconds, which turned out to be completely silent. However, when I used the activitydetector() function to analyze this segment, the result was 0.98, indicating that I couldn't identify and exclude these bad cases. So my question is, in such situations, how should I set the "energy_thresh" and "target_level" parameters in activitydetector()? Alternatively, how can I modify my random sampling strategy?
By the way, when processing the data, I only use speech segments with activitydetector(audio) > 0.6.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A question about how to accurately remove silence segments? #167

A question about how to accurately remove silence segments? #167

shenpengjie commented May 11, 2023

A question about how to accurately remove silence segments? #167

A question about how to accurately remove silence segments? #167

Comments

shenpengjie commented May 11, 2023