Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A question about how to accurately remove silence segments? #167

Open
shenpengjie opened this issue May 11, 2023 · 0 comments
Open

A question about how to accurately remove silence segments? #167

shenpengjie opened this issue May 11, 2023 · 0 comments

Comments

@shenpengjie
Copy link

When I'm processing data, I take random-length chunks from each utterance and use the "activitydetector()" function in audiolib.py. Unfortunately, it's highly possible for me to randomly sample a completely silent segment from each utterance. For example, in the "book_07839_chp_0009_reader_06500_4.wav" utterance, I randomly sampled a segment of approximately 3.8 seconds, which turned out to be completely silent. However, when I used the activitydetector() function to analyze this segment, the result was 0.98, indicating that I couldn't identify and exclude these bad cases. So my question is, in such situations, how should I set the "energy_thresh" and "target_level" parameters in activitydetector()? Alternatively, how can I modify my random sampling strategy?
By the way, when processing the data, I only use speech segments with activitydetector(audio) > 0.6.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant