Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnboundLocalError: local variable 'feas' referenced before assignment #18

Open
imKarthikeyanK opened this issue Jan 29, 2019 · 2 comments

Comments

@imKarthikeyanK
Copy link

imKarthikeyanK commented Jan 29, 2019

While trying to execute the below command ..

python spk-diarization2.py /mnt/c/users/karthikeyan/Downloads/proper.wav

am getting,

Reading file: /mnt/c/users/karthikeyan/Downloads/proper.wav
Writing output to: stdout
Using feacat from: /home/userk/speaker-diarization/feacat
Writing temporal files in: /tmp
Writing lna files in: /home/userk/speaker-diarization/lna
Writing exp files in: /home/userk/speaker-diarization/exp
Writing features in: /home/userk/speaker-diarization/fea
Performing exp generation and feacat concurrently
Traceback (most recent call last):
File "./generate_exp.py", line 37, in
from docopt import docopt
ImportError: No module named docopt
Calling voice-detection2.py
Reading recipe from: /tmp/initrypiaG.recipe
Reading .exp files from: /home/userk/speaker-diarization/exp
Writing output to: /tmp/vadHJVgzE.recipe
Sample rate set to: 125
Minimum speech turn duration: 0.5 seconds
Minimum nonspeech between-turns duration: 1.5 seconds
Segment before expansion set to: 0.0 seconds
Segment end expansion set to: 0.0 seconds
Error, /home/userk/speaker-diarization/exp/proper.exp does not exist
Waiting for feacat to end.
Calling spk-change-detection.py
Reading recipe from: /tmp/vadHJVgzE.recipe
Reading feature files from: /home/userk/speaker-diarization/fea
Feature files extension: .fea
Writing output to: /tmp/spkcM3EdlF.recipe
Conversion rate set to frame rate: 125.0
Using a growing window
Deltaws set to: 0.096 seconds
Using BIC as distance measure, lambda = 1.0
Window size set to: 1.0 seconds
Window step set to: 3.0 seconds
Threshold distance: 0.0
Useful metrics for determining the right threshold:

Maximum between windows distance: 0
Total windows: 0
Total segments: 0
Maximum between detected segments distance: 0
Total detected speaker changes: 0
Calling spk-clustering.py
('===', '/tmp/spkcM3EdlF.recipe')
Reading recipe from: /tmp/spkcM3EdlF.recipe
Reading feature files from: /home/userk/speaker-diarization/fea
Feature files extension: .fea
Writing output to: stdout
Conversion rate set to frame rate: 125.0
Using hierarchical clustering
Using BIC as distance measure, lambda = 1.3
Threshold distance: 0.0
Maximum speakers: 0
('::::::::::::::::::::::::::::::::::', 0)
Initial cluster with: 0 speakers
Traceback (most recent call last):
File "./spk-clustering.py", line 432, in
process_recipe(parsed_recipe, speakers, outf)
File "./spk-clustering.py", line 293, in process_recipe
spk_cluster_m(feas[1], recipe, speakers, outf, dist, segf)
UnboundLocalError: local variable 'feas' referenced before assignment

I tried looking into spk-clustering.py . the len(receipe) and feas values are 0....
thank you,

@antoniomo
Copy link
Collaborator

Do you have the docopt dependency, and are you using python2?

@imKarthikeyanK
Copy link
Author

imKarthikeyanK commented Jan 29, 2019

yeah.. Thankyou. I was missing docopt dependency. Now Im getting this result...

userk@PSSHSRDT034:~/speaker-diarization$ python spk-diarization2.py /mnt/c/users/karthikeyan/Downloads/proper.wav Reading file: /mnt/c/users/karthikeyan/Downloads/proper.wav Writing output to: stdout Using feacat from: /home/userk/speaker-diarization/feacat Writing temporal files in: /tmp Writing lna files in: /home/userk/speaker-diarization/lna Writing exp files in: /home/userk/speaker-diarization/exp Writing features in: /home/userk/speaker-diarization/fea Performing exp generation and feacat concurrently tokenpass: ./VAD/tokenpass/test_token_pass Reading recipe: /tmp/initzDxEk1.recipe Using model: ./hmms/mfcc_16g_11.10.2007_10 Writing .lnafiles in: /home/userk/speaker-diarization/lna Writing.exp` files in: /home/userk/speaker-diarization/exp
Processing file 1/1
Input: /mnt/c/users/karthikeyan/Downloads/proper.wav
Output: /home/userk/speaker-diarization/lna/proper.lna
FAN OUT: 0 nodes, 0 arcs
FAN IN: 0 nodes, 0 arcs
Prefix tree: 3 nodes, 6 arcs
WARNING: No tokens in final nodes. The result will be incomplete. Try increasing beam.
Calling voice-detection2.py
Reading recipe from: /tmp/initzDxEk1.recipe
Reading .exp files from: /home/userk/speaker-diarization/exp
Writing output to: /tmp/vadTalccO.recipe
Sample rate set to: 125
Minimum speech turn duration: 0.5 seconds
Minimum nonspeech between-turns duration: 1.5 seconds
Segment before expansion set to: 0.0 seconds
Segment end expansion set to: 0.0 seconds
Waiting for feacat to end.
Calling spk-change-detection.py
Reading recipe from: /tmp/vadTalccO.recipe
Reading feature files from: /home/userk/speaker-diarization/fea
Feature files extension: .fea
Writing output to: /tmp/spkcxxYN9G.recipe
Conversion rate set to frame rate: 125.0
Using a growing window
Deltaws set to: 0.096 seconds
Using BIC as distance measure, lambda = 1.0
Window size set to: 1.0 seconds
Window step set to: 3.0 seconds
Threshold distance: 0.0
Useful metrics for determining the right threshold:

Average between windows distance: -789.417532303
Maximum between windows distance: 35.230502772707496
Minimum between windows distance: -1378.4592347022503
Total windows: 23
Total segments: 2
Average between detected segments distance: 56.7217946043
Maximum between detected segments distance: 56.72179460426196
Minimum between detected segments distance: 56.72179460426196
Total detected speaker changes: 1
Calling spk-clustering.py
('===', '/tmp/spkcxxYN9G.recipe')
Reading recipe from: /tmp/spkcxxYN9G.recipe
Reading feature files from: /home/userk/speaker-diarization/fea
Feature files extension: .fea
Writing output to: stdout
Conversion rate set to frame rate: 125.0
Using hierarchical clustering
Using BIC as distance measure, lambda = 1.3
Threshold distance: 0.0
Maximum speakers: 0
Initial cluster with: 2 speakers
Merging: 1 and 2 distance: -2548.5851870160886
Final speakers: 1
Useful metrics for determining the right threshold:

Maximum between segments distance: 0
Minimum between segments distance: -2548.5851870160886
Total segments: 2
Total detected speakers: 1`

from this how can I get the info of 'number of audio segments can be generated with respect to each speaker'. like speaker 1 has around 5 audio segments and the duration (from where to where I should crop the audio) .... and the wav file has two speakers but it shows total detected speakers: 1 ..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants