Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gemini agent is not working #1242

Open
kikoncuo opened this issue Dec 16, 2024 · 2 comments
Open

Gemini agent is not working #1242

kikoncuo opened this issue Dec 16, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@kikoncuo
Copy link

kikoncuo commented Dec 16, 2024

I was testing the voice pipeline agent that uses gemini as LLM through the openai API and google services for the other components in the pipeline (Not the gemini realtime agent which is being worked on)
https://github.com/livekit/agents/blob/main/examples/voice-pipeline-agent/gemini_voice_agent.py

I downloaded it, installed all dependencies, created a set of credentials with the correct permissions and while the welcome message is being streamed I get this error

2024-12-16 19:12:11,961 - ERROR livekit.agents.pipeline - Error in _recognize_task
Traceback (most recent call last):
  File "/home/enrique/Enrique/Omniloy/test/googleVoiceTests/realtime-playground/.venv/lib/python3.12/site-packages/livekit/agents/utils/log.py", line 16, in async_fn_logs
    return await fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/enrique/Enrique/Omniloy/test/googleVoiceTests/realtime-playground/.venv/lib/python3.12/site-packages/livekit/agents/pipeline/human_input.py", line 150, in _recognize_task
    await asyncio.gather(*tasks)
  File "/home/enrique/Enrique/Omniloy/test/googleVoiceTests/realtime-playground/.venv/lib/python3.12/site-packages/livekit/agents/pipeline/human_input.py", line 120, in _audio_stream_co
    stt_stream.push_frame(ev.frame)
  File "/home/enrique/Enrique/Omniloy/test/googleVoiceTests/realtime-playground/.venv/lib/python3.12/site-packages/livekit/agents/stt/stt.py", line 265, in push_frame
    self._check_not_closed()
  File "/home/enrique/Enrique/Omniloy/test/googleVoiceTests/realtime-playground/.venv/lib/python3.12/site-packages/livekit/agents/stt/stt.py", line 327, in _check_not_closed
    raise RuntimeError(f"{cls.__module__}.{cls.__name__} is closed")
RuntimeError: livekit.plugins.google.stt.SpeechStream is closed {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}

This is the only error or warning I see on the logs

Here is the log trace in case it helps

python main.py dev
2024-12-16 19:11:52,728 - DEBUG asyncio - Using selector: EpollSelector
2024-12-16 19:11:52,729 - DEV  livekit.agents - Watching /home/enrique/Enrique/Omniloy/test/googleVoiceTests/realtime-playground/agent
2024-12-16 19:11:53,459 - DEBUG asyncio - Using selector: EpollSelector
2024-12-16 19:11:53,465 - INFO livekit.agents - starting worker {"version": "0.12.2", "rtc-version": "0.18.2"}
2024-12-16 19:11:53,649 - INFO livekit.agents - registered worker {"id": "AW_TnnRsKzECFut", "region": "France", "protocol": 15, "node_id": "NC_OMARSEILLE1A_LzQTokkbYhhv"}
2024-12-16 19:12:08,152 - INFO livekit.agents - received job request {"job_id": "AJ_dtKi2nzgZvqh", "dispatch_id": "", "room_name": "playground-n1ow-rvVz", "agent_name": "", "resuming": false}
2024-12-16 19:12:08,995 - INFO livekit.agents - initializing job process {"pid": 91447}
2024-12-16 19:12:09,044 - INFO livekit.agents - job process initialized {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:09,044 - DEBUG asyncio - Using selector: EpollSelector {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:09,046 - INFO voice-assistant - connecting to room playground-n1ow-rvVz {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:09,048 - INFO livekit - livekit_ffi::server:133:livekit_ffi::server - initializing ffi server v0.12.3 {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:09,048 - INFO livekit - livekit_ffi::cabi:36:livekit_ffi::cabi - initializing ffi server v0.12.3 {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:09,050 - INFO livekit - livekit_api::signal_client::signal_stream:96:livekit_api::signal_client::signal_stream - connecting to wss://testomni-b7j6hodm.livekit.cloud/rtc?sdk=python&protocol=15&auto_subscribe=0&adaptive_stream=0&version=0.18.2&access_token=... {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:09,156 - DEBUG livekit - rustls::anchors:150:rustls::anchors - add_parsable_certificates processed 146 valid and 0 invalid certs {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:09,156 - DEBUG livekit - tokio_tungstenite::tls::encryption::rustls:103:tokio_tungstenite::tls::encryption::rustls - Added 146/146 native root certificates (ignored 0) {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:09,156 - DEBUG livekit - rustls::client::hs:73:rustls::client::hs - No cached session for DnsName("testomni-b7j6hodm.livekit.cloud") {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:09,156 - DEBUG livekit - rustls::client::hs:132:rustls::client::hs - Not resuming any session {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:09,182 - DEBUG livekit - rustls::client::hs:615:rustls::client::hs - Using ciphersuite TLS13_AES_128_GCM_SHA256 {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:09,182 - DEBUG livekit - rustls::client::tls13:142:rustls::client::tls13 - Not resuming {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:09,183 - DEBUG livekit - rustls::client::tls13:381:rustls::client::tls13 - TLS1.3 encrypted extensions: [] {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:09,183 - DEBUG livekit - rustls::client::hs:472:rustls::client::hs - ALPN protocol is None {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:09,283 - DEBUG livekit - tungstenite::handshake::client:95:tungstenite::handshake::client - Client handshake done. {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:09,492 - INFO voice-assistant - starting voice assistant for participant identity-RxbH {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:09,492 - DEBUG google.auth._default - Checking /home/enrique/.config/gcloud/application_default_credentials.json for explicit credentials as part of auth process... {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:09,492 - DEBUG google.auth._default - Explicit credentials path /home/enrique/.config/gcloud/application_default_credentials.json is the same as Cloud SDK credentials path, fall back to Cloud SDK credentials flow... {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:09,492 - DEBUG google.auth._default - Checking Cloud SDK credentials as part of auth process... {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:10,561 - DEBUG grpc._cython.cygrpc - Using AsyncIOEngine.POLLER as I/O engine {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:11,651 - DEBUG urllib3.connectionpool - Starting new HTTPS connection (1): oauth2.googleapis.com:443 {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:11,652 - DEBUG urllib3.connectionpool - Starting new HTTPS connection (1): oauth2.googleapis.com:443 {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:11,848 - DEBUG urllib3.connectionpool - https://oauth2.googleapis.com:443 "POST /token HTTP/11" 200 None {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:11,849 - DEBUG urllib3.connectionpool - https://oauth2.googleapis.com:443 "POST /token HTTP/11" 200 None {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:11,961 - ERROR livekit.agents.pipeline - Error in _recognize_task
Traceback (most recent call last):
  File "/home/enrique/Enrique/Omniloy/test/googleVoiceTests/realtime-playground/.venv/lib/python3.12/site-packages/livekit/agents/utils/log.py", line 16, in async_fn_logs
    return await fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/enrique/Enrique/Omniloy/test/googleVoiceTests/realtime-playground/.venv/lib/python3.12/site-packages/livekit/agents/pipeline/human_input.py", line 150, in _recognize_task
    await asyncio.gather(*tasks)
  File "/home/enrique/Enrique/Omniloy/test/googleVoiceTests/realtime-playground/.venv/lib/python3.12/site-packages/livekit/agents/pipeline/human_input.py", line 120, in _audio_stream_co
    stt_stream.push_frame(ev.frame)
  File "/home/enrique/Enrique/Omniloy/test/googleVoiceTests/realtime-playground/.venv/lib/python3.12/site-packages/livekit/agents/stt/stt.py", line 265, in push_frame
    self._check_not_closed()
  File "/home/enrique/Enrique/Omniloy/test/googleVoiceTests/realtime-playground/.venv/lib/python3.12/site-packages/livekit/agents/stt/stt.py", line 327, in _check_not_closed
    raise RuntimeError(f"{cls.__module__}.{cls.__name__} is closed")
RuntimeError: livekit.plugins.google.stt.SpeechStream is closed {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:12,490 - DEBUG livekit.agents.pipeline - speech playout started {"speech_id": "0426d2c5853d", "pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:12,491 - INFO livekit.agents - Pipeline TTS metrics: sequence_id=0426d2c5853d, ttfb=1.922347173000162, audio_duration=2.20 {"pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:14,692 - DEBUG livekit.agents.pipeline - speech playout finished {"speech_id": "0426d2c5853d", "interrupted": false, "pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}
2024-12-16 19:12:14,692 - DEBUG livekit.agents.pipeline - committed agent speech {"agent_transcript": " Hi there, this is Gemini, how can I help you today?", "interrupted": false, "speech_id": "0426d2c5853d", "pid": 91447, "job_id": "AJ_dtKi2nzgZvqh"}

I'm going to test with a different stt

@kikoncuo kikoncuo added the bug Something isn't working label Dec 16, 2024
@davidzhao
Copy link
Member

davidzhao commented Dec 17, 2024

looks like TTS worked based on your logs?

the error had come from the STT side, are the right services enabled on your Google account? I'm unable to reproduce this on our side.

@kikoncuo
Copy link
Author

kikoncuo commented Dec 17, 2024

Sorry STT, I changed to a different STT and it worked.

The API is enabled, I was getting a different error when they weren't enabled.

The logs before were really descriptive with the problem this one is a bit harder to debug.

I'm not using google's STT, just wanted to report the issue with the demo.

A friend replicated the problem with a newly created google account.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants