-
Notifications
You must be signed in to change notification settings - Fork 875
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I think RealtimeRelay should use RealtimeAPI instead of RealTimeClient #462
Comments
What is it that you are suggesting? Where and what file are you changing? |
RealtimeRelay class in relay.js should be changed so it uses RealtimeAPI instead RealtimeClient. |
Below is ChatGPT o1-preview's analysis of the difference between RealtimeAPI and RealtimeClient. I want to know if I need to implement something at the relay level where I would need RealtimeClient's interfaces, such as function calling on the server side, and to keep the state of conversations in a database to preserve the history of multiple chat sessions with a list of messages exchanged between a user and the system. **With this change would we be limited to browser-only usage of RealtimeClient and whatever it can offer? Can anyone explain what is happening, where, and when it would be appropriate (use cases) to have RealtimeClient present on the relay server?** RealtimeAPI RealtimeClient Key Features and Functionalities: Allows developers to configure and manage sessions, including setting modalities (e.g., text, audio), instructions, voice options, and temperature settings. Maintains the state of conversations across multiple interactions, enabling context preservation and multi-turn dialogues. Integrates audio transcription capabilities, enabling the processing of audio inputs (speech-to-text) and generating audio outputs (text-to-speech). Includes mechanisms for detecting when a user has finished speaking, which is essential for natural conversational flow in voice-based applications. Allows the incorporation of custom tools or functions that the assistant can call during a conversation. Provides methods and structures to handle various content types, such as text messages, audio clips, and function calls. Includes advanced event handling to manage conversation flow, response generation, and error handling. Complex Conversational Applications: Ideal for applications that require advanced conversation management, such as virtual assistants, chatbots, or customer service agents. Suitable for applications that need to handle multiple input and output modalities, including text and audio. Essential for applications where maintaining context across multiple exchanges enhances the user experience. Beneficial for applications that require dynamic functionality, where the assistant can perform actions or provide information by invoking custom tools or functions. The RealtimeClient serves as a comprehensive solution for developers aiming to build sophisticated conversational interfaces using the OpenAI Realtime API. By abstracting lower-level details and providing robust features for session and conversation management, it empowers developers to focus on crafting engaging and effective user experiences without worrying about the complexities of direct API interactions. |
@Phodaie Have you tried this in practice? What's been your experience? I just ran into an issue -- the relay server works find in a development environment, but after deploying to prod I am seeing an issue where the conversation breaks down, I believe the web socket connection is stalling. Wondering if you've tried this swap and how it went, I might try it too as a debugging idea! |
I have the following version RealtimeRelay that uses RealtimeAPI instead of RealtimeClient. RealtimeClient has additional functionality (e.g. state management) that is not needed in the Relay.
The text was updated successfully, but these errors were encountered: