-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🗺️ Vision / multi-modal #495
Labels
enhancement
New feature or request
language: js
Related to JavaScript or Typescript integration
language: python
Related to Python integration
roadmap
Milestone
Comments
mikeldking
added
enhancement
New feature or request
triage
Issues that require triage
labels
May 23, 2024
mikeldking
added
language: js
Related to JavaScript or Typescript integration
language: python
Related to Python integration
labels
May 23, 2024
Example vLLM client that should also support vision class VLMClient:
def __init__(self, vlm_model: str = VLM_MODEL, vllm_url: str = VLLM_URL):
self._vlm_model = vlm_model
self._vllm_client = httpx.AsyncClient(base_url=vllm_url)
if VLLM_HEALTHCHECK:
wait_for_ready(
server_url=vllm_url,
wait_seconds=VLLM_READY_TIMEOUT,
health_endpoint="health",
)
@property
def vlm_model(self) -> str:
return self._vlm_model
async def __call__(
self,
prompt: str,
image_bytes: bytes | None = None,
image_filetype: filetype.Type | None = None,
max_tokens: int = 10,
) -> str:
# Assemble the message content
message_content: list[dict[str, str | dict]] = [
{
"type": "text",
"text": prompt,
}
]
if image_bytes is not None:
if image_filetype is None:
image_filetype = filetype.guess(image_bytes)
if image_filetype is None:
raise ValueError("Could not determine image filetype")
if image_filetype not in ALLOWED_IMAGE_TYPES:
raise ValueError(
f"Image type {image_filetype} is not supported. Allowed types: {ALLOWED_IMAGE_TYPES}"
)
image_b64 = base64.b64encode(image_bytes).decode("utf-8")
message_content.append(
{
"type": "image_url",
"image_url": {
"url": f"data:{image_filetype.mime};base64,{image_b64}",
},
}
)
# Put together the request payload
payload = {
"model": self.vlm_model,
"messages": [{"role": "user", "content": message_content}],
"max_tokens": max_tokens,
# "logprobs": True,
# "top_logprobs": 1,
}
response = await self._vllm_client.post("/v1/chat/completions", json=payload)
response = response.json()
response_text: str = (
response.get("choices")[0].get("message", {}).get("content", "").strip()
)
return response_text |
mikeldking
changed the title
[feature request]. Capture OpenAI gpt 4o image messages
🗺️ Vision / multi-modal
May 23, 2024
Closing as completed as images is complete. Audio will come as part of openAI realtime instrumentation |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
enhancement
New feature or request
language: js
Related to JavaScript or Typescript integration
language: python
Related to Python integration
roadmap
GPT 4o introduces a new message type that contains images and coded as either URL or base64 encoded.
example:
https://platform.openai.com/docs/guides/vision
Milestone 1
Milestone N
Tracing
Instrumenation
Testing
Image tracing
Context Attributes
Config
Suppress Tracing
UI / Javascript
Testing
Documentation
Evals
The text was updated successfully, but these errors were encountered: