Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize the structure of the content of a prompt / completion #1557

Open
karthikscale3 opened this issue Nov 7, 2024 · 3 comments
Open

Comments

@karthikscale3
Copy link
Contributor

Area(s)

area:gen-ai

Is your change request related to a problem? Please describe.

We need to standardize the structure of prompt and completion - This should be standardized to the OpenAI structure.

Describe the solution you'd like

Use the OpenAI structure to standardize it.

Describe alternatives you've considered

No response

Additional context

This was discussed during the Nov 7th GenAI SIG meeting - https://docs.google.com/document/d/1EKIeDgBGXQPGehUigIRLwAUpRGa7-1kXB736EaYuJ2M/edit?tab=t.0

@codefromthecrypt
Copy link

Please recognize I am playing devil's advocate and have less experience than most in the team in practice, also I may misunderstand what this issue means as I wasn't on the last call. So, take this feedback as a grain of salt, if I misunderstood the goal from reading the description.

I understand the idea here is making genai semantics for log event (aka request response formats) the same as openai ones. So, you can refer to the openai openapi spec when in doubt of the schema. This could be helpful for those doing indexing, or search.

My first thought is we should solicit thoughts from folks who maintain production systems who attempt similar openai normalization for portability e.g. litellm @ishaan-jaff or vLLM @simon-mo. Even if these aren't driven by observability, they would have many practical intersections.

Personally, I can see a lot of value in continuing investment of semantics that relate to openai, and I also have noticed openai-based portability mechanisms pop up, one discussed below which we can pull into another issue. We have at least a couple choices

  • continue to refine openai semantic specifications as an option for structures including completions, embeddings etc
  • fold openai semantic specifications as the only prompt/completion (log event) structure.

We have some concerns we'd want to visit technically and from a neutrality viewpoint if we went for the latter

  • what happens when openai changes its structure (think v3)?
  • would we also follow the same with normal attributes, or do this only for log structures?
  • if we were to do this, should we not for the sake of otel vendor neutrality, want buy in from other services such as bedrock, anthropic etc? What sort of "cloud quorum" would make this a vendor neutral decision?

Practicalities of openai portability via extra_body

I actually found this issue because I was wondering how we would model extra_body which is already in use to try to allow tunneling of parameters to other clouds or even decorating in configuration such as for guardrails. I was looking to see if we were thinking to edit openai's semantics (here) to support it, and what to do with data overlap concerns (since extending the body is in the body). If this is totally independent, ask me to kick this part to a different issue.

For example, in our python test data, we use openai's extra_body field, which helps pass more parameters through for various reasons. I have seen these in use in azure openai, langchain, vllm, and litellm code or issues and can cite them as needed. However helpful these are to extend things, they are undocumented and not in the openai openapi spec.

Interestingly in litellm/langfuse, I ran into an example of using extra_body for trace ID propagation!

@karthikscale3
Copy link
Contributor Author

Thanks @codefromthecrypt for sharing your thoughts. We should definitely solicit thoughts from code owners of other related projects. For context, I created this issue for tracking since we couldn't find volunteers to work on this during our last SIG call. But, in terms of standardizing to OpenAI spec, in my humble opinion, I think it's a decent middle term strategy for us. And the reason for this is,

  • LLM service providers and model providers are starting to converge towards OpenAI spec: ex:
  1. https://github.com/aws-samples/bedrock-access-gateway - AWS allowing devs to use Bedrock models using the openai client.
  2. Google allowing access to Gemini models using the openai client - https://developers.googleblog.com/en/gemini-is-now-accessible-from-the-openai-library/
  3. A wide range of open source models like mistral and llama family of models are mostly adopted using the openai client.

Now the main concern right now is the other closed model providers i.e. anthropic, cohere etc. where the have their own clients for accessing their models. While the clients resemble openai's client in many ways, they are also quite different when it comes to certain API parameters, data structures etc. This is where proxies like liteLLM come in to the picture. The good thing about proxies is, the litellm client for instance kinda looks very similar to the openai spec. In fact the liteLLM OTEL instrumentation code we had created (Langtrace) was mostly a copy paste of the openai OTEL instrumentation.

My general thought here is, the industry is converging towards the openai spec based on the above data points which is why I think it's a decent middle term strategy. But, what happens if openai changes the spec is a very valid concern. I think soliciting thoughts from other OSS code owners will be a good way to converge towards a standard that works for all of us.

@codefromthecrypt
Copy link

codefromthecrypt commented Nov 19, 2024

@karthikscale3 I think all of what you said are valid reasons and there is a de'facto completion format and good reasons to use it.

As you know, my main anchor in OSS is diverse feedback. I think we can reach out more directly on a decision like this, and happy to help. I'll summarize for the new folks mentioned below:

This thread is about the log format for any LLM provider, and a suggestion of normalizing them to openai format (e.g. for chat completions). This means that regardless of LLM provider/platform, if you were capturing log events (just logs really) the data structure would be based on some openai versioned format. In implementation, instrumentation would coerce different data formats to openai format, which may imply some loss in signal for the convenience of a single format for processing. For example, tool calls are different in certain providers.

Do you see any issues with this? Even if you did, would you welcome it? Do you have any other thoughts?

@yuzisun bloomberg end user while also developing kserve and also the new envoy AI gateway
@sausheong CTO govtech singapore who recently presented AI coding there at the Stack Conference (as well runs gophercon singapore!)
@ashwinb leads the llama-stack project at Meta, which also does local inference
@andymjames colleage at elastic who recently blogged about genai obs in support cases

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants