Support Azure content filters in completion #280

Sufflope · 2024-10-10T11:59:11Z

Hello,

we currently use async-openai to query Azure OpenAI and we would need support for the new content filters on Azure OpenAI. Basically it's a new content_filter_results field in (Chat)Choice.

I would be happy to work on it, however I read the Contributing section of the README 😄 and I understand you want to stick to OpenAI specs, while this is an Azure extension. But also (thankfully for us) there are already some Azure specific stuff I reckon (e.g. the AzureConfig), although limited admittedly.

What would be the best course of action? I see a few options:

add the support, but make this field optional (it is anyway since you can choose to not enable those filters) so it doesn't break on pure OpenAI
add the support and add an azure feature to the crate and gate Azure-specific stuff behind it
add some sort of "extra fields" to (Chat)Choice where anything extra would be stored as a serde_value::Map (and then people "do their thing" with those extra fields) so it supports "extra features" from providers without being Azure-specific
I can also just fork (a friendly fork! or even just an internal one)

1 would work for me of course, but I think 2 (my preference) would be interesting, to have both pure OpenAI as default, but also support Azure extensions, since I believe Azure is actually a big de facto OpenAI provider, but I understand if you're not interested / willing to. 3 sounds quite hackish. 4 is OK of course as last resort. What do you think?

The text was updated successfully, but these errors were encountered:

64bit · 2024-10-10T16:14:55Z

Hi there, Thank you for detailed description of your use case and offering to contribute.

I added Azure support without even having access to it, because access was based on approval from Azure/Microsoft (not sure if that's still the case), thanks to community for helping test and getting the PR merged #67

In retrospect, that was short sighted because there's no way for me to test new contributions or even existing code, and Azure introducing non-compatible changes have made it even more challenging.

As far the options that you have listed, there's no easy answer to it either, for example how do you now incorporate something like #106?

In short, its just complicated to build and maintain, given that majority of the issues are related to de-serialization. And all of this makes it unsustainable for me, and every time I come to conclusion to remove Azure support. For now its an uncertainty.

So unfortunately your best bet is to go with a public fork having Azure specific changes - that way community has an option too.

Sufflope · 2024-10-10T16:52:32Z

Thanks for your quick reply, and for your understanding!

About #106 I'm not sure I see the issue? You already have a Config trait for polymorphism, I kinda agree with #106 (comment) about that. I think I would need a more concrete example of the use case for dynamic dispatch.

Anyway that's out of scope if you don't want to add Azure stuff. I absolutely understand that without a way to test it yourself or automatically, it's unreasonable.

I have some ideas on how to do an "extension" allowing access to extra Azure features, building on your crate, instead of a fork. I'll see how it goes!

Sufflope · 2024-10-11T15:19:55Z

Hello again,

I've begun exploring the idea of building on top of your crate instead of forking it, which sounds much better on many levels 😄 . The idea would be to add "extensions" to your Client to support Azure-specific stuff.

You can see a PoC here: https://github.com/Sufflope/async-openai/tree/azure_extensions

It's of course WIP, I haven't decided whether I would want to transparently "replace" the chat() method or if I should add another one, and other design choices, but it shows the idea. I have validated locally that it works in my end-user project.

The only problem is: for it to work, I would need the pub(crate) get/post/... in Client to be public so I can use them. Would you be OK with exposing them?

64bit · 2024-10-12T01:15:02Z

Thanks for sharing an approach.

I think the root of the problem is not have ability for user to provide custom types for request and response.

So something like:

// existing type
pub struct Chat<'c>  { ... }

impl<'c> Chat<'c> {

    // exisiting function 
    pub async fn create(
        &self,
        request: CreateChatCompletionRequest,
    ) -> Result<CreateChatCompletionResponse, OpenAIError> { ... }

    // extension: let user decide request and response types via generics 
    //  Perhaps can be implemented by just adding a #[ext] custom macro on existing create function above.
    pub async fn create_ext(
        &self,
        request: I,
    ) -> Result<O, OpenAIError> 
     where I: Serialize + ExtraRequiredTraitsForInput,
               O: DeserializeOwned + ExtraRequiredTraitsForOutput,
       { ... }
}

This way one can easily use custom Request type via I and custom Response type via O, and a lot of serialization and deserialization implementation can be prevented by just using built-in serde features on these custom types.

But on the other hand it would be nice to have ability to just inject a custom nested type (for example ChatChoice in your case) in existing type hierarchy.

Sufflope · 2024-10-12T06:55:31Z

Your idea is interesting too, especially with a macro to reduce the boilerplate. You would have to add the #[ext] on basically every existing function though.

About the serde boilerplate, if you're talking about my manual Deserialize, that's one of the points I talked about, that I haven't settled yet. And it is a rather orthogonal topic. I did it like this because, as you saw, I wanted to piggy-back as much as possible on your types with the vanilla flattened fields. But then, using the derive, serde "consumes" the choices json field in my type, and it is not present anymore to deserialize the vanilla struct, so I got a deserialization error. There is already an issue in serde about this case (a field with the same name in the parent and the flattened struct) and the answer was "unsupported, implement Deserialize yourself in that case" 😄.

Another approach would be to just copy-paste those two types (CreateChatCompletionResponse and ChatChoice) in my extension crate and adapt them, instead of reusing yours. But I wanted to avoid that. And I would have the same "problem" with your dual-functions approach.

But on the other hand it would be nice to have ability to just inject a custom nested type (for example ChatChoice in your case) in existing type hierarchy.

Haha sure if I could just somehow monkey-patch ChatChoice it would be solved. But I don't see a way to do it without having a generics nightmare where every FooResponse has only generic fields, themselves containing generics...

64bit · 2024-10-12T23:34:04Z

I see. Combining both of our ideas: instead of wrapping existing struct Chat like you did and inheriting its limitations, what if we have struct ChatExt, it would offer exactly same APIs as struct Chat but it would only use generics for input and output types :

pub struct ChatExt<'c, C: Config> {
    client: &'c Client<C>,
}

impl<'c, C: Config> ChatExt<'c, C> {
    pub fn new(client: &'c Client<C>) -> Self {
        Self { client }
    }

    /// Creates a model response for the given chat conversation.
    pub async fn create(
        &self,
        request: I,
    ) -> Result<O, OpenAIError> 
where I: Serialize + ExtraRequiredTraitsForInput,
           O: DeserializeOwned + ExtraRequiredTraitsForOutput,
{
        if request.stream.is_some() && request.stream.unwrap() {
            return Err(OpenAIError::InvalidArgument(
                "When stream is true, use Chat::create_stream".into(),
            ));
        }
        self.client.post("/chat/completions", request).await
    }
}

// and same way generics for streaming types

// inside Client, something like:

impl Client {
  // for the corresponding chat() fn: 
   pub fn chat_ext(&self) -> ChatExt {  ... }
}

That way async-openai gets to keep its scope limited to official OpenAI API - but lets you bring your own types. And unblocks many other AI providers offering "OpenAI compatible API". Moreover, it lets struct ChatExt evolve on its own rater than tied to struct Chat.

And of course this increases the work that's required to add <API>Ext for all APIs and not just Chat. What do you think, is this a good idea to implement?

Sufflope · 2024-10-20T14:23:43Z

As you said, that sounds like a lot of work to maintain all those UsecaseExt with same API as Usecase but with generics, keeping them in sync also in terms of HTTP method and path to send the requests too.

After taking some time thinking about it, I think your idea of an #[ext] macro on methods to generate a foo_ext that accepts I: Serialize, O: Deserialize generics and otherwise just makes the same calls with the same logic, is probably the best in terms of extensibility with reuse and without boilerplate. Could be behind an ext feature to not pollute your base API and not make compilation longer by default.

Sufflope · 2024-10-20T14:47:49Z

That would require some abstracting of the validations, like whether the request is stream, or the encoding_format one in embeddings, and a way to tell the macro to add necessary bounds.

Probably something like:

/// somewhere in common types
pub trait Streamable {
    fn stream(&self) -> bool;
}

/// ...

impl Streamable for CreateChatCompletionRequest {
    fn stream(&self) -> bool {
        self.stream == Some(true)
    }
}

/// ...

#[ext(Streamable)]
pub async fn create(
    &self,
    request: CreateChatCompletionRequest,
) -> Result<CreateChatCompletionResponse, OpenAIError> {
    if request.stream() {
        return Err(OpenAIError::InvalidArgument(
            "When stream is true, use Chat::create_stream".into(),
        ));
    }
    self.client.post("/chat/completions", request).await
}

That would generate a create_ext method which takes I: Streamable for requests.

64bit · 2024-10-20T20:05:31Z

Thank you for looking into it further, I'm onboard with either #[ext] or APIExt - they both have pros and cons, on surface looks like APIExt might be easier to implement, however either one would get the job done.

There are other edge cases like multiple inputs, the client enforcing DeserializedOwn, and some type conversions to Form for POST calls, some APIs returning raw Bytes. So with either approach, work required would be non-trivial. Only after a working implementation we'll know.

If you're onboard too, PR is welcome!

Sufflope · 2024-10-21T18:43:39Z

I drafted a first PoC, you can see it in:

the extension branch in my fork https://github.com/Sufflope/async-openai/tree/extension specifically the last commit Sufflope@39678bd this is the macro part that would be needed to be merged
The updated azure_extensions branch https://github.com/Sufflope/async-openai/tree/azure_extensions that shows that now, extensions would only need to expose their custom types. I basically removed all the shadowing I was doing (my ClientExt, Chat, overriding fn create...), there's only the custom types tailored for Azure that are left. I also refactored them to remove lots of serde boilerplate but as I said previously, that's a pretty orthogonal topic. To each extension author their choice how they want to develop their custom types that will be used through the generics
I can't show you the whole code but in my end project, I validated that it works well and I can access the Azure extensions. The change is that basically I import your crate (temporarily, my local fork with my changes of course) + my extension crate from 2. and I use your client as usual, I just changed this to use my extension type:

+use async_azure_openai::types::CreateChatCompletionResponse;
...
-        let response = self
+        let response: CreateChatCompletionResponse = self
             .client
             .chat()
-            .create(request)
+            .create_ext(request)

As you can see I tried to allow genericizing multiple parameters if needed, with the ability to add needed bounds, but also not necessarily not all parameters. This should work for functions that accept a big Request parameter but also an additional e.g. String one.

For the result though I went with the strict decision that it would be Result<Generic, OpenAIError> because it would be much more difficult to dig through the result type. IDK if that's OK?

Sufflope · 2024-11-12T20:53:29Z

Hi again! Could you have a look at my last PoC? I'll soon need support for content filters for a release, I can settle for a temporary ad-hoc fork, so no pressure, but it would be great if I could use your upstream instead 😃

64bit · 2024-11-12T20:59:52Z

Hi @Sufflope thank you for the PoC I'll take a look today and get back to you.

64bit · 2024-11-13T01:50:53Z

Please excuse my delayed response since your PoC.

Your approach and choices looks good to me. Having this minimal surface area for public _ext APIs including Result<Generic, OpenAIError> for output type is excellent.

I'd say for output type Result<Generic, GenericError> would more appropriate because if library is interacting with other OpenAI compatible APIs they may not respond with error response which can be deserialized into OpenAIError::ApiError (i'm not 100% sure though), and so it defeats the purpose. What do you think?

Please do send a PR when you get a chance, and thank you for taking ownership on this feature, its very exciting addition!

Sufflope · 2024-11-13T19:05:27Z

I would say that OpenAIError::ApiError (or more exactly its inner ApiError) looks generic enough as an extensive description of an error, so chances are it's not extended. I'd say we can let it as the fixed error type, it keeps the macro more simple than else, and when someone actually has a need it can be made generic?

I'll prepare a PR soon.

…and response Fixes 64bit#280

Sufflope closed this as completed Oct 10, 2024

Sufflope reopened this Oct 11, 2024

64bit mentioned this issue Oct 13, 2024

allow json values as function call arguments #274

Open

64bit added the out of scope Requests which are not related to OpenAI API label Oct 13, 2024

Sufflope added a commit to Sufflope/async-openai that referenced this issue Nov 13, 2024

Generate extensible versions of methods by allowing generic requests …

d0cad81

…and response Fixes 64bit#280

Sufflope linked a pull request Nov 13, 2024 that will close this issue

Generate extensible versions of methods by allowing generic requests and response #287

Open

Sufflope added a commit to Sufflope/async-openai that referenced this issue Nov 14, 2024

Generate extensible versions of methods by allowing generic requests …

de31d8c

…and response Fixes 64bit#280

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Azure content filters in completion #280

Support Azure content filters in completion #280

Sufflope commented Oct 10, 2024

64bit commented Oct 10, 2024

Sufflope commented Oct 10, 2024

Sufflope commented Oct 11, 2024

64bit commented Oct 12, 2024

Sufflope commented Oct 12, 2024

64bit commented Oct 12, 2024

Sufflope commented Oct 20, 2024 •

edited

Loading

Sufflope commented Oct 20, 2024

64bit commented Oct 20, 2024

Sufflope commented Oct 21, 2024 •

edited

Loading

Sufflope commented Nov 12, 2024

64bit commented Nov 12, 2024

64bit commented Nov 13, 2024

Sufflope commented Nov 13, 2024

Support Azure content filters in completion #280

Support Azure content filters in completion #280

Comments

Sufflope commented Oct 10, 2024

64bit commented Oct 10, 2024

Sufflope commented Oct 10, 2024

Sufflope commented Oct 11, 2024

64bit commented Oct 12, 2024

Sufflope commented Oct 12, 2024

64bit commented Oct 12, 2024

Sufflope commented Oct 20, 2024 • edited Loading

Sufflope commented Oct 20, 2024

64bit commented Oct 20, 2024

Sufflope commented Oct 21, 2024 • edited Loading

Sufflope commented Nov 12, 2024

64bit commented Nov 12, 2024

64bit commented Nov 13, 2024

Sufflope commented Nov 13, 2024

Sufflope commented Oct 20, 2024 •

edited

Loading

Sufflope commented Oct 21, 2024 •

edited

Loading