Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] Instructor when instrumented tries to instantiate the type that is given as a response_model #1160

Open
spott opened this issue Dec 7, 2024 · 3 comments
Labels
bug Something isn't working language: python Related to Python integration

Comments

@spott
Copy link

spott commented Dec 7, 2024

Describe the bug
When using instructor, one of the things you pass to the client.chat.completions.create is a Pydantic model type for the kwarg response_model. When openinference instruments instructor, it takes all kwargs to the client.chat.completions.create and tries to save them as attributes. Unfortunately, it doesn't expect to get anything that isn't a standard type (or sequence of standard types), and the mask operation for a span when saving attributes returns with:

return value() if callable(value) else value

which (in my case) tries to instantiate a pydantic type that can't be instantiated without arguments.

I would be happy to put together a pull request for this, however I'm not sure where the problem should be fixed. Should it be fixed in the mask function? or in the instructor _wrapped.py patched_new_func where it creates the dictionary of attributes (by stringifying all inputs), or in _WrappedSpan.set_attributes where it can check if a value isn't a AttributeValue and stringily it or similar before it passes it to _WrappedSpan.set_attribute.

Let me know where you think this should be fixed.

@spott spott added bug Something isn't working triage Issues that require triage labels Dec 7, 2024
@github-project-automation github-project-automation bot moved this to 📘 Todo in phoenix Dec 7, 2024
@dosubot dosubot bot added the language: python Related to Python integration label Dec 7, 2024
@mikeldking mikeldking removed the triage Issues that require triage label Dec 9, 2024
@mikeldking
Copy link
Contributor

Thanks for the note @spott - we'll take a look. If there's a minimum working example we can take a look at that would be amazing. Let us know if we can help in any way.

@spott
Copy link
Author

spott commented Dec 11, 2024

Let me try and show what I've found. I don't think a minimum working example is necessary, though it shouldn't be that hard to make one.

When you patch instructor, the patched async function here is called (there is a similar function for sync functions as well).

here that function calls start_as_current_span. However, notice that it takes the kwargs of the function, and flattens them and passes it as a dict as attributes for the span.

After a few more layers of indirection, we get to here, the _WrappedSpan class. This class has two functions, set_attributes which accepts dicts of AttributeValues, and set_attribute which accepts callables that return AttributeValue or an AttributeValue.

AttributeValue looks like this:

AttributeValue = Union[
    str,
    bool,
    int,
    float,
    Sequence[str],
    Sequence[bool],
    Sequence[int],
    Sequence[float],
]

set_attribute calls a mask function that, at the end, checks if the value is callable, and if it is, calls it with no arguments.

And there is where the problem lies. The function that this whole thing is wrapping (client.chat.completions.create in this case), takes as an argument a response_model... which is a Pydantic model. This isn't a AttributeValue, and when mask tries to call it with no arguments, it tries to construct the Pydantic object, and if that object has required fields, it throws.

Ultimately the problem is that the types that set_attributes accepts aren't enforced, and the flattening of the kwargs here passes non-AttributeValue values along the chain.

So, I'm happy to submit a pull request... but I'm not sure where the contract for AttributeValue should be enforced. Should it be enforced at the flattening of the kwargs? or should it be enforced at the set_attributes? I can see good arguments for both.

I'm leaning towards enforcing it at the flattening of the kwargs. In this case, I would stringify anything that wasn't an AttributeValue in the kwargs dict before passing it as an attribute.

It might also be worth adding some error catching so that an error in the tracing code doesn't result in the function not being called... but I'm not sure where that would go.

@mikeldking
Copy link
Contributor

Let me try and show what I've found. I don't think a minimum working example is necessary, though it shouldn't be that hard to make one.

When you patch instructor, the patched async function here is called (there is a similar function for sync functions as well).

here that function calls start_as_current_span. However, notice that it takes the kwargs of the function, and flattens them and passes it as a dict as attributes for the span.

After a few more layers of indirection, we get to here, the _WrappedSpan class. This class has two functions, set_attributes which accepts dicts of AttributeValues, and set_attribute which accepts callables that return AttributeValue or an AttributeValue.

AttributeValue looks like this:

AttributeValue = Union[
    str,
    bool,
    int,
    float,
    Sequence[str],
    Sequence[bool],
    Sequence[int],
    Sequence[float],
]

set_attribute calls a mask function that, at the end, checks if the value is callable, and if it is, calls it with no arguments.

And there is where the problem lies. The function that this whole thing is wrapping (client.chat.completions.create in this case), takes as an argument a response_model... which is a Pydantic model. This isn't a AttributeValue, and when mask tries to call it with no arguments, it tries to construct the Pydantic object, and if that object has required fields, it throws.

Ultimately the problem is that the types that set_attributes accepts aren't enforced, and the flattening of the kwargs here passes non-AttributeValue values along the chain.

So, I'm happy to submit a pull request... but I'm not sure where the contract for AttributeValue should be enforced. Should it be enforced at the flattening of the kwargs? or should it be enforced at the set_attributes? I can see good arguments for both.

I'm leaning towards enforcing it at the flattening of the kwargs. In this case, I would stringify anything that wasn't an AttributeValue in the kwargs dict before passing it as an attribute.

It might also be worth adding some error catching so that an error in the tracing code doesn't result in the function not being called... but I'm not sure where that would go.

Thanks for the detailed feedback. We'll get this prioritized. Appreciate it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working language: python Related to Python integration
Projects
Status: 📘 Todo
Development

No branches or pull requests

2 participants