-
Notifications
You must be signed in to change notification settings - Fork 510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Give the possibility to obtain the full response when calling the vLLM generate function #1199
Comments
What do you mean by full response? |
InspectAI needs the full "results" variable returned by the model.generate call of the vLLM API, see line 131 here outlines/outlines/models/vllm.py Line 131 in a2fd35c
Currently only the texts are returned in a list starting at line 137: outlines/outlines/models/vllm.py Lines 137 to 149 in a2fd35c
|
Couldn't you just implement a [custom solver](https://inspect.ai-safety-institute.org.uk/solvers.html) for InspectAI? |
Hi @rlouf, I'm also interested in this. Custom solvers and custom models are indeed the way to go. However, there is still the issue that we loose information when using Outlines' generate function. For example, with Outlines' wrapper of vLLM, we don't have the stop_reason, logprobs and output_tokens fields of the LLM's output. Would adding an optional argument to output directly the results of vLLM make sense to you? The default behavior would remain the same. I can send a pull request. |
I'm using InspectAI to evaluate language models. In particular, I'm evaluating the benefits of structured text generation using Outlines with language models. I would like to obtain the full response when calling the vLLM generate function since InspectAI expects to get the full response. Would it be possible to give the possibility to the user to get the full response. The default should still be the same as now which is a filtered response.
The text was updated successfully, but these errors were encountered: