-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-modal context management within gptel. #459
Comments
* gptel-anthropic.el (gptel-make-anthropic, gptel--anthropic-parse-multipart, gptel--anthropic-models): Add support for sending PDFs to the model `claude-3-5-sonnet-20241022'. This is the only model that supports reading PDFs as of now. Cache sent PDFs so follow up the input cost of reading the PDF in follow up messages is 90% cheaper.
Will address this when I work on the system messages next, for which, as I mentioned in #416, a fair bit of work is planned.
The context inspection buffer was supposed to fulfill this purpose. (More explanation below)
This is not the case, the context chunks as displayed in the context inspection buffer are read-only. By edit I guess you meant "remove the context chunk from gptel"?
In the context inspection buffer, pressing RET on a context chunk pops up a window with the relevant buffer. You can use this to jump there. The idea was that the context inspection buffer can fulfill both roles: it provides a listing of added context chunks (like So could you explain what exactly is missing? (If there are minor ergonomic deficiencies with the context buffer, we can address them.)
Image and media support was added over a month ago, before the 0.9.6 release. You can indeed add supported document types using PDF support for Claude 3.5 Sonnet (and only this model) is brand new. Anyway, I added it just now. |
|
I'll reply in detail when I can, but regarding point 2 I wanted to link to #475. |
Thanks I've fleshed out my thinking here: #475 (comment) |
First up I want to say:
GPTel is fantastic - it accelerates my Emacs workflow no end.
I want to thank you for creating this tool, in the way you have; lightweight and seamless, across the panopole of what Emacs offers.
My feedback here is in the context of "context":
As we have discussed, the system message management could be easier.
gptel--system-message is not robustly initialised. #416 (comment)
Context management is great, but the way one manages it is a little difficult, for me at least.
I am interested in your advice, but the means by which one adds and removes context feels a little rough. Adding context is simple enough; just call gptel-add with a region selected or just a buffer or file selected. Of course, the options via the transient menu are easy enough as well, although in this case, when one seeks to add several pieces of context, jumping around the buffers can be a challenge.
The difficulty lies in managing context once it has been created.
What seems to be missing is an easy list of the regions, buffers and files that have been added.
The C command, via the transient menu, opens a list of the context content, rather than clear references to where that content came from. What I think I would find more convenient is the ability to pop open a consult style list with completions to enable me to select the context I want to remove or edit. The functionality of the existing C command is great when one wants to edit the content, however, given that the content comes from other buffers, or regions within buffers, providing means by which a list of "context references" would be presented that would enable me to jump directly to the origin of that context, to edit in-place there, would be a lower friction workflow.
There may well be a way of doing this which I've not yet discovered, so please let me know if I've missed something!
Multi-modal context support would be fantastic. For example, integrating advancements like this.
https://docs.anthropic.com/en/docs/build-with-claude/pdf-support
whereby one could put PDFs, images and other multi-modal files in context, with gptel-add. This would feed into what I've described at suggestion 2, where instead of viewing the context content directly, one instead has references to the source of that content, especially given that in light of this suggestion 3, that content may not be text and therefore not directly editable with Emacs.
Thanks again for creating such a useful tool.
[2024-11-05 Tue 10:49]
The text was updated successfully, but these errors were encountered: