Replies: 25 comments 17 replies
-
Kinda related: #249 |
Beta Was this translation helpful? Give feedback.
-
Neat idea! The TL;DR is that I'm on board with the idea, but not sure about the implementation. For gptel I'm not interested in providing opinionated/fit-to-purpose system messages or templates for tasks, since (for the most part) I don't want to be in the business of testing various templates to see which ones work best, and changing them over time. gptel is 75% basic infrastructure and 25% UI. However I am interested in providing the flexibility to create your own templates like the ones in yap, preferably with some UI. This is part of a larger plan to allow gptel to specify an LLM pipeline interactively or with elisp, i.e. a linked graph where each node is an LLM API call and you can specify the parameters for each, including templates like the ones from your experiments. You can already do this by chaining together The problem so far is that users are comfortable with the familiar chat and "operate on this region in place" paradigms, but it's not apparent to them how to use an LLM client to simulate any other kind of use. For example, you have to know elisp and read the documentation of Other LLM clients (or web services) offer a plethora of specialized commands like I'm curious to know if you have any thoughts about this approach. |
Beta Was this translation helpful? Give feedback.
-
I agree, I don't think gptel should be responsible for managing prompts. I mostly just wanted to include a few sample templates in yap just so that people have an idea of the kind of things they can build. I expect that in most cases, users have their specific templates. Another option would be to have a separate "contrib" repo with less oversight that can collect just these templates. On that note, a lot of even my own personal usecase is just providing just the last user prompt for the
default template:
Just this alone would be huge as the current way of just adding the transformation in the system message produces worse results in my testing.
While being able to define templates via UI is good, I personally think for more most use-cases, we should have predefined templates. Most of the times, I just want to select a region and give it a user prompt and have the tool just template out a chat so as to optimize the llm output.
I think most end users only do have to rely on this. As I said, once you have a good enough generic(probably mode aware template), it is just about selecting a region and giving a command. The main idea here is that we should split out the transformation to the region and system prompt as separate messages. See my
I totally agree. Providing specialized commands is very restricting. Even in yap, I've avoided providing commands. There are a few builtin templates, but they are more of "these are the kind of things you can do". I have however focused a bit more on providing functions that can let you build templates like Here is a video of how I use Screen.Recording.2024-09-08.at.9.27.20.AM.mov |
Beta Was this translation helpful? Give feedback.
-
@meain I have added support for templates to gptel. It's currently in the The commit message doesn't explain much, so for now here's how you can use them:
A directive can be
Here are some examples of directives: A stringInterpreted as a simple system message. No change from gptel's current behavior:
A list of stringsA template consisting of a system message followed by user/assistant interactions (synthetic example since I couldn't think of a static example that doesn't need a function): ("You are a large language model living in Emacs and a helpful assistant. Respond concisely." ; system
"<user question for priming the llm here>" ;user
"<canned llm response here>") ;llm A function that returns a stringA string returned by a function is interpreted as the system message. A forced example here, since these can be three different system messages that you can manually select. (defun mode-specific-instruction ()
"Return a generic LLM system prompt depending on context."
(cond
;; In programming modes
((provided-mode-derived-p major-mode 'prog-mode)
(let ((lang (gptel--strip-mode-suffix major-mode)))
(format
(concat "You are a %s programmer. Answer my questions with a combination of %s code and text."
" The text should be laced in appropriate code comments.")
lang lang)))
;; When composing git commits
((or git-commit-mode
(buffer-match-p "^COMMIT_EDITMSG$" (current-buffer)))
(concat "You are a git commit helper. Generate a changelog for this changeset:\n\n```diff"
(shell-command-to-string "git diff")
"```"))
;; In text modes
((provided-mode-derived-p major-mode 'text-mode)
(concat "You are a prose editor. Proofread the provided text and suggest changes based on:\n"
"- Grammar and style: check for fragments, run-on sentences, purple prose and the like."
"- Brevity: Suggest ways to shorten or simplify this text without losing important details.")))) The first element of the list can be A function that returns a list of stringsA function that returns a template, including the system message: (defun rewrite-template ()
"Rewrite or refactor selected region"
(let ((lang (downcase (gptel--strip-mode-suffix major-mode))))
(list ;; system
(format (concat "You are a %s programmer. "
"Follow my instructions and refactor %s code I provide. "
"Generate ONLY %s code as output, without "
"any explanation or markdown code fences.")
lang lang lang)
;; user
(buffer-substring-no-properties (region-beginning) (region-end))
;; llm
"What is the required change?"))) Using these directives in gptelThese can be added to (setq gptel-directives
`((tutor . "You are a tutor and domain expert in the domain of my questions...")
(synthetic . ("You are a large language model living in Emacs and a helpful assistant. Respond concisely."
"<user question for priming here>"
"<canned llm response here>"))
(DTRT . ,#'mode-specific-instruction)
(rewrite . ,#'rewrite-template))) |
Beta Was this translation helpful? Give feedback.
-
I spend a bit of time playing around with it, but I'm not sure how this would work. I was mostly trying out the refactor workflow. I've added the following to the gptel config: (defun rewrite-template ()
"Rewrite or refactor selected region"
(let ((lang (downcase (gptel--strip-mode-suffix major-mode))))
(list ;; system
(format (concat "You are a %s programmer. "
"Follow my instructions and refactor %s code I provide. "
"Generate ONLY %s code as output, without "
"any explanation or markdown code fences.")
lang lang lang)
;; user
(buffer-substring-no-properties (region-beginning) (region-end))
;; llm
"What is the required change?")))
(add-to-list 'gptel-directives '(rewrite . #'rewrite-template)) I selected a section of code, called
PS: The inspected list object seems to have the selected section as the final message as well, which I'm assuming was an oversight. (:model "gpt-4o-mini" :messages
[(:role "system" :content
"You are a lisp-interaction programmer. Follow my instructions and refactor lisp-interaction code I provide. Generate ONLY lisp-interaction code as output, without any explanation or markdown code fences.")
(:role "user" :content "(add-to-list 'gptel-directives '(rewrite . #'rewrite-template))
")
(:role "assistant" :content "What is the required change?")
(:role "user" :content
"(add-to-list 'gptel-directives '(rewrite . #'rewrite-template))")]
:stream t :temperature 1.0) |
Beta Was this translation helpful? Give feedback.
-
Yes, using a rewrite template this way won't work because there's nowhere to add the instruction. I've updated gptel's rewrite feature to use this template for now, and added the dry-run option so you can see what will be sent. (Run
It's not, this is expected since gptel includes the selected region in the prompt by default. This expectation is fine when using the previous way that rewriting worked, where the rewrite instructions were part of the system message and the code from the selected region was part of the prompt. Now both the code and instructions are part of the prompt, as specified by the rewrite template. Overall I'm not yet seeing the benefit of using templates -- it's easy enough to write a wrapper function to do the templating and call gptel inside it. Building templates into gptel inverts this nesting, but doesn't make it any easier to define new useful templates. It looks like templates will have to be dynamic to be useful, and thus require the user to write elisp functions anyway. Going back to my original concerns,
The templates as implemented right now should work pretty much how they do in Yap, except that "static" ones with unchanging text can be defined more simply. |
Beta Was this translation helpful? Give feedback.
-
@meain The templating feature is more or less complete, you can test it out now. I plan to merge it in a day or two. Of the three possibilities for directives: a string (system message), a list of strings (system message + templated conversation) or a function (dynamic system message or conversation), really only the string and function options are useful. |
Beta Was this translation helpful? Give feedback.
-
Templates
I think I understand the problem here. It's not the template implementation that's lacking, because it works more or less the same as Yap's templates. If you define a template using a function (like Yap does), you can put anything in it, anywhere. For example, here's how you could implement (defun gptel-split-buffer-directive (&optional system-prompt)
(list (or system-prompt (alist-get 'default gptel-directives))
(format "I'll provide a document with a highlighted section.
The code is in %s.
The answer should be specific to the highlighted section,
but use the rest of the text as context to understand the patterns and intent."
(gptel--strip-mode-suffix major-mode))
"OK. What is the highlighted text?"
(buffer-substring-no-properties (region-beginning) (region-end))
"What is before the highlighted section?"
(buffer-substring-no-properties (point-min) (region-beginning))
"What is after the highlighted region?"
(buffer-substring-no-properties (region-end) (point-max))
"What can I help you with?")) (For simplicity I've shown only the case with an active selection.) The limitation is actually that This makes Using an arbitrary template in gptel thus requires dropping into elisp. The above template can be used as: (gptel-request (read-string "Rewrite instructions: ")
:system 'gptel-split-buffer-directive
:in-place t) which should work exactly like A one-to-one comparison between
|
Beta Was this translation helpful? Give feedback.
-
@meain Okay, after thinking about it for a while I've added templates (the full version) to the transient menu. These can be selected
Try it out and let me know what you think: The templates themselves are not included. You can evaluate the test templates here to add them. This brings yap's template functionality in full to gptel, but I think there are some issues with this implementation. I'll wait for your feedback before going into more depth on this. |
Beta Was this translation helpful? Give feedback.
-
I like the new templating interaction(I guess I'm biased here). I would have personally liked templating for rewrite as well but I can see myself using the templates functionality in gptel instead of One thing I've been primarily been missing from yap is the ability to continue a conversation ideally with added context. For example, I start off a conversation with a template, then continue that in a gptel buffer additionally adding any extra context needed. It would be nice if we had something similar to the current "response to gptel session", but instead of just the last message, it will dump the entire templated chat Just some notes I had while going through the changes:
Unrelated request: We be nice to inspect the query and then send it. Currently inspect dismisses the transient interface.
We might not always be able to pick the right template based on context. For example think if we have two different rewrite options, one to add comments, one to fix bugs and we call rewrite from a prog-mode buffer, it is not something that can be picked automatically. We could use the context to narrow down to particular one though. One possibility I was thinking about was to let people define templates limited to specific major-modes. |
Beta Was this translation helpful? Give feedback.
-
The rewrite system message has been elevated to a first-class directive -- you can pick any from For the first problem (use any template with rewrites), we're almost there but there's one little problem left to sort out: The rewrite template is currently hardcoded to this: (system-message ; <-- Customizable, pick anything from gptel-directives or use the rewrite-hook
<code here> ; <-- hardcoded
what is the required change? ; <-- hardcoded
<rewrite instruction here>) ; <-- Read from the transient menu
It's easy to remove the hardcoding here, but then it becomes more difficult for the user to define a custom template. Instead of specifying just the system message, which they can do simply as a static string, they'll be forced to specify a function that constructs the full list. I'm not sure I want to do that.
It's easy to modify the "response to gptel session" option to dump the filled-in chat template to the gptel session. The only problem is the UI: It will override the current behavior, which is valuable too. Do you think if the directive is a templated conversation (as opposed to a system message only), it always makes sense to dump the whole chat?
Yes, I prefer this approach as it has less syntax. You can always add a comment by the side: '(system ;system message
nil ;user
"What would you like..." ;assistant
"Give me X") ;user
This can be done without much trouble. Extending gptel to handle this is easy: '(system
nil
("What would you like" :prop1 val1 :prop2 val2)
("Give me X" :prop3 val3))
I am aware. The Anthropic API is also quite finicky, but about different things about the prompt.
Cool. How do you test this kind of thing? I'm exhausted just thinking about testing all these variations.
This is surprisingly difficult to do right now. I need to rewrite the whole networking code path (required for other features), and I hope to make it easier to pause and resume requests like this.
Yes, this is the main drawback I mentioned in my previous message, and the reason I am not merging the templates option into master yet. I'm merging everything else, just not the templates UI from the latest commit. Fundamentally, there is no difference between a "template" and a "directive" as implemented in gptel. So this distinction is artificial and potentially confusing. If someone selects a directive and then a template, they'd have no idea what is going to be sent. From the perspective of a new-ish user (who is probably aware of what a "system prompt" is), that's too many non-orthogonal concepts required to use gptel. The difference between the two is only in whether the buffer text is included in the prompt. So I think better nomenclature and UI design is needed -- ideally I'd avoid the term "template" altogether and just make do with directives. If you have any ideas please let me know. |
Beta Was this translation helpful? Give feedback.
-
Sounds good to me.
OK, I think the problem is the difference in how we see rewrites. You should just see rewrite as "replace the current text with the response from LLM". Don't necessarily have to think as rewriting code or prose. I think an easy and intuitive(in my subjective opinion) to add an option next to "respond in place" for rewrite. It could be something like "replace selected text" instead of proving a separate rewrite interface. With that in place, the current hardcoded template can be made to be just one of the templates in
Not necessarily. Both are useful. It is possible that you might want just the output, but is also possible that you want the full chat. If we really had to pick a way to automatically decide between this, I would say, if it is a new gptel buffer, add full context, otherwise just the response. Otherwise it might be worth it to provide two separate options, but I'm worried if we are just adding too many options now 😅. Just for some statistics, in my personal use 80-85% of the time, I'm just using the default rewrite template which is similar to what you have for gptel. I use custom template mostly for adding additional info like diagnostics from flymake. It also comes in handy when I need to improve the responses by templating specifically (eg: role prompting).
I don't have any scientific methods for testing any of this. It is more tribal knowledge from randomly experimenting, and lot of papers/blogs/posts.
I like the idea of being able to easily modify system prompt and "directive". That is lost with templates. I don't really have an answers on how to have the same level of ease. One thought I had was to merge system prompt and template that way the current "system prompt" thing will be a template, but with only the system prompt and they can add it in. As for the UI, this would be a separate option in the system prompt picker transient, something like:
|
Beta Was this translation helpful? Give feedback.
-
All commits except the last one adding the "templates" option have been merged into master.
I don't follow. Have you tried using the (I'll refer to the The point of the dedicated rewrite interface is that it gives you a different UI after the response is received: you can ediff, diff etc. In the beginning I was directing people to the "respond in place" option, but there were many requests for something like the current, dedicated rewrite interface. Actually, the "respond in place" option also supports ediff. Try the following!
I like this older rewrite workflow, and think it's more elegant and general. But it turned out to be completely undiscoverable, and users kept asking for a dedicated rewrite menu/interface. They also wanted a way to see the proposed changes before they're applied, not after with an option to revert them. Hence the new, loud design with colorful overlays with user-driven changes. BTW, I remember you mentioned (possibly in a meetup) that gptel's rewrite workflow is too slow for your tastes and has too many steps. Just checking if you're aware of these two things:
This is already the case with
I don't want to populate the menu with even more options, at least for now. Dumping the full prompt if it's a new gptel session sounds like a good compromise to me, I'll do that.
You can do this even with the current implementation of the hardcoded rewrite interface: (defun gptel--rewrite-with-diagnostics ()
(list "You are a ... programmer. Follow instructions..." ;system
(concat "Here are the relevant linter errors :"
(collect-flymake-diagnostics-here)) ;user
nil)) ;assistant
(add-to-list 'gptel-directives
'(rewrite-with-diag . gptel--rewrite-with-diagnostics)) This is prepended to the hardcoded prompt to give you (list "You are a ... programmer. Follow instructions..." ;system
"Here are the relevant linter errors:
- error 1
- error 2
..." ;user
nil ;assistant
;; The following is the current hardcoded template
(list (buffer-substring-no-properties
(region-beginning) (region-end)) ;user
"What is the required change?" ;assistant
(or rewrite-message gptel--rewrite-message))) ;user Of course, this is a brittle sort of composition and we need a better solution.
I don't understand this idea. Here's how I'm using these terms:
This is not clear. They can add what in? |
Beta Was this translation helpful? Give feedback.
-
Ohh, this seems useful. I had misunderstood what what it did. I was also totally unaware of "Tweak Response". Seems like a great alternatively to having a separate rewrite interface to me. To be frank I did not spend a lot of time trying to play around with gptel initially before building out yap as I had found quite few ideas that I wanted to experiment with which would have been hard with gptel.
Agreed on both of those points, about it being elagant and about it being undiscoverable. About the discoverability issue, one small change might be to change "respond in place" to "rewrite selection" if there is an active selection.
The ideal workflow for rewrites for me(and what is currently in yap) is to see the output of the llm while keeping the original intact, if necessary, see a diff and then apply if it looks good. Viewing the diff after the fact is an OK compromise. If we don't already, providing something like I also kinda like what chatgpt-shell does where the new output is added along with the current one in the same buffer(unlike yap where we open it in different buffer). This might however get tricky if you are doing big(over 50 lines) rewrites which I tend to do ~25% of the time.
This is the part that I'm not a big fan of. In most cases, when I have performed a rewrite, I don't want to go back and then invoke
Yup, I understand. I just wanted to point out what my current use-case was like.
I don't think I communicated what I was thinking coherently. What I meant was that it would nice if we could define "variables" within the template which could be visually filled in via the template the same way we do with system and directive entries. For example, in the YouTube example that you had, the transient would add a "button" for the URL and when you fill it in, you can see the url in the UI before you hit "RET". But the more I think about it, the more if feels like a unnecessary complication. |
Beta Was this translation helpful? Give feedback.
-
I couldn't get Here's what I think you're suggesting
The difference between this and how gptel-rewrite currently works is that
Does that cover it, or did I miss something?
I'm not sure how this would work? The original text is deleted when the response starts streaming in. (It remains available via the "Tweak response" options)
You don't have to go back to the menu. You can invoke all available actions directly:
This looks like the same as
Hmm, I need to think about this some more. |
Beta Was this translation helpful? Give feedback.
-
Huh, let me look into it. You are correct in the assumption. In any case, here is a demo. Also, would it be possible to try installing yap-rewrite-example.mov.mp4
True, but streaming in the response let's me start validating the response as it streams in and for most paid services, the tokens per second are more than my reading speed that this makes for a good workflow. This way, I don't have to sit there doing nothing until the model response is fully available. Just a small note on the UI. "Refactor" seems too specific. "REWRITE READY" might be a more better term to use in the place of "REFACTOR READY". Also, probably don't insert "Refactor:" into the input box. It could be a generic set of instructions. Again, all are personal preferences. Just letting you know how I would have gone about it. In fact, I think rewrite too is a bit too specific that I'm thinking about switching to replace in yap as I would also want to support images(if I ever get to doing it is a separate queestion). I might want to provide an image as input, as it to "replace" the image with a modified version of it.
Again, my bad. I misunderstood what |
Beta Was this translation helpful? Give feedback.
-
My main takeaway from this thread is actually that I'm doing a very poor job of communicating gptel's features. For the longest time I was having trouble explaining to users that gptel works anywhere -- they assumed you had to open a special "chat" buffer. Perhaps they still do. Trying to make features like rewriting intuitive without hitting them in the face with it is going to be difficult!
I have one point of disagreement and one of agreement.
gptel-rewrite-in-place.mp4Now you can see the response as it comes in, and the overlay with the same diff/ediff/merge/apply actions is available after it's done. It also avoids the problem with long text chunks that you get from auto-using the merge option (the chatgpt-shell feature). What do you think of this approach? The big problem with this approach is that the incoming text needs to actually be inserted into the buffer -- which the user may not want. I can do this trick with overlays alone and not modify the buffer, but the rewritten region will then be intangible, which causes other issues. |
Beta Was this translation helpful? Give feedback.
-
I also probably didn't do my fair share of reading of gptel docs.
One suggestion I would like to make is to add a gif at the top for refactor/rewrite. The top gifs are currently just about having a separate chat session. People(including me) don't generally even read the entire
I feel like it is a difference in workflow. Since I'm mostly working with code, as soon as I see about 10% of the response, I can mostly decide if the response is worth waiting for or if I should just ignore this one and think of a better prompt. And for this, I need to see the response as it streams in.
This seems to be better than the current approach. This way at least we can see what the llm is coming up with as it is coming up with it. I don't know if I'll be able to halfway through give up on the current output and decide to try another prompt.
I agree. In my workflow, I need to be able to know if the thing that the llm produces is useful before it is complete, but I ideally don't want it to change anything in my buffer. That said, most other tools that do refactor, either provides an inline diff or rewrite the content and so this is definitely an improvement. Since the users can open a diff once the rewrite it complete, I think the workflow is good enough. Will have to do a good job at communicating about good workflows.
I too feel that that will be more annoying than useful. |
Beta Was this translation helpful? Give feedback.
-
Yeah, I'll need to do this. One thing I did in gptel-rewrite just now is hit the user over the head with the available options, probably to the point of annoying them. From the commit message:
Making it easy (one keybind) to abort rewrites is planned.
It was a prototype demo. I pushed it now, although it works somewhat differently from the demo (see below).
The more I thought about it the worse the idea seemed -- so I've switched to the full-overlay method now. Between auto-formatters, LSP, linters and other minor-modes,I think modifying the buffer can have all kinds of undesirable side effects, depending on what the user is running.
This is the approach I'm using now (please test). I've made it easier to accept/reject the changes, so I'm hoping that offsets the overlay tangibility problem. While it's not entirely intangible, it takes a little more care to make sure the point is in the overlay now. Hopefully that'll work. gptel-rewrite-code-demo-1.mp4 |
Beta Was this translation helpful? Give feedback.
-
Hahaha. Hopefully that is enough.
Huh, I did not think about it. Makes sense.
Seems to work well enough. |
Beta Was this translation helpful? Give feedback.
-
Aborting rewrites in progress now works correctly with screencast_20241202T234523.mp4I think the rewrite interface is in much better shape now. There's only one problem: the overlay display becomes wonky when it's larger than the screen height. It still works but will probably be confusing to users. |
Beta Was this translation helpful? Give feedback.
-
Sweet!
Any time I've worked with overlays, it works great in the beginning, but then you end up spending a lot of time fixing all the edge cases. While we are on the topic of rewrites. I've seen a few tools where they get a code response along with an explanation from the llm model, but use just the first code block for rewrite or in general do some post processing on the output received from the llm to get to what the rewrite text is. I'm not sure how we would want to model that interaction. I've been thinking more about this as I've had hard time making some llms follow the instruction "only give me the code, no explanations". In the particular case of code with explanations scenario, I've seen the explanation go into a separate buffer but with the code going into where we would replace it. I think it should be easy to extend the package to do it at a later point if we want to. Not something that we should block on for "releasing" the new rewrite interface, but thought I would mention it. |
Beta Was this translation helpful? Give feedback.
-
Just for reference, this is what Zed does. It is really good. I don't know how tricky it will be to get a similar flow working with Emcas overlays. This UI is super useful when there is only a few additions and we are not rewriting the entire thing. Screen.Recording.2024-12-03.at.8.39.44.PM.mov |
Beta Was this translation helpful? Give feedback.
-
I'm planning to tag a new release with the improved rewrite UI and the generalized directives features. Templates aren't in yet. As I mentioned above, they're the same things as directives but used differently, and I haven't found a good way to communicate this to the user yet. This screen crosses a confusion threshold for me: There's a system message option, a "directive" option below that, and a "prompt from template" option, and they're all related. So this UI needs some work. Maybe there is some reshuffling/renaming of these options that can make it less confusing. Templates have one more problem -- they're non-interactive functions that query the user interactively. While this practice won't send the Elisp cops after us, it is considered bad design in Emacs. FWIW I'd like to avoid this too. Do you have any suggestions for me with regards to these features before I tag the release? |
Beta Was this translation helpful? Give feedback.
-
Dry run output can now be freely edited by hand and the request continued. The original request specification (callback etc) is respected. |
Beta Was this translation helpful? Give feedback.
-
I was wondering if there there was some way I could template out a longer conversation than just system prompt + context. Currently if I want do do some refactor on regions(let say I add the directive as "flip the args"), the
messages
that get sent to llm is:From my testing I found that llms work much better when the system message/prompt is just high level role and the instruction is separate. For example the messages would look something like this:
To avoid any confusion, the above set of
messages
is not created over time, but loaded from a template with the dynamic parts filled in. Let me know if I'm missing something, but as far as I can tell gptel does not allow to create such a list of messages to be sent to llms. It is even more useful when we have to add additional context to add some metadata around them.I've been hacking around these concepts on meain/yap and the responses with messages structured this way seem to be much better even with smaller models. I build out messages based on the user selection and "directive"/user-prompt. Also see here to see how one can provide additional data and nudge the llm to only use it for context.
Beta Was this translation helpful? Give feedback.
All reactions