Can I do SFT with dataset that includes tool usage with TorchTune? #1921
-
Hello! I am working on supervised fine-tuning (SFT) for Llama models using a chat dataset that includes tool calling in the OpenAI format. I'm unsure if this specific setup is directly supported by TorchTune. I see that fine-tuning on a chat dataset without tool calling works well, and I also noticed that there is a role (e.g., ipython) intended for tool calling. My questions are: Is SFT on a chat dataset with tool calling supported? I assume it is. Thanks a lot for any help! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 5 replies
-
Yes, tool-calling is supported in SFT as long as the model tokenizer you are using supports it. A tool call would be You will just need to ensure that your dataset gets translated to I've been meaning to add a dataset example with tool calls and tool returns but haven't gotten a chance to. What datasets have you been looking at in particular (if they're on HF)? cc @joecummings whos been thinking about task-based dataset builders, maybe tool calling could be one? |
Beta Was this translation helpful? Give feedback.
Yes, tool-calling is supported in SFT as long as the model tokenizer you are using supports it. A tool call would be
Message(role="assistant", ipython=True)
and the return from the tool call would beMessage(role="ipython")
You will just need to ensure that your dataset gets translated to
Messages
correctly. You may need to make a custom message transform, using thetorchtune.data.OpenAIToMessages
as a starting point. We might need to update that class to ensure tool calls and tool returns are converted correctly, so please let us know if you have any trouble with this.I've been meaning to add a dataset example with tool calls and tool returns but haven't gotten a chance to. What dataset…