Skip to content

Commit

Permalink
clearly point out the DJ format
Browse files Browse the repository at this point in the history
  • Loading branch information
yxdyc authored Dec 26, 2024
1 parent 1adca64 commit 62392ae
Showing 1 changed file with 18 additions and 1 deletion.
19 changes: 18 additions & 1 deletion tools/fmt_conversion/post_tuning_dialog/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,4 +76,21 @@ For post tuning formats, we mainly consider 4 formats to support [ModelScope-Swi
}
```

In Data-Juicer, we pre-set fields to align with the last Query-Response format, which serves as our intermediate format for post-tuning dialog datasets. Correspondingly, we provide several tools to convert datasets in other formats to Query-Response format and vice versa.
In Data-Juicer, we pre-set fields to align with the last two formats (Alpaca and Query-Response), which serves as our intermediate format for post-tuning dialog datasets. Correspondingly, we provide several tools to convert datasets in other formats to Query-Response format and vice versa.

- DJ default format for post-tuning OPs:

```python
{
"system": "<system>",
"instruction": "<query-inst>",
"query": "<query2>",
"response": "<response2>",
"history": [
[
"<query1>",
"<response1>"
]
]
}
```

0 comments on commit 62392ae

Please sign in to comment.