Skip to content

Commit

Permalink
Update post_tuning_dialog/README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
yxdyc authored Dec 26, 2024
1 parent 6a51521 commit 461baf3
Showing 1 changed file with 5 additions and 5 deletions.
10 changes: 5 additions & 5 deletions tools/fmt_conversion/post_tuning_dialog/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

For post tuning formats, we mainly consider 4 formats to support [ModelScope-Swift](https://github.com/modelscope/ms-swift/blob/main/docs/source_en/Customization/Custom-dataset.md) and [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory/blob/main/data/README.md).

- Messages format (Also as ShareGPT format in LLaMA-Factory):
- Swift's Messages format (Very similar to the LLaMA-Factory's ShareGPT format, with different key names):

```python
{
Expand Down Expand Up @@ -31,7 +31,7 @@ For post tuning formats, we mainly consider 4 formats to support [ModelScope-Swi
}
```

- ShareGPT format:
- Swift's ShareGPT format:

```python
{
Expand All @@ -49,7 +49,7 @@ For post tuning formats, we mainly consider 4 formats to support [ModelScope-Swi
}
```

- Alpaca format:
- Alpaca format (used in the same definition in Swift and LLaMA-Factory):

```python
{
Expand All @@ -60,7 +60,7 @@ For post tuning formats, we mainly consider 4 formats to support [ModelScope-Swi
}
```

- Query-Response format:
- Swift's Query-Response format:

```python
{
Expand All @@ -76,4 +76,4 @@ For post tuning formats, we mainly consider 4 formats to support [ModelScope-Swi
}
```

In Data-Juicer, we use the Query-Response format as our intermediate format for post tuning dialog datasets. Thus, Data-Juicer provides several tools to convert datasets in other formats to Query-Response format and vice versa.
In Data-Juicer, we pre-set fields to align with the last Query-Response format, which serves as our intermediate format for post-tuning dialog datasets. Correspondingly, we provide several tools to convert datasets in other formats to Query-Response format and vice versa.

0 comments on commit 461baf3

Please sign in to comment.