From ee91924584b3fa6d2a79c204ebe57a99906999f1 Mon Sep 17 00:00:00 2001 From: Sean Lee Date: Wed, 7 Feb 2024 18:16:31 +0800 Subject: [PATCH] support training with only positive pairs (DatasetFormats.C) --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index c11420f..a259de9 100644 --- a/README.md +++ b/README.md @@ -229,6 +229,8 @@ We support two dataset formats: 2) `DatasetFormats.B`: it is a triple format with three columns: `text`, `positive`, and `negative`. `positive` and `negative` store the positive and negative samples of `text`. +3) `DatasetFormats.C`: it is a pair format with two columns: `text`, `positive`. `positive` store the positive sample of `text`. + You need to prepare your data into huggingface `datasets.Dataset` in one of the formats in terms of your supervised data. ### 2. Train