Skip to content

Commit

Permalink
fix: add handling of empty role back
Browse files Browse the repository at this point in the history
  • Loading branch information
NanoCode012 committed Jan 18, 2024
1 parent 885f603 commit 13f31d8
Showing 1 changed file with 11 additions and 0 deletions.
11 changes: 11 additions & 0 deletions src/axolotl/prompt_strategies/sharegpt.py
Original file line number Diff line number Diff line change
Expand Up @@ -277,6 +277,17 @@ def tokenize_prompt(self, prompt):
labels[:len_role] = [IGNORE_TOKEN_ID] * min(
len_role, len(labels)
)
elif role == "":
turn = content
# this is only ever the first part, should include the bos token and the user query
res = self._tokenize(
turn, add_eos_token=False, strip_bos_token=False
)
if self.train_on_inputs:
labels = copy.deepcopy(res["input_ids"])
else:
# everything from this is masked out from the labels
labels = [IGNORE_TOKEN_ID] * len(res["input_ids"])
else:
LOG.warning(f"unhandled role: {role}")
continue
Expand Down

0 comments on commit 13f31d8

Please sign in to comment.