Skip to content

Commit

Permalink
fix: add handling of empty role back
Browse files Browse the repository at this point in the history
  • Loading branch information
NanoCode012 committed Feb 22, 2024
1 parent 1623a50 commit 85ddde2
Showing 1 changed file with 11 additions and 0 deletions.
11 changes: 11 additions & 0 deletions src/axolotl/prompt_strategies/sharegpt.py
Original file line number Diff line number Diff line change
Expand Up @@ -280,6 +280,17 @@ def tokenize_prompt(self, prompt):
labels[:len_role] = [IGNORE_TOKEN_ID] * min(
len_role, len(labels)
)
elif role == "":
turn = content
# this is only ever the first part, should include the bos token and the user query
res = self._tokenize(
turn, add_eos_token=False, strip_bos_token=False
)
if self.train_on_inputs:
labels = copy.deepcopy(res["input_ids"])
else:
# everything from this is masked out from the labels
labels = [IGNORE_TOKEN_ID] * len(res["input_ids"])
else:
LOG.warning(f"unhandled role: {role}")
continue
Expand Down

0 comments on commit 85ddde2

Please sign in to comment.