Skip to content

Commit

Permalink
Fix wrong head divisor when loading dict
Browse files Browse the repository at this point in the history
Great catch @ofivite!
  • Loading branch information
janEbert committed Aug 16, 2024
1 parent ec55113 commit 92ca94c
Showing 1 changed file with 1 addition and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,7 @@ def maybe_mup_init(module):
attn_norm_head_divisors = collections.defaultdict(lambda: attn_norm_head_divisor)
else:
# Here we don't use a `defaultdict` so that we get errors for missing values.
attn_norm_head_divisors = base_head_widths
attn_norm_head_divisors = {name: math.sqrt(head_width) for (name, head_width) in base_head_widths.items()}

for name, layer in self.named_modules():
if (
Expand Down

0 comments on commit 92ca94c

Please sign in to comment.