Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better tied weight handling #464

Merged
merged 4 commits into from
Nov 30, 2024
Merged

Better tied weight handling #464

merged 4 commits into from
Nov 30, 2024

Conversation

cg123
Copy link
Collaborator

@cg123 cg123 commented Nov 30, 2024

Handle cases where some input models have a tied tensor and some don't.

For example, there are some fine tunes of Llama 3.2 3B floating around that are ~3.6B parameters because they have a separate LM head - with these changes these can be merged with standard sized ones. There will be a LM head in the output model if any inputs have one. Otherwise behavior will be as it was before.

@cg123 cg123 merged commit 68c4b65 into main Nov 30, 2024
6 checks passed
@cg123 cg123 deleted the tied-weight-handling branch November 30, 2024 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant