Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minor tweaks to simplify #597

Merged
merged 1 commit into from
Sep 18, 2023
Merged

minor tweaks to simplify #597

merged 1 commit into from
Sep 18, 2023

Conversation

winglian
Copy link
Collaborator

don't debug the attention mask value since that is almost always exclusively 1 for all cases.
simplify config by not asking users to update the total_num_tokens and sample_packing_eff_est since that computation has been optimized and the optimization to set that ahead of time can cause headaches

@@ -489,6 +489,8 @@ def calculate_total_num_steps(cfg, train_dataset, tokenizer):
data_loader_len = data_loader.len_w_stats()
actual_eff = data_loader.efficiency()
LOG.info(f"data_loader_len: {data_loader_len}")
# FIXME: is there a bug here somewhere? the total num steps depends
# on the agreed on value for sample_packing_eff_est
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this mean? Do you mean the total num steps should be affected by sample efficiency or some other var?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly. because if the various processes have different calculations, we choose the max value to find the smallest steps that they all share. It doesn't seem to be a major issue atm, but should get addressed.

@winglian winglian merged commit 31b9e0c into main Sep 18, 2023
4 checks passed
@winglian winglian deleted the misc-fixes-20230917 branch September 18, 2023 15:45
mkeoliya pushed a commit to mkeoliya/axolotl that referenced this pull request Dec 15, 2023
djsaunde pushed a commit that referenced this pull request Dec 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants