Skip to content
This repository has been archived by the owner on Jun 18, 2024. It is now read-only.

Probable error on line 306 in create_pretraining_data.py for albert #256

Open
wjdghks950 opened this issue Jan 18, 2022 · 0 comments
Open

Comments

@wjdghks950
Copy link

wjdghks950 commented Jan 18, 2022

a_end = rng.randint(1, len(current_chunk) - 1)

In line 306, there is appears to be a probable issue.

For random.randint(start, end), the method is end-inclusive.

So, when len(current_chunk) == 2, line 309 would stop at a single iteration.

While this may allow the model to incorporate the single leftover chunk (if it were to be enter the first elif statement in line 339), it will leave the single chunk out of training instances.

Please address this issue.

@wjdghks950 wjdghks950 changed the title Error on line 306 in create_pretraining_data.py for albert Probable error on line 306 in create_pretraining_data.py for albert Jan 18, 2022
@wjdghks950 wjdghks950 reopened this Jan 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant