-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How is the file "cloze_test_test__spring2016 - cloze_test_ALL_test.csv" created? #41
Comments
You need to export the google sheet to a csv file (from https://docs.google.com/spreadsheets/d/1FkdPMd7ZEw_Z38AsFSTzgXeiJoLdLyXY_0B_0JIJIbw/edit#gid=81257118 and https://docs.google.com/spreadsheets/d/11tfmMQeifqP-Elh74gi2NELp0rx9JMMjnQ_oyGKqCEg/edit#gid=410941117). |
Thanks ! So, the model is not trained on the entire dataset "ROCStories__spring2016 - ROCStories_spring2016.csv"? |
According to the datasets.py file, it's trained on 1497 examples from 'cloze_test_val__spring2016 - cloze_test_ALL_val.csv', validated on 374 examples from the same file, and tested on 'cloze_test_test__spring2016 - cloze_test_ALL_test.csv'. |
It looks like catastrophically small dataset for deep learning model, isn't it? I have heard that good start to get adequate model is 1GB of text data. How does it work? |
The idea of the OpenAI paper is to use a pretrained network and transfer what it knows about language to another task. By doing this, you can obtain really good results with a small dataset. |
Downloading the dataset from the website comprise different filenames, none of which matches this particular filename. Can you please elaborate as to how this file is created - like merging the train & test & val files? Preferably the filename of those files. Thanks !
The text was updated successfully, but these errors were encountered: