Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error on training when used preprocess_shards #69

Closed
ghost opened this issue Nov 18, 2016 · 2 comments
Closed

Error on training when used preprocess_shards #69

ghost opened this issue Nov 18, 2016 · 2 comments

Comments

@ghost
Copy link

ghost commented Nov 18, 2016

Due to large training dataset I had to use the preprocess_shards in order to split it. When running the train.lua i get the following error:
loading data...
/home/sergio/torch/install/bin/luajit: /home/sergio/torch/install/share/lua/5.1/hdf5/group.lua:312: HDF5Group:read() - no such child 'num_source_features' for [HDF5Group 33554432 /]
Seems like 'num_source_features' is used in preprocess but not in shards.
Could you please advice? Thanks

@guillaumekln
Copy link
Contributor

Unfortunately, preprocesor-shards.py still lags behind in terms of features due to heavy code duplication with preprocess.lua. In the mean time, you can use the updated implementation from @mdasadul:

https://github.com/mdasadul/seq2seq-attn/blob/bcd899ec990da6b2c5c616aab5ac77b5c7760dc6/preprocess-shards.py

See #49.

@ghost
Copy link
Author

ghost commented Nov 18, 2016

Thanks!

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant