- Full logging system (issue #5)
- Add multi-gpu support (issue #6)
- Use multiple workers to load embeddings / support for loading embeddings on the fly to reduce memory usage (issue #8/11)
- Add convenience function to generate candidates - all pairs from a list / cartesian product of multiple lists
- Add error handling for calledProcessError in utils.gpu_mem
- Resolve #24 by fixing training
- Can now run
dscript train --train data/pairs/human_train.tsv --test data/pairs/human_test.tsv --embedding /afs/csail/u/s/samsl/Work/databases/STRING/homo.sapiens/human_nonRed.h5 --output [output] --save-prefix [prefix] --device 0
to replicate paper results - Updated code formatting with black and pre-commit
- Following previous update, addresses #24 by fixing model training while maintaining preferred API and command line usage
- Fixed significant bug in how training was run by reverting to older code
- Should address issue #24: unable to replicate paper results
- To do: code cleaning to bring up to formatting standards while maintaining performance
- Augmentation fix in v0.1.5 was bugged still and would throw an error, now resets index
- Change
--use-w
and--augment
to--no-w
and--no-augment
with store false
- Updated package level imports
- Updated documentation
- Fixed issue #13: improper augmentation of data
- Fixed issue #12: overwrites cmap data sets if they already exist
- Fixed issue #7: bug which would crash contact module if called directly
- Fixed issues #3, #4
- Basic logging system implemented to report skipped pairs
- Fixed wrong variable name in loading from sequence file
- Updated documentation
- Model should be put into
eval()
mode before prediction or evaluation, and when new models are downloaded - this makes the output deterministic by disabling dropout layers