Skip to content

Sidu28/RNN_encoder

Repository files navigation

RNN Encoder-Decoder in PyTorch

A minimal PyTorch implementation of RNN Encoder-Decoder for sequence to sequence learning.

Supported features:

  • Mini-batch training with CUDA
  • Lookup, CNNs, RNNs and/or self-attentive encoding in the embedding layer
  • Global and local attention (Luong et al 2015)
  • Vectorized computation of alignment scores in the attention layer
  • Input feeding (Luong et al 2015)
  • CopyNet (Gu et al 2016)
  • Beam search decoding
  • Attention visualization

Usage

Training data should be formatted as below:

source_sequence \t target_sequence
source_sequence \t target_sequence
...

To prepare data:

python prepare.py training_data

To train:

python train.py model vocab.src vocab.tgt training_data.csv num_epoch

To predict:

python predict.py model.epochN vocab.src vocab.tgt test_data

References

Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. arXiv:1409.0473.

Denny Britz, Anna Goldie, Minh-Thang Luong, Quoc Le. 2017. Massive Exploration of Neural Machine Translation Architectures. arXiv:1703.03906.

Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, Yoshua Bengio. 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. arXiv:1406.1078.

Jiatao Gu, Zhengdong Lu, Hang Li, Victor O.K. Li. 2016. Incorporating Copying Mechanism in Sequence-to-Sequence Learning. arXiv:1603.06393.

Jiwei Li. 2017. Teaching Machines to Converse. Doctoral dissertation. Stanford University.

Junyang Lin, Xu Sun, Xuancheng Ren, Muyu Li, Qi Su. 2018. Learning When to Concentrate or Divert Attention: Self-Adaptive Attention Temperature for Neural Machine Translation. arXiv:1808.07374.

Minh-Thang Luong, Hieu Pham, Christopher D. Manning. 2015. Effective Approaches to Attention-based Neural Machine Translation. arXiv:1507.04025.

Sam Wiseman, Alexander M. Rush. Sequence-to-Sequence Learning as Beam-Search Optimization. arXiv:1606.02960.

Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Łukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, Jeffrey Dean. 2016. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. arXiv:1609.08144.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages