seq2seq-chatbot

**I've mostly stopped working on this project in favor of my newer implementation neural-chatbot done with TensorFlow.

An implementation of Google's seq2seq architecture.

I am also simultaneously blogging about the process.

##How to Use

First install torch if you haven't already. Here is an easy install process.

You will need to install a few packages to get this to work as well:

$ luarocks install nn
$ luarocks install rnn

Optional (for gpu usage):

$ luarocks install cunn
$ luarocks install cutorch

Or if you prefer AMD,

$ luarocks install clnn
$ luarocks install cltorch

If you get errors, you should try installing these packages as well:

$ luarocks install dpnn
$ luarocks install cunnx

You will need one or more large corpus text files with each line being a conversational phrase. The preceeding line is assumed to be the source, and the following line the target.

To simplify things, my plan is to either include a bash script that downloads a decent sized pre-cleaned corpus, or to actually include the corpus in the data directory. I will do this in the near future, probably after I finished the TODO list above.

##Examples of Usage and Training

Run

$ th train.lua

The dataset is stored in data/raw/, and comes from my other project opensubtitles-parser

##A few notes:

Some of this was borrowed from : char-rnn
This is based off of Sutskever et al., 2014. and Vinyals & Le, 2015.
The data this is being tested on is the OpenSubtitles dataset, I used a script I made to tokenize and create the input output sequences
I made heavy use of the rnn package provided by Element Research

##Unfinished TODO

Need to finish implementing prediction capabilities (ie/ actual chatbot interface for trained models)
Look into checkpointing system, each checkpoint seems to take 3.6G.. Something is clearly wrong there.
Need to profile code to check if there are unnecessary bottlenecks
More testing and algorithm verification

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
Models		Models
Util		Util
data/raw		data/raw
.gitignore		.gitignore
LICENSE		LICENSE
MiniBatchLoader.lua		MiniBatchLoader.lua
Preprocessor.lua		Preprocessor.lua
README.md		README.md
predict.lua		predict.lua
seq2seq.lua		seq2seq.lua
train.lua		train.lua

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

seq2seq-chatbot

About

Releases

Packages

Languages

License

domerin0/seq2seq-chatbot

Folders and files

Latest commit

History

Repository files navigation

seq2seq-chatbot

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages