Skip to content

Commit

Permalink
Add requirements.txt, .gitignore, corpora simple shortcut.
Browse files Browse the repository at this point in the history
  • Loading branch information
kimdwkimdw committed Jun 19, 2017
1 parent 4572f29 commit ce1f163
Show file tree
Hide file tree
Showing 3 changed files with 110 additions and 3 deletions.
103 changes: 103 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# pyenv
.python-version

# celery beat schedule file
celerybeat-schedule

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/

corpora
logdir
preprocessed
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,9 @@ I don't intend to replicate the paper exactly. Rather, I aim to implement the ma

## Training
* STEP 1. Download [IWSLT 2016 German–English parallel corpus](https://wit3.fbk.eu/download.php?release=2016-01&type=texts&slang=de&tlang=en) and extract it to `corpora/` folder.
```sh
wget -qO- --show-progress https://wit3.fbk.eu/archive/2016-01//texts/de/en/de-en.tgz | tar xz; mv de-en corpora
```
* STEP 2. Adjust hyper parameters in `hyperparams.py` if necessary.
* STEP 3. Run `prepro.py` to generate vocabulary files to the `preprocessed` folder.
* STEP 4. Run `train.py` or download the [pretrained files](https://u42868014.dl.dropboxusercontent.com/u/42868014/transformer/logdir.zip).
Expand Down Expand Up @@ -86,6 +89,3 @@ got: Oh yeah you all are incredibly
source: Dies ist nicht meine Meinung Das sind Fakten<br>
expected: This is not my opinion These are the facts<br>
got: This is not my opinion These are facts



4 changes: 4 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
nltk>=3.2.4
numpy>=1.13.0
regex>=2017.6.7
tensorflow>=1.2.0

0 comments on commit ce1f163

Please sign in to comment.