Skip to content

Commit

Permalink
update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
thammegowda committed Oct 21, 2021
1 parent fb73aec commit 4adfaf4
Show file tree
Hide file tree
Showing 8 changed files with 1,087 additions and 535 deletions.
11 changes: 6 additions & 5 deletions docs/00-intro.adoc
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
== Overview


https://github.com/isi-nlp/rtg[Reader-Translator-Generator (RTG)^] is a Neural Machine Translation toolkit based on pytorch.

link:versions.html[_See all versions_^]
* link:versions.html[_See all versions_^]
* Demo: 500-Eng multilingual NMT: http://rtg.isi.edu/many-eng/
=== Features
* Reproducible experiments: one `conf.yml` that has everything -- data paths, params, and
Expand All @@ -17,21 +18,21 @@ link:versions.html[_See all versions_^]
*** Lot of varieties of transformer: width varying, skip transformer etc configurable from YAML files
*** https://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf[RNN based Encoder-Decoder^] with https://nlp.stanford.edu/pubs/emnlp15_attn.pdf[Attention^]. (No longer using it, but it's available for experimentation)
* Language Modeling: RNN, Transformer
* And more ..
* And more ...
** Easy and interpretable code (for those who read code as much as papers)
** Object Orientated Design. (Not too many levels of functions and function factories like Tensor2Tensor)
** Experiments and reproducibility are main focus. To control an experiment you edit an YAML file that is inside the experiment directory.
** Where ever possible, prefer https://www.wikiwand.com/en/Convention_over_configuration[convention-over-configuration^]. Have a look at this experiment directory structure (below).

[#colab-example]
=== Quick Start using Google Colab
=== Google Colab Example

Use this Google Colab notebook for learning __how to train your NMT model with RTG__: https://colab.research.google.com/drive/198KbkUcCGXJXnWiM7IyEiO1Mq2hdVq8T?usp=sharing


=== Setup

`rtg` has been published to PyPi at https://pypi.org/project/rtg/
image:https://badge.fury.io/py/rtg.svg["PyPI version", link="https://badge.fury.io/py/rtg"]

----
pip install rtg
Expand Down
132 changes: 129 additions & 3 deletions docs/10-conf.yml.adoc
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[#conf.yml]
[#conf]
== RTG *`conf.yml`* File

The key component of RTG toolkit is a `conf.yml`. As the name suggest - it is a YAML file containing configuration
Expand All @@ -18,7 +18,7 @@ such as BPE/char/words, and vocabulary size.
** Suite - a set of source and reference file pairs, for computing BLEU scores
[#conf-minimal]
=== Minimal Yet Complete Config File:
=== Config Example:

.conf.yml
[source,yaml]
Expand Down Expand Up @@ -92,6 +92,132 @@ updated_at: '2019-03-09T21:15:33.707183' # automatically updated by system
seed: 12345 # fix the manual seed of pytorch + cuda + numpy + python_stdlib RNGs. Remove/comment this to disable
----

[#config-opts]
=== Config options

.Summary of component choices
[%autowidth]
|===
|Component | Choices

|model
|tfmnmt, rnnmt, rnnlm, tfmlm, skptfmnmt, wvtfmnmt, wvskptfmnmt, tfmextembmt, robertamt, mtfmnmt, hybridmt, CBOW, tfmcls

|optimizer
| adam, sgd, adagrad, adam_w, adadelta, sparse_adam

|schedule
| noam, inverse_sqrt

|criterion
|sparse_cross_entropy, kl_divergence, focal_loss, binary_cross_entropy, smooth_kld, triplet_loss, smooth_kld_and_triplet_loss, dice_loss, squared_error

|===


[#config-schedule]
==== `schedule` options

. `noam` with args:
* warmup
* constant
* model_dim

. `inverse_sqrt` with args:
* warmup
* peak_lr

[#config-criterion]
==== `criterion` options

* `smooth_kld` (recommended; used since the first version of transformer)
** `label_smoothing`: float : [0, 1] : optional: default=0.1
.Args to `smooth_kld`
|===
|Name |Type| Range/Choices| Required |Default
|`label_smoothing`
|`float`
| `[0.0, 1.0)`
| Optional
|0.1
|===

* `sparse_cross_entropy`
.Args to `sparse_cross_entropy`
|===
|Name |Type| Range/Choices| Required |Default | Comment

|`weight`
|`str`
| `{inv_freq, inv_sqrt_freq, inv_log_freq}`
| Optional
| None => disable weighing
|

|`weight_calm_time`
|`int`
| [0, )
| Optional
| 0 => disable calming;
| Applicable when `weight` is enabled

|===


* `kl_divergence` (re-implementation of `smooth_kld` with some extra features)
.Args to `kl_divergence`
|===
|Name |Type| Range/Choices| Required |Default

|`label_smoothing`
|`float`
| `[0.0, 1.0)`
| Optional
| 0.0 => disable label smoothing

|`weight`
|`str`
| `{inv_freq, inv_sqrt_freq, inv_log_freq}`
| Optional
| None => disable weighing

|`weight_calm_time`
|`int`
| [0, )
| Optional
| 0 => disable calming => weights applicable from step 0

|===

* `focal_loss`
.Args to `focal_loss`
|===
|Name |Type| Range/Choices| Required |Default
|`gamma`
|`float`
| `[0.0, )`
| Optional
| 0.0 => disable => cross entropy

|`weight_calm_time`
|`int`
| [0, )
| Optional
| 0 => disable calming => weights applicable from step 0

|===

* _Experimental loss functions:_
** `dice_loss`
** `binary_cross_entropy`
** `triplet_loss`
** `squared_error`
[#conf-early-stop]
=== Early stop
Add the below piece of config to `trainer` to enable early stop on convergence.
Expand Down Expand Up @@ -243,7 +369,7 @@ prep:
----

[#conf-vocab]
== Vocabulary Preprocessing using Sentencepiece or NLCodec
== Vocabulary Preprocessing

link:https://github.com/google/sentencepiece[Google's sentencepiece] is an awesome lib for
preprocessing the text datasets.
Expand Down
28 changes: 6 additions & 22 deletions docs/80-migration.adoc → docs/15-migration.adoc
Original file line number Diff line number Diff line change
@@ -1,11 +1,14 @@
[#migrate]
== Migration

[#migrate-to-0_6]
== Migration from v0.5.0 or earlier to v0.6.0
=== v0.5.0 or earlier to v0.6.0

The optimizer block got a big update in v0.6.0, as a result it is not backward compatible.

.Old config, prior to v0.6.0:

[yaml]
[source,yaml]
----
optim:
args:
Expand All @@ -24,7 +27,7 @@ optim:
name: ADAM
----
.New config in v0.6.0
[yaml]
[source,yaml]
----
optimizer:
name: adam
Expand All @@ -47,22 +50,3 @@ criterion:
args:
label_smoothing: 0.1
----


=== Learning rate schedule

. `noam` with args:
* warmup
* constant
* model_dim

. `inverse_sqrt` with args:
* warmup
* peark_lr

=== Criterion
. `cross_entropy`
* label smoothing not implemented yet, FIXME: support label smoothing
. `smooth_kld`
* `label_smoothing`
. Other (experimental): `binary_cross_entropy`, `triplet_loss`
2 changes: 1 addition & 1 deletion docs/45-scaling.adoc
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[#scaling-big]
== Scaling to Big Datasets Using PySpark
== Scaling Big Using PySpark

When dealing with big datasets, the traditional tools such as multiprocessing and SQLite3 simply aren't enogh.
In such scenario, https://spark.apache.org/[PySpark] is a useful backend to use.
Expand Down
8 changes: 6 additions & 2 deletions docs/index.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,14 @@ USC Information Sciences Institute Natural Language Group
//injects google analytics to <head>
:docinfo2:
:hide-uri-scheme:
:source-highlighter: rouge

include::00-intro.adoc[]

include::10-conf.yml.adoc[]

include::15-migration.adoc[]

include::20-clitools.adoc[]

include::30-environ.adoc[]
Expand All @@ -25,8 +28,9 @@ include::40-train-pro.adoc[]

include::45-scaling.adoc[]


include::50-serve.adoc[]

include::60-develop.adoc[]

include::80-migration.adoc[]

include::60-develop.adoc[]
Loading

0 comments on commit 4adfaf4

Please sign in to comment.