Skip to content

BrandonSmithJ/tensorflow-rationale

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Understanding Prediction via Unsupervised Rationale Generation

This repo contains a tensorflow implementation of the system described in
"Rationalizing Neural Predictions". Tao Lei, Regina Barzilay and Tommi Jaakkola. EMNLP 2016. [PDF]

A theano implementation is available from: https://github.com/taolei87/rcnn

To run the code with default parameters:

$ python3 model.py --embedding=data/review+wiki.filtered.200.txt.gz --training=data/reviews.aspect1.train.txt.gz --testing=data/reviews.aspect1.heldout.txt.gz --output=output.json

Data sets can be found through the github linked above. Embeddings may be any word vector sets, such as the ones found here.

Overview

The objective is to create a model which can simultaneously classify text documents, and provide justifications for those classifications via the text itself. By specifying two sub-components - a Generator and an Encoder - and training them in concert, the model learns to choose concise phrases which are then used to make the classification.

Using tensorboard, the model can easily be visualized. The two components are linked at each step, such that they form a step-ladder reinforcement scheme.

In other words, the generator determines which text must be selected by gradually increasing or decreasing the probability of selection based on the encoder's ability to predict a classification on the selection. In turn, the encoder learns to predict the correct classification for a given text based on the snippets of text given by the generator.

This reinforcement pattern must be balanced to acheive learning - too much weight on either component overwhelms the other and the model converges suboptimally.

Performance

Initially the generator randomly selects text, which appears as a uniform noise in the visualization. As the encoder provides feedback for the selections, the generator begins creating a more sparse representation, eventually converging to groups of words in each text sample. The images below show different points in the training process, with the latter showing groups of rationals emerging (where the vertical axis represents the text document, and the horizontal represents the batch dimension). The dark blue portions of the images represent padding, which is ignored by the model.

The model as constructed tends to bounce between a small sampling rate and a large one, primarily due to adding a higher cost to the evaluation function at both ends (too little text and too much). This has the effect of allowing the model to recheck previously discarded text later in training, thereby giving a chance to re-evaluate in the context of better learned weights. This also causes a slower overall convergence, but in essence performs a regularization on selections. Alternative cost functions may speed up the convergence to text segments by penalizing small chains of text higher than longer ones, in a more explicit manner than currently implemented.


Perfect convergence (on the current dataset) is likely impossible due to the subjectivity between scores and reviews. Take for example the following (test set) sample:



It would be difficult to rationalize the prediction even as a human, due to the low information density and inherent ambiguity. As well, outliers tend to be difficult for the network to predict:



For many however, the network performs well both on the prediction and the text selection to justify that prediction:




One thing to note: though these samples are predicted on the given rating for 'taste', the network doesn't necessarily select text which deals with taste directly (e.g. specific flavors). Instead, it selects the text which is most predictive of the score given; this text is entirely dependent upon what the sample writers thought contributed to score they gave.

About

Generating rationales to justify predictions

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published