Bidirectional LSTM-CRF Models for sequence tagging, Huang et al., 2015

Paper, Tags: #nlp, #architectures

Our work is the first to apply BI-LSTM-CRF model to NLP benchmark sequence tagging datasets. The model can use past and future input, it can also user sentence level tag information thanks to a CRF layer. It is robust and has less dependence on word embedding as compared to previous observations.

Models

LSTM networks

A RNN maintains a memory based on history information. LSTM networks are the same as RNNs, except that the hidden layer updates are replaced by purpose-built memory cells. The're better at finding and exploiting long range dependencies in the data.

BI-LSTM networks

Used to have access to both past and future input features. We train BI-LSTM networks using backpropagation through time.

CRF networks

They focus on sentence level instead of individual positions, and the inputs and outputs are directly connected

LSTM-CRF networks

The network efficiently uses past input features via a LSTM layer and a sentence level tag information via a CRF layer, which has a state transition matrix (position independent) as parameters, so we can use past and future tags to predict the current tag, similar to the use of past and future input features via a BI-LSTM network.

BI-LSTM-CRF networks

This model can use the future input features as well, boosting tagging accuracy.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1508.01991.md

1508.01991.md

Bidirectional LSTM-CRF Models for sequence tagging, Huang et al., 2015

Paper, Tags: #nlp, #architectures

Models

LSTM networks

BI-LSTM networks

CRF networks

LSTM-CRF networks

BI-LSTM-CRF networks

Files

1508.01991.md

Latest commit

History

1508.01991.md

File metadata and controls

Bidirectional LSTM-CRF Models for sequence tagging, Huang et al., 2015

Paper, Tags: #nlp, #architectures

Models

LSTM networks

BI-LSTM networks

CRF networks

LSTM-CRF networks

BI-LSTM-CRF networks