Skip to content

Handwritten recognition model for Esposalles datasets, based on LSTM and CTC.

Notifications You must be signed in to change notification settings

leitro/handwrittenEsposallesCTC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

handwrittenEsposallesCTC

Handwritten recognition model for Esposalles datasets, based on BLSTM and CTC.

My software environment:

  • Ubuntu 16.04 x64
  • Python 3.5
  • TensorFlow 1.1

Structure:

It is a handwritten recognition model based on BLSTM and CTC. At this moment, I just use the simplest way to implement it: only 1 convolutional layer for feature extraction and followed by 1 BLSTM layer, and CTC as the loss function.

Esposalles Datasets:

In Esposalles datasets (available at Esposalles Datasets), there are 2 types: textline-based and word-based.

Figure 1. Textline-based Esposalles datasets

Figure 2. Word-based Esposalles datasets

Note: In my repository there is a folder named "groundTruth", which is to make the datasets easier using. So when you download the Esposalles datasets, please copy the groundtruth txt files to the corresponding folders.

Usage:

  • esposallesData.py is to preprocess the Esposalles datasets, you can change to textline-based datasets or word-based ones by changing the bool value "TEXTLINE".
  • esposallesSequenceCTC.py is the main program which has a class of SeqLearn().

Result:

During the running of the program, "train_cer.log" and "test_cer.log" will generate. When it finished, the character error rate can be visualized by showplt.py. For textline-based datasets (with the batch size of 8), the test CER reaches 10% at around 400th epoch. For word-based datasets (with the batch size of 64), the test CER reaches 13.5% at around 400th epoch. Here is the demo result:

Figure 3. Character error rate for textline-based datasets

Figure 4. Character error rate for word-based datasets

To be improved:

  • This model has only 2 layers: one is convolutional and the other is BLSTM, so if more layers added, the result will be much better.
  • Dynamic learning rate

About

Handwritten recognition model for Esposalles datasets, based on LSTM and CTC.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages