Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reaching CER 0.052 and WER 0.175 #2

Open
AniketGurav opened this issue Jun 29, 2024 · 2 comments
Open

Reaching CER 0.052 and WER 0.175 #2

AniketGurav opened this issue Jun 29, 2024 · 2 comments

Comments

@AniketGurav
Copy link

AniketGurav commented Jun 29, 2024

Hi georgeretsi,

Thanks for updated code I am following this repo from long time and new repo looks simple to run and maintian. However, when I run the code at line level I could get CER 0.052 and WER 0.175 in 800 epochs. I have created line level data as you have mentioned (using script prepare_iam.py).
Following are value in config files and other important parameters. The weights provided by you gives expected result on test data but when I am training it Its getting stuck at above mentiined CER and WER values.

 conf: {'resume': None, 'save': './temp_16batchSize.pt', 'device': 'cuda:1', 'data': {'path': '/cluster/datastore/aniketag/allData/icpr2/output_path/'}, 'preproc': {'image_height': 128, 'image_width': 1024}, 'arch': {'cnn_cfg': [[2, 64], 'M', [3, 128], 'M', [2, 256]], 'head_type': 'both', 'rnn_type': 'lstm', 'rnn_layers': 3, 'rnn_hidden_size': 256, 'flattening': 'maxpool', 'stn': False}, 'train': {'lr': 0.001, 'num_epochs': 800, 'batch_size': 16, 'scheduler': 'mstep', 'save_every_k_epochs': 10, 'num_workers': 8}, 'eval': {'batch_size': 32, 'num_workers': 8, 'wer_mode': 'tokenizer'}}

Character classes: [' ', '!', '"', '#', '&', "'", '(', ')', '', '+', ',', '-', '.', '/', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', ':', ';', '?', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'] (79 different characters)
training lines 3876
Character classes: [' ', '!', '"', '#', '&', "'", '(', ')', '
', '+', ',', '-', '.', '/', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', ':', ';', '?', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'] (79 different characters)
validation lines 613
Character classes: [' ', '!', '"', '#', '&', "'", '(', ')', '*', '+', ',', '-', '.', '/', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', ':', ';', '?', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'] (79 different characters)
testing lines 1918
Preparing Net - Architectural elements:
{'cnn_cfg': [[2, 64], 'M', [3, 128], 'M', [2, 256]], 'head_type': 'both', 'rnn_type': 'lstm', 'rnn_layers': 3, 'rnn_hidden_size': 256, 'flattening': 'maxpool', 'stn': False}

AM I MISSING SOMETHING?

After observation I found that my training code uses 3876 training lines. As mentioned in paper this work uses a split from reference [21] which contains 6161 training lines. Can it be root cause?

@georgeretsi
Copy link
Owner

Hi there! Thanks for your interest in my repo!

Considering the number of train/val/test lines that you report, I'm guessing that you are using a subset of the actual dataset. The whole dataset will provide these numbers:

training lines 6482
validation lines 976
testing lines 2915

One potential reason for this is that the official IAM repo has three different form folders (data/formsA-D.tgz data/formsE-H.tgz data/formsI-Z.tgz). You have to put all the images into a common folder without subfolders. By your reported numbers, it seems like one of these three sets is missing.

Hope I helped!

@AniketGurav
Copy link
Author

Thanks for reply, I will check and update you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants