Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

model behaviour on text vs. non-text #69

Open
bertsky opened this issue Jul 3, 2024 · 1 comment
Open

model behaviour on text vs. non-text #69

bertsky opened this issue Jul 3, 2024 · 1 comment

Comments

@bertsky
Copy link
Contributor

bertsky commented Jul 3, 2024

If training was done on DIBCO datasets, some of which according to @VChristlein are representing non-text differently, how can the models here be expected to perform on non-textual parts (separators, ornaments, line drawings, edge drawings etc)?

@vahidrezanezhad
Copy link
Member

@bertsky First, let me clarify that our binarization models are not exclusively trained with the DIBCO dataset. In the early stages, the DIBCO dataset was the only ground truth (GT) available to us, so we initially trained some models using it. We then used these trained models to generate pseudo-labeled GT from the SBB datasets. To achieve this, I applied thresholding to binarize almost everything (every element in document images), and then employed scaling and cropping to improve binarization and extract only the desired results from each document image. Consequently, we ended up with a mix of the DIBCO dataset, containing mostly text content, and pseudo-labeled datasets from SBB, which included non-text content as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants