DocTr

Good news! Our new work exhibits state-of-the-art performances on the DocUNet Benchmark dataset: DocScanner: Robust Document Image Rectification with Progressive Learning

DocTr

DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction
ACM MM 2021 Oral

Any questions or discussions are welcomed!

Training

DocTr consists of two main components: a geometric unwarping transformer (GeoTr) and an illumination correction transformer (IllTr).

For geometric unwarping, we train the GeoTr network using the Doc3D dataset.
For illumination correction, we train the IllTr network based on the DRIC dataset.

Added training pipeline

dataset path structure

  DATASET_ROOT/
      -imgs/
        img files
      -uvs/
        uv files

to train the model

python train.py

to change config, edit

    batch_size = 1
    num_epochs = 30

in train.py

Inference

Download the pretrained models from Google Drive or Baidu Cloud, and put them to $ROOT/model_pretrained/.
Geometric unwarping:
```
python inference.py
```
Geometric unwarping and illumination rectification:
```
python inference.py --ill_rec True
```

Evaluation

We use the same evaluation code as DocUNet Benchmark dataset based on Matlab 2019a.
Please compare the scores according to your Matlab version.
Use the rectified images available from Google Drive or Baidu Cloud for reproducing the quantitative performance on the DocUNet Benchmark reported in the paper and further comparison.

Citation

If you find this code useful for your research, please use the following BibTeX entry.

@inproceedings{feng2021doctr,
  title={DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction},
  author={Feng, Hao and Wang, Yuechen and Zhou, Wengang and Deng, Jiajun and Li, Houqiang},
  booktitle={Proceedings of the 29th ACM International Conference on Multimedia},
  pages={273--281},
  year={2021}
}

@article{feng2021docscanner,
  title={DocScanner: Robust Document Image Rectification with Progressive Learning},
  author={Feng, Hao and Zhou, Wengang and Deng, Jiajun and Tian, Qi and Li, Houqiang},
  journal={arXiv preprint arXiv:2110.14968},
  year={2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
distorted		distorted
model_pretrained		model_pretrained
.gitignore		.gitignore
DocTr_github.png		DocTr_github.png
GeoTr.py		GeoTr.py
IllTr.py		IllTr.py
LICENSE		LICENSE
README.md		README.md
data.py		data.py
extractor.py		extractor.py
inference.py		inference.py
inference_ill.py		inference_ill.py
position_encoding.py		position_encoding.py
seg.py		seg.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DocTr

Training

Added training pipeline

Inference

Evaluation

Citation

About

Releases

Packages

Contributors 2

Languages

License

jacklishufan/doctr-training

Folders and files

Latest commit

History

Repository files navigation

DocTr

Training

Added training pipeline

Inference

Evaluation

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages