Skip to content

saifullah3396/docxclassifier

Repository files navigation

DocXclassifier: towards a robust and interpretable deep neural network for document image classification

This repository contains the evaluation code for the paper DocXclassifier: towards a robust and interpretable deep neural network for document image classification by Saifullah Saifullah, Stefan Agne, Andreas Dengel, and Sheraz Ahmed.

Requires Python 3+. For evaluation, please follow the steps below.

Environment Setup

Clone the repository

git clone https://github.com/saifullah3396/docxclassifier.git --recursive

Install requirements

Install the dependencies:

pip install -r requirements.txt

Setup environment variables:

export PYTHONPATH=./external/torchfusion/src
export DATA_ROOT_DIR=/home/ataraxia/Datasets/
export TORCH_FUSION_CACHE_DIR=</your/cache/dir>
export TORCH_FUSION_OUTPUT_DIR=</your/output/dir> # can be any directory where datasets are cached and model training outputs are generated.

DocXClassifier Models

Model Dataset Accuracy
DocXClassifier-B RVL-CDIP 94.00%
DocXClassifier-L RVL-CDIP 94.15%
DocXClassifier-XL RVL-CDIP 94.17%
DocXClassifier-B Tobacco3482 (RVL-CDIP Pretraining) 95.29%
DocXClassifier-L Tobacco3482 (RVL-CDIP Pretraining) 95.57%
DocXClassifier-XL Tobacco3482 (RVL-CDIP Pretraining) 95.43%
DocXClassifier-B Tobacco3482 (ImageNet Pretraining) 87.43%
DocXClassifier-L Tobacco3482 (ImageNet Pretraining) 88.43%
DocXClassifier-XL Tobacco3482 (ImageNet Pretraining) 90.14%

DocXClassifierFPN Models

Model Dataset Accuracy
DocXClassifierFPN-B RVL-CDIP 94.04%
DocXClassifierFPN-L RVL-CDIP 94.13%
DocXClassifierFPN-XL RVL-CDIP 94.19%
DocXClassifierFPN-B Tobacco3482 (RVL-CDIP Pretraining) 95.57%
DocXClassifierFPN-L Tobacco3482 (RVL-CDIP Pretraining) 95.71%
DocXClassifierFPN-XL Tobacco3482 (RVL-CDIP Pretraining) 94.86%
DocXClassifierFPN-B Tobacco3482 (ImageNet Pretraining) 88.43%
DocXClassifierFPN-L Tobacco3482 (ImageNet Pretraining) 89.57%
DocXClassifierFPN-XL Tobacco3482 (ImageNet Pretraining) 90.29%

Evaluation on RVL-CDIP:

Please download the RVL-CDIP dataset and place it under the directory $DATA_ROOT_DIR/documents/rvlcdip. Evaluate the DocXClassifier models on the RVL-CDIP dataset using the following script:

./scripts/run/evaluate/document_classification/evaluate_rvlcdip_no_fpn.sh

Evaluate the DocXClassifierFPN models on the RVL-CDIP dataset using the following script:

./scripts/run/evaluate/document_classification/evaluate_rvlcdip_fpn.sh

Evaluation on Tobacco3482 dataset:

Please download the Tobacco3482 dataset and place it under the directory $DATA_ROOT_DIR/documents/tobacco3482. Evaluate the DocXClassifier models on the Tobacco3482 dataset with ImageNet pretraining using the following script:

./scripts/run/evaluate/document_classification/evaluate_tobacco3482_no_fpn.sh

Evaluate the DocXClassifier models on the Tobacco3482 dataset with ImageNet pretraining using the following script:

./scripts/run/evaluate/document_classification/evaluate_tobacco3482_fpn.sh

Evaluate the DocXClassifier models on the Tobacco3482 dataset with RVL-CDIP pretraining using the following script:

./scripts/run/evaluate/document_classification/evaluate_tobacco3482_rvlcdip_pretrained_no_fpn.sh

Evaluate the DocXClassifier models on the Tobacco3482 dataset with RVL-CDIP pretraining using the following script:

./scripts/run/evaluate/document_classification/evaluate_tobacco3482_rvlcdip_pretrained_fpn.sh

Citation

If you find this useful in your research, please consider citing our associated paper:

@article{Saifullah2024,
  title = {DocXclassifier: towards a robust and interpretable deep neural network for document image classification},
  ISSN = {1433-2825},
  url = {http://dx.doi.org/10.1007/s10032-024-00483-w},
  DOI = {10.1007/s10032-024-00483-w},
  journal = {International Journal on Document Analysis and Recognition (IJDAR)},
  publisher = {Springer Science and Business Media LLC},
  author = {Saifullah,  Saifullah and Agne,  Stefan and Dengel,  Andreas and Ahmed,  Sheraz},
  year = {2024},
  month = jun
}

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published