From 747be4425b0fff2cb6ed2cf63aaec0ac75b6213d Mon Sep 17 00:00:00 2001 From: Geewook Kim Date: Tue, 24 Oct 2023 20:56:38 +0900 Subject: [PATCH] Update README.md --- README.md | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 112e3ce..425f7fe 100644 --- a/README.md +++ b/README.md @@ -14,12 +14,12 @@ Official Implementation of **Web**-based **Vi**sual **Co**rpus **B**uilder (**WE **WEBVICOB** 🕸, **Web**-based **Vi**sual **Co**rpus **B**uilder, is a dataset generator that can readily construct a large-scale visual corpus (i.e., images with text annotations) from a raw Wikipedia HTML dump. The constructed visual corpora can be utilized in building Visual Document Understanding (VDU) backbones. Our academic paper, which describes our engine in detail and provides full experimental results and analyses, can be found here:
> [**On Web-based Visual Corpus Construction for Visual Document Understanding**](https://arxiv.org/abs/2211.03256).
-> [Donghyun Kim](https://github.com/dhkim0225), [Teakgyu Hong](https://dblp.org/pid/183/0952.html), [Moonbin Yim](https://github.com/moonbings), [Yoonsik Kim](https://scholar.google.com/citations?user=nuxd_BsAAAAJ) and [Geewook Kim](https://geewook.kim). In ICDAR 2023 (to appear). +> [Donghyun Kim](https://github.com/dhkim0225), [Teakgyu Hong](https://dblp.org/pid/183/0952.html), [Moonbin Yim](https://github.com/moonbings), [Yoonsik Kim](https://scholar.google.com/citations?user=nuxd_BsAAAAJ) and [Geewook Kim](https://geewook.kim). In ICDAR 2023. ![annot](resources/annot.png) ## Updates -**_2023-05-03_** Our paper is accepted at ICDAR2023. A new version of the paper has been published on arxiv. +**_2023-05-03_** Our paper is accepted at ICDAR2023. A new version of the paper has been published on arXiv. **_2023-02-11_** HTML Section Chunker added, Solve memory-leak issue. **_2022-11-08_** [Paper](https://arxiv.org/abs/2211.03256) published on arxiv. **_2022-11-04_** First Commit, We release the codebase. @@ -111,12 +111,11 @@ And untar ndjson files on `[your workspace path]/raw`. ## How to Cite If you find this work useful to you, please cite: ``` -@inproceedings{kim2023web, +@InProceedings{kim2023web, title = {On Web-based Visual Corpus Construction for Visual Document Understanding}, author = {Kim, Donghyun and Hong, Teakgyu and Yim, Moonbin and Kim, Yoonsik and Kim, Geewook}, - booktitle = {International Conference on Document Analysis and Recognition (ICDAR)}, + booktitle = {Document Analysis and Recognition - ICDAR 2023}, year = {2023}, - note = {accepted, to appear}, } ```