Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
Added the following repositories:
1. Devnagri to Roman transliteration.
2. IndIE: triple extractor for Hindi, Tamil, Telugu, and Urdu.
3. Hindi-BenchIE: an automatic triple evaluator for 112 Hindi sentences.
  • Loading branch information
ritwikmishra authored Sep 27, 2023
1 parent 19ec948 commit bf7d899
Showing 1 changed file with 3 additions and 0 deletions.
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,7 @@ Indian language NLP has come a long way. We feature a few resources that are ill
## <a name='Libraries'></a>Libraries and Tools

- [Indic NLP Library](https://github.com/anoopkunchukuttan/indic_nlp_library): Python Library for various Indian language NLP tasks like tokenization, sentence splitting, normalization, script conversion, transliteration, _etc_
- [Devnagri to Roman transliteration](https://github.com/ritwikmishra/devanagari-to-roman-script-transliteration) using hand-crafted rules and lexicons.
- [pyiwn](https://github.com/riteshpanjwani/pyiwn): Python Interface to IndoWordNet
- [Indic-OCR](https://indic-ocr.github.io/) : OCR for Indic Scripts
- [CLTK](https://github.com/cltk/cltk/tree/master/cltk): Toolkit for many of the world's classical languages. Support for Sanskrit. Some parts of the Sanskrit library are forked from the Indic NLP Library.
Expand All @@ -99,6 +100,8 @@ Indian language NLP has come a long way. We feature a few resources that are ill
- [Smart Sanskirt Annotator](https://github.com/iamdsc/smart-sanskrit-annotator): Annotation tool for Sanskrit [paper](http://www.lrec-conf.org/proceedings/lrec2020/pdf/2020.lrec-1.874.pdf)
- [BNLP](https://github.com/sagorbrur/bnlp): Bengali language processing toolkit with tokenization, embedding, POS tagging, NER suppport
- [CodeSwitch](https://github.com/sagorbrur/codeswitch): Language identification, POS Tagging, NER, sentiment analysis support for code mixed data including Hindi and Nepali language
- [IndIE](https://github.com/ritwikmishra/IndIE): An Open Information Extraction tool (triple extractor) in Hindi. It is conjectured to work for Tamil, Telugu, and Urdu as well.
- [Hindi-BenchIE](https://github.com/ritwikmishra/hindi-benchie): A triple evaluation tool for 112 Hindi sentences.

## <a name='Benchmarks'></a>Evaluation Benchmarks

Expand Down

0 comments on commit bf7d899

Please sign in to comment.