-
Notifications
You must be signed in to change notification settings - Fork 2
jtmart/text-processing
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This repository contains a series of scripts to batch process textual data for analysis in R, Python, TXM and IRaMuTeQ. It mainly consists of tools to extract text – and its metadata – from digital sources (PDFs, HTML, SRT), clean it (layout and OCR corrections) and format it in a CSV+TXT format for analysis.
About
Text structuring for import/manipulation/analysis in R | TXM | IRaMuTeQ
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published