Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading txt in the command line interface #35

Open
silviaegt opened this issue Aug 1, 2016 · 2 comments
Open

Reading txt in the command line interface #35

silviaegt opened this issue Aug 1, 2016 · 2 comments
Labels

Comments

@silviaegt
Copy link

Dear Collatex creators,
thank you so much for making your tool available!! It sounds super useful. Sadly, I haven't been able to run it. It would be lovely if you can show me an example of how to read a .txt file using the Command Line Interface.
Say I have output-adobe.txt + output-tesseract.txt + original.txt and want to compare them.
I open collatex like:
C:\Users\xxxx\Desktop> java -jar collatex-tools-1.7.1.jar
and then?

@rhdekker
Copy link
Member

rhdekker commented Aug 2, 2016

Dear Silvia,

Typing:
C:\Users\xxxxx\Desktop> java -jar collatex-tools-1.7.1.jar output-adobe.txt output-tesseract.txt original.txt
should produce an alignment table in JSON format.

If you don't like the JSON format there are other formats, for example comma separated values (CSV).

C:\Users\xxxxx\Desktop> java -jar collatex-tools-1.7.1.jar output-adobe.txt output-tesseract.txt original.txt -f csv

Hope this helps.

Best,
Ronald

@silviaegt
Copy link
Author

Dear Ronald, it worked perfectly, thank you!

Although I was not able to get the encoding right :(
In the documentation I found that:

plain text version can also be provided in other encodings supported by the Java Platform and will be converted to Unicode before comparison. The command line interface is one such interface which supports character set conversions

I tried doing this:
C:\Users\xxxxx\Desktop>java -jar collatex-tools-1.7.1.jar output_abby.txt output_tesseract.txt output_clean.txt -f csv -ie utf-8 -oe utf-8 >> output.csv

But it didn't work.

Thank you in advance for any help you can provide!

Cheers,

S.

@djbpitt djbpitt added the java label Aug 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants