Skip to content

Commit

Permalink
Fill in readme
Browse files Browse the repository at this point in the history
  • Loading branch information
apeyser committed Feb 3, 2015
1 parent 1964a90 commit 86eb528
Showing 1 changed file with 35 additions and 0 deletions.
35 changes: 35 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,37 @@
# tab2opf
Remake of tab2opf dictionary builder for kindle

Script to convert tab delimited dictionary files into opf file to run
with kindlegen into a translation lookup dictionary for kindle.

Based on the generally available tab2opf.py by Klokan Petr Přidal
(www.klokan.cz) from 2007

The script has been mostly rewritten and extended. The encoding
convolutions have been removed, and the code migrated to python3

The input form is:
Word(s) \\t Definition

By running tab2opf.py path/file.txt in the current directory file.opf
and file*.html are created which can then be converted with kindlegen
file.opf into file.mobi

--source and --target options define which language to which we are
translating.

tab2opf has a -m option to load a module to load getkey and getdef
functions and mapping dictionary from that namespace into the current
one; if it doesn't find such defined member(s), no error is produced.

getkey converts the term (the Word(s)) into a search key. getdef
converts Definition in some arbitrary way. mapping is a dictionary
mapping from char -> char to replace some set of characters in the
input with characters in the output.

Used and tested is the dictcc.py, which converts a dict.cc german ->
english dictionary (http://www.dict.cc/?s=about%3Awordlist&l=e). The
keys are the longest word in the Word(s) after removal of
prepositional phrases and some pronouns (in getkey). The definitions
remap the dict.cc pattern of "definition \\t part-of-speech" into
(part-of-speech) definiton.

0 comments on commit 86eb528

Please sign in to comment.