-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
epub 2 txt conversion issues. #47
Comments
I'm not really sure what's up with this other than just something odd in the epub formatting. My suggestion would be to see if Calibre is able to cleanly export this to text, as a next test. It is unfortunate it's silently missing sections of a book though. I wonder if this is related to a relatively recent merge that checks for both p and div? Hmm, no I took a look again at that and it would not have been behind this. I think it's got to be something with the formatting of the epub as it works fine with books made be big publishers. |
In the test epub you supplied, I did get output indicating some of the chapters had problems:
I also tried exporting to text with epub2tts, but comparing the output of the two didn't show me anything obvious missing. without knowing specific phrases to search for (ones that were missing), I'm not really sure I can do anything here, sorry. |
I'm using Calibres FanFicFare plugin to download from royal road. I'm not quite sure what those error messages you listed mean. In this particular example, look at line 10485 in the text file. in the txt:
In the epub
|
I've run into issues a few times converting epubs to txt where it will silently fail for parts of it (and then i dont realiize until later when I'm being confused about the book not making sense)
Look at Chapter 33 ( which gets labeled part 34).
Half of the chapter is missing in the txt file.
Ghost in the City try2.zip
The text was updated successfully, but these errors were encountered: