Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add md5 hash to resource, to be used for fix_attachment_reference #54

Closed
wants to merge 1 commit into from

Conversation

linonetwo
Copy link

@linonetwo linonetwo commented Oct 14, 2023

@linonetwo
Copy link
Author

For mac user want to use this before merged, you can modify

/usr/local/Cellar/evernote-backup/1.9.2_2/libexec/lib/python3.11/site-packages/evernote_backup/note_formatter.py

by using vscode code /usr/local/Cellar/evernote-backup/1.9.2_2/libexec/lib/python3.11/site-packages/evernote_backup/note_formatter.py

@vzhd1701
Copy link
Owner

Sorry, but data element cannot have hash attribute as per official specification. If you need to alter produced enex file you should write your own converter.

<!--
  Corresponds to the Resource.data field.
  The binary body of the resource must be encoded into Base-64 format.  The
  encoding may contain whitespace (e.g. to break into lines), or may be
  continuous without break.  Total length of the original binary body may not
  exceed 25MB.
-->
<!ELEMENT data (#PCDATA)>
<!ATTLIST data encoding NMTOKEN "base64">

@vzhd1701 vzhd1701 closed this Oct 18, 2023
@linonetwo
Copy link
Author

linonetwo commented Oct 18, 2023

How to write your own converter without forking this repo? Do you have plugin mechanism?

Anyway, I have done importing my notes to tiddlywiki, I just want to help others who want to import it to tiddlywiki or obsidian (at cost of writing a PR), if it requires higher price like maintaining a fork, then I won't have time for it, because my time is for https://github.com/tiddly-gittly/TidGi-Desktop .

I still hope you can accept this PR, so importing images in tw and ob is possible. Evernote is dying, we don't really have to stick to their spec! (And this hash attr won't hurt other app, just an addition)

@vzhd1701
Copy link
Owner

Just write a simple converter, why do you need to use my repo specifically? Here is an example:

import base64
import hashlib
import sys
from pathlib import Path

from lxml import etree


def parse_xml(xml_file):
    parser = etree.XMLParser(strip_cdata=False)

    with open(xml_file, 'rb') as f:
        return etree.parse(f, parser)


def add_resource_hashes(enex_file: Path):
    tree = parse_xml(enex_file)

    for resource in tree.findall('.//resource'):
        r_data = resource.find("data")

        r_data_bin = base64.b64decode(r_data.text)
        r_data_md5 = hashlib.md5(r_data_bin).hexdigest()

        r_data.set("hash", r_data_md5)

    new_file = Path(f"{enex_file.stem}_modified{enex_file.suffix}")

    with open(new_file, 'wb') as f:
        tree.write(f, encoding="UTF-8")


if __name__ == "__main__":
    add_resource_hashes(Path(sys.argv[1]))

@linonetwo
Copy link
Author

Thanks, this makes sense, I don't know if it is as simple as this, also I was trying to make it all-in-one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants