Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double mnemonic #432

Merged
merged 44 commits into from
Aug 20, 2024
Merged

Double mnemonic #432

merged 44 commits into from
Aug 20, 2024

Conversation

tadeubas
Copy link
Contributor

@tadeubas tadeubas commented Jul 31, 2024

PR to fulfill this issue: #318

  • Show an asterisk after BIP39 mnemonic title if it is a double mnemonic when creating/loading a 24w mnemonic
  • Option to create a new double mnemonic from camera snapshot
  • Changed "Processing ..." to "Processing.." to adhere to the standard we use elsewhere
  • Changed "Saving ..." to "Processing.." in order to optimize translations
  • Create tests...
  • Create docs...

What is the purpose of this pull request?

  • Bug fix
  • New Feature
  • Documentation update
  • Other

@tadeubas tadeubas changed the base branch from main to develop July 31, 2024 23:52
Copy link

codecov bot commented Jul 31, 2024

Codecov Report

Attention: Patch coverage is 98.09524% with 2 lines in your changes missing coverage. Please review.

Project coverage is 94.20%. Comparing base (af4b621) to head (478730c).
Report is 45 commits behind head on develop.

Files Patch % Lines
src/krux/pages/login.py 93.93% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #432      +/-   ##
===========================================
+ Coverage    94.15%   94.20%   +0.04%     
===========================================
  Files           58       59       +1     
  Lines         7189     7264      +75     
===========================================
+ Hits          6769     6843      +74     
- Misses         420      421       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@odudex
Copy link
Member

odudex commented Aug 5, 2024

Before we proceed with this PR, I would like to see a more detailed context, including a clear definition of the feature and the term 'double mnemonic.' Additionally, the goals and demands being addressed.
The docs should also include a description of the functionality, and a tool to generate 'double mnemonics' must be created, as it does not currently exist in Krux.

@tadeubas
Copy link
Contributor Author

Finished this PR for review!

@tadeubas tadeubas marked this pull request as ready for review August 13, 2024 23:13
@jdlcdl
Copy link
Collaborator

jdlcdl commented Aug 14, 2024

As of latest optimization, uncontrolled testing (using Amigo, doing double-mnemonic again and again, watching the usb console) is showing that a try takes just less than 6ms.

My testing over many variations of this pr, finding 77 double mnemonics indicates that a double was found roughly once in 246 tries. Is my math correct here on the expected probability being 1/256? In Tadeu's implementation here, a try is done knowing the first and the second valid 12-word mnemonics, therefore putting these together into a 24-word mnemonic will have a valid last word 1 in 256 times just like any other 23 words + another word will have the same probability of being valid?
...and that is very different than an incorrect intuition that 24 scrambled words might have a 1 in 256 chance of being a valid double mnemonic. Rather, 24 scrambled words would have a 1/256 chance of being a valid 24w mnemonic, then the first 12w would have a 1/16 chance of being valid, and the second 12w would also have a 1/16 chance of being valid... for once in 16 * 16 * 256 scrambled-24word phrases 1/65536???

Nice job on implementation, documentation, and automated tests!

@jdlcdl
Copy link
Collaborator

jdlcdl commented Aug 14, 2024

... for once in 16 * 16 * 256 scrambled-24word phrases 1/65536???

Answering my second question above with brute-force -- might add "and ignorance" ;)

import random
from embit import bip39

tries = 0
found = 0

while tries < 2**24:
    tries += 1
    words24 = random.choices(bip39.WORDLIST, k=24)
    if (
        bip39.mnemonic_is_valid(" ".join(words24[:12]))
        and bip39.mnemonic_is_valid(" ".join(words24[12:]))
        and bip39.mnemonic_is_valid(" ".join(words24))
    ):
        print(f"double: \"{' '.join(words24)}\" found on try: {tries}")
        found += 1

if found:
    print(f"found {found} / tried {tries}: 1/{tries/found}")
else:
    print(f"found none, tried {tries}")

Which reported:

...
found 267 / tried 16777216: 1/62836.01498127341

...while I was expecting to find 254; It's "in the ballpark" to me.

@tadeubas
Copy link
Contributor Author

... for once in 16 * 16 * 256 scrambled-24word phrases 1/65536???

Answering my second question above with brute-force -- might add "and ignorance" ;)

import random
from embit import bip39

can't believe you've used embit version to do this brute-force 🤣

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please let "validate_position" default to True, this is default behavior.

@odudex
Copy link
Member

odudex commented Aug 16, 2024

Great work!
This PR goes beyond the cool feature and brings many other good optimizations. I liked them all!
Only minor detail I would change is restore default touch behavior, that is to validate position(ignoring touches with incomplete or invalid metadata) as default, for stability reasons.
Other than that I believe we could move on and merge it!

@tadeubas
Copy link
Contributor Author

Thank you!
Regarding the change in the default touch behavior, what are the stability reasons? Every call that was passing the False/True argument is behaving the same way as before, only the default has been changed to cover most cases (hence the default).

@jdlcdl
Copy link
Collaborator

jdlcdl commented Aug 16, 2024

Storing notes about my findings of krux.bip39, which replaces embit.bip39.mnemonic_to_bytes() as 1) a complete rewrite using big integers that I'll argue is easier to read/understand, and more importantly 2) which has significant performance enhancements that pay-off when brute-forcing mnemonics.

I had originally considered submitting a pr to embit with these changes (even though I find it arguably scandalous for an anonymous contributor to rewrite such a well reviewed/tested-in-production/stable function from a well respected developer) but I needed testing/measuring of the performance differences on at least a small set of hardware devices.

There are two major differences in the implementation of krux.bip39.mnemonic_to_bytes() which explain the significant speedup.

  • 2x-3x speedup: via try: index = wordlist.index(word) with an except leg that raises the ValueError, instead of if word not in wordlist: raising the same.
  • 2x-25x speedup: via using a memory-hungry WORDLIST dict with word keys that map to an integer index.

The rest of the changes in krux.bip39.mnemonic_to_bytes() implementation might gain an extra 2-8% increase in performance, but not more than that on any of the devices I tested.

Because embit is geared towards limited-resource micro-controllers, I will plan only to submit a "safe" pull-request to embit which addresses the first performance boost above via the try/except block. As for the second WORDLIST dict, while it succeeded without memory errors in all of my device tests, I still believe it will be a contentious change for embit.

But, I have good news. Both embit.bip39.mnemonic_is_valid() and embit.bip39.mnemonic_to_bytes() accept an optional wordlist parameter and 3rd party apps, like krux's double-mnemonic code, can pass in a more efficient structure if they need added performance. If/when embit implements the first speedup, we can ditch krux.bip39 and use this trick.

# this dict is more efficient than calling list.index(word)
class WordIndex(dict):
    def index(self, word):
        return self[word]

# if device has enough memory and is brute-forcing, then use a WordIndex object
wordindex = WordIndex({word: i for i, word in enumerate(bip39.WORDLIST)})
bip39.mnemonic_is_valid(mnemonic, wordlist=wordindex)
bip39.mnemonic_to_bytes(mnemonic, wordlist=wordindex)

Notes related to my testing included below:

krux.bip39 using WORDINDEX returns in X% of the time and is Y times faster than embit currently:
* ubuntu amd64-Xeon:     ~2%, ~50x faster
* ubuntu arm64-pi4:      ~3%, ~30x faster
* raspios arm-pi0W:      ~2%, ~57x faster
* MaixPy k210-Amigo:    ~18%,  ~6x faster
* upython esp32s3wroom:  ~5%, ~22x faster
* upython rp2040-picoW: ~16%,  ~6x faster

krux.bip39 using WORDINDEX returns in X% of the time and is Y times faster than embit using `try/except` speedup:
* ubuntu amd64-Xeon:     ~4%, ~26x faster
* ubuntu arm64-pi4:      ~6%, ~16x faster
* raspios arm-pi0W:      ~3%, ~30x faster
* MaixPy k210-Amigo:    ~56%,  ~2x faster
* upython esp32s3wroom: ~20%,  ~5x faster
* upython rp2040-picoW: ~45%,  ~2x faster

krux.bip39 w/o WORDINDEX returns in X% of the time and is Y times faster than embit currently:
* ubuntu amd64-Xeon:    ~50%,  ~2x faster
* ubuntu arm64-pi4:     ~55%,  ~2x faster
* raspios arm-pi0W:     ~50%,  ~2x faster
* MaixPy k210-Amigo:    ~30%,  ~3x faster
* upython esp32s3wroom: ~21%,  ~5x faster
* upython rp2040-picoW: ~34%,  ~3x faster

krux.bip39 w/o WORDINDEX returns in X% of the time and is Y times faster than embit using `try/except` speedup:
* ubuntu amd64-Xeon:    ~95%,  insignificant
* ubuntu arm64-pi4:     ~93%,  insignificant
* raspios arm-pi0W:     ~94%,  insignificant
* MaixPy k210-Amigo:    ~96%,  insignificant
* upython esp32s3wroom: ~97%,  insignificant
* upython rp2040-picoW: ~98%,  insignificant

Since I still believe the rewrite is scandalous and heresy, I'll plan instead to pr the "safe" try/except speedup to embit.

@odudex
Copy link
Member

odudex commented Aug 17, 2024

Thank you! Regarding the change in the default touch behavior, what are the stability reasons? Every call that was passing the False/True argument is behaving the same way as before, only the default has been changed to cover most cases (hence the default).

The majority of touch handling code necessitates a valid position. Failing to provide one can result in unpredictable behavior, which is why I consider position validation to be default behavior.

@tadeubas
Copy link
Contributor Author

Returned the previous behavior to input validate_position

@odudex odudex merged commit 3cdb8ec into selfcustody:develop Aug 20, 2024
7 checks passed
@odudex
Copy link
Member

odudex commented Aug 20, 2024

Thank you!

@tadeubas tadeubas deleted the double-mnemonic-318 branch September 4, 2024 07:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants