-
Notifications
You must be signed in to change notification settings - Fork 353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement llvm.x86.aesni.*
intrinsics
#3101
Conversation
Thanks for the PR! Now we are getting into territory where we'll have to make some non-trivial decisions. I am honestly not super excited about having to maintain an AES implementation. I cannot review this code, and implementing crypto is always dicey. So, in other words I see this PR as coming with a pretty high long-term cost, much more so than the other x86 intrinsic PRs we had so far. How strong is the need to have these intrinsics in Miri, compared to the cost? Can we avoid carrying our own AES implementation, can we import a (pure-Rust) crypto library instead? @rust-lang/miri what do you think? |
Has aes::hazmat been considered? Presumably with aes = { version = "0.8", features = ["aes_force_soft"] } |
|
doh. |
src/shims/x86/aesni.rs
Outdated
|
||
/// SubWord - S-box transformation on 4 bytes | ||
fn sub_word(w: u32) -> u32 { | ||
u32::from_ne_bytes(w.to_ne_bytes().map(|byte| SBOX[usize::from(byte)])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would strongly caution against any implementation of AES which uses lookup tables like this. They are notorious for cache timing and other microarchitectural sidechannels:
https://eprint.iacr.org/2018/1002.pdf
Modern implementations of AES, such as the one in the aes
crate, use techniques like bitslicing/fixslicing to avoid such sidechannels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nobody should ever run Miri on secret data anyway. So if this ever becomes relevant something already went seriously wrong.
But indeed that's one of the reasons why I don't want to maintain an AES implementation.^^
@tarcieri so
That sounds like we should ask the crate author to turn it into a crate feature? |
I'm the author (or rather, translator), so take it with a grain of salt. However, it is using one of the best, most modern techniques available.
It was a crate feature in the past. If one (potentially transitive) dependency enabled the feature, it would disable the use of AES-NI for every other application in the same binary, a problem which came up in practice. As it subtracts functionality, rather than adding it, it does not map well to Cargo features. We switched to the use of a cfg attribute to move this decision to the toplevel binary, rather than one which could be made by any dependency with no way for the developer of the toplevel binary to disable it without patching their dependencies. |
Ah, good to know, thanks! That still give me confidence. :)
It's not subtractive as far as the API is concerned, right? I view it as the additive guarantee that no AES hardware instructions will be used. But maybe that's a somewhat stretched position. ;)
Miri as the toplevel binary still doesn't have a good way to make this decision though. It can't be set in Cargo.toml, I think? We'd have to make sure that every way of building Miri (both in this repo and as part of the rustc distribution) sets the |
Is there a specific reason you need to disable support for hardware intrinsics? The |
Yeah I guess in the end we don't really care how |
+1 on using an external crate instead of implementing including our own implementation. I implemented I still couldn't remove the S-Box transformation because it is needed by |
You can find a description of the steps performed by Unfortunately that's the one part we don't have a portable wrapper for yet, though it would be something interesting for us to provide. |
@eduardosm can you restrict this PR to the ones where we can use the |
Definitely meant to respond to this earlier, but in any case I approve of the reasoning and direction here. If I could choose, I would rather we have all the sort of data processing/vectorization intrinsics than these special purpose ones. I am more excited about being able to execute a crate like simd-json or atoi_simd in Miri than running cryptography code. But I'm also just happy @eduardosm is on this intrinsics tear. I never really expected anyone would turn up and just do so many of them. Hero work. |
@tarcieri since you're here anyway, could I ask you to take a look at how we are invoking |
@RalfJung the portable low-level API we implement is specifically designed to resemble the AES-NI API, so it should be a fairly straightforward mapping. What I skimmed looked fine, but I can review in depth later. |
@tarcieri it looks like some hacks are needed where the operation does more than we need: "undo the InvMixColumns with MixColumns". But maybe that's not as hacky as it looks to me. :) |
@RalfJung yeah, that's another gap in the API. We can try to provide a fully AES-NI compatible portable API, but... The portability story is rather tricky, because we try to provide an API which is fast on both Intel and ARM, but Intel and ARM chose to implement AES in two completely different ways: ARM uses standard AES decryption and Intel uses a technique called the equivalent inverse cipher. On top of that, we also fall back to a portable software implementation, so everything needs to work on all three backends. What we have implements the equivalent inverse cipher (i.e. AES-NI-like API) along with exposing various operations like mix columns and inverse mix columns, and while the latter are native instructions on ARM, on Intel we have to emulate mix columns on Intel by invoking inverse mix columns three times, and to provide all of the instructions like I would just suggest filing issues about the API gaps and we can decide if first-class support makes sense in the |
Okay, so the fact that we do some emulation by "un-inv-mixing" seems fine then, thanks. Looks like the keygenassist operation cannot be emulated (if I understood @eduardosm correctly); we'll just not support that intrinsic for now. |
☔ The latest upstream changes (presumably #3108) made this pull request unmergeable. Please resolve the merge conflicts. |
a2c3544
to
7d24fc7
Compare
//@ignore-target-s390x | ||
//@ignore-target-thumbv7em | ||
//@ignore-target-wasm32 | ||
//@compile-flags: -C target-feature=+aes,+vaes,+avx512f |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
avx512f
should also be asserted below then.
7d24fc7
to
c01e7a9
Compare
I can only agree, thanks a ton for all your work. :) @bors r+ |
☀️ Test successful - checks-actions |
Thanks everyone for your reviews and suggestions! |
No description provided.