Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Document bounds and remove redundant reductions (#208)
* Remove redundant `poly_reduce()` calls The typical flow of arithmetic data is as follows: - Forward NTT - Basemul - One of: * Remove Montgomery factor + invNTT * Remove Montgomery factor - Reduction - Serialization At present, `poly_reduce()` features in multple places in this flow, serving multiple purposes: First, it is used after the fwd NTT and after the base mul. The purpose of those `poly_reduce()` calls is to merely keep the data small, with details of canonicity and sign being irrelevant. Second, `poly_reduce()` is used ahead of serialization: Here, we do need _unsigned_ canonical representatives. The C and AArch64 versions of `poly_reduce()` produce _signed_ canonical Barret reduction, forcing the serialization functions to complete the normalization through a conditional addition. The AVX2 code uses an _unsigned_ 'almost-canonical' `poly_reduce()`, giving a representative in [0,q] (inclusive upper bound!) which are subject to conditional _subtraction_ during serialization. In a nutshell, conceptually we'd like an unsigned canonical reduction ahead of serialization, and non-strict (faster) reduction everywhere else. Looking closer, the uses of `poly_reduce()` after the forward NTT and after the basemul appear unnecessary. Reduction after basemul: - Basemul output fed into `poly_tomont()` is reduced through the Montgomery multiplication that's done in `poly_tomont()`. - Basemul output fed into inv NTT is potentially problematic if not reduced: The invNTT does currently _not_ reduce its input. _However_, the invNTT does need a scaling by `mont^2/128`. This scaling is currently happening at the _end_ of the invNTT. Instead moving it to the _start_ of the invNTT reduces the input in the same way as `poly_tomont()` does. This change affects both the C reference NTT as well as the AArch64 NTT. Reduction after fwd NTT: A reduction after the forward NTT is not needed since base multiplication does not overflow provided _one_ factor is bound by q, which is always the case in MLKEM. Signed-off-by: Hanno Becker <[email protected]> * Document and runtime-check bounds for arithmetic data This commit documents bounds on arithmetic data as it flows through common operations like NTT, invNTT and basemul. Moreover, it introduces debug macros like POLY_BOUND or POLYVEC_BOUND which can be used in debug builds to check the documented bounds at runtime. Moreover, at a later point those assertions should be proved by converting them to CBMC assertions. Finally, the bounds show that canonicity of the result of the NTT is not needed, and this commit removes the Barrett reduction at the end of the AArch64 ASM NTT. Signed-off-by: Hanno Becker <[email protected]> * Reduce pk & sk after unpacking It's not standards-required for the unpacked sk to be be reduced (its coefficients can be up to 2^12 in size), but our bounds reasoning assumes it to be. For pk, it must be checked at the top-level that the byte stream unpacks to coefficients in bound, which however has not yet been implemented. Until that's done, reduce the pk after unpacking, so that the lower level routines are only called with canonical pk. Signed-off-by: Hanno Becker <[email protected]> * Add Python script confirming Barrett/Montgomery relation and bound Signed-off-by: Hanno Becker <[email protected]> * Remove TODO in AArch64 intt_clean.S Signed-off-by: Hanno Becker <[email protected]> * Add bound for fwd NTT and static non-overflow assertion in keygen Signed-off-by: Hanno Becker <[email protected]> * Fix formatting Signed-off-by: Hanno Becker <[email protected]> * Address further review feedback Signed-off-by: Hanno Becker <[email protected]> * Introduce single contractual bound for [inv]NTT output Reasoning about safety of the C 'frontend' should not depend on details of the arithmetic backend (ref/native). We thus introduce single bounds NTT_BOUND and INVNTT_BOUND on the absolute values of the output of the [inv]NTT that any implementation needs to satisfy. For every specific implementation, we still define and check (in debug builds) for tighter bounds, plus add a static assertion that the implementation-specific bound is smaller than the contractual bound. Signed-off-by: Hanno Becker <[email protected]> * Rewrite bounds-checking macros to work with poly and poly_mulcache Signed-off-by: Hanno Becker <[email protected]> * Fix typo in ntt.c Signed-off-by: Hanno Becker <[email protected]> * Document output bounds for poly_mulcache_compute() Signed-off-by: Hanno Becker <[email protected]> * Document+Check input bound for polyvec_basemul_acc_montgomery_cached Signed-off-by: Hanno Becker <[email protected]> * Add input bounds check for NTT Signed-off-by: Hanno Becker <[email protected]> * Make wording of bounds estimates more accessible Signed-off-by: Hanno Becker <[email protected]> * Document implementation-defined C behaviour in montgomery_reduce() See pq-crystals/kyber#77 Signed-off-by: Hanno Becker <[email protected]> * Run functional CI in debugging mode Signed-off-by: Hanno Becker <[email protected]> --------- Signed-off-by: Hanno Becker <[email protected]>
- Loading branch information