Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crystals aes #1

Closed
wants to merge 15 commits into from
5 changes: 4 additions & 1 deletion .CMake/alg_support.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,10 @@ if(NOT WIN32)
else()
option(OQS_USE_OPENSSL "Enable OpenSSL usage" OFF)
endif()
cmake_dependent_option(OQS_USE_AES_OPENSSL "" ON "OQS_USE_OPENSSL" OFF)

# Use OpenSSL's AES only if no AESNI and x86 dist build is used.
# Reason: AESNI implementation better fits our incremental API.
cmake_dependent_option(OQS_USE_AES_OPENSSL "" ON "OQS_USE_OPENSSL; NOT OQS_DIST_X86_64_BUILD; NOT OQS_USE_AES_INSTRUCTIONS" OFF)
cmake_dependent_option(OQS_USE_SHA2_OPENSSL "" ON "OQS_USE_OPENSSL" OFF)
# Disable OpenSSL's SHA3 by default. The implementation is not complete
# enough to support our incremental API.
Expand Down
4 changes: 3 additions & 1 deletion .CMake/gcc_clang_intrinsics.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,9 @@ if(NOT RUN_RESULT EQUAL 0)
message(FATAL_ERROR ".CMake/detect_gcc_clang_intrinsics.c returned exit code: " ${RUN_RESULT})
endif()
foreach(CPU_EXTENSION ${RUN_OUTPUT})
set(OQS_USE_${CPU_EXTENSION}_INSTRUCTIONS ON)
if (NOT DEFINED OQS_USE_${CPU_EXTENSION}_INSTRUCTIONS)
set(OQS_USE_${CPU_EXTENSION}_INSTRUCTIONS ON)
endif()
endforeach()
if(OQS_USE_AVX512BW_INSTRUCTIONS AND
OQS_USE_AVX512DQ_INSTRUCTIONS AND
Expand Down
9 changes: 7 additions & 2 deletions CONFIGURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ The following options can be passed to CMake before the build file generation pr
- [OQS_ENABLE_KEM_\<ALG\>/OQS_ENABLE_SIG_\<ALG\>](#OQS_ENABLE_KEM_\<ALG\>/OQS_ENABLE_SIG_\<ALG\>)
- [OQS_MINIMAL_BUILD](#OQS_MINIMAL_BUILD)
- [OQS_DIST_BUILD](#OQS_DIST_BUILD)
- [OQS_USE_\<CPU_FEATURE\>_INSTRUCTIONS](OQS_USE_\<CPU_FEATURE\>_INSTRUCTIONS)
- [OQS_USE_OPENSSL](#OQS_USE_OPENSSL)
- [OQS_OPT_TARGET](#OQS_OPT_TARGET)
- [OQS_SPEED_USE_ARM_PMU](#OQS_SPEED_USE_ARM_PMU)
Expand Down Expand Up @@ -57,15 +58,19 @@ When built for distribution, the library will run on any CPU of the target archi

When built for use on a single machine, the library will only include the best available code for the target micro-architecture (see [OQS_OPT_TARGET](#OQS_OPT_TARGET)).

## OQS_USE_\<CPU_FEATURE\>_INSTRUCTIONS

These can be set to `ON` or `OFF` and take an effect if liboqs is built for use on a single machine. By default, the CPU features are automatically determined and set to `ON` or `OFF` based on the CPU features available on the build system. The default values can be overridden by providing CMake build options. The available options on x86-64 are: `OQS_USE_ADX_INSTRUCTIONS`, `OQS_USE_AES_INSTRUCTIONS`, `OQS_USE_AVX_INSTRUCTIONS`, `OQS_USE_AVX2_INSTRUCTIONS`, `OQS_USE_AVX512_INSTRUCTIONS`, `OQS_USE_BMI1_INSTRUCTIONS`, `OQS_USE_BMI2_INSTRUCTIONS`, `OQS_USE_PCLMULQDQ_INSTRUCTIONS`, `OQS_USE_VPCLMULQDQ_INSTRUCTIONS`, `OQS_USE_POPCNT_INSTRUCTIONS`, `OQS_USE_SSE_INSTRUCTIONS`, `OQS_USE_SSE2_INSTRUCTIONS` and `OQS_USE_SSE3_INSTRUCTIONS`. The available options on ARM64v8 are `OQS_USE_ARM_AES_INSTRUCTIONS`, `OQS_USE_ARM_SHA2_INSTRUCTIONS`, `OQS_USE_ARM_SHA3_INSTRUCTIONS` and `OQS_USE_ARM_NEON_INSTRUCTIONS`.

## OQS_USE_OPENSSL

This can be set to `ON` or `OFF`. When `ON`, the additional options `OQS_USE_AES_OPENSSL`, `OQS_USE_SHA2_OPENSSL`, and `OQS_USE_SHA3_OPENSSL` are made available to control whether liboqs uses OpenSSL's AES, SHA-2, and SHA-3 implementations. By default, `OQS_USE_AES_OPENSSL` and `OQS_USE_SHA2_OPENSSL` are `ON` while `OQS_USE_SHA3_OPENSSL` is `OFF`.
This can be set to `ON` or `OFF`. When `ON`, the additional options `OQS_USE_AES_OPENSSL`, `OQS_USE_SHA2_OPENSSL`, and `OQS_USE_SHA3_OPENSSL` are made available to control whether liboqs uses OpenSSL's AES, SHA-2, and SHA-3 implementations. By default, `OQS_USE_AES_OPENSSL` is `ON` (on x86-64 only if `OQS_DIST_BUILD` and `OQS_USE_AES_INSTRUCTIONS` are not set), `OQS_USE_SHA2_OPENSSL` is `ON` while `OQS_USE_SHA3_OPENSSL` is `OFF`.

When `OQS_USE_OPENSSL` is `ON`, CMake also scans the filesystem to find the minimum version of OpenSSL required by liboqs (which happens to be 1.1.1). The `OPENSSL_ROOT_DIR` option can be set to aid CMake in its search.

## OQS_OPT_TARGET

An optimization target. Only has an effect if the compiler is GCC or Clang and `OQS_DIST_BUILD=OFF`. Can take any valid input to the `-march` (on x86_64) or `-mcpu` (on ARM32v7 or ARM64v8) option for `CMAKE_C_COMPILER`. Can also be set to one of the following special values.
An optimization target. Only has an effect if the compiler is GCC or Clang and `OQS_DIST_BUILD=OFF`. Can take any valid input to the `-march` (on x86-64) or `-mcpu` (on ARM32v7 or ARM64v8) option for `CMAKE_C_COMPILER`. Can also be set to one of the following special values.
- `auto`: Use `-march=native` or `-mcpu=native` (if the compiler supports it).
- `generic`: Use `-march=x86-64` on x86-64, or `-mcpu=cortex-a5` on ARM32v7, or `-mcpu=cortex-a53` on ARM64v8.

Expand Down
6 changes: 2 additions & 4 deletions scripts/copy_from_upstream/copy_from_upstream.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,18 +15,16 @@ upstreams:
git_branch: master
git_commit: faf5c3fe33e0b61c7c8a7888dd862bf5def17ad2
kem_meta_path: '{pretty_name_full}_META.yml'
common_meta_path: 'Common_META.yml'
kem_scheme_path: '.'
patches: [pqcrystals-kyber-yml.patch, pqcrystals-kyber-ref-shake.patch, pqcrystals-kyber-avx2-shake.patch]
patches: [pqcrystals-kyber-yml.patch, pqcrystals-kyber-ref-shake-aes.patch, pqcrystals-kyber-avx2-shake-aes.patch]
-
name: pqcrystals-dilithium
git_url: https://github.com/pq-crystals/dilithium.git
git_branch: master
git_commit: 61b51a71701b8ae9f546a1e5d220e1950ed20d06
sig_meta_path: '{pretty_name_full}_META.yml'
common_meta_path: 'Common_META.yml'
sig_scheme_path: '.'
patches: [pqcrystals-dilithium-yml.patch, pqcrystals-dilithium-ref-shake.patch, pqcrystals-dilithium-avx2-shake.patch]
patches: [pqcrystals-dilithium-yml.patch, pqcrystals-dilithium-ref-shake-aes.patch, pqcrystals-dilithium-avx2-shake-aes.patch]
kems:
-
name: classic_mceliece
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -124,10 +124,61 @@ index 0e9e988..bb268fd 100644

/*************************************************
diff --git a/avx2/sign.c b/avx2/sign.c
index 3dee7a6..408f0ba 100644
index 3dee7a62..8c254f07 100644
--- a/avx2/sign.c
+++ b/avx2/sign.c
@@ -197,7 +197,7 @@ int crypto_sign_signature(uint8_t *sig, size_t *siglen, const uint8_t *m, size_t
@@ -97,17 +97,18 @@ int crypto_sign_keypair(uint8_t *pk, uint8_t *sk) {

/* Sample short vectors s1 and s2 */
#ifdef DILITHIUM_USE_AES
- aes256ctr_init(&aesctx, rhoprime, 0);
+ aes256ctr_init_u64(&aesctx, rhoprime, 0);
for(i = 0; i < L; ++i) {
nonce = i;
- aesctx.n = _mm_loadl_epi64((__m128i *)&nonce);
+ aes256ctr_init_iv_u64(&aesctx, nonce);
poly_uniform_eta_preinit(&s1.vec[i], &aesctx);
}
for(i = 0; i < K; ++i) {
nonce = L + i;
- aesctx.n = _mm_loadl_epi64((__m128i *)&nonce);
+ aes256ctr_init_iv_u64(&aesctx, nonce);
poly_uniform_eta_preinit(&s2.vec[i], &aesctx);
}
+ aes256_ctx_release(&aesctx);
#elif K == 4 && L == 4
poly_uniform_eta_4x(&s1.vec[0], &s1.vec[1], &s1.vec[2], &s1.vec[3], rhoprime, 0, 1, 2, 3);
poly_uniform_eta_4x(&s2.vec[0], &s2.vec[1], &s2.vec[2], &s2.vec[3], rhoprime, 4, 5, 6, 7);
@@ -134,7 +135,7 @@ int crypto_sign_keypair(uint8_t *pk, uint8_t *sk) {
polyvecl_ntt(&s1);

#ifdef DILITHIUM_USE_AES
- aes256ctr_init(&aesctx, rho, 0);
+ aes256ctr_init_u64(&aesctx, rho, 0);
#endif

for(i = 0; i < K; i++) {
@@ -142,7 +143,7 @@ int crypto_sign_keypair(uint8_t *pk, uint8_t *sk) {
#ifdef DILITHIUM_USE_AES
for(unsigned int j = 0; j < L; j++) {
nonce = (i << 8) + j;
- aesctx.n = _mm_loadl_epi64((__m128i *)&nonce);
+ aes256ctr_init_iv_u64(&aesctx, nonce);
poly_uniform_preinit(&row->vec[j], &aesctx);
poly_nttunpack(&row->vec[j]);
}
@@ -164,6 +165,10 @@ int crypto_sign_keypair(uint8_t *pk, uint8_t *sk) {
polyt0_pack(sk + 3*SEEDBYTES + (L+K)*POLYETA_PACKEDBYTES + i*POLYT0_PACKEDBYTES, &t0);
}

+#ifdef DILITHIUM_USE_AES
+ aes256_ctx_release(&aesctx);
+#endif
+
/* Compute H(rho, t1) and store in secret key */
shake256(sk + 2*SEEDBYTES, SEEDBYTES, pk, CRYPTO_PUBLICKEYBYTES);

@@ -197,7 +202,7 @@ int crypto_sign_signature(uint8_t *sig, size_t *siglen, const uint8_t *m, size_t
polyvecl y;
polyveck w0;
} tmpv;
Expand All @@ -136,7 +187,7 @@ index 3dee7a6..408f0ba 100644

rho = seedbuf;
tr = rho + SEEDBYTES;
@@ -207,11 +207,11 @@ int crypto_sign_signature(uint8_t *sig, size_t *siglen, const uint8_t *m, size_t
@@ -207,11 +212,11 @@ int crypto_sign_signature(uint8_t *sig, size_t *siglen, const uint8_t *m, size_t
unpack_sk(rho, tr, key, &t0, &s1, &s2, sk);

/* Compute CRH(tr, msg) */
Expand All @@ -153,7 +204,24 @@ index 3dee7a6..408f0ba 100644

#ifdef DILITHIUM_RANDOMIZED_SIGNING
randombytes(rhoprime, CRHBYTES);
@@ -268,11 +268,11 @@ rej:
@@ -227,14 +232,14 @@ int crypto_sign_signature(uint8_t *sig, size_t *siglen, const uint8_t *m, size_t

#ifdef DILITHIUM_USE_AES
aes256ctr_ctx aesctx;
- aes256ctr_init(&aesctx, rhoprime, 0);
+ aes256ctr_init_u64(&aesctx, rhoprime, 0);
#endif

rej:
/* Sample intermediate vector y */
#ifdef DILITHIUM_USE_AES
for(i = 0; i < L; ++i) {
- aesctx.n = _mm_loadl_epi64((__m128i *)&nonce);
+ aes256ctr_init_iv_u64(&aesctx, nonce);
nonce++;
poly_uniform_gamma1_preinit(&z.vec[i], &aesctx);
}
@@ -268,11 +273,11 @@ rej:
polyveck_decompose(&w1, &tmpv.w0, &w1);
polyveck_pack_w1(sig, &w1);

Expand All @@ -170,15 +238,19 @@ index 3dee7a6..408f0ba 100644
poly_challenge(&c, sig);
poly_ntt(&c);

@@ -317,6 +317,7 @@ rej:
@@ -317,6 +322,11 @@ rej:
hint[OMEGA + i] = pos = pos + n;
}

+#ifdef DILITHIUM_USE_AES
+ aes256_ctx_release(&aesctx);
+#endif
+
+ shake256_inc_ctx_release(&state);
/* Pack z into signature */
for(i = 0; i < L; i++)
polyz_pack(sig + SEEDBYTES + i*POLYZ_PACKEDBYTES, &z.vec[i]);
@@ -380,18 +381,19 @@ int crypto_sign_verify(const uint8_t *sig, size_t siglen, const uint8_t *m, size
@@ -380,18 +390,19 @@ int crypto_sign_verify(const uint8_t *sig, size_t siglen, const uint8_t *m, size
polyvecl *row = rowbuf;
polyvecl z;
poly c, w1, h;
Expand All @@ -204,7 +276,34 @@ index 3dee7a6..408f0ba 100644

/* Expand challenge */
poly_challenge(&c, sig);
@@ -454,11 +456,12 @@ int crypto_sign_verify(const uint8_t *sig, size_t siglen, const uint8_t *m, size
@@ -404,7 +415,7 @@ int crypto_sign_verify(const uint8_t *sig, size_t siglen, const uint8_t *m, size
}

#ifdef DILITHIUM_USE_AES
- aes256ctr_init(&aesctx, pk, 0);
+ aes256ctr_init_u64(&aesctx, pk, 0);
#endif

for(i = 0; i < K; i++) {
@@ -412,7 +423,7 @@ int crypto_sign_verify(const uint8_t *sig, size_t siglen, const uint8_t *m, size
#ifdef DILITHIUM_USE_AES
for(j = 0; j < L; j++) {
nonce = (i << 8) + j;
- aesctx.n = _mm_loadl_epi64((__m128i *)&nonce);
+ aes256ctr_init_iv_u64(&aesctx, nonce);
poly_uniform_preinit(&row->vec[j], &aesctx);
poly_nttunpack(&row->vec[j]);
}
@@ -449,16 +460,21 @@ int crypto_sign_verify(const uint8_t *sig, size_t siglen, const uint8_t *m, size
polyw1_pack(buf.coeffs + i*POLYW1_PACKEDBYTES, &w1);
}

+#ifdef DILITHIUM_USE_AES
+ aes256_ctx_release(&aesctx);
+#endif
+
/* Extra indices are zero for strong unforgeability */
for(j = pos; j < OMEGA; ++j)
if(hint[j]) return -1;

/* Call random oracle and verify challenge */
Expand All @@ -222,18 +321,46 @@ index 3dee7a6..408f0ba 100644
for(i = 0; i < SEEDBYTES; ++i)
if(buf.coeffs[i] != sig[i])
return -1;
diff --git a/avx2/polyvec.c b/avx2/polyvec.c
index 1d9c2e70..5ce1d887 100644
--- a/avx2/polyvec.c
+++ b/avx2/polyvec.c
@@ -25,16 +25,17 @@ void polyvec_matrix_expand(polyvecl mat[K], const uint8_t rho[SEEDBYTES]) {
uint64_t nonce;
aes256ctr_ctx state;

- aes256ctr_init(&state, rho, 0);
+ aes256ctr_init_u64(&state, rho, 0);

for(i = 0; i < K; i++) {
for(j = 0; j < L; j++) {
nonce = (i << 8) + j;
- state.n = _mm_loadl_epi64((__m128i *)&nonce);
+ aes256ctr_init_iv_u64(&state, nonce);
poly_uniform_preinit(&mat[i].vec[j], &state);
poly_nttunpack(&mat[i].vec[j]);
}
}
+ aes256_ctx_release(&state);
}

#elif K == 4 && L == 4
diff --git a/avx2/symmetric.h b/avx2/symmetric.h
index 7eb6f98..ed476d1 100644
--- a/avx2/symmetric.h
+++ b/avx2/symmetric.h
@@ -17,29 +17,33 @@ typedef aes256ctr_ctx stream256_state;
@@ -15,31 +15,35 @@ typedef aes256ctr_ctx stream256_state;
#define STREAM128_BLOCKBYTES AES256CTR_BLOCKBYTES
#define STREAM256_BLOCKBYTES AES256CTR_BLOCKBYTES

#define stream128_init(STATE, SEED, NONCE) aes256ctr_init(STATE, SEED, NONCE)
-#define stream128_init(STATE, SEED, NONCE) aes256ctr_init(STATE, SEED, NONCE)
+#define stream128_init(STATE, SEED, NONCE) aes256ctr_init_u64(STATE, SEED, NONCE)
#define stream128_squeezeblocks(OUT, OUTBLOCKS, STATE) aes256ctr_squeezeblocks(OUT, OUTBLOCKS, STATE)
+#define stream128_release(STATE)
#define stream256_init(STATE, SEED, NONCE) aes256ctr_init(STATE, SEED, NONCE)
-#define stream256_init(STATE, SEED, NONCE) aes256ctr_init(STATE, SEED, NONCE)
+#define stream128_release(STATE) aes256_ctx_release(STATE)
+#define stream256_init(STATE, SEED, NONCE) aes256ctr_init_u64(STATE, SEED, NONCE)
#define stream256_squeezeblocks(OUT, OUTBLOCKS, STATE) aes256ctr_squeezeblocks(OUT, OUTBLOCKS, STATE)
+#define stream256_release(STATE)
+#define stream256_release(STATE) aes256_ctx_release(STATE)

#else

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -196,16 +196,18 @@ diff --git a/ref/symmetric.h b/ref/symmetric.h
index 0b34fb6..13c88da 100644
--- a/ref/symmetric.h
+++ b/ref/symmetric.h
@@ -24,25 +24,27 @@ void dilithium_aes256ctr_init(aes256ctr_ctx *state,
@@ -24,25 +24,29 @@ void dilithium_aes256ctr_init(aes256ctr_ctx *state,
dilithium_aes256ctr_init(STATE, SEED, NONCE)
#define stream128_squeezeblocks(OUT, OUTBLOCKS, STATE) \
aes256ctr_squeezeblocks(OUT, OUTBLOCKS, STATE)
+#define stream128_release(STATE)
+#define stream128_release(STATE) \
+ aes256_ctx_release(STATE)
#define stream256_init(STATE, SEED, NONCE) \
dilithium_aes256ctr_init(STATE, SEED, NONCE)
#define stream256_squeezeblocks(OUT, OUTBLOCKS, STATE) \
aes256ctr_squeezeblocks(OUT, OUTBLOCKS, STATE)
+#define stream256_release(STATE)
+#define stream256_release(STATE) \
+ aes256_ctx_release(STATE)

#else

Expand All @@ -228,7 +230,7 @@ index 0b34fb6..13c88da 100644
const uint8_t seed[CRHBYTES],
uint16_t nonce);

@@ -53,10 +55,12 @@ void dilithium_shake256_stream_init(keccak_state *state,
@@ -53,10 +57,12 @@ void dilithium_shake256_stream_init(keccak_state *state,
dilithium_shake128_stream_init(STATE, SEED, NONCE)
#define stream128_squeezeblocks(OUT, OUTBLOCKS, STATE) \
shake128_squeezeblocks(OUT, OUTBLOCKS, STATE)
Expand Down
Loading