Skip to content
This repository has been archived by the owner on Feb 19, 2024. It is now read-only.

Commit

Permalink
ig we push these
Browse files Browse the repository at this point in the history
  • Loading branch information
Hanting Zhang committed Jan 12, 2024
1 parent ef54fc2 commit 3f183e6
Show file tree
Hide file tree
Showing 28 changed files with 450 additions and 394 deletions.
3 changes: 1 addition & 2 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,2 @@
/target
Cargo.lock
/plots
Cargo.lock
51 changes: 51 additions & 0 deletions benches/data.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Benchmarks

## Table of Contents

- [Benchmark Results](#benchmark-results)
- [GPU](#GPU)
- [lurkrs](#lurkrs)

## Benchmark Results

### GPU

| | `random` | `lurkrs` |
|:-------------------------------------|:---------------------------|:----------------------------------|
| **`witness 1, 7941351 scalars`** | `119.79 ms` (✅ **1.00x**) | `552.09 ms` (❌ *4.60x slower*) |
| **`T 2, 9699051 scalars`** | `144.74 ms` (✅ **1.00x**) | `143.87 ms` (✅ **1.00x faster**) |
| **`witness 5, 7941351 scalars`** | `118.40 ms` (✅ **1.00x**) | `560.24 ms` (❌ *4.74x slower*) |
| **`T 6, 9699051 scalars`** | `143.99 ms` (✅ **1.00x**) | `323.89 ms` (❌ *2.25x slower*) |
| **`witness 9, 7941351 scalars`** | `120.42 ms` (✅ **1.00x**) | `561.95 ms` (❌ *4.67x slower*) |
| **`T 10, 9699051 scalars`** | `143.02 ms` (✅ **1.00x**) | `350.17 ms` (❌ *2.44x slower*) |
| **`witness 13, 7941351 scalars`** | `111.75 ms` (✅ **1.00x**) | `560.65 ms` (❌ *5.04x slower*) |
| **`T 14, 9699051 scalars`** | `119.85 ms` (✅ **1.00x**) | `468.23 ms` (❌ *3.93x slower*) |
| **`witness 17, 7941351 scalars`** | `119.13 ms` (✅ **1.00x**) | `560.10 ms` (❌ *4.66x slower*) |
| **`T 18, 9699051 scalars`** | `143.52 ms` (✅ **1.00x**) | `564.23 ms` (❌ *3.94x slower*) |
| **`witness 21, 7941351 scalars`** | `121.13 ms` (✅ **1.00x**) | `558.80 ms` (❌ *4.65x slower*) |
| **`T 22, 9699051 scalars`** | `145.63 ms` (✅ **1.00x**) | `614.23 ms` (❌ *4.23x slower*) |
| **`witness 25, 7941351 scalars`** | `118.84 ms` (✅ **1.00x**) | `557.14 ms` (❌ *4.72x slower*) |
| **`T 26, 9699051 scalars`** | `141.11 ms` (✅ **1.00x**) | `679.76 ms` (❌ *4.81x slower*) |
| **`witness 29, 7941351 scalars`** | `119.40 ms` (✅ **1.00x**) | `557.92 ms` (❌ *4.68x slower*) |
| **`T 30, 9699051 scalar`** | `142.80 ms` (✅ **1.00x**) | `702.32 ms` (❌ *4.94x slower*) |

### lurkrs

| | `lurkrs/fibonacci` | `pasta-msm/benches/lurkrs` |
|:-------------------------------------|:---------------------------|:----------------------------------|
| **`witness 1, 7941351 scalars`** | `559.07 ms` (✅ **1.00x**) | `552.09 ms` (✅ **0.98x faster**) |
| **`T 2, 9699051 scalars`** | `135.27 ms` (✅ **1.00x**) | `143.87 ms` (✅ **1.05x slower**) |
| **`witness 5, 7941351 scalars`** | `547.40 ms` (✅ **1.00x**) | `560.24 ms` (✅ **1.02x slower**) |
| **`T 6, 9699051 scalars`** | `316.52 ms` (✅ **1.00x**) | `323.89 ms` (✅ **1.02x slower**) |
| **`witness 9, 7941351 scalars`** | `544.42 ms` (✅ **1.00x**) | `561.95 ms` (✅ **1.03x slower**) |
| **`T 10, 9699051 scalars`** | `344.99 ms` (✅ **1.00x**) | `350.17 ms` (✅ **1.01x slower**) |
| **`witness 13, 7941351 scalars`** | `552.01 ms` (✅ **1.00x**) | `560.65 ms` (✅ **1.01x slower**) |
| **`T 14, 9699051 scalars`** | `457.50 ms` (✅ **1.00x**) | `468.23 ms` (✅ **1.02x slower**) |
| **`witness 17, 7941351 scalars`** | `545.15 ms` (✅ **1.00x**) | `560.10 ms` (✅ **1.02x slower**) |
| **`T 18, 9699051 scalars`** | `556.29 ms` (✅ **1.00x**) | `564.23 ms` (✅ **1.01x slower**) |
| **`witness 21, 7941351 scalars`** | `548.88 ms` (✅ **1.00x**) | `558.80 ms` (✅ **1.02x slower**) |
| **`T 22, 9699051 scalars`** | `607.14 ms` (✅ **1.00x**) | `614.23 ms` (✅ **1.01x slower**) |
| **`witness 25, 7941351 scalars`** | `550.40 ms` (✅ **1.00x**) | `557.14 ms` (✅ **1.01x slower**) |
| **`T 26, 9699051 scalars`** | `680.36 ms` (✅ **1.00x**) | `679.76 ms` (✅ **1.00x slower**) |
| **`witness 29, 7941351 scalars`** | `549.54 ms` (✅ **1.00x**) | `557.92 ms` (✅ **1.01x slower**) |
| **`T 30, 9699051 scalar`** | `692.39 ms` (✅ **1.00x**) | `702.32 ms` (✅ **1.02x slower**) |
269 changes: 139 additions & 130 deletions benches/msm.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,160 +2,169 @@
// Licensed under the Apache License, Version 2.0, see LICENSE for details.
// SPDX-License-Identifier: Apache-2.0

use criterion::{criterion_group, criterion_main, Criterion};
use pasta_msm::utils::{gen_points, gen_scalars};
#![allow(dead_code)]
#![allow(unused_imports)]
#![allow(unused_mut)]

use std::io::Read;

use abomonation::Abomonation;
use criterion::{criterion_group, criterion_main, Criterion, BenchmarkId};

use pasta_curves::{group::ff::{PrimeField, Field}, pallas};
use pasta_msm::{self, utils::CommitmentKey};
use rand::thread_rng;

#[cfg(feature = "cuda")]
use pasta_msm::cuda_available;
use pasta_curves::{pallas, group::ff::Field};
extern "C" {
fn cuda_available() -> bool;
}

fn criterion_benchmark(c: &mut Criterion) {
let bench_npow: usize = std::env::var("BENCH_NPOW")
.unwrap_or("18".to_string())
.parse()
.unwrap();
let npoints: usize = 1 << bench_npow;
fn read_abomonated<T: Abomonation + Clone>(name: String) -> std::io::Result<T> {
use std::fs::OpenOptions;
use std::io::BufReader;

// println!("generating {} random points, just hang on...", npoints);
let mut points = gen_points(npoints);
let mut scalars = gen_scalars(npoints);
let arecibo = home::home_dir().unwrap().join(".arecibo_witness");

#[cfg(feature = "cuda")]
{
unsafe { pasta_msm::CUDA_OFF = true };
}
let data = OpenOptions::new()
.read(true)
.write(true)
.create(true)
.open(arecibo.join(name))?;
let mut reader = BufReader::new(data);
let mut bytes = vec![];
reader.read_to_end(&mut bytes)?;

let mut group = c.benchmark_group("CPU");
group.sample_size(10);
let (data, _) = unsafe { abomonation::decode::<T>(&mut bytes).unwrap() };

group.bench_function(format!("2**{} points", bench_npow), |b| {
b.iter(|| {
let _ = pasta_msm::pallas(&points, &scalars, npoints);
})
});
Ok(data.clone())
}

fn criterion_benchmark(c: &mut Criterion) {
let npoints: usize = 10_000_000;
let scalars = pasta_msm::utils::gen_scalars(npoints);
// let nonuniform_scalars = pasta_msm::utils::generate_nonuniform_scalars(npoints);
let points = pasta_msm::utils::gen_points(npoints);

group.finish();
let mut rng = thread_rng();

#[cfg(feature = "cuda")]
if unsafe { cuda_available() } {
unsafe { pasta_msm::CUDA_OFF = false };

const EXTRA: usize = 5;
let bench_npow = bench_npow + EXTRA;
let npoints: usize = 1 << bench_npow;

while points.len() < npoints {
points.append(&mut points.clone());
}
scalars.append(&mut gen_scalars(npoints - scalars.len()));

let mut group = c.benchmark_group("GPU");
group.sample_size(20);

let context = pasta_msm::pallas_init(&points, npoints);
for i in 0..32 {
let witness_i = read_abomonated::<
Vec<<pallas::Scalar as PrimeField>::Repr>,
>(i.to_string())
.unwrap();
let mut witness_i = unsafe {
std::mem::transmute::<Vec<_>, Vec<pallas::Scalar>>(witness_i)
};

for nnz in [1.0, 0.75, 0.5, 0.25, 0.10, 0.01] {
let zeros = ((1.0 - nnz) * npoints as f64) as usize;
let witness_n = witness_i.len();

for i in 0..zeros {
scalars[i] = pallas::Scalar::ZERO;
if witness_n < 1_000_000 {
continue;
}

group.bench_function(format!("2**{} points {}", bench_npow, nnz), |b| {
b.iter(|| {
let _ = pasta_msm::pallas_with(&context, npoints, npoints - zeros, &scalars);
})
});

let name = format!("#{i}, {} scalars", witness_n);

// group.bench_function(BenchmarkId::new("random", &name), |b| {
// b.iter(|| {
// let _ = pasta_msm::pallas(
// &points[..witness_n],
// &scalars[..witness_n],
// witness_n,
// );
// })
// });

group.bench_function(
BenchmarkId::new("lurkrs", &name),
|b| {
b.iter(|| {
let _ = pasta_msm::pallas(
&points[..witness_n],
&witness_i,
witness_n,
);
})
},
);

for w_i in witness_i.iter_mut() {
if w_i.is_zero_vartime() {
*w_i = pallas::Scalar::random(&mut rng);
}
}
group.bench_function(
BenchmarkId::new("lurkrs no zeros", &name),
|b| {
b.iter(|| {
let _ = pasta_msm::pallas(
&points[..witness_n],
&witness_i,
witness_n,
);
})
},
);
}

group.finish();
// group.bench_function(format!("biased {} scalars", npoints), |b| {
// b.iter(|| {
// let _ =
// pasta_msm::pallas(&points, &nonuniform_scalars, npoints);
// })
// });

// for npoints in [7941351, 9699051] {
// group.bench_function(format!("random {} scalars", npoints), |b| {
// b.iter(|| {
// let _ = pasta_msm::pallas(&points, &scalars, npoints);
// })
// });
// }

// let context = pasta_msm::pallas_init(&points, npoints);

// group.bench_function(format!("preallocated lurkrs {} scalars", npoints), |b| {
// b.iter(|| {
// let _ = pasta_msm::pallas_with(&context, npoints, npoints, &witness_primary);
// })
// });

// group.bench_function(
// format!("preallocated biased {} scalars", npoints),
// |b| {
// b.iter(|| {
// let _ = pasta_msm::pallas_with(
// &context,
// npoints,
// npoints,
// &nonuniform_scalars,
// );
// })
// },
// );

// group.bench_function(
// format!("preallocated random {} scalars", npoints),
// |b| {
// b.iter(|| {
// let _ = pasta_msm::pallas_with(
// &context, npoints, npoints, &scalars,
// );
// })
// },
// );

// group.finish();
}
}

criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);



// // Copyright Supranational LLC
// // Licensed under the Apache License, Version 2.0, see LICENSE for details.
// // SPDX-License-Identifier: Apache-2.0

// #![allow(dead_code)]
// #![allow(unused_imports)]
// #![allow(unused_mut)]

// use std::io::Read;

// use abomonation::Abomonation;
// use criterion::{criterion_group, criterion_main, Criterion};

// use pasta_curves::{pallas, group::ff::PrimeField};
// use pasta_msm::{self, utils::CommitmentKey};

// #[cfg(feature = "cuda")]
// extern "C" {
// fn cuda_available() -> bool;
// }

// include!("../src/tests.rs");

// fn read_abomonated<T: Abomonation + Clone>(name: String) -> std::io::Result<T> {
// use std::fs::OpenOptions;
// use std::io::BufReader;

// let arecibo = home::home_dir().unwrap().join(".arecibo");

// let data = OpenOptions::new()
// .read(true)
// .write(true)
// .create(true)
// .open(arecibo.join(name))?;
// let mut reader = BufReader::new(data);
// let mut bytes = vec![];
// reader.read_to_end(&mut bytes)?;

// let (data, _) = unsafe { abomonation::decode::<T>(&mut bytes).unwrap() };

// Ok(data.clone())
// }

// fn criterion_benchmark(c: &mut Criterion) {
// let witness_primary = read_abomonated::<
// Vec<<pallas::Scalar as PrimeField>::Repr>,
// >("witness_primary".into())
// .unwrap();
// let witness_primary = unsafe {
// std::mem::transmute::<Vec<_>, Vec<pallas::Scalar>>(witness_primary)
// };
// let npoints: usize = witness_primary.len();

// let scalars = crate::tests::gen_scalars(npoints);
// let points = crate::tests::gen_points(npoints);

// #[cfg(feature = "cuda")]
// if unsafe { cuda_available() } {
// unsafe { pasta_msm::CUDA_OFF = false };

// let mut group = c.benchmark_group("GPU");
// group.sample_size(20);

// group.bench_function(format!("lurkrs {} scalars", npoints), |b| {
// b.iter(|| {
// let _ = pasta_msm::pallas(&points, &witness_primary);
// })
// });

// group.bench_function(format!("random {} scalars", npoints), |b| {
// b.iter(|| {
// let _ = pasta_msm::pallas(&points, &scalars);
// })
// });

// group.finish();
// }
// }

// criterion_group!(benches, criterion_benchmark);
// criterion_main!(benches);
33 changes: 33 additions & 0 deletions benchmark_cmp_zeros.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Benchmarks

## Table of Contents

- [Benchmark Results](#benchmark-results)
- [GPU](#gpu)

## Benchmark Results

### GPU

| | `lurkrs` | `lurkrs no zeros` |
|:---------------------------|:--------------------------|:--------------------------------- |
| **`#1, 7941351 scalars`** | `550.19 ms` (✅ **1.00x**) | `566.97 ms` (✅ **1.03x slower**) |
| **`#2, 9699051 scalars`** | `148.77 ms` (✅ **1.00x**) | `176.85 ms` (❌ *1.19x slower*) |
| **`#5, 7941351 scalars`** | `558.14 ms` (✅ **1.00x**) | `572.18 ms` (✅ **1.03x slower**) |
| **`#6, 9699051 scalars`** | `330.08 ms` (✅ **1.00x**) | `363.44 ms` (✅ **1.10x slower**) |
| **`#9, 7941351 scalars`** | `559.80 ms` (✅ **1.00x**) | `571.99 ms` (✅ **1.02x slower**) |
| **`#10, 9699051 scalars`** | `350.76 ms` (✅ **1.00x**) | `394.22 ms` (❌ *1.12x slower*) |
| **`#13, 7941351 scalars`** | `557.43 ms` (✅ **1.00x**) | `569.30 ms` (✅ **1.02x slower**) |
| **`#14, 9699051 scalars`** | `465.93 ms` (✅ **1.00x**) | `501.17 ms` (✅ **1.08x slower**) |
| **`#17, 7941351 scalars`** | `556.75 ms` (✅ **1.00x**) | `569.25 ms` (✅ **1.02x slower**) |
| **`#18, 9699051 scalars`** | `562.11 ms` (✅ **1.00x**) | `596.44 ms` (✅ **1.06x slower**) |
| **`#21, 7941351 scalars`** | `557.79 ms` (✅ **1.00x**) | `572.85 ms` (✅ **1.03x slower**) |
| **`#22, 9699051 scalars`** | `615.48 ms` (✅ **1.00x**) | `645.91 ms` (✅ **1.05x slower**) |
| **`#25, 7941351 scalars`** | `558.32 ms` (✅ **1.00x**) | `569.35 ms` (✅ **1.02x slower**) |
| **`#26, 9699051 scalars`** | `681.24 ms` (✅ **1.00x**) | `708.16 ms` (✅ **1.04x slower**) |
| **`#29, 7941351 scalars`** | `556.30 ms` (✅ **1.00x**) | `570.89 ms` (✅ **1.03x slower**) |
| **`#30, 9699051 scalars`** | `704.70 ms` (✅ **1.00x**) | `731.01 ms` (✅ **1.04x slower**) |

---
Made with [criterion-table](https://github.com/nu11ptr/criterion-table)

Loading

0 comments on commit 3f183e6

Please sign in to comment.