Skip to content

Commit

Permalink
Improve readme to better emphasize on considerations
Browse files Browse the repository at this point in the history
  • Loading branch information
ogxd committed Nov 5, 2024
1 parent 5171a82 commit 9eb19b0
Showing 1 changed file with 47 additions and 39 deletions.
86 changes: 47 additions & 39 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,28 +5,7 @@

GxHash is a [**blazingly fast**](#performance) and [**robust**](#robustness) non-cryptographic hashing algorithm.

## Usage
```bash
cargo add gxhash
```
Used directly as a hash function:
```rust
let bytes: &[u8] = "hello world".as_bytes();
let seed = 1234;
println!(" 32-bit hash: {:x}", gxhash::gxhash32(&bytes, seed));
println!(" 64-bit hash: {:x}", gxhash::gxhash64(&bytes, seed));
println!("128-bit hash: {:x}", gxhash::gxhash128(&bytes, seed));
```

GxHash provides an implementation of the [`Hasher`](core::hash::Hasher) trait.
For convenience, this crate also provides the type aliases `gxhash::HashMap` and `gxhash::HashSet`.

```rust
use gxhash::{HashMap, HashMapExt};

let mut map: HashMap<&str, i32> = HashMap::new();
map.insert("answer", 42);
```
**[Features](#features) | [Considerations](#important-considerations) | [Usage](#usage) | [Benchmarks](#benchmarks) | [Contributing](#contributing)**

## Features

Expand All @@ -44,21 +23,61 @@ Check out the [paper](https://github.com/ogxd/gxhash-rust/blob/main/article/arti
### 0 Dependencies 📦
GxHash has 0 cargo dependency. The `Hasher` and `Hashset`/`Hashmap` convenience types require the standard library, enabled by default with the `std` feature.

## Portability

> **Important**
> Because GxHash relies on `aes` hardware acceleration, you must make sure the `aes` feature is enabled when building (otherwise it won't build). This can be done by setting the `RUSTFLAGS` environment variable to `-C target-feature=+aes` or `-C target-cpu=native` (the latter should work if your CPU is properly recognized by rustc, which is the case most of the time).
## Important Considerations

### Architecture Compatibility
GxHash is compatible with:
### Hardware Acceleration
GxHash requires a few specific hardware acceleration features, which are supported on *most* modern processors, but not all of them.
- X86 processors with `AES-NI` & `SSE2` intrinsics
- ARM processors with `AES` & `NEON` intrinsics
> **Warning**
> Other platforms are currently not supported (there is no fallback). GxHash will not build on these platforms.
In case you are building gxhash without the required features, the crate will fail to build with an error message like this (even if you know your target supports the required features):
```
Gxhash requires aes and sse2 intrinsics. Make sure the processor supports it and build with RUSTFLAGS="-C target-cpu=native" or RUSTFLAGS="-C target-feature=+aes,+sse2"
```

To fix this, simply follow the instructions in the error message. Setting `RUSTFLAGS` to `-C target-cpu=native` should work if your CPU is properly recognized by rustc, which is the case most of the time.

### Hashes Stability
All generated hashes for a given version of GxHash are stable, meaning that for a given input the output hash will be the same across all supported platforms.

### Consistency of Hashes When Using the `Hasher` Trait
The `Hasher` trait defines methods to hash specific types. This allows the implementation to circumvent some tricks used when the size is unknown. For this reason, hashing 4 `u32` using a `Hasher` will return a different hash compared to using the `gxhash128` method directly with these same 4 `u32` but represented as 16 `u8`. The rationale being that `Hasher` (mostly used for things like `HashMap` or `HashSet`) and `gxhash128` are used in two different scenarios. Both way are independently stable still.

### Unsafety
In order to achieve this magnitude of performance, this crate contains unsafe code, and a [trick](https://ogxd.github.io/articles/unsafe-read-beyond-of-death/) that some people qualify as "undefined behavior". For this reason, this crate is not intended for use in safety-critical applications, but rather for applications that require extreme hashing performance and that are less concerned about this aspect.

### Security
GxHash is seeded (with seed randomization) to improve DOS resistance and uses a wide (128-bit) internal state to improve multicollision resistance. Yet, such resistances are just basic safeguards and do not make GxHash secure against all attacks.

Also, it is important to note that GxHash is not a cryptographic hash function and should not be used for cryptographic purposes.

## Usage
```bash
cargo add gxhash
```
Used directly as a hash function:
```rust
let bytes: &[u8] = "hello world".as_bytes();
let seed = 1234;
println!(" 32-bit hash: {:x}", gxhash::gxhash32(&bytes, seed));
println!(" 64-bit hash: {:x}", gxhash::gxhash64(&bytes, seed));
println!("128-bit hash: {:x}", gxhash::gxhash128(&bytes, seed));
```

GxHash provides an implementation of the [`Hasher`](core::hash::Hasher) trait.
For convenience, this crate also provides the type aliases `gxhash::HashMap` and `gxhash::HashSet`.

```rust
use gxhash::{HashMap, HashMapExt};

let mut map: HashMap<&str, i32> = HashMap::new();
map.insert("answer", 42);
```

## Flags

### `no_std`

The `std` feature flag enables the `HashMap`/`HashSet` container convenience type aliases. This is on by default. Disable to make the crate `no_std`:
Expand Down Expand Up @@ -101,17 +120,6 @@ Throughput is measured as the number of bytes hashed per second.
![x86_64](./benches/throughput/x86_64.svg)
![x86_64-hybrid](./benches/throughput/x86_64-hybrid.svg)

## Security

### DOS Resistance
GxHash is a seeded hashing algorithm, meaning that depending on the seed used, it will generate completely different hashes. The default `HasherBuilder` (`GxHasherBuilder::default()`) uses seed randomization, making any `HashMap`/`HashSet` more DOS resistant, as it will make it much more difficult for attackers to be able to predict which hashes may collide without knowing the seed used. This does not mean however that it is completely DOS-resistant (currently, it's probably not). This has to be analyzed further.

### Multicollisions Resistance
GxHash uses a 128-bit internal state. This makes GxHash [a widepipe construction](https://en.wikipedia.org/wiki/Merkle%E2%80%93Damg%C3%A5rd_construction#Wide_pipe_construction) when generating hashes of size 64-bit or smaller, which had amongst other properties to be inherently more resistant to multicollision attacks. See [this paper](https://www.iacr.org/archive/crypto2004/31520306/multicollisions.pdf) for more details.

### Cryptographic Properties
GxHash is a non-cryptographic hashing algorithm, thus it is not recommended to use it as a cryptographic algorithm (it is not a replacement for SHA). It has not been assessed if GxHash is preimage resistant and how difficult it is to be reversed.

## Contributing

- Feel free to submit PRs
Expand Down

0 comments on commit 9eb19b0

Please sign in to comment.