This directory contains the block level I/O performance benchmarks for the different devices, and their collected data. While not all of the benchmarks were used in the end as some failed to show the intended results, they all remain here. Numerous of the benchmarks were initially thought to be successful however, turned out to be wrong on non-logical setups. We hope the mistakes made here and lessons learned will help others, hence we left the invalid benchmarks and provide explanations on what is invalid below. We have the following benchmarks
This benchmark measures the random write performance with fio on an increasing number of active zones, under block size of 4KiB
and a queue depth of 16
. This benchmark was not used as it primarily measures the random write performance, which ZNS devices do not support, and hence it mainly measures fio's implementation of handling random writing to ZNS devices.
This benchmark measures the random read performance of the ZNS device for the mq-deadline
and none
schedulers. With mq-deadline
it increases the iodepth up to 14 with a single active zone (since mq-deadline can hold back I/Os for the device if iodepth is larger than 1), and none
uses a single iodepth with increasing number of concurrent threads (up to 14), each reading at a 3 zone offset. This benchmark was also not used, as it is measuring I/O performance of read, where there are no device constraints, hence even none scheduler could have deep I/O queues.
This benchmark repeats what was done in the prior fio_concur_read_rand benchmark only with sequential reading.
This benchmark again compares the mq-deadline
and none
schedulers under sequential reading. Here we increase the iodepth for both with a single zone, as there are no read ordering constraints no I/Os need to be held back. We use a block size of 4KiB
.
This benchmark again compared the mq-deadline
and none
scheduler with the same command, where both are writing with an iodepth of 1 and block size of 4KiB
, and the number of concurrent threads is increased up to 14. Each thread writes at a 3 zone offset, and writing is sequential.
This benchmark compares the mq-deadline
and none
schedulers under writing workload. Similarly, we increase the iodepth for mq-deadline on a single zone, as it can hold back write I/Os and now we have a sequential write constraint. And to enforce there is no on-device ordering (see section 3.4.1 of the NVMe 2.0 specification) the scheduler needs to hold I/Os back. The none scheduler does not do this, hence it can only have an iodepth of 1. Therefore, this benchmark increases the number of concurrent threads for none (up to 14, since we have 14 maximum active zones), with each an iodepth of 1.
This benchmark runs a number of read/write (random and sequential) benchmark with an increasing block size from 4KiB
up to 128KiB
in powers of 2) with an iodepth of 4. This shows the maximum achievable bandwidth of a device. This benchmark is run on the conventional and ZNS devices.
This benchmark similarly uses a block size of 4KiB
however increases the iodepth from 1
to 1024
in powers of 2. Based on this we can retrieve the maximum achievable IOPs for the devices (conventional and ZNS).