Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHPLIB-1187: Run benchmark on Evergreen #1185

Merged
merged 5 commits into from
Oct 30, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions .evergreen/config/functions.yml
Original file line number Diff line number Diff line change
Expand Up @@ -482,3 +482,16 @@ functions:
binary: bash
args:
- .evergreen/compile-extension.sh

# Run benchmarks. The filter skips the benchAmpWorkers subjects as they fail due to socket exceptions
"run benchmark":
- command: shell.exec
type: test
params:
working_dir: "src/benchmark"
script: |
${PREPARE_SHELL}
export PATH="${PHP_PATH}/bin:$PATH"

php ../composer.phar install --no-suggest
vendor/bin/phpbench run --report=env --report=evergreen --report=aggregate --output html --filter='bench(?!AmpWorkers)'
1 change: 1 addition & 0 deletions .evergreen/config/php.ini
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
extension=mongodb.so
memory_limit=-1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should increase the limit. Otherwise we don't know how much memory is used if it's too much. What is the memory limit of the job runner?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason it aborted when it hit a previous 128M limit, apparently ignoring the 1G limit definer in the runner config. We can also report memory usage from the benchmarks using perf.send, which would be a better indicator than failing the build when it hits some limit. Want me to add those numbers?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perf.send is too late if the process crashes, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, perf.send would not be executed in that case. With -1 the memory would be unlimited, so the only case in which it would crash is if it used up all memory including the page file, which I'd consider highly unlikely.

12 changes: 12 additions & 0 deletions .evergreen/config/test-tasks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,15 @@ tasks:
commands:
- func: "bootstrap mongohoused"
- func: "run atlas data lake test"

- name: "run-benchmark"
exec_timeout_secs: 3600
commands:
- func: "bootstrap mongo-orchestration"
vars:
TOPOLOGY: "server"
MONGODB_VERSION: "v6.0-perf"
- func: "run benchmark"
- command: perf.send
params:
file: src/benchmark/.phpbench/results.json
15 changes: 15 additions & 0 deletions .evergreen/config/test-variants.yml
Original file line number Diff line number Diff line change
Expand Up @@ -114,3 +114,18 @@ buildvariants:
tasks:
- "test_atlas_task_group"
- ".csfle"

# Run benchmarks
- name: benchmark-rhel90
tags: ["benchmark", "rhel", "x64"]
display_name: "Benchmark: RHEL 9.0, MongoDB 6.0"
run_on: rhel90-dbx-perf-large
expansions:
FETCH_BUILD_VARIANT: "build-rhel90"
FETCH_BUILD_TASK: "build-php-8.2"
PHP_VERSION: "8.2"
depends_on:
- variant: "build-rhel90"
name: "build-php-8.2"
tasks:
- "run-benchmark"
3 changes: 1 addition & 2 deletions benchmark/phpbench.json.dist
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,5 @@
"runner.file_pattern": "*Bench.php",
"runner.path": "src",
"runner.php_config": { "memory_limit": "1G" },
"runner.iterations": 3,
"runner.revs": 10
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this in favour of increasing revs only for benchmarks that run in the microsecond range. For anything that takes milliseconds, the precision is usually good enough with a single rev.

"runner.iterations": 3
}
2 changes: 2 additions & 0 deletions benchmark/src/BSON/DocumentBench.php
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,15 @@
use MongoDB\Benchmark\Fixtures\Data;
use MongoDB\BSON\Document;
use PhpBench\Attributes\BeforeMethods;
use PhpBench\Attributes\Revs;
use PhpBench\Attributes\Warmup;
use stdClass;

use function file_get_contents;
use function iterator_to_array;

#[BeforeMethods('prepareData')]
#[Revs(10)]
#[Warmup(1)]
final class DocumentBench
{
Expand Down
2 changes: 2 additions & 0 deletions benchmark/src/BSON/PackedArrayBench.php
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,14 @@
use MongoDB\Benchmark\Fixtures\Data;
use MongoDB\BSON\PackedArray;
use PhpBench\Attributes\BeforeMethods;
use PhpBench\Attributes\Revs;
use PhpBench\Attributes\Warmup;

use function array_values;
use function iterator_to_array;

#[BeforeMethods('prepareData')]
#[Revs(10)]
#[Warmup(1)]
final class PackedArrayBench
{
Expand Down
36 changes: 15 additions & 21 deletions benchmark/src/DriverBench/ParallelMultiFileExportBench.php
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@
use PhpBench\Attributes\BeforeClassMethods;
use PhpBench\Attributes\Iterations;
use PhpBench\Attributes\ParamProviders;
use PhpBench\Attributes\Revs;
use RuntimeException;

use function array_chunk;
Expand Down Expand Up @@ -44,7 +43,6 @@
#[AfterClassMethods('afterClass')]
#[AfterMethods('afterIteration')]
#[Iterations(1)]
#[Revs(1)]
final class ParallelMultiFileExportBench
{
public static function beforeClass(): void
Expand Down Expand Up @@ -74,15 +72,15 @@ public function afterIteration(): void
* Using a single thread to export multiple files.
* By executing a single Find command for multiple files, we can reduce the number of roundtrips to the server.
*
* @param array{chunk:int} $params
* @param array{chunkSize:int} $params
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decided to rename this since I got confused as to what chunk meant: I thought chunk: 1 meant a single chunk of files while it actually meant 100 chunks of 1 file each.

*/
#[ParamProviders(['provideChunkParams'])]
public function benchSequential(array $params): void
{
foreach (array_chunk(self::getFileNames(), $params['chunk']) as $i => $files) {
foreach (array_chunk(self::getFileNames(), $params['chunkSize']) as $i => $files) {
self::exportFile($files, [], [
'limit' => 5_000 * $params['chunk'],
'skip' => 5_000 * $params['chunk'] * $i,
'limit' => 5_000 * $params['chunkSize'],
'skip' => 5_000 * $params['chunkSize'] * $i,
]);
}
}
Expand All @@ -103,12 +101,12 @@ public function benchFork(array $params): void
Utils::reset();

// Create a child process for each chunk of files
foreach (array_chunk(self::getFileNames(), $params['chunk']) as $i => $files) {
foreach (array_chunk(self::getFileNames(), $params['chunkSize']) as $i => $files) {
$pid = pcntl_fork();
if ($pid === 0) {
self::exportFile($files, [], [
'limit' => 5_000 * $params['chunk'],
'skip' => 5_000 * $params['chunk'] * $i,
'limit' => 5_000 * $params['chunkSize'],
'skip' => 5_000 * $params['chunkSize'] * $i,
]);

// Exit the child process
Expand All @@ -133,21 +131,21 @@ public function benchFork(array $params): void
/**
* Using amphp/parallel with worker pool
*
* @param array{chunk:int} $params
* @param array{chunkSize:int} $params
*/
#[ParamProviders(['provideChunkParams'])]
public function benchAmpWorkers(array $params): void
{
$workerPool = new ContextWorkerPool(ceil(100 / $params['chunk']), new ContextWorkerFactory());
$workerPool = new ContextWorkerPool(ceil(100 / $params['chunkSize']), new ContextWorkerFactory());

$futures = [];
foreach (array_chunk(self::getFileNames(), $params['chunk']) as $i => $files) {
foreach (array_chunk(self::getFileNames(), $params['chunkSize']) as $i => $files) {
$futures[] = $workerPool->submit(
new ExportFileTask(
files: $files,
options: [
'limit' => 5_000 * $params['chunk'],
'skip' => 5_000 * $params['chunk'] * $i,
'limit' => 5_000 * $params['chunkSize'],
'skip' => 5_000 * $params['chunkSize'] * $i,
],
),
)->getFuture();
Expand All @@ -160,13 +158,9 @@ public function benchAmpWorkers(array $params): void

public static function provideChunkParams(): Generator
{
yield 'by 1' => ['chunk' => 1];
yield 'by 2' => ['chunk' => 2];
yield 'by 4' => ['chunk' => 4];
yield 'by 8' => ['chunk' => 8];
yield 'by 13' => ['chunk' => 13];
yield 'by 20' => ['chunk' => 20];
yield 'by 100' => ['chunk' => 100];
yield '100 chunks' => ['chunkSize' => 1];
yield '25 chunks' => ['chunkSize' => 4];
yield '10 chunks' => ['chunkSize' => 10];
}

/**
Expand Down
36 changes: 8 additions & 28 deletions benchmark/src/DriverBench/ParallelMultiFileImportBench.php
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@
use PhpBench\Attributes\BeforeMethods;
use PhpBench\Attributes\Iterations;
use PhpBench\Attributes\ParamProviders;
use PhpBench\Attributes\Revs;
use RuntimeException;

use function array_chunk;
Expand Down Expand Up @@ -47,7 +46,6 @@
#[AfterClassMethods('afterClass')]
#[BeforeMethods('beforeIteration')]
#[Iterations(1)]
#[Revs(1)]
final class ParallelMultiFileImportBench
{
public static function beforeClass(): void
Expand All @@ -73,20 +71,6 @@ public function beforeIteration(): void
$database->createCollection(Utils::getCollectionName());
}

/**
* Using Driver's BulkWrite in a single thread.
* The number of files to import in each iteration is controlled by the "chunk" parameter.
*
* @param array{chunk:int} $params
*/
#[ParamProviders(['provideChunkParams'])]
public function benchBulkWrite(array $params): void
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This subject essentially tests the same as benchInsertMany, so I decided to skip it in order to shave ~6 minutes off the run time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could keep this and remove the other that is a little less efficient

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I removed bulkWrite and kept insertMany is that benchmarking the latter also exposes performance regressions introduced in the library, while the other only tests the extension.

{
foreach (array_chunk(self::getFileNames(), $params['chunk']) as $files) {
self::importFile($files);
}
}

/**
* Using library's Collection::insertMany in a single thread
*/
Expand Down Expand Up @@ -116,7 +100,7 @@ public function benchInsertMany(): void
* Using multiple forked threads. The number of threads is controlled by the "chunk" parameter,
* which is the number of files to import in each thread.
*
* @param array{chunk:int} $params
* @param array{chunkSize:int} $params
*/
#[ParamProviders(['provideChunkParams'])]
public function benchFork(array $params): void
Expand All @@ -128,7 +112,7 @@ public function benchFork(array $params): void
// of a new libmongoc client.
Utils::reset();

foreach (array_chunk(self::getFileNames(), $params['chunk']) as $files) {
foreach (array_chunk(self::getFileNames(), $params['chunkSize']) as $files) {
$pid = pcntl_fork();
if ($pid === 0) {
self::importFile($files);
Expand All @@ -155,16 +139,16 @@ public function benchFork(array $params): void
/**
* Using amphp/parallel with worker pool
*
* @param array{processes:int} $params
* @param array{chunkSize:int} $params
*/
#[ParamProviders(['provideChunkParams'])]
public function benchAmpWorkers(array $params): void
{
$workerPool = new ContextWorkerPool(ceil(100 / $params['chunk']), new ContextWorkerFactory());
$workerPool = new ContextWorkerPool(ceil(100 / $params['chunkSize']), new ContextWorkerFactory());

$futures = array_map(
fn ($files) => $workerPool->submit(new ImportFileTask($files))->getFuture(),
array_chunk(self::getFileNames(), $params['chunk']),
array_chunk(self::getFileNames(), $params['chunkSize']),
);

foreach (Future::iterate($futures) as $future) {
Expand All @@ -176,13 +160,9 @@ public function benchAmpWorkers(array $params): void

public function provideChunkParams(): Generator
{
yield 'by 1' => ['chunk' => 1];
yield 'by 2' => ['chunk' => 2];
yield 'by 4' => ['chunk' => 4];
yield 'by 8' => ['chunk' => 8];
yield 'by 13' => ['chunk' => 13];
yield 'by 20' => ['chunk' => 20];
yield 'by 100' => ['chunk' => 100];
yield '100 chunks' => ['chunkSize' => 1];
yield '25 chunks' => ['chunkSize' => 4];
yield '10 chunks' => ['chunkSize' => 10];
}

/**
Expand Down
3 changes: 0 additions & 3 deletions benchmark/src/DriverBench/SingleDocBench.php
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@
use MongoDB\Driver\Command;
use PhpBench\Attributes\BeforeMethods;
use PhpBench\Attributes\ParamProviders;
use PhpBench\Attributes\Revs;

use function array_map;
use function file_get_contents;
Expand Down Expand Up @@ -45,7 +44,6 @@ public function benchRunCommand(): void
*/
#[BeforeMethods('beforeFindOneById')]
#[ParamProviders('provideFindOneByIdParams')]
#[Revs(1)]
public function benchFindOneById(array $params): void
{
$collection = Utils::getCollection();
Expand Down Expand Up @@ -79,7 +77,6 @@ public static function provideFindOneByIdParams(): Generator
* @param array{document: object|array, repeat: int, options?: array} $params
*/
#[ParamProviders('provideInsertOneParams')]
#[Revs(1)]
public function benchInsertOne(array $params): void
{
$collection = Utils::getCollection();
Expand Down
Loading