Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add micro benchmarks. #537

Merged
merged 12 commits into from
Oct 13, 2023
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ Inspired from [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
- Added point-in-time APIs (create_pit, delete_pit, delete_all_pits, get_all_pits) and Security Client APIs (health and update_audit_configuration) ([#502](https://github.com/opensearch-project/opensearch-py/pull/502))
- Added new guide for using index templates with the client ([#531](https://github.com/opensearch-project/opensearch-py/pull/531))
- Added `pool_maxsize` for `Urllib3HttpConnection` ([#535](https://github.com/opensearch-project/opensearch-py/pull/535))
- Added benchmarks ([#537](https://github.com/opensearch-project/opensearch-py/pull/537))
### Changed
- Generate `tasks` client from API specs ([#508](https://github.com/opensearch-project/opensearch-py/pull/508))
- Generate `ingest` client from API specs ([#513](https://github.com/opensearch-project/opensearch-py/pull/513))
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ For more information, see [opensearch.org](https://opensearch.org/) and the [API

## User Guide

To get started with the OpenSearch Python Client, see [User Guide](https://github.com/opensearch-project/opensearch-py/blob/main/USER_GUIDE.md).
To get started with the OpenSearch Python Client, see [User Guide](https://github.com/opensearch-project/opensearch-py/blob/main/USER_GUIDE.md). This repository also contains [working samples](https://github.com/opensearch-project/opensearch-py/tree/main/samples) and [benchmarks](https://github.com/opensearch-project/opensearch-py/tree/main/benchmarks).

## Compatibility with OpenSearch

Expand Down
63 changes: 63 additions & 0 deletions benchmarks/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
- [Benchmarks](#benchmarks)
- [Start OpenSearch](#start-opensearch)
- [Install Prerequisites](#install-prerequisites)
- [Run Benchmarks](#run-benchmarks)

## Benchmarks

Python client benchmarks using [richbench](https://github.com/tonybaloney/rich-bench).

### Start OpenSearch

```
docker run -p 9200:9200 -e "discovery.type=single-node" opensearchproject/opensearch:latest
```

### Install Prerequisites

Install [poetry](https://python-poetry.org/docs/), then install package dependencies.

```
poetry install
```

Benchmarks use the code in this repository by specifying the dependency as `opensearch-py = { path = "..", develop=true, extras=["async"] }` in [pyproject.toml](pyproject.toml).

### Run Benchmarks

Run all benchmarks available as follows.

```
poetry run richbench . --repeat 1 --times 1
```

Outputs results from all the runs.

```
Benchmarks, repeat=1, number=1
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Benchmark ┃ Min ┃ Max ┃ Mean ┃ Min (+) ┃ Max (+) ┃ Mean (+) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ 1 client vs. more clients (async) │ 1.640 │ 1.640 │ 1.640 │ 1.102 (1.5x) │ 1.102 (1.5x) │ 1.102 (1.5x) │
│ 1 thread vs. 32 threads (sync) │ 5.526 │ 5.526 │ 5.526 │ 1.626 (3.4x) │ 1.626 (3.4x) │ 1.626 (3.4x) │
│ 1 thread vs. 32 threads (sync) │ 4.639 │ 4.639 │ 4.639 │ 3.363 (1.4x) │ 3.363 (1.4x) │ 3.363 (1.4x) │
│ sync vs. async (8) │ 3.198 │ 3.198 │ 3.198 │ 0.966 (3.3x) │ 0.966 (3.3x) │ 0.966 (3.3x) │
└───────────────────────────────────┴─────────┴─────────┴─────────┴─────────────────┴─────────────────┴─────────────────┘
```

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please put the commands within a single code block for easier copying? The output can be placed in a separate block or outside of the code block. Thank you!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Run a specific benchmark, e.g. [bench_sync.py](bench_sync.py) by specifying `--benchmark [name]`.

```
poetry run richbench . --repeat 1 --times 1 --benchmark sync
```

Outputs results from one benchmark.

```
Benchmarks, repeat=1, number=1
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Benchmark ┃ Min ┃ Max ┃ Mean ┃ Min (+) ┃ Max (+) ┃ Mean (+) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ 1 thread vs. 32 threads (sync) │ 6.804 │ 6.804 │ 6.804 │ 3.409 (2.0x) │ 3.409 (2.0x) │ 3.409 (2.0x) │
└────────────────────────────────┴─────────┴─────────┴─────────┴─────────────────┴─────────────────┴─────────────────┘
```
101 changes: 101 additions & 0 deletions benchmarks/bench_async.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
#!/usr/bin/env python

# SPDX-License-Identifier: Apache-2.0
#
# The OpenSearch Contributors require contributions made to
# this file be licensed under the Apache-2.0 license or a
# compatible open source license.

import asyncio
import uuid

from opensearchpy import AsyncHttpConnection, AsyncOpenSearch

host = "localhost"
port = 9200
auth = ("admin", "admin")
index_name = "test-index-async"
item_count = 100


async def index_records(client, item_count):
await asyncio.gather(
*[
client.index(
index=index_name,
body={
"title": f"Moneyball",
"director": "Bennett Miller",
"year": "2011",
},
id=uuid.uuid4(),
)
for j in range(item_count)
]
)


async def test_async(client_count=1, item_count=1):
clients = []
for i in range(client_count):
clients.append(
AsyncOpenSearch(
hosts=[{"host": host, "port": port}],
http_auth=auth,
use_ssl=True,
verify_certs=False,
ssl_show_warn=False,
connection_class=AsyncHttpConnection,
pool_maxsize=client_count,
)
)

if await clients[0].indices.exists(index_name):
await clients[0].indices.delete(index_name)

await clients[0].indices.create(index_name)

await asyncio.gather(
*[index_records(clients[i], item_count) for i in range(client_count)]
)

await clients[0].indices.refresh(index=index_name)
print(await clients[0].count(index=index_name))

await clients[0].indices.delete(index_name)

await asyncio.gather(*[client.close() for client in clients])


def test(item_count=1, client_count=1):
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
loop.run_until_complete(test_async(item_count, client_count))
loop.close()


def test_1():
test(1, 32 * item_count)


def test_2():
test(2, 16 * item_count)


def test_4():
test(4, 8 * item_count)


def test_8():
test(8, 4 * item_count)


def test_16():
test(16, 2 * item_count)


def test_32():
test(32, item_count)


__benchmarks__ = [(test_1, test_8, "1 client vs. more clients (async)")]
93 changes: 93 additions & 0 deletions benchmarks/bench_info_sync.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
#!/usr/bin/env python

# SPDX-License-Identifier: Apache-2.0
#
# The OpenSearch Contributors require contributions made to
# this file be licensed under the Apache-2.0 license or a
# compatible open source license.

import logging
import sys
import time

from thread_with_return_value import ThreadWithReturnValue

from opensearchpy import OpenSearch

host = "localhost"
port = 9200
auth = ("admin", "admin")
request_count = 250


root = logging.getLogger()
# root.setLevel(logging.DEBUG)
# logging.getLogger("urllib3.connectionpool").setLevel(logging.DEBUG)

handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.DEBUG)
formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
handler.setFormatter(formatter)
root.addHandler(handler)


def get_info(client, request_count):
tt = 0
for n in range(request_count):
start = time.time() * 1000
rc = client.info()
total_time = time.time() * 1000 - start
tt += total_time
return tt


def test(thread_count=1, request_count=1, client_count=1):
clients = []
for i in range(client_count):
clients.append(
OpenSearch(
hosts=[{"host": host, "port": port}],
http_auth=auth,
use_ssl=True,
verify_certs=False,
ssl_show_warn=False,
pool_maxsize=thread_count,
)
)

threads = []
for thread_id in range(thread_count):
thread = ThreadWithReturnValue(
target=get_info, args=[clients[thread_id % len(clients)], request_count]
)
threads.append(thread)
thread.start()

latency = 0
for t in threads:
latency += t.join()

print(f"latency={latency}")


def test_1():
test(1, 32 * request_count, 1)


def test_2():
test(2, 16 * request_count, 2)


def test_4():
test(4, 8 * request_count, 3)


def test_8():
test(8, 4 * request_count, 8)


def test_32():
test(32, request_count, 32)


__benchmarks__ = [(test_1, test_32, "1 thread vs. 32 threads (sync)")]
Loading