Skip to content

Commit

Permalink
onnx service
Browse files Browse the repository at this point in the history
  • Loading branch information
stelios-ritual committed Oct 4, 2024
1 parent e1a1959 commit 3d38858
Show file tree
Hide file tree
Showing 11 changed files with 145 additions and 1,110 deletions.
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,12 @@ All notable changes to this project will be documented in this file.
- ##### The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
- ##### This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [2.0.0] - XXXX-XX-XX

### Changed
- Simplified examples to the minimum core functionality necessary and removed all dependencies on `infernet-ml`.
- Updated images used for deploying the Infernet Node.

## [1.0.1] - 2024-07-31

### Fixed
Expand Down
9 changes: 6 additions & 3 deletions projects/gpt4/container/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
# GPT 4
In this example, we run a minimalist container that makes use of the OpenAI [completions API](https://platform.openai.com/docs/api-reference/chat) to serve text generation requests.

In this example, we will run a minimalist container that makes use of the OpenAI [completions API](https://platform.openai.com/docs/api-reference/chat) to serve text generation requests.

## Requirements

To use the model you'll need to have an OpenAI API key. Get one on [OpenAI](https://openai.com/)'s website.

## Run the Container
Expand All @@ -24,7 +26,8 @@ curl -X POST localhost:3000/service_output -H "Content-Type: application/json" \

## Next steps

This container is for demonstration purposes only, and is purposefully simplified for readability and ease of comprehension. For a production-ready version of this code, check out:
This container is for demonstration purposes only, and is purposefully simplified for
readability and ease of comprehension. For a production-ready version of this code, check out:

- The [CSS Inference Workflow](https://infernet-ml.docs.ritual.net/reference/infernet_ml/workflows/inference/css_inference_workflow/): A Python class that supports multiple API providers, including OpenAI, that can be used to build production-ready containers.
- The [CSS Inference Workflow](https://infernet-ml.docs.ritual.net/reference/infernet_ml/workflows/inference/css_inference_workflow/): A Python class that supports multiple API providers, including OpenAI, and can be used to build production-ready containers.
- The [CSS Inference Service](https://infernet-services.docs.ritual.net/reference/css_inference_service/): A production-ready, [Infernet](https://docs.ritual.net/infernet/node/introduction)-compatible container that works out-of-the-box with minimal configuration, and serves inference using the `CSS Inference Workflow`.
18 changes: 9 additions & 9 deletions projects/hello-world/hello-world.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,12 @@
Welcome to the first tutorial of Infernet! In this tutorial we will guide you through the process of setting up and
running an Infernet Node, and then demonstrate how to create and monitor off-chain compute jobs and on-chain subscriptions.

To interact with infernet, one could either create a job by accessing an infernet node directly through it's API (we'll
To interact with infernet, one could either create a job by accessing an Infernet Node directly through it's API (we'll
refer to this as an off-chain job), or by creating a subscription on-chain (we'll refer to this as an on-chain job).

## Requesting an off-chain job: Hello World!

This project is a simple [flask-app](container/src/app.py) that is compatible with `infernet`, and simply
This project is a simple [flask-app](container/src/app.py) that is compatible with Infernet, and simply
[echoes what you send to it](container/src/app.py#L16).

### Install Docker & Verify Installation
Expand Down Expand Up @@ -42,11 +42,11 @@ make build-container project=hello-world

Then, from the top-level project directory, Run the following make command:

```
```bash
make deploy-container project=hello-world
```

This will deploy an infernet node along with the `hello-world` image.
This will deploy an Infernet Node along with the `hello-world` image.

### Creating an off-chain job through the API

Expand Down Expand Up @@ -107,11 +107,11 @@ In another terminal, run `docker container ls`, you should see something like th

```bash
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c2ca0ffe7817 ritualnetwork/infernet-anvil:0.0.0 "anvil --host 0.0.0.…" 9 seconds ago Up 8 seconds 0.0.0.0:8545->3000/tcp anvil-node
c2ca0ffe7817 ritualnetwork/infernet-anvil:1.0.0 "anvil --host 0.0.0.…" 9 seconds ago Up 8 seconds 0.0.0.0:8545->3000/tcp infernet-anvil
0b686a6a0e5f ritualnetwork/hello-world-infernet:0.0.2 "gunicorn app:create…" 9 seconds ago Up 8 seconds 0.0.0.0:3000->3000/tcp hello-world
28b2e5608655 ritualnetwork/infernet-node:0.1.1 "/app/entrypoint.sh" 10 seconds ago Up 10 seconds 0.0.0.0:4000->4000/tcp deploy-node-1
03ba51ff48b8 fluent/fluent-bit:latest "/fluent-bit/bin/flu…" 10 seconds ago Up 10 seconds 2020/tcp, 0.0.0.0:24224->24224/tcp deploy-fluentbit-1
a0d96f29a238 redis:latest "docker-entrypoint.s…" 10 seconds ago Up 10 seconds 0.0.0.0:6379->6379/tcp deploy-redis-1
28b2e5608655 ritualnetwork/infernet-node:1.3.1 "/app/entrypoint.sh" 10 seconds ago Up 10 seconds 0.0.0.0:4000->4000/tcp deploy-node-1
03ba51ff48b8 fluent/fluent-bit:latest "/fluent-bit/bin/flu…" 10 seconds ago Up 10 seconds 2020/tcp, 0.0.0.0:24224->24224/tcp infernet-fluentbit
a0d96f29a238 redis:latest "docker-entrypoint.s…" 10 seconds ago Up 10 seconds 0.0.0.0:6379->6379/tcp infernet-redis
```

You can see that the anvil node is running on port `8545`, and the infernet
Expand All @@ -125,7 +125,7 @@ All this contract does is to request a job from the infernet node, and upon rece
the result, it will use the `forge` console to print the result.

**Anvil Logs**: First, it's useful to look at the logs of the anvil node to see what's going on. In
a new terminal, run `docker logs -f anvil-node`.
a new terminal, run `docker logs -f infernet-anvil`.

**Deploying the contracts**: In another terminal, run the following command:

Expand Down
57 changes: 24 additions & 33 deletions projects/onnx-iris/container/README.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,10 @@
# Iris Classification via ONNX Runtime
# Running an ONNX model

This example uses a pre-trained model to classify iris flowers. The code for the model
is located at
our [simple-ml-models](https://github.com/ritual-net/simple-ml-models/tree/main/iris_classification)
repository.
In this example, we will serve a pre-trained model to classify iris flowers via the ONNX runntime. The code for the model
is located at our [simple-ml-models](https://github.com/ritual-net/simple-ml-models/tree/main/iris_classification) repository.

## Overview

We're making use of
the [ONNXInferenceWorkflow](https://github.com/ritual-net/infernet-ml/blob/main/src/ml/workflows/inference/onnx_inference_workflow.py)
class to run the model. This is one of many workflows that we currently support in our
[infernet-ml](https://github.com/ritual-net/infernet-ml). Consult the library's
documentation for more info on workflows that
are supported.

## Building & Running the Container in Isolation

Note that this container is meant to be started by the infernet-node. For development &
Testing purposes, you can run the container in isolation using the following commands.
This container is meant to be started by the Infernet Node. For development and
testing purposes, you can run the container in isolation using the following commands.

### Building the Container

Expand All @@ -44,7 +31,7 @@ Run the following command to run an inference:
```bash
curl -X POST http://127.0.0.1:3000/service_output \
-H "Content-Type: application/json" \
-d '{"source":1, "data": {"input": [[1.0380048, 0.5586108, 1.1037828, 1.712096]]}}'
-d '{"source": 1, "data": {"input": [[1.0380048, 0.5586108, 1.1037828, 1.712096]]}}'
```

#### Note Regarding the Input
Expand All @@ -63,27 +50,23 @@ Putting this input into a vector and scaling it, we get the following scaled inp
[1.0380048, 0.5586108, 1.1037828, 1.712096]
```

Refer
to [this function in the model's repository](https://github.com/ritual-net/simple-ml-models/blob/03ebc6fb15d33efe20b7782505b1a65ce3975222/iris_classification/iris_inference_pytorch.py#L13)
for more information on how the input is scaled.
Refer to [this function in the model's repository](https://github.com/ritual-net/simple-ml-models/blob/03ebc6fb15d33efe20b7782505b1a65ce3975222/iris_classification/iris_inference_pytorch.py#L13) for more information on how the input
is scaled.

For more context on the Iris dataset, refer to
the [UCI Machine Learning Repository](https://archive.ics.uci.edu/ml/datasets/iris).
For more context on the Iris dataset, refer to the [UCI Machine Learning Repository](https://archive.ics.uci.edu/ml/datasets/iris).

### Output

By running the above command, you should get a response similar to the following:

```json
[
[
[
0.0010151526657864451,
0.014391022734344006,
0.9845937490463257
]
{
"output": [
0.0010151526657864451,
0.014391022734344006,
0.9845937490463257
]
]
}
```

The response corresponds to the model's prediction for each of the classes:
Expand All @@ -93,4 +76,12 @@ The response corresponds to the model's prediction for each of the classes:
```

In this case, the model predicts that the input corresponds to the class `virginica`with
a probability of `0.9845937490463257`(~98.5%).
a probability of `0.9845937490463257` (~98.5%).

## Next steps

This container is for demonstration purposes only, and is purposefully simplified for readability and ease of comprehension. For a production-ready version of this code, check out:

- The [ONNX Inference Workflow](https://infernet-ml.docs.ritual.net/reference/infernet_ml/workflows/inference/onnx_inference_workflow): A Python class that can run any ONNX model from a variety of storage sources.
- The [ONNX Inference Service](https://infernet-services.docs.ritual.net/reference/onnx_inference_service): A production-ready, [Infernet](https://docs.ritual.net/infernet/node/introduction)-compatible container that works out-of-the-box
with minimal configuration, and serves ONNX inference using the `ONNX Inference Workflow`.
140 changes: 76 additions & 64 deletions projects/onnx-iris/container/src/app.py
Original file line number Diff line number Diff line change
@@ -1,19 +1,11 @@
import logging
from typing import Any, cast, List

from infernet_ml.utils.common_types import TensorInput
import numpy as np
from eth_abi import decode, encode # type: ignore
from infernet_ml.utils.model_loader import (
HFLoadArgs,
ModelSource,
)
from infernet_ml.utils.service_models import InfernetInput, JobLocation
from infernet_ml.workflows.inference.onnx_inference_workflow import (
ONNXInferenceWorkflow,
ONNXInferenceInput,
ONNXInferenceResult,
)
from huggingface_hub import hf_hub_download # type: ignore
import numpy as np
import onnx
from onnxruntime import InferenceSession # type: ignore
from quart import Quart, request
from quart.json.provider import DefaultJSONProvider

Expand All @@ -33,14 +25,10 @@ def default(obj: Any) -> Any:
def create_app() -> Quart:
Quart.json_provider_class = NumpyJsonEncodingProvider
app = Quart(__name__)
# we are downloading the model from the hub.
# model repo is located at: https://huggingface.co/Ritual-Net/iris-dataset

workflow = ONNXInferenceWorkflow(
model_source=ModelSource.HUGGINGFACE_HUB,
load_args=HFLoadArgs(repo_id="Ritual-Net/iris-dataset", filename="iris.onnx"),
)
workflow.setup()
# Model repo is located at: https://huggingface.co/Ritual-Net/iris-dataset
REPO_ID = "Ritual-Net/iris-dataset"
FILENAME = "iris.onnx"

@app.route("/")
def index() -> str:
Expand All @@ -51,64 +39,88 @@ def index() -> str:

@app.route("/service_output", methods=["POST"])
async def inference() -> Any:
req_data = await request.get_json()
"""
InfernetInput has the format:
Input data has the format:
source: (0 on-chain, 1 off-chain)
destination: (0 on-chain, 1 off-chain)
data: dict[str, Any]
"""
infernet_input: InfernetInput = InfernetInput(**req_data)

match infernet_input:
case InfernetInput(source=JobLocation.OFFCHAIN):
web2_input = cast(dict[str, Any], infernet_input.data)
values = cast(List[List[float]], web2_input["input"])
case InfernetInput(source=JobLocation.ONCHAIN):
web3_input: List[int] = decode(
["uint256[]"], bytes.fromhex(cast(str, infernet_input.data))
)[0]
values = [[float(v) / 1e6 for v in web3_input]]
req_data: dict[str, Any] = await request.get_json()
onchain_source = True if req_data.get("source") == 0 else False
onchain_destination = True if req_data.get("destination") == 0 else False
data = req_data.get("data")

"""
The input to the onnx inference workflow needs to conform to ONNX runtime's
input_feed format. For more information refer to:
https://docs.ritual.net/ml-workflows/inference-workflows/onnx_inference_workflow
"""
_input = ONNXInferenceInput(
inputs={"input": TensorInput(shape=(1, 4), dtype="float", values=values)},
if onchain_source:
"""
For on-chain requests, the prompt is sent as a generalized hex-string
which we will decode to the appropriate format.
"""
web3_input: List[int] = decode(
["uint256[]"], bytes.fromhex(cast(str, data))
)[0]
values = [[float(v) / 1e6 for v in web3_input]]
else:
"""For off-chain requests, the input is sent as is."""
web2_input = cast(dict[str, Any], data)
values = cast(list[list[float]], web2_input["input"])

# Prepare the input data for the model
dtype = cast(np.dtype[np.float32], "float32")
shape = (len(values), len(values[0]))

# Download the model from the hub
path = hf_hub_download(repo_id=REPO_ID, filename=FILENAME, force_download=False)
model = onnx.load(path)
onnx.checker.check_model(model)
session = InferenceSession(path)
output_names = [output.name for output in model.graph.output]

# Run the model
outputs = session.run(
output_names,
{
"input": np.array(
values,
dtype=dtype,
).reshape(shape)
},
)
result: ONNXInferenceResult = workflow.inference(_input)

match infernet_input:
case InfernetInput(destination=JobLocation.OFFCHAIN):
"""
In case of an off-chain request, the result is returned as is.
"""
return result
case InfernetInput(destination=JobLocation.ONCHAIN):
"""
In case of an on-chain request, the result is returned in the format:

# Get the predictions
output = outputs[0]
predictions = {
"values": output.flatten(),
"dtype": "float32",
"shape": output.shape,
}

# Depending on the destination, the result is returned in a different format.
if onchain_destination:
"""
For on-chain requests, the result is returned in the format:
{
"raw_input": str,
"processed_input": str,
"raw_output": str,
"processed_output": str,
"proof": str,
}
refer to: https://docs.ritual.net/infernet/node/advanced/containers for
more info.
"""
predictions = result[0]
predictions_normalized = [int(p * 1e6) for p in predictions.values]
return {
"raw_input": "",
"processed_input": "",
"raw_output": encode(["uint256[]"], [predictions_normalized]).hex(),
"processed_output": "",
"proof": "",
}
case _:
raise ValueError("Invalid destination")
refer to: https://docs.ritual.net/infernet/node/advanced/containers for more
info.
"""
predictions_normalized = [int(p * 1e6) for p in predictions["values"]]
return {
"raw_input": "",
"processed_input": "",
"raw_output": encode(["uint256[]"], [predictions_normalized]).hex(),
"processed_output": "",
"proof": "",
}
else:
"""
For off-chain request, the result is returned as is.
"""
return {"output": predictions["values"]}

return app

Expand Down
6 changes: 4 additions & 2 deletions projects/onnx-iris/container/src/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
huggingface-hub==0.17.3
numpy==1.26.4
onnx==1.16.1
onnxruntime==1.18.0
quart==0.19.4
infernet-ml==1.0.0
infernet-ml[onnx_inference]==1.0.0
web3==6.15.0
Loading

0 comments on commit 3d38858

Please sign in to comment.