onnx service

ritual-net · Oct 4, 2024 · 3d38858 · 3d38858
1 parent e1a1959
commit 3d38858
Show file tree

Hide file tree

Showing 11 changed files with 145 additions and 1,110 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -5,6 +5,12 @@ All notable changes to this project will be documented in this file.
 - ##### The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
 - ##### This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [2.0.0] - XXXX-XX-XX
+
+### Changed
+- Simplified examples to the minimum core functionality necessary and removed all dependencies on `infernet-ml`.
+- Updated images used for deploying the Infernet Node.
+
 ## [1.0.1] - 2024-07-31
 
 ### Fixed

diff --git a/projects/gpt4/container/README.md b/projects/gpt4/container/README.md
@@ -1,7 +1,9 @@
 # GPT 4
-In this example, we run a minimalist container that makes use of the OpenAI [completions API](https://platform.openai.com/docs/api-reference/chat) to serve text generation requests.
+
+In this example, we will run a minimalist container that makes use of the OpenAI [completions API](https://platform.openai.com/docs/api-reference/chat) to serve text generation requests.
 
 ## Requirements
+
 To use the model you'll need to have an OpenAI API key. Get one on [OpenAI](https://openai.com/)'s website.
 
 ## Run the Container
@@ -24,7 +26,8 @@ curl -X POST localhost:3000/service_output -H "Content-Type: application/json" \
 
 ## Next steps
 
-This container is for demonstration purposes only, and is purposefully simplified for readability and ease of comprehension. For a production-ready version of this code, check out:
+This container is for demonstration purposes only, and is purposefully simplified for
+readability and ease of comprehension. For a production-ready version of this code, check out:
 
-- The [CSS Inference Workflow](https://infernet-ml.docs.ritual.net/reference/infernet_ml/workflows/inference/css_inference_workflow/): A Python class that supports multiple API providers, including OpenAI, that can be used to build production-ready containers.
+- The [CSS Inference Workflow](https://infernet-ml.docs.ritual.net/reference/infernet_ml/workflows/inference/css_inference_workflow/): A Python class that supports multiple API providers, including OpenAI, and can be used to build production-ready containers.
 - The [CSS Inference Service](https://infernet-services.docs.ritual.net/reference/css_inference_service/): A production-ready, [Infernet](https://docs.ritual.net/infernet/node/introduction)-compatible container that works out-of-the-box with minimal configuration, and serves inference using the `CSS Inference Workflow`.
diff --git a/projects/hello-world/hello-world.md b/projects/hello-world/hello-world.md
@@ -3,12 +3,12 @@
 Welcome to the first tutorial of Infernet! In this tutorial we will guide you through the process of setting up and
 running an Infernet Node, and then demonstrate how to create and monitor off-chain compute jobs and on-chain subscriptions.
 
-To interact with infernet, one could either create a job by accessing an infernet node directly through it's API (we'll
+To interact with infernet, one could either create a job by accessing an Infernet Node directly through it's API (we'll
 refer to this as an off-chain job), or by creating a subscription on-chain (we'll refer to this as an on-chain job).
 
 ## Requesting an off-chain job: Hello World!
 
-This project is a simple [flask-app](container/src/app.py) that is compatible with `infernet`, and simply
+This project is a simple [flask-app](container/src/app.py) that is compatible with Infernet, and simply
 [echoes what you send to it](container/src/app.py#L16).
 
 ### Install Docker & Verify Installation
@@ -42,11 +42,11 @@ make build-container project=hello-world
 
 Then, from the top-level project directory, Run the following make command:
 
-```
+```bash
 make deploy-container project=hello-world
 ```
 
-This will deploy an infernet node along with the `hello-world` image.
+This will deploy an Infernet Node along with the `hello-world` image.
 
 ### Creating an off-chain job through the API
 
@@ -107,11 +107,11 @@ In another terminal, run `docker container ls`, you should see something like th
 
 ```bash
 CONTAINER ID   IMAGE                                      COMMAND                  CREATED          STATUS          PORTS                                NAMES
-c2ca0ffe7817   ritualnetwork/infernet-anvil:0.0.0         "anvil --host 0.0.0.…"   9 seconds ago    Up 8 seconds    0.0.0.0:8545->3000/tcp               anvil-node
+c2ca0ffe7817   ritualnetwork/infernet-anvil:1.0.0         "anvil --host 0.0.0.…"   9 seconds ago    Up 8 seconds    0.0.0.0:8545->3000/tcp               infernet-anvil
 0b686a6a0e5f   ritualnetwork/hello-world-infernet:0.0.2   "gunicorn app:create…"   9 seconds ago    Up 8 seconds    0.0.0.0:3000->3000/tcp               hello-world
-28b2e5608655   ritualnetwork/infernet-node:0.1.1          "/app/entrypoint.sh"     10 seconds ago   Up 10 seconds   0.0.0.0:4000->4000/tcp               deploy-node-1
-03ba51ff48b8   fluent/fluent-bit:latest                   "/fluent-bit/bin/flu…"   10 seconds ago   Up 10 seconds   2020/tcp, 0.0.0.0:24224->24224/tcp   deploy-fluentbit-1
-a0d96f29a238   redis:latest                               "docker-entrypoint.s…"   10 seconds ago   Up 10 seconds   0.0.0.0:6379->6379/tcp               deploy-redis-1
+28b2e5608655   ritualnetwork/infernet-node:1.3.1          "/app/entrypoint.sh"     10 seconds ago   Up 10 seconds   0.0.0.0:4000->4000/tcp               deploy-node-1
+03ba51ff48b8   fluent/fluent-bit:latest                   "/fluent-bit/bin/flu…"   10 seconds ago   Up 10 seconds   2020/tcp, 0.0.0.0:24224->24224/tcp   infernet-fluentbit
+a0d96f29a238   redis:latest                               "docker-entrypoint.s…"   10 seconds ago   Up 10 seconds   0.0.0.0:6379->6379/tcp               infernet-redis
 ```
 
 You can see that the anvil node is running on port `8545`, and the infernet
@@ -125,7 +125,7 @@ All this contract does is to request a job from the infernet node, and upon rece
 the result, it will use the `forge` console to print the result.
 
 **Anvil Logs**: First, it's useful to look at the logs of the anvil node to see what's going on. In
-a new terminal, run `docker logs -f anvil-node`.
+a new terminal, run `docker logs -f infernet-anvil`.
 
 **Deploying the contracts**: In another terminal, run the following command:
 

diff --git a/projects/onnx-iris/container/README.md b/projects/onnx-iris/container/README.md
@@ -1,23 +1,10 @@
-# Iris Classification via ONNX Runtime
+# Running an ONNX model
 
-This example uses a pre-trained model to classify iris flowers. The code for the model
-is located at
-our [simple-ml-models](https://github.com/ritual-net/simple-ml-models/tree/main/iris_classification)
-repository.
+In this example, we will serve a pre-trained model to classify iris flowers via the ONNX runntime. The code for the model
+is located at our [simple-ml-models](https://github.com/ritual-net/simple-ml-models/tree/main/iris_classification) repository.
 
-## Overview
-
-We're making use of
-the [ONNXInferenceWorkflow](https://github.com/ritual-net/infernet-ml/blob/main/src/ml/workflows/inference/onnx_inference_workflow.py)
-class to run the model. This is one of many workflows that we currently support in our
-[infernet-ml](https://github.com/ritual-net/infernet-ml). Consult the library's
-documentation for more info on workflows that
-are supported.
-
-## Building & Running the Container in Isolation
-
-Note that this container is meant to be started by the infernet-node. For development &
-Testing purposes, you can run the container in isolation using the following commands.
+This container is meant to be started by the Infernet Node. For development and
+testing purposes, you can run the container in isolation using the following commands.
 
 ### Building the Container
 
@@ -44,7 +31,7 @@ Run the following command to run an inference:
 ```bash
 curl -X POST http://127.0.0.1:3000/service_output \
      -H "Content-Type: application/json" \
-     -d '{"source":1, "data": {"input": [[1.0380048, 0.5586108, 1.1037828, 1.712096]]}}'
+     -d '{"source": 1, "data": {"input": [[1.0380048, 0.5586108, 1.1037828, 1.712096]]}}'
 ```
 
 #### Note Regarding the Input
@@ -63,27 +50,23 @@ Putting this input into a vector and scaling it, we get the following scaled inp
 [1.0380048, 0.5586108, 1.1037828, 1.712096]
 ```
 
-Refer
-to [this function in the model's repository](https://github.com/ritual-net/simple-ml-models/blob/03ebc6fb15d33efe20b7782505b1a65ce3975222/iris_classification/iris_inference_pytorch.py#L13)
-for more information on how the input is scaled.
+Refer to [this function in the model's repository](https://github.com/ritual-net/simple-ml-models/blob/03ebc6fb15d33efe20b7782505b1a65ce3975222/iris_classification/iris_inference_pytorch.py#L13) for more information on how the input
+is scaled.
 
-For more context on the Iris dataset, refer to
-the [UCI Machine Learning Repository](https://archive.ics.uci.edu/ml/datasets/iris).
+For more context on the Iris dataset, refer to the [UCI Machine Learning Repository](https://archive.ics.uci.edu/ml/datasets/iris).
 
 ### Output
 
 By running the above command, you should get a response similar to the following:
 
 ```json
-[
-  [
-    [
-      0.0010151526657864451,
-      0.014391022734344006,
-      0.9845937490463257
-    ]
+{
+  "output": [
+    0.0010151526657864451,
+    0.014391022734344006,
+    0.9845937490463257
   ]
-]
+}
 ```
 
 The response corresponds to the model's prediction for each of the classes:
@@ -93,4 +76,12 @@ The response corresponds to the model's prediction for each of the classes:
 ```
 
 In this case, the model predicts that the input corresponds to the class `virginica`with
-a probability of `0.9845937490463257`(~98.5%).
+a probability of `0.9845937490463257` (~98.5%).
+
+## Next steps
+
+This container is for demonstration purposes only, and is purposefully simplified for readability and ease of comprehension. For a production-ready version of this code, check out:
+
+- The [ONNX Inference Workflow](https://infernet-ml.docs.ritual.net/reference/infernet_ml/workflows/inference/onnx_inference_workflow): A Python class that can run any ONNX model from a variety of storage sources.
+- The [ONNX Inference Service](https://infernet-services.docs.ritual.net/reference/onnx_inference_service): A production-ready, [Infernet](https://docs.ritual.net/infernet/node/introduction)-compatible container that works out-of-the-box
+with minimal configuration, and serves ONNX inference using the `ONNX Inference Workflow`.
diff --git a/projects/onnx-iris/container/src/app.py b/projects/onnx-iris/container/src/app.py
@@ -1,19 +1,11 @@
 import logging
 from typing import Any, cast, List
 
-from infernet_ml.utils.common_types import TensorInput
-import numpy as np
 from eth_abi import decode, encode  # type: ignore
-from infernet_ml.utils.model_loader import (
-    HFLoadArgs,
-    ModelSource,
-)
-from infernet_ml.utils.service_models import InfernetInput, JobLocation
-from infernet_ml.workflows.inference.onnx_inference_workflow import (
-    ONNXInferenceWorkflow,
-    ONNXInferenceInput,
-    ONNXInferenceResult,
-)
+from huggingface_hub import hf_hub_download  # type: ignore
+import numpy as np
+import onnx
+from onnxruntime import InferenceSession  # type: ignore
 from quart import Quart, request
 from quart.json.provider import DefaultJSONProvider
 
@@ -33,14 +25,10 @@ def default(obj: Any) -> Any:
 def create_app() -> Quart:
     Quart.json_provider_class = NumpyJsonEncodingProvider
     app = Quart(__name__)
-    # we are downloading the model from the hub.
-    # model repo is located at: https://huggingface.co/Ritual-Net/iris-dataset
 
-    workflow = ONNXInferenceWorkflow(
-        model_source=ModelSource.HUGGINGFACE_HUB,
-        load_args=HFLoadArgs(repo_id="Ritual-Net/iris-dataset", filename="iris.onnx"),
-    )
-    workflow.setup()
+    # Model repo is located at: https://huggingface.co/Ritual-Net/iris-dataset
+    REPO_ID = "Ritual-Net/iris-dataset"
+    FILENAME = "iris.onnx"
 
     @app.route("/")
     def index() -> str:
@@ -51,64 +39,88 @@ def index() -> str:
 
     @app.route("/service_output", methods=["POST"])
     async def inference() -> Any:
-        req_data = await request.get_json()
         """
-        InfernetInput has the format:
+        Input data has the format:
             source: (0 on-chain, 1 off-chain)
+            destination: (0 on-chain, 1 off-chain)
             data: dict[str, Any]
         """
-        infernet_input: InfernetInput = InfernetInput(**req_data)
-
-        match infernet_input:
-            case InfernetInput(source=JobLocation.OFFCHAIN):
-                web2_input = cast(dict[str, Any], infernet_input.data)
-                values = cast(List[List[float]], web2_input["input"])
-            case InfernetInput(source=JobLocation.ONCHAIN):
-                web3_input: List[int] = decode(
-                    ["uint256[]"], bytes.fromhex(cast(str, infernet_input.data))
-                )[0]
-                values = [[float(v) / 1e6 for v in web3_input]]
+        req_data: dict[str, Any] = await request.get_json()
+        onchain_source = True if req_data.get("source") == 0 else False
+        onchain_destination = True if req_data.get("destination") == 0 else False
+        data = req_data.get("data")
 
-        """
-        The input to the onnx inference workflow needs to conform to ONNX runtime's
-        input_feed format. For more information refer to:
-        https://docs.ritual.net/ml-workflows/inference-workflows/onnx_inference_workflow
-        """
-        _input = ONNXInferenceInput(
-            inputs={"input": TensorInput(shape=(1, 4), dtype="float", values=values)},
+        if onchain_source:
+            """
+            For on-chain requests, the prompt is sent as a generalized hex-string
+            which we will decode to the appropriate format.
+            """
+            web3_input: List[int] = decode(
+                ["uint256[]"], bytes.fromhex(cast(str, data))
+            )[0]
+            values = [[float(v) / 1e6 for v in web3_input]]
+        else:
+            """For off-chain requests, the input is sent as is."""
+            web2_input = cast(dict[str, Any], data)
+            values = cast(list[list[float]], web2_input["input"])
+
+        # Prepare the input data for the model
+        dtype = cast(np.dtype[np.float32], "float32")
+        shape = (len(values), len(values[0]))
+
+        # Download the model from the hub
+        path = hf_hub_download(repo_id=REPO_ID, filename=FILENAME, force_download=False)
+        model = onnx.load(path)
+        onnx.checker.check_model(model)
+        session = InferenceSession(path)
+        output_names = [output.name for output in model.graph.output]
+
+        # Run the model
+        outputs = session.run(
+            output_names,
+            {
+                "input": np.array(
+                    values,
+                    dtype=dtype,
+                ).reshape(shape)
+            },
         )
-        result: ONNXInferenceResult = workflow.inference(_input)
-
-        match infernet_input:
-            case InfernetInput(destination=JobLocation.OFFCHAIN):
-                """
-                In case of an off-chain request, the result is returned as is.
-                """
-                return result
-            case InfernetInput(destination=JobLocation.ONCHAIN):
-                """
-                In case of an on-chain request, the result is returned in the format:
+
+        # Get the predictions
+        output = outputs[0]
+        predictions = {
+            "values": output.flatten(),
+            "dtype": "float32",
+            "shape": output.shape,
+        }
+
+        # Depending on the destination, the result is returned in a different format.
+        if onchain_destination:
+            """
+            For on-chain requests, the result is returned in the format:
                 {
                     "raw_input": str,
                     "processed_input": str,
                     "raw_output": str,
                     "processed_output": str,
                     "proof": str,
                 }
-                refer to: https://docs.ritual.net/infernet/node/advanced/containers for
-                more info.
-                """
-                predictions = result[0]
-                predictions_normalized = [int(p * 1e6) for p in predictions.values]
-                return {
-                    "raw_input": "",
-                    "processed_input": "",
-                    "raw_output": encode(["uint256[]"], [predictions_normalized]).hex(),
-                    "processed_output": "",
-                    "proof": "",
-                }
-            case _:
-                raise ValueError("Invalid destination")
+            refer to: https://docs.ritual.net/infernet/node/advanced/containers for more
+            info.
+            """
+            predictions_normalized = [int(p * 1e6) for p in predictions["values"]]
+            return {
+                "raw_input": "",
+                "processed_input": "",
+                "raw_output": encode(["uint256[]"], [predictions_normalized]).hex(),
+                "processed_output": "",
+                "proof": "",
+            }
+        else:
+            """
+            For off-chain request, the result is returned as is.
+            """
+            return {"output": predictions["values"]}
 
     return app
 

diff --git a/projects/onnx-iris/container/src/requirements.txt b/projects/onnx-iris/container/src/requirements.txt
@@ -1,4 +1,6 @@
+huggingface-hub==0.17.3
+numpy==1.26.4
+onnx==1.16.1
+onnxruntime==1.18.0
 quart==0.19.4
-infernet-ml==1.0.0
-infernet-ml[onnx_inference]==1.0.0
 web3==6.15.0