Skip to content

Latest commit

 

History

History
251 lines (189 loc) · 12.2 KB

README.md

File metadata and controls

251 lines (189 loc) · 12.2 KB

arize banner



Overview

The arize-otel package provides a lightweight wrapper around OpenTelemetry primitives with Arize-aware defaults and options. It is meant to be a very lightweight convenience package to help set up OpenTelemetry for tracing LLM applications and send the traces to Arize.

Installation

Install arize-otel using pip

pip install arize-otel

Quickstart

The arize.otel module provides a high-level register function to configure OpenTelemetry tracing by returning a TracerProvider. The register function can also configure headers and whether or not to process spans one by one or by batch.

The following examples showcase how to use register to setup Opentelemetry in order to send traces to a collector. However, this is NOT the same as instrumenting your application. For instance, you can use any of our OpenInference AutoInstrumentators. Assuming we use the OpenAI AutoInstrumentation, we need to run instrument() after using register:

from arize.otel import register
# Setup OTel via our convenience function
tracer_provider = register(
    # See details in examples below...
)

# Instrument your application using OpenInference AutoInstrumentators
from openinference.instrumentation.openai import OpenAIInstrumentor
OpenAIInstrumentor().instrument(tracer_provider=tracer_provider)

The above code snippet will yield a fully setup and instrumented application. It is worth noting that this is completely optional. The usage of this package is for convenience only, you can set up OpenTelemetry and send traces to Arize without installing this or any other package from Arize.

In the following sections we have examples on how to use the register function:

Send traces to Arize

To send traces to Arize you need to authenticate via the Space ID and API Key. You can find them in the Space Settings page in the Arize platform. In addition, you'll need to specify the project name, a unique name to identify your project in the Arize platform.

from arize.otel import register
register(
    space_id = "your-arize-space-id",
    api_key = "your-arize-api-key",
    project_name = "your-model-id",
)

If you are located in the European Union, you'll need to specify the corresponding Endpoint (the default endpoint is Endpoint.ARIZE):

from arize.otel import register, Endpoint
register(
    endpoint=Endpoint.ARIZE_EUROPE,
    space_id = "your-arize-space-id",
    api_key = "your-arize-api-key",
    project_name = "your-model-id",
)

If you would like to configure your tracing using environment variables instead of passing arguments, read Using Environment Variables.

Send traces to Custom Endpoint

Sending traces to a collector on a custom endpoint is simple, you just need to provide the endpoint as a string. In addition, it is worth noting that the default is to use a GRPCSpanExporter. If you'd like to use a HTTPSpanExporter instead, specify the transport as shown below:

from arize.otel import register
register(
    endpoint = "https://my-custom-endpoint"
    # any other options...
)

Specify exporter type

If you're using endpoints from the Endpoint enum, you do not need to do this, since we know what exporter to use. However, if you're using a custom endpoint, it is worth noting that the default is to use a GRPCSpanExporter. If you'd like to use a HTTPSpanExporter instead, specify the transport as shown below:

from arize.otel import register, Transport
register(
    endpoint = "https://my-custom-endpoint"
    transport = Transport.HTTP,
    # any other options...
)

Turn off batch processing of spans

We default to using BatchSpanProcessor from OpenTelemetry because it is non-blocking in case telemetry goes down. In contrast, "SimpleSpanProcessor processes spans as they are created." This can be helpful in development. You can use SimpleSpanProcessor with the option use_batch_processor=False.

from arize.otel import register
register(
    # other options...
    batch=False
)

Debug

As you're setting up your tracing, it is helpful to print to console the spans created. You can achieve this by setting log_to_console=True.

from arize.otel import register
register(
    # other options...
    log_to_console=True
)

Using Environment Variables

The register function will read from environment variables if the arguments are not passed:

from arize.otel import register
register(
    space_id = ... # Will be read from ARIZE_SPACE_ID env var
    api_key = ... # Will be read from ARIZE_API_KEY env var
    project_name = ... # Will be read from ARIZE_PROJECT_NAME env var
    endpoint = ... # Will be read from ARIZE_COLLECTOR_ENDPOINT env var, defaults to Endpoint.Arize
)

In the event of conflict, if an environment variable is set but a different argument is passed, the argument passed will take precedence and the environment variable will be ignored.

Using OTel Primitives

For more granular tracing configuration, these wrappers can be used as drop-in replacements for OTel primitives:

from opentelemetry import trace as trace_api
from arize.otel import HTTPSpanExporter, TracerProvider, SimpleSpanProcessor

tracer_provider = TracerProvider()
span_exporter = HTTPSpanExporter(endpoint=...)
span_processor = SimpleSpanProcessor(span_exporter=span_exporter)
tracer_provider.add_span_processor(span_processor)
trace_api.set_tracer_provider(tracer_provider)

Wrappers have Arize-aware defaults to greatly simplify the OTel configuration process. A special endpoint keyword argument can be passed to either a TracerProvider, SimpleSpanProcessor or BatchSpanProcessor in order to automatically infer which SpanExporter to use to simplify setup.

Specifying the endpoint directly

from opentelemetry import trace as trace_api
from arize.otel import TracerProvider

tracer_provider = TracerProvider(endpoint="https://your-desired-endpoint.com")
trace_api.set_tracer_provider(tracer_provider)

Configuring resources

# export ARIZE_COLLECTOR_ENDPOINT=https://your-desired-endpoint.com

from opentelemetry import trace as trace_api
from arize.otel import Resource, PROJECT_NAME, TracerProvider

tracer_provider = TracerProvider(resource=Resource({PROJECT_NAME: "my-project"}))
trace_api.set_tracer_provider(tracer_provider)

Using a BatchSpanProcessor

# export ARIZE_COLLECTOR_ENDPOINT=https://your-desired-endpoint.com

from opentelemetry import trace as trace_api
from arize.otel import TracerProvider, BatchSpanProcessor

tracer_provider = TracerProvider()
batch_processor = BatchSpanProcessor()
tracer_provider.add_span_processor(batch_processor)

Specifying a custom GRPC endpoint

from opentelemetry import trace as trace_api
from arize.otel import TracerProvider, BatchSpanProcessor, GRPCSpanExporter

tracer_provider = TracerProvider()
batch_processor = BatchSpanProcessor(
    span_exporter=GRPCSpanExporter(endpoint="https://your-desired-endpoint.com")
)
tracer_provider.add_span_processor(batch_processor)

Questions?

Find us in our Slack Community or email [email protected]

Copyright, Patent, and License

Copyright 2024 Arize AI, Inc. All Rights Reserved.

This software is licensed under the terms of the 3-Clause BSD License. See LICENSE.