Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
RichardBruskiewich committed May 15, 2024
2 parents 0f05ff6 + 65d6902 commit 022e9b1
Showing 1 changed file with 39 additions and 1 deletion.
40 changes: 39 additions & 1 deletion docs/deployment-guide/monitoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,42 @@ T.B.A. (Tim Putnam to elaborate)

# Telemetry

Given the distributed nature of Biomedical Translator knowledge processing components, tracing the flow of queries through the system represents a challenge. Observability is the practice of measuring the state of a system by its various component outputs. [OpenTelemetry](https://opentelemetry.io/) is an open source observability framework. Applying elements of OpenTelemetry to Translator will help in the challenge of query auditing, for quality assurance or performance. An overview of OpenTelemetry concepts is provided [here](https://docs.google.com/presentation/d/1OjcE1gVhx8u9EvvHGn6h50otBKmpd-9HidlTNppXXy0/edit#slide=id.g27ee40efb83_0_3) and a small Translator demo of the concept using the Jaeger telemetry collector, is [here](https://github.com/TranslatorSRI/Jaeger-demo).
Given the distributed nature of Biomedical Translator knowledge processing components, tracing the flow of queries through the system represents a challenge. Observability is the practice of measuring the state of a system by its various component outputs. [OpenTelemetry](https://opentelemetry.io/) is an open-source observability framework. Applying elements of OpenTelemetry to Translator will help in the challenge of query auditing, for quality assurance or performance. An overview of OpenTelemetry concepts is provided [here](https://docs.google.com/presentation/d/1OjcE1gVhx8u9EvvHGn6h50otBKmpd-9HidlTNppXXy0/edit#slide=id.g27ee40efb83_0_3) and a small Translator demo of the concept using the Jaeger telemetry collector, is [here](https://github.com/TranslatorSRI/Jaeger-demo).

### Telemetry Frequently Asked Questions
#### When I instrument my application should I trace incoming requests or outgoing?
In a microservices environment such as the Translator system, tracing both incoming and outgoing requests is key. Incoming tracing reveals user journeys, while outgoing tracing uncovers dependencies between services—both are crucial for comprehensive visibility and issue diagnosis.

As an example ARAGORN, an ARA that receives requests from the ARS and performs subsequent requests to downstream components makes use of FastAPI instrumentation to trace incoming requests and httpx instrumentation for tracing outgoing requests. This [code snippet](https://github.com/ranking-agent/aragorn/blob/main/src/otel_config.py) shows how ARAGORN traces both incoming and outgoing requests.

#### When tracing outgoing requests should it be logged if the request is bound to external services?
Logging outgoing requests bound to external services can be essential for capturing communication details, aiding in troubleshooting, performance monitoring, and understanding dependencies outside your system.

#### In a Development environment with no provisioned Jaeger instance, what is the best way to test my OTEL implementation?

To use a local Jaeger instance for testing your OpenTelemetry implementation, you can follow these general steps:

1. **Install Jaeger:** Download and install Jaeger locally. You can use Docker to quickly set up a Jaeger instance:
```bash
docker run -d --name jaeger -p 16686:16686 -p 6831:6831/udp jaegertracing/all-in-one:latest
```
This command pulls the latest Jaeger image and runs it, exposing the Jaeger UI on port 16686.
2. **Configure OpenTelemetry SDK:** Use the OpenTelemetry SDK in your application to send traces to your local Jaeger instance. Configure your OpenTelemetry instrumentation to send data to localhost on the relevant Jaeger ports (usually 6831 for UDP and 16686 for HTTP).

3. **Instrument Your Code:** Instrument your application using OpenTelemetry APIs to create traces. Please make sure you've set up the instrumentation to export traces to your local Jaeger instance.

4. **Verify Traces:** Execute your application's workflows or requests that should generate traces. Then, access the Jaeger UI at http://localhost:16686 in your browser to view the traces generated by your application.


Remember to adapt the OpenTelemetry SDK configuration in your code to use the address and ports where your local Jaeger instance is running.

This approach provides a local environment for testing OpenTelemetry traces with Jaeger without needing a remote Jaeger instance.

Once ready for deployment, in the ITRB-managed environments Jaeger can be found at `jaeger-otel-agent.sri:6831`. This stays the same in all ITRB-managed environments, pointing to an instance in the environment where your application is deployed.

#### Where can I see my traces in ITRB environments once deployed?
Jaeger UI can be accessed in the following links:
* [CI Jaeger](https://translator-otel.ci.transltr.io/search)
* [Test Jaeger](https://translator-otel.test.transltr.io/search)
* [Prod Jaeger](https://translator-otel.transltr.io/search)

0 comments on commit 022e9b1

Please sign in to comment.