Skip to content
This repository has been archived by the owner on May 8, 2024. It is now read-only.

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
maeddes committed Apr 16, 2024
2 parents 3d1edad + 6d182b8 commit ec44f94
Show file tree
Hide file tree
Showing 5 changed files with 43 additions and 59 deletions.
46 changes: 20 additions & 26 deletions tutorial/content/labs/instrumentation/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,29 +4,23 @@ draft = false
weight = 20
+++

The OpenTelemetry project is organized as a set of telemetry signals.
Every signal is developed as a stand-alone component (but OpenTelemetry also defines ways how to integrate them with one another).
Currently, OpenTelemetry incorporates (but isn't limited to) signals for tracing, metrics, logging and baggage.

{{< figure src="images/otel_architecture_spec.drawio.svg" width=700 caption="cross-language specification" >}}

At its core, every signal is defined by a **language-*agnostic* specification**, which ensures consistency and interoperability across various programming languages.
First, the specification includes definitions of terms to establish shared understanding and common vocabulary within the OpenTelemetry ecosystem.
Second, each telemetry signal provides an API specification.
The API specification defines the interface that all OpenTelemetry implementations must adhere to. This includes the methods that can be used to generate, process, and export telemetry data. By following the API specification, OpenTelemetry implementations can ensure that they are compatible with each other and can be used to collect and analyze telemetry data from a variety of sources.

The API specification is complemented by the SDK specification.
It serves as a guide for developers defining the requirements that an implementation of the API must meet to be compliant.
In here, OpenTelemetry specifies concepts around the configuration, processing, and exporting of telemetry data.
Last, the API specification contains a section to define aspect on the telemetry data.
This includes the concept of semantic conventions, which aim to standardize meaning and interpretation of telemetry data.
Ensuring that telemetry interpreted consistently regardless of the vendors involved fosters interoperability.
Finally, there specification also defines the OpenTelemetry Protocol (OTLP).

Using the SDK telemetry data can be generated within applications. This can be accomplished in two ways - automatic and manual. With automatic instrumentation there are predefined metrics, traces and logs that are collected within a library or framework. This will yield a standard set of telemetry data that can be used to getting started quickly with observability. Auto-instrumentation is either already added to a library or framework by the authors or can be added using agents, but we will learn about this later. With manual instrumentation more specific telemetry data can be generated. To use manual instrumentation the source code has to modified most of the time, except when you are using an agent like [inspectIT Ocelot](https://inspectit.rocks/) that can inject manual instrumentation code into your application. This allows for for greater control to collect more specific telemetry data that is tailored to your needs. Manual instrumentation is a big part of the following labs chapter.

The benefit of instrumenting code with OpenTelemetry to collect telemetry data is that the correlation of the previously mentioned signals is simplified since all signals carry metadata. Correlating telemetry data enables you to connect and analyze data from various sources, providing a comprehensive view of your system's behavior. By setting a unique correlation ID for each telemetry item and propagating it across network boundaries, you can track the flow of data and identify dependencies between different components. OpenTelemetry's trace ID can also be leveraged for correlation, ensuring that telemetry data from the same request or transaction is associated with the same trace. Correlation engines can further enhance this process by matching data based on correlation IDs, trace IDs, or other attributes like timestamps, allowing for efficient aggregation and analysis. Correlated telemetry data provides valuable insights for troubleshooting, performance monitoring, optimization, and gaining a holistic understanding of your system's behavior. In the labs' chapter you will see how correlated data looks like. Traditionally this had to be done by hand or just by timestamps which was a tedious task.

To ensure that the collected telemetry data can be collected across different frameworks, libraries or programming languages a vendor-neutral protocol was set into place. The OpenTelemetry Protocol (OTLP) is an open-source protocol for collecting and transmitting telemetry data, to back-end systems for analysis and storage. It defines a standardized data model, encoding format, and transport mechanisms to enable interoperability between telemetry tools and services from different vendors. By standardizing the way telemetry data is collected and transported, OTLP simplifies the integration of telemetry tools and services, improves data consistency, and facilitates data analysis and visualization across multiple technologies and environments.

OTLP offers three transport mechanisms for transmitting telemetry data: HTTP/1.1, HTTP/2, and gRPC. When using OTLP, the choice of transport mechanism depends on application requirements, considering factors such as performance, reliability, and security. OTLP data is often encoded using the Protocol Buffers (Protobuf) binary format, which is compact and efficient for network transmission and supports schema evolution, allowing for future changes to the data model without breaking compatibility. Data can also be encoded in the JSON file format which allows for a human-readable format with the disadvantage of higher network traffic and larger file sizes. The protocol is described in the OpenTelemetry Protocol Specification.
<!--
- can be accomplished in two ways
- automatic
- collected within a library or framework
- will yield a standard set of telemetry data that can be used to getting started quickly with observability
- is either already added to a library or framework by the authors or can be added using agents
- manual
- more specific telemetry data can be generated
- source code has to modified most of the time
- allows for for greater control to collect more specific telemetry data that is tailored to your needs.
The benefit of instrumenting code with OpenTelemetry to collect telemetry data is that the correlation of the previously mentioned signals is simplified since all signals carry metadata.
Correlating telemetry data enables you to connect and analyze data from various sources, providing a comprehensive view of your system's behavior.
By setting a unique correlation ID for each telemetry item and propagating it across network boundaries, you can track the flow of data and identify dependencies between different components.
OpenTelemetry's trace ID can also be leveraged for correlation, ensuring that telemetry data from the same request or transaction is associated with the same trace.
Correlation engines can further enhance this process by matching data based on correlation IDs, trace IDs, or other attributes like timestamps, allowing for efficient aggregation and analysis.
Correlated telemetry data provides valuable insights for troubleshooting, performance monitoring, optimization, and gaining a holistic understanding of your system's behavior.
In the labs' chapter you will see how correlated data looks like.
Traditionally this had to be done by hand or just by timestamps which was a tedious task.
-->
38 changes: 14 additions & 24 deletions tutorial/content/labs/instrumentation/manual/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,27 +10,17 @@ This lab looks at how to manually instrument an application by directly using Op
In doing so, we explore how each signal works.
Thereby, we hope you gain an understanding of the fundamental concepts and terminology used by OpenTelemetry. -->


{{< figure src="images/otel_architecture_impl.drawio.svg" width=700 caption="placeholder" >}}

The specification is realized through a collection of **language-*specific* implementations**.
OpenTelemetry supports a wide-range of popular [programming languages](https://opentelemetry.io/docs/instrumentation/#status-and-releases).
The implementation of a telemetry signal is mainly divided into two separate parts: the *API* and the *SDK*.
The API provides the set of interfaces to embed vendor-agnostic instrumentation into our applications and libraries.
These interfaces adhere to what was defined in OpenTelemetry's specification.
Then, there are *providers*, which implement the API.
A provider contains the logic to generate, process/aggregate and transmit the telemetry for the programming language of choice.
On startup, the application registers a provider for every type of signal.
Thereby, all API calls will be forwarded to the designated provider.
OpenTelemetry provides an SDK for every language it supports
This SDK contains a set of official providers that serve the reference implementation of the API.
There are several reasons for why OpenTelemetry separates the API and SDK like this.
First, let's consider the case of an open-source developer who wants to embed instrumentation in their code.
The implementation of a telemetry signal likely relies on a number of other dependencies.
Forcing these dependencies onto your users is problematic, as it may cause dependency conflicts in their environments.
In contrast, APIs merely consists of a set of interfaces and constants.
By separating both open-source developers can depend on the lightweight API, while users are free to choose an implementation that doesn't cause conflicts with their specific software stack.
Another benefit of this design is that it allows us to embed observability into software, without users having to pay the runtime cost if they don't need it.
Whenever we don't register a provider for a signal, OpenTelemetry will default to a special provider that translates API calls into no-ops.

However, it comes with its own set of trade-offs. Implementing OpenTelemetry can introduce complexity to an application, potentially impacting performance, when configured wrong, and may lead to vendor lock-in if heavily invested in a specific implementation. As a relatively new project, it may face challenges with adoption and compatibility, and while it aims to be vendor-agnostic, there is still a risk of vendor lock-in. Customization and flexibility may be limited compared to tailored solutions for specific use cases, and there can be a learning curve associated with understanding OpenTelemetry's concepts and APIs. Maintenance and support, particularly for organizations that rely on open-source projects, may require additional investment. Integration with existing systems can be challenging and may require extra effort. Costs may also be incurred depending on the scale of implementation and the need for additional services or support. Lastly, while OpenTelemetry has a growing community, it may not yet have the same level of community support or ecosystem of tools and integrations as more established projects. Additionally, it is important to consider that alternative implementations might offer better performance, as the SDK is designed to be extensible and general-purpose. This implies that while the SDK provides a robust framework for observability, it may not be the most optimized solution for every scenario. It is essential to weigh these trade-offs against the benefits of OpenTelemetry to determine if it is the right fit for a particular application or organization. But if OpenTelemetry is used in the right way and configured well - the benefits might
<!--
However, it comes with its own set of trade-offs.
Implementing OpenTelemetry can introduce complexity to an application, potentially impacting performance, when configured wrong, and may lead to vendor lock-in if heavily invested in a specific implementation.
As a relatively new project, it may face challenges with adoption and compatibility, and while it aims to be vendor-agnostic, there is still a risk of vendor lock-in.
Customization and flexibility may be limited compared to tailored solutions for specific use cases, and there can be a learning curve associated with understanding OpenTelemetry's concepts and APIs.
Maintenance and support, particularly for organizations that rely on open-source projects, may require additional investment.
Integration with existing systems can be challenging and may require extra effort.
Costs may also be incurred depending on the scale of implementation and the need for additional services or support.
Lastly, while OpenTelemetry has a growing community, it may not yet have the same level of community support or ecosystem of tools and integrations as more established projects.
Additionally, it is important to consider that alternative implementations might offer better performance, as the SDK is designed to be extensible and general-purpose.
This implies that while the SDK provides a robust framework for observability, it may not be the most optimized solution for every scenario.
It is essential to weigh these trade-offs against the benefits of OpenTelemetry to determine if it is the right fit for a particular application or organization.
But if OpenTelemetry is used in the right way and configured well - the benefits might
-->
2 changes: 2 additions & 0 deletions tutorial/content/labs/instrumentation/manual/logs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,7 @@ Since no tracing is set up, the `trace_id` and `span_id` are `0`. There are many
* https://github.com/open-telemetry/opentelemetry-python
* https://opentelemetry-python-contrib.readthedocs.io/en/latest/instrumentation/logging/logging.html

<!--
### Knowledge check
1. **Question**: OpenTelemetry provides a standardized way to collect and export logs.
Expand Down Expand Up @@ -227,3 +228,4 @@ Since no tracing is set up, the `trace_id` and `span_id` are `0`. There are many
* Learn how to log complex structures
* Understand the integration of OpenTelemetry logging into Python applications through manual and automatic instrumentation.
* Review the structure of OpenTelemetry logs and the role of trace and span IDs in log entries.
-->
11 changes: 5 additions & 6 deletions tutorial/content/labs/instrumentation/manual/metrics/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -638,6 +638,7 @@ If we pass `DropAggregation`, the SDK will ignore all measurements from the matc
You have now seen some basic examples of how Views let us match instruments and customize the metrics stream.
Feel free to add these code snippets to `create_views` and observe the changes in the output.

<!--
## quiz
{{< quizdown >}}
Expand Down Expand Up @@ -667,12 +668,9 @@ Feel free to add these code snippets to `create_views` and observe the changes i
- [ ] D. Use an `UpAndDownCounter` instrument to track the number of requests
{{< /quizdown >}}
-->


## finish

Congratulations on successfully completing the lab on metrics!

<!--
### push and pull-based exporter
So far, we have seen how the ConsoleMetricsExporter can be a useful tool when debugging output generated by the SDK.
Expand Down Expand Up @@ -739,4 +737,5 @@ The reason is that in Flask's debug mode the code is reloaded after the flask se
As a result, the Prometheus server tries to bind to a port a second time, which the OS prevents.
To fix this, set the debug parameter to False.
Start the app and open [localhost:8000](localhost:8000) in your browser.
You should now see the metrics being exported in Prometheus text-based exposition format.
You should now see the metrics being exported in Prometheus text-based exposition format.
-->
5 changes: 2 additions & 3 deletions tutorial/content/labs/instrumentation/manual/traces/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -486,6 +486,7 @@ Now, the service should recognize the tracing header of the incoming request and
Finally, context propagation is working as expected.
If we were to export spans to a tracing backend, it could analyze the SpanContext of the individual objects and piece together a distributed trace.

<!--
## quiz
{{< quizdown >}}
Expand Down Expand Up @@ -558,6 +559,4 @@ This makes it trivial to share it across different telemetry signals and service
To implement it simply inherit from the `ResourceDetector` class and override the `detect` method.
Finally, we simply call the `merge` method on the `Resource` object, which combines both and returns a new object.
## finish

Congratulations on successfully completing the lab on distributed tracing with OpenTelemetry.
-->

0 comments on commit ec44f94

Please sign in to comment.