diff --git a/.chloggen/add-synthetic-source.yaml b/.chloggen/add-synthetic-source.yaml new file mode 100644 index 0000000000..935eb5529e --- /dev/null +++ b/.chloggen/add-synthetic-source.yaml @@ -0,0 +1,22 @@ +# Use this changelog template to create an entry for release notes. +# +# If your change doesn't affect end users you should instead start +# your pull request title with [chore] or use the "Skip Changelog" label. + +# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix' +change_type: enhancement + +# The name of the area of concern in the attributes-registry, (e.g. http, cloud, db) +component: user_agent + +# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`). +note: Add the `user_agent.synthetic.type` attribute to specify the category of synthetic traffic, such as tests or bots. + +# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists. +# The values here must be integers. +issues: [1127] + +# (Optional) One or more lines of additional information to render under the primary note. +# These lines will be padded with 2 spaces and then inserted directly into the document. +# Use pipe (|) for multiline entries. +subtext: diff --git a/docs/attributes-registry/user-agent.md b/docs/attributes-registry/user-agent.md index 37be97223f..3952005cd9 100644 --- a/docs/attributes-registry/user-agent.md +++ b/docs/attributes-registry/user-agent.md @@ -14,8 +14,20 @@ Describes user-agent attributes. |---|---|---|---|---| | `user_agent.name` | string | Name of the user-agent extracted from original. Usually refers to the browser's name. [1] | `Safari`; `YourApp` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | `user_agent.original` | string | Value of the [HTTP User-Agent](https://www.rfc-editor.org/rfc/rfc9110.html#field.user-agent) header sent by the client. | `CERN-LineMode/2.15 libwww/2.17b3`; `Mozilla/5.0 (iPhone; CPU iPhone OS 14_7_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.2 Mobile/15E148 Safari/604.1`; `YourApp/1.0.0 grpc-java-okhttp/1.27.2` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| `user_agent.version` | string | Version of the user-agent extracted from original. Usually refers to the browser's version [2] | `14.1.2`; `1.0.0` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `user_agent.synthetic.type` | string | Specifies the category of synthetic traffic, such as tests or bots. [2] | `bot`; `test` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `user_agent.version` | string | Version of the user-agent extracted from original. Usually refers to the browser's version [3] | `14.1.2`; `1.0.0` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1] `user_agent.name`:** [Example](https://www.whatsmyua.info) of extracting browser's name from original string. In the case of using a user-agent for non-browser products, such as microservices with multiple names/versions inside the `user_agent.original`, the most significant name SHOULD be selected. In such a scenario it should align with `user_agent.version` -**[2] `user_agent.version`:** [Example](https://www.whatsmyua.info) of extracting browser's version from original string. In the case of using a user-agent for non-browser products, such as microservices with multiple names/versions inside the `user_agent.original`, the most significant version SHOULD be selected. In such a scenario it should align with `user_agent.name` +**[2] `user_agent.synthetic.type`:** This attribute MAY be derived from the contents of the `user_agent.original` attribute. Components that populate the attribute are responsible for determining what they consider to be synthetic bot or test traffic. This attribute can either be set for self-identification purposes, or on telemetry detected to be generated as a result of a synthetic request. This attribute is useful for distinguishing between genuine client traffic and synthetic traffic generated by bots or tests. + +**[3] `user_agent.version`:** [Example](https://www.whatsmyua.info) of extracting browser's version from original string. In the case of using a user-agent for non-browser products, such as microservices with multiple names/versions inside the `user_agent.original`, the most significant version SHOULD be selected. In such a scenario it should align with `user_agent.name` + +--- + +`user_agent.synthetic.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `bot` | Bot source. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `test` | Synthetic test source. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | diff --git a/docs/http/http-metrics.md b/docs/http/http-metrics.md index 60020b3a9d..a5136e6d5e 100644 --- a/docs/http/http-metrics.md +++ b/docs/http/http-metrics.md @@ -89,6 +89,7 @@ of `[ 0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1, 2.5, 5, 7.5, 10 | [`network.protocol.version`](/docs/attributes-registry/network.md) | string | The actual version of the protocol used for network communication. [7] | `1.0`; `1.1`; `2`; `3` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`server.address`](/docs/attributes-registry/server.md) | string | Name of the local HTTP server that received the request. [8] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Opt-In` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`server.port`](/docs/attributes-registry/server.md) | int | Port of the local HTTP server that received the request. [9] | `80`; `8080`; `443` | `Opt-In` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`user_agent.synthetic.type`](/docs/attributes-registry/user-agent.md) | string | Specifies the category of synthetic traffic, such as tests or bots. [10] | `bot`; `test` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1] `http.request.method`:** HTTP request method value SHOULD be "known" to the instrumentation. By default, this convention defines "known" methods as the ones listed in [RFC9110](https://www.rfc-editor.org/rfc/rfc9110.html#name-methods) @@ -143,6 +144,8 @@ SHOULD include the [application root](/docs/http/http-spans.md#http-server-defin > Since this attribute is based on HTTP headers, opting in to it may allow an attacker > to trigger cardinality limits, degrading the usefulness of the metric. +**[10] `user_agent.synthetic.type`:** This attribute MAY be derived from the contents of the `user_agent.original` attribute. Components that populate the attribute are responsible for determining what they consider to be synthetic bot or test traffic. This attribute can either be set for self-identification purposes, or on telemetry detected to be generated as a result of a synthetic request. This attribute is useful for distinguishing between genuine client traffic and synthetic traffic generated by bots or tests. + --- `error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. @@ -168,6 +171,15 @@ SHOULD include the [application root](/docs/http/http-spans.md#http-server-defin | `PUT` | PUT method. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | `TRACE` | TRACE method. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +--- + +`user_agent.synthetic.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `bot` | Bot source. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `test` | Synthetic test source. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + @@ -270,6 +282,7 @@ This metric is optional. | [`network.protocol.version`](/docs/attributes-registry/network.md) | string | The actual version of the protocol used for network communication. [7] | `1.0`; `1.1`; `2`; `3` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`server.address`](/docs/attributes-registry/server.md) | string | Name of the local HTTP server that received the request. [8] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Opt-In` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`server.port`](/docs/attributes-registry/server.md) | int | Port of the local HTTP server that received the request. [9] | `80`; `8080`; `443` | `Opt-In` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`user_agent.synthetic.type`](/docs/attributes-registry/user-agent.md) | string | Specifies the category of synthetic traffic, such as tests or bots. [10] | `bot`; `test` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1] `http.request.method`:** HTTP request method value SHOULD be "known" to the instrumentation. By default, this convention defines "known" methods as the ones listed in [RFC9110](https://www.rfc-editor.org/rfc/rfc9110.html#name-methods) @@ -324,6 +337,8 @@ SHOULD include the [application root](/docs/http/http-spans.md#http-server-defin > Since this attribute is based on HTTP headers, opting in to it may allow an attacker > to trigger cardinality limits, degrading the usefulness of the metric. +**[10] `user_agent.synthetic.type`:** This attribute MAY be derived from the contents of the `user_agent.original` attribute. Components that populate the attribute are responsible for determining what they consider to be synthetic bot or test traffic. This attribute can either be set for self-identification purposes, or on telemetry detected to be generated as a result of a synthetic request. This attribute is useful for distinguishing between genuine client traffic and synthetic traffic generated by bots or tests. + --- `error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. @@ -349,6 +364,15 @@ SHOULD include the [application root](/docs/http/http-spans.md#http-server-defin | `PUT` | PUT method. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | `TRACE` | TRACE method. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +--- + +`user_agent.synthetic.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `bot` | Bot source. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `test` | Synthetic test source. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + @@ -382,6 +406,7 @@ This metric is optional. | [`network.protocol.version`](/docs/attributes-registry/network.md) | string | The actual version of the protocol used for network communication. [7] | `1.0`; `1.1`; `2`; `3` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`server.address`](/docs/attributes-registry/server.md) | string | Name of the local HTTP server that received the request. [8] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Opt-In` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`server.port`](/docs/attributes-registry/server.md) | int | Port of the local HTTP server that received the request. [9] | `80`; `8080`; `443` | `Opt-In` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`user_agent.synthetic.type`](/docs/attributes-registry/user-agent.md) | string | Specifies the category of synthetic traffic, such as tests or bots. [10] | `bot`; `test` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1] `http.request.method`:** HTTP request method value SHOULD be "known" to the instrumentation. By default, this convention defines "known" methods as the ones listed in [RFC9110](https://www.rfc-editor.org/rfc/rfc9110.html#name-methods) @@ -436,6 +461,8 @@ SHOULD include the [application root](/docs/http/http-spans.md#http-server-defin > Since this attribute is based on HTTP headers, opting in to it may allow an attacker > to trigger cardinality limits, degrading the usefulness of the metric. +**[10] `user_agent.synthetic.type`:** This attribute MAY be derived from the contents of the `user_agent.original` attribute. Components that populate the attribute are responsible for determining what they consider to be synthetic bot or test traffic. This attribute can either be set for self-identification purposes, or on telemetry detected to be generated as a result of a synthetic request. This attribute is useful for distinguishing between genuine client traffic and synthetic traffic generated by bots or tests. + --- `error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. @@ -461,6 +488,15 @@ SHOULD include the [application root](/docs/http/http-spans.md#http-server-defin | `PUT` | PUT method. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | `TRACE` | TRACE method. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +--- + +`user_agent.synthetic.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `bot` | Bot source. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `test` | Synthetic test source. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + diff --git a/docs/http/http-spans.md b/docs/http/http-spans.md index 045a69a408..37781049ea 100644 --- a/docs/http/http-spans.md +++ b/docs/http/http-spans.md @@ -162,6 +162,7 @@ For an HTTP client span, `SpanKind` MUST be `CLIENT`. | [`url.scheme`](/docs/attributes-registry/url.md) | string | The [URI scheme](https://www.rfc-editor.org/rfc/rfc3986#section-3.1) component identifying the used protocol. | `http`; `https` | `Opt-In` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`url.template`](/docs/attributes-registry/url.md) | string | The low-cardinality template of an [absolute path reference](https://www.rfc-editor.org/rfc/rfc3986#section-4.2). [14] | `/users/{id}`; `/users/:id`; `/users?id={id}` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`user_agent.original`](/docs/attributes-registry/user-agent.md) | string | Value of the [HTTP User-Agent](https://www.rfc-editor.org/rfc/rfc9110.html#field.user-agent) header sent by the client. | `CERN-LineMode/2.15 libwww/2.17b3`; `Mozilla/5.0 (iPhone; CPU iPhone OS 14_7_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.2 Mobile/15E148 Safari/604.1`; `YourApp/1.0.0 grpc-java-okhttp/1.27.2` | `Opt-In` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`user_agent.synthetic.type`](/docs/attributes-registry/user-agent.md) | string | Specifies the category of synthetic traffic, such as tests or bots. [15] | `bot`; `test` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1] `http.request.method`:** HTTP request method value SHOULD be "known" to the instrumentation. By default, this convention defines "known" methods as the ones listed in [RFC9110](https://www.rfc-editor.org/rfc/rfc9110.html#name-methods) @@ -245,6 +246,8 @@ The attribute value MUST consist of either multiple header values as an array of **[14] `url.template`:** The `url.template` MUST have low cardinality. It is not usually available on HTTP clients, but may be known by the application or specialized HTTP instrumentation. +**[15] `user_agent.synthetic.type`:** This attribute MAY be derived from the contents of the `user_agent.original` attribute. Components that populate the attribute are responsible for determining what they consider to be synthetic bot or test traffic. This attribute can either be set for self-identification purposes, or on telemetry detected to be generated as a result of a synthetic request. This attribute is useful for distinguishing between genuine client traffic and synthetic traffic generated by bots or tests. + The following attributes can be important for making sampling decisions and SHOULD be provided **at span creation time** (if provided at all): @@ -290,6 +293,15 @@ and SHOULD be provided **at span creation time** (if provided at all): | `udp` | UDP | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | `unix` | Unix domain socket | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +--- + +`user_agent.synthetic.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `bot` | Bot source. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `test` | Synthetic test source. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + @@ -410,6 +422,7 @@ For an HTTP server span, `SpanKind` MUST be `SERVER`. | [`network.local.address`](/docs/attributes-registry/network.md) | string | Local socket address. Useful in case of a multi-IP host. | `10.1.2.80`; `/tmp/my.sock` | `Opt-In` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`network.local.port`](/docs/attributes-registry/network.md) | int | Local socket port. Useful in case of a multi-port host. | `65123` | `Opt-In` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`network.transport`](/docs/attributes-registry/network.md) | string | [OSI transport layer](https://wikipedia.org/wiki/Transport_layer) or [inter-process communication method](https://wikipedia.org/wiki/Inter-process_communication). [17] | `tcp`; `udp` | `Opt-In` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`user_agent.synthetic.type`](/docs/attributes-registry/user-agent.md) | string | Specifies the category of synthetic traffic, such as tests or bots. [18] | `bot`; `test` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1] `http.request.method`:** HTTP request method value SHOULD be "known" to the instrumentation. By default, this convention defines "known" methods as the ones listed in [RFC9110](https://www.rfc-editor.org/rfc/rfc9110.html#name-methods) @@ -491,6 +504,8 @@ The attribute value MUST consist of either multiple header values as an array of **[17] `network.transport`:** Generally `tcp` for `HTTP/1.0`, `HTTP/1.1`, and `HTTP/2`. Generally `udp` for `HTTP/3`. Other obscure implementations are possible. +**[18] `user_agent.synthetic.type`:** This attribute MAY be derived from the contents of the `user_agent.original` attribute. Components that populate the attribute are responsible for determining what they consider to be synthetic bot or test traffic. This attribute can either be set for self-identification purposes, or on telemetry detected to be generated as a result of a synthetic request. This attribute is useful for distinguishing between genuine client traffic and synthetic traffic generated by bots or tests. + The following attributes can be important for making sampling decisions and SHOULD be provided **at span creation time** (if provided at all): @@ -541,6 +556,15 @@ and SHOULD be provided **at span creation time** (if provided at all): | `udp` | UDP | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | `unix` | Unix domain socket | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +--- + +`user_agent.synthetic.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `bot` | Bot source. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `test` | Synthetic test source. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + diff --git a/model/http/metrics.yaml b/model/http/metrics.yaml index 78283fdf75..90f74a7656 100644 --- a/model/http/metrics.yaml +++ b/model/http/metrics.yaml @@ -18,6 +18,8 @@ groups: > **Warning** > Since this attribute is based on HTTP headers, opting in to it may allow an attacker > to trigger cardinality limits, degrading the usefulness of the metric. + - ref: user_agent.synthetic.type + requirement_level: opt_in - id: metric_attributes.http.client type: attribute_group brief: 'HTTP client attributes' diff --git a/model/http/spans.yaml b/model/http/spans.yaml index 2698415ce3..86ee3154c4 100644 --- a/model/http/spans.yaml +++ b/model/http/spans.yaml @@ -45,6 +45,8 @@ groups: requirement_level: opt_in - ref: http.response.body.size requirement_level: opt_in + - ref: user_agent.synthetic.type + requirement_level: opt_in - id: span.http.server type: span @@ -113,3 +115,5 @@ groups: requirement_level: opt_in - ref: http.response.body.size requirement_level: opt_in + - ref: user_agent.synthetic.type + requirement_level: opt_in diff --git a/model/user-agent/registry.yaml b/model/user-agent/registry.yaml index 2b856ecdf5..369aad35cd 100644 --- a/model/user-agent/registry.yaml +++ b/model/user-agent/registry.yaml @@ -35,3 +35,22 @@ groups: using a user-agent for non-browser products, such as microservices with multiple names/versions inside the `user_agent.original`, the most significant version SHOULD be selected. In such a scenario it should align with `user_agent.name` + - id: user_agent.synthetic.type + stability: experimental + brief: > + Specifies the category of synthetic traffic, such as tests or bots. + note: > + This attribute MAY be derived from the contents of the `user_agent.original` attribute. + Components that populate the attribute are responsible for determining what they consider to be synthetic bot or test traffic. + This attribute can either be set for self-identification purposes, or on telemetry detected to be generated as a result + of a synthetic request. This attribute is useful for distinguishing between genuine client traffic and synthetic traffic generated by bots or tests. + type: + members: + - id: bot + value: "bot" + brief: 'Bot source.' + stability: experimental + - id: test + value: "test" + brief: 'Synthetic test source.' + stability: experimental