Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(app): Backend frame count metrics #3308

Merged
merged 8 commits into from
Nov 20, 2024
Merged

feat(app): Backend frame count metrics #3308

merged 8 commits into from
Nov 20, 2024

Conversation

cratelyn
Copy link
Collaborator

@cratelyn cratelyn commented Oct 30, 2024

☁️ overview

this introduces a new tower middleware for Prometheus metrics, used for instrumenting HTTP and gRPC
response bodies, and observing (a) the number of frames yielded by a body, and (b) the number of bytes
included in body frames.

this middleware allows operators to reason about how large or small the packets being served in a backend's
response bodies are.

a route-level middleware that instruments request bodies will be added in a follow-on PR.

📝 changes

an overview of changes made here:

  • the linkerd-http-prom has a new body_data submodule. it exposes request and response halves, to
    be explicit about which body is being instrumented on a tower::Service.

  • the linkerd-http-prom crate now has a collection of new dependencies: bytes is added as a dependency
    in order to inspect the data chunk when the inner body yields a new frame. futures-util and http-body
    are added as dev-dependencies for the accompanying test coverage.

  • body metrics are affixed to the RouteBackendMetrics<L> structure, and registered at startup.

🔗 related

@cratelyn cratelyn self-assigned this Oct 30, 2024
@cratelyn cratelyn force-pushed the kate/record-body-data branch 5 times, most recently from 044cc73 to 41d2c9c Compare October 31, 2024 21:18
@cratelyn cratelyn changed the title feat(app): work in progress body frame metrics feat(app): Backend frame count metric Nov 1, 2024
@cratelyn cratelyn force-pushed the kate/record-body-data branch 2 times, most recently from f9d86cb to c08a6b5 Compare November 1, 2024 14:44
@cratelyn cratelyn marked this pull request as ready for review November 1, 2024 14:55
@cratelyn cratelyn requested a review from a team as a code owner November 1, 2024 14:55
Copy link
Collaborator Author

@cratelyn cratelyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🍞 breadcrumbs for review...

@cratelyn cratelyn changed the title feat(app): Backend frame count metric feat(app): Backend frame count metrics Nov 1, 2024
@sfleen sfleen self-requested a review November 1, 2024 16:10
@cratelyn cratelyn requested a review from olix0r November 4, 2024 18:52
cratelyn added a commit that referenced this pull request Nov 10, 2024
 ### ⛅ overview

this introduces a new tower middleware for Prometheus metrics, used for
instrumenting HTTP and gRPC request bodies, and observing (a) the
number of frames yielded by a body, and (b) the number of bytes included
in body frames.

this builds upon the backend-level metrics added in #3308. this
additionally uses the route label extractor, hoisted out of the retry
middleware's Prometheus telemetry in #3337.

 ### 📝 changes

* a `linkerd_http_prom::body_data::request::NewRecordBodyData::NewRecordBodyData`
  middleware is added, which complements the equivalent
  `linkerd_http_prom::body_data::response` middleware.

* this is added

Signed-off-by: katelyn martin <[email protected]>
@cratelyn cratelyn force-pushed the kate/record-body-data branch from cd2d1b9 to 3b1db24 Compare November 10, 2024 22:33
cratelyn added a commit that referenced this pull request Nov 10, 2024
 ### ⛅ overview

this introduces a new tower middleware for Prometheus metrics, used for
instrumenting HTTP and gRPC request bodies, and observing (a) the
number of frames yielded by a body, and (b) the number of bytes included
in body frames.

this builds upon the backend-level metrics added in #3308. this
additionally uses the route label extractor, hoisted out of the retry
middleware's Prometheus telemetry in #3337.

 ### 📝 changes

* a `linkerd_http_prom::body_data::request::NewRecordBodyData::NewRecordBodyData`
  middleware is added, which complements the equivalent
  `linkerd_http_prom::body_data::response` middleware.

* this is added

Signed-off-by: katelyn martin <[email protected]>
cratelyn added a commit that referenced this pull request Nov 10, 2024
 ### ⛅ overview

this introduces a new tower middleware for Prometheus metrics, used for
instrumenting HTTP and gRPC request bodies, and observing (a) the
number of frames yielded by a body, and (b) the number of bytes included
in body frames.

this builds upon the backend-level metrics added in #3308. this
additionally uses the route label extractor, hoisted out of the retry
middleware's Prometheus telemetry in #3337.

 ### 📝 changes

* a `linkerd_http_prom::body_data::request::NewRecordBodyData::NewRecordBodyData`
  middleware is added, which complements the equivalent
  `linkerd_http_prom::body_data::response` middleware.

* this is added to policy routes' metrics layer.

Signed-off-by: katelyn martin <[email protected]>
cratelyn added a commit that referenced this pull request Nov 10, 2024
 ### ⛅ overview

this introduces a new tower middleware for Prometheus metrics, used for
instrumenting HTTP and gRPC request bodies, and observing (a) the
number of frames yielded by a body, and (b) the number of bytes included
in body frames.

this builds upon the backend-level metrics added in #3308. this
additionally uses the route label extractor, hoisted out of the retry
middleware's Prometheus telemetry in #3337.

 ### 📝 changes

* a `linkerd_http_prom::body_data::request::NewRecordBodyData::NewRecordBodyData`
  middleware is added, which complements the equivalent
  `linkerd_http_prom::body_data::response` middleware.

* this is added to policy routes' metrics layer.

Signed-off-by: katelyn martin <[email protected]>
@cratelyn cratelyn removed their assignment Nov 10, 2024
@cratelyn
Copy link
Collaborator Author

this pr has been rebased onto main as of release v2.263.1.

linkerd/http/prom/src/body_data/metrics.rs Outdated Show resolved Hide resolved
linkerd/http/prom/src/body_data/metrics.rs Outdated Show resolved Hide resolved
linkerd/http/prom/src/body_data/response.rs Outdated Show resolved Hide resolved
linkerd/http/prom/src/body_data/metrics.rs Outdated Show resolved Hide resolved
linkerd/http/prom/src/body_data/metrics.rs Outdated Show resolved Hide resolved
Copy link
Member

@olix0r olix0r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I through together a quick example of what a histogram implementation here looks like: kate/record-body-data...ver/kate/record-body-data (It compiles, though I didn't check the test status).

This replaces the counter/total metrics pair: the frame_size counters give us the total number of frames processed, and the sum gives us the total number of bytes. Getting in the proper metrics in a first pass saves us a deprecation cycle.

I chose a fairly constrained set of buckets to give us orders of magnitude: 128, 1024, 10240 (and we'll get an +Inf).

I also tried to handle the wrapping case a little more gracefully.

linkerd/http/prom/src/body_data/metrics.rs Outdated Show resolved Hide resolved
@cratelyn cratelyn force-pushed the kate/record-body-data branch from 52f55c6 to cc04760 Compare November 20, 2024 19:23
@cratelyn
Copy link
Collaborator Author

📝 NB: c874231 was lacking a DCO signoff, so i've added one on your behalf to satiate CI, @olix0r. that's now 225674b.

see; https://github.com/linkerd/linkerd2-proxy/runs/33282345859

cratelyn and others added 8 commits November 20, 2024 14:37
this introduces a new tower middleware for Prometheus metrics, used for
instrumenting HTTP and gRPC response bodies, and observing (a) the
number of frames yielded by a body, and (b) the number of bytes included
in body frames.

this middleware allows operators to reason about how large or small the
packets being served in a backend's response bodies are.

a route-level middleware that instruments request bodies will be added
in a follow-on PR.

 ### 📝 changes

an overview of changes made here:

* the `linkerd-http-prom` has a new `body_data` submodule. it exposes
  `request` and `response` halves, to be explicit about which body is
  being instrumented on a `tower::Service`.

* the `linkerd-http-prom` crate now has a collection of new
  dependencies: `bytes` is added as a dependency in order to inspect the
  data chunk when the inner body yields a new frame. `futures-util` and
  `http-body` are added as dev-dependencies for the accompanying test
  coverage.

* body metrics are affixed to the `RouteBackendMetrics<L>` structure,
  and registered at startup.

Signed-off-by: katelyn martin <[email protected]>
continuing this theme of inlining, we inline the passthrough methods on
`Body` as well.

Signed-off-by: katelyn martin <[email protected]>
see:

* prometheus/client_rust#242
* prometheus/client_rust#241

for now, refactor this test so that it gates all use of the (proposed)
`sum()` and `count()` accessors behind a temporary feature gate.

Signed-off-by: katelyn martin <[email protected]>
@cratelyn cratelyn force-pushed the kate/record-body-data branch from 95e796f to c728e18 Compare November 20, 2024 19:38
@cratelyn
Copy link
Collaborator Author

this pr has been rebased onto main as of release v2.666.0.

@cratelyn cratelyn force-pushed the kate/record-body-data branch from c728e18 to 33f5f60 Compare November 20, 2024 19:47
@cratelyn
Copy link
Collaborator Author

I through together a quick example of what a histogram implementation here looks like: kate/record-body-data...ver/kate/record-body-data (It compiles, though I didn't check the test status).

This replaces the counter/total metrics pair: the frame_size counters give us the total number of frames processed, and the sum gives us the total number of bytes. Getting in the proper metrics in a first pass saves us a deprecation cycle.

I chose a fairly constrained set of buckets to give us orders of magnitude: 128, 1024, 10240 (and we'll get an +Inf).

I also tried to handle the wrapping case a little more gracefully.

changes from this branch have been incorporated here. i've added a DCO signoff, here.

@cratelyn cratelyn requested a review from olix0r November 20, 2024 19:53
Copy link
Member

@olix0r olix0r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great!

@cratelyn cratelyn merged commit 049e01b into main Nov 20, 2024
16 checks passed
@cratelyn cratelyn deleted the kate/record-body-data branch November 20, 2024 20:45
cratelyn added a commit that referenced this pull request Nov 21, 2024
 ### ⛅ overview

this introduces a new tower middleware for Prometheus metrics, used for
instrumenting HTTP and gRPC request bodies, and observing (a) the
number of frames yielded by a body, and (b) the number of bytes included
in body frames.

this builds upon the backend-level metrics added in #3308. this
additionally uses the route label extractor, hoisted out of the retry
middleware's Prometheus telemetry in #3337.

 ### 📝 changes

* a `linkerd_http_prom::body_data::request::NewRecordBodyData::NewRecordBodyData`
  middleware is added, which complements the equivalent
  `linkerd_http_prom::body_data::response` middleware.

* this is added to policy routes' metrics layer.

Signed-off-by: katelyn martin <[email protected]>
cratelyn added a commit that referenced this pull request Nov 21, 2024
 ### ⛅ overview

this introduces a new tower middleware for Prometheus metrics, used for
instrumenting HTTP and gRPC request bodies, and observing (a) the
number of frames yielded by a body, (b) the number of bytes included
in body frames, and (c) a distribution of the size of frames yielded.

this builds upon the backend-level metrics added in #3308. this
additionally uses the route label extractor, hoisted out of the retry
middleware's Prometheus telemetry in #3337.

 ### 📝 changes

* a `linkerd_http_prom::body_data::request::NewRecordBodyData::NewRecordBodyData`
  middleware is added, which complements the equivalent
  `linkerd_http_prom::body_data::response` middleware.

* this is added to policy routes' metrics layer.

see prometheus/client_rust#241 and prometheus/client_rust#242, which
track upstream proposals to add accessors to `Histogram` that will allow
us to make test assertions that metrics are working properly. for now,
these are feature gated as also done in #3308.

Signed-off-by: katelyn martin <[email protected]>
cratelyn added a commit that referenced this pull request Nov 21, 2024
 ### ⛅ overview

this introduces a new tower middleware for Prometheus metrics, used for
instrumenting HTTP and gRPC request bodies, and observing (a) the
number of frames yielded by a body, (b) the number of bytes included
in body frames, and (c) a distribution of the size of frames yielded.

this builds upon the backend-level metrics added in #3308. this
additionally uses the route label extractor, hoisted out of the retry
middleware's Prometheus telemetry in #3337.

 ### 📝 changes

* a `linkerd_http_prom::body_data::request::NewRecordBodyData::NewRecordBodyData`
  middleware is added, which complements the equivalent
  `linkerd_http_prom::body_data::response` middleware.

* this is added to policy routes' metrics layer.

see prometheus/client_rust#241 and prometheus/client_rust#242, which
track upstream proposals to add accessors to `Histogram` that will allow
us to make test assertions that metrics are working properly. for now,
these are feature gated as also done in #3308.

Signed-off-by: katelyn martin <[email protected]>
cratelyn added a commit that referenced this pull request Nov 21, 2024
### ⛅ overview

this introduces a new tower middleware for Prometheus metrics, used for
instrumenting HTTP and gRPC request bodies, and observing (a) the
number of frames yielded by a body, (b) the number of bytes included
in body frames, and (c) a distribution of the size of frames yielded.

this builds upon the backend-level metrics added in #3308. this
additionally uses the route label extractor, hoisted out of the retry
middleware's Prometheus telemetry in #3337.

 ### 📝 changes

* a `linkerd_http_prom::body_data::request::NewRecordBodyData::NewRecordBodyData`
  middleware is added, which complements the equivalent
  `linkerd_http_prom::body_data::response` middleware.

* this is added to policy routes' metrics layer.

see prometheus/client_rust#241 and prometheus/client_rust#242, which
track upstream proposals to add accessors to `Histogram` that will allow
us to make test assertions that metrics are working properly. for now,
these are feature gated as also done in #3308.

Signed-off-by: katelyn martin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants