From c3801a1ad4c0f1b427b8a798737ac48d2435a967 Mon Sep 17 00:00:00 2001 From: Micah Nagel Date: Mon, 23 Sep 2024 08:37:34 -0600 Subject: [PATCH] chore: pr feedback --- src/vector/Dockerfile | 15 --------------- src/vector/README.md | 22 +++++++++++----------- 2 files changed, 11 insertions(+), 26 deletions(-) delete mode 100644 src/vector/Dockerfile diff --git a/src/vector/Dockerfile b/src/vector/Dockerfile deleted file mode 100644 index 566b2457d..000000000 --- a/src/vector/Dockerfile +++ /dev/null @@ -1,15 +0,0 @@ -ARG BASE_REGISTRY=registry1.dso.mil -ARG BASE_IMAGE=ironbank/google/distroless/static -ARG BASE_TAG=nonroot - -FROM docker.io/timberio/vector:0.40.1 AS upstream - -FROM ${BASE_REGISTRY}/${BASE_IMAGE}:${BASE_TAG} - -COPY --from=upstream /usr/local/bin/* /usr/local/bin/ -COPY --from=upstream /var/lib/vector /var/lib/vector -COPY --from=upstream /etc/vector/vector.yaml /etc/vector/vector.yaml - -HEALTHCHECK NONE - -ENTRYPOINT ["/usr/local/bin/vector"] diff --git a/src/vector/README.md b/src/vector/README.md index 7c424cb33..d09241ca4 100644 --- a/src/vector/README.md +++ b/src/vector/README.md @@ -15,23 +15,23 @@ One of the main issues that has arisen with Promtail is its limited output/expor ### Goals and Options In choosing an alternative to Promtail we have a few primary objectives: -1. Chosen tool must be capable of gathering host and pod logs: This has been our primary usage of Promtail in the past - gathering pods logs and host logs (to include k8s audit logs, controlplane logs, etc). -1. Provide a tool that has numerous export options to cover specific needs for environments: Current known requirements include Loki, S3, and SIEM tools like Elastic and Splunk. Ideally the tool of choice supports all of these and more, allowing for expansion as new environments require it. -1. Choose a tool that does not require major changes in our logging stack, but is flexible for future adjustments to the stack: As we do have active users of our product we want to be careful in switching tools, so ideally we would like a tool that is a "drop-in" replacement. However, we don't want to rule out future changes to other pieces of the stack (i.e. Loki) so choosing a tool that doesn't lock us into Loki is important. -1. Focus on the log collection/shipping problem: While there are a number of tools that offer far more than just logging pipelines (metrics, traces, etc), we don't currently see a need to focus on these tools. These features are seen as a nice to have, but not being evaluated as the focus here. +- Chosen tool must be capable of gathering host and pod logs: This has been our primary usage of Promtail in the past - gathering pods logs and host logs (to include k8s audit logs, controlplane logs, etc). +- Provide a tool that has numerous export options to cover specific needs for environments: Current known requirements include Loki, S3, and SIEM tools like Elastic and Splunk. Ideally the tool of choice supports all of these and more, allowing for expansion as new environments require it. +- Choose a tool that does not require major changes in our logging stack, but is flexible for future adjustments to the stack: As we do have active users of our product we want to be careful in switching tools, so ideally we would like a tool that is a "drop-in" replacement. However, we don't want to rule out future changes to other pieces of the stack (i.e. Loki) so choosing a tool that doesn't lock us into Loki is important. +- Focus on the log collection/shipping problem: While there are a number of tools that offer far more than just logging pipelines (metrics, traces, etc), we don't currently see a need to focus on these tools. These features are seen as a nice to have, but not being evaluated as the focus here. Three tools in the space of log collection were considered: -1. [Vector](https://vector.dev/): Opensource and maintained by Datadog, Vector provides input integrations with Kubernetes logs, arbitrary files, and [other sources](https://vector.dev/docs/reference/configuration/sources/). It has the necessary export integrations with Loki, S3, Elastic, Splunk and a [number of other sinks](https://vector.dev/docs/reference/configuration/sinks/). Vector is a newer tool that has not yet reached a 1.0 release, but has risen in popularity due to its performance improvements over other tools. -1. [FluentBit](https://fluentbit.io/): Fluentbit was historically used in Big Bang and supports file based inputs as well as [other inputs](https://docs.fluentbit.io/manual/pipeline/inputs). It also supports the necessary output integrations (Loki, S3, Elastic, Splunk and [others](https://docs.fluentbit.io/manual/pipeline/outputs)). FluentBit is a CNCF graduated project and is relatively mature. Fluentbit fell out of favor with Big Bang due to some of the complexities around managing it at scale, specifically with its buffering. -1. [Grafana Alloy](https://grafana.com/docs/alloy/latest/): Alloy is a distribution of the OpenTelemetry Collector, opensource and maintained by Grafana Labs. It supports the necessary [inputs and outputs](https://grafana.com/docs/alloy/latest/reference/components/) (local file/k8s logs, Loki and S3). As a distribution of OTel it supports vendor-agnostic output formats and can be integrated with numerous other tools through the OTel ecosystem. While Alloy itself is relatively new, it is built on the previous codebase of Grafana Agent and the existing OTel framework. Notably it does not have any direct integrations with Splunk or Elastic, and its S3 integration is noted as experimental. +- [Vector](https://vector.dev/): Opensource and maintained by Datadog, Vector provides input integrations with Kubernetes logs, arbitrary files, and [other sources](https://vector.dev/docs/reference/configuration/sources/). It has the necessary export integrations with Loki, S3, Elastic, Splunk and a [number of other sinks](https://vector.dev/docs/reference/configuration/sinks/). Vector is a newer tool that has not yet reached a 1.0 release, but has risen in popularity due to its performance improvements over other tools. +- [FluentBit](https://fluentbit.io/): Fluentbit was historically used in Big Bang and supports file based inputs as well as [other inputs](https://docs.fluentbit.io/manual/pipeline/inputs). It also supports the necessary output integrations (Loki, S3, Elastic, Splunk and [others](https://docs.fluentbit.io/manual/pipeline/outputs)). FluentBit is a CNCF graduated project and is relatively mature. Fluentbit fell out of favor with Big Bang due to some of the complexities around managing it at scale, specifically with its buffering. +- [Grafana Alloy](https://grafana.com/docs/alloy/latest/): Alloy is a distribution of the OpenTelemetry Collector, opensource and maintained by Grafana Labs. It supports the necessary [inputs and outputs](https://grafana.com/docs/alloy/latest/reference/components/) (local file/k8s logs, Loki and S3). As a distribution of OTel it supports vendor-agnostic output formats and can be integrated with numerous other tools through the OTel ecosystem. While Alloy itself is relatively new, it is built on the previous codebase of Grafana Agent and the existing OTel framework. Notably it does not have any direct integrations with Splunk or Elastic, and its S3 integration is noted as experimental. ### Decision and Impact Vector has been chosen as our replacement for Promtail. Primary motivations include: -1. Vector has an extensive "component" catalog for inputs and outputs, with complete coverage of all currently desired export locations (and all are noted as "stable" integrations). -1. Vector's configuration is simple and works well in helm/with UDS helm overrides (easy to add additional export locations via bundle overrides for example). -1. Despite being a newer project, Vector's community is very active - with the most active contributors and GitHub stars compared to the other two tools. -1. Vector is [significantly more performant](https://github.com/vectordotdev/vector?tab=readme-ov-file#performance) than other tooling in the space on most categories of metrics. +- Vector has an extensive "component" catalog for inputs and outputs, with complete coverage of all currently desired export locations (and all are noted as "stable" integrations). +- Vector's configuration is simple and works well in helm/with UDS helm overrides (easy to add additional export locations via bundle overrides for example). +- Despite being a newer project, Vector's community is very active - with the most active contributors and GitHub stars compared to the other two tools. +- Vector is [significantly more performant](https://github.com/vectordotdev/vector?tab=readme-ov-file#performance) than other tooling in the space on most categories of metrics. As with any decisions of tooling in core this can always be reevaluated in the future as different tools or factors affect how we look at our logging stack.