Skip to content

Commit

Permalink
Merge branch 'main' into resource-sample
Browse files Browse the repository at this point in the history
  • Loading branch information
xogoodnow authored Dec 13, 2024
2 parents 95696fb + 5d5c055 commit 66e1615
Show file tree
Hide file tree
Showing 73 changed files with 633 additions and 172 deletions.
23 changes: 23 additions & 0 deletions .github/workflows/deploy-pr-preview.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
name: Deploy pr preview

on:
pull_request:
types:
- opened
- synchronize
- closed
paths:
- "docs/sources/**"

jobs:
deploy-pr-preview:
uses: grafana/writers-toolkit/.github/workflows/deploy-preview.yml@main
with:
sha: ${{ github.event.pull_request.head.sha }}
branch: ${{ github.head_ref }}
event_number: ${{ github.event.number }}
title: ${{ github.event.pull_request.title }}
repo: alloy
website_directory: content/docs/alloy/latest
relative_prefix: /docs/alloy/latest/
index_file: true
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@ Main (unreleased)

- Add `otelcol.receiver.influxdb` to convert influx metric into OTEL. (@EHSchmitt4395)

- Add a new `/-/healthy` endpoint which returns HTTP 500 if one or more components are unhealthy. (@ptodev)

### Enhancements

- Add second metrics sample to the support bundle to provide delta information (@dehaansa)
Expand All @@ -50,9 +52,13 @@ Main (unreleased)

- Use a forked `github.com/goccy/go-json` module which reduces the memory consumption of an Alloy instance by 20MB.
If Alloy is running certain otelcol components, this reduction will not apply. (@ptodev)

- Update `prometheus.write.queue` library for performance increases in cpu. (@mattdurham)

### Bugfixes

- Fixed issue with automemlimit logging bad messages and trying to access cgroup on non-linux builds (@dehaansa)

- Fixed issue with reloading configuration and prometheus metrics duplication in `prometheus.write.queue`. (@mattdurham)

- Updated `prometheus.write.queue` to fix issue with TTL comparing different scales of time. (@mattdurham)
Expand Down
1 change: 1 addition & 0 deletions CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
* @grafana/grafana-alloy-maintainers

#`make docs` procedure and related workflows are owned by @jdbaldry.
/.github/workflows/deploy-pr-preview.yml @jdbaldry
/.github/workflows/publish-technical-documentation-next.yml @jdbaldry
/.github/workflows/publish-technical-documentation-release.yml @jdbaldry
/.github/workflows/update-make-docs.yml @jdbaldry
Expand Down
2 changes: 1 addition & 1 deletion docs/sources/reference/_index.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
canonical: https://grafana.com/docs/alloy/latest/reference/
description: The reference-level documentaiton for Grafana Aloy
description: The reference-level documentation for Grafana Alloy
menuTitle: Reference
title: Grafana Alloy Reference
weight: 600
Expand Down
2 changes: 2 additions & 0 deletions docs/sources/reference/cli/environment-variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ The following environment variables are supported:
* `PPROF_BLOCK_PROFILING_RATE`
* `GOMEMLIMIT`
* `AUTOMEMLIMIT`
* `AUTOMEMLIMIT_EXPERIMENT`
* `GOGC`
* `GOMAXPROCS`
* `GOTRACEBACK`
Expand Down Expand Up @@ -80,6 +81,7 @@ For example, if you want to keep memory usage below `10GiB`, use `GOMEMLIMIT=9Gi
The `GOMEMLIMIT` environment variable is either automatically set to 90% of an available `cgroup` value using the [`automemlimit`][automemlimit] module, or you can explicitly set the `GOMEMLIMIT` environment variable before you run {{< param "PRODUCT_NAME" >}}.
You can also change the 90% ratio by setting the `AUTOMEMLIMIT` environment variable to a float value between `0` and `1.0`.
No changes occur if the limit can't be determined and you didn't explicitly define a `GOMEMLIMIT` value.
The `AUTOMEMLIMIT_EXPERIMENT` variable can be set to `system` to use the [`automemlimit`][automemlimit] module's System provider, which sets `GOMEMLIMIT` based on the same ratio applied to the total system memory. As `cgroup` is a Linux specific concept, this is the only way to use the `automemlimit` module to automatically set `GOMEMLIMIT` on non-Linux OSes.

## GOGC

Expand Down
78 changes: 78 additions & 0 deletions docs/sources/reference/http/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
---
canonical: https://grafana.com/docs/alloy/latest/reference/http/
description: Learn about HTTP endpoints exposed by Grafana Alloy
title: The Grafana Alloy HTTP endpoints
menuTitle: HTTP endpoints
weight: 700
---

# The {{% param "FULL_PRODUCT_NAME" %}} HTTP endpoints

{{< param "FULL_PRODUCT_NAME" >}} has several default HTTP endpoints that are available by default regardless of which components you have configured.
You can use these HTTP endpoints to monitor, health check, and troubleshoot {{< param "PRODUCT_NAME" >}}.

The HTTP server which exposes them is configured via the [http block](../config-blocks/http)
and the `--server.` [command line arguments](../cli/run).
For example, if you set the `--server.http.listen-addr` command line argument to `127.0.0.1:12345`,
you can query the `127.0.0.1:12345/metrics` endpoint to see the internal metrics of {{< param "PRODUCT_NAME" >}}.

### /metrics

The `/metrics` endpoint returns the internal metrics of {{< param "PRODUCT_NAME" >}} in the Prometheus exposition format.

### /-/ready

An {{< param "PRODUCT_NAME" >}} instance is ready once it has loaded its initial configuration.
If the instance is ready, the `/-/ready` endpoint returns `HTTP 200 OK` and the message `Alloy is ready.`
Otherwise, if the instance is not ready, the `/-/ready` endpoint returns `HTTP 503 Service Unavailable` and the message `Alloy is not ready.`

### /-/healthy

When all {{< param "PRODUCT_NAME" >}} components are working correctly, all components are considered healthy.
If all components are healthy, the `/-/healthy` endpoint returns `HTTP 200 OK` and the message `All Alloy components are healthy.`.
Otherwise, if any of the components are not working correctly, the `/-/healthy` endpoint returns `HTTP 500 Internal Server Error` and an error message.
You can also monitor component health through the {{< param "PRODUCT_NAME" >}} [UI](../../troubleshoot/debug#alloy-ui).

```shell
$ curl localhost:12345/-/healthy
All Alloy components are healthy.
```

```shell
$ curl localhost:12345/-/healthy
unhealthy components: math.add
```

{{< admonition type="note" >}}
The `/-/healthy` endpoint isn't suitable for a [Kubernetes liveness probe][k8s-liveness].

An {{< param "PRODUCT_NAME" >}} instance that reports as unhealthy should not necessarily be restarted.
For example, a component may be unhealthy due to an invalid configuration or an unavailable external resource.
In this case, restarting {{< param "PRODUCT_NAME" >}} would not fix the problem.
A restart may make it worse, because it would could stop the flow of telemetry in healthy pipelines.

[k8s-liveness]: https://kubernetes.io/docs/concepts/configuration/liveness-readiness-startup-probes/
{{< /admonition >}}

### /-/reload

The `/-/reload` endpoint reloads the {{< param "PRODUCT_NAME" >}} configuration file.
If the configuration file can't be reloaded, the `/-/reload` endpoint returns `HTTP 400 Bad Request` and an error message.

```shell
$ curl localhost:12345/-/reload
config reloaded
```

```shell
$ curl localhost:12345/-/reload
error during the initial load: /Users/user1/Desktop/git.alloy:13:1: Failed to build component: loading custom component controller: custom component config not found in the registry, namespace: "math", componentName: "add"
```

### /-/support

The `/-/support` endpoint returns a [support bundle](../../troubleshoot/support_bundle) that contains information about your {{< param "PRODUCT_NAME" >}} instance. You can use this information as a baseline when debugging an issue.

### /debug/pprof

The `/debug/pprof` endpoint returns a pprof Go [profile](../../troubleshoot/profile) that you can use to visualize and analyze profiling data.
165 changes: 165 additions & 0 deletions docs/sources/set-up/install/openshift.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,165 @@
---
canonical: https://grafana.com/docs/alloy/latest/set-up/install/openshift/
description: Learn how to deploy Grafana Alloy on OpenShift
menuTitle: OpenShift
title: Deploy Grafana Alloy on OpenShift
weight: 530
---

# Deploy {{% param "FULL_PRODUCT_NAME" %}} on OpenShift

You can deploy {{< param "PRODUCT_NAME" >}} on the Red Hat OpenShift Container Platform (OCP).

## Before you begin

* These steps assume you have a working OCP environment.
* You can adapt the suggested policies and configuration to meet your specific needs and security policies.

## Configure RBAC

You must configure Role-Based Access Control (RBAC) to allow secure access to Kubernetes and OCP resources.

1. Download the [rbac.yaml][] configuration file. This configuration file defines the OCP verbs and permissions for {{< param "PRODUCT_NAME" >}}.
1. Review the `rbac.yaml` file and adapt as needed for your local environment. Refer to [Managing Role-based Access Control (RBAC)][rbac] topic in the OCP documentation for more information about updating and managing your RBAC configurations.

## Run {{% param "PRODUCT_NAME" %}} as a non-root user

You must configure {{< param "PRODUCT_NAME" >}} to [run as a non-root user][nonroot].
This ensures that {{< param "PRODUCT_NAME" >}} complies with your OCP security policies.

## Apply security context constraints

OCP uses Security Context Constraints (SCC) to control Pod permissions.
Refer to [Managing security context constraints][scc] for more information about how you can define and enforce these permissions.
This ensures that the pods running {{< param "PRODUCT_NAME" >}} comply with OCP security policies.

{{< admonition type="note" >}}
The security context is only configured at the container level, not at the container and deployment level.
{{< /admonition >}}

You can apply the following SCCs when you deploy {{< param "PRODUCT_NAME" >}}.

{{< admonition type="note" >}}
Not all of these SCCs are required for each use case.
You can adapt the SCCs to meet your local requirements and needs.
{{< /admonition >}}

* `RunAsUser`: Specifies the user ID under which {{< param "PRODUCT_NAME" >}} runs.
You must configure this constraint to allow a non-root user ID.
* `SELinuxContext`: Configures the SELinux context for containers.
If you run {{< param "PRODUCT_NAME" >}} as root, you must configure this constraint to make sure that SELinux policies don't block {{< param "PRODUCT_NAME" >}}.
This SCC is generally not required to deploy {{< param "PRODUCT_NAME" >}} as a non-root user.
* `FSGroup`: Specifies the fsGroup IDs for file system access.
You must configure this constraint to give {{< param "PRODUCT_NAME" >}} group access to the files it needs.
* `Volumes`: Specifies the persistent volumes used for storage.
You must configure this constraint to give {{< param "PRODUCT_NAME" >}} access to the volumes it needs.

## Example DaemonSet configuration

The following example shows a DaemonSet configuration that deploys {{< param "PRODUCT_NAME" >}} as a non-root user:

```yaml
apiVersion: aapps/v1
kind: DaemonSet
metadata:
name: alloy-logs
namespace: monitoring
spec:
selector:
matchLabels:
app: alloy-logs
template:
metadata:
labels:
app: alloy-logs
spec:
containers:
- name: alloy-logs
image: grafana/alloy:<ALLOY_VERSION>
ports:
- containerPort: 12345
# The security context configuration
securityContext:
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
runAsUser: 473
runAsGroup: 473
fsGroup: 1000
volumes:
- name: log-volume
emptyDir: {}
```
Replace the following:
* _`<ALLOY_VERSION>`_: Set to the specific {{< param "PRODUCT_NAME" >}} version you are deploying. For example, `1.5.1`.

{{< admonition type="note" >}}
This example uses the simplest volume type, `emptyDir`. In this example configuration, if your node restarts, your data will be lost. Make sure you set the volume type to a persistent storage location for production environments. Refer to [Using volumes to persist container data](https://docs.openshift.com/container-platform/latest/nodes/containers/nodes-containers-volumes.html) in the OpenShift documentation for more information.
{{< /admonition >}}

## Example SSC definition

The following example shows an SSC definition that deploys {{< param "PRODUCT_NAME" >}} as a non-root user:

```yaml
kind: SecurityContextConstraints
apiVersion: security.openshift.io/v1
metadata:
name: scc-alloy
runAsUser:
type: MustRunAs
uid: 473
fsGroup:
type: MustRunAs
uid: 1000
volumes:
- '*'
users:
- my-admin-user
groups:
- my-admin-group
seLinuxContext:
type: MustRunAs
user: <SYSTEM_USER>
role: <SYSTEM_ROLE>
type: <CONTAINER_TYPE>
level: <LEVEL>
```

Replace the following:

* _`<SYSTEM_USER>`_: The user for your SELinux context.
* _`<SYSTEM_ROLE>`_: The role for your SELinux context.
* _`<CONTAINER_TYPE>`_: The container type for your SELinux context.
* _`<LEVEL>`_: The level for your SELinux context.

Refer to [SELinux Contexts][selinux] in the RedHat documentation for more information on the SELinux context configuration.

{{< admonition type="note" >}}
This example sets `volumes:` to `*`. In a production environment, you should set `volumes:` to only the volumes that are necessary for the deployment. For example:

```yaml
volumes:
- configMap
- downwardAPI
- emptyDir
- persistentVolumeClaim
- secret
```

{{< /admonition >}}

Refer to [Deploy {{< param "FULL_PRODUCT_NAME" >}}][deploy] for more information about deploying {{< param "PRODUCT_NAME" >}} in your environment.

## Next steps

* [Configure {{< param "PRODUCT_NAME" >}}][Configure]

[rbac.yaml]: https://github.com/grafana/alloy/blob/main/operations/helm/charts/alloy/templates/rbac.yaml
[rbac]: https://docs.openshift.com/container-platform/latest/authentication/using-rbac.html
[nonroot]: ../../../configure/nonroot/
[scc]: https://docs.openshift.com/container-platform/latest/authentication/managing-security-context-constraints.html
[Configure]: ../../../configure/linux/
[deploy]: ../../deploy/
[selinux]: https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/6/html/security-enhanced_linux/chap-security-enhanced_linux-selinux_contexts#chap-Security-Enhanced_Linux-SELinux_Contexts
14 changes: 7 additions & 7 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ require (
github.com/Azure/go-autorest/autorest v0.11.29
github.com/DATA-DOG/go-sqlmock v1.5.2
github.com/IBM/sarama v1.43.3
github.com/KimMachineGun/automemlimit v0.6.0
github.com/KimMachineGun/automemlimit v0.6.1
github.com/Lusitaniae/apache_exporter v0.11.1-0.20220518131644-f9522724dab4
github.com/Masterminds/sprig/v3 v3.2.3
github.com/PuerkitoBio/rehttp v1.4.0
Expand Down Expand Up @@ -73,7 +73,7 @@ require (
github.com/grafana/regexp v0.0.0-20240518133315-a468a5bfb3bc
github.com/grafana/tail v0.0.0-20230510142333-77b18831edf0
github.com/grafana/vmware_exporter v0.0.5-beta
github.com/grafana/walqueue v0.0.0-20241202135041-6ec70efeec94
github.com/grafana/walqueue v0.0.0-20241211144301-2b91b7dd6e08
github.com/hashicorp/consul/api v1.29.5
github.com/hashicorp/go-discover v0.0.0-20230724184603-e89ebd1b2f65
github.com/hashicorp/go-multierror v1.1.1
Expand Down Expand Up @@ -248,13 +248,13 @@ require (
go.uber.org/goleak v1.3.0
go.uber.org/multierr v1.11.0
go.uber.org/zap v1.27.0
golang.org/x/crypto v0.29.0
golang.org/x/crypto v0.31.0
golang.org/x/crypto/x509roots/fallback v0.0.0-20240208163226-62c9f1799c91
golang.org/x/exp v0.0.0-20240909161429-701f63a606c0
golang.org/x/net v0.31.0
golang.org/x/oauth2 v0.23.0
golang.org/x/sys v0.27.0
golang.org/x/text v0.20.0
golang.org/x/sys v0.28.0
golang.org/x/text v0.21.0
golang.org/x/time v0.6.0
golang.org/x/tools v0.25.0
google.golang.org/api v0.188.0
Expand Down Expand Up @@ -803,8 +803,8 @@ require (
go4.org/netipx v0.0.0-20230125063823-8449b0a6169f // indirect
golang.org/x/arch v0.7.0 // indirect
golang.org/x/mod v0.21.0 // indirect
golang.org/x/sync v0.9.0
golang.org/x/term v0.26.0 // indirect
golang.org/x/sync v0.10.0
golang.org/x/term v0.27.0 // indirect
golang.org/x/xerrors v0.0.0-20220907171357-04be3eba64a2 // indirect
gomodules.xyz/jsonpatch/v2 v2.4.0 // indirect
gonum.org/v1/gonum v0.15.1 // indirect
Expand Down
Loading

0 comments on commit 66e1615

Please sign in to comment.