-
Notifications
You must be signed in to change notification settings - Fork 26
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* tezos signer forwarder chart The last remaining piece of https://github.com/midl-dev/tezos-on-gke/ to move into tezos-k8s, tezos-signer-forwarder is a terminating pod for ssh tunnels exposing a tezos signing endpoint from an on-prem location. * support for HA signers * support for loadbalancerip instead of annotation * instead of 2 service monitors, relabel the alerts from signer enable cold standby * support for selection of the signer port * set port for service as well * set scrape timeout for remote signers to 20s * add toggle for signer metrics * name ports in statefulset as well * make sure signer-forwarder pod restarts when endpoint config changes * replace janky python injection script with `targetLabels` I didn't know about `targetLabels` but it seems more natural to do it this way. * last part - replace ad-hoc relabeling with proper ServiceMonitor config * lint * better explanation for sidecar * remove namespace from serviceMonitor (bc it's not set anywhere else) * midl => tezos * pin alpine to more stable * add -D and -e to CMD in signerForwarder dockerfile does not do anything since we use entrypoint in chart * move signer forwarder image into tezos_k8s_images * values: uncomment and make "" * load balancer ip: uncomment and set to "" * Update charts/tezos-signer-forwarder/templates/statefulset.yaml Co-authored-by: Aryeh Harris <[email protected]> * simplify enumeration * Update charts/tezos-signer-forwarder/scripts/signer_exporter.py Co-authored-by: Aryeh Harris <[email protected]> * add readonly for the ssh secrets * default mode 400 for more config files * remove range and add enumeration in service * only expose metrics port in service when enabled in values * Revert "only expose metrics port in service when enabled in values" This reverts commit 49cf3a9. * grab endpoint port straight from values.yaml instead of going thru a cm * re-add missing quotes * revert some of the perm changes to make it work * add comment why container runs as root * handle readiness probe timeout just like for the node * do not hardcode pulumi annotation --------- Co-authored-by: Aryeh Harris <[email protected]>
- Loading branch information
1 parent
9d1750c
commit 4d858ba
Showing
13 changed files
with
567 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
apiVersion: v2 | ||
name: tezos-signer-forwarder | ||
description: A chart for tezos-signer-forwarder | ||
type: application | ||
version: 0.0.0 | ||
appVersion: "10.0" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
#!/bin/sh | ||
|
||
/usr/sbin/sshd -D -e -p ${TUNNEL_ENDPOINT_PORT} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
#!/usr/bin/env python | ||
import os | ||
from flask import Flask, request, jsonify | ||
import requests | ||
|
||
import logging | ||
log = logging.getLogger('werkzeug') | ||
log.setLevel(logging.ERROR) | ||
|
||
application = Flask(__name__) | ||
|
||
readiness_probe_path = os.getenv("READINESS_PROBE_PATH") | ||
signer_port = os.getenv("SIGNER_PORT") | ||
signer_metrics = os.getenv("SIGNER_METRICS") == "true" | ||
|
||
# https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/ | ||
# Configured readiness probe timeoutSeconds is 5s, timeout sync request before that. | ||
SIGNER_CONNECT_TIMEOUT = 4.5 | ||
|
||
@application.route('/metrics', methods=['GET']) | ||
def prometheus_metrics(): | ||
''' | ||
Prometheus endpoint | ||
This combines: | ||
* the metrics from the signer, which themselves are a combination of the | ||
prometheus node-exporter and custom probes (power status, etc) | ||
* the `unhealthy_signers_total` metric exported by this script, verifying | ||
whether the signer URL configured upstream returns a 200 OK | ||
''' | ||
|
||
try: | ||
probe = requests.get(f"http://localhost:{signer_port}{readiness_probe_path}", timeout=SIGNER_CONNECT_TIMEOUT) | ||
except requests.exceptions.ConnectTimeout: | ||
#Timeout connect to node | ||
probe = None | ||
except requests.exceptions.ReadTimeout: | ||
#Timeout read from node | ||
probe = None | ||
except requests.exceptions.RequestException: | ||
probe = None | ||
if probe and signer_metrics: | ||
try: | ||
healthz = requests.get(f"http://localhost:{signer_port}/healthz").text | ||
except requests.exceptions.RequestException: | ||
healthz = None | ||
else: | ||
healthz = None | ||
return '''# number of unhealthy signers - should be 0 or 1 | ||
unhealthy_signers_total %s | ||
%s''' % (0 if probe else 1, healthz or "") | ||
|
||
if __name__ == "__main__": | ||
application.run(host = "0.0.0.0", port = 31732, debug = False) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
{{/* | ||
Expand the name of the chart. | ||
*/}} | ||
{{- define "tezos-signer-forwarder.name" -}} | ||
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }} | ||
{{- end }} | ||
|
||
{{/* | ||
Create a default fully qualified app name. | ||
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec). | ||
If release name contains chart name it will be used as a full name. | ||
*/}} | ||
{{- define "tezos-signer-forwarder.fullname" -}} | ||
{{- if .Values.fullnameOverride }} | ||
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }} | ||
{{- else }} | ||
{{- $name := default .Chart.Name .Values.nameOverride }} | ||
{{- if contains $name $.Release.Name }} | ||
{{- .Release.Name | trunc 63 | trimSuffix "-" }} | ||
{{- else }} | ||
{{- printf "%s-%s" $.Release.Name $name | trunc 63 | trimSuffix "-" }} | ||
{{- end }} | ||
{{- end }} | ||
{{- end }} | ||
|
||
{{/* | ||
Create chart name and version as used by the chart label. | ||
*/}} | ||
{{- define "tezos-signer-forwarder.chart" -}} | ||
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }} | ||
{{- end }} | ||
|
||
{{/* | ||
Common labels | ||
*/}} | ||
{{- define "tezos-signer-forwarder.labels" -}} | ||
helm.sh/chart: {{ include "tezos-signer-forwarder.chart" . }} | ||
{{ include "tezos-signer-forwarder.selectorLabels" . }} | ||
{{- if .Chart.AppVersion }} | ||
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }} | ||
{{- end }} | ||
app.kubernetes.io/managed-by: {{ .Release.Service }} | ||
{{- end }} | ||
|
||
{{/* | ||
Selector labels | ||
*/}} | ||
{{- define "tezos-signer-forwarder.selectorLabels" -}} | ||
app.kubernetes.io/name: {{ include "tezos-signer-forwarder.name" . }} | ||
app.kubernetes.io/instance: {{ .Release.Name }} | ||
{{- end }} | ||
|
||
{{/* | ||
Create the name of the service account to use | ||
*/}} | ||
{{- define "tezos-signer-forwarder.serviceAccountName" -}} | ||
{{- if .Values.serviceAccount.create }} | ||
{{- default (include "tezos-signer-forwarder.fullname" .) .Values.serviceAccount.name }} | ||
{{- else }} | ||
{{- default "default" .Values.serviceAccount.name }} | ||
{{- end }} | ||
{{- end }} |
63 changes: 63 additions & 0 deletions
63
charts/tezos-signer-forwarder/templates/alertmanagerconfig.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
{{- if .Values.alertmanagerConfig.enabled }} | ||
{{- range .Values.signers }} | ||
{{- if .monitoring_email }} | ||
{{ $signer := . }} | ||
{{- range .endpoints }} | ||
{{- if .alert_when_down }} | ||
apiVersion: monitoring.coreos.com/v1alpha1 | ||
kind: AlertmanagerConfig | ||
metadata: | ||
name: tezos-signer-{{ $signer.name }}-{{ .alias }}-email | ||
labels: | ||
{{- toYaml $.Values.alertmanagerConfig.labels | nindent 4 }} | ||
spec: | ||
route: | ||
groupBy: ['job'] | ||
groupWait: 30s | ||
groupInterval: 5m | ||
repeatInterval: 12h | ||
receiver: 'email_{{ $signer.name }}' | ||
matchers: | ||
- name: service | ||
value: tezos-remote-signer-{{ $signer.name }} | ||
regex: false | ||
- name: alertType | ||
value: tezos-remote-signer-alert | ||
regex: false | ||
- name: tezos_endpoint_name | ||
value: {{ .alias }} | ||
regex: false | ||
continue: false | ||
|
||
receivers: | ||
- name: 'email_{{ $signer.name }}' | ||
emailConfigs: | ||
- to: "{{ $signer.monitoring_email }}" | ||
sendResolved: true | ||
headers: | ||
- key: subject | ||
value: '{{`[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}] {{ .CommonLabels.alertname }}`}}' | ||
html: >- | ||
{{`{{ if eq .Status "firing" }} | ||
Attention Required for Tezos Remote Signer: | ||
{{ else }} | ||
Resolved Alert for Tezos Remote Signer: | ||
{{ end }} | ||
{{ range .Alerts -}} | ||
{{ .Annotations.summary }} | ||
{{ end }}`}} | ||
text: >- | ||
{{`{{ if eq .Status "firing" }} | ||
Attention Required for Tezos Remote Signer: | ||
{{ else }} | ||
Resolved Alert for Tezos Remote Signer: | ||
{{ end }} | ||
{{ range .Alerts -}} | ||
{{ .Annotations.summary }} | ||
{{ end }}`}} | ||
--- | ||
{{- end }} | ||
{{- end }} | ||
{{- end }} | ||
{{- end }} | ||
{{- end }} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
{{- range .Values.signers }} | ||
{{- $name := .name }} | ||
{{- range $i, $endpoint := .endpoints }} | ||
apiVersion: v1 | ||
kind: ConfigMap | ||
metadata: | ||
name: tezos-signer-forwarder-config-{{ $name }}-{{ $i }} | ||
data: | ||
authorized_keys: "{{ $endpoint.ssh_pubkey }} signer" | ||
--- | ||
{{- end }} | ||
{{- end }} |
51 changes: 51 additions & 0 deletions
51
charts/tezos-signer-forwarder/templates/prometheusrule.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
{{- if .Values.prometheusRule.enabled }} | ||
apiVersion: monitoring.coreos.com/v1 | ||
kind: PrometheusRule | ||
metadata: | ||
labels: | ||
{{- toYaml .Values.prometheusRule.labels | nindent 4 }} | ||
name: tezos-remote-signer-rules | ||
spec: | ||
groups: | ||
- name: tezos-remote-signer.rules | ||
rules: | ||
- alert: SignerPowerLoss | ||
annotations: | ||
description: 'Remote signer "{{`{{ $labels.tezos_endpoint_name }}`}}" for baker "{{`{{ $labels.tezos_baker_name }}`}}" has lost power' | ||
summary: 'Remote signer "{{`{{ $labels.tezos_endpoint_name }}`}}" for baker "{{`{{ $labels.tezos_baker_name }}`}}" has lost power' | ||
expr: power{namespace="{{ .Release.Namespace }}"} != 0 | ||
for: 1m | ||
labels: | ||
severity: critical | ||
alertType: tezos-remote-signer-alert | ||
- alert: SignerWiredNetworkLoss | ||
annotations: | ||
description: 'Remote signer "{{`{{ $labels.tezos_endpoint_name }}`}}" for baker "{{`{{ $labels.tezos_baker_name }}`}}" has lost wired internet connection' | ||
summary: 'Tezos remote signer "{{`{{ $labels.tezos_endpoint_name }}`}}" for baker "{{`{{ $labels.tezos_baker_name }}`}}" has lost wired internet connection' | ||
expr: wired_network{namespace="{{ .Release.Namespace }}"} != 0 | ||
for: 1m | ||
labels: | ||
severity: critical | ||
alertType: tezos-remote-signer-alert | ||
--- | ||
apiVersion: monitoring.coreos.com/v1 | ||
kind: PrometheusRule | ||
metadata: | ||
labels: | ||
{{- toYaml .Values.prometheusRule.labels | nindent 4 }} | ||
name: tezos-remote-signer-reachability-rules | ||
spec: | ||
groups: | ||
- name: tezos-remote-signer.rules | ||
rules: | ||
- alert: NoRemoteSigner | ||
annotations: | ||
description: 'Remote signer "{{`{{ $labels.tezos_endpoint_name }}`}}" for baker "{{`{{ $labels.tezos_baker_name }}`}}" is down' | ||
summary: 'Remote signer "{{`{{ $labels.tezos_endpoint_name }}`}}" for baker "{{`{{ $labels.tezos_baker_name }}`}}" is down or unable to sign.' | ||
expr: unhealthy_signers_total{namespace="{{ .Release.Namespace }}"} != 0 | ||
for: 1m | ||
labels: | ||
severity: critical | ||
alertType: tezos-remote-signer-alert | ||
--- | ||
{{- end }} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
apiVersion: v1 | ||
kind: Secret | ||
metadata: | ||
name: tezos-signer-forwarder-secret-{{ .Values.name }} | ||
data: | ||
ssh_host_ecdsa_key: | | ||
{{ println .Values.secrets.signer_target_host_key | b64enc | indent 4 -}} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
apiVersion: v1 | ||
kind: Service | ||
metadata: | ||
name: tezos-remote-signer-ssh-ingress-{{ .Values.name }} | ||
annotations: | ||
{{ toYaml .Values.service_annotations | indent 4 }} | ||
spec: | ||
type: LoadBalancer | ||
selector: | ||
app.kubernetes.io/name: tezos-signer-forwarder | ||
ports: | ||
{{- range .Values.signers }} | ||
{{- $name := .name }} | ||
# undocumented k8s feature to make a service route to different pods | ||
# based on the port - allows to reuse the same public ip in all cloud | ||
# providers. For it to work, ports need to have names. | ||
# https://github.com/kubernetes/kubernetes/issues/24875#issuecomment-794596576 | ||
{{- range $i, $endpoint := .endpoints }} | ||
- port: {{ $endpoint.tunnel_endpoint_port }} | ||
name: ssh-{{ trunc 9 $name }}-{{ $i }} | ||
targetPort: ssh-{{ trunc 9 $name }}-{{ $i }} | ||
{{- end }} | ||
{{- end }} | ||
# ensures that remote signers can always ssh | ||
publishNotReadyAddresses: true | ||
{{ if .Values.load_balancer_ip }} | ||
loadBalancerIP: {{ .Values.load_balancer_ip }} | ||
{{ end }} | ||
--- | ||
{{- range .Values.signers }} | ||
apiVersion: v1 | ||
kind: Service | ||
metadata: | ||
name: tezos-remote-signer-{{ .name }} | ||
labels: | ||
app.kubernetes.io/name: tezos-signer-forwarder | ||
tezos_baker_name: {{ .name }} | ||
spec: | ||
selector: | ||
app.kubernetes.io/name: tezos-signer-forwarder | ||
tezos_baker_name: {{ .name }} | ||
ports: | ||
- port: {{ .signer_port }} | ||
name: signer | ||
- port: 31732 | ||
name: metrics | ||
# make sure that the service always targets the same signer, when HA is in use. | ||
sessionAffinity: ClientIP | ||
--- | ||
{{- end }} |
25 changes: 25 additions & 0 deletions
25
charts/tezos-signer-forwarder/templates/servicemonitor.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
{{- if .Values.serviceMonitor.enabled }} | ||
{{- range .Values.signers }} | ||
apiVersion: monitoring.coreos.com/v1 | ||
kind: ServiceMonitor | ||
metadata: | ||
labels: | ||
app.kubernetes.io/name: tezos-signer-forwarder | ||
name: tezos-remote-signer-monitoring-{{ .name }} | ||
spec: | ||
endpoints: | ||
- port: metrics | ||
path: /metrics | ||
# default scrape timeout of 10 can be too small for remote raspberry pis | ||
scrapeTimeout: "20s" | ||
selector: | ||
matchLabels: | ||
app.kubernetes.io/name: tezos-signer-forwarder | ||
tezos_baker_name: {{ .name }} | ||
targetLabels: | ||
- tezos_baker_name | ||
podTargetLabels: | ||
- tezos_endpoint_name | ||
--- | ||
{{- end }} | ||
{{- end }} |
Oops, something went wrong.