Skip to content

Latest commit

 

History

History
149 lines (125 loc) · 6.06 KB

README.md

File metadata and controls

149 lines (125 loc) · 6.06 KB

Go golangci-lint security scan gosec docker build

Architecture overview

Main idea behind this program is to test PagerDuty integration on your infrastructure.

Diagram

pd-checker service

Description

pd-checker service is a master process that run on your monitoring infrastructure (outside of the monitored infrastructure). Main purpose of this command is to scanning all PagerDuty services for a specific pd-checker event and trigger alert if such incident not occurred after defined time.

pd-checker service to integrate with PagerDuty required user auth token.

Can be used as cli command or docker service

Database structure

pd-checker service lists incidents for all available services and save to the database last incident per service created by the pd-checker event. If incident for given service already exist it will only update ID, Title, CreateAt and Timer values.

//Incident structure for incidents stored in database
type Incident struct {
	ID          string // PagerDuty incident ID
	Title       string // PagerDuty incident title
	ServiceID   string // PagerDuty service ID related to created incident
	ServiceName string // PagerDuty service name related to created incident
	CreateAt    string // PagerDuty incident creation time
	Timer       string // PagerDuty additional information defined by pd-checker event details informtion
	Alert       string // If "Y" create new alert for service
	ToCheck     string // If "Y" check for new incidents 
	Trigger     string // If "Y" alert already triggered
}

All available PagerDuty services are stored in Service database.

//Service structure for service stored in database
type Service struct {
	ID   string // PagerDuty service ID related to created incident
	Name string // PagerDuty service name related to created incident
}

Usage

CLI

Set PagerDuty user auth token using environment variable

	export PAGERDUTY_AUTH_TOKEN=xxx

Run pd-checker service with default parameters (check all services for new alerts every 6h)

pd-checker service server

Run pd-checker service for every 24h for scanning services and checking new alerts

pd-checker service server -t 24h

Docker

Can be used as docker service (preferred way) Pull image from docker hub

docker pull jlubzinski/pd-checker

Prepare docker compose and define PAGERDUTY_AUTH_TOKEN

Metrics

pd-checker event

Description

pd-checker-event trigger (from inside of your infrastructure) and instantly resolve single Pagerduty incident always with the same payload:

Summary:  "PD CHECKER - OK",
Severity: "info",
Source:   "localhost",
Details:  triggerEvery,

New event can be create manually or in server mode every triggerEvery time.

Next pd-checker-service will scan all available services every triggerEvery time and if found new event with name PD CHECKER - OK register them in local database

To integrate with PagerDuty required Events API v2 service integration.

Usage

CLI

Set PagerDuty integration key using environment variable

	export PAGERDUTY_INTEGRATION_KEY=xxx

Run pd-checker event with default parameters (trigger/resolve new alert every 12h )

pd-checker event server

Run pd-checker event trigger/resolve new alert every 12h

pd-checker service event -r 12h

Docker

Can be used as docker service (preferred way) Pull image from docker hub

docker pull jlubzinski/pd-checker

Prepare docker compose and define PAGERDUTY_INTEGRATION_KEY

Metrics

FAQ

New PagerDuty service was added how to refresh pd-checker service database

pd-checker service on every 12h (default) scan for new PagerDuty service and add them to sqlite database

pd-checker service container service was recreated and sqlite database was lost

pd-checker service on initial run scan for all PagerDuty services and add them to sqlite database

How often pd-checker service will be trigger PagerDuty alert on not working integration

pd-checker service trigger real PagerDuty alert only once and keep information about triggered alerts in sqlite database

How add PagerDuty service to pd-checker

Add pd-checker event to new monitored infrastructure. Next every 12h (default) new pd-checker event will be created. pd-checker service add new service after scan new alert

TODO:

  • check if alert was already triggered on PagerDuty