< devops-project-template

Operation Method

Operations manages how code is deployed, configured, and monitored, as well as the availability, latency, change management, emergency response, and capacity management of services in production.

Clone this repo and document your specific choice here:

``

Content

Tips and hints

SRE pinciples

TAM

ITIL

See:

SRE Practices Without SREs
Do you have an SRE team yet? How to start and assess your journey
The Site Reliability Workbook: Practical Ways to Implement SRE
Site Reliability Engineering: How Google Runs Production Systems

Tips and hints

Use SRE as a basis
Define reasonable SLOs (service level objectives)
Have a monitoring strategy implemented and measure the SLIs (service level indicators)
Have a rollback strategy implemented
Have an incident management procedure in place
Create a culture of authoring blameless postmortems
As a start, have normal development teams assign SRE-engineers that spend 50% of their time in a specialized horizontal SRE-team

SRE principles

Principle #1: SRE needs SLOs with consequences.

Principle #2: SREs must have time to make tomorrow better than today.

Principle #3: SRE teams have the ability to regulate their workload.

ITIL

TAM

Technical application management is the more traditional way of doing operation. The TAM engineer is traditionally focused on execution and not on prevention, which is also part of SRE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

operations-setup.md

operations-setup.md

Operation Method

Tips and hints

SRE principles

ITIL

TAM

Files

operations-setup.md

Latest commit

History

operations-setup.md

File metadata and controls

Operation Method

Tips and hints

SRE principles

ITIL

TAM