Fork me on GitHub

Reading Notes: The Site Reliability Workbook: Practical Ways to Implement SRE

Chapter 8 On Call

Google’s on-call practice:

  • Make sure never achieve reliability at the expense of an on-call engineer’s health.
  • SRE work should be a healthy mix of duties: on-call and project work.
  • Google SRE target a maximum of two incidents per on-call shift.
  • cash compensation for on-call.

The responsibilities of on-call engineers:

  • The on-call engineer minimizes user impact first, then makes sure the issues are fully addressed.
  • At the start of each shift, the on-call engineer reads the handoff from the previous shift.
  • At the end of the shift, the on-call engineer sends a handoff email to the next engineer on-call.
← Previous Post: Reading Notes: Peopleware: Productive Projects and Teams Next Post: Labs: GitHub Push触发Jenkins自动构建 →

All Tags

mindset

devops

book

ci

communication

jenkins

自媒体

hometown

geek life

cloud