Skip to content

Latest commit

 

History

History
16 lines (12 loc) · 1.4 KB

File metadata and controls

16 lines (12 loc) · 1.4 KB

0x19. Postmortem

Background Context

Any software system will eventually fail, and that failure can come stem from a wide range of possible factors: bugs, traffic spikes, security issues, hardware failures, natural disasters, human error… Failing is normal and failing is actually a great opportunity to learn and improve. Any great Software Engineer must learn from his/her mistakes to make sure that they won’t happen again. Failing is fine, but failing twice because of the same issue is not.

A postmortem is a tool widely used in the tech industry. After any outage, the team(s) in charge of the system will write a summary that has 2 main goals:

  • To provide the rest of the company’s employees easy access to information detailing the cause of the outage. Often outages can have a huge impact on a company, so managers and executives have to understand what happened and how it will impact their work.
  • And to ensure that the root cause(s) of the outage has been discovered and that measures are taken to make sure it will be fixed.

Resources

Read or watch:

More Info

Manual QA Review

It is your responsibility to request a review for your postmortem from a peer before the project’s deadline. If no peers have been reviewed, you should request a review from a TA or staff member.