"The first five minutes are not about fixing the system. They are about ensuring the system can still be fixed later."
Incident response is something worth thinking carefully about, and this idea captures it well. The instinct when something breaks is to jump straight into fixing it. But the first few minutes should be about something else entirely: containment over correctness, reversibility over impact, protecting state before touching services.
The best incident responders resist the urge to act immediately. They pause, assess the blast radius, and make sure that whatever they do next doesn't make the situation harder to recover from. A premature restart can destroy the evidence you need to understand root cause. A hasty rollback can cascade into something worse.
What good first five minutes look like
The priority is preserving your options. That means capturing logs before they rotate, taking snapshots before you modify state, and communicating early — even if all you can say is "we're aware and investigating."
It also means having a plan before the incident happens. If your first five minutes involve working out who to call, where the runbooks are, and which Slack channel to use, you've already lost valuable time.
Worth reading
This short piece on the first five minutes doctrine is a good articulation of the principle. Three minutes of your time, well spent.
← All filings