← Blog
Operations 9 min read

How to Set Up an On-Call Rotation That Doesn't Suck

Most on-call setups fail the same way: too many noisy alerts, no clear handoff process, and an engineer who's exhausted by Thursday. Here's how to build a rotation that's actually sustainable.


Why most on-call rotations burn people out

On-call burnout rarely happens because of real incidents. It happens because of bad configuration: alerts that fire for things that don't matter, no clear escalation path, and engineers who have no idea what they're expected to do when something breaks.

Before you build the rotation schedule, you need three things in place:

Without these, the rotation schedule doesn't matter. You're just spreading misery evenly.


Choosing a rotation structure

The most common rotation structures are weekly and follow-the-sun. Each works for different team sizes and geographies.

Weekly rotation

One engineer carries a full week, then hands off. Simple to schedule, but requires at least 3–4 people to avoid frequent rotations. Works best when incidents are rare.

Follow-the-sun

Engineers cover business hours in their timezone. Eliminates overnight coverage for any single person. Requires distributed teams — doesn't work if everyone is in the same timezone.

Weekday/weekend split

Primary on-call covers weekdays; a separate engineer takes weekends. Reduces weekend disruptions. Works well when weekend volume is lower than weekday volume.

For most early-stage SaaS teams, a weekly rotation with a primary and a secondary (the escalation target) is the right starting point. It's simple and doesn't require complex tooling.


Configuring alerts for on-call

Alert fatigue is the on-call killer. If your on-call engineer gets 12 alerts a night, half of which are flapping monitors or non-critical warnings, they stop paying attention — and miss the one real incident buried in the noise.

The rule of thumb: if an alert doesn't require action, it shouldn't page.

Before adding anything to your on-call alert channel:


The handoff process

A rotation without a handoff is just on-call theater. The handoff is where context gets transferred — and without it, the incoming on-call engineer starts cold.

A minimal but effective handoff includes:

  1. Ongoing issues: Anything that's not fully resolved, even if it's been stable for a few days
  2. Recent changes: Deploys, config changes, or dependency updates in the last week that could cause issues
  3. Known flaky monitors: Alerts that have been firing spuriously so the incoming engineer doesn't waste time investigating
  4. What to watch: Anything that's been trending in the wrong direction

Keep the handoff async — a Slack message or short doc is fine. It doesn't need to be a meeting.


Compensating fairly

On-call is work. If engineers are expected to respond to incidents outside business hours, that needs to be reflected in compensation, time off, or both.

Common approaches:

What doesn't work: pretending on-call is just "part of the job" with no acknowledgment. That approach leads to quiet resentment and eventual attrition.


The on-call setup checklist

Before going on-call

At each handoff

On-call doesn't have to be dreaded. With the right alert hygiene, clear processes, and fair compensation, it becomes a manageable part of operating a reliable product — rather than a source of burnout that drives good engineers away.

Continue reading

Operations

The Incident Management Playbook: From Alert to Resolution

Operations

How to Fix Alert Fatigue Before It Breaks Your Team