← Blog
Operations 8 min read

Response Time Monitoring: Why Speed Matters as Much as Availability

A monitor that only checks whether your site returns 200 misses a whole category of real problems. A page that takes 8 seconds to load is functionally down from a user's perspective — even though your uptime monitor is green.

Binary up/down monitoring is table stakes. Response time monitoring is the next level — and it catches a distinct class of incidents that availability checks miss entirely.


What response time monitoring actually measures

Every time your monitor makes an HTTP request to your URL, it records how long the request took from first byte sent to last byte received. This is the total response time: DNS resolution + TCP connection + TLS handshake + server processing + data transfer.

For most web endpoints, a healthy response time is under 500ms. For APIs, under 200ms. These aren't hard rules — they depend on what the endpoint does — but they're reasonable baselines. When response times climb past 2–3 seconds, users notice. Past 5 seconds, many abandon.


What causes response time spikes?

Response time increases before outages. A service that's about to fail usually degrades first — slower queries, memory pressure, connection pool exhaustion — before it falls over completely. Response time monitoring gives you early warning.

Common causes of response time spikes worth monitoring for:

Cause Pattern What to look for
Database slowdownGradual climb or sudden spikeAffects all pages that query DB; correlates with traffic peaks
Memory pressure / GCPeriodic spikes, then recoveryRegular pattern, every N minutes
Upstream API latencyCorrelated with third-party outagesOnly affects endpoints that call external services
N+1 query bugGrows linearly with data sizeGets worse over time as database grows
Caching failureSudden jump when cache invalidatedCoincides with deploys or cache expiry events
Traffic spikeSpike correlated with visitor spikeOften after launch, marketing email, or social mention

How to set a response time threshold

A response time threshold alert fires when a check takes longer than N milliseconds. Set it too low and you'll get alerts on normal variance. Set it too high and you'll miss real degradation.

The right process:

  1. Establish a baseline. Look at your monitor's response time history over the last 30 days. What's the typical range? What's the P95 (the value that 95% of checks fall under)?
  2. Set the threshold at 2–3× P95. If your P95 is 300ms, a threshold of 700–900ms catches real spikes without firing on normal variance.
  3. Tune after the first false positive. If you get alerted for something that wasn't actually a problem, raise the threshold slightly. If you have an incident and response time didn't alert, lower it.

Example thresholds by endpoint type

Marketing homepage (cached)500ms
API health endpoint300ms
Dashboard / authenticated app pages2000ms
Checkout / payment flow1500ms
Search endpoint1000ms

Using response time history

Response time graphs are as valuable as the alerts. Even when you're not in an incident, the historical chart tells you:

PingBase stores 90 days of response time data per monitor. The graph in the monitor detail view shows both average response time and uptime bars, so you can correlate slowdowns with outages.


Response time vs availability: both matter

Some teams treat response time monitoring as a nice-to-have layered on top of availability monitoring. It's better to think of them as two different failure modes that require separate checks:

A service can fail either way. Running both gives you full coverage of the two most common production problem types.

Monitor both availability and response time

PingBase tracks response times on every check and alerts you when they exceed your threshold. Free for up to 5 monitors.

Start free →

Related