← Blog
Operations 9 min read

SLA Monitoring: How to Track and Report Uptime Guarantees

Promising 99.9% uptime is easy. Knowing in real time whether you're on track, reporting that to customers with credibility, and getting warned before you breach — that's the work that makes an SLA real.

An uptime SLA (Service Level Agreement) is a contractual commitment to your customers: your service will be available X% of the time. If you fall below that threshold, customers may be entitled to service credits or compensation.

Most SaaS products with enterprise customers have some form of uptime SLA. Many smaller products include them in their Terms of Service without much thought. The problem comes when you actually need to honor one — or defend one. Was the outage 47 minutes or 62 minutes? Does scheduled maintenance count? What if your monitoring probe was down when the outage happened?

SLA monitoring turns your uptime data into a continuous, auditable record of whether you're meeting your commitments. Here's how it works and how to do it right.


The gap between promising an SLA and tracking one

Many companies add "99.9% uptime SLA" to their pricing page without a clear system for tracking it. When an incident happens, the conversation becomes adversarial: the customer thinks the outage was two hours, you think it was forty minutes. Neither of you has reliable, objective data.

Proper SLA monitoring closes this gap by establishing:


How to calculate your uptime percentage

Uptime percentage is calculated over a rolling or calendar period — usually monthly:

// Monthly uptime calculation

total_minutes = 30 × 24 × 60 = 43,200

downtime_minutes = (sum of all confirmed outage durations)

uptime_pct = (total_minutes - downtime_minutes) / total_minutes × 100

For a 30-day month, the downtime budget at each SLA tier is:

SLA tier Monthly budget Annual budget Remaining after 10min incident
99.9%43.2 min/mo8.7 hrs/yr33.2 min remaining
99.95%21.6 min/mo4.4 hrs/yr11.6 min remaining
99.99%4.3 min/mo52.6 min/yrAlready breached

This is why real-time SLA tracking matters. A 10-minute incident looks small in isolation, but if you're running a 99.99% SLA, that single incident has already exhausted your entire monthly downtime budget — and then some.


What counts as downtime

Before you can track your SLA, you need a precise definition of "downtime." This belongs in your ToS or SLA document and it matters more than you might think.

Consecutive check failures

A single failed check from one location could be a transient network hiccup. Most SLAs define downtime as two or more consecutive failures from multiple monitoring locations — not a single blip. PingBase uses multi-region consensus: an incident opens only when multiple independent probes confirm the failure simultaneously.

Define this in your SLA: "Downtime means the service is unreachable from at least two independent monitoring locations for two or more consecutive check intervals."

Scheduled maintenance

Planned maintenance windows are usually excluded from SLA calculations. But "usually" doesn't protect you — it needs to be written explicitly: "Scheduled maintenance communicated 48 hours in advance is excluded from downtime calculations."

In PingBase, you can create scheduled maintenance windows. During these windows, monitors are paused and no incidents are opened — the time is excluded from your uptime percentage calculations automatically.

Partial degradation

What if your app loads but is extremely slow? What if one feature is broken but others work? Your SLA should address partial availability. Many SLAs use tiered credits: severe degradation counts as partial downtime weighted at 50%. Define your tiers in the contract, and make sure your monitoring captures response time — not just availability — so you can classify incidents appropriately.


Early warning: the SLA budget alarm

The most underused SLA feature is the budget alarm — an alert that fires before you breach, not after.

If your SLA allows 43 minutes of downtime per month and you've had 35 minutes of confirmed downtime by the 15th, you have 8 minutes of budget remaining for the rest of the month. You should know that. Your on-call engineer should know that. A 5-minute incident at this point would breach your SLA and trigger customer credits.

PingBase's SLA tracking shows your current-month downtime budget in real time: how much you've consumed, how much remains, and a projection based on your recent incident rate. You can set an alert threshold — "notify me when I've consumed 75% of my monthly SLA budget" — so you get early warning before the breach, not a support ticket after.


Reporting SLA performance to customers

There are three ways customers typically consume SLA data, each appropriate for different relationships:

Public status page with uptime history

Your public status page should show historical uptime percentages — typically the last 90 days broken into daily or weekly bars. This is the lowest-friction SLA reporting: customers can check it without contacting you, and it creates accountability. If you claimed 99.9% and your status page shows three multi-hour incidents last month, customers will notice.

PingBase status pages display 90-day uptime history with daily resolution. Each bar represents one day's uptime percentage. Incidents are listed with start time, end time, duration, and update timeline.

Monthly SLA reports

For enterprise customers, monthly SLA reports are standard. These are formal documents delivered by email or available in the customer portal, showing:

PingBase scheduled reports can be configured to deliver monthly uptime summaries automatically — to you, your team, or directly to customer email addresses. The report data comes from your monitoring history, not from manual compilation.

Real-time SLA dashboard via API

Larger customers may want to pull your uptime data directly into their own systems. PingBase's public API exposes monitor uptime data and incident history so customers can integrate it into their vendor management dashboards without depending on your status page.


Setting up SLA tracking in PingBase

SLA tracking in PingBase is configured per monitor or monitor group:

  1. Open a monitor's settings and scroll to the SLA section
  2. Set your SLA target (e.g., 99.9%)
  3. Configure your SLA window — rolling 30 days or calendar month
  4. Set a budget alert threshold — we recommend alerting at 75% budget consumed
  5. Optionally, configure scheduled maintenance windows to exclude from calculations

Once configured, your monitor dashboard shows a live SLA gauge: current uptime percentage, downtime budget consumed, and budget remaining. At month-end, the data rolls into your history for audit purposes.


Common SLA tracking mistakes

Measuring from a single location

A single monitoring probe gives you one data point. If that probe has a network hiccup, it looks like your service is down — or conversely, if your service is down in one region, the probe might not see it. Multi-region monitoring from 3+ independent locations is the minimum for credible SLA data.

Not separating your monitoring infrastructure from your product

If your monitoring runs on the same servers as your product, a major outage will take down both simultaneously. External monitoring services like PingBase are hosted completely independently from your infrastructure — so your monitoring data is reliable even when your product is fully down.

Discovering a breach from a customer complaint

If a customer is the first to tell you that you've breached your SLA, your monitoring setup failed. Proactive SLA alerts — "you've consumed 80% of your monthly downtime budget" — are the difference between managing your SLA and reacting to it.

Not including SLA data in incident postmortems

Every significant incident should include a section on SLA impact: how many minutes of downtime were incurred, what percentage of the monthly budget this consumed, and whether the incident triggered a credit obligation. This connects operational work to customer commitments and helps prioritize future reliability improvements.

Track your SLA commitments automatically

PingBase SLA tracking shows your real-time uptime percentage, budget remaining, and sends early warnings before you breach. Free for up to 5 monitors.

Get started free →

Related