← Blog

Operations April 5, 2026 12 min read

Build a Complete Monitoring Stack for Under $50/Month

Enterprise observability stacks cost thousands per month and take weeks to configure. But for an indie hacker or small team, you can assemble a complete monitoring stack — uptime, logs, errors, and performance — for under $50/month, most of it free. Here's exactly what to use and how to connect it all.

Observability is the practice of understanding what your system is doing from the outside — through the signals it emits. There are four main pillars: uptime monitoring (is it up?), logging (what happened?), error tracking (what broke?), and performance monitoring (how fast?). Each covers different failure modes, and together they form a complete picture.

Large engineering organizations spend five figures per month on tools like Datadog, Splunk, and New Relic. But most of what those tools do is available on free tiers of focused tools — if you know which ones to combine.

This guide assembles a full stack for a typical indie SaaS running on a modern edge-first or serverless architecture (Cloudflare Workers, Vercel, Railway, Fly.io, or a small VPS). Adjust to your infrastructure as needed.

Total monthly cost

PingBase Pro — uptime + status page$9

Grafana Cloud — logs, metrics, dashboards$0

Sentry — error tracking$0

Checkly — synthetic end-to-end checks$0

Total$9/month

Free tiers are sufficient for most indie products under ~10k MAU. Costs increase as you scale.

Layer 1: Uptime monitoring — PingBase ($9/month)

Uptime monitoring is the foundation of any observability stack. It answers the most basic question: is my service reachable? Without it, you're blind to the most visible failure mode — complete unavailability.

What PingBase covers

HTTP monitors — checks your URLs every minute, verifies status codes and response content
SSL certificate monitoring — alerts you 30 days before a certificate expires
Heartbeat / cron monitoring — detects when scheduled jobs stop running
TCP / port monitoring — checks that ports are accepting connections
Public status page — gives users a place to check your status and subscribe to updates
Multi-region checks — eliminates false positives from single-location network issues

What to set up on day one

HTTP monitor on your homepage
HTTP monitor on your API health endpoint (/health or /api/health)
HTTP monitor on your login page
SSL monitor on your main domain
Heartbeat monitor on your cron job (backup, invoice generation, etc.)
Public status page linked from your app footer

Alerts go to email (immediate) and Slack (team visibility). The free tier covers 5 monitors — enough to start. Pro ($9/month) gives you unlimited monitors, custom domain for your status page, and 1-minute check frequency.

For a full breakdown of what to monitor, see The Ultimate Website Monitoring Checklist for 2026.

Layer 2: Logs — Grafana Cloud (free tier)

Logs answer "what happened?" Uptime monitoring tells you that your API returned a 500 at 2:13am — but logs tell you which request it was, what parameters it had, what line of code it hit, and what the stack trace was. You need both.

What Grafana Cloud offers on the free tier

14 days of log retention
50 GB of log ingestion per month
Grafana Loki for log querying (LogQL syntax)
Grafana dashboards for visualization
Grafana Prometheus for metrics
3 users included

50 GB/month is generous for a small product. Even with verbose logging, most indie SaaS apps produce well under 1 GB of log data per month at early stages.

How to ship logs to Grafana Loki

The exact integration depends on your runtime:

Node.js / Express / Fastify. Use the winston-loki transport or pino-loki. These push log entries directly to the Loki HTTP API.
Cloudflare Workers. Workers don't have persistent processes, so you can't run a log agent. Instead, use the Cloudflare Logpush feature to forward Worker logs to a Loki-compatible endpoint, or use the Grafana Cloud integration in the Cloudflare dashboard.
Vercel / serverless. Vercel has a native log drain integration that can forward to an HTTP endpoint. Point it at a Grafana Alloy instance running on a small VPS, or use a log proxy service.
VPS / Docker. Run Grafana Alloy (formerly Grafana Agent) as a sidecar or system service. It tails log files and ships to Loki automatically.

What to log

Every HTTP request: method, path, status code, duration, user ID (if authenticated)
Every error: message, stack trace, request context
Every background job: start time, end time, outcome, any errors
Authentication events: login, logout, failed login attempts
Payment events: charge attempt, success, failure, refund

Don't log sensitive data (passwords, full credit card numbers, PII) — but log enough to reconstruct what a user did and what the system returned.

How logs and uptime monitoring complement each other

When PingBase alerts you that your API returned a 500, your first step is to open Grafana Loki and query for error-level logs in the minute surrounding the alert timestamp. Within 30 seconds, you'll have the stack trace, the affected endpoint, and the likely cause. Without logs, you're debugging blind.

Layer 3: Error tracking — Sentry (free tier)

Logs capture what happened in your system. Error tracking captures what broke in your code — with a user-facing view, a stack trace with source map resolution, breadcrumbs showing what the user did before the error, and a count of how many users were affected.

What Sentry offers on the free tier

5,000 errors per month
1 user (free) — fine for solo founders
JavaScript, Python, Go, Rust, and most other languages via SDKs
Source map support for minified JavaScript
Performance monitoring for up to 10 transactions/second (sampling)
30 days of error retention

5,000 errors per month is plenty for early-stage. If you're generating more than that in errors, you have a code quality problem that's more urgent than monitoring costs.

Setting up Sentry

Sentry integration is typically a 15-minute setup:

Create a Sentry project and get your DSN
Install the Sentry SDK: npm install @sentry/node for Node.js, @sentry/react for React frontend
Initialize Sentry at the entry point of your app with your DSN
For React: wrap your root component in Sentry.ErrorBoundary
Upload source maps during your build process for human-readable stack traces
Set up Sentry alerts to post to your Slack channel for new error types

Sentry vs logs: which to check first?

Sentry is better for JavaScript exceptions and user-facing errors — it gives you the full React component tree, the user actions that led up to the error (breadcrumbs), and whether the same error is affecting 1 user or 1,000. Logs are better for server-side request-level debugging. In practice, you'll usually start with the PingBase alert, open Sentry to check if there's a correlated error spike, then open Loki if you need more context.

Layer 4: Synthetic end-to-end checks — Checkly (free tier)

Uptime monitoring checks that your endpoints respond. Synthetic monitoring checks that your user flows work — that a user can sign up, log in, create a resource, and complete a checkout from start to finish. These are different problems.

What Checkly offers on the free tier

3 browser checks (Playwright-based)
5 API checks
Checks run every 10 minutes
Alert notifications via email and Slack

Three browser checks is enough to cover your most critical user flows: signup, login, and your core product action (create first resource, start trial, etc.).

Setting up Checkly

Checkly uses Playwright syntax for browser checks. A basic login check looks like:

const { test, expect } = require('@playwright/test');

test('User can log in', async ({ page }) => {
  await page.goto('https://app.yourproduct.com/login');
  await page.fill('[data-testid="email"]', process.env.CHECK_EMAIL);
  await page.fill('[data-testid="password"]', process.env.CHECK_PASSWORD);
  await page.click('[data-testid="submit"]');
  await expect(page).toHaveURL(/dashboard/);
  await expect(page.locator('h1')).toContainText('Dashboard');
});

Checkly runs this check from multiple locations every 10 minutes, alerts you when it fails, and shows you a screenshot and video recording of the failure. This catches a class of issues that HTTP monitoring can't: when your server returns 200 but the page content is wrong, the JavaScript errors out on render, or the login form submits but the redirect fails.

What flows to check

Signup flow. Can a new user register, verify email, and land on the dashboard?
Login flow. Can an existing user log in?
Core action. Can a logged-in user perform the primary action your product exists for?

These three checks cover the majority of "the product is broken" scenarios.

Connecting the stack: the incident workflow

Having four tools is only useful if they work together. Here's the incident workflow this stack enables:

PingBase detects failure. Your API health endpoint returns 500 at 2:13am. PingBase fires after 2 consecutive failures — alert lands in Slack at 2:14am.
Check Sentry. Open Sentry and look at the error spike around 2:13am. You see 47 errors of the same type: DatabaseConnectionError: connection pool exhausted.
Check Grafana Loki. Query for error logs around 2:13am. See the exact query that triggered the pool exhaustion — a poorly-indexed query started running at 2:10am and holding connections.
Fix and deploy. Kill the offending query, add the index, deploy.
PingBase sends recovery alert. 2:31am — service is back. 18 minutes of downtime captured with exact timestamps.
Post incident update to status page. Update the PingBase status page with a timeline. Users who subscribed to status notifications get an email: "The incident affecting API performance has been resolved."

From first alert to resolution, every step has supporting data. You never debug blind.

Optional: metrics — Grafana Prometheus (included in Grafana Cloud)

Grafana Cloud's free tier includes Prometheus for metrics storage and Grafana dashboards for visualization. If you're running a Node.js server, you can expose a /metrics endpoint using prom-client and scrape it with Grafana Alloy.

Useful metrics to track:

Request rate (requests/second per endpoint)
Error rate (% of requests returning 4xx or 5xx)
P95 and P99 response time by endpoint
Database connection pool utilization
Background job queue depth and processing rate

For serverless or edge runtimes that don't support persistent Prometheus scraping, you can push metrics directly to the Grafana Cloud Prometheus remote write endpoint on each request. The overhead is small.

When to upgrade beyond free tiers

The free tier stack handles most indie products comfortably to $5k–$10k MRR. Here's when you'll need to upgrade:

Tool	When to upgrade	Paid tier starts at
PingBase	When you need more than 5 monitors or custom domain for status page	$9/month
Grafana Cloud	When you need >50GB logs/month, >14 days retention, or >3 users	~$8/month (pay as you go)
Sentry	When you need >5k errors/month, multiple team members, or longer retention	$26/month (Team)
Checkly	When you need >3 browser checks or 5-minute check intervals	$20/month

At full paid tiers, this stack costs roughly $63/month — still dramatically less than a comparable Datadog setup at similar scale, which would run $150–$400+/month for the same coverage.

The 30-minute setup plan

Here's a realistic order of operations to get the full stack live in one session:

PingBase (5 min): Sign up, add your homepage + API health monitors, configure email alerts, create a status page. Link the status page from your app footer.
Sentry (10 min): Create project, install SDK, add to your app entry point. Deploy. Trigger a test error to confirm it arrives.
Grafana Cloud (10 min): Create free account, configure a log shipping integration for your runtime, verify logs are arriving in Loki. Set up one dashboard showing error rate and request count.
Checkly (5 min): Create free account, write a login check using the Playwright editor, enable Slack notifications. Verify the check runs successfully.

That's it. In 30 minutes you go from zero observability to a production-grade stack that will catch the vast majority of real-world issues before users escalate them to you.

Start the stack with PingBase — it's free

PingBase is the uptime monitoring and status page layer of this stack. Free for up to 5 monitors. Takes 5 minutes to set up. No credit card required.

Get started free →

The Complete Guide to Uptime Monitoring in 2026

Everything you need to know about every type of monitor, alerting, and status pages.

The Ultimate Website Monitoring Checklist for 2026

A complete checklist of everything worth monitoring before and after launch.

What Is Uptime Monitoring? A Beginner's Guide

How uptime monitoring works and why you need it as soon as you have users.