Guided Lab Brief

Monitoring & Observability

Add the three pillars of observability: metrics, logs, and traces to your distributed system.

Overview

Add the three pillars of observability: metrics, logs, and traces to your distributed system.

You can't fix what you can't see.

You will build 5 architecture steps that model production dependencies.

You will run 1 failure experiment to observe bottlenecks and recovery behavior.

Success target: Metrics, logs, and traces are all connected. Alerts configured for key SLOs.

Learning Objectives

  • Understand the three pillars of observability: metrics, logs, traces
  • Know how to set up alerting that's actionable, not noisy
  • Learned about trace sampling trade-offs
  • Experienced the difference between monitored and unmonitored systems

Experiments

  1. Set tracing sample rate to 100% to see the overhead

Failure Modes to Trigger

  • Trigger: Set tracing sample rate to 100% to see the overhead

    Observe: At 100% sampling for 5000 rps, you're generating 5000 traces/second. Each trace has multiple spans. Storage explodes, network is saturated shipping trace data, and the tracing backend itself might overload.