How Do You Know It's Broken?

Monitoring, Alerting, and Freshness SLAs The interviewer asks this to test whether you've actually operated a pipeline in production. The trap: talking about Grafana dashboards. The senior answer names four pillars of monitoring - freshness, volume, schema, quality - and explains which failure each one catches. Start with: 'I monitor four things: is the data fresh, is the volume expected, has the schema changed, and are the values valid?' The follow-up will be: 'What's your most important monitor?' The answer is freshness. A freshness SLA is a contract: 'this table will be updated within N hours of the source data landing.' It's the simplest monitor to add, catches the most common failure (pipeline didn't run), and gives you credibility when you say you've operated real systems. The interv