Observability & Monitoring for .NET Apps

Observability is what lets you ask a question about a production system that nobody anticipated when the dashboard was built. A monitoring stack that answers only the questions its authors thought to ask is not observability — it is a fixed query set with autocomplete. The articles in this collection treat the three signals — traces, structured logs, metrics — as the minimum surface a .NET workload has to expose for that arbitrary-question property to hold.

Logging articles assume that ILogger looks correct and is silently lying. BeginScope that does nothing because the provider does not implement scopes. Structured templates that emit strings because the sink was misconfigured. Exception chains flattened to a single message because the formatter swallowed InnerException. The content names each failure mode and shows the configuration that produces actual structured output downstream.

Distributed tracing content focuses on what System.Diagnostics.Activity and the OpenTelemetry .NET SDK actually emit — and what they do not. W3C trace context that survives a Service Bus hop only if the producer and consumer were both instrumented. Sampling decisions taken at the edge that decide which spans the back-end ever sees. Baggage propagation that introduces information leakage if the egress firewall is not aware of it.

Metrics get the same treatment as the other two. System.Diagnostics.Metrics and the OpenTelemetry exporter replace the legacy performance counter model, but the cardinality budget is still finite. Articles cover which dimensions are worth emitting per request, which belong on a histogram, and which become a per-tenant counter that costs more in storage than the workload itself.

The recurring point: observability is not a dashboard. It is a property of the telemetry pipeline that determines whether the dashboard you build next month will have data to draw.

Why Your Logging Strategy Fails in Production

Why Your Logging Strategy Fails in Production

Let me tell you what I’ve learned over the years from watching teams deploy logging strategies that looked great on paper and failed spectacularly at 3 AM when production burned.

It’s not that they didn’t know the theory. They’d read the Azure documentation. They’d seen the structured logging samples. They’d studied distributed tracing. The real problem was different: they knew what to do but had no idea why it mattered until production broke catastrophically.