Observability & Monitoring for .NET Apps
Observability is what lets you ask a question about a production system that nobody anticipated when the dashboard was built. A monitoring stack that answers only the questions its authors thought to ask is not observability — it is a fixed query set with autocomplete. The articles in this collection treat the three signals — traces, structured logs, metrics — as the minimum surface a .NET workload has to expose for that arbitrary-question property to hold.
Logging articles assume that ILogger looks correct and is silently lying. BeginScope that does nothing because the provider does not implement scopes. Structured templates that emit strings because the sink was misconfigured. Exception chains flattened to a single message because the formatter swallowed InnerException. The content names each failure mode and shows the configuration that produces actual structured output downstream.
Distributed tracing content focuses on what System.Diagnostics.Activity and the OpenTelemetry .NET SDK actually emit — and what they do not. W3C trace context that survives a Service Bus hop only if the producer and consumer were both instrumented. Sampling decisions taken at the edge that decide which spans the back-end ever sees. Baggage propagation that introduces information leakage if the egress firewall is not aware of it.
Metrics get the same treatment as the other two. System.Diagnostics.Metrics and the OpenTelemetry exporter replace the legacy performance counter model, but the cardinality budget is still finite. Articles cover which dimensions are worth emitting per request, which belong on a histogram, and which become a per-tenant counter that costs more in storage than the workload itself.
The recurring point: observability is not a dashboard. It is a property of the telemetry pipeline that determines whether the dashboard you build next month will have data to draw.

Six Ways ILogger Silently Fails in Production

Privacy Health Checks: Beyond Database Connectivity

Green Dashboard, Dead Application
Your application just crashed in production. Azure App Service kept routing traffic to the failing instance for ninety seconds. Users saw timeouts. Your monitoring dashboard stayed green because the web server responded with HTTP 200 while the database connection pool was exhausted.
I’ve watched this exact scenario play out at three different organizations in the past year. Each time, the post-mortem revealed the same root cause: health checks that verified the process was breathing without checking whether it could actually do its job. ISO/IEC 27001 Control A.17.2.1 exists precisely for this reason—availability is a security control, not an operational afterthought.

Observability in AKS CNI Overlay: When Pod IPs Hide Behind Nodes
