Observability & Monitoring for .NET Apps

Observability is what lets you ask a question about a production system that nobody anticipated when the dashboard was built. A monitoring stack that answers only the questions its authors thought to ask is not observability — it is a fixed query set with autocomplete. The articles in this collection treat the three signals — traces, structured logs, metrics — as the minimum surface a .NET workload has to expose for that arbitrary-question property to hold.

Logging articles assume that ILogger looks correct and is silently lying. BeginScope that does nothing because the provider does not implement scopes. Structured templates that emit strings because the sink was misconfigured. Exception chains flattened to a single message because the formatter swallowed InnerException. The content names each failure mode and shows the configuration that produces actual structured output downstream.

Distributed tracing content focuses on what System.Diagnostics.Activity and the OpenTelemetry .NET SDK actually emit — and what they do not. W3C trace context that survives a Service Bus hop only if the producer and consumer were both instrumented. Sampling decisions taken at the edge that decide which spans the back-end ever sees. Baggage propagation that introduces information leakage if the egress firewall is not aware of it.

Metrics get the same treatment as the other two. System.Diagnostics.Metrics and the OpenTelemetry exporter replace the legacy performance counter model, but the cardinality budget is still finite. Articles cover which dimensions are worth emitting per request, which belong on a histogram, and which become a per-tenant counter that costs more in storage than the workload itself.

The recurring point: observability is not a dashboard. It is a property of the telemetry pipeline that determines whether the dashboard you build next month will have data to draw.

Six Ways ILogger Silently Fails in Production

Six Ways ILogger Silently Fails in Production

I lost half a day because BeginScope silently did nothing in production: no error, no warning, just a flat stream of undifferentiated log entries. ILogger is a façade over a pipeline full of opt-in behaviour that looks enabled by default. Scopes, structured properties, minimum levels, exception chains, timestamps: all have failure modes that compile cleanly and fail quietly.
Privacy Health Checks: Beyond Database Connectivity

Privacy Health Checks: Beyond Database Connectivity

Your health checks verify database connectivity every 30 seconds. Great. But do they know that 15% of your users have expired consents? Privacy compliance isn’t a documentation exercise—it’s an operational discipline. Same IHealthCheck interface, different questions. Two queries, one ratio, three possible outcomes. Here’s how to build privacy health checks that turn audit questions into dashboard demos.
Green Dashboard, Dead Application

Green Dashboard, Dead Application

Your application just crashed in production. Azure App Service kept routing traffic to the failing instance for ninety seconds. Users saw timeouts. Your monitoring dashboard stayed green because the web server responded with HTTP 200 while the database connection pool was exhausted.

I’ve watched this exact scenario play out at three different organizations in the past year. Each time, the post-mortem revealed the same root cause: health checks that verified the process was breathing without checking whether it could actually do its job. ISO/IEC 27001 Control A.17.2.1 exists precisely for this reason—availability is a security control, not an operational afterthought.

Observability in AKS CNI Overlay: When Pod IPs Hide Behind Nodes

Observability in AKS CNI Overlay: When Pod IPs Hide Behind Nodes

CNI Overlay masks pod IPs behind node IPs through SNAT, breaking traditional observability. Network logs show nodes, application logs show pods. Without Container Insights, correlation IDs, and distributed tracing, you’re debugging blind. SNAT port exhaustion mimics network failures, and timestamp-based correlation is fragile. The cost of proper monitoring is trivial compared to debugging outbound connectivity at 3 AM without visibility.
Audit Logging That Survives Your Next Security Incident

Audit Logging That Survives Your Next Security Incident

Your audit logs probably won’t survive a real security incident. Most implementations log too much, protect too little, and provide zero value when something breaks at 2 AM. Here’s how to fix that with structured logging that actually works.