Recently while examining a slow request issue (I have a plan to describe this investigation in a seperate post) it came to me that every time I open the Thread Time view it takes a moment to understand what this view actually contains. I decided to write this post for me and for any of you who share the same feeling about this window :).
Posts related to Event Tracing for Windows
Diagnosing a Windows Service timeout with PerfView
Today I would like to share with you an interesting (I hope) diagnostics case in one of our system services. The IngestService (that is its name) was not starting properly for the first time – it was being killed because of exceeding the default 30s timeout. But the second try was always successful. No exception was thrown and no logs could be found in the event logs. It’s a situation when ETW traces might shed some light on what’s going on. As it was a .NET service I used PerfView to record the trace file. An important checkbox to select when diagnosing thread wait times is the Thread Time box:
After collecting the traces on production, I merged them and copied to my developer machine.
LowLevelDesign.NLog.Ext and ETW targets for NLog
UPDATE 2019.01.30: All the features described in this post, as well as some other improvements, are available in the official NLog.Etw package. Please use it in place of my custom package.
I really like the NLog library and I use it pretty often in my projects. Some time ago I wrote a post in which I showed you my preferred debug and production configuration. Other day I presented you a simple layout renderer for assembly versions.
Today, I would like to inform you that all those goodies 😉 are available in my brand new LowLevelDesign.NLog.Ext Nuget package.
Additionally, you may find in it two ETW NLog targets. ETW (Event Tracing for Windows) is a very effective way of logging and its support in kernel makes it a great choice for verbose/trace/debug logs. Moreover, if you are using Windows Performance Toolkit in your performance analysis, providing your own ETW messages will help you correlate system events with methods in your application. ETW infrastructure is highly customizable (check Semantic Logging Application Block to see how your logs might look like and how they might be consumed:)).
Diagnosing ADO.NET with ETW traces
The majority of modern applications use ORMs as a way to connect to a database. Those libraries simplify the usage of the underlying ADO.NET API in a way that we might even forget that under the hood we are dealing with a relational model. This object-oriented wonderland usually lasts till the moment when the first SQL exceptions crop up. If we configured our ORM correctly we should be able to diagnose database problems with just logs that it provides. But what if we didn’t or if the problem lies deep in the ADO.NET layer? We might then try to use a debugger or a SQL Server Profiler. Both those tools, although great, are usually too heavy (or too invasive) for a production environment. Fortunately there is still one option left – ADO.NET ETW tracing. In today’s post I will show you how to turn on this type of tracing and how to use it to quickly diagnose some database problems.
A managed ETW provider and the 15002 error
I have been playing recently with the ETW (Event Tracing for Windows). One of my aims was to write a managed provider and try the ETW infrastructure in my application. Everything seemed to be well explained on the MSDN and not very hard to implement (especially in my simple case). Unfortunately not all things went smoothly and in this post I’m going to show you an issue I run into as well as some general path when diagnosing broken ETW providers.