Let’s go with a problem, identification and analysis approach. I am using Perfmon counters to detect the symptoms and later use Windbg to analyze the root cause. A sample application is used to simulate the issue.
Problem
Once a
support request was received that the performance of a .NET application was
lowering intermittently though the application workload was normal those
times.
Tool
introduction
If an issue
is reproducible, that is a huge relief for the support engineer. It is fairly
common that support engineers receive troubleshooting data from customers that does not contain
issue context. Analysis on that data is useless.
Here issue could
be reproduced. Along with some other data collections related to project, few specific
perfmon counters were also monitored that gave a clue regarding the problem.
Before we
move on, few words about Perfmon.
Frankly
speaking, I never used perfmon in my early (mid?) development days until I got
a chance to work as member of a support team. It is one of the favorite tool
of the support personnel and very powerful tool too. Almost every OS resource and service can be
monitored with perfmon counters.
Perfmon has
ability to monitor .NET CLR related sections, I would say it is a good practice
for the .NET developer to do occasional health checking of his applications using
perfmon counters during development stages to identify symptoms and prevent
them early.
Identification
We
collected the Perfmon data. Two CLR memory counters were particular interest there.
Also counters for Processor time for all threads of the Application were
also fetched.- Finalization Survivors
- % time in GC
- Processor time for threads
We could
observe that when performance lowering situations were occurring, processing
time for one thread was peaking .Another interesting thing was that, thousands
of finalization survivors were getting created. % time in GC was also relatively high.
Higher
values for [% time in GC] means Application has memory pressure, frequent
garbage collections are occurring on that application. There are many causes that leading to high
memory pressures in .NET applications, some of them can be discussed with later
blogs.
In this scenario,
the perfmon data observations gave some starting troubleshooting directions. To
validate assumptions we checked the dumps which had the issue context. The most
time consuming thread was finalizer thread. Then tracked the finalization survival
objects and their types.
The cause
There were
few time consuming code lines in the destructor of a class other than object
destruction. That led finalize thread to spent more time in destructor (finalize
method). Application was also creating high number of objects of this class. There is only one finalizer thread which execute finalizer methods. So if the number of objects ( those implemented finalize method) are more then this single thread has huge work executing finalize methods of every object during destruction. This was causing
the Finalizer thread busy and eating CPU cycles.
Let’s
have a Demo
In this
demo, the issue is simulated. In the first part we will execute the
application, reproduce the issue and then identify issue using perfmon. Next
session we will analyze further using Windbg.
Figure (1)
As you
could see in the above pic, demo app has two buttons
- Button to create 10000 objects of a class that has implemented a finalize method
- Button to Force a GC collection (GC.Collect). [This is not a recommended procedure.CLR will invoke GC as and when it is required. We only use it for our demo]
In c# , finalize method is implemented implicitly (by implementing a
destructor)
Do’s and don’ts
- Implement destructor ( finalize ) only when it is necessary ( if class has unmanaged members)
- If it obvious that you wanted to have a destructor, follow dispose pattern code.( Explained later)
- Do not put business logic in the finalize method (destructor). Our only aim in destructor is to clean the member variables. (In the above code i put for loops to to mimic a business code).
Steps to
set perfmon counters for an application.
Run your
application. In this case I have started
‘ExCodeData’ demo app.
1. In
the Run prompt type perfmon and OK
3. A dialog box appears, select “CLR Memory” and from below box .
Select application you wish to monitor. (In the demo i am select ‘ExCodeData’ application)
4.Under CLR memory currently we select
only two counters related to this context
(Many other counters of this group is worth to discuss. Keeping it for
later time)

5.Apply the Add button. We can see counters added in right side.
6. In the similar way, selecting Thread
counter group. Selected Processor time counter under it. In demo I have
selected all threads of the application. (figure below)
7. Apply
the Add button. We can see counters added in right side. The close the dialog
box.
This way we
can monitor perfmon counters for a running application. In some situations, the
requirement would be to collect the perfmon data. There are ways to set and collect
the perfmon data. For the demo purpose we just monitor the live data instead of
collecting.
Coming back
to our demo, first I am keeping report style in perfmon (Line is default)
We can see our added counters in the perfmon report view with 0 in all counters.
Action 1
Now, I click
the button to create 10000 objects and after few seconds apply button
to force a GC.
Then
observing the Perfmon report.
We can see
10,121 finalization survivors. 1000 belong to the new objects created. (121
would be the number of .NET finalizer objects the default winform app may have)
and thread 2 is 99.7 %
Action 2
Let’s do
one more Force GC here (apply GC button) and observe the perfmon report view.
Now
survivors are just 36 after executing GC second time. All 10,000 objects
are destroyed.
Next steps
We have
seen so many finalization survivors are getting created and destroyed
eventually after multiple GC occurring. We also saw Thread 2 was so busy.
So the next
obvious questions are,
- What are these survivor objects those huge in number? Can we identify them and its type? Then we may ask the developer to examine those objects.
-
We saw some thread is taking more
time. Can we get more insights what this
thread and it is doing.
We shall proceed with a dump analysis to check our assumptions are true. Continues in next blog.Before we wrapping up, one more observation.We have to do GC two times to clean the objects that have finalize method implemented. Do these objects survive a Garbage collection? Is that why they are termed Survivor in the perfmon reports? How the normal objects (managed class objects without finalize methods) behave in similar situations? Let’s go through the second blog for the step-by-step explanation with Windbg.









No comments:
Post a Comment