Application diagnostics
.NET .NET Core C# Diagnostics Visual Studio

Practical Approach to Memory Profiling of .NET Core Applications

Welcome to today’s post.

In today’s post I will be showing how we can use the built-in diagnostic tools that are available within Visual Studio 2019 to profile and analyze memory management within a .NET Core application.

When we develop and debug applications, the most common tools we use are the watch list and console output to check variable values and to output detailed diagnostics using a logger.

I will show how we can take snapshots within a debug session and analyze them by comparing differences in memory utilization that can allow us to determine the most optimal structures we can use when using C# types.

During this discussion, I will cover memory usage of both immutable and mutable types which I discussed in an earlier post, so it may help to get a refresher of those concepts.

A Sample Application to Profile

Suppose the program I am going to analyze is shown below:

using System;
using System.Text;

namespace DevSandboxConsoleApp
    class Program
        static void Main(string[] args)

        static void MemoryStressTest()
            // execute some memory intensive commands..

            // Use a string builder..
            StringBuilder sb = new StringBuilder(100, 1024000);

            for (int i=0; i<100; i++)
                sb.Append($"String Number {i}." + Environment.NewLine);

            Console.WriteLine("Output string builder content..");

            foreach (var c in sb.GetChunks())

            // Use a string..
            string s = "";
            for (int i = 0; i < 100; i++)
                s = s + $"String Number {i}." + Environment.NewLine;

            Console.WriteLine("Output string content..");

We have two parts to the program, the first builds list of strings using StringBuilder, and the second builds a string using string concatenation.

Introducing the Diagnostics Window

When the run an application in debug mode, the Diagnostics window will be visible with the Events, Process Memory and CPU. Below these graphs are four tabs: Summary, Events, Memory Usage and CPU Usage:

If our application is still running with a breakpoint set, then we can take snapshots of the current state of the application.  I will focus on the Memory Usage tab.

Taking Snapshots of Memory Usage

There is a camera icon to the left of the active tab, which when we click it, will take a snapshot of the current memory usage, which is shown below:

We then set a breakpoint just after the string builder object is created, then take another snapshot:

Notice that there are an additional 40 objects created, and a further 2.4kb of heap space taken by the string builder instance.

We then set a breakpoint just after the first iteration of the string builder append for-each loop, then take another snapshot:

We then set a breakpoint just after the 100th iteration of the string builder Append() operation for-each loop, then take another snapshot:

We notice that a further 12 objects are created, and the heap size increases by 6.52kb.

When the loop through the output of the StringBuilder chunks ends we take another snapshot.

Inspecting the heap space link difference shows the structure of the instances created for the StringBuilder class, with each instance containing an allocated chunk of space for as many strings that will fit into the allocated capacity:

At the point the string variable is initialized, we then take another snapshot:

Notice that the space taken by the string is 0.48kb but the object count has reduced by 1.

When the first string is concatenated, we take another snapshot:

The object count has increased by 2, and the heap increases by 0.12kb.

Let the debugger run to the line just after the string concatenation loop, then take another snapshot:

Notice that the number of objects increases by 3 and the heap size increases by 7.48kb.

This is greater than the heap size taken by the string concatenation, which took 6.52kb.

Also, the string concatenation used three instances, whereas the StringBuilder type took up 12 additional instances. As StringBuilder is a mutable type, the 12 objects were chunks that were allocated to contain the strings. The string is an immutable type, and it required a new object instance to be assigned to the original string. We see that the immutable type consumes more heap space as it creates a copy of the instance before reassigning it back to the destination variable.

Viewing and Comparing Heap Space Usage

From each snapshot we can view the heap space, which can be ordered by heap size as shown:

We can verify that memory is freed by the garbage collector (GC) on scope exit.

After re-running and placing snapshots at the beginning of the method block, before the string concatenation, at the final line of code in the block and the exit of the block, we have the following snapshots of memory usage:

We can see the difference in heap size between the start of the method block (snapshot #1) and the exit of the block (snapshot #4) is 0.24kb. 

Select the heap snapshot #4. Compare with snapshot #1.

The object type is a StringBuilder object:

Select the instance. You will see the final string within the StringBuilder has not de-allocated 128 bytes:

Comparing Heap Space Usage for Different String Concatenation Approaches

There isn’t much we can do with StringBuilder memory deallocation as it is performed by the GC.

Next, replace the string concatenation with the following code that allocates 100 strings.

// Use a string..
string[] s = new string[100];

for (int i = 0; i < 100; i++)
s[i] = $"String Number {i}." + Environment.NewLine;

After re-running the program and snapshotting the memory usage we can see the differences:

We can see the difference in heap size between the start of the method block and the exit of the block is 0.43kb (448 bytes). 

Select the heap snapshot #4. Compare with snapshot #1.

The object types not deallocated are a StringBuilder and a RuntimeType object:

The StringBuilder object has the same instance still in the heap as we had earlier, however the RuntimeType object has eight instances still in the heap:

If we experiment further and set the reference to the allocated string array to null as the final line of the block:

s = null;

the memory snapshot is unchanged. This shows that the GC attempts deallocation and automatic de-referencing of allocated objects.

Comparing Heap Space Usage for Immutable and Mutable List Collections

The final memory profile I want to show is with mutable and immutable List collections.

We will profile the memory usage of two additional methods that do pretty much the same thing as the previous two methods, except we will use a list List<T> collection and an immutable list collection, ImmutableList<T> to store and output lists. The method using a list collection is shown below:

static void CPUStressTest3()
    List<string> stringList = new List<string>();

    for (int i=0; i<100; i++)
        stringList.Add($"String Number {i}." + Environment.NewLine);

    for (int i = 0; i < 100; i++)

The method using an immutable list collection is shown below:

static void CPUStressTest4()
    ImmutableList<string> stringList = ImmutableList<string>.Empty;           

    for (int i = 0; i < 100; i++)
       	stringList = stringList.Add($"String Number {i}." + Environment.NewLine);

    stringList.ForEach(act =>

We run the application until it hits the initial breakpoint as shown:

Set breakpoints within the beginning and end of the blocks within the methods CPUStressTest3() and CPUStressTest4(). Take memory usage snapshots of the heap at each of these breakpoints.

The memory heap usage shows the mutable list taking up an additional 7.41kb whereas the immutable list takes up an additional 15.16kb.

Drilling into the final memory snapshot and ordering by the Inclusive Size Differential shows the extent of object usage:

Selecting the instances of the immutable list node shows the reference count and a detected cycle. This indicates a likely memory leak!

Advance the debugger outside of the method into the calling scope in main(), and take another snapshot.

Notice that latest snapshot shows only 10.15kb has been reclaimed instead of the 15.16kb. The GC algorithm seems to have hit a cycle in the allocation graph while attempting to clean up the static immutable list and causes it to miss a great deal of the list items that were allocated while being appended.

We have seen how to use the Visual Studio 2019 diagnostic tools to help us identify and compare differences in memory usage at different points during debugging, and when a significant memory intensive operation has been completed. This allows us to compare different C# types and determine which offer better memory usage.

That is all for today’s post.

In the next post I will show how to use the diagnostic tools within Visual Studio to analyze and compare application performance.

I hope you have found this post useful and informative.

Social media & sharing icons powered by UltimatelySocial