Writing Allocation-Free
Code in .NET
How to reduce GC pressure by eliminating unnecessary heap allocations — covering value types, Span<T>, pooling, boxing, closures, and async.
01 — Why Allocations Matter
Every object allocated on the managed heap must eventually be collected by the garbage collector. GC pauses — even in the background — introduce latency spikes and reduce throughput. In hot paths (game loops, high-frequency trading, network servers, parsers), the cumulative cost of allocations is often the primary bottleneck.
The goal is not zero allocations everywhere. It is zero allocations in the hot path. Startup code, one-time initialization, and infrequent code paths are irrelevant to this concern.
| GC Generation | Trigger | Cost |
|---|---|---|
| Gen 0 | LOH threshold (~256 KB budget consumed) | Sub-millisecond, frequent |
| Gen 1 | Objects survive Gen 0 collection | Low, but pauses grow |
| Gen 2 | Long-lived objects, full GC | Can cause multi-ms stop-the-world |
| LOH | Objects ≥ 85,000 bytes | Never compacted by default, fragmentation |
02 — Stack vs. Heap
Stack allocations are free. Incrementing the stack pointer is a single CPU instruction. The stack frame is reclaimed automatically when the method returns, with no GC involvement. Stack memory is also cache-hot because it is used in LIFO order.
Heap allocations are not free. The CLR must find space in the managed heap, write object headers (method table pointer, sync block index), potentially trigger a collection, and eventually run finalizers if applicable.
Value types (struct, primitives, enum) live on the stack when declared as locals or fields of another value type. They are copied by value on assignment. Reference types (class, record class, arrays, delegates) always live on the heap and are accessed through a managed pointer.
03 — Structs & Value Types
Prefer structs for small, short-lived data that has value semantics and does not need identity.
readonly struct
Declare structs as readonly when all fields are immutable. This prevents the compiler from emitting a defensive copy when the struct is used as an in parameter or accessed via a readonly field — a common hidden allocation source.
// defensive copy emitted — mutating field on copy, not original
public struct Point { public int X, Y; }
public void Process(in Point p) => _ = p.X; // hidden copy if not readonly
// no defensive copy
public readonly struct Point { public int X { get; } public int Y { get; } }
ref struct
ref struct types are stack-only. They cannot be boxed, stored in fields of reference types, used as generic type arguments, or assigned to object/dynamic. Span<T> and ReadOnlySpan<T> are ref struct types. Use ref struct when a type must never escape to the heap.
ref struct ParseState
{
public ReadOnlySpan<char> Remaining;
public int Position;
}
in, ref, out parameters
Pass large structs with in (read-only reference) to avoid copies. Use ref for read-write access. Neither causes a heap allocation — both pass a managed reference to the existing memory location.
04 — Span<T> and Memory<T>
Span<T> is a stack-allocated view over a contiguous region of memory — an array, a stack-allocated buffer, or native memory. It carries a pointer and a length. No heap allocation. It replaces many patterns that previously required creating sub-arrays or substrings.
// Before: allocates a new string
string line = input.Substring(0, newline);
// After: zero allocation
ReadOnlySpan<char> line = input.AsSpan(0, newline);
// Slicing an array without allocation
byte[] buffer = new byte[4096];
Span<byte> header = buffer.AsSpan(0, 16); // no copy, no alloc
Span<byte> payload = buffer.AsSpan(16);
Memory<T> and ReadOnlyMemory<T>
Memory<T> is the heap-compatible counterpart. Use it when you need to store the slice reference in a field, pass it across async boundaries, or use it as a generic type argument — all of which Span<T> prohibits. Obtain a Span<T> from it only at the point of use via .Span.
Span<T> and Memory<T> overloads. Check MemoryExtensions, System.Text.Unicode, System.Buffers.Text.Utf8Parser, and System.IO.Pipelines before writing custom parsing code.05 — ArrayPool<T> and MemoryPool<T>
When a temporary array is needed, rent from ArrayPool<T>.Shared instead of allocating. The pool maintains per-thread buckets of arrays sized to powers of two. Rented arrays must be returned.
byte[] rented = ArrayPool<byte>.Shared.Rent(minLength: 256);
try
{
Span<byte> span = rented.AsSpan(0, 256); // rented array may be larger
DoWork(span);
}
finally
{
ArrayPool<byte>.Shared.Return(rented, clearArray: false);
}
minLength but may be larger. Always use the actual required length as the upper bound when slicing, not rented.Length.MemoryPool<T> provides an IMemoryOwner<T> handle that returns the buffer to the pool on disposal — cleaner for async code where try/finally is awkward.
using IMemoryOwner<byte> owner = MemoryPool<byte>.Shared.Rent(256);
await ProcessAsync(owner.Memory.Slice(0, 256));
ObjectPool<T>
For reference types that are expensive to construct (e.g., StringBuilder, parsers, connection objects), use Microsoft.Extensions.ObjectPool.ObjectPool<T>. It maintains a fixed pool and falls back to creating new instances under contention.
06 — stackalloc
stackalloc allocates a contiguous block on the stack and returns a Span<T>. No GC involvement. The memory is freed when the method returns.
Span<byte> buf = stackalloc byte[128];
FillHeader(buf);
stackalloc for large or variable-size buffers risks StackOverflowException. Use a threshold and fall back to ArrayPool when the size is not statically known.const int StackThreshold = 256;
byte[] rented = null;
Span<byte> buf = length <= StackThreshold
? stackalloc byte[StackThreshold]
: (rented = ArrayPool<byte>.Shared.Rent(length));
try { DoWork(buf.Slice(0, length)); }
finally { if (rented != null) ArrayPool<byte>.Shared.Return(rented); }
07 — String Allocations
Strings are immutable reference types. Any transformation — concatenation, substring, format — produces a new heap object.
String interpolation and +
In .NET 6+, interpolated strings are lowered to DefaultInterpolatedStringHandler, which uses a stack buffer for small results and avoids allocating for the intermediate format string. However, calling .ToString() still allocates a new string. Avoid string building entirely in hot paths.
Span-based alternatives
Use MemoryExtensions.AsSpan(), MemoryExtensions.StartsWith(), MemoryExtensions.IndexOf(), and related methods to operate on string data without creating substrings.
// Allocates
bool ok = input.Substring(0, 4) == "HTTP";
// Zero allocation
bool ok = input.AsSpan(0, 4).SequenceEqual("HTTP");
Utf8 strings
System.Text.Unicode.Utf8 and System.Buffers.Text.Utf8Parser operate on ReadOnlySpan<byte> directly. Parsing numbers, dates, and GUIDs from a UTF-8 byte stream without converting to a string eliminates both the string allocation and the encoding conversion.
Utf8Parser.TryParse(utf8Bytes, out int value, out int bytesConsumed);
String.Create
When a string must be produced, string.Create<TState>(length, state, action) allocates exactly once and fills the buffer via a Span<char>. Prefer this over StringBuilder when the final length is known.
08 — Boxing
Boxing wraps a value type in a heap-allocated object. It is invisible in source code and easily overlooked.
| Pattern | Boxes? |
|---|---|
| int x = 5; object o = x; | Yes — explicit cast to object |
| interface IFoo; struct S : IFoo | Yes — S cast to IFoo allocates |
| string.Format("{0}", intValue) | Yes — variadic params as object[] |
| Enum in Dictionary<Enum, T> | Yes — default GetHashCode/Equals box |
| Generic method with T : struct | No — JIT specializes per value type |
| Span<T> with T : struct | No |
Interface dispatch on structs
Casting a struct to an interface boxes. Pass structs as generic type parameters constrained to the interface instead:
// Boxes S on each call
void Process(IProcessor p) { ... }
// No boxing — JIT generates specialized code for each T
void Process<T>(T p) where T : struct, IProcessor { ... }
Enum comparisons
Using an enum as a dictionary key with the default EqualityComparer boxes on .NET Framework. On .NET Core/.NET 5+, the JIT eliminates the boxing via intrinsics, but using EqualityComparer<T>.Default explicitly or a custom comparer is safer and more readable.
09 — Collections
Pre-size collections
Every internal array resize in List<T>, Dictionary<K,V>, and HashSet<T> allocates a new backing array and copies. If the final count is known or estimable, pass the capacity to the constructor.
var list = new List<Record>(expectedCount);
var dict = new Dictionary<int, string>(capacity: 64);
Avoid LINQ in hot paths
LINQ methods allocate enumerators, state machines, and intermediate collections. Replace with imperative loops in hot paths. A foreach over a List<T> or array is allocation-free when the variable is declared with the concrete type (not IEnumerable<T>).
// Allocates: IEnumerable wrapper + enumerator
foreach (var x in list.Where(x => x.Active)) { ... }
// Zero allocation
foreach (var x in list) { if (x.Active) { ... } }
CollectionsMarshal
System.Runtime.InteropServices.CollectionsMarshal provides low-level access to collection internals without allocation. CollectionsMarshal.AsSpan(list) returns a Span<T> over the internal array of a List<T> for direct iteration or mutation.
Span<Item> span = CollectionsMarshal.AsSpan(items);
for (int i = 0; i < span.Length; i++) { span[i].Process(); }
10 — Closures & Delegates
Every lambda that captures a variable from the enclosing scope causes the compiler to generate a closure class. Creating that delegate allocates the closure object plus the delegate itself.
int threshold = 42;
var result = list.FindAll(x => x.Value > threshold); // allocates closure + delegate
Strategies to avoid this:
Static lambdas — In C# 9+, mark a lambda static to prevent accidental capture. The compiler will error if a capture is attempted. A static lambda with no captures is cached as a single delegate instance.
list.FindAll(static x => x.IsActive); // no allocation after first call
Pass state explicitly — Use overloads that accept a TState parameter to avoid captures:
// Array.Sort with comparison state — no closure
Array.Sort(arr, comparer);
Cached delegates — For instance methods used as callbacks repeatedly, store the delegate in a field to avoid repeated allocation:
private readonly Action<Packet> _onReceive;
public Handler() { _onReceive = OnReceive; } // allocate once
public void Register() { bus.Subscribe(_onReceive); } // no allocation
11 — async/await
Each async method compiles into a state machine struct. When the awaited operation completes synchronously, the runtime can return a completed Task from a cache — no allocation. When it suspends, the state machine is boxed to the heap.
ValueTask
Use ValueTask and ValueTask<T> for methods that frequently complete synchronously. ValueTask is a struct that avoids the heap allocation when synchronous completion is the common case. When it must suspend, it falls back to allocating a Task or using an IValueTaskSource.
// Returns cached Task.CompletedTask when buffer not empty
public ValueTask<int> ReadAsync(Memory<byte> buffer)
{
if (_buffer.Length > 0)
{
int n = Drain(buffer.Span);
return new ValueTask<int>(n); // no allocation
}
return ReadSlowAsync(buffer); // allocates only on actual I/O
}
ValueTask must be awaited at most once and must not be awaited after the operation has completed. Do not cache or share ValueTask instances. If you need to await multiple times, call .AsTask().IValueTaskSource
For high-performance scenarios (e.g., System.IO.Pipelines, socket I/O), implement IValueTaskSource<T> to reuse the completion source object between operations. This allows completely allocation-free async I/O after warm-up.
ConfigureAwait and context
Using ConfigureAwait(false) avoids capturing the SynchronizationContext, which itself can involve allocations and cross-thread marshaling overhead in UI or ASP.NET Framework contexts.
12 — Tooling
Allocation-free code is only worth writing where it actually matters. Measure first.
| Tool | Use |
|---|---|
| dotMemory | Heap snapshot diffing, allocation call trees, class retention |
| BenchmarkDotNet | Micro-benchmarks with MemoryDiagnoser; reports bytes allocated per operation |
| PerfView / ETW | GC event traces, allocation sampling, large-scale profiling on Windows |
| Roslyn Analyzers | Detects boxing, closure captures, missing readonly — runs in-IDE |
| dotnet-trace | Cross-platform event tracing, GC heap stats |
| Allocation-aware unit tests | Assert zero allocations in hot paths using GC.GetAllocatedBytesForCurrentThread() |
// Assert allocation budget in tests
long before = GC.GetAllocatedBytesForCurrentThread();
HotPath(input);
long after = GC.GetAllocatedBytesForCurrentThread();
Assert.Equal(before, after); // strict zero-alloc assertion
Add [MemoryDiagnoser] to BenchmarkDotNet benchmarks to get per-invocation allocation figures in the results table alongside throughput numbers. This makes allocation regressions visible in CI.