Nanite vs Traditional LODs

Architecture Overview

Traditional LOD Pipeline

The conventional approach requires artists or DCCs to produce discrete mesh levels — typically four to six — at progressively reduced triangle counts. The engine selects among them at runtime based on screen-space pixel coverage, trading detail for rasterization cost as distance increases. Transitions are managed via crossfades or hard swaps, and every LOD level must be stored in GPU memory simultaneously for the currently visible meshes.

LOD selection in UE4/5 is controlled by the ScreenSize threshold per level. The engine computes projected screen area as a fraction of the viewport and picks the first LOD whose threshold the mesh falls below. Distance-based fallback can be forced via r.ForceLOD or overridden per-actor in the Details panel.

// Simplified LOD selection logic (engine internals, UPrimitiveComponent)
int32 ComputeLODForMeshes(
    const TArray<FStaticMeshLODInfo>& LODData,
    float ScreenSize,
    float LODBias
) {
    int32 LODIndex = 0;
    for (int32 i = 0; i < LODData.Num(); ++i) {
        if (ScreenSize < LODData[i].ScreenSize * LODBias) {
            LODIndex = i;
        }
    }
    return LODIndex;
}

Nanite's Virtualized Geometry

Nanite replaces the entire LOD hierarchy with a single asset that stores the source mesh as a persistent BVH of clusters, each containing ~128 triangles. At runtime, a software rasterizer (supplemented by hardware rasterization for large triangles) runs a visibility pass per frame that selects and projects only the clusters whose screen-space error would be perceptible, at sub-pixel precision. No discrete levels exist — the mesh is continuously tessellated to match screen coverage.

Memory & Streaming

Traditional LODs keep all active LOD levels resident in VRAM simultaneously. For a scene with ten thousand unique meshes each carrying five LODs, the footprint compounds quickly. Streaming in UE5 uses texture streaming budgets for materials but has no equivalent for mesh geometry — the full mesh set for loaded cells stays in memory.

Nanite streams geometry data as clusters on demand. Only clusters contributing visible pixels are fetched. The system maintains a GPU-side page table analogous to virtual texture streaming: geometry pages are evicted and loaded at sub-mesh granularity based on visibility. This decouples asset complexity from VRAM cost in a way discrete LODs cannot.

Cluster Size

~128

triangles per Nanite cluster

Page Size

128 KB

Nanite geometry streaming page

LOD Disk Cost

~1.5×

typical source mesh overhead for 4 LODs

Visibility Pass

GPU

async compute; no CPU round-trip

Caveat

Nanite's cooked data is larger on disk than a conventional mesh without LODs. The BVH and cluster hierarchy add overhead. The advantage is runtime streaming granularity, not raw storage size.

Runtime Performance

Triangle Budget

The fundamental premise of traditional LODs is reducing rasterization work proportionally to screen coverage. At scale, this works well if scenes have low unique mesh counts and predictable viewing distances. It breaks down in dense open worlds where thousands of meshes sit at intermediate distances, all rendering mid-LOD geometry that is neither cheap nor detailed.

Nanite's visibility pass costs a near-fixed GPU overhead regardless of total scene triangle count, because it only rasterizes what is visible at pixel resolution. In benchmarks from Epic's internal Fortnite data, scenes exceeding one billion source triangles rendered at comparable GPU cost to the same scene with hand-authored LODs at ~10M visible triangles. The bottleneck shifts from triangle count to cluster count and depth complexity.

Draw Call Overhead

Traditional rendering submits one draw call per LOD switch group. For ten thousand mesh instances, even with GPU instancing, this caps throughput and drives CPU-side bottlenecks at high instance counts. UE4's Instanced Static Mesh (ISM) and Hierarchical Instanced Static Mesh (HISM) mitigate this but require explicit authoring decisions.

Nanite consolidates all visible clusters into a single indirect draw dispatch after the visibility pass. CPU-side draw call cost for Nanite geometry is effectively O(1) relative to instance count. This is the primary performance advantage in scenes with massive unique mesh populations.

Factor	Traditional LOD	Nanite
Draw calls	Proportional to instances	~O(1) indirect dispatch
Triangle cost	LOD-bounded, not scene-bounded	Pixel-bounded via cluster cull
CPU overhead	LOD selection per actor, per frame	GPU visibility pass; minimal CPU
Depth complexity	Managed by LOD simplification	Still expensive; overdraw applies
Skinned meshes	Fully supported	Not supported (UE 5.4)
Transparency	Full blend mode support	Masked only; no translucency
Shadow rendering	Separate shadow LODs required	Virtual Shadow Maps integrate natively
Deformation / WPO	Full vertex shader access	Supported but costly — disables fast path

Shadow Cost

In traditional pipelines, shadow maps require shadow-specific LODs to keep rasterization cost tractable for deep shadow cascades. Nanite integrates with UE5's Virtual Shadow Maps, which tile and cache shadow pages independently. Shadow clusters are culled the same way as primary visibility clusters. This removes the need for shadow proxy meshes and substantially reduces artist overhead.

Foliage & Landscape Considerations

Foliage is among the most LOD-intensive content categories in large scenes. Traditional foliage uses HISM with explicit LODs and imposter atlases at far distances. The transition between LODs at medium range is frequently visible and requires careful offset tuning.

Nanite supports foliage as of UE 5.1, enabled on the FoliageType asset. Each instance is treated as an independent Nanite mesh. However, foliage presents a specific challenge: with WPO wind deformation enabled, Nanite must evaluate per-cluster bounds conservatively, which increases cluster count and disables the fastest rasterization path. For dense vegetation at scale, the tradeoff is non-trivial.

// Console variable: foliage Nanite fast path
// r.Nanite.Foliage 1         — enable Nanite for foliage types
// r.Nanite.AllowWPODistanceThreshold — WPO cutoff distance (world units)
// r.Nanite.MaxPixelsPerEdge  — cluster selection quality, default 1.0

r.Nanite.AllowWPODistanceThreshold 10000   // disable WPO beyond 100m
r.Nanite.MaxPixelsPerEdge          2.0     // coarser cluster selection, better perf

Practical Note

For background foliage beyond a few hundred meters, disabling WPO on Nanite foliage and using a static impostor system remains the most performant configuration. Nanite's advantage is largest in the 5–150m range where LOD transitions are most visible.

Landscape

UE5's Landscape system does not use Nanite; it maintains its own adaptive tessellation mechanism based on the existing heightfield LOD system. Landscape LOD is separate from Nanite and must be tuned independently via r.Landscape.LODBias and LOD distribution curves. Nanite meshes placed on top of landscape (rocks, props, ruins) do benefit from Nanite normally.

Material Constraints

Nanite imposes material restrictions that have direct pipeline implications:

Only Opaque and Masked blend modes are supported. Translucent or additive materials fall back to non-Nanite rendering for those meshes.
Pixel Depth Offset (PDO) is not supported and will cause visual artifacts if applied to Nanite meshes.
Two-sided materials are supported but increase cluster cost because front and back faces must both be considered during visibility.
World Position Offset is supported but, as noted, degrades performance by disabling the software rasterizer fast path. It is computed at cluster granularity, not per-vertex, which can produce approximation errors on high-frequency WPO.
Custom Depth pass works with Nanite but requires explicit opt-in per primitive.

Pipeline Risk

Asset packs authored for UE4 frequently contain translucent materials, PDO, and WPO combinations that break or degrade Nanite rendering. Validate material compatibility before enabling Nanite on bulk-imported assets.

Authoring & Pipeline Cost

Traditional LOD Cost

Maintaining a hand-authored LOD pipeline requires generating and validating four to six discrete mesh levels per unique asset. For large-scale environments with hundreds of hero assets, this is a significant ongoing authoring cost. Auto-generated LODs (Simplygon, built-in UE reduction) frequently fail on assets with UV seams, hard normals, or complex topology, requiring manual correction.

Nanite Authoring Cost

Nanite requires only enabling the option on import and ensuring the mesh is a closed or well-formed surface. The complexity management is fully automated. This shifts artist time from LOD generation to source mesh quality — which is a more productive allocation. However, the source mesh must be Nanite-compatible: no skeletal binding, no translucent sections.

// Enabling Nanite at import via Python scripting
import unreal

asset_path = "/Game/Meshes/SM_RockFormation"
mesh = unreal.load_asset(asset_path)

nanite_settings = mesh.get_editor_property("nanite_settings")
nanite_settings.enabled = True
nanite_settings.fallback_triangle_percent = 1.0
nanite_settings.fallback_relative_error = 0.0
mesh.set_editor_property("nanite_settings", nanite_settings)

unreal.EditorAssetLibrary.save_asset(asset_path)

The fallback_triangle_percent parameter controls the non-Nanite fallback mesh (used on platforms without Nanite support, e.g. mobile). Setting it to 1.0 keeps full resolution fallback; lower values reduce disk size at the cost of fallback fidelity.

Culling Integration

Nanite performs hierarchical culling at the cluster BVH level. Frustum and occlusion culling happen inside the GPU visibility pass using a two-pass depth pyramid approach: the first pass renders previously-visible clusters to build an occluder depth buffer; the second pass tests new clusters against it. This is functionally similar to HZB occlusion culling in the traditional pipeline but operates at cluster granularity without CPU readback.

Traditional UE5 rendering uses hardware occlusion queries or HZB queries submitted from the CPU. At high instance counts, these generate GPU stalls waiting for query results. Nanite's GPU-resident culling eliminates this round-trip.

// Profiling Nanite visibility pass in RenderDoc / UE Insights
// Key passes to inspect:
//   Nanite::InitCandidateNodes
//   Nanite::PersistentCull     — main BVH traversal
//   Nanite::Rasterize          — software + hardware raster
//   Nanite::EmitGBuffer        — resolve to GBuffer targets

// Stats via stat command:
stat nanite
stat nanitestreaming

Platform Support

Nanite requires SM6 / D3D12 (PC, Series X/S, PS5). It is not available on SM5, Vulkan below 1.2 (implementation-dependent), or mobile. Consoles below current-gen fall back to the Nanite fallback mesh defined at import. For cross-platform projects targeting last-gen hardware, a hybrid approach is mandatory: Nanite assets must carry a valid fallback LOD configuration.

Platform	Nanite	Notes
PC (DX12 SM6)	Full support	Requires GPU with SM6 tier 1+
PS5	Full support	Native GNM path
Xbox Series X/S	Full support	DX12U path
PS4 / Xbox One	Not supported	Fallback mesh used
Mobile (iOS/Android)	Not supported	Fallback mesh used
Vulkan (PC/Android)	Partial	Vulkan 1.3 + driver-dependent

Decision Guidance

Use Nanite when

Target platforms support SM6
Scene has high unique mesh count or dense instance populations
Assets are opaque static geometry (architecture, terrain props, rocks)
Eliminating LOD pop is a visual priority
Artist bandwidth for LOD generation is limited
Using Virtual Shadow Maps (native integration)
Foliage in mid-range without heavy WPO

Avoid Nanite for

Skeletal / skinned meshes (characters, cloth)
Translucent or particle-driven materials
Heavy real-time WPO at high instance counts
Cross-gen projects without platform budget for fallback authoring
Meshes with PDO-based depth effects
Landscape (use native LOD system)
Small primitive counts where traditional LODs are already optimal

Profiling & Debugging

The primary tool for Nanite debugging is the visualization modes accessible via the viewport Show menu or console:

// Nanite visualization modes
r.Nanite.Visualize triangles      // rendered triangle density
r.Nanite.Visualize clusters       // cluster boundaries
r.Nanite.Visualize overdraw       // depth complexity
r.Nanite.Visualize materialid     // per-material coverage
r.Nanite.Visualize instanceoverdraw // instance depth pile-up

// Key metrics from stat nanite:
// Nanite.Clusters.Total        — clusters evaluated in visibility pass
// Nanite.Triangles.Total       — triangles rasterized this frame
// Nanite.Pages.Streamed        — cluster pages streamed from disk
// Nanite.Culled.Percent        — culled vs. submitted ratio (target: >90%)

For traditional LOD debugging, stat initviews and ProfileGPU expose draw call breakdown and occlusion query cost. The LOD coloring visualizer (Show > LOD Coloring) overlays current LOD level per mesh in the viewport — essential for identifying LOD bias and transition band issues.

Performance Target

A Nanite.Culled.Percent below 85% typically indicates depth complexity problems — scene layout with excessive overdraw (dense foliage canopy, layered particle-heavy atmosphere). This is a scene design issue, not a Nanite configuration issue, and applies equally to traditional rasterization.