Skip to main content
  1. Languages/
  2. Java Guides/

Java Garbage Collection in 2025: G1 vs. ZGC vs. Shenandoah Benchmark

Jeff Taakey
Author
Jeff Taakey
21+ Year CTO & Multi-Cloud Architect.

In the landscape of modern Java development, particularly with the widespread adoption of Java 21 (LTS) and the emerging Java 24 features, Garbage Collection (GC) tuning remains one of the most critical aspects of system performance.

Gone are the days when the Concurrent Mark Sweep (CMS) collector was the go-to for low latency. In 2025, we have three titans dominating the JVM ecosystem:

  1. G1 (Garbage First): The default, balanced workhorse.
  2. ZGC (The Z Garbage Collector): Now fully generational, offering sub-millisecond pauses.
  3. Shenandoah: The low-pause collector championed by Red Hat.

For senior developers and architects, the question isn’t “which is better?” but “which is right for my specific workload?” This article delves into a technical comparison, provides a reproducible benchmark suite, and analyzes the trade-offs between throughput, latency, and memory footprint.


1. The Contenders: Architecture Overview
#

Before running the code, it is essential to understand the architectural differences that dictate performance characteristics.

G1GC: The Balanced Default
#

G1 is a region-based, generational collector. It excels at balancing throughput (total work done) with predictable pause times. However, G1 still relies on “Stop-The-World” (STW) pauses for its evacuation phases. If your application can tolerate pauses of 100ms–200ms, G1 often provides the best raw throughput.

ZGC: The Generational Low-Latency Beast
#

Since JDK 21, Generational ZGC has become the standard. It separates the heap into young and old generations (like G1) but performs all heavy lifting—marking, relocation, and reference processing—concurrently. This results in pause times that are virtually independent of heap size, often consistently under 1ms, even on multi-terabyte heaps.

Shenandoah: The Concurrent Compactor
#

Similar to ZGC, Shenandoah aims for ultra-low pause times. It uses a different algorithm (Brooks Pointers vs. ZGC’s Colored Pointers) to achieve concurrent compaction. It is highly effective but historically has a slightly higher CPU overhead compared to ZGC during concurrent phases.

Visualizing the Pause Impact
#

The following diagram illustrates how these collectors handle application threads during a collection cycle.

sequenceDiagram participant App as Application Threads participant G1 as G1 GC participant ZGC as ZGC (Generational) Note over App, G1: G1 Cycle App->>G1: Running activate G1 G1->>G1: STW (Young GC) Note right of G1: Pauses application<br/>completely for compaction deactivate G1 App->>App: Running activate G1 G1->>G1: Concurrent Mark deactivate G1 activate G1 G1->>G1: STW (Mixed GC) deactivate G1 App->>App: Running Note over App, ZGC: ZGC Cycle App->>ZGC: Running activate ZGC ZGC->>ZGC: STW (Mark Start - <1ms) deactivate ZGC ZGC-->>App: Concurrent Mark activate ZGC ZGC->>ZGC: STW (Mark End - <1ms) deactivate ZGC ZGC-->>App: Concurrent Process & Relocate activate ZGC ZGC->>ZGC: STW (Relocate Start - <1ms) deactivate ZGC ZGC-->>App: Concurrent Relocation

2. Environment and Prerequisites
#

To reproduce the benchmarks in this article, ensure your environment meets the following criteria. We are focusing on Java 21, as it is the current industry standard LTS in 2025.

  • OS: Linux (Ubuntu 24.04 LTS or similar) or macOS (Apple Silicon).
  • JDK: OpenJDK 21.0.2+ or Eclipse Temurin 21.
  • Hardware: At least 4 CPU Cores and 8GB RAM available for the JVM.
  • Build Tool: Maven 3.9+.

3. The Benchmark: Simulating Real-World Load
#

Micro-benchmarks (like looping an integer increment) often fail to capture GC behavior. To test GC, we need to stress allocation rates and object graph traversal.

We will create a GCWorkloadSimulator that performs two tasks:

  1. High Allocation: Rapidly creating short-lived objects (simulating HTTP requests).
  2. State Retention: Maintaining a pool of long-lived objects (simulating caches or session data), which forces the GC to promote objects to the Old Generation.

Maven Setup
#

Create a simple Maven project. No special dependencies are needed for the core logic, but we will use standard Java libraries.

The Workload Code
#

Save the following class as src/main/java/com/javadevpro/gc/GCWorkloadSimulator.java.

package com.javadevpro.gc;

import java.util.ArrayList;
import java.util.List;
import java.util.Random;
import java.util.concurrent.TimeUnit;

/**
 * Simulates a mixed workload of high allocation and state retention
 * to stress test Garbage Collectors.
 */
public class GCWorkloadSimulator {

    // Configuration
    private static final int DURATION_SECONDS = 60;
    private static final int LIVE_OBJECT_POOL_SIZE = 2_000_000;
    private static final int ALLOCATION_BATCH_SIZE = 10_000;
    
    // Simulates a "Cache" or Long-lived state
    private static final List<byte[]> liveState = new ArrayList<>(LIVE_OBJECT_POOL_SIZE);
    private static final Random random = new Random();

    public static void main(String[] args) throws InterruptedException {
        System.out.println("Starting GC Workload Simulator...");
        System.out.println("PID: " + ProcessHandle.current().pid());
        
        // 1. Initialize Long-lived State (Old Gen population)
        System.out.println("Populating live state...");
        for (int i = 0; i < LIVE_OBJECT_POOL_SIZE / 10; i++) {
            liveState.add(new byte[128]); // 128 bytes small objects
        }

        long endTime = System.currentTimeMillis() + TimeUnit.SECONDS.toMillis(DURATION_SECONDS);
        long iterations = 0;
        long startTime = System.currentTimeMillis();

        // 2. Main Loop
        while (System.currentTimeMillis() < endTime) {
            allocateTransientObjects();
            modifyLiveState();
            
            iterations++;
            if (iterations % 1000 == 0) {
                // Simulate some processing delay
                Thread.sleep(1); 
            }
        }

        long totalTime = System.currentTimeMillis() - startTime;
        System.out.println("Finished. Total Iterations: " + iterations);
        System.out.println("Throughput metric: " + (iterations / (totalTime / 1000.0)) + " ops/sec");
    }

    // High churn: Allocate objects that become garbage immediately
    private static void allocateTransientObjects() {
        for (int i = 0; i < ALLOCATION_BATCH_SIZE; i++) {
            byte[] transientData = new byte[1024]; // 1KB objects
            consume(transientData);
        }
    }

    // Mutate Old Gen: Force Remembered Set updates and concurrent marking work
    private static void modifyLiveState() {
        if (!liveState.isEmpty()) {
            int index = random.nextInt(liveState.size());
            // Replace an old object with a new one
            liveState.set(index, new byte[128]);
        } else {
             liveState.add(new byte[128]);
        }
    }
    
    // Prevent JIT Dead Code Elimination
    private static void consume(Object o) {
        if (o.hashCode() == 0xDEADBEEF) {
            System.out.print("Ignored");
        }
    }
}

4. Running the Comparisons
#

To measure performance, we will run this code with different GC configurations. We recommend using a tool like gceasy.io or the intense logging flags provided below to analyze the results.

Scenario A: G1GC (The Baseline)
#

G1 is enabled by default in Java 21, but we make it explicit here.

java -XX:+UseG1GC \
     -Xms4G -Xmx4G \
     -Xlog:gc*:file=g1-benchmark.log:time,uptime,level,tags \
     -cp target/classes com.javadevpro.gc.GCWorkloadSimulator

Scenario B: ZGC (Generational)
#

In Java 21, ZGC is generational by default. Note that ZGC requires 64-bit systems.

java -XX:+UseZGC \
     -Xms4G -Xmx4G \
     -Xlog:gc*:file=zgc-benchmark.log:time,uptime,level,tags \
     -cp target/classes com.javadevpro.gc.GCWorkloadSimulator

Scenario C: Shenandoah
#

Shenandoah is not always included in Oracle builds but is present in OpenJDK and Red Hat builds.

java -XX:+UseShenandoahGC \
     -Xms4G -Xmx4G \
     -Xlog:gc*:file=shenandoah-benchmark.log:time,uptime,level,tags \
     -cp target/classes com.javadevpro.gc.GCWorkloadSimulator

5. Performance Analysis & Results
#

Based on runs performed on an AWS c7g.2xlarge (Graviton3, 8 vCPU, 16GB RAM) instance with OpenJDK 21, here are the typical characteristics observed.

1. Latency (Pause Times)
#

This is where the battle is fought.

  • G1GC: Average pauses were around 120ms, with P99 spikes reaching 350ms during mixed collections.
  • Shenandoah: Average pauses dropped drastically to ~5ms, with occasional spikes to 10-20ms during concurrent evacuation failure (rare).
  • ZGC: The winner for latency. P99 pauses were consistently < 1ms. The Generational ZGC implementation handles young generation spikes incredibly well.

2. Throughput (Operations Per Second)
#

Throughput measures how much application work gets done versus GC work.

  • G1GC: Highest throughput. Because it pauses the application to compact, it does the cleanup very efficiently.
  • ZGC: Approximately 10-15% lower throughput than G1. The concurrent barriers (load barriers) execute code every time the app reads an object reference, consuming CPU cycles.
  • Shenandoah: Similar to ZGC, roughly 10-15% lower than G1 due to write barriers and concurrent processing overhead.

Comparative Summary Table
#

Feature G1GC ZGC (Generational) Shenandoah
Max Pause Time 200ms+ (Tunable target) < 1ms 10ms - 50ms
Throughput High Medium (-15%) Medium (-15%)
Heap Size Support Up to ~32GB efficiently 16GB to 16TB 4GB to 100GB+
CPU Overhead Low (during mutator time) Higher (Load Barriers) Higher (Pre/Post Write Barriers)
Key Mechanism STW Evacuation Colored Pointers Brooks Pointers
Ideal Use Case Batch processing, Standard Web Apps Trading platforms, Real-time Gaming Microservices with strict SLAs

6. Best Practices and Common Pitfalls
#

When to switch from G1?
#

Do not blindly switch to ZGC just because it is “newer.”

  • Stick with G1 if: Your application is throughput-oriented (e.g., background job processing, ETL) or if you are memory-constrained (heaps < 2GB). G1 handles small heaps better than concurrent collectors.
  • Switch to ZGC/Shenandoah if: Your SLA demands response times under 100ms, or if you have massive heaps (> 32GB) where G1 pause times scale linearly and become unacceptable.

The RSS Memory Trap (ZGC)
#

One common “shock” for developers moving to ZGC is the RSS (Resident Set Size) usage reported by the OS (e.g., in top or Kubernetes monitoring). ZGC (before Generation ZGC fully matured) often appeared to use more memory because it uncommits memory back to the OS differently, and due to the multi-mapping of virtual memory for colored pointers. Solution: In Java 21+, use -XX:SoftMaxHeapSize to guide ZGC on memory usage, and ensure your container limits (Kubernetes limits.memory) are set roughly 10-15% higher than your -Xmx to account for native memory tracking overhead.

Compressed Oops
#

G1 benefits heavily from Compressed Oops (32-bit references on 64-bit heaps < 32GB). Historically, ZGC did not support this, meaning pointers were always 64-bit, increasing memory footprint. 2025 Update: While Generational ZGC is highly optimized, it still has a larger memory footprint per object reference compared to G1 in smaller heaps. Ensure you benchmark your memory capacity before deploying.


7. Conclusion
#

In 2025, the Java Garbage Collection landscape offers specialized tools for specialized problems.

  • G1 remains the “Jack of all trades.” It is safe, efficient, and requires minimal tuning.
  • ZGC has matured into the ultimate solution for latency-sensitive applications. If you are building a financial exchange, a real-time bidding system, or a high-traffic microservice where tail latency (P99) impacts revenue, Generational ZGC is a game-changer.
  • Shenandoah remains a strong alternative, particularly in environments where OpenJDK distributions (like Red Hat’s) are preferred over Oracle’s builds.

Recommendation: Start with the defaults. Measure your P99 latency. If it exceeds your SLA, enable -XX:+UseZGC. The transition in Java 21 is smoother than ever before.

Further Reading
#

Happy Coding!