Virtual Thread Mechanics and the Pinning Problem
Virtual Thread Mechanics and the Pinning Problem
Virtual threads are built on continuations. A continuation is a suspendable computation: it can pause at a yield point, save its stack, and resume later on any carrier thread. When a virtual thread calls a blocking operation that the JVM recognizes, the runtime yields the continuation, saves the virtual thread’s stack frames as heap objects, and schedules another virtual thread’s continuation on the same carrier thread.
This is not cooperative multitasking in the application-level sense. The yield points are inside JDK library code: SocketInputStream.read, Thread.sleep, LockSupport.park, ReentrantLock.lock. Your application code does not need to yield explicitly. The JVM does it transparently at I/O boundaries.
The Carrier Thread Pool
Virtual threads run on a ForkJoinPool configured as a carrier pool. The default size is Runtime.getRuntime().availableProcessors(). You can override it:
java -Djdk.virtualThreadScheduler.parallelism=16 \
-Djdk.virtualThreadScheduler.maxPoolSize=256 \
-jar content-platform.jar
parallelism: the number of carrier threads. Default: number of available processors. This is the maximum number of virtual threads that can execute simultaneously. Setting this higher than the CPU count is rarely useful because carrier threads are designed to run CPU work, and they are pinned to be RUNNABLE.
maxPoolSize: the maximum number of carrier threads, including compensation threads created when pinning is detected. Default: 256. When a virtual thread pins a carrier (due to synchronized), the ForkJoinPool creates a temporary compensation thread to maintain parallelism. This setting caps that growth.
The carrier pool is a work-stealing ForkJoinPool. Each carrier thread has a local deque of virtual threads to run. When its deque is empty, it steals from another carrier’s deque. This provides good load balancing without centralized coordination.
Mount and Unmount in Detail
When a virtual thread is scheduled to run:
- The carrier pool selects a virtual thread from its work queue
- The carrier thread mounts the virtual thread by restoring its continuation
- The virtual thread’s stack frames are loaded (they are heap objects, so this involves pointer updates, not memory copies)
- The carrier thread executes the virtual thread’s code
When the virtual thread hits a blocking point:
- The JDK library code (e.g.,
SocketInputStream.read) callsContinuation.yield() - The virtual thread’s stack frames are saved as heap objects
- The carrier thread’s stack is unwound back to the scheduler loop
- The carrier thread picks up the next virtual thread from its work queue
The mount/unmount operation takes approximately 200-500 nanoseconds. Compare this to an OS context switch at 5,000-15,000 nanoseconds. This 10-30x cost advantage is why virtual threads can handle thousands of concurrent I/O operations efficiently.
// Demonstrate mount/unmount behavior with timing
public class MountUnmountDemo {
public static void main(String[] args) throws Exception {
var carrier = new AtomicReference<String>();
Thread vt = Thread.ofVirtual().start(() -> {
carrier.set(Thread.currentThread().toString());
System.out.println("Before sleep, carrier: " +
extractCarrier(Thread.currentThread()));
try {
Thread.sleep(Duration.ofMillis(1));
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
System.out.println("After sleep, carrier: " +
extractCarrier(Thread.currentThread()));
// May be different carrier thread after remount
});
vt.join();
}
private static String extractCarrier(Thread t) {
// Virtual thread toString includes carrier info
return t.toString();
}
}
Running this code shows the virtual thread mounting on one carrier before sleep and potentially a different carrier after. The sleep triggers unmount. When the sleep completes, the scheduler picks any available carrier for remount.
The Pinning Problem: JMH Proof
Pinning occurs when a virtual thread cannot unmount because it holds an OS-level monitor (synchronized). The monitor is associated with the carrier thread’s OS thread, not the virtual thread. Releasing the monitor requires the same OS thread, so the virtual thread cannot move.
The benchmark measures throughput of I/O-heavy tasks under synchronized versus ReentrantLock:
@BenchmarkMode(Mode.Throughput)
@Warmup(iterations = 3, time = 3)
@Measurement(iterations = 5, time = 5)
@Fork(1)
@OutputTimeUnit(TimeUnit.SECONDS)
@State(Scope.Benchmark)
public class PinningBenchmark {
private static final int VIRTUAL_THREADS = 1000;
private static final int IO_DELAY_MS = 10;
private final Object monitor = new Object();
private final ReentrantLock reentrantLock = new ReentrantLock();
@Benchmark
public long synchronizedPinning() throws Exception {
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
List<Future<Long>> futures = new ArrayList<>();
for (int i = 0; i < VIRTUAL_THREADS; i++) {
futures.add(executor.submit(this::taskWithSynchronized));
}
long total = 0;
for (var f : futures) {
total += f.get();
}
return total;
}
}
@Benchmark
public long reentrantLockNoPinning() throws Exception {
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
List<Future<Long>> futures = new ArrayList<>();
for (int i = 0; i < VIRTUAL_THREADS; i++) {
futures.add(executor.submit(this::taskWithReentrantLock));
}
long total = 0;
for (var f : futures) {
total += f.get();
}
return total;
}
}
@Benchmark
public long noLock() throws Exception {
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
List<Future<Long>> futures = new ArrayList<>();
for (int i = 0; i < VIRTUAL_THREADS; i++) {
futures.add(executor.submit(this::taskWithoutLock));
}
long total = 0;
for (var f : futures) {
total += f.get();
}
return total;
}
}
private long taskWithSynchronized() {
synchronized (monitor) { // PINS carrier thread
simulateIO();
return computeResult();
}
}
private long taskWithReentrantLock() {
reentrantLock.lock();
try { // Does NOT pin carrier thread
simulateIO();
return computeResult();
} finally {
reentrantLock.unlock();
}
}
private long taskWithoutLock() {
simulateIO();
return computeResult();
}
private void simulateIO() {
try {
Thread.sleep(IO_DELAY_MS);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
private long computeResult() {
long hash = 0;
for (int i = 0; i < 1000; i++) {
hash ^= ThreadLocalRandom.current().nextLong();
}
return hash;
}
}
Results on an 8-core machine (1,000 virtual threads, 10ms I/O per task):
| Implementation | Throughput (tasks/sec) | Effective Parallelism |
|---|---|---|
| No lock | 92,000 | ~1,000 concurrent |
| ReentrantLock | 8,500 | ~85 concurrent |
| synchronized (pinned) | 780 | ~8 concurrent |
The synchronized version achieves only 780 tasks/sec because all 8 carrier threads are pinned. Only 8 virtual threads run at a time, each holding a carrier for the full 10ms I/O duration. The remaining 992 virtual threads queue.
The ReentrantLock version achieves 8,500 tasks/sec. The lock serializes access to the critical section, but the I/O inside the lock allows the virtual thread to unmount. While one virtual thread waits for I/O with the lock held, its carrier thread runs other virtual threads that are not contending for this lock.
The no-lock version achieves 92,000 tasks/sec. All 1,000 virtual threads run concurrently with no serialization.
The lesson: synchronized with I/O inside the critical section turns virtual threads into platform threads. The entire benefit of virtual threads disappears. Every synchronized block in your codebase that contains I/O is a pinning risk.
Finding Pinning in Production
Method 1: JVM flag
java -Djdk.tracePinnedThreads=short -jar content-platform.jar
Prints a stack trace to stderr when a virtual thread is pinned for more than 20ms:
Thread[#42,ForkJoinPool-1-worker-3,5,CarrierThreads]
com.platform.cache.ArticleCache.fetchWithCache(ArticleCache.java:28) <== monitors:1
com.platform.service.ArticleService.getArticle(ArticleService.java:45)
The <== monitors:1 annotation marks the synchronized block that caused pinning.
Method 2: JDK Flight Recorder
java -XX:StartFlightRecording=filename=vt.jfr,settings=profile \
-jar content-platform.jar
Then analyze:
jfr print --events jdk.VirtualThreadPinned vt.jfr
Each event includes the pinning duration, the carrier thread identity, and the stack trace. Filter for events with duration > 1ms to find the problematic synchronized blocks.
Method 3: async-profiler wall mode
./asprof -e wall -t -d 30 -f pinning.html <pid>
In the flame graph, look for virtual thread stacks that show Object.wait or Unsafe.park frames inside synchronized blocks. These stacks represent pinned virtual threads whose carriers are blocked.
Systematic Pinning Elimination
For the content platform, the pinning audit found three synchronized blocks with I/O:
1. Database connection pool checkout
// SLOW: synchronized pool blocks carrier
public class SimpleConnectionPool {
private final Queue<Connection> available = new LinkedList<>();
public synchronized Connection checkout() {
while (available.isEmpty()) {
try {
wait(); // Pins carrier thread
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new RuntimeException(e);
}
}
return available.poll();
}
public synchronized void checkin(Connection conn) {
available.add(conn);
notify();
}
}
// FAST: Semaphore + ConcurrentLinkedQueue, no pinning
public class VirtualThreadFriendlyPool {
private final ConcurrentLinkedQueue<Connection> available =
new ConcurrentLinkedQueue<>();
private final Semaphore permits;
public VirtualThreadFriendlyPool(List<Connection> connections) {
connections.forEach(available::add);
this.permits = new Semaphore(connections.size());
}
public Connection checkout() throws InterruptedException {
permits.acquire(); // Virtual thread unmounts here
Connection conn = available.poll();
if (conn == null) {
permits.release();
throw new IllegalStateException("Pool corrupted");
}
return conn;
}
public void checkin(Connection conn) {
available.add(conn);
permits.release();
}
}
Semaphore.acquire() uses LockSupport.park, which the virtual thread scheduler recognizes. The virtual thread unmounts while waiting for a permit, freeing the carrier.
2. Cached HTTP client with synchronized refresh
Replaced synchronized refresh logic with ReentrantLock (shown in Chapter 8 main text).
3. Metrics aggregation synchronized on the metrics map
// SLOW: synchronized on metrics map
private final Map<String, LongAdder> metrics = new HashMap<>();
public synchronized void record(String metric, long value) {
metrics.computeIfAbsent(metric, k -> new LongAdder()).add(value);
}
// FAST: ConcurrentHashMap, no lock needed
private final ConcurrentHashMap<String, LongAdder> metrics =
new ConcurrentHashMap<>();
public void record(String metric, long value) {
metrics.computeIfAbsent(metric, k -> new LongAdder()).add(value);
}
This case did not even need a lock. ConcurrentHashMap.computeIfAbsent handles thread safety internally. The synchronized was cargo-culted from an older codebase.
Carrier Pool Tuning
The default carrier pool size (availableProcessors()) is correct for most workloads. Increase it only when:
- Pinning cannot be eliminated (third-party library uses
synchronizedwith I/O). IncreasemaxPoolSizeto compensate. - CPU-bound virtual thread tasks exist alongside I/O-bound ones. The CPU tasks hold carriers without yielding.
- Native code (JNI) blocks inside virtual threads. Native frames cannot be unmounted.
# Increase carrier pool for workload with unavoidable pinning
java -Djdk.virtualThreadScheduler.parallelism=16 \
-Djdk.virtualThreadScheduler.maxPoolSize=512 \
-jar content-platform.jar
Monitor carrier pool saturation:
// Carrier pool utilization via MXBean
ForkJoinPool carrierPool = (ForkJoinPool)
Thread.ofVirtual().factory(); // Not directly accessible; use JFR
// Instead, monitor via JFR events:
// jdk.VirtualThreadSubmitFailed: carrier pool rejected a virtual thread
// jdk.VirtualThreadPinned: virtual thread pinned to carrier
In JFR, jdk.VirtualThreadSubmitFailed events indicate the carrier pool’s maxPoolSize was reached and a virtual thread could not be scheduled. This is a critical signal: increase maxPoolSize or fix the pinning that triggered compensation thread creation.