Use virtual threads starting from JDK 25.

Issues with virtual threads in JDK 21, along with fixes.

Virtual threads were the headlining feature of Java 21, promising efficient CPU usage and improved scalability. Enabling them should theoretically improve performance for I/O bound workloads, but for microservices it would be advised to hold off until Java 25 is out.

Threads in Java #

Concurrency in Java is achieved through the Thread model, where execution of code is handled by a thread. Threads in Java are lightweight wrappers over OS threads, and hence they are scheduled by the OS scheduler. To distinguish them from virtual threads, we will refer to them as platform threads from now on.

Creating and managing these platform threads is expensive, both in terms of CPU cycles consumed and in the amount of memory that is allocated for a thread. This in turn limits the number of platform threads, and equivalently OS threads, that can be created even though the CPU is not hitting 100% usage. Additionally, when a platform thread performs any blocking operation, like network calls or I/O, then thread execution is suspended until the blocking operation is complete, leading to poor CPU utilization.

Java’s solution to this problem is virtual threads, which are purely a JVM construct and hence are created and scheduled by the JVM itself. Whenever any code execution needs to take place on these threads, the virtual thread mounts to a platform thread after which code is executed. In case a blocking operation occurs, the virtual thread unmounts from the platform thread, freeing it up to perform other tasks. Once the virtual thread is unblocked, it mounts onto another platform thread to resume execution.

This helps preserve Java’s thread-based concurrency model while improving scalability in I/O heavy workloads. This fine grained scheduling improves CPU utilization as well, since platform threads are now released to do other work while a virtual thread does the waiting.

To learn more, I’d recommend reading JEP 444, which introduced them in JDK 21.

The issue we would have faced #

I was looking into using virtual threads in one of our services, because it has the highest traffic while maintaining single digit latency. When I started poking into its dependencies to see if virtual threads were supported, I stumbled upon this issue raised in the Jedis repository (Redis client for Java) where threads were seemingly getting deadlocked. Diving deeper into the issue gave me more insight into the internals of virtual threads and edge cases that I missed while reading about it.

Pinning #

As mentioned earlier, when a blocking operation occurs, the virtual thread gets blocked while the platform thread is released to perform other tasks. But this is not what happens inside synchronized blocks. Code within a synchronized block can be executed by one thread at a time. When a virtual thread is blocked inside a synchronized block, it does not release the platform thread to which it is attached to, which is what is called pinning. This can lead to deadlocks when all the platform threads are pinned to blocked virtual threads, and other virtual threads are waiting to mount onto a platform thread.

Netflix and Yandex have good write-ups regarding the deadlock issues they’ve encountered in production after enabling virtual threads.

Solution #

Pinning is documented in the aforementioned JEP 444. Initially, the recommended way of solving this was to move away from the synchronized keyword and use locks from the java.util.concurrent.locks package. This is not feasible in case they exist deep in dependencies, and are part of large and legacy projects, such as apache-commons-pool on which Jedis relies on.

This issue was fixed and delivered as part of JEP 491 in JDK 24. Since a lot of production systems usually migrated to LTS versions of the JDK, it’s best to wait until JDK 25 is generally available. Therefore, we have held off migrating to virtual threads, and will look into adopting them once JDK 25 is out.