TornadoVM 2.0 Brings Automatic GPU Acceleration and LLM support to Java

TornadoVM 2.0: Heterogeneous Hardware Runtime for Java

The TornadoVM project has released version 2.0, an open-source runtime designed to automatically accelerate Java programs on CPUs, GPUs, and FPGAs. This release is especially relevant for developers building Large Language Model (LLM) solutions on the Java Virtual Machine (JVM).

While existing JVMs excel at portability and safety, they often struggle to fully utilize the potential of heterogeneous hardware. TornadoVM bridges this gap by offloading Java code to accelerators, managing memory transfers, and executing compute kernels, enabling significant performance gains for suitable workloads and reducing the cost of compute-intensive tasks.

Key Insights

Runtime Compilation: TornadoVM acts as a Just-In-Time (JIT) compiler, translating Java bytecode to OpenCL C, NVIDIA CUDA PTX, or SPIR-V binary.
Parallelism Models: Offers both a simple Loop Parallel API using annotations (@Parallel, @Reduce) and a more explicit Kernel API for GPU-style programming.
LLM Inference Library: Includes GPULlama3.java, a pure Java library for LLM inference on GPUs, removing external dependencies and simplifying setup.

Working Example

public static void vectorMul(FloatArray a, FloatArray b, FloatArray result) {
for (@Parallel int i = 0; i < result.getSize(); i++) {
result.set(i, a.get(i) * b.get(i));
}
}

var taskGraph = new TaskGraph("multiply")
.transferToDevice(DataTransferMode.FIRST_EXECUTION, a, b)
.task("vectorMul", Example::vectorMul, a, b, result)
.transferToHost(DataTransferMode.EVERY_EXECUTION, result);
var snapshot = taskGraph.snapshot();
new TornadoExecutionPlan(snapshot).execute();

Practical Applications

LLM Inference: GPULlama3.java enables running LLMs like Llama 3 and Qwen3 directly within Java applications on GPUs.
Pitfall: Workloads without loop dependencies may not benefit from TornadoVM’s acceleration; careful analysis of code structure is required.

References:

https://www.infoq.com/news/2025/12/tornadovm-20-gpu-llm/

On This Page

TornadoVM 2.0: Heterogeneous Hardware Runtime for Java

Key Insights

Working Example

Practical Applications

Continue reading

Related Content

Persism 2.4 Released: Lightweight Java ORM with Zero Dependencies

Jlama: Running LLMs Locally in Java

JUnit 6.0.0 Released with Java 17 Baseline, Kotlin Suspend Support, and Enhanced Features