Skip to main content

On This Page

TornadoVM 2.0 Brings Automatic GPU Acceleration and LLM support to Java

2 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

TornadoVM 2.0: Heterogeneous Hardware Runtime for Java

The TornadoVM project has released version 2.0, an open-source runtime designed to automatically accelerate Java programs on CPUs, GPUs, and FPGAs. This release is especially relevant for developers building Large Language Model (LLM) solutions on the Java Virtual Machine (JVM).

While existing JVMs excel at portability and safety, they often struggle to fully utilize the potential of heterogeneous hardware. TornadoVM bridges this gap by offloading Java code to accelerators, managing memory transfers, and executing compute kernels, enabling significant performance gains for suitable workloads and reducing the cost of compute-intensive tasks.

Key Insights

  • Runtime Compilation: TornadoVM acts as a Just-In-Time (JIT) compiler, translating Java bytecode to OpenCL C, NVIDIA CUDA PTX, or SPIR-V binary.
  • Parallelism Models: Offers both a simple Loop Parallel API using annotations (@Parallel, @Reduce) and a more explicit Kernel API for GPU-style programming.
  • LLM Inference Library: Includes GPULlama3.java, a pure Java library for LLM inference on GPUs, removing external dependencies and simplifying setup.

Working Example

public static void vectorMul(FloatArray a, FloatArray b, FloatArray result) {
for (@Parallel int i = 0; i < result.getSize(); i++) {
result.set(i, a.get(i) * b.get(i));
}
}
var taskGraph = new TaskGraph("multiply")
.transferToDevice(DataTransferMode.FIRST_EXECUTION, a, b)
.task("vectorMul", Example::vectorMul, a, b, result)
.transferToHost(DataTransferMode.EVERY_EXECUTION, result);
var snapshot = taskGraph.snapshot();
new TornadoExecutionPlan(snapshot).execute();

Practical Applications

  • LLM Inference: GPULlama3.java enables running LLMs like Llama 3 and Qwen3 directly within Java applications on GPUs.
  • Pitfall: Workloads without loop dependencies may not benefit from TornadoVM’s acceleration; careful analysis of code structure is required.

References:

Continue reading

Next article

Amazon Exposes Years-Long GRU Cyber Campaign Targeting Energy and Cloud Infrastructure

Related Content