Skip to main content

On This Page

Forking Apache Stateful Functions for Flink 2.x and Java 21

3 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

Kzm Labs has launched a continuation of Apache Stateful Functions to bridge the gap between legacy dependencies and modern runtime requirements. The upstream project has not shipped a release since October 2024, leaving it pinned to Flink 1.16 and Java 11.

Why This Matters

The technical reality of modern stream processing requires features like disaggregated state and Java 21 support, which are currently absent in the upstream Apache StateFun repository. Without this continuation, engineers are forced to choose between a powerful stateful-actor model and the performance gains of Flink 2.x, such as decoupling state size from local disk ceilings through object storage caching.

Key Insights

  • Upstream Apache StateFun version 3.4.0, released in October 2024, remains locked to Java 11 and Flink 1.16.
  • Disaggregated state in Flink 2.x allows RocksDB state to reside in object storage, enabling horizontal scaling for massive stateful jobs.
  • The kzmlabs/flink-statefun continuation rewrote Kinesis ingress/egress to support the modern Flink 2.x KinesisStreamsSource and Sink APIs.
  • The build system was upgraded to Java 21 and Maven Shade Plugin 3.6.1 to handle bytecode-level relocation for Protobuf shading.
  • Security provenance is enforced via Sigstore keyless attestation and SLSA build provenance for all GHCR-published Docker images.

Working Examples

A standard hand-written keyed Flink job implementation.

public final class CounterFunction
extends KeyedProcessFunction<String, Event, Result> {
private static final long serialVersionUID = 1L;
private static final ValueStateDescriptor<Long> COUNT_DESCRIPTOR =
new ValueStateDescriptor<>("count", Types.LONG);
private transient ValueState<Long> count;
@Override
public void open(OpenContext ctx) {
this.count = getRuntimeContext().getState(COUNT_DESCRIPTOR);
}
@Override
public void processElement(Event event, Context ctx, Collector<Result> out)
throws Exception {
long next = Objects.requireNonNullElse(count.value(), 0L) + 1L;
count.update(next);
out.collect(new Result(event.id(), next));
}
}

The equivalent logic using the StateFun programming model with durable per-key state.

public final class Counter implements StatefulFunction {
private static final EgressIdentifier<Result> RESULTS = new EgressIdentifier<>("io.kzm.counter", "results", Result.class);
@Persisted
private final PersistedValue<Long> count = PersistedValue.of("count", Long.class);
@Override
public void invoke(Context context, Object input) {
if (input instanceof Event event) {
long next = Optional.ofNullable(count.get()).orElse(0L) + 1L;
count.set(next);
context.send(RESULTS, new Result(event.id(), next));
}
}
}

Practical Applications

  • Fraud detection and IoT digital twins: Implementing durable per-key state and exactly-once messaging without authoring complex Flink topologies. Pitfall: Using legacy Kinesis short names instead of ARNs in Flink 2.x will break the routing layer’s lookup table.
  • Polyglot stateful microservices: Deploying functions in Python, Go, or JavaScript while sharing state through the Flink runtime. Pitfall: Retaining Flink 1.x configuration keys like ‘state.backend’ in module.yaml will cause deployment failures on Flink 2.x.
  • Kubernetes-native streaming: Utilizing the Flink Kubernetes Operator 1.11 for automated deployments and scaling. Pitfall: Failing to update the Maven coordinate to io.github.kzmlabs.flinkstatefun will result in unresolved dependency errors for Flink 2.x environments.

References:

Continue reading

Next article

Mitigating Tokenization Drift: How Spacing and Formatting Impact LLM Performance

Related Content