Swiftly build and enhance your Kafka Streams applications.
Kstreamplify adds extra features to Kafka Streams, simplifying development so you can write applications with minimal effort and stay focused on business implementation.

- Overview
- Getting Started
- Avro Serializer and Deserializer
- Error Handling
- Web Services
- TopicWithSerde API
- Interactive Queries
- Hooks
- Deduplication
- Open Telemetry
- Swagger
- Motivation
- Contribution
Wondering what makes Kstreamplify stand out? Here are some of the key features that make it a must-have for Kafka Streams:
-
🚀 Bootstrapping: Automatically handles the startup, configuration, and initialization of Kafka Streams so you can focus on business logic instead of setup.
-
📝 Avro Serializer and Deserializer: Provides common Avro serializers and deserializers out of the box.
-
⛑️ Error Handling: Catches and routes errors to a dead-letter queue (DLQ) topic.
-
☸️ Kubernetes: Built-in readiness and liveness probes for Kubernetes deployments.
-
🤿 Interactive Queries: Easily access and interact with Kafka Streams state stores.
-
🫧 Deduplication: Remove duplicate events from your stream.
-
🧪 Testing: Automatically sets up the Topology Test Driver so you can start writing tests right away.
Kstreamplify simplifies bootstrapping Kafka Streams applications by handling startup, configuration, and initialization for you.
For Spring Boot applications, add the following dependency:
<dependency>
<groupId>com.michelin</groupId>
<artifactId>kstreamplify-spring-boot</artifactId>
<version>${kstreamplify.version}</version>
</dependency>
Then, create a KafkaStreamsStarter
bean and override the KafkaStreamsStarter#topology()
method:
@Component
public class MyKafkaStreams extends KafkaStreamsStarter {
@Override
public void topology(StreamsBuilder streamsBuilder) {
// Define your topology here
}
@Override
public String dlqTopic() {
return "dlq_topic";
}
}
Define all your Kafka Streams properties directly from the application.yml
file, under the kafka.properties
key:
kafka:
properties:
application.id: 'myKafkaStreams'
bootstrap.servers: 'localhost:9092'
schema.registry.url: 'http://localhost:8081'
You're now ready to start your Kstreamplify Spring Boot application.
For simple Java applications, add the following dependency:
<dependency>
<groupId>com.michelin</groupId>
<artifactId>kstreamplify-core</artifactId>
<version>${kstreamplify.version}</version>
</dependency>
Then, create a class that extends KafkaStreamsStarter
and override the KafkaStreamsStarter#topology()
method:
public class MyKafkaStreams extends KafkaStreamsStarter {
@Override
public void topology(StreamsBuilder streamsBuilder) {
// Define your topology here
}
@Override
public String dlqTopic() {
return "dlq_topic";
}
}
From your main
method, create a KafkaStreamsInitializer
instance and initialize it with your KafkaStreamsStarter
child class:
public class MainKstreamplify {
public static void main(String[] args) {
KafkaStreamsInitializer myKafkaStreamsInitializer = new KafkaStreamsInitializer();
myKafkaStreamsInitializer.init(new MyKafkaStreams());
}
}
Define all your Kafka Streams properties in an application.yml
file, under the kafka.properties
key:
kafka:
properties:
application.id: 'myKafkaStreams'
bootstrap.servers: 'localhost:9092'
schema.registry.url: 'http://localhost:8081'
server:
port: 8080
You're now ready to start your Kstreamplify Java application.
A few important notes:
- A
server.port
is required to enable the web services. - The core dependency does not include a logger—be sure to add one to your project.
Kstreamplify simplifies the use of the Topology Test Driver for testing Kafka Streams applications.
For both Java and Spring Boot applications, add the following dependency:
<dependency>
<groupId>com.michelin</groupId>
<artifactId>kstreamplify-core-test</artifactId>
<version>${kstreamplify.version}</version>
<scope>test</scope>
</dependency>
Create a test class that extends KafkaStreamsStarterTest
.
Override the getKafkaStreamsStarter()
method to provide your custom to provide your KafkaStreamsStarter
implementation.
public class MyKafkaStreamsTest extends KafkaStreamsStarterTest {
private TestInputTopic<String, KafkaUser> inputTopic;
private TestOutputTopic<String, KafkaUser> outputTopic;
@Override
protected KafkaStreamsStarter getKafkaStreamsStarter() {
return new MyKafkaStreams();
}
@BeforeEach
void setUp() {
inputTopic = testDriver.createInputTopic("input_topic", new StringSerializer(),
SerdesUtils.<KafkaUser>getValueSerdes().serializer());
outputTopic = testDriver.createOutputTopic("output_topic", new StringDeserializer(),
SerdesUtils.<KafkaUser>getValueSerdes().deserializer());
}
@Test
void shouldUpperCase() {
inputTopic.pipeInput("1", user);
List<KeyValue<String, KafkaUser>> results = outputTopic.readKeyValuesToList();
assertEquals("FIRST NAME", results.get(0).value.getFirstName());
assertEquals("LAST NAME", results.get(0).value.getLastName());
}
@Test
void shouldFailAndRouteToDlqTopic() {
inputTopic.pipeInput("1", user);
List<KeyValue<String, KafkaError>> errors = dlqTopic.readKeyValuesToList();
assertEquals("1", errors.get(0).key);
assertEquals("Something bad happened...", errors.get(0).value.getContextMessage());
assertEquals(0, errors.get(0).value.getOffset());
}
}
Kstreamplify uses default properties for the tests.
You can provide additional properties or override the default ones by overriding the getSpecificProperties()
method:
public class MyKafkaStreamsTest extends KafkaStreamsStarterTest {
@Override
protected Map<String, String> getSpecificProperties() {
return Map.of(
STATE_DIR_CONFIG, "/tmp/kafka-streams"
);
}
}
When working with Avro schemas, you can use the SerdesUtils
class to easily serialize or deserialize records:
SerdesUtils.<MyAvroValue>getValueSerdes()
or
SerdesUtils.<MyAvroValue>getKeySerdes()
Here’s an example of how to use these methods in your topology:
@Component
public class MyKafkaStreams extends KafkaStreamsStarter {
@Override
public void topology(StreamsBuilder streamsBuilder) {
streamsBuilder
.stream("input_topic", Consumed.with(Serdes.String(), SerdesUtils.<KafkaUser>getValueSerdes()))
.to("output_topic", Produced.with(Serdes.String(), SerdesUtils.<KafkaUser>getValueSerdes()));
}
}
Kstreamplify makes it easy to handle errors and route them to a dead-letter queue (DLQ) topic.
Override the dlqTopic()
method and return the name of your DLQ topic:
@Component
public class MyKafkaStreams extends KafkaStreamsStarter {
@Override
public void topology(StreamsBuilder streamsBuilder) {
// Define your topology here
}
@Override
public String dlqTopic() {
return "dlq_topic";
}
}
To catch processing errors and route them to the DLQ, use the ProcessingResult
class.
@Component
public class MyKafkaStreams extends KafkaStreamsStarter {
@Override
public void topology(StreamsBuilder streamsBuilder) {
KStream<String, KafkaUser> stream = streamsBuilder
.stream("input_topic", Consumed.with(Serdes.String(), SerdesUtils.getValueSerdes()));
TopologyErrorHandler
.catchErrors(stream.mapValues(MyKafkaStreams::toUpperCase))
.to("output_topic", Produced.with(Serdes.String(), SerdesUtils.getValueSerdes()));
}
@Override
public String dlqTopic() {
return "dlq_topic";
}
private static ProcessingResult<KafkaUser, KafkaUser> toUpperCase(KafkaUser value) {
try {
value.setLastName(value.getLastName().toUpperCase());
return ProcessingResult.success(value);
} catch (Exception e) {
return ProcessingResult.fail(e, value, "Something went wrong...");
}
}
}
The mapValues
operation returns a ProcessingResult<V, V2>
, where:
- The first type parameter (
V
) represents the transformed value upon success. - The second type (
V2
) represents the original value if an error occurs.
To mark a result as successful:
ProcessingResult.success(value);
To mark it as failed:
ProcessingResult.fail(e, value, "Something went wrong...");
Use TopologyErrorHandler#catchErrors()
to catch and route failed records to the DLQ topic. A healthy stream is returned and can be further processed as needed.
@Component
public class MyKafkaStreams extends KafkaStreamsStarter {
@Override
public void topology(StreamsBuilder streamsBuilder) {
TopologyErrorHandler.catchErrors(
streamsBuilder.stream("input_topic", Consumed.with(Serdes.String(), Serdes.String()))
.process(CustomProcessor::new)
)
.to("output_topic", Produced.with(Serdes.String(), Serdes.String()));
}
@Override
public String dlqTopic() {
return "dlq_topic";
}
public static class CustomProcessor extends ContextualProcessor<String, String, String, ProcessingResult<String, String>> {
@Override
public void process(Record<String, String> record) {
try {
context().forward(ProcessingResult.wrapRecordSuccess(record.withValue(record.value().toUpperCase())));
} catch (Exception e) {
context().forward(ProcessingResult.wrapRecordFailure(e, record.withValue(record.value()), "Something went wrong..."));
}
}
}
}
The process
operation forwards a ProcessingResult<V, V2>
, where:
- The first type parameter (
V
) represents the transformed value upon success. - The second type (
V2
) represents the original value if an error occurs.
To mark a result as successful:
ProcessingResult.wrapRecordSuccess(record);
To mark it as failed:
ProcessingResult.wrapRecordFailure(e, record, "Something went wrong...");
Use TopologyErrorHandler#catchErrors()
to catch and route failed records to the DLQ topic. A healthy stream is returned and can be further processed as needed.
Kstreamplify also provides handlers to manage production and deserialization errors by forwarding them to the DLQ.
Add the following properties to your application.yml
kafka:
properties:
default.deserialization.exception.handler: 'com.michelin.kstreamplify.error.DlqDeserializationExceptionHandler'
default.production.exception.handler: 'com.michelin.kstreamplify.error.DlqProductionExceptionHandler'
The DLQ topic must have an associated Avro schema registered in the Schema Registry. You can find the schema here.
By default, uncaught exceptions will shut down the Kafka Streams client.
To customize this behavior, override the KafkaStreamsStarter#uncaughtExceptionHandler()
method:
@Override
public StreamsUncaughtExceptionHandler uncaughtExceptionHandler() {
return throwable -> StreamsUncaughtExceptionHandler.StreamThreadExceptionResponse.SHUTDOWN_APPLICATION;
}
Kstreamplify exposes web services on top of your Kafka Streams application.
The /topology
endpoint returns the Kafka Streams topology description by default.
You can customize the path by setting the following property:
topology:
path: 'custom-topology'
A set of endpoints is available to query the state stores of your Kafka Streams application. These endpoints leverage interactive queries and handle state stores across different Kafka Streams instances by providing an RPC layer.
The following state store types are supported:
- Key-Value store
- Timestamped Key-Value store
- Window store
- Timestamped Window store
Note that only state stores with String keys are supported.
Readiness and liveness probes are exposed for Kubernetes deployment, reflecting the Kafka Streams state.
These are available at /ready
and /liveness
by default.
You can customize the paths by setting the following properties:
kubernetes:
liveness:
path: 'custom-liveness'
readiness:
path: 'custom-readiness'
Kstreamplify provides an API called TopicWithSerde
that unifies all consumption and production points, simplifying the management of topics owned by different teams across multiple environments.
You can declare your consumption and production points in a separate class. This requires a topic name, a key SerDe, and a value SerDe.
public static TopicWithSerde<String, KafkaUser> inputTopic() {
return new TopicWithSerde<>(
"input_topic",
Serdes.String(),
SerdesUtils.getValueSerdes()
);
}
public static TopicWithSerde<String, KafkaUser> outputTopic() {
return new TopicWithSerde<>(
"output_topic",
Serdes.String(),
SerdesUtils.getValueSerdes()
);
}
Use it in your topology:
@Slf4j
@Component
public class MyKafkaStreams extends KafkaStreamsStarter {
@Override
public void topology(StreamsBuilder streamsBuilder) {
KStream<String, KafkaUser> stream = inputTopic().stream(streamsBuilder);
outputTopic().produce(stream);
}
}
The TopicWithSerde
API is designed to handle topics owned by different teams across various environments without changing the topology. It uses prefixes to differentiate teams and topic ownership.
In your application.yml
file, declare the prefixes in a key: value
format:
kafka:
properties:
prefix:
self: 'staging.team1.'
team2: 'staging.team2.'
team3: 'staging.team3.'
Then, include the prefix when declaring your TopicWithSerde
:
public static TopicWithSerde<String, KafkaUser> inputTopic() {
return new TopicWithSerde<>(
"input_topic",
"team1",
Serdes.String(),
SerdesUtils.getValueSerdes()
);
}
The topic
staging.team1.input_topic
will be consumed when running the application with the stagingapplication.yml
file.
By default, if no prefix is specified, self
is used.
Kstreamplify encourages the use of fixed topic names in the topology, using the prefix feature to manage namespacing for virtual clusters and permissions. However, there are situations where you might want to reuse the same topology with different input or output topics.
In the application.yml
file, you can declare dynamic remappings in a key: value
format:
kafka:
properties:
topic:
remap:
oldTopicName: newTopicName
foo: bar
The topic
oldTopicName
in the topology will be mapped tonewTopicName
.
This feature works with both input and output topics.
When testing, you can use the TopicWithSerde
API to create test topics with the same name as those in your topology.
TestInputTopic<String, KafkaUser> inputTopic = createInputTestTopic(inputTopic());
TestInputTopic<String, KafkaUser> outputTopic = createOutputTestTopic(outputTopic());
Kstreamplify aims to simplify the use of interactive queries in Kafka Streams application.
The value for the "application.server" property can be derived from various sources, following this order of priority:
- The environment variable defined by the
application.server.var.name
property.
kafka:
properties:
application.server.var.name: 'MY_APPLICATION_SERVER'
- If not defined, it defaults to the
APPLICATION_SERVER
environment variable. - If neither of the above is set, it defaults to
localhost:<serverPort>
.
You can leverage the interactive query services provided by the web services layer to access and query the state stores of your Kafka Streams application:
@Component
public class MyService {
@Autowired
KeyValueStoreService keyValueStoreService;
@Autowired
TimestampedKeyValueStoreService timestampedKeyValueStoreService;
@Autowired
WindowStoreService windowStoreService;
@Autowired
TimestampedWindowStoreService timestampedWindowStoreService;
}
The web services layer provides a set of endpoints that allow you to query the state stores of your Kafka Streams application. You can find more details in the Interactive Queries Web Services section.
Kstreamplify provides the flexibility to execute custom code through hooks at various stages of your Kafka Streams application lifecycle.
The On Start hook allows you to execute custom code before the Kafka Streams instance starts.
@Component
public class MyKafkaStreams extends KafkaStreamsStarter {
@Override
public void onStart(KafkaStreams kafkaStreams) {
// Execute code before starting the Kafka Streams instance
}
}
Kstreamplify provides an easy way to deduplicate streams through the DeduplicationUtils
class.
You can deduplicate based on various criteria and within a specified time frame.
All deduplication methods return a KStream<String, ProcessingResult<V,V2>
, which allows you to handle errors and route them to the TopologyErrorHandler#catchErrors()
.
Note: Only streams with String keys and Avro values are supported.
@Component
public class MyKafkaStreams extends KafkaStreamsStarter {
@Override
public void topology(StreamsBuilder streamsBuilder) {
KStream<String, KafkaUser> myStream = streamsBuilder
.stream("input_topic");
DeduplicationUtils
.deduplicateKeys(streamsBuilder, myStream, Duration.ofDays(60))
.to("output_topic");
}
}
@Component
public class MyKafkaStreams extends KafkaStreamsStarter {
@Override
public void topology(StreamsBuilder streamsBuilder) {
KStream<String, KafkaUser> myStream = streamsBuilder
.stream("input_topic");
DeduplicationUtils
.deduplicateKeyValues(streamsBuilder, myStream, Duration.ofDays(60))
.to("output_topic");
}
}
@Component
public class MyKafkaStreams extends KafkaStreamsStarter {
@Override
public void topology(StreamsBuilder streamsBuilder) {
KStream<String, KafkaUser> myStream = streamsBuilder
.stream("input_topic");
DeduplicationUtils
.deduplicateWithPredicate(streamsBuilder, myStream, Duration.ofDays(60),
value -> value.getFirstName() + "#" + value.getLastName())
.to("output_topic");
}
}
In the predicate approach, the provided predicate is used as the key in the window store. The stream will be deduplicated based on the values derived from the predicate.
Kstreamplify simplifies the integration of Open Telemetry with Kafka Streams applications in Spring Boot. It binds all Kafka Streams metrics to the Spring Boot registry, making monitoring and observability easier.
To run your application with the Open Telemetry Java agent, include the following JVM options:
-javaagent:/opentelemetry-javaagent.jar -Dotel.traces.exporter=otlp -Dotel.logs.exporter=otlp -Dotel.metrics.exporter=otlp
You can also add custom tags to the Open Telemetry metrics to help organize and filter them in your observability tools, like Grafana. Use the following JVM options to specify custom tags:
-Dotel.resource.attributes=environment=production,service.namespace=myNamespace,service.name=myKafkaStreams,category=orders
These tags will be included in the metrics, and you'll be able to see them in your logs during application startup, helping to track and filter metrics based on attributes like environment, service name, and category.
The Kstreamplify Spring Boot module integrates with Springdoc to automatically generate API documentation for your Kafka Streams application.
By default:
- The Swagger UI is available at
http://host:port/swagger-ui/index.html
. - The OpenAPI documentation can be accessed at
http://host:port/v3/api-docs
.
Both the Swagger UI and the OpenAPI description can be customized using the Springdoc properties.
Developing applications with Kafka Streams can be challenging, with developers often facing various questions and obstacles. Key considerations include efficiently bootstrapping Kafka Streams applications, handling unexpected business logic issues, integrating Kubernetes probes, and more.
To assist developers in overcoming these challenges, we have built Kstreamplify. Our goal is to provide a comprehensive solution that simplifies the development process and addresses the common pain points encountered when working with Kafka Streams. By offering easy-to-use utilities, error handling mechanisms, testing support, and integration with modern tools like Kubernetes and OpenTelemetry, Kstreamplify aims to streamline Kafka Streams application development.
We welcome contributions from the community to help make Kstreamplify even better! If you'd like to contribute, please take a moment to review our contribution guide. There, you'll find our guidelines and best practices for contributing code, reporting issues, or suggesting new features.
We appreciate your support in making Kstreamplify a powerful and user-friendly tool for everyone working with Kafka Streams.