Supermetal Kafka Target
Built a Debezium-compatible Kafka target with multi-format support, optimized for low-latency replication and operational simplicity.
- rust
- kafka
- cdc
- avro
- json
Highlights
- Implemented topic routing with schema-aware payload formatting across Avro and JSON.
- Engineered high-throughput snapshot logic that turns initial syncs that typically take days into a fraction of the time.
- Took the Kafka target from concept to code-complete implementation.
Screenshots
What it is
A high-throughput Kafka target for Supermetal that streams an initial snapshot plus ongoing CDC into Kafka topics. It can emit either a Debezium-compatible envelope (for drop-in integration with existing CDC consumers) or a compact Supermetal-native upsert format (to keep new pipelines minimal and consumer logic simple). The target supports JSON and Avro encoding, integrates with Confluent Schema Registry, and can optionally publish transactional writes aligned to source database transaction boundaries.
What I contributed
- Took the Kafka target from concept to a code-complete implementation (configuration surface, topic routing, producer lifecycle, and the end-to-end publish loop).
- Developed both output message formats:
- Debezium-compatible events for existing consumers and tooling.
- Supermetal-native events with minimal metadata to support simple upsert + dedupe semantics.
- Added multi-encoding support: JSON and Avro, including Confluent Schema Registry integration for schema registration/lookup and evolution-friendly contracts.
- Engineered high-throughput snapshot logic that makes the target feel instant at scale, turning initial syncs that typically take days into a fraction of the time.
- Implemented optional transactional producer mode so downstream consumers don’t observe partial multi-table transactions (while keeping batching and backpressure behavior safe under load).
- Developed a preset settings selection in the UI to simplify the user experience.
Outcome / impact
- Enabled flexible low-latency replication: Users can preserve Debezium semantics for compatibility or switch to the compact Supermetal format to reduce payload size and downstream complexity.
- Dramatically improved time-to-data for large backfills and new environments (when compared to standard Debezium over Kafka Connect deployments) by accelerating initial snapshots while maintaining low-latency CDC streaming.
- Prioritized operability and correctness: clear configuration, predictable failure modes, schema-aware serialization, and an optional transactional delivery path when atomicity matters.
Tech (high-level)
Rust · Apache Arrow · Apache Kafka · CDC · JSON/Avro encoding · Confluent Schema Registry · Kafka transactions