I have been saying for several years that success with streaming data requires enterprises to manage data in motion alongside data at rest, rather than treating streaming as a niche activity. Software providers have also been moving in this direction. Many established data management providers have added the ability to manage, store and process streaming data alongside their existing batch data processing capabilities. At the same time, providers closely associated with streaming data, such as Confluent, have increased their support for batch processing, with a view to delivering a holistic view of all data across an enterprise.
Confluent was founded in 2014 by the creators of the open-source Apache Kafka distributed
While Confluent is still best known in relation to Apache Kafka, the company has expanded its capabilities well beyond messaging and event processing.
Confluent also recently announced the general availability of Tableflow, which automatically materializes Apache Kafka topics and schemas as Parquet files to be persisted in a data warehouse, data lake or cloud storage using open table formats. Support for Apache Iceberg is GA, while the company also announced an early access support for Delta Lake format. The combination of batch and stream processing was also the cornerstone of the company’s announcements at its Current London event in May, including early access to a new feature called Snapshot Queries in Confluent Cloud for Apache Flink. Snapshot Queries automatically bound data sets to provide batch-style processing, enabling the use of Flink SQL to perform unified stream and batch processing. The ability to query both historical and real-time data includes the ability to access data in Apache Iceberg and Delta table formats via Tableflow. Also new in Confluent Cloud for Apache Flink is support for private networking on AWS and Microsoft Azure, as well as IP Filtering for Flink and Schema Registry for additional security.
While Confluent Cloud is the company’s flagship offering, Confluent Platform remains an important part of the portfolio for self-managed deployment on-premises and in the cloud. Version 8.0 of Confluent Platform was released in June and is based on version 4.0 of Apache Kafka. Both are significant updates as they remove the previous dependency on the Apache ZooKeeper distributed coordination project. While ZooKeeper was historically important for enabling the configuration and coordination of Apache Kafka in distributed environments, it has now been completely replaced by KRaft, which is an implementation of the Raft distributed consensus algorithm for Kafka. KRaft mode Confluent Platform 8.0 provides native metadata management to deliver fault tolerance and scalability without the need to deploy and maintain ZooKeeper as a separate system. Confluent also recently introduced the latest version of Confluent Control Center, which provides operational monitoring and management for Confluent Platform deployments. Confluent Control Center has now been enhanced with Confluent Manager for Apache Flink, providing the ability to create, modify and monitor Flink environments and applications.
Confluent was rated as Exemplary in the ISG 2025 Buyers Guides for Real-Time Data, Streaming Data and Streaming Analytics, and Innovative in the 2025 ISG Buyers Guide for Messaging and Event Processing, as well as a Provider of Merit in the 2024 ISG Buyers Guide for Data Governance and Data Integration. Although capabilities such as Tableflow and Snapshot Queries have enhanced its capabilities for the long-term persistence and batch-based processing historical event data the company remains best known as an event and streaming data specialist. Nevertheless, I would encourage enterprises evaluating data architectures that provide a holistic view of all data—in motion and at rest—to consider streaming data platforms and Confluent alongside more traditional data platforms and providers.
Regards,
Matt Aslett