As I previously stated, although most enterprises are reliant on batch data processing, it is an artificial construct driven by the historical limitations of computing capabilities to generate and process data at the same time without impacting performance. While real-time data processing has previously been seen as a niche requirement for low-latency applications, it is increasingly being adopted as a primary approach to data processing that enables enterprises to operate at the speed of business by acting on events as they happen. If artificial intelligence (AI) agents are to fulfill their potential and become core components of the next generation of enterprise software, then AI agents will also need to automate decisions as business events occur in real time, rather than as data related to those business events is processed. This is the context for the recent announcement that IBM has signed a definitive agreement to acquire streaming data specialist Confluent for approximately $11 billion in cash.
Confluent was founded in 2014 by the creators of the open-source Apache Kafka distributed event streaming platform, which was originally developed at LinkedIn but has
been widely adopted by thousands of enterprises to capture and process event data as it is generated by sensors, infrastructure and applications. Apache Kafka forms the basis of Confluent’s product portfolio, which includes Confluent Platform for self-managed deployment on-premises and in the cloud, as well as the Confluent Cloud managed service. The company has invested in security and governance capabilities with its Stream Governance suite, and Confluent added streaming analytics capabilities with the 2023 acquisition of Immerok and via support for the Apache Flink stream processing engine. In 2025 it delivered Tableflow, which enables event data to be persisted as tables in cloud storage and data lakehouse environments using open table formats such as Apache Iceberg and Delta Lake. Confluent has more than 6,500 customers and reported revenue of $964 million in fiscal 2024, an increase of 24% on the previous year, and is expecting fiscal 2025 revenue to exceed $1.1 billion. IBM’s proposed acquisition of Confluent is driven in part by the increased adoption of streaming data and event processing, which is reflected in Confluent’s financial performance and potential for growth. I assert that by 2027, more than three-quarters of enterprises’ standard information architectures will include streaming data and event processing, allowing enterprises to be more responsive and provide better customer experiences. IBM stated that it anticipates the acquisition will be accretive to adjusted earnings within the first full year and to free cash flow in the second year.
IBM is no stranger to real-time data. The company’s IBM MQ publish and subscribe messaging system was initially launched in the early 1990s and IBM was also an early innovator in stream computing in the early 2000s with its System S research leading to the development of IBM Streams. Adoption of early stream event processing systems that required the use of dedicated programming languages, such as IBM Streams, was impacted by the explosion of open-source projects (including Apache Kafka, Apache Spark and Apache Storm) that occurred during the big data wave, and IBM Streams was eventually sold to 21CS in 2023. IBM currently offers IBM Event Streams, which is based on Apache Kafka and is available as a fully managed service on IBM Cloud or as part of IBM Event Automation and IBM Cloud Pak for Integration. IBM also acquired StreamSets in 2024 for real-time data integration. As required by SEC regulations, IBM is not disclosing any plans for the Confluent portfolio until after the completion of the acquisition. However, it seems highly likely that IBM Event Streams will be replaced by Confluent Platform and Confluent Cloud. IBM MQ and IBM StreamSets are potentially complementary to the Confluent portfolio. The acquisition provides IBM with the potential to create a platform that combines its various capabilities for messaging, streaming data and event processing, as well as real-time data and application integration and API management.
ISG’s 2025 Buyers Guides for Real-Time Data, Messaging and Event Processing, Streaming Data and Streaming Analytics illustrate why IBM made their bid to acquire Confluent. Both companies performed well, with Confluent rated as Exemplary for Real-Time Data, Streaming Data and Streaming Analytics, and Innovative for Messaging and Event Processing, and IBM rated as Exemplary for Real-Time Data, Messaging and Event Processing and Streaming Data, and a Provider of Assurance for Streaming Analytics. In terms of Capability, however, Confluent outperformed IBM in Messaging and Event Processing, Streaming Data and Streaming Analytics, with Confluent named a Leader in Messaging and Event Processing and Streaming Data. Like IBM, many other providers offer streaming products that are based on or compatible with Apache Kafka (including Aiven, Alibaba Cloud, Amazon Web Services, Google Cloud, Huawei Cloud, Microsoft, Oracle, Redpanda and Tencent Cloud) but Confluent is more than just Apache Kafka.
In recent years Confluent underwent a major engineering project to develop its Kora engine to provide a cloud-native experience in Confluent Cloud, including tiered storage, elastic scaling, high availability and performance. Additionally, as I previously described, version 8.0 of Confluent Platform was also significant in removing the previous dependency on the Apache ZooKeeper distributed coordination project, which was replaced by an implementation of the Raft distributed consensus algorithm. Acquiring Confluent provides IBM with these functional capabilities as well as the expertise that led the development of Kora and many relevant features in Apache Kafka. Given IBM’s longstanding involvement in many open-source projects, as well as its ownership of independent subsidiaries Red Hat and HashiCorp, we see no reason to think that Confluent’s relationship with the Apache Kafka project will be negatively impacted by its ownership by IBM—especially if it is similarly operated as an independent subsidiary (which remains to be seen).
As I indicated above, another key driver for the acquisition is the potential real-time data processing requirements for AI applications and agents. While historical data is used to
train and tune AI models, AI agents designed to execute business processes through autonomous actions will need to do so in the context of the very latest business events and data, requiring integration with streaming and event processing. Although it is early days for the combination of AI and real-time data processing, I assert that by 2027, more than one-third of enterprises will integrate streaming and event processing with AI and GenAI inferencing to deliver interactive real-time applications. Confluent recently made significant progress in its AI strategy with the announcement of its Confluent Intelligence managed service for real-time AI applications and systems built on Apache Kafka and Apache Flink. Confluent Intelligence combines native AI/ML functions for anomaly detection, fraud prevention and forecasting with the ability to create retrieval-augmented generation workflows from Kafka topics and Flink tables, as well as Streaming Agents to build, deploy and orchestrate event-driven AI agents and a Real-Time Context Engine to integrate with AI applications and agents using Model Context Protocol.
The logical benefits of IBM acquiring Confluent are clear. As with any proposed acquisition, execution will have a significant role to play in making those theoretical benefits a reality. It seems highly likely that the deal will proceed as Confluent’s largest shareholders and investors representing almost two-thirds (62%) of Confluent’s outstanding common stock have agreed to vote in favor of the deal and against any alternatives. Nevertheless, with the proposed transaction expected to close by the middle of 2026 it will be some time before we begin to see the results of the combination. In the interim, I recommend that enterprises continue to include both IBM and Confluent in their evaluations of providers for real-time data use cases.
Regards,
Matt Aslett
Fill out the form to continue reading.