Services for Organizations

Using our research, best practices and expertise, we help you understand how to optimize your business processes using applications, information and technology. We provide advisory, education, and assessment services to rapidly identify and prioritize areas for improvement and perform vendor selection

Consulting & Strategy Sessions

Ventana On Demand

    Services for Investment Firms

    We provide guidance using our market research and expertise to significantly improve your marketing, sales and product efforts. We offer a portfolio of advisory, research, thought leadership and digital education services to help optimize market strategy, planning and execution.

    Consulting & Strategy Sessions

    Ventana On Demand

      Services for Technology Vendors

      We provide guidance using our market research and expertise to significantly improve your marketing, sales and product efforts. We offer a portfolio of advisory, research, thought leadership and digital education services to help optimize market strategy, planning and execution.

      Analyst Relations

      Demand Generation

      Product Marketing

      Market Coverage

      Request a Briefing


        Analyst Perspectives

        << Back to Blog Index

        Data Lakehouses Enable Data as a Product


        Data Lakehouses Enable Data as a Product
        6:29

        I have previously described how data as a product was initially closely aligned with data mesh, a cultural and organizational approach to distributed data processing. As a result of data mesh’s association with distributed data, many assumed that the concept was diametrically opposed to the data lake, which offered a platform for combining large volumes of data from multiple data sources. That assumption was always misguided: There was never any reason why data lakes could not be used as a data persistence and processing platform within a data mesh environment. In recent years, data as a product has gained momentum outside the context of data mesh, while data lakes have evolved into data lakehouses. It has become increasingly clear that data lakehouses and data as a product are well matched, as the data intelligence cataloging capabilities of a lakehouse environment can serve as the foundation to enable the development, sharing and management of data as a product.

        The concept of the data lakehouse has become so ubiquitous that it is easy to forget that just a few years ago, it was closely associated with only a few providers. It was derided by many others, not least established purveyors of data warehouses. Today, many of those early naysayers have not only dropped their objections to the data lakehouse concept but have actively adapted to align with data lakehouse architecture to enable the unification of data from multiple sources to support various workloads, including analytics and artificial intelligence (AI). Potential data lakehouse adopters today would be hard-pressed to find an analytic data platform provider not claiming at least coexistence.

        Several factors influenced this change of perspective. The first is widespread adoption of cloud object storage and open file formats, which fundamentally altered the economics of storing and processing large volumes of data. ISG_BR_AD_Object_Stores_Use_2024Based on cloud object storage and open file formats such as Apache Parquet, Apache Avro and Apache Orc, data lakes provide a relatively inexpensive environment in which to combine data from multiple sources, especially semi- and unstructured data that is not suitable for storing and processing in a traditional data warehouse. More than one-half (53%) of participants in our Analytics and Data Benchmark Research are in production with the use of object stores for analytics. Data has gravity. Rather than trying to persuade enterprises to extract data from cloud object storage and open file formats and move it to products, data warehousing providers instead opted to bring the data warehouse processing engines to the data.

        The second factor that altered the perception of the data lakehouse was the popularization of open table formats, which provide the consistency and reliability guarantees required for business-critical data as well as interoperability with multiple data processing engines.ISG_Research_2025_Assertion_DataPlat_27_DataLake_Table_Formats_S As I previously explained, open table formats—Apache Hudi, Apache Iceberg and Delta Lake—are fundamental enablers of a data lakehouse, providing support for atomic, consistent, isolated and durable (or ACID) transactions and create, read, update and delete (or CRUD) operations. This provides the guaranteed consistency and reliability required for processing business-critical data. I assert that by 2027, more than 8 in 10 enterprises using data lakehouses will adopt open table formats to deliver support for ACID transactions and CRUD operations on data stored in object storage.

        Of the three primary open table formats—Apache Hudi, Apache Iceberg and Delta Lake—there has been a significant uptick in community and provider support for Apache Iceberg in recent years. One of the key features that encouraged data platform providers to coalesce around Apache Iceberg is the Iceberg REST Catalog, which provides a common API for interacting with any compatible Iceberg implementation. This facilitates governance and access controls for diverse processing platforms and query engines that implement Iceberg. Interoperability via standard catalog APIs across the growing ecosystem of tools supporting Iceberg enables enterprises to use multiple data engines to access and query data in a data lakehouse. This includes query engines such as Apache Spark, Trino and Presto but also analytic databases offered by data warehousing providers.

        The third factor that changed the perception of the data lakehouse was the widespread adoption of the medallion architecture design pattern. This involves using data operations pipelines to transform and refine data through three stages: bronze tables for raw ingested data; silver tables for cleansed, enriched and normalized data; and gold curated tables suitable for serving domain-oriented business requirements. When combined with the application of product thinking to data initiatives, the medallion pattern culminates in the delivery of data products suitable for sharing and consumption by business users, data scientists and agentic AI. The delivery of trusted data products has been facilitated by the incorporation into the data lakehouse of a metadata catalog layer providing data intelligence capabilities that support a unified view of data in the lakehouse environment, as well as capabilities for the identification and management of sensitive data, data lineage, auditing and access controls. The ability to support the delivery of data products is likely to become an increasingly important consideration.

        By nature, a data lakehouse is a complex environment. The breadth of capabilities spans multiple ISG Buyers Guides, including Analytic Data Platforms, Data Intelligence, Data Governance, Data Operations and Data Products. As such, considerable skill and resources are required to configure and maintain a data lakehouse, addressing a combination of ingestion, pipeline management, table optimization, data governance and lineage, catalog integrations and query tuning across a variety of query engines. I recommend that enterprises investigating the data lakehouse approach evaluate potential providers based on not only core data processing functionality but also the availability of additional tooling that facilitates data operations pipelines and the delivery of data as a product.

        Regards,

        Matt Aslett

        Matt Aslett
        Director of Research, Analytics and Data

        Matt Aslett leads the software research and advisory for Analytics and Data at ISG Software Research, covering software that improves the utilization and value of information. His focus areas of expertise and market coverage include analytics, data intelligence, data operations, data platforms, and streaming and events.

        JOIN OUR COMMUNITY

        Our Analyst Perspective Policy

        • Ventana Research’s Analyst Perspectives are fact-based analysis and guidance on business, industry and technology vendor trends. Each Analyst Perspective presents the view of the analyst who is an established subject matter expert on new developments, business and technology trends, findings from our research, or best practice insights.

          Each is prepared and reviewed in accordance with Ventana Research’s strict standards for accuracy and objectivity and reviewed to ensure it delivers reliable and actionable insights. It is reviewed and edited by research management and is approved by the Chief Research Officer; no individual or organization outside of Ventana Research reviews any Analyst Perspective before it is published. If you have any issue with an Analyst Perspective, please email them to ChiefResearchOfficer@isg-research.net

        View Policy

        Subscribe to Email Updates

        Posts by Month

        see all

        Posts by Topic

        see all


        Analyst Perspectives Archive

        See All