Read Time:
6 min.
Sponsored by:
Font Size:
Font Weight:
Combining the Best of Self-Managed and Managed Services for Data Agility
Defining “Bring Your Own Cloud”
The adoption of cloud environments for analytic workloads has been a key feature of the data platforms sector in recent years. Many enterprises adopted cloud-based data platforms with a view to improving operational efficiencies by reducing the need for upfront investment in on-premises physical infrastructure and facilitating the ability to scale cloud services up and down to match fluctuating requirements. Adoption of cloud computing has also enabled a reduction in management overhead, as data infrastructure software providers take responsibility for monitoring and managing the underlying infrastructure, thus allowing enterprises to focus on data processing and applications. But on-premises versus cloud is a false dichotomy. Most enterprises are using a combination of on-premises and cloud computing resources.
“Cloud” is itself nebulous term. Not only will most enterprises use multiple cloud data infrastructure providers, but they will use a range of different cloud models. The different cloud consumption approaches can be thought of as a spectrum. At one end is fully managed services, in which the provider is responsible for managing and maintaining all aspects of the product offering. At the other end is a self-managed approach, through which the enterprise is responsible for managing their use of cloud resources.
In the middle of this spectrum is “bring your own cloud” (BYOC)—an approach that is growing in popularity for data and analytics workloads and one that seeks to provide the best of both worlds. BYOC involves the deployment and management of a software provider’s product in a customer’s cloud account. BYOC is based on a shared responsibility model through which the software provider is responsible for managing service availability, upgrades and disaster recovery via a control plane that resides in the customer’s preferred cloud environment. The customer retains responsibility for managing applications and their performance, as well as user authentication and authorization. The enterprise is responsible for the data plane that resides in their own preferred cloud environment, retaining complete control of the data.
Benefits of BYOC
Retaining control of the data plane in a BYOC environment has several benefits, not least security and data governance—two of the most significant cloud challenges faced by participants in ISG’s 2024 Market Lens Cloud Study. With a fully managed service, the enterprise has little control over the location in which the data is stored and processed, other than perhaps the ability to select regions. This can make it unsuitable for workloads involving personally identifiable information that is subject to regulations such as GDPR and HIPAA. While many cloud providers have policies around protecting data privacy, emerging data sovereignty requirements place additional burdens on enterprises to not only have reassurances about how the data is stored and processed, but control over the physical infrastructure used to store and process the data, including the country in which it resides. Separating the vendor and customer environments helps address these requirements by enabling the enforcement of multiple layers of security on top of the security features provided by the software provider.
Performance and reliability are also significant potential considerations of BYOC. Managing the data plane enables an enterprise to assert more control over the performance of the data processing software. At its most extreme, this could mean the customer being able to continue operating even during a failure of the software provider’s operational control plane. More typically, it will enable the enterprise to be in control of the scale, or the number and type of infrastructure resources allocated to the data plane to address performance requirements. Streaming latency and query performance can also be improved significantly due to the BYOC environment being located closer to the customer’s applications.
Having control of the resources used for data processing enables the customer to take responsibility for the scale and cost of the resources used.
The flipside of performance is cost. Having control of the resources used for data processing enables the customer to take responsibility for the scale and cost of the resources used. BYOC can also enable enterprises to avoid cloud egress costs that would otherwise be involved in moving data to and from the software provider’s environment. Associated networking costs can also potentially be lowered when the BYOC environment is within the customer's cloud network, while BYOC can also help enterprises avoid costs associated with querying “remote” data in their streaming lakehouse using Apache Iceberg. The customer can continue to enjoy the benefits related to committed spend levels or discounts, while BYOC helps enterprises avoid unexpected runaway costs that can occur with the use of fully managed services when not properly monitored.
Challenges and Architectural Considerations
While BYOC is an increasingly popular approach for data and analytics workloads, it is not without challenges. In addition to managing the data processing workloads, the enterprise also needs to take responsibility for tasks such as network management, access control and service integration. The benefits may outweigh the overhead involved, but these factors should be considered. Some cloud data infrastructure providers do cover these aspects if the customer chooses (for example, by creating a virtual private cloud, or by setting cloud storage bucket permissions), which reduces this responsibility and the skills required to deploy. Enterprises adopting BYOC also need to ensure they have absolute clarity from their software and cloud providers as to which party is responsible for which aspects operations and security. Any ambiguity poses a potential risk. Clarity in relation to responsibility will be particularly important should anything go wrong, to ensure that the enterprise can identify which party is responsible for troubleshooting performance or reliability issues and maintaining and fixing each element of a data processing and analytics stack.
BYOC adopters should also be aware that different software providers have different approaches to BYOC. Some providers offer a shared-nothing architecture in which the software provider’s control plane is completely separated from the data plane. Others offer a shared-storage architecture in which a shared and managed metadata service resides in the control plane. Proponents of the shared-nothing approach argue that the managed metadata service used in the shared storage approach means that the data plane has an unnecessary external dependency that can prevent the customer’s software from functioning properly in the event of a failure of the control plane.
Conversely, proponents of the shared-storage approach argue that the separated metadata layer gives the provider insight into the operation of the data plane without which the customer of a shared-nothing architecture is required to grant the provider access provisions to troubleshoot potential failures on the data plane. Either way, BYOC is an increasingly attractive proposition for data workloads that enables the customer to delegate permissions to the provider for automating provisioning, upgrades and repair. Ultimately, the customer remains in control of revoking specific permissions if they choose to do so. They also not only own the data but the complete audit trail, which never leaves the customer’s premises.
The potential for improved data sharing across different workloads is enabled by increased adoption of open table formats such as Apache Iceberg and Delta Lake Tables. Support for open table formats is now a critical feature for providers of analytic data platforms to enable the persistence and analysis of structured and unstructured data in object storage. Apache Iceberg also enables portable data sets that support multi-modal access by both lakehouses and data engineering pipelines that are interoperable with many industry-leading data platform products and open-source technologies. Streaming data providers have also recently added support for converting streaming data into Apache Iceberg tables for long-term persistence and analysis of event data.
There has been significant uptick in community and vendor support for Apache Iceberg in recent years with deeper adoption by AWS, Clickhouse, Databricks, Dremio and Snowflake, among others, positioning Apache Iceberg to become the lingua franca for storing and analyzing both transactional and event data. Meanwhile, interest in BYOC is also being boosted by enterprise investments in sovereign artificial intelligence (AI). Rather than training models with enterprise data by sending it over the public internet to cloud AI services, many enterprises are looking to the BYOC approach to enable them to bring AI models to their data in VPC environments, reducing data movement and processing costs, improving AI lineage tracking, and protecting confidential and private intellectual property.
BYOC offers enterprises a flexible approach that combines the benefits of both fully managed and self-managed cloud services, allowing for greater control over data while leveraging provider expertise for software management. As cloud-based data platforms continue to evolve, enterprises should consider how to harness BYOC to enhance both data agility and data privacy while reducing costs.
Combining the Best of Self-Managed and Managed Services for Data Agility
Defining “Bring Your Own Cloud”
The adoption of cloud environments for analytic workloads has been a key feature of the data platforms sector in recent years. Many enterprises adopted cloud-based data platforms with a view to improving operational efficiencies by reducing the need for upfront investment in on-premises physical infrastructure and facilitating the ability to scale cloud services up and down to match fluctuating requirements. Adoption of cloud computing has also enabled a reduction in management overhead, as data infrastructure software providers take responsibility for monitoring and managing the underlying infrastructure, thus allowing enterprises to focus on data processing and applications. But on-premises versus cloud is a false dichotomy. Most enterprises are using a combination of on-premises and cloud computing resources.
“Cloud” is itself nebulous term. Not only will most enterprises use multiple cloud data infrastructure providers, but they will use a range of different cloud models. The different cloud consumption approaches can be thought of as a spectrum. At one end is fully managed services, in which the provider is responsible for managing and maintaining all aspects of the product offering. At the other end is a self-managed approach, through which the enterprise is responsible for managing their use of cloud resources.
In the middle of this spectrum is “bring your own cloud” (BYOC)—an approach that is growing in popularity for data and analytics workloads and one that seeks to provide the best of both worlds. BYOC involves the deployment and management of a software provider’s product in a customer’s cloud account. BYOC is based on a shared responsibility model through which the software provider is responsible for managing service availability, upgrades and disaster recovery via a control plane that resides in the customer’s preferred cloud environment. The customer retains responsibility for managing applications and their performance, as well as user authentication and authorization. The enterprise is responsible for the data plane that resides in their own preferred cloud environment, retaining complete control of the data.
Benefits of BYOC
Retaining control of the data plane in a BYOC environment has several benefits, not least security and data governance—two of the most significant cloud challenges faced by participants in ISG’s 2024 Market Lens Cloud Study. With a fully managed service, the enterprise has little control over the location in which the data is stored and processed, other than perhaps the ability to select regions. This can make it unsuitable for workloads involving personally identifiable information that is subject to regulations such as GDPR and HIPAA. While many cloud providers have policies around protecting data privacy, emerging data sovereignty requirements place additional burdens on enterprises to not only have reassurances about how the data is stored and processed, but control over the physical infrastructure used to store and process the data, including the country in which it resides. Separating the vendor and customer environments helps address these requirements by enabling the enforcement of multiple layers of security on top of the security features provided by the software provider.
Performance and reliability are also significant potential considerations of BYOC. Managing the data plane enables an enterprise to assert more control over the performance of the data processing software. At its most extreme, this could mean the customer being able to continue operating even during a failure of the software provider’s operational control plane. More typically, it will enable the enterprise to be in control of the scale, or the number and type of infrastructure resources allocated to the data plane to address performance requirements. Streaming latency and query performance can also be improved significantly due to the BYOC environment being located closer to the customer’s applications.
Having control of the resources used for data processing enables the customer to take responsibility for the scale and cost of the resources used.
The flipside of performance is cost. Having control of the resources used for data processing enables the customer to take responsibility for the scale and cost of the resources used. BYOC can also enable enterprises to avoid cloud egress costs that would otherwise be involved in moving data to and from the software provider’s environment. Associated networking costs can also potentially be lowered when the BYOC environment is within the customer's cloud network, while BYOC can also help enterprises avoid costs associated with querying “remote” data in their streaming lakehouse using Apache Iceberg. The customer can continue to enjoy the benefits related to committed spend levels or discounts, while BYOC helps enterprises avoid unexpected runaway costs that can occur with the use of fully managed services when not properly monitored.
Challenges and Architectural Considerations
While BYOC is an increasingly popular approach for data and analytics workloads, it is not without challenges. In addition to managing the data processing workloads, the enterprise also needs to take responsibility for tasks such as network management, access control and service integration. The benefits may outweigh the overhead involved, but these factors should be considered. Some cloud data infrastructure providers do cover these aspects if the customer chooses (for example, by creating a virtual private cloud, or by setting cloud storage bucket permissions), which reduces this responsibility and the skills required to deploy. Enterprises adopting BYOC also need to ensure they have absolute clarity from their software and cloud providers as to which party is responsible for which aspects operations and security. Any ambiguity poses a potential risk. Clarity in relation to responsibility will be particularly important should anything go wrong, to ensure that the enterprise can identify which party is responsible for troubleshooting performance or reliability issues and maintaining and fixing each element of a data processing and analytics stack.
BYOC adopters should also be aware that different software providers have different approaches to BYOC. Some providers offer a shared-nothing architecture in which the software provider’s control plane is completely separated from the data plane. Others offer a shared-storage architecture in which a shared and managed metadata service resides in the control plane. Proponents of the shared-nothing approach argue that the managed metadata service used in the shared storage approach means that the data plane has an unnecessary external dependency that can prevent the customer’s software from functioning properly in the event of a failure of the control plane.
Conversely, proponents of the shared-storage approach argue that the separated metadata layer gives the provider insight into the operation of the data plane without which the customer of a shared-nothing architecture is required to grant the provider access provisions to troubleshoot potential failures on the data plane. Either way, BYOC is an increasingly attractive proposition for data workloads that enables the customer to delegate permissions to the provider for automating provisioning, upgrades and repair. Ultimately, the customer remains in control of revoking specific permissions if they choose to do so. They also not only own the data but the complete audit trail, which never leaves the customer’s premises.
The potential for improved data sharing across different workloads is enabled by increased adoption of open table formats such as Apache Iceberg and Delta Lake Tables. Support for open table formats is now a critical feature for providers of analytic data platforms to enable the persistence and analysis of structured and unstructured data in object storage. Apache Iceberg also enables portable data sets that support multi-modal access by both lakehouses and data engineering pipelines that are interoperable with many industry-leading data platform products and open-source technologies. Streaming data providers have also recently added support for converting streaming data into Apache Iceberg tables for long-term persistence and analysis of event data.
There has been significant uptick in community and vendor support for Apache Iceberg in recent years with deeper adoption by AWS, Clickhouse, Databricks, Dremio and Snowflake, among others, positioning Apache Iceberg to become the lingua franca for storing and analyzing both transactional and event data. Meanwhile, interest in BYOC is also being boosted by enterprise investments in sovereign artificial intelligence (AI). Rather than training models with enterprise data by sending it over the public internet to cloud AI services, many enterprises are looking to the BYOC approach to enable them to bring AI models to their data in VPC environments, reducing data movement and processing costs, improving AI lineage tracking, and protecting confidential and private intellectual property.
BYOC offers enterprises a flexible approach that combines the benefits of both fully managed and self-managed cloud services, allowing for greater control over data while leveraging provider expertise for software management. As cloud-based data platforms continue to evolve, enterprises should consider how to harness BYOC to enhance both data agility and data privacy while reducing costs.
Fill out the form to continue reading
About ISG Software Research
ISG Software Research provides expert market insights on vertical industries, business, AI and IT through comprehensive consulting, advisory and research services with world-class industry analysts and client experience. Our ISG Buyers Guides offer comprehensive ratings and insights into technology providers and products. Explore our research at research.isg-one.com.
About ISG Research
ISG Research provides subscription research, advisory consulting and executive event services focused on market trends and disruptive technologies driving change in business computing. ISG Research delivers guidance that helps businesses accelerate growth and create more value. For more information about ISG Research subscriptions, please email contact@isg-one.com.
About ISG
ISG (Information Services Group) (Nasdaq: III) is a leading global technology research and advisory firm. A trusted business partner to more than 900 clients, including more than 75 of the world’s top 100 enterprises, ISG is committed to helping corporations, public sector organizations, and service and technology providers achieve operational excellence and faster growth. The firm specializes in digital transformation services, including AI and automation, cloud and data analytics; sourcing advisory; managed governance and risk services; network carrier services; strategy and operations design; change management; market intelligence and technology research and analysis. Founded in 2006 and based in Stamford, Conn., ISG employs 1,600 digital-ready professionals operating in more than 20 countries—a global team known for its innovative thinking, market influence, deep industry and technology expertise, and world-class research and analytical capabilities based on the industry’s most comprehensive marketplace data.
For more information, visit isg-one.com.