What is it about human beings? We have two beautiful dog breeds: a Poodle and a Labrador. Then, somebody feels the need to create a Labradoodle? (Though they are pretty cute!). Or we invent an electric car, but cannot let go of our reliance on the infrastructure, so we invent a hybrid car. With the Labradoodle, clearly we think we are accomplishing new or better results with a hybrid. In the case of the hybrid car, we want to have our cake and eat it, too — benefitting from new, innovative technology while keeping our investment in a known system.
In the case of Hybrid Cloud, well, it’s a little bit of both, isn’t it? We see the obvious benefits in a cloud architecture. We love the infinite and inexpensive capacity and the elasticity to quickly scale up and down to meet our business goals. We benefit from the agility to rapidly spin up new business capabilities. We value the ability to spend only what we need and when we need it, rather than making hefty upfront capital expenditures. Just as important, all this frees up our budgets and resources to invest in truly differentiating technical capabilities.
If we could just “forklift” our entire data center to the cloud, perhaps we just would. But, alas we cannot. Almost every company we speak with is telling us that they are redirecting workloads to the cloud. However, they cannot do it all at once. And they cannot do it overnight. Certain on-premise investments remain for now.
As you pursue your own journey to the cloud, you will undoubtedly find yourself with a hybrid cloud architecture. Some workloads will run in public cloud, some in private cloud,
and others may remain on-premises. In fact, according to a 2015 study by the 451 Research (1), nearly half of respondents are using public cloud or will be using it in the next six months; two-thirds of respondents are already using SaaS (Software as a Service) or will be using it in the next six months; and over a third are using PaaS (Platform as a Service) or plan to use it in the next six months.
But what about managing your data in a hybrid cloud architecture? As you might imagine, at Informatica, we have a few thoughts on this dilemma. Integrating disparate data has never been easy. In the hybrid cloud paradigm, this problem has exacerbated. Data is everywhere and multiplying at staggering rates. With the movement to cloud, we are actually making a conscious decision to further spread out our data to multiple cloud destinations. In fact, the nice and contained ‘on-prem’ universe we had before, now seems like a simple data management picture compared to what’s ahead!
As you redirect workloads to public cloud and manage your initiatives, such as hybrid data warehousing and hybrid application integration, there are four major types of data management challenges that will surface time and time again:
1) Data Connectivity: In a hybrid cloud architecture, we need to integrate and manage data from an increasingly growing number and type of data systems, which may reside in public cloud, in private cloud, or on-premises. And we need to deliver data faster than ever before. This demands a data management architecture that provides out-of-the box connectivity to any data source and target. Our need for speed requires native, high-performance connectivity, which at the same time abstracts the native complexity from the developer. Realistically, how many data systems can your average developer build native expertise in? Separating integration logic from underlying sources greatly improves productivity. It also allows developers to easily reuse integration logic, such as mappings, across data sources and targets, further increasing speed and productivity. Make sure your data management solution provides robust connectivity to any data system, ranging from cloud to mainframe and anything in between.
2) Scalability: As data volumes continue growing, a big advantage of moving to the cloud is the ability to infinitely scale your environment and deliver high performance at a fraction of the cost. However, doesn’t it defeat the purpose of architecting a massive scale data warehouse if it takes an inordinate amount of time to extract and load the data from sources to the data warehouse? This is akin to using a straw to fill up the ocean. Similarly, moving vast amounts of data from on-premise systems to public cloud can be time-consuming without the right tools. In order to truly benefit from the inherent petabyte scalability found in public cloud data systems, your architecture should include a data integration platform that is also designed for infinite scalability and high performance. That platform should be inherently capable of moving large volumes of data at lightning speeds, and able to dynamically scale up and down as your environment changes.
3) Data Visibility: As your data environment becomes more complex and intertwined, full visibility into data flows throughout your environment is more critical than ever. This need for data comprehension is true for a wide range of stakeholders within your organization. Business stakeholders, such as data analysts, need to fully understand the origins of data driving their analyses. They need to know where the data came from, who touched the data in the course of its journey, and how it was transformed at any point along the way. All stakeholders, business and technical alike, need to have a common and consistent definition for business terms used in data management and analysis. The term ‘revenue recognized,’ for example, should mean exactly the same thing to anyone discussing end of quarter revenue. Developers must understand all the ways in which certain data is moved and transformed across the enterprise. This way, when a code change is required, they can analyze the risk of change to the organization and determine how systems are impacted, in turn allowing them to plan and productively execute changes. The only way to provide data comprehension to all your stakeholders is via a metadata-driven data management platform. Metadata is the backbone of a reliable, self-documenting data management solution and the foundation of any governance initiatives. In a complex hybrid cloud environment, chaos will reign without robust metadata management.
4) Operational Control: In hybrid cloud, you will have multiple, complex business processes that span across your environment and touch multiple data systems and applications—both in cloud and on-premises. For this reason, a central point of control is more critically important than ever to the success of your business. In order to ensure operational confidence in mission-critical data integration processes, you need the ability to orchestrate, administer, and monitor your production data as it flows through your end-to-end environment. Whether you are moving data from on-premise to public cloud, loading your hybrid data warehouse, or integrating across your hybrid application ecosystem, the ability to manage all of this from a central point of control is key.
We mentioned data from the 451 Research earlier in this post. Carl Lehman from 451 Research has done extensive research on this topic, and he will join Informatica in an upcoming webinar to discuss the inherent data management challenges in Hybrid Cloud and best practices for addressing these challenges.
If you are interested in learning more, please join us for this webinar with 451 Research and Informatica.
(1) 451 Research, “Voice of the Enterprise: Cloud Computing, Worldwide & Regional Survey Results and Narratives,” Q3 2015