By Myles Suer
Published on 2024年10月15日
The cloud is the de facto standard for housing data these days. There was a time not so long ago when most CIOs would never consider putting their crown jewels — AKA customer data and associated analytics — into the cloud.
But, there has been a magic quadrant for cloud databases for years, and cloud data warehousing benefits are widely known and appreciated. In fact, organizations today are using cloud data warehouses to enhance data accessibility, increase scalability and flexibility, improve performance, support AI/ML, and more.
As more enterprises migrate legacy data and systems to the cloud, two key questions continue to cause CIOs sleepless nights: How can I mitigate the risks of cloud data migration? And, what must our organization overcome to succeed at cloud data warehousing?
It is natural to assume the biggest drivers are time and money. It’s costly and time-consuming to manage on-premises data warehouses — and modern cloud data architectures can deliver business agility and innovation, among many other cloud data warehousing benefits. However, CIOs declare that agility, innovation, security, adopting new capabilities, and time to value — never cost — are the top drivers for cloud data warehousing.
“Cloud data warehouses can provide a lot of upfront agility, especially with serverless databases,” says former CIO and author Isaac Sacolick. “There are tools to replicate and snapshot data, plus tools to scale and improve performance.” Yet the cloud, according to Sacolick, doesn’t come cheap. “A misconception is cloud data warehouses and lakes are cheap or don’t require IT ops support.”
Many see the cloud as the most secure option. Much to my surprise, CIO Paige Francis claims that in her organization, “the number one driver is security, given the wide range of secure data types. I am not interested in owning that risk internally.” Improved, reliable security in the cloud has increased cloud usage, which is much greater than early estimates, thanks to the cloud's ability to actually be more secure than an on-premises data center.
Data aggregation is another key benefit the cloud delivers. “Businesses we work with have so many different types of data on different systems and infrastructure, that the cloud makes sense as a single aggregation point,” shares industry analyst Dan Kirsch.
Cloud data migration can offer many other benefits to organizations, including:
Access to cloud-based tools: Organizations can access cloud-based developer tools and APIs.
Regulatory compliance: Cloud migration can help organizations comply with regulations.
Backup and recovery: Cloud migration can provide backup and recovery options.
Simplified management: Cloud migration can simplify management.
Automatic updates: The cloud can automatically download and install updates to address security vulnerabilities, bugs, and performance enhancements.
Disaster recovery: Cloud migration can reduce the time lost to technical mistakes, server lags, and other issues.
Centralized data storage: Cloud storage offers stronger security than traditional data centers because business information and data are stored centrally.
Access to data regardless of physical machinery: Data stored in the cloud can be accessed regardless of what happens to physical machinery.
Those planning their migration to realize cloud data warehousing benefits would be wise to map out a strategy. What do you migrate, how, and when?
CIOs agree that organizations should avoid lift and shift migrations, as this approach often leads to fixing what is there, fixing it again, and finally getting it re-engineered — and maybe still getting it wrong again. Analyst Dion Hinchcliffe succinctly summarizes the problem: “Lift and shift is usually the worst way to move anything to the cloud. It means you’re going to do one migration to get into the cloud. And then a second migration to get there right.”
Sacolick agrees. “Sadly, lift and shift often leads to more than two migrations.” Luckily, with the right planning, this migration can be done in one fell swoop. It’s critical IT leaders “define the problem, find the value, and architect a solution that meets the objectives,” he argues.
Migrating data from legacy systems presents a unique set of challenges that can complicate the overall cloud migration process. These older systems often lack the compatibility needed to work seamlessly with modern cloud platforms. To address this, businesses have several options, including re-platforming or refactoring legacy systems to make them cloud-compatible. In some cases, sunsetting obsolete systems or data that is no longer relevant may be the most cost-effective solution.
To ensure that no critical data is lost, businesses should first conduct a thorough audit of their legacy systems, identifying which systems and data sets need to be migrated, modernized, or retired. Migrating from legacy systems can also offer an opportunity to optimize data management practices by implementing new data governance policies and leveraging cloud-native tools. This approach not only facilitates a smooth transition but also ensures that future business operations are not hindered by outdated infrastructure.
For this reason, CIOs recommend only migrating data to the cloud if the data has downstream business value. This means organizations need to develop a cloud data warehouse plan early in the process, while establishing holistic, enterprise-level governance and management from both infrastructure and cloud data warehouse components.
As Kirsch suggests, it’s no surprise organizations should modernize their data: “To lift and shift then modernize is expensive and means you’ll be moving useless data. Clean out your closet before you move into a new house!”
Organizations are better off redesigning for the cloud and then focusing on building in small manageable chunks focused on business needs. For this redesign to succeed, it is critical to remember data governance becomes even more essential to understanding where your data is at all times.
Like any data migration, cloud data migration requires careful planning, design, and execution. But be warned: CIOs say IT leaders should not assume cloud data warehouses work like an on-premises data center. It is important to understand the unique responsibilities of a cloud data warehouse and to include data governance. Indeed, data governance ensures data is labeled – giving migration leaders a clear view of what data is useful, usable, popular – and worth migrating at all. A common big issue, says Francis, is “bringing over the same garbage data or broken integrations. Where possible, IT teams should start as clean and fresh as they can.”
In other words, CIOs can’t just wing it and migrate the entire legacy data landscape to a cloud data warehouse. They need a plan. “Incorrect scoping of the migration poses a significant risk to the migration, especially around cost,” points out CIO Anthony McMahon.
Lift and shift perpetuates the same data problems, albeit in a new location. In many cases, businesses have tons of data, but the data can’t be trusted. If you don’t have a well-defined business problem, your analytics, AI/ML, and data science projects will be expensive failures. Where the old data warehouse model was driven by feeding data into it, the new cloud data warehouse model is about providing a view across lots of data with a specific purpose.
Cloud data warehouses offer the potential to solve larger and more complex business data problems that could not be addressed via on-premises software and hardware. Cloud data should remove the infrastructure discussions and return attention to business, data, and outcomes.
With this said, Hinchcliffe summarizes the biggest cloud data migration risks as:
Source/target vendor lock-in
Consistent performance
Lower control
Data quality/wrangling
Regulatory, compliance, and other data governance issues
Cost monitoring/limiting
Ability to move out/costs of data egress
Data security is a top concern during cloud migrations. Be sure to explore best practices for cloud data migration, such as encryption techniques, access control policies, and compliance with industry-specific regulations like GDPR or HIPAA. Also, find solutions with built-in security features, and emphasize the need for a shared responsibility model between the business and provider.
Implementing robust encryption methods, both for data in transit and at rest, is essential to prevent unauthorized access during the migration process. Furthermore, companies should adopt a "zero trust" approach to security by limiting access to the migration tools and cloud environments only to authorized personnel. This can significantly reduce the risk of a security breach during the transition.
In addition to encryption, businesses should also ensure compliance with industry-specific regulations such as GDPR, HIPAA, or CCPA, depending on the nature of the data they manage. A shared responsibility model means that while cloud providers manage the infrastructure, organizations are responsible for securing the data itself, making end-to-end security a priority throughout the migration journey.
Also, do not underestimate the role of a data catalog in data security during a cloud migration. However, effective data catalog software should include five things: data intelligence, data collaboration, guided navigation, active data governance, and broad data connectivity.
When migrating data to the cloud, ensuring compliance with relevant laws and regulations is essential, especially for industries like healthcare, finance, or government that handle sensitive information. Regulations such as the GDPR, CCPA, and HIPAA impose strict data handling, storage, and security requirements. Failing to meet these regulations can result in substantial fines and reputational damage, making compliance a top priority in cloud migrations.
Many cloud providers offer built-in compliance tools and frameworks to help businesses meet regulatory standards, but the responsibility does not end there. Companies should regularly audit their cloud environments to ensure that data is being stored, accessed, and processed in compliance with applicable regulations. It's also important to work with legal and compliance experts to understand the regional and industry-specific requirements that apply to your data. Implementing a proactive compliance strategy during the migration will mitigate risks and provide peace of mind.
AI and machine learning (ML) initiatives don’t just benefit from cloud data migrations, they are revolutionizing cloud data migrations by automating many of the traditionally manual and time-consuming tasks. AI tools can automatically identify and catalog data, assess relationships between datasets, and even recommend the best migration paths and assist with data governance. This significantly reduces human error and speeds up the migration process, particularly when handling large volumes of data. AI-driven insights can also highlight potential roadblocks before they become issues, making the migration more predictable.
Beyond the migration process itself, AI/ML can add long-term value to a business’s cloud data warehouse. Post-migration, these technologies can be used to analyze data more efficiently, driving faster and more accurate insights. This is especially useful for companies that rely heavily on data-driven decision-making, such as those in finance, healthcare, or e-commerce. With cloud-based AI tools, companies can continuously optimize their data environment, ensuring they make the most out of their new infrastructure.
A successful cloud data warehouse migration depends on a well-structured team with clear roles and responsibilities. Key team members should include data engineers, system architects, and IT security specialists, all of whom will play a pivotal role in managing different aspects of the migration. A project manager should also be appointed to oversee the entire process and ensure that milestones are met according to the migration timeline. This manager will coordinate with stakeholders across the business, keeping communication open and ensuring that the project remains on track.
In addition to the core team, businesses should involve data analysts and business users to ensure that the migrated data meets the needs of the organization. By integrating feedback early in the process, companies can avoid costly rework post-migration. The team should also have a dedicated point of contact for the cloud service provider to handle any technical issues that may arise during the migration. Properly allocating roles helps streamline the migration, mitigate risks, and maintain business continuity.
CIOs claim that data relationships are vital — from both building the business side of the data warehouse as well as understanding the resulting infrastructure requirements. Clearly, moving data isn’t free. Nor are architecting new solutions or changing how users access data.
“It’s key to understand data and data relationships, but so is data governance and data management,” says CIO Martin Davis. “Unless you understand all of these things, you will end up with issues and problems that will cause rework.”
Data discovery and relationship mapping are among the top ways to achieve high value from any kind of cloud data warehouse. “You really need to understand the metadata and data definitions around different data sets,” Kirsch says. “Packaged analytics and data warehousing solutions are getting smarter, but just dumping into a cloud data warehouse will give you a swamp.”
And once data is in the cloud data warehouse, security, risk, and compliance are critical. This can bring to the foreground people, skills, and culture. In many cases, a culture change is also essential for success, which makes a data culture so important. A new data environment introduces new responsibilities. How will your team train and transition people to make your cloud data warehouse successful?
So how do you choose what data to migrate? Migration leaders would be wise to filter out data, not to migrate via a clear policy. CIO Martin Davis stresses the importance of up-front policy planning: Decisions should be made “based on business need and data integrity requirements. If the business justifies it then it’s going in, if it is integral to the end results or to maintaining relationships within the data then you need it. But you must be tough!”
Hinchcliffe says it is important to define a policy with filters that remove:
Inaccurate data
Aged-out data
Violations of compliance in privacy/regulatory
Unwieldy data badly out of scope
In this process, organizations should be guided by:
Data regulations and compliance
Cloud costs
Latency
Current business processes
Cloud provider technology
Leaders would also be wise to envision cloud data migration, not as a mere move, but as a chance to re-architect a better data environment. According to Capgemini Chief Data Architect Steve Jones, “If we accept that data-driven business is the future, then there’s nothing left behind that has value. But that doesn’t mean you are migrating an existing data warehouse to the cloud, but rather building a new data landscape enabling the business to drive from data.”
“This is the fundamental question on cloud data warehouses,” he adds. “Are you building a better data warehouse, or is a cloud data warehouse one of the technologies you are using to surface data to the business from a collaborative data mesh? I’d argue if it’s not the latter, you might as well save your money.”
Fiserv is a leading finance technology services company, issuing more than 1.6 billion credit cards in the US for various companies. Once it realized the benefits of streamlining and federating data usage, its team revolutionized its data management processes to enable secure, centralized data sharing and enhance operational efficiency across the organization.
“During our cloud migration, part of this is understanding what you have, whether it’s locked up in people’s heads or in spreadsheets flying around in an email,” recalls Jim Haas, Chief Data Architect at Fiserv. “With Alation, we've seen a great improvement in data literacy in the company, where data that was previously locked away in silos is now available for people to learn about.”
Fiserv’s cloud data warehouse now provides data intelligence across 250,000+ distinct fields from Fiserv core systems to thousands of registered users. And, they use generative AI to convert unstructured data into enriched, documented content within the Alation platform.
Read more here: Fiserv’s Journey to Intelligent Cloud Migration and Optimization with Alation.
Understanding the return on investment (ROI) for cloud data warehouse migration is a critical first step for any organization considering the move. By transitioning from on-premises systems to a cloud-based infrastructure, companies can drastically reduce overhead associated with hardware maintenance, cooling systems, and physical data center upkeep. Additionally, businesses benefit from the scalability of cloud solutions, which allows them to adapt more efficiently to growing data volumes without significant capital expenditures. When calculating the ROI, companies should factor in not only cost savings but also the performance improvements that come from faster, more efficient data processing.
Moreover, cloud data warehouses enable better decision-making through real-time analytics and access to more advanced tools for data analysis. For instance, companies can generate faster insights from machine learning models, which can enhance their ability to react quickly to market changes. The long-term financial benefits also include improved business agility and productivity, with teams able to work with data more seamlessly across departments. By combining operational cost savings and improved business outcomes, cloud migrations often deliver a compelling ROI.
Once the migration is complete, businesses should focus on optimizing their cloud data warehouse to ensure they are maximizing its value. This includes organizing data more efficiently, optimizing queries for faster processing, and leveraging cloud-native analytics tools to generate insights. Regularly reviewing the performance of the cloud environment and identifying areas for improvement can help businesses continuously fine-tune their operations. This is particularly important for companies with dynamic data needs, as optimization can lead to significant cost savings over time.
Beyond performance tuning, businesses should also explore new features and services offered by their cloud provider, such as machine learning integrations, data lakes, or serverless computing. These tools can help extract even greater value from the data stored in the cloud. Regularly revisiting and refining your cloud architecture ensures that the investment made in the migration continues to pay off in the form of faster insights, reduced costs, and greater business agility.
Cloud data warehousing is the preferred approach for most enterprises. However, launching your own cloud data warehouse requires a clearly defined business impact. Cost is no longer — if it ever has been — an adequate justification, and lift and shift is a losing strategy.
Just as important, IT leaders must make conscious decisions about the data to move and in many cases, this should be about ensuring the newly moved data is trustworthy and adds business value. And this needs to be done in a way that ensures data going forward is governed, protected, and supports compliance requirements. Take this approach and you’ll do more than widen access to enterprise data – you’ll launch a smarter foundation to support data-driven decision making across your entire organization.
To continue your cloud data warehouse journey, download “5 Steps to Ensure the Success of Your Cloud Data Migration.” This white paper looks at the critical starting point for cloud data migration, how to avoid common migration pitfalls, and more.