2022 Big Data Predictions from the Cloud
The pandemic marked an inflection point for the growth of cloud platforms in 2020, as organizations scrambled to keep their applications running. That general pattern has continued through 2021, and the cloud looms even larger now with 2022 knocking on the door. As we kick off annual prediction season here at Datanami, here are cloud-related predictions from 14 individuals and two companies to get things started.
The hybrid cloud conversation is now driven by public cloud vendors rather than infrastructure/on-premises vendors. But in 2022, customers should be careful in how much control they cede to the cloud providers, says Jesse Stockall, the chief architect of cloud management at Snow Software.
“For the last few years, hybrid cloud was championed by technology vendors who sold on-premises technologies, but now public cloud vendors are offering cloud-like experiences on premises,” Stockall says. “This is not a good or bad thing, but as companies decide how they will approach their hybrid cloud strategy they need to consider how much control they want to maintain. By handing their private cloud to a public cloud vendor, companies may lose some control and ability to customize, but they will gain a unified, consistent private cloud experience. Companies need to decide what will be best for their business, but overall the conversation has shifted with public cloud vendors taking the wheel.”
Customers are adopting multi-cloud strategies to avoid lock-in. But in 2022, companies will find that actually achieving a multi-cloud setup is a lot tougher than it looks on paper, says Asim Razzaq, the CEO of Yotascale.
“Almost all enterprises are now embracing multi-cloud, but it remains a challenge for teams to intimately know AWS, GCP, and Azure,” Razzaq tells us. “The skills gap makes doing multi-cloud well unrealistic. To get the most benefit from a cloud, you need to go deeper and embed core services rather than building stuff generically. Businesses will need to evaluate whether the economics of investing in more than one cloud is critical to their long-term survival.”
Where the data is stored in the cloud mattered in 2021, and it will matter even more in 2022 as the legal and regulatory environment evolves, according to the prognosticators at Forrester.
“Every major hyperscaler develops a plan for geopolitical frictions and antitrust reform,” Forrester predicts. “In 2022, major cloud providers will continue to make major adjustments to satisfy their global customer needs and new regulations. At the same time, US companies will consider adding regional players to the mix; European companies will react to the first GAIA-X policies; and state-owned enterprises will dictate the future of the cloud marketplace in China.
Databases are proliferating in the cloud at a rapid clip at the moment. But the sprawl is getting out of control, according to SingleStore CEO Raj Verma, who sees 2022 beckoning the start of Database 3.0, or The Great Database Consolidation.
“The first generation of databases were the Oracles and Informix and DB2,” Verma writes. “The second was this database sprawl where you saw the influx of DB2, Couchbase went public, and the other 300. The next generation of databases is the consolidation of these data platforms and types into a database that can handle modern data, and do it in a hybrid, multi-cloud manner with extremely low latency.”
The focus on hybrid cloud will wane in 2022, while decision-makers look to multi-cloud deployments, which will mature into something seamless and make a real impact on data estates, says MinIO CEO AB Periasamy.
“The hybrid cloud got lapped in the last 12 months,” Periasamy says. “Enterprises already use nearly three public and three private clouds apiece. Hybrid cloud is still a thing, but the requirement for multi-cloud takes the conversation to the next level. It is hardly seamless, however. Unfortunately, for many enterprises, the multi-cloud is a series of silos. The ‘common denominators’ are few and far between. Yes, Kubernetes is a stunningly powerful unifying force and greases the wheels for application mobility, but seamlessly migrating security, access management, policy management, and other mission critical IT functions across clouds is challenging at best.”
Going multi-cloud has become a priority for companies. In 2022, we’ll also independent software vendors begin to offer this as a standard deployment option, says Charles Caldwell, the vice president of product management at Logi Analytics.
“The cloud has become the primary location for software vendors to store their data over the last few years. Many have even moved their applications to cloud platforms. The trend I am seeing now and what will continue in 2022 is software vendors investing in multi-cloud deployments. Organizations will not want to be locked into a single cloud vendor due to regulations like GDPR and organizations wanting localized data.”
Multi-cloud environments are becoming a reality, says Tomer Shiran, founder and CPO of Dremio. But that doesn’t mean that hybrid cloud is going away, he says, as hybrid and multi-cloud deployments will blend together in 2022.
“Companies are not migrating to just one cloud,” Shiran says. “They are adopting AWS for certain workloads, Microsoft Azure for other workloads, while needing to keep some workloads on premise. They’re also adopting technologies such as Databricks and Dremio, Databricks and Snowflake to get the best out of these multi-cloud, hybrid environments. Most companies have ended up with different data across clouds as their businesses have organically evolved — whether it be different business units adopting different clouds at different times, acquisitions, or new leadership. When they can use the same technology across both, it simplifies everything. In 2022, we’ll see more enterprises adopt multi and hybrid cloud strategies.”
Just as the on-prem data warehouse vendors of yore benefited from lock-in, today’s top cloud data warehouse vendors are also benefiting economically from lock-in. That dismal math starts to change in 2022, says Dipti Borkar, the co-founder and Chief Product Officer (CPO) of Ahana.
“Data warehouses like Snowflake are the new Teradata–they’re locking people into proprietary formats,” Borkar says. “As users start feeling the burden of higher costs as the size of their cloud data warehouse grows, they’ll start looking for cheaper AND open options that don’t lock them into a proprietary format or technology. In 2022 it’ll be all about the Open Data Lake Analytics stack, the stack that allows for open formats, open source, open cloud – and absolutely no lock-in.”
There are multiple ways to achieve lock-in with your customers (Datanami’s favorite is providing an awesome product with extraordinary customer service at a reasonable price). But in 2022, there will be one sure fire way to eliminate vendor lock-in once and for all: the rise of the multi-cloud architecture, says Steve Touw, the CTO of Immuta.
“Organizations are well underway in their shift to the cloud for cloud computing for data processing and storage,” Touw writes. “In fact, 81% project they will become 100% cloud-based or primarily cloud-based in the next 12 to 24 months. Many organizations fear being forced to lock-in to a single cloud vendor platform. Multi-cloud architecture means companies can forgo the ‘lock-in’ and instead operate across all cloud platforms. This trend is not going away, but rather accelerating, and cloud vendors must get on board as it adds a layer of backup and recovery, scalability, and improved security.”
We think about compute, data, and applications as separate entities. But in the cloud in 2022, they will sort of blend together into something greater than the sum of their parts, and that will require a new management approach, says Anand Rao, the global AI lead for PwC.
“By itself, AI can’t do much to solve important problems,” Rao says. “It needs data and scalable computing power. That’s why leading companies are increasingly administering data, AI, and cloud (DAC) as a unified whole. We’ll see an influx of companies in 2022 take a lifecycle approach to managing these three interconnected operations and when developing AI governance. Businesses are continually looking at strategy, fine-tuning execution and enhancing operations. When data, AI and cloud work together smoothly, end-to-end, the result is a supple and powerful system that realizes more value from data and solves more problems faster.”
After years of planting, cultivating, and harvesting Apache Hadoop (anybody remember Hadoop?), 2022 will usher in the beginning of a new crop cycle with a heavy seeding in the cloud, according to Kyligence CEO Luke Han’s almanac.
“In 2022, we can expect the continued decline of the Hadoop platform, even though like some tough weeds in your garden the roots and trailers of Hadoop will be hard to completely eradicate,” Han writes. “Expect CIOs and data teams to continue to de-emphasize Hadoop and to continue the process of removing it from their production data stack.
“Also look for IT departments to continue to make their on-prem implementations look and function like the public cloud,” Han continues. “In the near term, organizations may continue to use HDFS as a storage platform until a better private cloud storage solution can be devised. In reality, to protect existing investments, and to comply with local government regulation, organizations can’t simply move all existing workloads and applications built on top of on-premise Hadoop to the public cloud. The on-premise data stack will continue to exist. A hybrid solution across the public cloud and private cloud will be a more practical approach.”
Up to this point, the public cloud may be the Internet’s greatest hit. But we haven’t seen anything yet, according to Citrix’s Christian Reilly, Citrix’s vice president and the head of Technology Strategy Organization.
“We will continue to see the hyperscalers build out both subsea and terrestrial capacities at unprecedented scale, encouraging business to have their network traffic carried on hyperscaler backbones,” Reilly says. “This will provide the opportunity for ‘parallel internets’ to emerge with the promise of better performance, reach and improved security over the traditional internet construct.”
As companies begin to move and build more complex and mission-critical applications in the cloud, they’ll be faced with greater security demands and a need to face down growing cyberthreats, predicts a little-known computer startup named IBM.
“As cyberthreats grow, organizations are increasingly adopting a hybrid, multi-cloud approach to mitigate vendor concentration risk,” Big Blue tells us. “With data protection top of mind enterprises will also prioritize security designed with one single point of control so they can gain access to a holistic view of threats and mitigate complexity in the year ahead. While enterprises plan for 2022, they must also remember to prepare for the even longer-term future. As quantum computing grows stronger and poses potential risks, such as the ability to quickly break encryption algorithms and access sensitive data, enterprises must look beyond near-term threats to 10, 15, and 20 years in the future.”
Companies adopted remote work policies in response to the COVID-19 pandemic in 2022. In 2022, companies will examine their data and analytic workflows to ensure they’re optimally aligned for the new way that human workers work, says Chris Bergh, CEO of DataKitchen.
“With data and tools increasingly in the cloud, data organizations are finding ways to accommodate remote work. Web conferencing helps, but chance encounters by the water cooler are non-existent. The processes and workflows that depend on individuals with tribal knowledge huddling to solve problems are nearly impossible to execute through video conferences.
“As a result, enterprises will examine their end-to-end data operations and analytics creation workflows,” Bergh continues. “Are they building up or tearing down the communication and relationships that are critical to your mission? Instead of allowing technology to be a barrier to teamwork, leading data organizations in 2022 will further expand the automation of workflows to improve and facilitate communication and coordination between the groups. In other words, they will use DataOps principles to build a platform that creates a robust, transparent, efficient, repeatable analytics process hub that unifies all workflows.”
Data piled up in clouds at a record clip in 2021. In 2022, the time will have come for cloud-native data catalogs, predicts Kyle Kirwan, CEO and co-founder of Bigeye.
“Catalogs have been around for a while but until recently the space has been largely bimodal: open source options out of companies like Lyft and LinkedIn, or enterprise-focused catalogs like Alation and Collibra,” Kirwan says. “They can both involve a long implementation process, and the former comes with self-management overhead, and the latter starts at prices that often put them out of reach for small data teams. We heard questions about catalogs from many of the teams we spoke to this past year, and I predict that in 2022 we’ll see increasing adoption of the swath of new cloud-native catalogs that offer reduced management and setup costs.”
For some, the cloud functions as a pseudo warehouse where all of their data is stored. In 2022, that data density will result in some unique analytic achievements, says Jeff Whitaker, vice president of products at Excelero.
“In 2022, new performance infrastructure in the cloud for compute, networking, and storage is being built out, and we will see the convergence of analytics environments,” Whitaker writes. “As a result, many companies will migrate their core business applications and database environments to the cloud, uniting their data in a central resource. From BI, database analytics and into the AI/ML environments, it’s now entirely possible for near-real time analysis of data to be done in the cloud, using cloud engines together with the Web-scale data platforms.”