Follow BigDATAwire:
June 17, 2024

DataOps.Live: The Data Product Assembly Line for Snowflake

(metamorworks/Shutterstock)

When it comes to building great data products, all the key ingredients are available in the cloud–big data, massive compute, and sophisticated analytics and AI tools. What’s missing is an easy way to turn all those ingredients into finished products. That’s an area that a startup called DataOps.live hopes to fill in the Snowflake environment.

About seven years ago, British consultants Justin Mullen and Guy Adams were helping clients in Europe build data products on the Snowflake cloud. The pair devised ways that enabled some fairly large customers like Disney and Booking.com to utilize time-tested DevOps techniques in their Snowflake environment.

Mullen and Adams eventually realized they were sitting on a business opportunity, and a few years later, they launched their startup, DataOps.live, to essentially productize the one-off consulting work they had been doing with their clients.

“We started DataOps.live in 2020 specifically focused on, how do we become that data product assembly line for Snowflake,” Mullen, the CEO of DataOps.live, told Datanami in a recent interview. “How do we build, test, and deploy product in Snowflake in the same way that we’ve been doing in the software development world for the last 20 years.”

DataOps.live calls itself an “assembly line” for data products on Snowflake (Image courtesy DataOps.live)

DataOps.live takes the core primitives that Snowflake provides and layers atop it a template-based environment that allows for rapid development and deployment of data products. Instead of requiring users to manually string together the all of the elements that go into building and deploying a data product–which could be anything from an analytics dashboard to a LLM-based chatbot–DataOps.live brings automation to the equation.

“Whenever you’re building a data product, you’ve got a lot of infrastructure code that you need to run, in terms of setting up a tenant, setting up databases, setting up roles, setting up permissions,” Mullen said. “DataOps.live takes a declarative, sort of Terraform-type approach, to how you build and deploy all of that. That’s not a capability that Snowflake provides.”

In addition to setting up the infrastructure, DataOps.live provides hooks for ETL/ELT and data transformation tools to bring live data into its data product development and deployment environment. It has about 30 data “orchestrators” for tools such as dbt, Fivetran, Matillion, and others, Mullen said.

“We orchestrate all of those elements in the same way that an Airflow might orchestrate all of those elements,” he said. “We provide all of the code management, code repository, and the Gitflow actions and all of the elements around that. And then all of the packaging elements and the deployment elements. So it really is that manufacturing line in terms of how you build those blueprints and those solution templates, and then how you deploy those into customers.”

The typical data product relies on a bunch of disparate products and code, Mullen said. They may have some open-source Airflow pushing data into Snowflake CortexAI large language model (LLM). They may have user interfaces created in Snowpark’s Streamlit environment, and some homegrown Python orchestrating it all. DataOps.live brings all of those components together and packaging it all up for effective deployment in the CI/CD manner.

“Building a data product and assembling the data product requires people to assemble a lot of different components of a data product together. We want to run some ingestion, we want to run some Python, we want to do some modeling and everything else. And we create a data app that we then deploy into production,” Mullen said.

Data and code orchestrators at DataOps.live (Image courtesy DataOps.live)

“But we’ve also then got the partners that sit around the ecosystem, the Fivetrans and the Stitches. They’re core parts of the infrastructure,” he continued. “So we bring all of that together. We’re providing this sort of factory and this assembly line for building these data apps and these data products.”

DataOps.live customers can crank out more data products per developer thanks to the automation, Mullen said. For instance, before adopting DataOps.live, the pharmaceutical company Roche generated about one data product per quarter per team, he said. Following the deployment of DataOps.live, the company’s 300 data engineers, spread across 40 teams, are deploying about five data products per month. That’s about 2,400 data product deployments per year versus 120–a huge increase in output.

Another big DataOps.live customers is Snowflake itself. Nearly 1,000 solution engineers at the company use the environment to rapidly prototype and demonstrate data product solutions for customers and prospects.

“We as a Snowflake team are building things on top of Snowflake using Snowflake core features and functionalities like Cortex, like Snowpark, like our Data Marketplace,” Robert Guglietti, a solution development manager at Snowflake. “We are bringing those together in a way that help customers understand what they can build, what’s the art of possible, how can they leverage Snowflake to do some of these things.”

As Guglietti and his team were getting ready for the recent Data Cloud Summit, they used DataOps.live to create demos of new data products that the Snowflake sales team in charge of the marketing vertical could show at the conference. The company had a new team that went from being new hires on day one to deploying an app on DataOps.live on day four, after four days of onboarding and training.

“For me, that’s phenomenal,” Guglietti said. “That’s unheard of in the past. And this team itself was able to just get going, look at documentation, and do that type of throughput, which is exactly what we were looking for with this type of model, with this type of templating framework on top of DataOps.”

In addition to being a DataOps.live customer, Snowflake is also an investor. The company took a stake in DataOps.live with its $17.5 million Series A in May 2023.

As data products become more popular in the months and years to come, tools that can eliminate some of the complexity and accelerate the deployment of vetted and tested programs will certainly have a place. And for DataOps.live, that place is currently on the Snowflake cloud, where it’s carving itself a comfortable niche.

Related Items:

Inside Snowflake’s iPhone and App Store Strategy for Data and AI Democratization

Snowflake Gives Cloud Customers What They Need and Want at Summit 2024

Snowflake Embraces Open Data with Polaris Catalog

 

BigDATAwire