

(Michael Vi/Shutterstock)
Google Cloud made a slew of analytics-related announcements at its Next 2025 conference this week, including a range of enhancements to BigQuery, its flagship database for analytics. BigDATAwire caught up with Yasmeen Ahmad, managing director of data analytics, to get the scoop.
Asked to identify three main areas of innovation in BigQuery and related products, Ahmad pointed to the new agents that automated data science, engineering, and analytics work; the new data processing engines in BigQuery; and advances in Google Cloud’s data foundation and its data fabric.
While the work is done by separate teams, there is a lot of functionality that crosses over into other areas, Ahmad added. “We have a lot of talented engineering teams all working on amazing things in parallel,” she said. “We just had so many amazing innovations over the past 12 months we’ve been working on culminating to Next.”
New AI Agents
As we previously reported, Google Cloud is devoting significantly resources to helping its customers build and manage AI agents. That works includes building a new Agent Development Kit (ADK), creating a new Agent-to-Agent (A2A) communication protocol that completes Anthropic’s Model Context Protocol (MCP), and the creation of an Agent Garden, among (many) other innovations.
The company is also embedding pre-built AI agents into its own software services, including BigQuery. There are new specialized agents for data engineering and data science tasks; new agents for building data pipelines; and new agents for performing data prep tasks, such as data transformation, data enrichment, and anomaly detection.
“That’s a game changer for the human data people who are working on data,” Ahmad said. “We really believe those agents are going to transform the way they work with data.”
The agents are powered by Gemini, Google’s flagship foundation model. The agents are making suggestions to the human data analysts, data scientists, and data engineers based in part on information collected through a new BigQuery knowledge engine that Google Cloud has built, which is currently in preview.
“The knowledge engine uses metadata, semantics, usage logs, and information from the catalog to understand business context, to understand how data items are related,” Ahmad said. “How are people using the data? How are different engines being used over that data? And the knowledge that it builds from that is what it then feeds these data agents.”
Google Cloud also unveiled a new conversational analytics agent functionality in Looker, its BI and analytics. This new agent will allow Looker users to interact with data using natural language. The new AI-powered natural language functions in Looker will also improve the accuracy of Looker’s modeling language, LookML, which functions as Google’s semantic layer, by up to two-thirds, the company says.
“As users reference business terms like ‘revenue’ or ‘segments,’ the agent knows exactly what you mean and can calculate metrics in real-time, ensuring it delivers accurate, relevant, and trusted results,” Ahmad wrote in a blog post.
New BigQuery Engines
In addition to the new knowledge engine, Google Cloud announced that it’s developing a new AI query engine for BigQuery. The BigQuery AI query engine will enable queries to foundation models like Gemini to occur simultaneously with traditional SQL queries to the data warehouse.
Querying structured and unstructured at the same time will open a host of new analytic and data science use cases, Google Cloud says, including building richer features for models, performing nuanced segmentation, and uncovering hard-to-reach insights.
“A data scientist can now ask questions like: ‘Which products in our inventory are primarily manufactured in countries with emerging economies?’ The foundation model inherently knows which countries are considered emerging economies,” Ahmad wrote.
BigQuery notebook, a data science notebook alternative to Jupyter, has also been enhanced with AI. Google Cloud is introducing “intelligent SQL cells” that understand the context of customers’ data and offer the data scientist suggestions as they write code. It’s also leveraging AI to enable new exploratory analysis and visualization capabilities.
Google Cloud has also introduced a new serverless Apache Spark engine in BigQuery. Google Cloud has supported traditional Spark environments for years as part of Dataproc, which also includes Hadoop, Flink, Presto, and many other engines. Currently in preview and being tested by customers, the serverless Spark offering is getting better, Ahmad said.
“We announced this week we’ve made three-fold performance improvement in our serverless Spark offering,” she said. “So we’re really looking forward to getting this now into general availability, because we believe that performance is going to be market-leading performance.”
And while it’s not a BigQuery announcement, Google Cloud also announced the general availability of Google Cloud for Apache Kafka. While the company also offers its PubSub service for streaming data, some customers just want Kafka, Ahmad said.
“We have many users using Google’s first party services, but again, we want that choice and optionality depending on where our customer is also coming from,” she said. “As we also embrace all of those customers migrating to Google, we want to embrace what they’ve already built with existing investments and built pipelines and so on.”
Data Foundation Enhancements
Like the first two areas, the third big area of improvement in the Google Cloud analytics environment–enhancements to the data foundation (the data fabric) and data governance–touches on other areas too.
For instance, just as the AI query engine in BigQuery lets users use Gemini against their data, they can also now manage unstructured data in BigQuery through the new support for multimodal tables (structured and unstructured data).
Google Cloud is rolling out a preview of a new feature called BigQuery governance that will provide a single, unified view for data stewards and professionals to handle discovery, classification, curation, quality, usage, and sharing. It includes automated data cataloging (GA) as well as new experimental feature, automatic metadata generation.
“We have a bigger vision around governance,” Ahmad said in the interview. “A lot of the work around catalogs, metadata, semantics, etc. has been very human and manual driven historically. You’ve got to go set up a catalog. You’ve got to go set up metadata, business glossaries–all of those things.”
Google Cloud is making a big bet that AI can help to automate much of that data governance work in its data fabric. “We showcased demos of automated semantic generation at scale, cataloging over objective or over unstructured data,” Ahmad said. “So we actually see this thing as an intelligent, living, breathing thing that is dynamic and actually powering the whole AI ecosystem around agents and any kind of agentic capability.”
As if that wasn’t enough, Google Cloud is also moving forward with its data lakehouse architecture. The company announced a preview of BigQuery tables for Apache Iceberg, which will give customers the benefits of the open table format, such as enabling a range of query engines to access the same table without fear of conflicts or data contamination.
Since Google Cloud first brought Iceberg into its environment six months ago, adoption has tripled, Ahmad said. In fact, she added, Google Cloud’s support for Iceberg is market-leading in terms of performance and capabilities.
For instance, customers can rely on Google to govern their Iceberg tables, she said. They can stream data straight into Iceberg, or extract AI-powered insights from Iceberg data. Google can back up customers’ Ice berg environments,
“In fact, many customers, when they’ve actually looked at our Iceberg managed service, they’re saying, ‘Hey you’re not just supporting it. You’re accelerating Iceberg in a way that that is just a dream come true,” Ahmad said. “So actually Deutsche Telekom on the panel I did yesterday with them said Iceberg has been magical for us in Google Cloud because we truly are embracing it, because we think it’s so important for customers for that choice and flexibility they’re looking for.”
Related Items:
Google Cloud Preps for Agentic AI Era with ‘Ironwood’ TPU, New Models and Software
Google Cloud Fleshes Out its Databases at Next 2025, with an Eye to AI
Google Revs Cloud Databases, Adds More GenAI to the Mix
July 30, 2025
- Elastic Announces Faster Filtered Vector Search with ACORN-1 and Default Better Binary Quantization Compression
- Nutanix Named a Leader in Multicloud Container Platforms Evaluation
- RAVEL Expands Orchestrate AI With Supermicro-Based AI Workload Solution
- MLCommons Releases MLPerf Client v1.0: A New Standard for AI PC and Client LLM Benchmarking
- IBM: 13% of Organizations Reported Breaches of AI Models, 97% of Which Reported Lacking Proper AI Access Controls
- Hitachi Vantara Announces Virtual Storage Platform One for Hybrid Cloud Data Management
- Elastic Delivers New ES|QL Features for Cross-Cluster Scale, Data Enrichment, and Performance
- Cognizant Launches AI Training Data Services to Accelerate AI Model Development at Enterprise Scale
- Fractal Launches Agentic AI Platform Cogentiq to Drive Enterprise Performance
July 29, 2025
- Git-for-data Pioneer lakeFS Secures $20M in Growth Capital, Fills a Critical Gap in Enterprise AI Tech Stack
- Esri, Microsoft, and Space42 Partner to Launch ‘Map Africa Initiative’
- Teradata Expands ModelOps in ClearScape Analytics for Generative and Agentic AI
- Linux Foundation Welcomes AGNTCY to Tackle AI Agent Fragmentation
- Deloitte: Trust Emerges as Main Barrier to Agentic AI Adoption in Finance and Accounting
- Lightbits Launches NVMe over TCP Storage for Kubernetes on Supermicro Systems, Unveiling Benchmark Results
- AWS and dbt Labs Sign Strategic Collaboration Agreement
- Actian Study Finds Organizations Overestimate Data Governance Maturity, Posing Risk to AI Investments
- Privacera Named Leader in GigaOm Radar for Data Access Governance for 4th Consecutive Time
July 28, 2025
- Scaling the Knowledge Graph Behind Wikipedia
- LinkedIn Introduces Northguard, Its Replacement for Kafka
- Top 10 Big Data Technologies to Watch in the Second Half of 2025
- Iceberg Ahead! The Backbone of Modern Data Lakes
- What Are Reasoning Models and Why You Should Care
- Apache Sedona: Putting the ‘Where’ In Big Data
- Top-Down or Bottom-Up Data Model Design: Which is Best?
- Rethinking AI-Ready Data with Semantic Layers
- What Is MosaicML, and Why Is Databricks Buying It For $1.3B?
- Rethinking Risk: The Role of Selective Retrieval in Data Lake Strategies
- More Features…
- Supabase’s $200M Raise Signals Big Ambitions
- Mathematica Helps Crack Zodiac Killer’s Code
- Solidigm Celebrates World’s Largest SSD with ‘122 Day’
- AI Is Making Us Dumber, MIT Researchers Find
- Promethium Wants to Make Self Service Data Work at AI Scale
- The Top Five Data Labeling Firms According to Everest Group
- Toloka Expands Data Labeling Service
- With $20M in Seed Funding, Datafy Advances Autonomous Cloud Storage Optimization
- Ryft Raises $8M to Help Companies Manage Their Own Data Without Relying on Vendors
- AWS Launches S3 Vectors
- More News In Brief…
- Seagate Unveils IronWolf Pro 24TB Hard Drive for SMBs and Enterprises
- Gartner Predicts 40% of Generative AI Solutions Will Be Multimodal By 2027
- OpenText Launches Cloud Editions 25.3 with AI, Cloud, and Cybersecurity Enhancements
- TigerGraph Secures Strategic Investment to Advance Enterprise AI and Graph Analytics
- Promethium Introduces 1st Agentic Platform Purpose-Built to Deliver Self-Service Data at AI Scale
- StarTree Adds Real-Time Iceberg Support for AI and Customer Apps
- Databricks Announces Data Intelligence Platform for Communications
- Gathr.ai Unveils Data Warehouse Intelligence
- Graphwise Launches GraphDB 11 to Bridge LLMs and Enterprise Knowledge Graphs
- Campfire Raises $35 Million Series A Led by Accel to Build the Next-Generation AI-Driven ERP
- More This Just In…