

via Shutterstock
Amazon Web Services used a big data conference in the backyard of some of its largest government customers to showcase its AI and machine learning tools that are helping to funnel ever-larger volumes of data into its storage and computing infrastructure.
Making a pitch for better data management tools like metadata systems, AWS executives addressing a big data conference in Tysons Corner, Va., said the the public cloud giant aims to go beyond democratizing big data to “demystify” AI and machine learning.
The combination of organized data and analytics will accelerate the building and deployment of machine learning models, many that currently never make it to production. Those that are deployed often require up to 18 months to roll out, noted Ben Snively, a solution architect at AWS (NASDAQ: AMZN).
Open source tools for model development often advance a generation or two in the time it takes many enterprises to develop, train and launch a machine learning model, he added.
Snively asserted that the combination of big data and analytics along with AI and machine learning creates a “flywheel effect” in which organized, accessible data leads to faster insights, better products and—completing the cycle—more data.
(Hence, the cloud vendor forecasts as much as 180 zettabytes of widely varied and fast-moving data by 2025.)
As it seeks to demystify machine automation technologies and move beyond the current technology “hype phase,” AWS executives note that deployment of machine learning models and, eventually, full-blown platforms, remains hard. Among the reasons are “dirty” data that must be cleansed to foster access. The company estimates that 80 percent of data lakes currently lack metadata management systems that help determine data sources, formats and other attributes needed to wrangle big data.
That makes the heavy investments in data lakes “inefficient,” stressed Alan Halamachi, a senior manager for AWS solution architectures. “If data is not in a format where it can be widely consumed and accessible,” Halamachi stressed, machine learning developers will find themselves in “data jail.”
Once big data is wrangled and secured—“Hackers would like nothing more than to engineer a single breach with access to all of it,” Hamachi said—it can be combined with analytics on the inference side to accelerate training of machine learning models, Snively said.
Noting that most machine learning models built by enterprises never make it to production, the AWS engineers pitched several new tools including its SageMaker machine and deep learning stack introduced in November. Described as a tool for taking the “muck” out of developing machine learning models, Snively said Sagemaker is also designed to free data scientists from IT chores like standing up a server for model development.
The cloud giant is seeing more experimentation among its customers as they seek to connect big data with machine learning development. “Voice [recognition] systems are here to stay,” Snively asserted, and developers are investigating “new ways of interacting with those systems.”
“It’s really about demystifying AI and machine learning” and getting beyond the “magic box” phase, he added.
Recent items:
AWS Takes the ‘Muck’ Out of ML with Sagemaker
How to Make Deep Learning Easy
April 30, 2025
- LogicMonitor Expands AI Observability Platform with Agentic AIOps and New Partnerships
- KNIME Turns Enterprise Data into Action, Demonstrates Custom AI Agents
- Pythian Boosts Global Data and AI Services with Rittman Mead Integration
- Collibra Harris Poll Finds 86% of Data Leaders Cite Privacy as Top Concern Amid AI Adoption
- StarTree Adds AI-Native MCP and Vector Embedding to Power Real-Time RAG and Agentic Apps
- DDN and Nebius Partner to Deliver Scalable AI Infrastructure for Enterprise Applications
- Backblaze Introduces High-Performance B2 Overdrive Cloud Storage for Data-Intensive Workloads
- Acceldata Unveils AI-Driven Anomaly Detection Engine to Automate Data Quality
- BigID Launches AI Data Lineage to Enhance AI Transparency and Governance
- Quobyte Launches Version 4 to Support AI Training and Scale-Out Workloads Across Hybrid Environments
April 29, 2025
- DataOps.live Named Data Breakthrough Awards’ DataOps Company of the Year
- Akamai Firewall for AI Enables Secure AI Applications with Advanced Threat Protection
- Denodo Launches Platform 9.2 with Enhanced Data Marketplace and GenAI Features
- NetApp Adds Quantum-Safe Encryption and AI Ransomware Detection to ONTAP
- Elastic Launches Automatic Migration to Simplify SIEM Migration
- Argonne Examines Opportunities and Risks of GenAI Tools
- GigaIO Demonstrates Power and Cost Savings with New AI Interconnect Benchmarks
- RWS TrainAI Study Finds Claude Sonnet, GPT and Gemini Pro Lead in Synthetic Data Generation
- Open Compute Project Foundation and UALink Consortium Announce a New Collaboration
April 28, 2025
- PayPal Feeds the DL Beast with Huge Vault of Fraud Data
- OpenTelemetry Is Too Complicated, VictoriaMetrics Says
- Thriving in the Second Wave of Big Data Modernization
- Google Cloud Preps for Agentic AI Era with ‘Ironwood’ TPU, New Models and Software
- Google Cloud Fleshes Out its Databases at Next 2025, with an Eye to AI
- Can We Learn to Live with AI Hallucinations?
- Monte Carlo Brings AI Agents Into the Data Observability Fold
- AI Today and Tomorrow Series #3: HPC and AI—When Worlds Converge/Collide
- The Active Data Architecture Era Is Here, Dresner Says
- Slash Your Cloud Bill with Deloitte’s Three Stages of FinOps
- More Features…
- Google Cloud Cranks Up the Analytics at Next 2025
- New Intel CEO Lip-Bu Tan Promises Return to Engineering Innovation in Major Address
- AI One Emerges from Stealth to “End the Data Lake Era”
- GigaOM Report Highlights Top Performers in Unstructured Data Management for 2025
- SnapLogic Connects the Dots Between Agents, APIs, and Work AI
- Supabase’s $200M Raise Signals Big Ambitions
- Snowflake Bolsters Support for Apache Iceberg Tables
- Dataminr Bets Big on Agentic AI for the Future of Real-Time Data Intelligence
- GenAI Investments Accelerating, IDC and Gartner Say
- Dremio Speeds AI and BI Workloads with Spring Lakehouse Release
- More News In Brief…
- Gartner Predicts 40% of Generative AI Solutions Will Be Multimodal By 2027
- Dataiku Achieves AWS Generative AI Competency
- AMD Powers New Google Cloud C4D and H4D VMs with 5th Gen EPYC CPUs
- MLCommons Releases New MLPerf Inference v5.0 Benchmark Results
- Opsera Raises $20M to Expand AI-Driven DevOps Platform
- GitLab Announces the General Availability of GitLab Duo with Amazon Q
- Dataminr Raises $100M to Accelerate Global Push for Real-Time AI Intelligence
- Intel and IBM Announce Availability of Intel Gaudi 3 AI Accelerators on IBM Cloud
- Kinaxis Partners with Databricks to Accelerate AI-Powered Supply Chain Orchestration
- SAS Partners with Kansas State to Advance AI-Driven Water Management
- More This Just In…