

Hortonworks today announced the acquisition of Sequence IQ, a Hungarian developer of cloud deployment automation tools for Hadoop. Hortonworks, which is hosting its Hadoop Summit Europe in Brussels this week, also shipped a maintenance release of its Hadoop distribution that includes Apache Ambari 2.0 and officially adds Apache Spark to the mix.
One of the big challenges customers face when implementing a modern Hadoop cluster is just getting it up and running. It’s not such an issue in small clusters that have less than 10 nodes. But as the cluster extends beyond 100 or 1,000 nodes, it quickly becomes too expensive and tedious to do the work manually.
A number of companies and open source projects are chasing this problem, including Sequence IQ. From its headquarters in Budapest, the company has developed a pair of products that fit into this space.
The first is Cloudbreak. This product makes it much easier for customers to provision and deploy Hadoop clusters, whether they live in the cloud (Amazon AWS, Microsoft Azure, and Google cloud are supported), in Docker containers, or running on bare metal. The software uses the “blueprints” functionality in Ambari, the Hadoop operations and management console, to enable users to easily duplicate customers’ Hadoop setups; support for OpenStack is on the horizon.
The second Sequence IQ product that caught Hortonworks‘ eye is Periscope, which provides auto-scaling capabilities for Hadoop clusters. The software, which also is integrated with Ambari, analyzes various performance metrics for the cluster, and automatically adds nodes as needed, based on policies set by the user.
Cloudbreak and Periscope fit very nicely together and complement the power of Ambari to streamline the management of Hadoop clusters, says Hortonworks vice president of product management Tim Hall. “You can imagine folks spinning up a 1,000 node cluster. That’s a lot of work to do. Just doing one step on every machine is too many,” he says. “So whatever we can do to help streamline automation, that’s been the focus around Ambari.”
Ambari has matured over the past year and gained more powerful capabilities, including Blueprints extensibility mechanisms and the new Alerts framework in Ambari 2.0, which Periscope can use to trigger the addition of Hadoop nodes.
“We’re seeing evidence that the community is understanding how to take advantage of the Blueprint extensibility mechanisms and really leveraging it to the maximum capability, which is awesome,” Hall says. “The Sequence IQ team has been wonderful to collaborate and work with and we’re excited to bring them into the family.”
Hortonworks plans to contribute the Sequence IQ products back to the Hadoop community, either by donating the intellectual property (IP) to an existing Apache Software Foundation project–potentially Apache Ambari itself–or by incubating a new one, Hall says. In any event, the Sequence IQ capabilities will be added to a future release of Hortonworks Hadoop distribution, but only for customers who have purchased Enterprise Plus support subscriptions for HDP, he says.
Hortonworks is happy to add the Sequence IQ team to its existing staff, and is eager to use its existing business to help it create a “beachhead” in Europe, Hall says. Looking forward, Hortonworks will be looking at ways to leverage some of the work Sequence IQ is doing with integrating Hadoop into OpenStack.
“Originally the integration between OpenStack and Hadoop was through the Sahara plugin. That worked great when Hadoop was MapReduce and HDFS as one project,” Hall says. “What we’ve seen now over time is, as they’re as more and more componentry in the Hadoop ecosystem that’s being deployed as a platform, it was causing a ripple effect in terms of what needed to be exposed and managed through that Sahara plugin.”
Instead of creating dozens of individual integration points between Open Stack and Hadoop for all the various Hadoop processing engines a customer might use–Hive, HBase, Cassandra, Spark, MapReduce, Tez, etc.–the Sequence IQ team is taking a different approach, and using Ambari’s blueprint API as the starting point for defining how Hadoop will deploy within OpenStack.
“From our perspective, the approach that the Sequence IQ team is taking makes more sense, given the way the Hadoop ecosystem is headed,” Hall says. “Part of it has to do with what is the right binding point into the OpenStack infrastructure for deployment of Hadoop…If it was just one Hadoop project being bound to Sahara and integrating into Open Stack, that makes some sense. But when you’re talking about 20 other components now that makes up the platform, exposing the details of all 20 of those things into the Sahara plugin didn’t make sense architecturally. The churn that comes along with what’s happening through inclusion or removal of various points, just meant that there was going to be a lot of investment in the Sahara plugin to keep up.”
Hortonworks also announced the first maintenance release for HDP 2.2, which shipped last fall. The new release includes Ambari 2.0, which Hortonworks unveiled last week, in addition to various other enhancements, including support for Apache Spark version 1.2.1. It’s the first time Hortonworks has official supported the popular in-memory computing framework in its Hadoop distribution.
The company also proposed a new Apache project for the Data Governance Initiative that it started earlier this year. Apache Atlas, as the DGI project would be known, aims to help rein in some of the data chaos that occurs on Hadoop. Specifically, Atlas will provide data classification, centralized auditing, search and lineage capabilities for Hadoop, as well as security and policy engines.
The new HDP release also includes Apache Ranger, a security management tool for Hadoop that came out of Hortonworks previous acquisition of XA Secure. Hortonworks has also streamlined the deployment of the Kerberos authentication subsystem in HDP; it can now be up and running in just a few clicks, the company says.
Related Items:
Taming the Wild Side of Hadoop Data
Hadoop Hits the Big Time with Hortonworks IPO
Hortonworks Goes Broad and Deep with HDP 2.2
April 29, 2025
- GigaIO Demonstrates Power and Cost Savings with New AI Interconnect Benchmarks
- RWS TrainAI Study Finds Claude Sonnet, GPT and Gemini Pro Lead in Synthetic Data Generation
- Open Compute Project Foundation and UALink Consortium Announce a New Collaboration
April 28, 2025
- Sumo Logic Unifies Security to Deliver Intelligent Security Operations
- Dataminr Unveils Agentic AI Roadmap to Advance Real-Time Decision-Making
- Oracle Expands OCI with 1st Wave of NVIDIA Blackwell GB200 Systems
April 25, 2025
- Denodo Supports Real-Time Data Integration for Hospital Sant Joan de Déu Barcelona
- Redwood Expands Automation Platform with Introduction of Redwood Insights
- Datatonic Announces Acquisition of Syntio to Expand Global Services and Delivery Capabilities
April 24, 2025
- Dataiku Expands Platform with Tools to Build, Govern, and Monitor AI Agents at Scale
- Indicium Launches IndiMesh to Streamline Enterprise AI and Data Systems
- StorONE and Phison Unveil Storage Platform Designed for LLM Training and AI Workflows
- Dataminr Raises $100M to Accelerate Global Push for Real-Time AI Intelligence
- Elastic Announces General Availability of Elastic Cloud Serverless on Google Cloud Marketplace
- CNCF Announces Schedule for OpenTelemetry Community Day
- Thoughtworks Signs Global Strategic Collaboration Agreement with AWS
April 23, 2025
- Metomic Introduces AI Data Protection Solution Amid Rising Concerns Over Sensitive Data Exposure in AI Tools
- Astronomer Unveils Apache Airflow 3 to Power AI and Real-Time Data Workflows
- CNCF Announces OpenObservabilityCon North America
- Domino Wins $16.5M DOD Award to Power Navy AI Infrastructure for Mine Detection
- PayPal Feeds the DL Beast with Huge Vault of Fraud Data
- Will Model Context Protocol (MCP) Become the Standard for Agentic AI?
- OpenTelemetry Is Too Complicated, VictoriaMetrics Says
- Thriving in the Second Wave of Big Data Modernization
- Google Cloud Preps for Agentic AI Era with ‘Ironwood’ TPU, New Models and Software
- Google Cloud Fleshes Out its Databases at Next 2025, with an Eye to AI
- Can We Learn to Live with AI Hallucinations?
- Monte Carlo Brings AI Agents Into the Data Observability Fold
- AI Today and Tomorrow Series #3: HPC and AI—When Worlds Converge/Collide
- The Active Data Architecture Era Is Here, Dresner Says
- More Features…
- Google Cloud Cranks Up the Analytics at Next 2025
- New Intel CEO Lip-Bu Tan Promises Return to Engineering Innovation in Major Address
- AI One Emerges from Stealth to “End the Data Lake Era”
- GigaOM Report Highlights Top Performers in Unstructured Data Management for 2025
- SnapLogic Connects the Dots Between Agents, APIs, and Work AI
- Snowflake Bolsters Support for Apache Iceberg Tables
- Supabase’s $200M Raise Signals Big Ambitions
- Dataminr Bets Big on Agentic AI for the Future of Real-Time Data Intelligence
- Big Data Career Notes for March 2025
- GenAI Investments Accelerating, IDC and Gartner Say
- More News In Brief…
- Gartner Predicts 40% of Generative AI Solutions Will Be Multimodal By 2027
- MinIO: Introducing Model Context Protocol Server for MinIO AIStor
- Dataiku Achieves AWS Generative AI Competency
- AMD Powers New Google Cloud C4D and H4D VMs with 5th Gen EPYC CPUs
- CData Launches Microsoft Fabric Integration Accelerator
- MLCommons Releases New MLPerf Inference v5.0 Benchmark Results
- Opsera Raises $20M to Expand AI-Driven DevOps Platform
- GitLab Announces the General Availability of GitLab Duo with Amazon Q
- Dataminr Raises $100M to Accelerate Global Push for Real-Time AI Intelligence
- Intel and IBM Announce Availability of Intel Gaudi 3 AI Accelerators on IBM Cloud
- More This Just In…