

Hortonworks today announced the acquisition of Sequence IQ, a Hungarian developer of cloud deployment automation tools for Hadoop. Hortonworks, which is hosting its Hadoop Summit Europe in Brussels this week, also shipped a maintenance release of its Hadoop distribution that includes Apache Ambari 2.0 and officially adds Apache Spark to the mix.
One of the big challenges customers face when implementing a modern Hadoop cluster is just getting it up and running. It’s not such an issue in small clusters that have less than 10 nodes. But as the cluster extends beyond 100 or 1,000 nodes, it quickly becomes too expensive and tedious to do the work manually.
A number of companies and open source projects are chasing this problem, including Sequence IQ. From its headquarters in Budapest, the company has developed a pair of products that fit into this space.
The first is Cloudbreak. This product makes it much easier for customers to provision and deploy Hadoop clusters, whether they live in the cloud (Amazon AWS, Microsoft Azure, and Google cloud are supported), in Docker containers, or running on bare metal. The software uses the “blueprints” functionality in Ambari, the Hadoop operations and management console, to enable users to easily duplicate customers’ Hadoop setups; support for OpenStack is on the horizon.
The second Sequence IQ product that caught Hortonworks‘ eye is Periscope, which provides auto-scaling capabilities for Hadoop clusters. The software, which also is integrated with Ambari, analyzes various performance metrics for the cluster, and automatically adds nodes as needed, based on policies set by the user.
Cloudbreak and Periscope fit very nicely together and complement the power of Ambari to streamline the management of Hadoop clusters, says Hortonworks vice president of product management Tim Hall. “You can imagine folks spinning up a 1,000 node cluster. That’s a lot of work to do. Just doing one step on every machine is too many,” he says. “So whatever we can do to help streamline automation, that’s been the focus around Ambari.”
Ambari has matured over the past year and gained more powerful capabilities, including Blueprints extensibility mechanisms and the new Alerts framework in Ambari 2.0, which Periscope can use to trigger the addition of Hadoop nodes.
“We’re seeing evidence that the community is understanding how to take advantage of the Blueprint extensibility mechanisms and really leveraging it to the maximum capability, which is awesome,” Hall says. “The Sequence IQ team has been wonderful to collaborate and work with and we’re excited to bring them into the family.”
Hortonworks plans to contribute the Sequence IQ products back to the Hadoop community, either by donating the intellectual property (IP) to an existing Apache Software Foundation project–potentially Apache Ambari itself–or by incubating a new one, Hall says. In any event, the Sequence IQ capabilities will be added to a future release of Hortonworks Hadoop distribution, but only for customers who have purchased Enterprise Plus support subscriptions for HDP, he says.
Hortonworks is happy to add the Sequence IQ team to its existing staff, and is eager to use its existing business to help it create a “beachhead” in Europe, Hall says. Looking forward, Hortonworks will be looking at ways to leverage some of the work Sequence IQ is doing with integrating Hadoop into OpenStack.
“Originally the integration between OpenStack and Hadoop was through the Sahara plugin. That worked great when Hadoop was MapReduce and HDFS as one project,” Hall says. “What we’ve seen now over time is, as they’re as more and more componentry in the Hadoop ecosystem that’s being deployed as a platform, it was causing a ripple effect in terms of what needed to be exposed and managed through that Sahara plugin.”
Instead of creating dozens of individual integration points between Open Stack and Hadoop for all the various Hadoop processing engines a customer might use–Hive, HBase, Cassandra, Spark, MapReduce, Tez, etc.–the Sequence IQ team is taking a different approach, and using Ambari’s blueprint API as the starting point for defining how Hadoop will deploy within OpenStack.
“From our perspective, the approach that the Sequence IQ team is taking makes more sense, given the way the Hadoop ecosystem is headed,” Hall says. “Part of it has to do with what is the right binding point into the OpenStack infrastructure for deployment of Hadoop…If it was just one Hadoop project being bound to Sahara and integrating into Open Stack, that makes some sense. But when you’re talking about 20 other components now that makes up the platform, exposing the details of all 20 of those things into the Sahara plugin didn’t make sense architecturally. The churn that comes along with what’s happening through inclusion or removal of various points, just meant that there was going to be a lot of investment in the Sahara plugin to keep up.”
Hortonworks also announced the first maintenance release for HDP 2.2, which shipped last fall. The new release includes Ambari 2.0, which Hortonworks unveiled last week, in addition to various other enhancements, including support for Apache Spark version 1.2.1. It’s the first time Hortonworks has official supported the popular in-memory computing framework in its Hadoop distribution.
The company also proposed a new Apache project for the Data Governance Initiative that it started earlier this year. Apache Atlas, as the DGI project would be known, aims to help rein in some of the data chaos that occurs on Hadoop. Specifically, Atlas will provide data classification, centralized auditing, search and lineage capabilities for Hadoop, as well as security and policy engines.
The new HDP release also includes Apache Ranger, a security management tool for Hadoop that came out of Hortonworks previous acquisition of XA Secure. Hortonworks has also streamlined the deployment of the Kerberos authentication subsystem in HDP; it can now be up and running in just a few clicks, the company says.
Related Items:
Taming the Wild Side of Hadoop Data
Hadoop Hits the Big Time with Hortonworks IPO
Hortonworks Goes Broad and Deep with HDP 2.2
August 1, 2025
- MIT: New Algorithms Enable Efficient Machine Learning with Symmetric Data
- Micron Expands Storage Portfolio with PCIe Gen6 and 122TB SSDs for AI Workloads
- DataRobot Announces Agent Workforce Platform Built with NVIDIA
- Menlo Ventures Report: Enterprise LLM Spend Reaches $8.4B as Anthropic Overtakes OpenAI
- Confluent Announces $200M Investment Across Its Global Partner Ecosystem
- Zilliz Sets New Industry Standard with VDBBench 1.0 for Benchmarking Real Vector Database Production Workloads
- Symmetry Systems CSO Releases Book on Data Security Strategies for the AI Era
July 31, 2025
- Google DeepMind’s AlphaEarth Model Aims to Transform Climate and Land Monitoring
- Elsevier Launches Reaxys AI Search for Natural Language Chemistry Queries
- Boomi Brings Sovereign Data Integration to Australia
- Scality Releases Open Source COSI and CSI Drivers to Streamline File Storage Provisioning
- Informatica Boosts AI Capabilities with Latest Intelligent Data Management Cloud Platform Release
- Helix 2.0 Gives Global Enterprises the Fastest Path to AI Agents on a Private GenAI Stack
- Anaconda Raises Over $150M in Series C Funding to Power AI for the Enterprise
- Supermicro Open Storage Summit Showcases the Impact of AI Workloads on Storage
- Observe Closes $156M Series C as Enterprises Shift to AI-Powered Observability at Scale
- Stack Overflow’s 2025 Developer Survey Reveals Trust in AI at an All Time Low
July 30, 2025
- Scaling the Knowledge Graph Behind Wikipedia
- LinkedIn Introduces Northguard, Its Replacement for Kafka
- Top 10 Big Data Technologies to Watch in the Second Half of 2025
- Rethinking Risk: The Role of Selective Retrieval in Data Lake Strategies
- What Are Reasoning Models and Why You Should Care
- Apache Sedona: Putting the ‘Where’ In Big Data
- Rethinking AI-Ready Data with Semantic Layers
- Top-Down or Bottom-Up Data Model Design: Which is Best?
- What Is MosaicML, and Why Is Databricks Buying It For $1.3B?
- LakeFS Nabs $20M to Build ‘Git for Big Data’
- More Features…
- Supabase’s $200M Raise Signals Big Ambitions
- Mathematica Helps Crack Zodiac Killer’s Code
- Promethium Wants to Make Self Service Data Work at AI Scale
- Solidigm Celebrates World’s Largest SSD with ‘122 Day’
- AI Is Making Us Dumber, MIT Researchers Find
- Toloka Expands Data Labeling Service
- The Top Five Data Labeling Firms According to Everest Group
- With $20M in Seed Funding, Datafy Advances Autonomous Cloud Storage Optimization
- Ryft Raises $8M to Help Companies Manage Their Own Data Without Relying on Vendors
- AWS Launches S3 Vectors
- More News In Brief…
- Seagate Unveils IronWolf Pro 24TB Hard Drive for SMBs and Enterprises
- Gartner Predicts 40% of Generative AI Solutions Will Be Multimodal By 2027
- OpenText Launches Cloud Editions 25.3 with AI, Cloud, and Cybersecurity Enhancements
- TigerGraph Secures Strategic Investment to Advance Enterprise AI and Graph Analytics
- Promethium Introduces 1st Agentic Platform Purpose-Built to Deliver Self-Service Data at AI Scale
- StarTree Adds Real-Time Iceberg Support for AI and Customer Apps
- Gathr.ai Unveils Data Warehouse Intelligence
- Databricks Announces Data Intelligence Platform for Communications
- Graphwise Launches GraphDB 11 to Bridge LLMs and Enterprise Knowledge Graphs
- Open Source Data Integration Company Airbyte Closes $26M Series A
- More This Just In…