
Hortonworks Hocks Hadoop Upgrade
Apache Hadoop contributor Hortonworks announced Hortonworks Data Platform version 2. HDPv2 will be using the most recent version of Hadoop (0.23). According to the Apache Software Foundation, curators and cultivators of Hadoop, the newest release is enterprise ready.
The Hortonworks Data Platform, which is powered by Hadoop, is the company’s scalable open source platform for handling big enterprise and research data. As with the other Hadoop distros floating around out there, the key to the success of the platform is the ability to integrate data from just about any source imaginable and provide a more simplified way to make use of it.
The company describes how they differentiate themselves from others offering Hadoop simplification for the enterprise, noting:
“Unlike other Hadoop solutions that lock away management features within proprietary extensions, Hortonworks Data Platform includes Ambari, an open source installation and management system out of the box. Hortonworks Data Platform also includes HCatalog, a metadata management service for simplifying data sharing between Hadoop and other enterprise information systems, along with a complete set of open APIs, including WebHDFS and those for Ambari and HCatalog, to make it easier for ISVs to integrate and extend Apache Hadoop.”
On Jan.6th, when the Apache Software Foundation made news announcing Hadoop v1.0 after 6 years of development, a number of notable new features and enhancements were made. With the release of Hadoop version 0.23, improvements have been made to both HDFS and MapReduce including:
- NextGen MapReduce (also known as YARN)
- HDFS Federation, which allows Namenodes to act independently and without coordination with eachother
- Splitting MapReduce JobTracker into 2 components (resource management and life-cycle management)
- The Resource manager will now manage global assignment of compute resources for each application while ApplicationMaster will manage scheduling and coordination.
According to Eric Baldeschwieler, CEO of Hortonworks, “With more than three years of development and much anticipation, Apache Hadoop 0.23 delivers important advancements in scalability, performance, high availability and data integrit.
He continued, “Apache Hadoop 0.23 is currently being tested across hundreds of applications in the world’s largest Hadoop deployment. We are excited to make the technology advancements in Apache Hadoop 0.23 available through an easily consumable version via the Hortonworks Data Platform v2.”
HDP was created to extremely scalable and fully open-source platform for storage, processing, analysis of large scale data. Along with HDFS and MapReduce, Hortonworks Data Platform includes Pig, Hive, HBase and Zookeeper.
Hortonworks was created by Yahoo! and Benchmark Capital to facilitate Apache Hadoop development. They provide tech support, training and certifications for vendors, enterprises, service providers and systems integrators.
Related Stories
Hadoop Hits Primetime with Production Release
RainStor Brings Database to Hadoop
Karmasphere Ushers in New Hadoop Partner
April 29, 2025
- DataOps.live Named Data Breakthrough Awards’ DataOps Company of the Year
- Denodo Launches Platform 9.2 with Enhanced Data Marketplace and GenAI Features
- Elastic Launches Automatic Migration to Simplify SIEM Migration
- Argonne Examines Opportunities and Risks of GenAI Tools
- GigaIO Demonstrates Power and Cost Savings with New AI Interconnect Benchmarks
- RWS TrainAI Study Finds Claude Sonnet, GPT and Gemini Pro Lead in Synthetic Data Generation
- Open Compute Project Foundation and UALink Consortium Announce a New Collaboration
April 28, 2025
- Sumo Logic Unifies Security to Deliver Intelligent Security Operations
- Dataminr Unveils Agentic AI Roadmap to Advance Real-Time Decision-Making
- Oracle Expands OCI with 1st Wave of NVIDIA Blackwell GB200 Systems
April 25, 2025
- Denodo Supports Real-Time Data Integration for Hospital Sant Joan de Déu Barcelona
- Redwood Expands Automation Platform with Introduction of Redwood Insights
- Datatonic Announces Acquisition of Syntio to Expand Global Services and Delivery Capabilities
April 24, 2025
- Dataiku Expands Platform with Tools to Build, Govern, and Monitor AI Agents at Scale
- Indicium Launches IndiMesh to Streamline Enterprise AI and Data Systems
- StorONE and Phison Unveil Storage Platform Designed for LLM Training and AI Workflows
- Dataminr Raises $100M to Accelerate Global Push for Real-Time AI Intelligence
- Elastic Announces General Availability of Elastic Cloud Serverless on Google Cloud Marketplace
- CNCF Announces Schedule for OpenTelemetry Community Day
- Thoughtworks Signs Global Strategic Collaboration Agreement with AWS
- PayPal Feeds the DL Beast with Huge Vault of Fraud Data
- OpenTelemetry Is Too Complicated, VictoriaMetrics Says
- Thriving in the Second Wave of Big Data Modernization
- Google Cloud Preps for Agentic AI Era with ‘Ironwood’ TPU, New Models and Software
- Google Cloud Fleshes Out its Databases at Next 2025, with an Eye to AI
- Can We Learn to Live with AI Hallucinations?
- Monte Carlo Brings AI Agents Into the Data Observability Fold
- AI Today and Tomorrow Series #3: HPC and AI—When Worlds Converge/Collide
- The Active Data Architecture Era Is Here, Dresner Says
- Will Model Context Protocol (MCP) Become the Standard for Agentic AI?
- More Features…
- Google Cloud Cranks Up the Analytics at Next 2025
- New Intel CEO Lip-Bu Tan Promises Return to Engineering Innovation in Major Address
- AI One Emerges from Stealth to “End the Data Lake Era”
- GigaOM Report Highlights Top Performers in Unstructured Data Management for 2025
- SnapLogic Connects the Dots Between Agents, APIs, and Work AI
- Supabase’s $200M Raise Signals Big Ambitions
- Snowflake Bolsters Support for Apache Iceberg Tables
- Dataminr Bets Big on Agentic AI for the Future of Real-Time Data Intelligence
- GenAI Investments Accelerating, IDC and Gartner Say
- Dremio Speeds AI and BI Workloads with Spring Lakehouse Release
- More News In Brief…
- Gartner Predicts 40% of Generative AI Solutions Will Be Multimodal By 2027
- Dataiku Achieves AWS Generative AI Competency
- AMD Powers New Google Cloud C4D and H4D VMs with 5th Gen EPYC CPUs
- MLCommons Releases New MLPerf Inference v5.0 Benchmark Results
- Opsera Raises $20M to Expand AI-Driven DevOps Platform
- GitLab Announces the General Availability of GitLab Duo with Amazon Q
- Dataminr Raises $100M to Accelerate Global Push for Real-Time AI Intelligence
- Intel and IBM Announce Availability of Intel Gaudi 3 AI Accelerators on IBM Cloud
- Kinaxis Partners with Databricks to Accelerate AI-Powered Supply Chain Orchestration
- SAS Partners with Kansas State to Advance AI-Driven Water Management
- More This Just In…