Data centre's new operating system to be Hadoop, according to WANdisco's claim
In a significant move for the big data market, WANdisco's share price has risen by 3% following the announcement of their new product, Non-stop NameNode. This innovative solution, designed to improve the resilience of Hadoop systems, has the potential to make a substantial impact.
WANdisco's journey began in 2005 when Dr. Yeturu Aahlad, a former Sun Microsystems employee, and UK IT entrepreneur David Richards founded the company, short for WAN distributed computing. The first commercial opportunity they pursued was distributed software development.
The problem they aimed to solve was active-active replication not working over wide-area networks (WAN). Aahlad found a solution by applying the Paxos algorithm, a technique devised by Leslie Lamport in the 1980s. This algorithm forms the core of WANdisco's technology, known as the 'Distributed Co-ordination Engine', or DConE.
In 2009, WANdisco decided to base its new headquarters in Sheffield, with David Richards citing lower costs and a significant talent pool as reasons. This decision has proven to be a strategic move, as the company has seen steady growth since then.
In November 2012, WANdisco made a significant acquisition by purchasing AltoStor, a Silicon Valley big data storage start-up. The founders of AltoStor were two of Hadoop's original authors while at Yahoo!, bringing valuable Hadoop know-how to WANdisco.
Hadoop, underpinned by its distributed file system HDFS, has far-reaching applications across the enterprise, according to David Richards. In a typical Hadoop cluster, the NameNode is the critical master node responsible for managing the metadata. Traditionally, there is one active NameNode, and if it fails or becomes unavailable, the entire Hadoop Distributed File System (HDFS) cluster can become inaccessible or enter a failover state, causing downtime.
WANdisco's Non-stop NameNode improves the resilience of Hadoop systems by eliminating the traditional single point of failure associated with the Hadoop NameNode. Here's how it works and what benefits it brings:
Background — The NameNode as a Single Point of Failure
In a typical Hadoop cluster, the NameNode is the critical master node responsible for managing the metadata (file system namespace, file-to-block mapping, and block locations). Traditionally, there is one active NameNode, and if it fails or becomes unavailable, the entire HDFS cluster can become inaccessible or enter a failover state, causing downtime.
How WANdisco Non-stop NameNode Improves Resilience
- Active-Active Architecture: WANdisco Non-stop NameNode implements a truly active-active NameNode architecture. Instead of having one active and one standby (which can only take over after failover), it enables multiple NameNodes to be active simultaneously across different geographic locations or within the same data center.
- Synchronous Replication of Metadata: The product uses WANdisco’s patented Distributed Coordination Engine (DConE) technology to synchronously replicate all metadata updates between the active NameNodes in real-time. This ensures that both NameNodes always have an identical and consistent view of the file system metadata.
- Zero Downtime Failover: Because all metadata changes are mirrored synchronously, if one NameNode fails or is taken offline for maintenance, the other NameNode continues to serve client requests without interruption. There is no downtime or failover lag.
- Geographically Distributed Clusters: WANdisco enables Hadoop clusters to be active-active not just within one data center but also across multiple geographically separated sites. This extends resilience to disaster recovery scenarios, allowing continuous Hadoop operation even if an entire site goes down.
- Simplified Operations: With non-stop operation, cluster administrators do not need to manage complex failover and recovery procedures. The system ensures continuous availability of the NameNode metadata service seamlessly.
WANdisco's Non-stop NameNode, combined with AltoStor's Hadoop know-how, addresses a critical weak point in the big data framework. The product has already won its first customer, an unnamed tier 1 telecommunications provider in the UK. WANdisco's customers also include HP, Intel, and Lockheed Martin.
In June 2012, WANdisco launched on the Alternative Investment Market (AIM) of London's stock exchange and raised £15 million in the IPO. However, the company reported a £3 million loss for the year, including £2.5 million in costs associated with the floatation. Despite this, the market remains confident in WANdisco's potential to make a dent in the big data market.
WANdisco's technology enables businesses to make development more effective and resilient by implementing code-sharing applications on DConE. This allows multiple instances of the same application to run on separate infrastructure with real-time data replication. With Non-stop NameNode, WANdisco is set to revolutionise big data resilience, significantly improving fault tolerance, uptime, and operational simplicity in big data environments.
Technology and data-and-cloud-computing are integral components of WANdisco's new product, Non-stop NameNode. This innovative solution, designed to enhance the resilience of Hadoop systems, leverages WANdisco's technology, the Distributed Co-ordination Engine (DConE), for synchronous replication of metadata in real-time, thereby eliminating the traditional single point of failure associated with the Hadoop NameNode.