Описание тега hadoop2

Hadoop 2 represents the second generation of the very popular open source distributed platform Apache Hadoop.

Apache Hadoop 2.x consists of significant improvements over the previous stable release of Hadoop aka Hadoop 1.x. Several major enhancements have been made to both the building blocks of Hadoop viz, HDFS and MapReduce. They are:

  1. HDFS Federation :In order to scale the name service horizontally, federation uses multiple independent Namenodes/Namespaces.

  2. MapReduce NextGen aka YARN aka MRv2: The new architecture divides the two major functions of the JobTracker, resource management and job life-cycle management, into separate components. The new ResourceManager manages the global assignment of compute resources to applications and the per-application ApplicationMaster manages the application‚ scheduling and coordination. An application is either a single job in the sense of classic MapReduce jobs or a DAG of such jobs. The ResourceManager and per-machine NodeManager daemon, which manages the user processes on that machine, form the computation fabric.

For more info on Hadoop 2 the official Hadoop 2 homepage can be visited.