with Glen Mules, Joe Byers
Data scientists, engineers, or anyone who is interested in learning about MapReduce and YARN.
Time to complete:
Apache Hadoop is one of the most popular tools for big data processing. It has been successfully deployed in production by many companies for several years. Though Hadoop is considered a reliable, scalable, and cost-effective solution, it is constantly being improved by a large community of developers. As a result, the 2.0 version offers several revolutionary features, including Yet Another Resource Negotiator (YARN), HDFS Federation, and high availability, which make the Hadoop cluster much more efficient, powerful, and reliable.
The most serious limitations of classical MapReduce are primarily related to scalability, resource utilization, and the support of workloads different from MapReduce. In the MapReduce framework, the job execution is controlled by two types of processes: a single master process called JobTracker and a number of subordinate processes called TaskTrackers.
Apache Hadoop 2.0 includes YARN, which separates the resource management and processing components. The YARN-based architecture is not constrained to MapReduce. In YARN, MapReduce is simply degraded to a role of a distributed application (but still a very popular and useful one) and is now called MRv2. MRv2 is simply the re-implementation of the classic MapReduce engine, now called MRv1, which runs on top of YARN.
The course reviews MapReduce1 and provides insight into the design and implementation of YARN: ResourceManager instead of a cluster manager, ApplicationMaster instead of a dedicated and short-lived JobTracker, NodeManager instead of TaskTracker, a distributed application instead of a MapReduce job.
Big Data University has been chosen by IBM as one of the issuers of badges as part of the IBM Open Badge program. Share your achievements through LinkedIn, Facebook, Twitter, and more!
Big Data University leverages the services of Pearson VUE Acclaim to assist in the administration of the IBM Open Badge program. By enrolling into this course, you agree to Big Data University sharing your details with Pearson VUE Acclaim for the strict use of issuing your badge upon completion of the badge criteria.
After completing this course, you should be able to:
Before taking this course, you should have the following background:
The minimum passing mark for the course is 60%, where the final test is worth 100% of the course mark. You have 3 attempts to take the test