9780321934505

Apache Hadoop YARN Moving beyond MapReduce and Batch Processing with Apache Hadoop 2

by ; ; ; ;
  • ISBN13:

    9780321934505

  • ISBN10:

    0321934504

  • Edition: 1st
  • Format: Paperback
  • Copyright: 3/19/2014
  • Publisher: Addison-Wesley Professional
  • Purchase Benefits
  • Free Shipping On Orders Over $59!
    Your order must be $59 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • Get Rewarded for Ordering Your Textbooks! Enroll Now
List Price: $39.99 Save up to $6.00
  • Buy New
    $33.99

    CURRENTLY AVAILABLE, USUALLY SHIPS IN 24-48 HOURS

Supplemental Materials

What is included with this book?

  • The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.
  • The eBook copy of this book is not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Summary

“This book is a critically needed resource for the newly released Apache Hadoop 2.0, highlighting YARN as the significant breakthrough that broadens Hadoop beyond the MapReduce paradigm.”
—From the Foreword by Raymie Stata, CEO of Altiscale


The Insider’s Guide to Building Distributed, Big Data Applications with Apache Hadoop™ YARN

 

Apache Hadoop is helping drive the Big Data revolution. Now, its data processing has been completely overhauled: Apache Hadoop YARN provides resource management at data center scale and easier ways to create distributed applications that process petabytes of data. And now in Apache Hadoop™ YARN, two Hadoop technical leaders show you how to develop new applications and adapt existing code to fully leverage these revolutionary advances.

 

YARN project founder Arun Murthy and project lead Vinod Kumar Vavilapalli demonstrate how YARN increases scalability and cluster utilization, enables new programming models and services, and opens new options beyond Java and batch processing. They walk you through the entire YARN project lifecycle, from installation through deployment.

 

You’ll find many examples drawn from the authors’ cutting-edge experience—first as Hadoop’s earliest developers and implementers at Yahoo! and now as Hortonworks developers moving the platform forward and helping customers succeed with it.

 

Coverage includes

  • YARN’s goals, design, architecture, and components—how it expands the Apache Hadoop ecosystem
  • Exploring YARN on a single node 
  • Administering YARN clusters and Capacity Scheduler 
  • Running existing MapReduce applications 
  • Developing a large-scale clustered YARN application 
  • Discovering new open source frameworks that run under YARN

Author Biography

Arun Murthy has contributed to Apache Hadoop full-time since the inception of the project in early 2006. He is a long-term Hadoop Committer and a member of the Apache Hadoop Project Management Committee. Previously, he was the architect and lead of the Yahoo Hadoop Map-Reduce development team and was ultimately responsible, technically, for providing Hadoop Map-Reduce as a service for all of Yahoo - currently running on nearly 50,000 machines! Arun is the Founder and Architect of the Hortonworks Inc., a software company that is helping to accelerate the development and adoption of Apache Hadoop. Hortonworks was formed by the key architects and core Hadoop committers from the Yahoo! Hadoop software engineering team in June 2011 in order to accelerate the development and adoption of Apache Hadoop. Funded by Yahoo! and Benchmark Capital, one of the preeminent technology investors, their goal is to ensure that Apache Hadoop becomes the standard platform for storing, processing, managing and analyzing big data. He lives in Silicon Valley in California.

Douglas Eadline, PhD, began his career as a practitioner and a chronicler of the Linux Cluster HPC revolution and now documents big data analytics. Starting with the first Beowulf How To document, Dr. Eadline has written hundreds of articles, white papers, and instructional documents covering virtually all aspects of HPC computing. Prior to starting and editing the popular ClusterMonkey.net web site in 2005, he served as Editor┐in┐chief for ClusterWorld Magazine, and was Senior HPC Editor for Linux Magazine. Currently, he is a consultant to the HPC industry and writes a monthly column in HPC Admin Magazine. Both clients and readers have recognized Dr. Eadline's ability to present a "technological value proposition" in a clear and accurate style. He has practical hands on experience in many aspects of HPC including, hardware and software design, benchmarking, storage, GPU, cloud, and parallel computing.

Table of Contents

1. YARN Quick Start

2. YARN and the Hadoop Ecosystem
3. Functional Overview of YARN Components
4. Installing YARN
5. Running Applications with YARN
6. YARN Administration
7. YARN Architecture Guide
8. Writing a Simple YARN Application
9. Using YARN Distributed Shell
10. Accelerating Applications with Tez
11. YARN Frameworks
Appendix A. Navigating and Joining the Hadoop Ecosystem
Appendix B. YARN Software API Reference

Rewards Program

Write a Review