Data Just Right Introduction to Large-Scale Data & Analytics

  • ISBN13:


  • ISBN10:


  • Edition: 1st
  • Format: Paperback
  • Copyright: 2013-12-19
  • Publisher: Addison-Wesley Professional

Note: Supplemental materials are not guaranteed with Rental or Used book purchases.

Purchase Benefits

  • Free Shipping On Orders Over $35!
    Your order must be $35 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • Get Rewarded for Ordering Your Textbooks! Enroll Now
List Price: $39.99 Save up to $21.32
  • Rent Book $25.99
    Add to Cart Free Shipping


Supplemental Materials

What is included with this book?

  • The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.
  • The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.


Large-scale data analysis ("Big Data") is suddenly of crucial importance to virtually every enterprise. Mobile and social technologies are generating massive datasets, and distributed cloud computing is providing better ways to analyzing and processing that data. Accelerating technological change is turning long-accepted ideas about Big Data upside down, forcing companies to evaluate daunting new technologies, including NoSQL databases. Until now, however, most books on "Big Data" have been little more than business polemics and product catalogs. Data Just Rightis different -- and utterly invaluable to every Big Data decision-maker, implementer, and strategist. Google's Michael Manoochehri organizes this book around today's key Big Data use cases, showing how they can be best addressed by combining technologies in hybrid solutions. Drawing on his own extensive experience, Manoochehri presents the technical detail you need to implement each solution, and best practices you can apply to any Big Data project. You'll learn how to: "Build for infinity," supporting rapid growth Break down data silos Decide what to insource and what to outsource Focus on applications, not infrastructure, since that's where you can drive the most value Throughout, Manoochehri shows how to use and integrate cutting-edge technologies including Hadoop, Hive, Pig, Tableau, R, and Google Bigquery. No other Big Data guide offers as much practical, actionable insight -- or even comes close.

Author Biography

Michael Manoochehri is an entrepreneur, writer, and optimist. With many years of experience working with enterprise, research, and non-profit organizations, his goal is to help make scalable data analytics more affordable and accessible. Michael has been a member of Google's Cloud Platform developer relations team, focusing on cloud computing and data developer products such as Google BigQuery. In addition, Michael has written for the tech blog ProgrammableWeb.com, has spent time in rural Uganda researching mobile phone use, and holds a master's degree in information management and systems from UC Berkeley's School of Information.

Table of Contents

Part I. Directives in the Big Data Era

1. The Four Guiding Principles For Data Success


Part II. Collecting and Sharing a lot of Data

2. How to Host and Share 5 Terabytes of Data

3. Building a NoSQL-Based Web App to Collect Crowd-Sourced Data

4. Strategies for Breaking Down Data Silos


Part III. Asking Questions About Your Data

5. Using Hadoop, Hive, and Pig to Ask Questions About Large Data

6. Building a Data Dashboard With Google Bigquery

7. Preparing Big Data Sets for Visualization with Tableau


Part IV. Data Pipelines And Real Time Data

8. What is a Data Pipeline?

9. Building a Hadoop-Based Data Transformation Pipeline With Cascading

10. Analyzing Snapshots of Streaming Data with Twitter Storm


Part V. Machine Learning For Large Datasets

11. Building a Big Data Classification System with Mahout


Part VI. Statistical Analysis For Massive Datasets

12. Using R with Large Datasets

13. Build an Analytics Workflow With Pandas


Part VII. Practical Solutions To Big Data

14. When to Build, When to Buy, When to Outsource

Rewards Program

Write a Review