rent-now

Rent More, Save More! Use code: ECRENTAL

5% off 1 book, 7% off 2 books, 10% off 3+ books

9780201648652

Introduction to Parallel Computing

by Grama, Ananth; Grama, Ananth; Karypis, George; Kumar, Vipin; Gupta, Anshul
  • ISBN13:

    9780201648652

  • ISBN10:

    0201648652

  • Edition: 2nd
  • Format: Hardcover
  • Copyright: 2003-01-16
  • Publisher: Pearson
  • Purchase Benefits
  • Free Shipping Icon Free Shipping On Orders Over $35!
    Your order must be $35 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • eCampus.com Logo Get Rewarded for Ordering Your Textbooks! Enroll Now
  • Buyback Icon We Buy This Book Back!
    In-Store Credit: $5.83
    Check/Direct Deposit: $5.55
    PayPal: $5.55
List Price: $173.32

Summary

Introduction to Parallel Computing, Second Edition Ananth Grama Anshul Gupta George Karypis Vipin Kumar Increasingly, parallel processing is being seen as the only cost-effective method for the fast solution of computationally large and data-intensive problems. The emergence of inexpensive parallel computers such as commodity desktop multiprocessors and clusters of workstations or PCs has made such parallel methods generally applicable, as have software standards for portable parallel programming. This sets the stage for substantial growth in parallel software. Data-intensive applications such as transaction processing and information retrieval, data mining and analysis and multimedia services have provided a new challenge for the modern generation of parallel platforms. Emerging areas such as computational biology and nanotechnology have implications for algorithms and systems development, while changes in architectures, programming models and applications have implications for how parallel platforms are made available to users in the form of grid-based services. This book takes into account these new developments as well as covering the more traditional problems addressed by parallel computers. Where possible it employs an architecture-independent view of the underlying platforms and designs algorithms for an abstract model. Message Passing Interface (MPI), POSIX threads and OpenMP have been selected as programming models and the evolving application mix of parallel computing is reflected in various examples throughout the book. * Provides a complete end-to-end source on almost every aspect of parallel computing (architectures, programming paradigms, algorithms and standards). * Covers both traditional computer science algorithms (sorting, searching, graph, and dynamic programming algorithms) as well as scientific computing algorithms (matrix computations, FFT). * Covers MPI, Pthreads and OpenMP, the three most widely used standards for writing portable parallel programs. * The modular nature of the text makes it suitable for a wide variety of undergraduate and graduate level courses including parallel computing, parallel programming, design and analysis of parallel algorithms and high performance computing. Ananth Grama is Associate Professor of Computer Sciences at Purdue University, working on various aspects of parallel and distributed systems and applications. Anshul Gupta is a member of the research staff at the IBM T. J. Watson Research Center. His research areas are parallel algorithms and scientific computing. George Karypis is Assistant Professor in the Department of Computer Science and Engineering at the University of Minnesota, working on parallel algorithm design, graph partitioning, data mining, and bioinformatics. Vipin Kumar is Professor in the Department of Computer Science and Engineering and the Director of the Army High Performance Computing Research Center at the University of Minnesota. His research interests are in the areas of high performance computing, parallel algorithms for scientific computing problems and data mining.

Table of Contents

Preface xvii
Acknowledgments xix
Introduction to Parallel Computing
1(10)
Motivating Parallelism
2(2)
The Computational Power Argument -- from Transistors to FLOPS
2(1)
The Memory/Disk Speed Argument
3(1)
The Data Communication Argument
4(1)
Scope of Parallel Computing
4(2)
Applications in Engineering and Design
4(1)
Scientific Applications
5(1)
Commercial Applications
5(1)
Applications in Computer Systems
6(1)
Organization and Contents of the Text
6(2)
Bibliographic Remarks
8(3)
Problems
9(2)
Parallel Programming Platforms
11(74)
Implicit Parallelism: Trends in Microprocessor Architectures
12(4)
Pipelining and Superscalar Execution
12(3)
Very Long Instruction Word Processors
15(1)
Limitations of Memory System Performance*
16(8)
Improving Effective Memory Latency Using Caches
17(1)
Impact of Memory Bandwidth
18(3)
Alternate Approaches for Hiding Memory Latency
21(2)
Tradeoffs of Multithreading and Prefetching
23(1)
Dichotomy of Parallel Computing Platforms
24(7)
Control Structure of Parallel Platforms
25(2)
Communication Model of Parallel Platforms
27(4)
Physical Organization of Parallel Platforms
31(22)
Architecture of an Ideal Parallel Computer
31(1)
Interconnection Networks for Parallel Computers
32(1)
Network Topologies
33(10)
Evaluating Static Interconnection Networks
43(1)
Evaluating Dynamic Interconnection Networks
44(1)
Cache Coherence in Multiprocessor Systems
45(8)
Communication Costs in Parallel Machines
53(10)
Message Passing Costs in Parallel Computers
53(8)
Communication Costs in Shared-Address-Space Machines
61(2)
Routing Mechanisms for Interconnection Networks
63(2)
Impact of Process-Processor Mapping and Mapping Techniques
65(9)
Mapping Techniques for Graphs
66(7)
Cost-Performance Tradeoffs
73(1)
Bibliographic Remarks
74(11)
Problems
76(9)
Principles of Parallel Algorithm Design
85(62)
Preliminaries
86(9)
Decomposition, Tasks, and Dependency Graphs
86(3)
Granularity, Concurrency, and Task-Interaction
89(4)
Processes and Mapping
93(1)
Processes versus Processors
94(1)
Decomposition Techniques
95(15)
Recursive Decomposition
95(2)
Data Decomposition
97(8)
Exploratory Decomposition
105(2)
Speculative Decomposition
107(2)
Hybrid Decompositions
109(1)
Characteristics of Tasks and Interactions
110(5)
Characteristics of Tasks
110(2)
Characteristics of Inter-Task Interactions
112(3)
Mapping Techniques for Load Balancing
115(17)
Schemes for Static Mapping
117(13)
Schemes for Dynamic Mapping
130(2)
Methods for Containing Interaction Overheads
132(7)
Maximizing Data Locality
132(2)
Minimizing Contention and Hot Spots
134(1)
Overlapping Computations with Interactions
135(1)
Replicating Data or Computations
136(1)
Using Optimized Collective Interaction Operations
137(1)
Overlapping Interactions with Other Interactions
138(1)
Parallel Algorithm Models
139(3)
The Data-Parallel Model
139(1)
The Task Graph Model
140(1)
The Work Pool Model
140(1)
The Master-Slave Model
141(1)
The Pipeline or Producer-Consumer Model
141(1)
Hybrid Models
142(1)
Bibliographic Remarks
142(5)
Problems
143(4)
Basic Communication Operations
147(48)
One-to-All Broadcast and All-to-One Reduction
149(8)
Ring or Linear Array
149(3)
Mesh
152(1)
Hypercube
153(1)
Balanced Binary Tree
153(1)
Detailed Algorithms
154(2)
Cost Analysis
156(1)
All-to-All Broadcast and Reduction
157(9)
Linear Array and Ring
158(2)
Mesh
160(1)
Hypercube
161(3)
Cost Analysis
164(2)
All-Reduce and Prefix-Sum Operations
166(1)
Scatter and Gather
167(3)
All-to-All Personalized Communication
170(9)
Ring
173(1)
Mesh
174(1)
Hypercube
175(4)
Circular Shift
179(5)
Mesh
179(2)
Hypercube
181(3)
Improving the Speed of Some Communication Operations
184(3)
Splitting and Routing Messages in Parts
184(2)
All-Port Communication
186(1)
Summary
187(1)
Bibliographic Remarks
188(7)
Problems
190(5)
Analytical Modeling of Parallel Programs
195(38)
Sources of Overhead in Parallel Programs
195(2)
Performance Metrics for Parallel Systems
197(8)
Execution Time
197(1)
Total Parallel Overhead
197(1)
Speedup
198(4)
Efficiency
202(1)
Cost
203(2)
The Effect of Granularity on Performance
205(3)
Scalability of Parallel Systems
208(10)
Scaling Characteristics of Parallel Programs
209(3)
The Isoefficiency Metric of Scalability
212(5)
Cost-Optimality and the Isoefficiency Function
217(1)
A Lower Bound on the Isoefficiency Function
217(1)
The Degree of Concurrency and the Isoefficiency Function
218(1)
Minimum Execution Time and Minimum Cost-Optimal Execution Time
218(3)
Asymptotic Analysis of Parallel Programs
221(1)
Other Scalability Metrics
222(4)
Bibliographic Remarks
226(7)
Problems
228(5)
Programming Using the Message-Passing Paradigm
233(46)
Principles of Message-Passing Programming
233(2)
The Building Blocks: Send and Receive Operations
235(5)
Blocking Message Passing Operations
236(3)
Non-Blocking Message Passing Operations
239(1)
MPI: the Message Passing Interface
240(10)
Starting and Terminating the MPI Library
242(1)
Communicators
242(1)
Getting Information
243(1)
Sending and Receiving Messages
244(4)
Example: Odd-Even Sort
248(2)
Topologies and Embedding
250(5)
Creating and Using Cartesian Topologies
251(2)
Example: Cannon's Matrix-Matrix Multiplication
253(2)
Overlapping Communication with Computation
255(5)
Non-Blocking Communication Operations
255(5)
Collective Communication and Computation Operations
260(12)
Barrier
260(1)
Broadcast
260(1)
Reduction
261(2)
Prefix
263(1)
Gather
263(1)
Scatter
264(1)
All-to-All
265(1)
Example: One-Dimensional Matrix-Vector Multiplication
266(2)
Example: Single-Source Shortest-Path
268(2)
Example: Sample Sort
270(2)
Groups and Communicators
272(4)
Example: Two-Dimensional Matrix-Vector Multiplication
274(2)
Bibliographic Remarks
276(3)
Problems
277(2)
Programming Shared Address Space Platforms
279(58)
Thread Basics
280(1)
Why Threads?
281(1)
The POSIX Thread API
282(1)
Thread Basics: Creation and Termination
282(5)
Synchronization Primitives in Pthreads
287(11)
Mutual Exclusion for Shared Variables
287(7)
Condition Variables for Synchronization
294(4)
Controlling Thread and Synchronization Attributes
298(3)
Attributes Objects for Threads
299(1)
Attributes Objects for Mutexes
300(1)
Thread Cancellation
301(1)
Composite Synchronization Constructs
302(8)
Read-Write Locks
302(5)
Barriers
307(3)
Tips for Designing Asynchronous Programs
310(1)
OpenMP: a Standard for Directive Based Parallel Programming
311(21)
The OpenMP Programming Model
312(3)
Specifying Concurrent Tasks in OpenMP
315(7)
Synchronization Constructs in OpenMP
322(5)
Data Handling in OpenMP
327(1)
OpenMP Library Functions
328(2)
Environment Variables in OpenMP
330(1)
Explicit Threads versus OpenMP Based Programming
331(1)
Bibliographic Remarks
332(5)
Problems
332(5)
Dense Matrix Algorithms
337(42)
Matrix-Vector Multiplication
337(8)
Rowwise 1-D Partitioning
338(3)
2-D Partitioning
341(4)
Matrix-Matrix Multiplication
345(7)
A Simple Parallel Algorithm
346(1)
Cannon's Algorithm
347(2)
The DNS Algorithm
349(3)
Solving a System of Linear Equations
352(19)
A Simple Gaussian Elimination Algorithm
353(13)
Gaussian Elimination with Partial Pivoting
366(3)
Solving a Triangular System: Back-Substitution
369(1)
Numerical Considerations in Solving Systems of Linear Equations
370(1)
Bibliographic Remarks
371(8)
Problems
372(7)
Sorting
379(50)
Issues in Sorting on Parallel Computers
380(2)
Where the Input and Output Sequences are Stored
380(1)
How Comparisons are Performed
380(2)
Sorting Networks
382(12)
Bitonic Sort
384(3)
Mapping Bitonic Sort to a Hypercube and a Mesh
387(7)
Bubble Sort and its Variants
394(5)
Odd-Even Transposition
395(3)
Shellsort
398(1)
Quicksort
399(13)
Parallelizing Quicksort
401(1)
Parallel Formulation for a CRCW PRAM
402(2)
Parallel Formulation for Practical Architectures
404(7)
Pivot Selection
411(1)
Bucket and Sample Sort
412(2)
Other Sorting Algorithms
414(2)
Enumeration Sort
414(1)
Radix Sort
415(1)
Bibliographic Remarks
416(13)
Problems
419(10)
Graph Algorithms
429(40)
Definitions and Representation
429(3)
Minimum Spanning Tree: Prim's Algorithm
432(4)
Single-Source Shortest Paths: Dijkstra's Algorithm
436(1)
All-Pairs Shortest Paths
437(8)
Dijkstra's Algorithm
438(2)
Floyd's Algorithm
440(5)
Performance Comparisons
445(1)
Transitive Closure
445(1)
Connected Components
446(4)
A Depth-First Search Based Algorithm
446(4)
Algorithms for Sparse Graphs
450(12)
Finding a Maximal Independent Set
451(4)
Single-Source Shortest Paths
455(7)
Bibliographic Remarks
462(7)
Problems
465(4)
Search Algorithms for Discrete Optimization Problems
469(46)
Definitions and Examples
469(5)
Sequential Search Algorithms
474(4)
Depth-First Search Algorithms
474(4)
Best-First Search Algorithms
478(1)
Search Overhead Factor
478(2)
Parallel Depth-First Search
480(16)
Important Parameters of Parallel DFS
482(3)
A General Framework for Analysis of Parallel DFS
485(3)
Analysis of Load-Balancing Schemes
488(2)
Termination Detection
490(2)
Experimental Results
492(3)
Parallel Formulations of Depth-First Branch-and-Bound Search
495(1)
Parallel Formulations of IDA*
496(1)
Parallel Best-First Search
496(5)
Speedup Anomalies in Parallel Search Algorithms
501(4)
Analysis of Average Speedup in Parallel DFS
502(3)
Bibliographic Remarks
505(10)
Problems
510(5)
Dynamic Programming
515(22)
Overview of Dynamic Programming
515(3)
Serial Monadic DP Formulations
518(5)
The Shortest-Path Problem
518(2)
The 0/1 Knapsack Problem
520(3)
Nonserial Monadic DP Formulations
523(3)
The Longest-Common-Subsequence Problem
523(3)
Serial Polyadic DP Formulations
526(1)
Floyd's All-Pairs Shortest-Paths Algorithm
526(1)
Nonserial Polyadic DP Formulations
527(3)
The Optimal Matrix-Parenthesization Problem
527(3)
Summary and Discussion
530(1)
Bibliographic Remarks
531(6)
Problems
532(5)
Fast Fourier Transform
537(28)
The Serial Algorithm
538(3)
The Binary-Exchange Algorithm
541(12)
A Full Bandwidth Network
541(7)
Limited Bandwidth Network
548(3)
Extra Computations in Parallel FFT
551(2)
The Transpose Algorithm
553(7)
Two-Dimensional Transpose Algorithm
553(3)
The Generalized Transpose Algorithm
556(4)
Bibliographic Remarks
560(5)
Problems
562(3)
Appendix A Complexity of Functions and Order Analysis 565(4)
A.1 Complexity of Functions
565(1)
A.2 Order Analysis of Functions
566(3)
Bibliography 569(42)
Author Index 611(10)
Subject Index 621

Supplemental Materials

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Rewards Program