Note: Supplemental materials are not guaranteed with Rental or Used book purchases.
Purchase Benefits
What is included with this book?
Parallel Computing Concepts and Terminology | |
Introduction | p. 3 |
Parallel Computing in Quantum Chemistry: Past and Present | p. 4 |
Trends in Hardware Development | p. 5 |
Moore's Law | p. 5 |
Clock Speed and Performance | p. 6 |
Bandwidth and Latency | p. 7 |
Supercomputer Performance | p. 8 |
Trends in Parallel Software Development | p. 10 |
Responding to Changes in Hardware | p. 10 |
New Algorithms and Methods | p. 10 |
New Programming Models | p. 12 |
References | p. 13 |
Parallel Computer Architectures | p. 17 |
Flynn's Classification Scheme | p. 17 |
Single-Instruction, Single-Data | p. 17 |
Single-Instruction, Multiple-Data | p. 18 |
Multiple-Instruction, Multiple-Data | p. 18 |
Network Architecture | p. 19 |
Direct and Indirect Networks | p. 19 |
Routing | p. 20 |
Network Performance | p. 23 |
Network Topology | p. 25 |
Crossbar | p. 26 |
Ring | p. 27 |
Mesh and Torus | p. 27 |
Hypercube | p. 28 |
Fat Tree | p. 28 |
Bus | p. 30 |
Ad Hoc Grid | p. 31 |
Node Architecture | p. 31 |
MIMD System Architecture | p. 34 |
Memory Hierarchy | p. 35 |
Persistent Storage | p. 35 |
Local Storage | p. 37 |
Network Storage | p. 37 |
Trends in Storage | p. 38 |
Reliability | p. 38 |
Homogeneity and Heterogeneity | p. 39 |
Commodity versus Custom Computers | p. 40 |
Further Reading | p. 42 |
References | p. 43 |
Communication via Message-Passing | p. 45 |
Point-to-Point Communication Operations | p. 46 |
Blocking Point-to-Point Operations | p. 46 |
Non-Blocking Point-to-Point Operations | p. 47 |
Collective Communication Operations | p. 49 |
One-to-All Broadcast | p. 50 |
All-to-All Broadcast | p. 51 |
All-to-One Reduction and All-Reduce | p. 54 |
One-Sided Communication Operations | p. 55 |
Further Reading | p. 56 |
References | p. 56 |
Multi-Threading | p. 59 |
Pitfalls of Multi-Threading | p. 61 |
Thread-Safety | p. 64 |
Comparison of Multi-Threading and Message-Passing | p. 65 |
Hybrid Programming | p. 66 |
Further Reading | p. 69 |
References | p. 70 |
Parallel Performance Evaluation | p. 71 |
Network Performance Characteristics | p. 71 |
Performance Measures for Parallel Programs | p. 74 |
Speedup and Efficiency | p. 74 |
Scalability | p. 79 |
Performance Modeling | p. 80 |
Modeling the Execution Time | p. 80 |
Performance Model Example: Matrix-Vector Multiplication | p. 83 |
Presenting and Evaluating Performance Data: A Few Caveats | p. 86 |
Further Reading | p. 90 |
References | p. 90 |
Parallel Program Design | p. 93 |
Distribution of Work | p. 94 |
Static Task Distribution | p. 95 |
Round-Robin and Recursive Task Distributions | p. 96 |
Dynamic Task Distribution | p. 99 |
Manager-Worker Model | p. 99 |
Decentralized Task Distribution | p. 101 |
Distribution of Data | p. 101 |
Designing a Communication Scheme | p. 104 |
Using Collective Communication | p. 104 |
Using Point-to-Point Communication | p. 105 |
Design Example: Matrix-Vector Multiplication | p. 107 |
Using a Row-Distributed Matrix | p. 108 |
Using a Block-Distributed Matrix | p. 109 |
Summary of Key Points of Parallel Program Design | p. 112 |
Further Reading | p. 114 |
References | p. 114 |
Applications of Parallel Programming in Quantum Chemistry | |
Two-Electron Integral Evaluation | p. 117 |
Basics of Integral Computation | p. 117 |
Parallel Implementation Using Static Load Balancing | p. 119 |
Parallel Algorithms Distributing Shell Quartets and Pairs | p. 119 |
Performance Analysis | p. 121 |
Determination of the Load Imbalance Factor k(p) | p. 122 |
Determination of [mu] and [sigma] for Integral Computation | p. 123 |
Predicted and Measured Efficiencies | p. 124 |
Parallel Implementation Using Dynamic Load Balancing | p. 125 |
Parallel Algorithm Distributing Shell Pairs | p. 126 |
Performance Analysis | p. 128 |
Load Imbalance | p. 128 |
Communication Time | p. 128 |
Predicted and Measured Efficiencies | p. 129 |
References | p. 130 |
The Hartree-Fock Method | p. 131 |
The Hartree-Fock Equations | p. 131 |
The Hartree-Fock Procedure | p. 133 |
Parallel Fock Matrix Formation with Replicated Data | p. 135 |
Parallel Fock Matrix Formation with Distributed Data | p. 138 |
Further Reading | p. 145 |
References | p. 146 |
Second-Order Moller-Plesset Perturbation Theory | p. 147 |
The Canonical MP2 Equations | p. 147 |
A Scalar Direct MP2 Algorithm | p. 149 |
Parallelization with Minimal Modifications | p. 151 |
High-Performance Parallelization | p. 154 |
Performance of the Parallel Algorithms | p. 158 |
Further Reading | p. 164 |
References | p. 164 |
Local Moller-Plesset Perturbation Theory | p. 167 |
The LMP2 Equations | p. 167 |
A Scalar LMP2 Algorithm | p. 169 |
Parallel LMP2 | p. 170 |
Two-Electron Integral Transformation | p. 171 |
Computation of the Residual | p. 173 |
Parallel Performance | p. 174 |
References | p. 177 |
Appendices | |
A Brief Introduction to MPI | p. 181 |
Pthreads: Explicit Use of Threads | p. 189 |
OpenMP: Compiler Extensions for Multi-Threading | p. 195 |
Index | p. 205 |
Table of Contents provided by Ingram. All Rights Reserved. |
The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.
The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.