What is included with this book?
Contents | p. vii |
Preface | p. xix |
For the Second Edition | p. xix |
Preface from the First Edition | p. xxiv |
About the Authors | p. xxxi |
Introduction | p. 1 |
Why an Availability Book? | p. 2 |
Our Approach to the Problem | p. 3 |
What's Not Here | p. 4 |
Our Mission | p. 4 |
The Availability Index | p. 5 |
Summary | p. 6 |
Organization of the Book | p. 6 |
Key Points | p. 8 |
What to Measure | p. 9 |
Measuring Availability | p. 10 |
Failure Modes | p. 20 |
Confidence in Your Measurements | p. 28 |
Key Points | p. 30 |
The Value of Availability | p. 31 |
What Is High Availability? | p. 31 |
The Costs of Downtime | p. 34 |
The Value of Availability | p. 37 |
The Availability Continuum | p. 47 |
The Availability Index | p. 51 |
The Lifecycle of an Outage | p. 52 |
Key Points | p. 60 |
The Politics of Availability | p. 61 |
Beginning the Persuasion Process | p. 61 |
Your Audience | p. 69 |
Delivering the Message | p. 70 |
After the Message Is Delivered | p. 73 |
Key Points | p. 73 |
20 Key High Availability Design Principles | p. 75 |
Don't Be Cheap | p. 76 |
Assume Nothing | p. 77 |
Remove Single Points of Failure (SPOFs) | p. 78 |
Enforce Security | p. 79 |
Consolidate Your Servers | p. 81 |
Watch Your Speed | p. 82 |
Enforce Change Control | p. 83 |
Document Everything | p. 84 |
Employ Service Level Agreements | p. 87 |
Plan Ahead | p. 88 |
Test Everything | p. 89 |
Separate Your Environments | p. 90 |
Learn from History | p. 92 |
Design for Growth | p. 93 |
Choose Mature Software | p. 94 |
Choose Mature, Reliable Hardware | p. 95 |
Reuse Configurations | p. 97 |
Exploit External Resources | p. 98 |
One Problem, One Solution | p. 99 |
K.I.S.S. (Keep It Simple...) | p. 101 |
Key Points | p. 104 |
Backups and Restores | p. 105 |
The Basic Rules for Backups | p. 106 |
Do Backups Really Offer High Availability? | p. 108 |
What Should Get Backed Up? | p. 109 |
Backup Software | p. 111 |
Backup Performance | p. 115 |
Backup Styles | p. 125 |
Handling Backup Tapes and Data | p. 141 |
Restores | p. 145 |
Summary | p. 147 |
Key Points | p. 148 |
Highly Available Data Management | p. 149 |
Four Fundamental Truths | p. 150 |
Six Independent Layers of Data Storage and Management | p. 152 |
Disk Hardware and Connectivity Terminology | p. 153 |
RAID Technology | p. 161 |
Disk Space and Filesystems | p. 176 |
Key Points | p. 182 |
SAN, NAS, and Virtualization | p. 183 |
Storage Area Networks (SANs) | p. 184 |
Network-Attached Storage (NAS) | p. 190 |
SAN or NAS: Which Is Better? | p. 191 |
Storage Virtualization | p. 196 |
Key Points | p. 202 |
Networking | p. 203 |
Network Failure Taxonomy | p. 204 |
Building Redundant Networks | p. 214 |
Load Balancing and Network Redirection | p. 228 |
Dynamic IP Addresses | p. 232 |
Network Service Reliability | p. 232 |
Key Points | p. 240 |
Data Centers and the Local Environment | p. 241 |
Data Centers | p. 242 |
Electricity | p. 252 |
Cabling | p. 255 |
Cooling and Environmental Issues | p. 257 |
System Naming Conventions | p. 259 |
Key Points | p. 261 |
People and Processes | p. 263 |
System Management and Modifications | p. 264 |
Vendor Management | p. 271 |
Security | p. 277 |
Documentation | p. 280 |
System Administrators | p. 284 |
Internal Escalation | p. 287 |
Key Points | p. 290 |
Clients and Consumers | p. 291 |
Hardening Enterprise Clients | p. 292 |
Tolerating Data Service Failures | p. 296 |
Key Points | p. 302 |
Application Design | p. 303 |
Application Recovery Overview | p. 304 |
Application Recovery from System Failures | p. 309 |
Internal Application Failures | p. 316 |
Developer Hygiene | p. 319 |
Process Replication | p. 326 |
Assume Nothing, Manage Everything | p. 330 |
Key Points | p. 331 |
Data and Web Services | p. 333 |
Network File System Services | p. 334 |
Database Servers | p. 342 |
Redundancy and Availability | p. 349 |
Web-Based Services Reliability | p. 351 |
Key Points | p. 359 |
Local Clustering and Failover | p. 361 |
A Brief and Incomplete History of Clustering | p. 362 |
Server Failures and Failover | p. 365 |
Logical, Application-centric Thinking | p. 367 |
Failover Requirements | p. 369 |
Larger Clusters | p. 385 |
Key Points | p. 386 |
Failover Management and Issues | p. 387 |
Failover Management Software (FMS) | p. 388 |
Component Monitoring | p. 389 |
Time to Manual Failover | p. 393 |
Homemade Failover Software or Commercial Software? | p. 395 |
Commercial Failover Management Software | p. 397 |
When Good Failovers Go Bad | p. 398 |
Verification and Testing | p. 404 |
Managing Failovers | p. 408 |
Other Clustering Topics | p. 411 |
Key Points | p. 414 |
Failover Configurations | p. 415 |
Two-Node Failover Configurations | p. 416 |
Service Group Failover | p. 425 |
Larger Cluster Configurations | p. 426 |
How Large Should Clusters Be? | p. 430 |
Key Points | p. 431 |
Data Replication | p. 433 |
What Is Replication? | p. 434 |
Why Replicate? | p. 435 |
Two Categories of Replication Types | p. 435 |
Other Thoughts on Replication | p. 458 |
Key Points | p. 463 |
Virtual Machines and Resource Management | p. 465 |
Partitions and Domains: System-Level VMs | p. 466 |
Containers and Jails: OS Level VMs | p. 468 |
Resource Management | p. 469 |
Key Points | p. 471 |
The Disaster Recovery Plan | p. 473 |
Should You Worry about DR? | p. 474 |
Three Primary Goals of a DR Plan | p. 475 |
What Goes into a Good DR Plan | p. 476 |
Preparing to Build the DR Plan | p. 477 |
Choosing a DR Site | p. 484 |
Distributing the DR Plan | p. 488 |
The Plan's Audience | p. 490 |
Timelines | p. 492 |
Team Assignments | p. 493 |
How Many Different Plans? | p. 495 |
Shared DR Sites | p. 496 |
Equipping the DR Site | p. 498 |
Is Your Plan Any Good? | p. 500 |
Three Types of Exercises | p. 507 |
The Effects of a Disaster on People | p. 509 |
Key Points | p. 512 |
A Resilient Enterprise* | p. 513 |
The New York Board of Trade | p. 514 |
Summary | p. 539 |
A Brief Look Ahead | p. 541 |
iSCSI | p. 541 |
InfiniBand | p. 542 |
Global Filesystem Undo | p. 543 |
Grid Computing | p. 545 |
Blade Computing | p. 547 |
Global Storage Repository | p. 548 |
Autonomic and Policy-Based Computing | p. 549 |
Intermediation | p. 551 |
Software Quality and Byzantine Reliability | p. 552 |
Business Continuity | p. 553 |
Key Points | p. 554 |
Parting Shots | p. 555 |
How We Got Here | p. 555 |
Index | p. 559 |
Table of Contents provided by Ingram. All Rights Reserved. |
The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.
The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.