9780130084583

Designing Enterprise Solutions With Sun Cluster 3.0

by ;
  • ISBN13:

    9780130084583

  • ISBN10:

    0130084581

  • Edition: 1st
  • Format: Paperback
  • Copyright: 2002-01-01
  • Publisher: Prentice Hall PTR
  • Purchase Benefits
  • Free Shipping On Orders Over $35!
    Your order must be $35 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • Get Rewarded for Ordering Your Textbooks! Enroll Now
List Price: $46.79

Summary

Understand the theory behind system and component failures, their impact and subsequent cost to the Enterprise Learn about the leading-edge technologies used inside the Sun Cluster 3.0 software See how Sun Clusters enable Enterprises to deploy best practice cluster technology quickly and safely Designing Enterprise Solutions with Sun Cluster 3.0 is an introduction to architecting highly available systems with Sun servers, storage, and the Sun Cluster 3.0 software. Three recurring themes are used throughout the book: failures, synchronization, and arbitration. These themes occur throughout all levels of systems design. The first chapter deals with understanding these relationships and recognizing failure modes associated with synchronization and arbitration. The second and third chapters review the building blocks and describe the Sun Cluster 3.0 software environment in detail. The remaining chapters discuss management servers and provide hypothetical case studies in wh

Table of Contents

Preface xxiii
Acknowledgements xxix
Cluster and Complex System Design Issues
1(30)
Business Reasons for Clustered Systems
2(3)
Risk Assessment
2(1)
Cost Estimation
3(2)
Failures in Complex Systems
5(9)
Fault Detection
7(2)
Probes
9(1)
Latent Faults
10(1)
Fault Isolation
10(1)
Fault Reporting
10(2)
Fault Containment
12(1)
Reconfiguration Around Faults
13(1)
Fault Prediction
14(1)
Data Synchronization
14(6)
Data Uniqueness
15(1)
Complexity and Reliability
15(1)
Synchronization Techniques
16(1)
Microprocessor Cache Coherency
16(1)
Kernel-Level Synchronization
17(1)
Application-Level Synchronization
18(1)
Synchronization Consistency Failures
18(1)
Two-Phase Commit
18(1)
Locks and Lock Management
19(1)
Lock Performance
20(1)
Arbitration Schemes
20(2)
Asymmetric Arbitration
20(1)
Symmetric Arbitration
21(1)
Voting and Quorum
22(1)
Data Caches
22(4)
Cost and Latency Trade-Off
23(1)
Cache Types
24(1)
Cache Synchronization
25(1)
Timeouts
26(3)
Stable System
27(1)
Unstable System
28(1)
Stability Problems
28(1)
Failures in Clustered Systems
29(1)
Split Brain
29(1)
Multiple Instances
29(1)
Amnesia
30(1)
Summary
30(1)
Enterprise Cluster Computing Building Blocks
31(20)
Data Repositories and Infrastructure Services
32(6)
File Services
32(1)
NFS
32(1)
SAMBA
33(1)
Database Services
33(1)
HA-Oracle
34(1)
Oracle 8i OPS and Oracle 9i RAC
34(2)
Messaging Services
36(1)
Name Services
36(1)
DNS
36(1)
LDAP
37(1)
NIS and NIS+
37(1)
Business Logic and Application Service
38(4)
Packaged Business Solutions
38(1)
Application Servers
39(3)
User Access Services: Web Farms
42(3)
Compute Clusters
45(4)
Distributed Clusters
45(1)
Parallel Processing
45(1)
High-Performance Computing
46(1)
Sun HPC Clusters
46(1)
Sun Cluster Runtime Environment
47(1)
Sun MPI Communications Library
47(1)
Sun Parallel File System
47(1)
Sun Scalable Scientific Subroutine Library
47(1)
Prism Parallel Development Environment
48(1)
Sun Grid Engine Software
49(1)
Technologies for Building Distributed Applications
49(2)
CORBA
49(1)
JXTA
50(1)
Sun Cluster 3.0 Architecture
51(64)
System Architecture
52(5)
Enterprise Infrastructure
52(1)
Service Point Architecture
53(1)
Fault Tolerant Systems
53(1)
High Availability Versus Disaster Recovery
53(3)
Data Deletion and Corruption Recovery
56(1)
Kernel Infrastructure
57(4)
Kernel Framework
58(1)
Replica Management
59(1)
Mini-Transactions
60(1)
System Features
61(30)
Storage Topologies
62(1)
Clustered Pair Topology
62(1)
N+1 Topology
63(1)
Pair+M Topology
64(1)
Cluster Device Connectivity
65(1)
Global Devices
66(1)
Primary and Secondary I/O Paths
66(1)
Device ID
67(1)
Namespace
68(1)
Practical Uses
69(1)
Global File Service
69(1)
Application Access
70(1)
Client/Server Model
71(1)
Read and Write Implementation
71(2)
I/O Parallelism
73(1)
File and Attribute Caches
73(2)
CFS Mounting
75(2)
Application Binaries, Data, and Logs
77(1)
Application Performance
78(1)
Node Separation Performance Impact
79(1)
CFS Versus NFS
80(1)
Global Networking Service
81(2)
Packet Distribution Mechanisms
83(1)
Advantages
84(1)
Client Connection Recovery After a GIN Node Failure
84(1)
Private Interconnects
85(1)
Traffic
86(1)
Resiliency
87(1)
Protocols
87(2)
Configuration Guidelines
89(1)
Cluster Configuration Control
89(1)
Configuration Repository
89(1)
File Consistency
90(1)
Amnesia and Temporally Split Configurations
90(1)
Cluster Failures
91(8)
Failure Detection
91(1)
Failure Handling and Outage Time
92(1)
Accuracy Versus Speed
92(1)
Public Network Monitoring
93(1)
Application Failure
94(2)
Process Monitoring Facility
96(1)
Recoverable Failures
96(1)
Data Storage
96(1)
Private Interconnects
97(1)
Public Networks
97(1)
Unrecoverable Failures
98(1)
Failure Reporting
98(1)
Synchronization
99(8)
Data Services and Application Agents
99(1)
Agent Application Program Interfaces
100(1)
Data Service Constructs
100(1)
Resource Types
100(2)
Resources
102(2)
Resource Groups
104(1)
Resource Group Manager Daemon
105(1)
Parallel Services
106(1)
Arbitration
107(8)
Cluster Membership
107(1)
CMM Implementation
108(1)
Majority Voting and Quorum Principles
108(1)
CMM Reconfiguration Process
109(1)
SCSI-2 and SCSI-3 Command Set Support
110(1)
Quorum Disk Vote
111(1)
Uneven Cluster Partitions
111(1)
Disk Fencing
112(1)
Failfast Driver
113(1)
Cluster Reconfiguration
113(2)
Management Server
115(18)
Design Goals
116(1)
Services
117(1)
Console Services
118(5)
JumpStart
118(1)
Consolidated Cluster Node Messages
119(1)
AnswerBook2 Documentation Server
119(1)
Sun Management Center Server
120(1)
Solaris Management Console
121(1)
NTP Server
122(1)
Sun Ray Server
123(1)
Sun StorEdge SAN Surfer
124(1)
Sun Explorer Data Collector
124(1)
Sun Remote Services
125(1)
Software Stack
126(1)
Hardware Components
126(2)
Network Configuration
128(1)
Systems Management
128(1)
Backup, Restore, and Recovery
129(3)
Management Server
129(1)
Tape Backup
130(1)
CDs and DVDs
130(1)
Directly Attached Tape Drives
130(1)
Web Start Flash Technology
131(1)
JumpStart Software
131(1)
Summary
132(1)
Case Study 1-File Server Cluster
133(24)
Firm Description
133(1)
Design Goals
134(4)
Business Case
134(1)
Server Requirements
134(1)
Cluster Services
135(1)
Expected Service Level
135(1)
Design Priorities
136(1)
Availability
136(1)
Cost
136(1)
Reliability
137(1)
Recovery
137(1)
Security
137(1)
Performance, Sizing, and Capacity Planning
138(1)
Serviceability
138(1)
Cluster Software
138(5)
Software Configuration
139(1)
NFS Overview
139(1)
NFS Characteristics
139(2)
Arbitration
141(1)
Synchronization
142(1)
Network Status Monitor
142(1)
Recommended Hardware Configuration
143(12)
Management Server
143(2)
Nodes
145(1)
Options
146(1)
Options Considered But Discounted
146(1)
Boot Environment
147(1)
Shared Storage
147(1)
Options
148(1)
Options Considered But Discounted
148(1)
Network and Interconnects
149(1)
Options
149(1)
Options Considered But Discounted
149(1)
Environmental
150(1)
Options
151(1)
Options Considered But Discounted
151(1)
Backup, Restore, and Recovery
152(2)
Justification
154(1)
Options
154(1)
Summary
155(2)
Case Study 2-Database Cluster
157(40)
Company Description
157(1)
Information Technology Organization
158(1)
Design Goals
159(2)
Business Case
161(1)
Requirements
162(2)
Services Required
163(1)
Service Level Expected
163(1)
Design Priorities
164(3)
Availability
165(1)
Reliability
165(1)
Serviceability
165(1)
Security
166(1)
Recovery
166(1)
Cost
166(1)
Performance
166(1)
Cluster Software
167(8)
Arbitration
169(1)
Lock Mastering
170(1)
Node Joining the Cluster
170(1)
Node Leaving the Cluster
170(1)
Crash Recovery
171(1)
Automatic Lock Remastering
171(1)
Synchronization
172(1)
Local GCS Lock Mode Versus Global
172(1)
Cache Fusion Read-Read Example
173(2)
Recommended Hardware Configuration
175(20)
Management Server
175(1)
Nodes
176(1)
Options
176(4)
Options Considered But Discounted
180(1)
Boot Environment
180(5)
Shared Storage
185(2)
Options
187(1)
Options Considered But Discounted
187(1)
Network Interconnects
187(2)
Options
189(1)
Options Considered But Discounted
190(1)
Environmental Requirements
190(1)
Power Sources
190(2)
Ambient Temperature
192(1)
Ambient Relative Humidity
192(1)
Backup, Restore, and Recovery
193(1)
Options
194(1)
Options Considered But Discounted
194(1)
Summary
195(2)
A. Sun Cluster 3.0 Design Checklists 197(10)
Business Case Considerations
198(1)
Personnel Considerations
199(1)
Top-Level Design Documentation
200(1)
Environmental Design
200(1)
Server Design
201(1)
Shared Storage Design
202(1)
Network Design
203(1)
Software Environment Design
203(1)
Security Considerations
204(1)
Systems Management Requirements
204(1)
Testing Requirements
205(2)
B. Sun Cluster Technology History and Perspective 207(14)
SPARCcluster PDB 1.x and SPARCcluster HA 1.x History
207(2)
SPARCcluster PDB 1.x
207(1)
SPARCcluster HA 1.x
208(1)
Sun Cluster 2.x
209(2)
Sun Cluster 2.2 and 3.0 Feature Comparison
211(10)
Cluster Interconnects
211(1)
Switch Management Agent
212(1)
Membership Monitor
212(2)
Quorum Voting
214(1)
Failure Fencing
215(1)
Cluster Configuration Database
215(2)
Data Services
217(1)
Availability
218(1)
Control
219(1)
Cluster Management
219(1)
Summary
220(1)
C. Data Center Guidelines 221(14)
Hardware Platform Stability
221(1)
Server Consolidation in a Common Rack
222(1)
System Component Identification
222(2)
Solaris Device Labeling
223(1)
Interconnection Diagram
223(1)
Component and Cable Labeling
224(1)
AC/DC Power
224(1)
System Cooling
225(1)
Network Infrastructure
225(1)
Security
226(1)
System Installation and Configuration Documentation
227(2)
Change Control Practices
229(1)
Maintenance and Patch Strategy
229(1)
Component Spares
230(1)
New Release Upgrade Process
231(1)
Support Agreement and Associated Response Time
231(1)
Backup-and-Restore Testing
232(1)
Cluster Recovery Procedures
233(1)
Summary
233(2)
D. Tools 235(14)
Fault Tree Analysis
236(4)
Building for Analysis
238(1)
Inspecting an FTA
238(2)
Reliability Block Diagram Analysis
240(1)
Failure Modes and Effects Analysis
241(4)
Risk Priority Number
242(2)
FMEA Process
244(1)
Event Tree Analysis
245(4)
ETA Process
245(1)
ETA Example
246(3)
Acronyms, Abbreviations, and Glossary 249(10)
Bibliography 259(2)
Index 261

Rewards Program

Write a Review