9780130962881

Summary

Introduction

Back when the mainframe ruled the world, information technology (IT) practitioners quickly learned the value of a well-managed system. They understood the value of managing problems, changes, and other issues confronting large, mission-critical computer systems running an organization's most sensitive business functions.

When the popularity of mainframes waned in favor of less costly midrange and PC systems, IT organizations were caught in the frenzy of developing and deploying new business applications with breakneck speed. Suddenly, more computing power was available to end users, who wanted to accomplish more with it than ever before. The corporate information system grew in scope, use, and importance, with no end in sight.

Now that the dust has settled somewhat, both the IT organization and the leaders of the business recognize that an unmanaged state-of-the-art computer system can be as bad as having none at all. Symptoms of this problem with unmanaged systems manifest themselves in ballooning IT costs, overworked and demoralized IT staff, and user dissatisfaction.

This book demonstrates how to deliver maximum system availability and manageability throughout a computer system's lifecycle, from design through implementation and maintenance. We review every key technique for simplifying the management and maintenance of computer systems - including redundancy, standardization, backups, and many more. We discuss practical means of implementing these techniques to make your current and future systems far less prone to outages.

We cover technical and management issues, since you cannot achieve long-term system availability and manageability solutions without addressing both. We have written this book to benefit everyone in the IT organization. Technical staff will find practical operational solutions that can be implemented immediately. IT management will gain a better perspective of the end-to-end and interrelated requirements of running an IT shop. And Chief Information Officers (CIOs) and other senior IT executives will find forward-looking strategies for enhancing the IT infrastructure and its contribution to the corporate bottom line.

You can manage systems better if you design them with high systems availability in mind. This book will show you how to address your system availability problems, from start to finish.

Author Biography

Floyd Piedad is President of AKAsia Services Corporation based in Manila, the Philippines. Michael Hawkins is an IT management consultant

Introduction

Acknowledgments

xvi

Today's Computing Environment

(12)

Complexity, Complexity, Complexity

(4)

Multiple Technologies and Protocols

(1)

Multiple Vendors

(1)

Varied Users

(1)

Multiple Locations

(1)

Rapid Change

(1)

Greater Business Demands

(1)

A Daunting Environment To Work In

(1)

The Total Cost of Ownership Issue

(7)

Total Cost of Ownership Defined

(2)

Industry TCO Estimates

(1)

What TCO Studies Reveal

(2)

The Underlying Reason for High TCO

(1)

A Typical Scenario: Choosing Office Systems

(1)

Availability as the Most Significant Contributor to TCO

(1)

Summary

(1)

Achieving Higher Availability

(12)

Determining User Availability Requirements

(3)

The Service Level Agreement

(1)

Helping Users Identify Their Availability Requirements

(2)

Availability Levels and Measurements

(7)

High Availability Level

(1)

Continuous Operations Level

(1)

Continuous Availability

(1)

Quantifying Availability Targets

(2)

Availability: A User Metric

(3)

Measuring End-To-End Availability

(1)

Summary

(2)

Planning for System Availability

(6)

Identifying System Components

(3)

Addressing Critical Components

(1)

The Four Elements of Availability

(1)

Summary

(2)

Preparing for Systems Management

(10)

Processes, Data, Tools, and Organization

(2)

Systems Management in the PC World (or the Lack of It)

(1)

IT Organizations: Away from Centralization, Then Back Again

(1)

Understanding the Systems To Manage

(2)

The Basics of Management: Five Phases

(2)

Setting Objectives

(1)

Planning

(1)

Execution

(1)

Measurement

(1)

Control

(1)

Identifying the Systems Management Disciplines

(3)

Implementing Service-Level Management

(52)

Service-Level Management

(8)

Process Requirements

(4)

Data and Measurement Requirements

(1)

Organization Requirements

(1)

Tools Requirements

(1)

Benefits of Service-Level Management

(1)

Problem Management

(9)

Process Requirements

(4)

Data Requirements

(2)

Organization Requirements

(1)

Tools Requirements

(2)

Benefits of Problem Management

(1)

Change Management

(7)

Process Requirements

(3)

Data Requirements

(1)

Organization Requirements

(1)

Tools Requirements

(1)

Benefits

(1)

Security Management

(12)

Process Requirements

(5)

Data Requirements

(3)

Organization Requirements

(2)

Tools Requirements

(1)

Benefits

(1)

Asset and Configuration Management

(8)

Process Requirements

(4)

Data Requirements

(2)

Organization Requirements

(1)

Tools Requirements

(1)

Availability Management

(8)

Process Requirements

(2)

Data Requirements

(1)

Organization Requirements

(1)

Tools Requirements

(1)

Benefits

(2)

From Centralized to Distributed Computing Environments

(12)

Systems Management Disciplines

(1)

The Centralized Computing Environment

(1)

The Distributed Computing Environment

(1)

Systems Management in Today's Computing Environment

(4)

Defining Appropriate Functions and Control

(1)

Choosing a Deployment Strategy

(3)

Developing a Deployment Strategy

100

(3)

Management by Exception

100

(1)

Policy-Based Management

101

(1)

Standardization of Performance Data

102

(1)

Accountability of the Distributed Systems Manager

102

(1)

Central Definition of Systems Management Architectures

102

(1)

Process Ownership

103

(1)

Summary

103

(2)

Techniques That Address Multiple Availability Requirements

105

(56)

Redundancy

105

(5)

Hardware Redundancy Examples

106

(2)

Software Redundancy Examples

108

(1)

Environmental Redundancy Example

109

(1)

Critical Success Factors

110

(1)

Backup of Critical Resources

110

(7)

Methods of Backup

111

(1)

Hardware Backup Examples

112

(1)

Software Backup Examples

112

(2)

IT Operations Backup Examples

114

(1)

Critical Success Factors

115

(2)

Clustering

117

(5)

Comparing Clustering and Redundancy

117

(2)

Hardware and Software Clustering Examples

119

(2)

IT Operations Clustering Examples

121

(1)

Environmental Clustering Examples

121

(1)

Critical Success Factors

121

(1)

Fault Tolerance

122

(3)

Hardware Fault Tolerance Examples

123

(1)

Software Fault Tolerance Examples

124

(1)

Environmental Fault Tolerance Examples

125

(1)

Critical Success Factors

125

(1)

Isolation or Partitioning

125

(6)

Hardware Isolation Examples

126

(1)

Software Isolation Examples

127

(1)

Other Benefits of Isolation

128

(1)

Critical Success Factors

129

(2)

Automated Operations

131

(7)

Console and Network Operations Examples

133

(1)

Workload Management Examples

134

(1)

System Resource Monitoring Examples

134

(1)

Problem Management Applications

135

(1)

Distribution of Resources Example

135

(1)

Backup and Restore Examples

136

(1)

Critical Success Factors

136

(2)

Access Security Mechanisms

138

(12)

Steps to Secure Access

139

(2)

Types of Security

141

(5)

Password Management

146

(2)

Critical Success Factors

148

(2)

Standardization

150

(8)

Hardware Standardization Examples

151

(1)

Software Standardization Examples

152

(1)

Network Standardization Examples

153

(1)

Processes and Procedures Standardization Examples

153

(1)

Naming Standardization Examples

154

(1)

Critical Success Factors

155

(2)

Transitioning to Standardization

157

(1)

Summary

158

(3)

Special Techniques for System Reliability

161

(22)

The Use of Reliable Components

161

(10)

Techniques for Maximizing Hardware Component Reliability

161

(3)

Techniques for Maximizing Software Component Reliability

164

(4)

Personnel-Related Techniques for Maximizing Reliability

168

(1)

Environment-Related Techniques for Maximizing Reliability

169

(1)

Some Reliability Indicators for Suppliers

170

(1)

Programming to Minimize Failures

171

(6)

Correctness

171

(2)

Robustness

173

(1)

Extensibility

174

(2)

Reusability

176

(1)

Implement Environmental Independence Measures

177

(3)

Use Power Generators

178

(1)

Use Independent Air-Conditioning Units

178

(1)

Use Fire Protection Systems

178

(1)

Use Raised Flooring

179

(1)

Install Equipment Wheel Locks

179

(1)

Locate Computer Room on the Second Floor

179

(1)

Utilize Fault Avoidance Measures

180

(1)

Analyzing Problem Trends and Statistics

180

(1)

Use of Advanced Hardware Technologies

180

(1)

Use of Software Maintenance Tools

181

(1)

Summary

181

(2)

Special Techniques for System Recoverability

183

(6)

Automatic Fault Recognition

183

(2)

Parity Checking Memory

183

(1)

ECC Memory

184

(1)

Data Validation Routines

184

(1)

Fast Recovery Techniques

185

(1)

Minimizing Use of Volatile Storage Media

186

(1)

Regular Database Updates to Central Storage

186

(1)

Automatic File-Save Features

186

(1)

Summary

187

(2)

Special Techniques for System Serviceability

189

(8)

Online System Redefinition

189

(1)

Add or Remove I/O Devices

189

(1)

Selectively Power Down Subsystems

190

(1)

Commit or Reject Changes

190

(1)

Informative Error Messages

190

(3)

Use Standard Corporate Terminology

191

(1)

Adopt Terms Already Used by Common Applications

191

(1)

Tell What, Why, Impact, and How

191

(1)

Implement Context-Sensitive Help

192

(1)

Give Options for Viewing More Detailed Error Information

192

(1)

Make Error Information Available After the Error Has Been Cleared

193

(1)

Complete Documentation

193

(2)

Have a Manual of Operations on Hand

193

(1)

Write Basic Problem Isolation and Recovery Guides

194

(1)

Provide System Configuration Diagrams

194

(1)

Label Resources

195

(1)

Provide a Complete Technical Library

195

(1)

Installation of Latest Fixes and Patches

195

(1)

Summary

196

(1)

Special Techniques for System Manageability

197

(14)

Use Manageable Components

197

(6)

Simple Network Management Protocol (SNMP)

199

(1)

Common Management Information Protocol (CMIP)

200

(1)

Desktop Management Interface (DMI)

201

(1)

Common Information Management Format (CIM)

202

(1)

Wired for Management (WfM)

202

(1)

Management Applications

203

(6)

Systems Management Issues

204

(1)

Automated Systems Management Capabilities

205

(1)

System Management Applications and Frameworks

206

(3)

Educate IS Personnel on Systems Management Disciplines

209

(1)

Business Value of the Information System

209

(1)

Value of Systems Management Disciplines

209

(1)

Principles of Management

209

(1)

Basic Numerical Analysis Skills

210

(1)

Summary

210

(1)

All Together Now

211

(8)

The Value of Systems Management Disciplines

211

(1)

Which One First?

212

(1)

Analyze Outages

213

(1)

Identify Single Points of Failure

214

(1)

Exploit What You Have

215

(1)

An Implementation Strategy

215

(1)

Summary

216

(3)

Appendix A Availability Features of Selected Products

219

(36)

Availability Features of Selected Operating Systems

219

(22)

Availability Features of Novell NetWare

220

(4)

Availability Features of Sun Solaris 8

224

(4)

Availability Features of AIX

228

(3)

Availability Features of Microsoft Windows 2000 Server and Professional

231

(8)

Availability Features of IBM OS/400

239

(2)

Availability Features of Selected Hardware Components

241

(12)

Availability Features of IBM S/390 Integrated Server

241

(2)

Availability Features of the IBM AS/400 Midrange System

243

(5)

Availability Features of the IBM RS/6000

248

(2)

Availability Features of Compaq Proliant Servers

250

(3)

Availability Features of Selected Software Components

253

(2)

Availability Features of the Oracle 8i Database

253

(2)

Index

255

Supplemental Materials

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Excerpts

Introduction Back when the mainframe ruled the world, information technology (IT) practitioners quickly learned the value of a well-managed system. They understood the value of managing problems, changes, and other issues confronting large, mission-critical computer systems running an organization's most sensitive business functions. When the popularity of mainframes waned in favor of less costly midrange and PC systems, IT organizations were caught in the frenzy of developing and deploying new business applications with breakneck speed. Suddenly, more computing power was available to end users, who wanted to accomplish more with it than ever before. The corporate information system grew in scope, use, and importance, with no end in sight. Now that the dust has settled somewhat, both the IT organization and the leaders of the business recognize that an unmanaged state-of-the-art computer system can be as bad as having none at all. Symptoms of this problem with unmanaged systems manifest themselves in ballooning IT costs, overworked and demoralized IT staff, and user dissatisfaction. This book demonstrates how to deliver maximum system availability and manageability throughout a computer system's lifecycle, from design through implementation and maintenance. We review every key technique for simplifying the management and maintenance of computer systems - including redundancy, standardization, backups, and many more. We discuss practical means of implementing these techniques to make your current and future systems far less prone to outages. We cover technical and management issues, since you cannot achieve long-term system availability and manageability solutions without addressing both. We have written this book to benefit everyone in the IT organization. Technical staff will find practical operational solutions that can be implemented immediately. IT management will gain a better perspective of the end-to-end and interrelated requirements of running an IT shop. And Chief Information Officers (CIOs) and other senior IT executives will find forward-looking strategies for enhancing the IT infrastructure and its contribution to the corporate bottom line. You can manage systems better if you design them with high systems availability in mind. This book will show you how to address your system availability problems, from start to finish.

Amazon no longer offers textbook rentals. We do!

Amazon no longer offers textbook rentals. We do!

We're the #1 textbook rental company. Let us show you why.

High Availability : Design, Techniques and Processes

0130962880

Summary

Introduction

Author Biography

Table of Contents

Supplemental Materials

Excerpts

Rewards Program