rent-now

Rent More, Save More! Use code: ECRENTAL

5% off 1 book, 7% off 2 books, 10% off 3+ books

9780471412540

Data Warehousing Fundamentals: A Comprehensive Guide for It Professionals

by
  • ISBN13:

    9780471412540

  • ISBN10:

    0471412546

  • Format: Hardcover
  • Copyright: 2001-09-01
  • Publisher: Wiley-Interscience
  • View Upgraded Edition
  • Purchase Benefits
  • Free Shipping Icon Free Shipping On Orders Over $35!
    Your order must be $35 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • eCampus.com Logo Get Rewarded for Ordering Your Textbooks! Enroll Now
List Price: $120.00 Save up to $3.61
  • Digital
    $116.39
    Add to Cart

    DURATION
    PRICE

Summary

Geared to IT professionals eager to get into the all-important field of data warehousing, this book explores all topics needed by those who design and implement data warehouses. Readers will learn about planning requirements, architecture, infrastructure, data preparation, information delivery, implementation, and maintenance. They'll also find a wealth of industry examples garnered from the author's 25 years of experience in designing and implementing databases and data warehouse applications for major corporations.Market: IT Professionals, Consultants.

Author Biography

PAULRAJ PONNIAH, PhD, is a twenty-five year IT professional who specializes in the design and implementation of data warehouse and database systems. He also teaches database and data warehousing courses.

Table of Contents

Foreword xxi
Preface xxiii
Part 1 OVERVIEW AND CONCEPTS
The Compelling Need for Data Warehousing
1(18)
Chapter Objectives
1(1)
Escalating Need for Strategic Information
2(5)
The Information Crisis
3(1)
Technology Trends
4(1)
Opportunities and Risks
5(2)
Failures of Past Decision-Support Systems
7(2)
History of Decision-Support Systems
8(1)
Inability to Provide Information
9(1)
Operational Versus Decision-Support Systems
9(3)
Making the Wheels of Business Turn
10(1)
Watching the Wheels of Business Turn
10(1)
Different Scope, Different Purposes
10(2)
Data Warehousing-The Only Viable Solution
12(1)
A New Type of System Environment
12(1)
Processing Requirements in the New Environment
12(1)
Business Intelligence at the Data Warehouse
12(1)
Data Warehouse Defined
13(2)
A Simple Concept for Information Delivery
14(1)
An Environment, Not a Product
14(1)
A Blend of Many Technologies
14(1)
Chapter Summary
15(1)
Review Questions
16(1)
Exercises
16(3)
Data Warehouse: The Building Blocks
19(20)
Chapter Objectives
19(1)
Defining Features
20(4)
Subject-Oriented Data
20(1)
Integrated Data
21(1)
Time-Variant Data
22(1)
Nonvolatile Data
23(1)
Data Granularity
23(1)
Data Warehouses and Data Marts
24(4)
How are They Different?
25(1)
Top-Down Versus Bottom-Up Approach
26(1)
A Practical Approach
27(1)
Overview of the Components
28(7)
Source Data Component
28(3)
Data Staging Component
31(2)
Data Storage Component
33(1)
Information Delivery Component
34(1)
Metadata Component
35(1)
Management and Control Component
35(1)
Metadata in the Data Warehouse
35(1)
Types of Metadata
36(1)
Special Significance
36(1)
Chapter Summary
36(1)
Review Questions
37(1)
Exercises
37(2)
Trends in Data Warehousing
39(24)
Chapter Objectives
39(1)
Continued Growth in Data Warehousing
40(3)
Data Warehousing is Becoming Mainstream
40(1)
Data Warehouse Expansion
41(1)
Vendor Solutions and Products
42(1)
Significant Trends
43(13)
Multiple Data Types
44(2)
Data Visualization
46(2)
Parallel Processing
48(1)
Query Tools
49(1)
Browser Tools
50(1)
Data Fusion
50(1)
Multidimensional Analysis
51(1)
Agent Technology
51(1)
Syndicated Data
52(1)
Data Warehousing and ERP
52(1)
Data Warehousing and KM
53(1)
Data Warehousing and CRM
54(2)
Active Data Warehousing
56(1)
Emergence of Standards
56(2)
Metadata
57(1)
OLAP
57(1)
Web-Enabled Data Warehouse
58(3)
The Warehouse to the Web
59(1)
The Web to the Warehouse
59(1)
The Web-Enabled Configuration
60(1)
Chapter Summary
61(1)
Review Questions
61(1)
Exercises
62(1)
Part 2 PLANNING AND REQUIREMENTS
Planning and Project Management
63(26)
Chapter Objectives
63(1)
Planning Your Data Warehouse
64(5)
Key Issues
64(2)
Business Requirements, Not Technology
66(1)
Top Management Support
67(1)
Justifying Your Data Warehouse
67(1)
The Overall Plan
68(1)
The Data Warehouse Project
69(5)
How is it Different?
70(1)
Assessment of Readiness
71(1)
The Life-Cycle Approach
71(2)
The Development Phases
73(1)
The Project Team
74(6)
Organizing the Project Team
75(1)
Roles and Responsibilities
75(2)
Skills and Experience Levels
77(1)
User Participation
78(2)
Project Management Considerations
80(6)
Guiding Principles
81(1)
Warning Signs
82(1)
Success Factors
82(1)
Anatomy of a Successful Project
83(1)
Adopt a Practical Approach
84(2)
Chapter Summary
86(1)
Review Questions
86(1)
Exercises
87(2)
Defining the Business Requirements
89(20)
Chapter Objectives
89(1)
Dimensional Analysis
90(3)
Usage of Information Unpredictable
90(1)
Dimensional Nature of Business Data
90(2)
Examples of Business Dimensions
92(1)
Information Packages-A new Concept
93(4)
Requirements Not Fully Determinate
93(2)
Business Dimensions
95(1)
Dimension Hierarchies/Categories
95(1)
Key Business Metrics or Facts
96(1)
Requirements Gathering Methods
97(7)
Interview Techniques
99(3)
Adapting the JAD Methodology
102(1)
Review of Existing Documentation
103(1)
Requirements Definition: Scope and Content
104(2)
Data Sources
105(1)
Data Transformation
105(1)
Data Storage
105(1)
Information Delivery
105(1)
Information Package Diagrams
106(1)
Requirements Definition Document Outline
106(1)
Chapter Summary
106(1)
Review Questions
107(1)
Exercises
107(2)
Requirements as the Driving Force for Data Warehousing
109(18)
Chapter Objectives
109(1)
Data Design
110(3)
Structure for Business Dimensions
112(1)
Structure for Key Measurements
112(1)
Levels of Detail
113(1)
The Architectural Plan
113(6)
Composition of the Components
114(1)
Special Considerations
115(3)
Tools And Products
118(1)
Data Storage Specifications
119(2)
DBMS Selection
120(1)
Storage Sizing
120(1)
Information Delivery Strategy
121(3)
Queries and Reports
122(1)
Types of Analysis
123(1)
Information Distribution
123(1)
Decision Support Applications
123(1)
Growth and Expansion
123(1)
Chapter Summary
124(1)
Review Questions
124(1)
Exercises
125(2)
Part 3 ARCHITECTURE AND INFRASTRUCTURE
The Architectural Components
127(18)
Chapter Objectives
127(1)
Understanding Data Warehouse Architecture
127(2)
Architecture: Definitions
127(1)
Architecture in Three Major Areas
128(1)
Distinguishing Characteristics
129(3)
Different Objectives and Scope
130(1)
Data Content
130(1)
Complex Analysis and Quick Response
131(1)
Flexible and Dynamic
131(1)
Metadata-driven
132(1)
Architecture Framework
132(2)
Architecture Supporting Flow of Data
132(1)
The Management and Control Module
133(1)
Technical Architecture
134(8)
Data Acquisition
135(3)
Data Storage
138(2)
Information Delivery
140(2)
Chapter Summary
142(1)
Review Questions
142(1)
Exercises
143(2)
Infrasturcture as the Foundation for Data Warehousing
145(28)
Chapter Objectives
145(1)
Infrastructure Supporting Architecture
145(3)
Operational Infrastructure
147(1)
Physical Infrastructure
147(1)
Hardware and Operating Systems
148(16)
Plateform Options
150(8)
Server Hardware
158(6)
Database Software
164(3)
Parallel Processing Options
164(2)
Selection of the DBMS
166(1)
Collection of Tools
167(3)
Architecture First, Then Tools
168(1)
Data Modeling
169(1)
Data Extraction
169(1)
Data Transformation
169(1)
Data Loading
169(1)
Data Quality
169(1)
Queries and Reports
170(1)
Online Analytical Processing (OLAP)
170(1)
Alert Systems
170(1)
Middleware and Connectivity
170(1)
Data Warehouse Management
170(1)
Chapter Summary
170(1)
Review Questions
171(1)
Exercises
171(2)
The Significant Role of Metadata
173(30)
Chapter Objectives
173(1)
Why Metadata is Important
173(10)
A Critical Need in the Data Warehouse
175(2)
Why Metadata is Vital for End-Users
177(2)
Why Metadata is Essential for IT
179(2)
Automation of Warehousing Tasks
181(2)
Establishing the Context of Information
183(1)
Metadata Types by Functional Areas
183(4)
Data Acquisition
184(2)
Data Storage
186(1)
Information Delivery
186(1)
Business Metadata
187(3)
Content Overview
188(1)
Examples of Business Metadata
188(1)
Content Highlights
189(1)
Who Benefits?
190(1)
Technical Metadata
190(3)
Content Overview
190(1)
Examples of Technical Metadata
191(1)
Content Highlights
192(1)
Who Benefits?
192(1)
How to Provide Metadata
193(7)
Metadata Requirements
193(1)
Sources of Metadata
194(2)
Challenges for Metadata Management
196(1)
Metadata Repository
196(2)
Metadata Integration and Standards
198(1)
Implementation Options
199(1)
Chapter Summary
200(1)
Review Questions
201(1)
Exercises
201(2)
Part 4 DATA DESIGN AND DATA PREPARATION
Principles of Dimensional Modeling
203(22)
Chapter Objectives
203(1)
From Requirements to Data Design
203(7)
Design Decisions
204(1)
Dimensional Modeling Basics
204(5)
E-R Modeling Versus Dimensional Modeling
209(1)
Use of CASE Tools
209(1)
The STAR Schema
210(8)
Review of a Simple STAR Schema
210(2)
Inside a Dimension Table
212(2)
Inside the Fact Table
214(2)
The Factless Fact Table
216(1)
Data Granularity
217(1)
STAR Schema Keys
218(2)
Primary Keys
218(1)
Surrogate Keys
219(1)
Foreign Keys
219(1)
Advantages of the STAR Schema
220(3)
Easy for Users to Understand
220(1)
Optimizes Navigation
221(1)
Most Suitable for Query Processing
222(1)
STARjoin and STARindex
223(1)
Chapter Summary
223(1)
Review Questions
224(1)
Exercises
224(1)
Dimensional Modeling: Advanced Topics
225(32)
Chapter Objectives
225(1)
Updates to the Dimension Tables
226(5)
Slowly Changing Dimensions
226(1)
Correction of Errors
227(1)
Preservation of History
228(2)
Tentative Soft Revisions
230(1)
Miscellaneous Dimensions
231(4)
Large Dimensions
231(2)
Rapidly Changing Dimensions
233(2)
Junk Dimensions
235(1)
The Snowflake Schema
235(4)
Options to Normalize
235(3)
Advantages and Disadvantages
238(1)
When to Snowflake
238(1)
Aggregate Fact Tables
239(10)
Fact Table Sizes
241(1)
Need for Aggregates
242(1)
Aggregating Fact Tables
243(4)
Aggregation Options
247(2)
Families of STARS
249(6)
Snapshot and Transaction Tables
250(1)
Core and Custom Tables
251(1)
Supporting Enterprise Value Chain or Value Circle
251(2)
Conforming Dimensions
253(1)
Standardizing Facts
254(1)
Summary of Family of STARS
254(1)
Chapter Summary
255(1)
Review Questions
255(1)
Exercises
256(1)
Data Extraction, Transformation, and Loading
257(34)
Chapter Objectives
257(1)
ETL Overview
258(4)
Most Important and Most Challenging
259(1)
Time-consuming and Arduous
260(1)
ETL Requirements and Steps
260(1)
Key Factors
261(1)
Data Extraction
262(9)
Source Identification
263(1)
Data Extraction Techniques
263(7)
Evaluation of the Techniques
270(1)
Data Transformation
271(8)
Data Transformation: Basic Tasks
272(1)
Major Transformation Types
273(2)
Data Integration and Consolidation
275(2)
Transformation for Dimension Attributes
277(1)
How to Implement Transformation
277(2)
Data Loading
279(6)
Applying Data: Techniques and Processes
280(2)
Data Refresh Versus Updata
282(1)
Procedure for Dimension Tables
283(1)
Fact Tables: History and Incremental Loads
284(1)
ETL Summary
285(3)
ETL Tool Options
285(1)
Reemphasizing ETL Metadata
286(1)
ETL Summary and Approach
287(1)
Chapter Summary
288(1)
Review Questions
288(1)
Exercises
289(2)
Data Quality: A key to Success
291(24)
Chapter Objectives
291(1)
Why is Data Quality Critical?
292(7)
What is Data Quality?
292(3)
Benefits of Improved Data Quality
295(1)
Types of Data Quality Problems
296(3)
Data Quality Challenges
299(4)
Sources of Data Pollution
299(2)
Validation of Names and Addresses
301(1)
Costs of Poor Data Quality
302(1)
Data Quality Tools
303(1)
Categories of Data Cleansing Tools
303(1)
Error Discovery Features
303(1)
Data Correction Features
303(1)
The DBMS for Quality Control
304(1)
Data Quality Initiative
304(7)
Data Cleansing Decisions
305(2)
Who Should be Responsible?
307(2)
The Purification Process
309(2)
Practical Tips on Data Quality
311(1)
Chapter Summary
311(1)
Review Questions
312(1)
Exercises
312(3)
Part 5 INFORMATION ACCESS AND DELIVERY
Matching Information to the Classes of Users
315(28)
Chapter Objectives
315(1)
Information from the Data Warehouse
316(7)
Data Warehouse Versus Operational Systems
316(2)
Information Potential
318(3)
User-Information Interface
321(2)
Industry Applications
323(1)
Who Will Use the Information?
323(6)
Classes of Users
323(3)
What They Need
326(3)
How to Provide Information
329(1)
Information Delivery
329(6)
Queries
331(1)
Reports
332(1)
Analysis
333(1)
Applications
334(1)
Information Delivery Tools
335(6)
The Desktop Environment
335(1)
Methodology for Tool Selection
335(3)
Tool Selection Criteria
338(2)
Information Delivery Framework
340(1)
Chapter Summary
341(1)
Review Questions
341(1)
Exercises
341(2)
OLAP in the Data Warehouse
343(34)
Chapter Objectives
343(1)
Demand for Online Analytical Processing
344(9)
Need for Multidimensional Analysis
344(1)
Fast Access and Powerful Calculations
345(2)
Limitations of Other Analysis Methods
347(2)
OLAP is the Answer
349(1)
OLAP Definitions and Rules
349(3)
OLAP Characteristics
352(1)
Major Features and Functions
353(10)
General Features
353(1)
Dimensional Analysis
353(4)
What are Hypercubes?
357(3)
Drill-Down and Roll-Up
360(2)
Slice-and-Dice or Rotation
362(1)
Uses and Benefits
363(1)
OLAP Models
363(5)
Overview of Variations
364(1)
The MOLAP Model
365(1)
The ROLAP Model
366(1)
ROLAP Versus MOLAP
367(1)
OLAP Implementation Considerations
368(6)
Data Design and Preparation
368(2)
Administration and Performance
370(2)
OLAP Platforms
372(1)
OLAP Tools and Products
373(1)
Implementation Steps
374(1)
Chapter Summary
374(1)
Review Questions
374(1)
Exercises
375(2)
Data Warehousing and the Web
377(22)
Chapter Objectives
377(1)
Web-Enabled Data Warehouse
378(5)
Why theWeb?
378(2)
Convergence of Technologies
380(1)
Adapting the Data Warehouse for the Web
381(1)
The Web as a Data Source
382(1)
Web-Based Information Delivery
383(6)
Expanded Usage
383(2)
New Information Strategies
385(2)
Browser Technology for the Data Warehouse
387(2)
Security Issues
389(1)
OLAP and the Web
389(2)
Enterprise OLAP
389(1)
Web-OLAP Approaches
390(1)
OLAP Engine Design
390(1)
Building a Web-Enabled Data Warehouse
391(5)
Nature of the Data Webhouse
391(2)
Implementation Considerations
393(1)
Putting the Pieces Together
394(1)
Web Processing Model
394(2)
Chapter Summary
396(1)
Review Questions
396(1)
Exercises
396(3)
Data Mining Basics
399(30)
Chapter Objectives
399(1)
What is Data Mining?
400(8)
Data Mining Defined
401(1)
The Knowledge Discovery Process
402(2)
OLAP Versus Data Mining
404(2)
Data Mining and the Data Warehouse
406(2)
Major Data Mining Techniques
408(14)
Cluster Detection
409(2)
Decision Trees
411(2)
Memory-Based Reasoning
413(2)
Link Analysis
415(2)
Neural Networks
417(1)
Genetic Algorithms
418(1)
Moving into Data Mining
419(3)
Data Mining Applications
422(4)
Benefits of Data Mining
423(1)
Applications in Retail Industry
424(1)
Applications in Telecommunications Industry
425(1)
Applications in Banking and Finance
426(1)
Chapter Summary
426(1)
Review Questions
426(1)
Exercises
427(2)
Part 6 IMPLEMENTATION AND MAINTENANCE
The Physical Design Process
429(26)
Chapter Objectives
429(1)
Physical Design Steps
430(3)
Develop Standards
430(1)
Create Aggregates Plan
431(1)
Determine the Data Partitioning Scheme
431(1)
Establish Clustering Options
432(1)
Prepare an Indexing Strategy
432(1)
Assign Storage Structures
432(1)
Complete Physical Model
433(1)
Physical Design Considerations
433(5)
Physical Design Objectives
433(1)
From Logical Model to Physical Model
434(1)
Physical Model Components
435(1)
Significance of Standards
436(2)
Physical Storage
438(5)
Storage Area Data Structures
439(1)
Optimizing Storage
440(2)
Using RAID Technology
442(1)
Estimating Storage Sizes
442(1)
Indexing the Data Warehouse
443(6)
Indexing Overview
443(2)
B-Tree Index
445(1)
Bitmapped Index
446(2)
Clustered Indexes
448(1)
Indexing the Fact Table
448(1)
Indexing the Dimension Tables
449(1)
Performance Enhancement Techniques
449(3)
Data Partitioning
449(1)
Data Clustering
450(1)
Parallel Processing
450(1)
Summary Levels
451(1)
Referential Integrity Checks
451(1)
Initialization Parameters
451(1)
Data Arrays
452(1)
Chapter Summary
452(1)
Review Questions
452(1)
Exercises
453(2)
Data Warehouse Deployment
455(22)
Chapter Objectives
455(1)
Major Deployment Activities
456(6)
Complete User Acceptance
456(1)
Perform Initial Loads
457(1)
Get User Desktops Ready
458(1)
Complete Initial User Training
459(1)
Institute Initial User Support
460(1)
Depoly in Stages
460(2)
Considerations for a Pilot
462(5)
When Is a Pilot Data Mart Useful?
462(1)
Types of Pilot Projects
463(2)
Choosing the Pilot
465(1)
Expanding and Integrating the Pilot
466(1)
Security
467(3)
Security Policy
467(1)
Managing User Privileges
468(1)
Password Considerations
469(1)
Security Tools
469(1)
Backup and Recovery
470(3)
Why Back Up the Data Warehouse?
470(1)
Backup Strategy
471(1)
Setting Up a Practical Schedule
472(1)
Recovery
472(1)
Chapter Summary
473(1)
Review Questions
474(1)
Exercises
474(3)
Growth and Maintenance
477(16)
Chapter Objectives
477(1)
Monitoring the Data Warehouse
478(3)
Collection of Statistics
478(2)
Using Statistics for Growth Planning
480(1)
Using Statistics for Fine-Tuning
480(1)
Publishing Trends for Users
481(1)
User Training and Support
481(6)
User Training Content
482(1)
Preparing the Training Program
482(2)
Delivering the Training Program
484(1)
User Support
485(2)
Managing the Data Warehouse
487(3)
Platform Upgrades
487(1)
Managing Data Growth
488(1)
Storage Management
488(1)
ETL Management
489(1)
Data Model Revisions
489(1)
Information Delivery Enhancements
489(1)
Ongoing Fine-Tuning
490(1)
Chapter Summary
490(1)
Review Questions
491(1)
Exercises
491(2)
Appendix A. Project Life Cycle Steps and Checklists 493(4)
Appendix B. Critical Factors for Success 497(2)
Appendix C. Guidelines for Evaluating Vendor Solutions 499(2)
References 501(2)
Glossary 503(8)
Index 511

Supplemental Materials

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Rewards Program