Foundations of Databases The... | Buy

Summary

In a nutshell, a database management system is a software system that enables the creation, maintenance, and use of large amounts of data. In contrast with many programming applications, the logical data structure--the "database schema"--used to structure a given data set is usually much smaller than the volume of that set. Furthermore, the data is persistent, evolving over time and surviving multiple invocations of the database management software. To increase usability, concurrent access to the data is usually supported with specialized protocols that guarantee a form of noninterference between interleaved transactions. Importantly, modern database management systems embody a distinction between the logical level and the physical level. The logical level focuses on an abstract representation of the data, along with languages to create, query and modify it; the physical level focuses on the underlying implementation, including the physical layout used to store the data, the indexing and clustering schemes, and the concurrency and recovery protocols.

Database theory has developed primarily around the logical level of database. (A notable exception is concurrency control, which is not addressed in this volume.) A core of fundamental material on the relational model has become well established. It consists primarily of three paradigms for query languages (algebraic, calculus-based, and deductive) and the theory of dependencies. The theory of query languages, including issues of expressiveness and complexity specific to databases, is well developed. The marriage between databases and logic programming produced deductive databases, with the main focus on the deductive query languages. Dependency theory focused initially on formalizing and applying the disparate integrity constraints that commonly arise in practice, and it went on to relate constraints with query optimization and to develop a unifying perspective for them.

As a field, database theory draws on several areas, including mathematical logic, complexity, and programming languages. But the database context brings with it different assumptions, perspectives, and emphases. Relations can be viewed as predicates in the sense of logic, and the relational calculus as a specialization of the first-order predicate calculus. However, the database area emphasizes finite structures and has developed the notions of "domain independence" and "safety" to capture intuitive properties related to this finitude. The questions and techniques in dependency theory borrow heavily from logic., with a focus on practically motivated, relatively weak classes of sentences. Query languages provide an interesting contrast with conventional, imperative programming languages. Query languages typically embody a set-at-a-time focus as opposed to an object-at-a-time focus. Also, they are largely declarative in nature, and failing that, are more applicative than imperative. Because of the emphasis on tractability in the face of large volumes of data, there is considerable interest in query languages that do not have full computational power, which gives rise to a rich interplay between query languages and complexity theory. Specialized notions of complexity have arisen, stemming from the practical reality of large volumes of data and the theoretical interest in different query languages. Also, the important notion of 'genericity," which captures a form of abstraction stemming from the separation of the logical and physical levels, has led to new perspectives on complexity theory, involving formalisms that circumvent the ordering of input data implicit in traditional Turing machines.

Exciting new research directions have continued to emerge in database theory, stemming primarily from unanswered questions about query languages and from an interest in expanding beyond the limitations of the relational model. Current research includes investigations motivated by connections with object-orientation, artificial intelligence, and graphics interfaces. And as the database field matures, it, in turn, influences adjacent areas in computer science, notably finite model theory, programming languages, and logic programming.

A Note on Style

This book deals with the theory that has developed around the logical level of databases. It has two main objectives: to provide a focused presentation of the core material and to present the essence of the advanced material in a unified framework. Some of the advanced material has never before been presented in book form. The presentation style is quite rigorous, in that precise definitions and statements of results are provided. However, our overriding concern was to make things simple to the reader, to get across the intuition and elegance of the concepts and proofs, rather than adhere to very strict criteria of rigor. Numerous examples, figures, and exercises should help clarify the development. Some of the proofs emphasize intuition and leave out much of the detail; we called such a proof a "crux." In this way we have tried to achieve a balance between formalism and intuition. As we went along, a two-tier style emerged, with a tendency towards more rigor in the exposition of the core material and more intuition in the presentation of advanced and tangential topics.

The book is aimed at an eclectic audience. Most broadly, it should be a useful resource for any computer scientist or mathematician who wishes to find out what database theory is about. Database researchers and practitioners should find it useful as a reference to both classical material and to advanced topics, otherwise scattered in sometimes hard-to-read papers. As a textbook, it is aimed at graduate students and seniors who would use the book as the main text in a database theory course or as complementary material in a database systems course. The book is fairly self-contained. Some needed background is provided concisely in the preliminaries, and the reader is told where more can be found.

We have attempted to make life easier for the database aficionado by adapting the material consistently to the database framework. This saves the reader the work of translating to the database framework, a task which is at best distracting and at worst tedious and confusing. Perhaps the most substantial difference from the conventional presentation arises in the case of logic programming, where the deductive database point of view has dramatic impact. Mostly, things become much simpler because there are no function symbols. However, for some questions, such as expressive power and complexity, the conventional approach is simply inapplicable. The book also maintains a strong focus on database theory issues--tangential material from adjacent areas has been kept to an absolute minimum.

Results are attributed to their sources in bibliographical notes, included at the end of each chapter.

Organization of This Book

The outline of this book is as follows. Part A contains preliminaries, and a brief introduction to databases and the relational model.

The basic material on relational query languages is developed in Part

Author Biography

Serge Abiteboul received his Ph.D. in computer science from the University of Southern California, and is now a director de recherch&eactute; at I.N.R.I.A. in France.

About Richard Hull

Richard Hull received his Ph.D. in mathematics from the University of California, Berkeley. He is an associate professor at the University of Southern California in Computer Science, currently on leave and visiting the University of Colorado, Boulder.

Victor Vianu received his Ph.D. in computer science from the University of Southern California, San Diego, where he is currently professor of Computer Science. He spends a portion of each year visiting the I.N.R,I.A.

0201537710AB04062001

Antechamber
Database Systems
The Main Principles
Functionalities
Complexity and Diversity
Past and Future
Ties with This Book
Bibliographic Notes
Theoretical Background
Some Basics
Languages, Computability, and Complexity
Basics from Logic
The Relational Model
The Structure of the Relational Model
Named versus Unnamed Perspectives
Notation
Bibliographic Notes
Basics: Relational Query Languages
Conjunctive Queries
Getting Started
Logic-Based Perspectives
Query Composition and Views
Algebraic Perspectives
Adding Union
Bibliographic Notes
Exercises
Adding Negation: Algebra and Calculus
The Relational Algebras
Nonrecursive Datalog with Negation
The Relational Calculus
Syntactic Restrictions for Domain Independence
Aggregate Functions
Digression: Finite Representations of Infinite Databases
Bibliographic Notes
Exercises
Static Analysis and Optimization
Issues in Practical Query Optimization
Global Optimization
Static Analysis of the Relational Calculus
Computers with Acyclic Joins
Bibliographic Notes
Exercises
Notes on Practical Languages
SQL: The Structured Query Language
Query-by-Example and Microsoft Access
Confronting the Real World
Bibliographic Notes
Exercises
Constraints
Functional and Join Dependency
Motivation
Functional and Key Dependencies
Join and Multivalued Dependencies
The Chase
Bibliographic Notes
Exercises
Inclusion Dependency
Inclusion Dependency in Isolation
Finite versus Infinite Implication
Nonaxiomatizability of fd's + ind's
Restricted Kinds of Inclusion Dependency
Bibliographic Notes
Exercises
A Larger Perspective
A Unifying Framework
The Chase revisited
Axiomatization
An Algebraic Perspective
Bibliographic Notes
Exercises
Design and Dependencies
Semantic Data Models
Normal Forms
Universal Relation Assumption
Bibliographic Notes
Exercises
Datalog and Recursion
Datalog
Syntax of Datalog
Model-Theoretic Semantics
Fixpoint Semantics
Proof-Theoretic Approach
Static Program Analysis
Bibliographic Notes
Exercises
Evaluation of Datalog
Seminaive Evaluation
Top-Down Techniques
Magic
Two Improvements
Bibliographic Notes
Exercises
Recursion and Negation
Algebra + While
Calculus + Fixpoint
Datalog with Negation
Equivalence
Recursion in Practical Language
Bibliographic Notes
Exercises
Negation in Datalog
The Basic Problem
Stratified Semantics
Well-Founded Semantics
Expressive Power
Negation as Failure of Brief
Bibliographic Notes
Exercises
Expressiveness and Complexity
Sizing up Languages
Queries
Compl
Table of Contents provided by Publisher. All Rights Reserved.

Supplemental Materials

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Excerpts

Database theory is a relative newcomer to the field of computer science. Early data management systems were based on techniques from several classical areas of computer science, ranging from hardware and operating systems to data structures and programming languages. In the early seventies, a leap of abstraction from file systems produced relational databases and its accompanying theory, with logic as the catalyst. We believe that database theory has matured--that is has emerged as an elegant and robust part of science with its own identity. As such, it embodies its own peculiar brand of wisdom that deserves to be communicated not just to insiders, but to the computer science community at large.In a nutshell, a database management system is a software system that enables the creation, maintenance, and use of large amounts of data. In contrast with many programming applications, the logical data structure--the "database schema"--used to structure a given data set is usually much smaller than the volume of that set. Furthermore, the data is persistent, evolving over time and surviving multiple invocations of the database management software. To increase usability, concurrent access to the data is usually supported with specialized protocols that guarantee a form of noninterference between interleaved transactions. Importantly, modern database management systems embody a distinction between the logical level and the physical level. The logical level focuses on an abstract representation of the data, along with languages to create, query and modify it; the physical level focuses on the underlying implementation, including the physical layout used to store the data, the indexing and clustering schemes, and the concurrency and recovery protocols.Database theory has developed primarily around the logical level of database. (A notable exception is concurrency control, which is not addressed in this volume.) A core of fundamental material on the relational model has become well established. It consists primarily of three paradigms for query languages (algebraic, calculus-based, and deductive) and the theory of dependencies. The theory of query languages, including issues of expressiveness and complexity specific to databases, is well developed. The marriage between databases and logic programming produced deductive databases, with the main focus on the deductive query languages. Dependency theory focused initially on formalizing and applying the disparate integrity constraints that commonly arise in practice, and it went on to relate constraints with query optimization and to develop a unifying perspective for them.As a field, database theory draws on several areas, including mathematical logic, complexity, and programming languages. But the database context brings with it different assumptions, perspectives, and emphases. Relations can be viewed as predicates in the sense of logic, and the relational calculus as a specialization of the first-order predicate calculus. However, the database area emphasizes finite structures and has developed the notions of "domain independence" and "safety" to capture intuitive properties related to this finitude. The questions and techniques in dependency theory borrow heavily from logic., with a focus on practically motivated, relatively weak classes of sentences. Query languages provide an interesting contrast with conventional, imperative programming languages. Query languages typically embody a set-at-a-time focus as opposed to an object-at-a-time focus. Also, they are largely declarative in nature, and failing that, are more applicative than imperative. Because of the emphasis on tractability in the face of large volumes of data, there is considerable interest in query languages that do not have full computational power, which gives rise to a rich interplay between query languages and complexity theory. Specialized notions of complexity have arisen,