The accessible presentation of this book gives both a general view of the entire computer vision enterprise and also offers sufficient detail to be able to build useful applications. Users learn techniques that have proven to be useful by first-hand experience and a wide range of mathematical methods.Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance. Topics are discussed in substantial and increasing depth.Application surveysdescribe numerous important application areas such as image based rendering and digital libraries. Many important algorithms broken down and illustrated in pseudo code.Appropriate for use by engineers as a comprehensive reference to the computer vision enterprise.
I. IMAGE FORMATION.
1. Radiometry—Measuring Light.
3. Sources, Shadows, and Shading.
II. IMAGE MODELS.
5. Geometric Camera Models.
6. Geometric Camera Calibration.
7. An Introduction to Probability.
III. EARLY VISION: JUST ONE IMAGE.
8. Linear Filters.
9. Edge Detection.
IV. EARLY VISION: MULTIPLE IMAGES.
11. The Geometry of Multiple Views.
13. Affine Structure from Motion.
14. Projective Structure from Motion.
V. MID-LEVEL VISION.
15. Segmentation by Clustering.
16. Segmentation by Fitting a Model.
17. Segmentation and Fitting Using Probabilistic Methods.
18. Tracking with Linear Dynamic Models.
19. Tracking with Non-Linear Dynamic Models.
VI. HIGH-LEVEL VISION: GEOMETRIC METHODS.
20. Model-Based Vision.
21. Smooth Surfaces and Their Outlines.
22. Aspect Graphs.
23. Range Data.
VII. HIGH-LEVEL VISION: PROBABILISTIC AND INFERENTIAL METHODS.
24. Finding Templates Using Classifiers.
25. Recognition by Relations between Templates.
26. Geometric Templates from Spatial Relations.
VIII. APPLICATIONS AND TOPICS.
27. Application: Finding in Digital Libraries.
28. Application: Image-Based Rendering.
Computer vision as a field is an intellectual frontier. Like any frontier, it is exciting and disorganised; there is often no reliable authority to appeal to--many useful ideas have no theoretical grounding, and some theories are useless in practice; developed areas are widely scattered, and often one looks completely inaccessible from the other. Nevertheless, we have attempted in this book to present a fairly orderly picture of the field.We see computer vision--or just "vision"; apologies to those who study human or animal vision--as an enterprise that uses statistical methods to disentangle data using models constructed with the aid of geometry, physics and learning theory. Thus, in our view, vision relies on a solid understanding of cameras and of the physical process of image formation (part I of this book) to obtain simple inferences from individual pixel values (part II), combine the information available in multiple images into a coherent whole (part III), impose some order on groups of pixels to separate them from each other or infer shape information (part IV), and recognize objects using geometric information (part V) or probabilistic techniques (part VI). Computer vision has a wide variety of applications, old (e.g., mobile robot navigation, industrial inspection, and military intelligence) and new (e.g., human computer interaction, image retrieval in digital libraries, medical image analysis, and the realistic rendering of synthetic scenes in computer graphics). We discuss some of these applications in part VII. WHY STUDY VISION?Computer vision's great trick is extracting descriptions of the world from pictures or sequences of pictures. This is unequivocally useful. Taking pictures is usually non-destructive and sometimes discreet. It is also easy and (now) cheap. The descriptions that users seek can differ widely between applications. For example, a technique known as structure from motion makes it possible to extract a representation of what is depicted and how the camera moved from a series of pictures. People in the entertainment industry use these techniques to build three-dimensional (3D) computer models of buildings, typically keeping the structure and throwing away the motion. These models are used where real buildings cannot be; they are set fire to, blown up, etc. Good, simple, accurate and convincing models can be built from quite small sets of photographs. People who wish to control mobile robots usually keep the motion and throw away the structure. This is because they generally know something about the area where the robot is working, but don't usually know the precise robot location in that area. They can determine it from information about how a camera bolted to the robot is moving.There are a number of other, important applications of computer vision. One is in medical imaging: One builds software systems that can enhance imagery, or identify important phenomena or events, or visualize information obtained by imaging. Another is in inspection: One takes pictures of objects to determine whether they are within specification. A third is in interpreting satellite images, both for military purposes--a program might be required to determine what militarily interesting phenomena have occurred in a given region recently; or what damage was caused by a bombing--and for civilian purposes--what will this year's maize crop be? How much rainforest is left? A fourth is in organizing and structuring collections of pictures. We know how to search and browse text libraries (though this is a subject that still has difficult open questions) but don't really know what to do with image or video libraries.Computer vision is at an extraordinary point in its development. The subject itself has been around since the 1960s, but it is only recently that it has been possible to build useful computer systems using ideas from computer vision