Approximate Dynamic Programming : Solving the Curses of Dimensionality

  • ISBN13:


  • ISBN10:


  • Edition: 2nd
  • Format: Hardcover
  • Copyright: 2011-09-27
  • Publisher: Wiley

Note: Supplemental materials are not guaranteed with Rental or Used book purchases.

Purchase Benefits

  • Free Shipping On Orders Over $35!
    Your order must be $35 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • Get Rewarded for Ordering Your Textbooks! Enroll Now
List Price: $145.00 Save up to $14.50
  • Rent Book $130.50
    Add to Cart Free Shipping


Supplemental Materials

What is included with this book?

  • The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.
  • The Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.


Understanding approximate dynamic programming (ADP) in large industrial settings helps develop practical and high-quality solutions to problems that involve making decisions in the presence of uncertainty. With a focus on modeling and algorithms in conjunction with the language of mainstream operations research, artificial intelligence, and control theory, this second edition of Approximate Dynamic Programming Solving the Curses of Dimensionality uniquely integrates four distinct disciplines#xC2;#x14;Markov design processes, mathematical programming, simulation, and statistics#xC2;#x14;to show students, practitioners, and researchers how to successfully model and solve a wide range of real-life problems using ADP.

Author Biography

WARREN B. POWELL, PhD, is Professor of Operations Research and Financial Engineering at Princeton University, where he is founder and Director of CASTLE Laboratory, a research unit that works with industrial partners to test new ideas found in operations research. The recipient of the 2004 INFORMS Fellow Award, Dr. Powell has authored more than 160 published articles on stochastic optimization, approximate dynamicprogramming, and dynamic resource management.

Table of Contents

Preface to the Second Editionp. xi
Preface to the First Editionp. xv
Acknowledgmentsp. xvii
The Challenges of Dynamic Programmingp. 1
A Dynamic Programming Example: A Shortest Path Problemp. 2
The Three Curses of Dimensionalityp. 3
Some Real Applicationsp. 6
Problem Classesp. 11
The Many Dialects of Dynamic Programmingp. 15
What Is New in This Book?p. 17
Pedagogyp. 19
Bibliographic Notesp. 22
Some Illustrative Modelsp. 25
Deterministic Problemsp. 26
Stochastic Problemsp. 31
Information Acquisition Problemsp. 47
A Simple Modeling Framework for Dynamic Programsp. 50
Bibliographic Notesp. 54
Problemsp. 54
Introduction to Markov Decision Processesp. 57
The Optimality Equationsp. 58
Finite Horizon Problemsp. 65
Infinite Horizon Problemsp. 66
Value Iterationp. 68
Policy Iterationp. 74
Hybrid Value-Policy Iterationp. 75
Average Reward Dynamic Programmingp. 76
The Linear Programming Method for Dynamic Programsp. 77
Monotone Policies*p. 78
Why Does It Work?**p. 84
Bibliographic Notesp. 103
Problemsp. 103
Introduction to Approximate Dynamic Programmingp. 111
The Three Curses of Dimensionality (Revisited)p. 112
The Basic Ideap. 114
Q-Learning and SARSAp. 122
Real-Time Dynamic Programmingp. 126
Approximate Value Iterationp. 127
The Post-Decision State Variablep. 129
Low-Dimensional Representations of Value Functionsp. 144
So Just What Is Approximate Dynamic Programming?p. 146
Experimental Issuesp. 149
But Does It Work?p. 155
Bibliographic Notesp. 156
Problemsp. 158
Modeling Dynamic Programsp. 167
Notational Stylep. 169
Modeling Timep. 170
Modeling Resourcesp. 174
The States of Our Systemp. 178
Modeling Decisionsp. 187
The Exogenous Information Processp. 189
The Transition Functionp. 198
The Objective Functionp. 206
A Measure-Theoretic View of Information**p. 211
Bibliographic Notesp. 213
Problemsp. 214
Policiesp. 221
Myopic Policiesp. 224
Lookahead Policiesp. 224
Policy Function Approximationsp. 232
Value Function Approximationsp. 235
Hybrid Strategiesp. 239
Randomized Policiesp. 242
How to Choose a Policy?p. 244
Bibliographic Notesp. 247
Problemsp. 247
Policy Searchp. 249
Backgroundp. 250
Gradient Searchp. 253
Direct Policy Search for Finite Alternativesp. 256
The Knowledge Gradient Algorithm for Discrete Alternativesp. 262
Simulation Optimizationp. 270
Why Does It Work?**p. 274
Bibliographic Notesp. 285
Problemsp. 286
Approximating Value Functionsp. 289
Lookup Tables and Aggregationp. 290
Parametric Modelsp. 304
Regression Variationsp. 314
Nonparametric Modelsp. 316
Approximations and the Curse of Dimensionalityp. 325
Why Does It Work?**p. 328
Bibliographic Notesp. 333
Problemsp. 334
Learning Value Function Approximationsp. 337
Sampling the Value of a Policyp. 337
Stochastic Approximation Methodsp. 347
Recursive Least Squares for Linear Modelsp. 349
Temporal Difference Learning with a Linear Modelp. 356
Bellman's Equation Using a Linear Modelp. 358
Analysis of TD(0), LSTD, and LSPE Using a Single Statep. 364
Gradient-Based Methods for Approximate Value Iteration*p. 366
Least Squares Temporal Differencing with Kernel Regression*p. 371
Value Function Approximations Based on Bayesian Learning*p. 373
Why Does It Work*p. 376
Bibliographic Notesp. 379
Problemsp. 381
Optimizing While Learningp. 383
Overview of Algorithmic Strategiesp. 385
Approximate Value Iteration and Q-Learning Using Lookup Tablesp. 386
Statistical Bias in the Max Operatorp. 397
Approximate Value Iteration and Q-Learning Using Linear Modelsp. 400
Approximate Policy Iterationp. 402
The Actor-Critic Paradigmp. 408
Policy Gradient Methodsp. 410
The Linear Programming Method Using Basis Functionsp. 411
Approximate Policy Iteration Using Kernel Regression*p. 413
Finite Horizon Approximations for Steady-State Applicationsp. 415
Bibliographic Notesp. 416
Problemsp. 418
Adaptive Estimation and Stepsizesp. 419
Learning Algorithms and Stepsizesp. 420
Deterministic Stepsize Recipesp. 425
Stochastic Stepsizesp. 433
Optimal Stepsizes for Nonstationary Time Seriesp. 437
Optimal Stepsizes for Approximate Value Iterationp. 447
Convergencep. 449
Guidelines for Choosing Stepsize Formulasp. 451
Bibliographic Notesp. 452
Problemsp. 453
Exploration Versus Exploitationp. 457
A Learning Exercise: The Nomadic Truckerp. 457
An Introduction to Learningp. 460
Heuristic Learning Policiesp. 464
Gittins Indexes for Online Learningp. 470
The Knowledge Gradient Policyp. 477
Learning with a Physical Statep. 482
Bibliographic Notesp. 492
Problemsp. 493
Value Function Approximations for Resource Allocation Problemsp. 497
Value Functions versus Gradientsp. 498
Linear Approximationsp. 499
Piecewise-Linear Approximationsp. 501
Solving a Resource Allocation Problem Using Piecewise-Linear Functionsp. 505
The SHAPE Algorithmp. 509
Regression Methodsp. 513
Cutting Planes*p. 516
Why Does It Work?**p. 528
Bibliographic Notesp. 535
Problemsp. 536
Dynamic Resource Allocation Problemsp. 541
An Asset Acquisition Problemp. 541
The Blood Management Problemp. 547
A Portfolio Optimization Problemp. 557
A General Resource Allocation Problemp. 560
A Fleet Management Problemp. 573
A Driver Management Problemp. 580
Bibliographic Notesp. 585
Problemsp. 586
Implementation Challengesp. 593
Will ADP Work for Your Problem?p. 593
Designing an ADP Algorithm for Complex Problemsp. 594
Debugging an ADP Algorithmp. 596
Practical Issuesp. 597
Modeling Your Problemp. 602
Online versus Offline Modelsp. 604
If It Works, Patent It!p. 606
Bibliographyp. 607
Indexp. 623
Table of Contents provided by Publisher. All Rights Reserved.

Rewards Program

Write a Review