Understanding approximate dynamic programming (ADP) in large industrial settings helps develop practical and high-quality solutions to problems that involve making decisions in the presence of uncertainty. With a focus on modeling and algorithms in conjunction with the language of mainstream operations research, artificial intelligence, and control theory, this second edition of Approximate Dynamic Programming Solving the Curses of Dimensionality uniquely integrates four distinct disciplines#xC2;#x14;Markov design processes, mathematical programming, simulation, and statistics#xC2;#x14;to show students, practitioners, and researchers how to successfully model and solve a wide range of real-life problems using ADP.

WARREN B. POWELL, PhD, is Professor of Operations Research and Financial Engineering at Princeton University, where he is founder and Director of CASTLE Laboratory, a research unit that works with industrial partners to test new ideas found in operations research. The recipient of the 2004 INFORMS Fellow Award, Dr. Powell has authored more than 160 published articles on stochastic optimization, approximate dynamicprogramming, and dynamic resource management.

Preface to the Second Edition	p. xi
Preface to the First Edition	p. xv
Acknowledgments	p. xvii
The Challenges of Dynamic Programming	p. 1
A Dynamic Programming Example: A Shortest Path Problem	p. 2
The Three Curses of Dimensionality	p. 3
Some Real Applications	p. 6
Problem Classes	p. 11
The Many Dialects of Dynamic Programming	p. 15
What Is New in This Book?	p. 17
Pedagogy	p. 19
Bibliographic Notes	p. 22
Some Illustrative Models	p. 25
Deterministic Problems	p. 26
Stochastic Problems	p. 31
Information Acquisition Problems	p. 47
A Simple Modeling Framework for Dynamic Programs	p. 50
Bibliographic Notes	p. 54
Problems	p. 54
Introduction to Markov Decision Processes	p. 57
The Optimality Equations	p. 58
Finite Horizon Problems	p. 65
Infinite Horizon Problems	p. 66
Value Iteration	p. 68
Policy Iteration	p. 74
Hybrid Value-Policy Iteration	p. 75
Average Reward Dynamic Programming	p. 76
The Linear Programming Method for Dynamic Programs	p. 77
Monotone Policies*	p. 78
Why Does It Work?**	p. 84
Bibliographic Notes	p. 103
Problems	p. 103
Introduction to Approximate Dynamic Programming	p. 111
The Three Curses of Dimensionality (Revisited)	p. 112
The Basic Idea	p. 114
Q-Learning and SARSA	p. 122
Real-Time Dynamic Programming	p. 126
Approximate Value Iteration	p. 127
The Post-Decision State Variable	p. 129
Low-Dimensional Representations of Value Functions	p. 144
So Just What Is Approximate Dynamic Programming?	p. 146
Experimental Issues	p. 149
But Does It Work?	p. 155
Bibliographic Notes	p. 156
Problems	p. 158
Modeling Dynamic Programs	p. 167
Notational Style	p. 169
Modeling Time	p. 170
Modeling Resources	p. 174
The States of Our System	p. 178
Modeling Decisions	p. 187
The Exogenous Information Process	p. 189
The Transition Function	p. 198
The Objective Function	p. 206
A Measure-Theoretic View of Information**	p. 211
Bibliographic Notes	p. 213
Problems	p. 214
Policies	p. 221
Myopic Policies	p. 224
Lookahead Policies	p. 224
Policy Function Approximations	p. 232
Value Function Approximations	p. 235
Hybrid Strategies	p. 239
Randomized Policies	p. 242
How to Choose a Policy?	p. 244
Bibliographic Notes	p. 247
Problems	p. 247
Policy Search	p. 249
Background	p. 250
Gradient Search	p. 253
Direct Policy Search for Finite Alternatives	p. 256
The Knowledge Gradient Algorithm for Discrete Alternatives	p. 262
Simulation Optimization	p. 270
Why Does It Work?**	p. 274
Bibliographic Notes	p. 285
Problems	p. 286
Approximating Value Functions	p. 289
Lookup Tables and Aggregation	p. 290
Parametric Models	p. 304
Regression Variations	p. 314
Nonparametric Models	p. 316
Approximations and the Curse of Dimensionality	p. 325
Why Does It Work?**	p. 328
Bibliographic Notes	p. 333
Problems	p. 334
Learning Value Function Approximations	p. 337
Sampling the Value of a Policy	p. 337
Stochastic Approximation Methods	p. 347
Recursive Least Squares for Linear Models	p. 349
Temporal Difference Learning with a Linear Model	p. 356
Bellman's Equation Using a Linear Model	p. 358
Analysis of TD(0), LSTD, and LSPE Using a Single State	p. 364
Gradient-Based Methods for Approximate Value Iteration*	p. 366
Least Squares Temporal Differencing with Kernel Regression*	p. 371
Value Function Approximations Based on Bayesian Learning*	p. 373
Why Does It Work*	p. 376
Bibliographic Notes	p. 379
Problems	p. 381
Optimizing While Learning	p. 383
Overview of Algorithmic Strategies	p. 385
Approximate Value Iteration and Q-Learning Using Lookup Tables	p. 386
Statistical Bias in the Max Operator	p. 397
Approximate Value Iteration and Q-Learning Using Linear Models	p. 400
Approximate Policy Iteration	p. 402
The Actor-Critic Paradigm	p. 408
Policy Gradient Methods	p. 410
The Linear Programming Method Using Basis Functions	p. 411
Approximate Policy Iteration Using Kernel Regression*	p. 413
Finite Horizon Approximations for Steady-State Applications	p. 415
Bibliographic Notes	p. 416
Problems	p. 418
Adaptive Estimation and Stepsizes	p. 419
Learning Algorithms and Stepsizes	p. 420
Deterministic Stepsize Recipes	p. 425
Stochastic Stepsizes	p. 433
Optimal Stepsizes for Nonstationary Time Series	p. 437
Optimal Stepsizes for Approximate Value Iteration	p. 447
Convergence	p. 449
Guidelines for Choosing Stepsize Formulas	p. 451
Bibliographic Notes	p. 452
Problems	p. 453
Exploration Versus Exploitation	p. 457
A Learning Exercise: The Nomadic Trucker	p. 457
An Introduction to Learning	p. 460
Heuristic Learning Policies	p. 464
Gittins Indexes for Online Learning	p. 470
The Knowledge Gradient Policy	p. 477
Learning with a Physical State	p. 482
Bibliographic Notes	p. 492
Problems	p. 493
Value Function Approximations for Resource Allocation Problems	p. 497
Value Functions versus Gradients	p. 498
Linear Approximations	p. 499
Piecewise-Linear Approximations	p. 501
Solving a Resource Allocation Problem Using Piecewise-Linear Functions	p. 505
The SHAPE Algorithm	p. 509
Regression Methods	p. 513
Cutting Planes*	p. 516
Why Does It Work?**	p. 528
Bibliographic Notes	p. 535
Problems	p. 536
Dynamic Resource Allocation Problems	p. 541
An Asset Acquisition Problem	p. 541
The Blood Management Problem	p. 547
A Portfolio Optimization Problem	p. 557
A General Resource Allocation Problem	p. 560
A Fleet Management Problem	p. 573
A Driver Management Problem	p. 580
Bibliographic Notes	p. 585
Problems	p. 586
Implementation Challenges	p. 593
Will ADP Work for Your Problem?	p. 593
Designing an ADP Algorithm for Complex Problems	p. 594
Debugging an ADP Algorithm	p. 596
Practical Issues	p. 597
Modeling Your Problem	p. 602
Online versus Offline Models	p. 604
If It Works, Patent It!	p. 606
Bibliography	p. 607
Index	p. 623
Table of Contents provided by Publisher. All Rights Reserved.

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Amazon no longer offers textbook rentals. We do!

Amazon no longer offers textbook rentals. We do!

We're the #1 textbook rental company. Let us show you why.

Approximate Dynamic Programming Solving the Curses of Dimensionality

9780470604458

047060445X

Supplemental Materials

Summary

Author Biography

Table of Contents

Supplemental Materials

Rewards Program