What is included with this book?
Preface to the Second Edition | p. xi |
Preface to the First Edition | p. xv |
Acknowledgments | p. xvii |
The Challenges of Dynamic Programming | p. 1 |
A Dynamic Programming Example: A Shortest Path Problem | p. 2 |
The Three Curses of Dimensionality | p. 3 |
Some Real Applications | p. 6 |
Problem Classes | p. 11 |
The Many Dialects of Dynamic Programming | p. 15 |
What Is New in This Book? | p. 17 |
Pedagogy | p. 19 |
Bibliographic Notes | p. 22 |
Some Illustrative Models | p. 25 |
Deterministic Problems | p. 26 |
Stochastic Problems | p. 31 |
Information Acquisition Problems | p. 47 |
A Simple Modeling Framework for Dynamic Programs | p. 50 |
Bibliographic Notes | p. 54 |
Problems | p. 54 |
Introduction to Markov Decision Processes | p. 57 |
The Optimality Equations | p. 58 |
Finite Horizon Problems | p. 65 |
Infinite Horizon Problems | p. 66 |
Value Iteration | p. 68 |
Policy Iteration | p. 74 |
Hybrid Value-Policy Iteration | p. 75 |
Average Reward Dynamic Programming | p. 76 |
The Linear Programming Method for Dynamic Programs | p. 77 |
Monotone Policies* | p. 78 |
Why Does It Work?** | p. 84 |
Bibliographic Notes | p. 103 |
Problems | p. 103 |
Introduction to Approximate Dynamic Programming | p. 111 |
The Three Curses of Dimensionality (Revisited) | p. 112 |
The Basic Idea | p. 114 |
Q-Learning and SARSA | p. 122 |
Real-Time Dynamic Programming | p. 126 |
Approximate Value Iteration | p. 127 |
The Post-Decision State Variable | p. 129 |
Low-Dimensional Representations of Value Functions | p. 144 |
So Just What Is Approximate Dynamic Programming? | p. 146 |
Experimental Issues | p. 149 |
But Does It Work? | p. 155 |
Bibliographic Notes | p. 156 |
Problems | p. 158 |
Modeling Dynamic Programs | p. 167 |
Notational Style | p. 169 |
Modeling Time | p. 170 |
Modeling Resources | p. 174 |
The States of Our System | p. 178 |
Modeling Decisions | p. 187 |
The Exogenous Information Process | p. 189 |
The Transition Function | p. 198 |
The Objective Function | p. 206 |
A Measure-Theoretic View of Information** | p. 211 |
Bibliographic Notes | p. 213 |
Problems | p. 214 |
Policies | p. 221 |
Myopic Policies | p. 224 |
Lookahead Policies | p. 224 |
Policy Function Approximations | p. 232 |
Value Function Approximations | p. 235 |
Hybrid Strategies | p. 239 |
Randomized Policies | p. 242 |
How to Choose a Policy? | p. 244 |
Bibliographic Notes | p. 247 |
Problems | p. 247 |
Policy Search | p. 249 |
Background | p. 250 |
Gradient Search | p. 253 |
Direct Policy Search for Finite Alternatives | p. 256 |
The Knowledge Gradient Algorithm for Discrete Alternatives | p. 262 |
Simulation Optimization | p. 270 |
Why Does It Work?** | p. 274 |
Bibliographic Notes | p. 285 |
Problems | p. 286 |
Approximating Value Functions | p. 289 |
Lookup Tables and Aggregation | p. 290 |
Parametric Models | p. 304 |
Regression Variations | p. 314 |
Nonparametric Models | p. 316 |
Approximations and the Curse of Dimensionality | p. 325 |
Why Does It Work?** | p. 328 |
Bibliographic Notes | p. 333 |
Problems | p. 334 |
Learning Value Function Approximations | p. 337 |
Sampling the Value of a Policy | p. 337 |
Stochastic Approximation Methods | p. 347 |
Recursive Least Squares for Linear Models | p. 349 |
Temporal Difference Learning with a Linear Model | p. 356 |
Bellman's Equation Using a Linear Model | p. 358 |
Analysis of TD(0), LSTD, and LSPE Using a Single State | p. 364 |
Gradient-Based Methods for Approximate Value Iteration* | p. 366 |
Least Squares Temporal Differencing with Kernel Regression* | p. 371 |
Value Function Approximations Based on Bayesian Learning* | p. 373 |
Why Does It Work* | p. 376 |
Bibliographic Notes | p. 379 |
Problems | p. 381 |
Optimizing While Learning | p. 383 |
Overview of Algorithmic Strategies | p. 385 |
Approximate Value Iteration and Q-Learning Using Lookup Tables | p. 386 |
Statistical Bias in the Max Operator | p. 397 |
Approximate Value Iteration and Q-Learning Using Linear Models | p. 400 |
Approximate Policy Iteration | p. 402 |
The Actor-Critic Paradigm | p. 408 |
Policy Gradient Methods | p. 410 |
The Linear Programming Method Using Basis Functions | p. 411 |
Approximate Policy Iteration Using Kernel Regression* | p. 413 |
Finite Horizon Approximations for Steady-State Applications | p. 415 |
Bibliographic Notes | p. 416 |
Problems | p. 418 |
Adaptive Estimation and Stepsizes | p. 419 |
Learning Algorithms and Stepsizes | p. 420 |
Deterministic Stepsize Recipes | p. 425 |
Stochastic Stepsizes | p. 433 |
Optimal Stepsizes for Nonstationary Time Series | p. 437 |
Optimal Stepsizes for Approximate Value Iteration | p. 447 |
Convergence | p. 449 |
Guidelines for Choosing Stepsize Formulas | p. 451 |
Bibliographic Notes | p. 452 |
Problems | p. 453 |
Exploration Versus Exploitation | p. 457 |
A Learning Exercise: The Nomadic Trucker | p. 457 |
An Introduction to Learning | p. 460 |
Heuristic Learning Policies | p. 464 |
Gittins Indexes for Online Learning | p. 470 |
The Knowledge Gradient Policy | p. 477 |
Learning with a Physical State | p. 482 |
Bibliographic Notes | p. 492 |
Problems | p. 493 |
Value Function Approximations for Resource Allocation Problems | p. 497 |
Value Functions versus Gradients | p. 498 |
Linear Approximations | p. 499 |
Piecewise-Linear Approximations | p. 501 |
Solving a Resource Allocation Problem Using Piecewise-Linear Functions | p. 505 |
The SHAPE Algorithm | p. 509 |
Regression Methods | p. 513 |
Cutting Planes* | p. 516 |
Why Does It Work?** | p. 528 |
Bibliographic Notes | p. 535 |
Problems | p. 536 |
Dynamic Resource Allocation Problems | p. 541 |
An Asset Acquisition Problem | p. 541 |
The Blood Management Problem | p. 547 |
A Portfolio Optimization Problem | p. 557 |
A General Resource Allocation Problem | p. 560 |
A Fleet Management Problem | p. 573 |
A Driver Management Problem | p. 580 |
Bibliographic Notes | p. 585 |
Problems | p. 586 |
Implementation Challenges | p. 593 |
Will ADP Work for Your Problem? | p. 593 |
Designing an ADP Algorithm for Complex Problems | p. 594 |
Debugging an ADP Algorithm | p. 596 |
Practical Issues | p. 597 |
Modeling Your Problem | p. 602 |
Online versus Offline Models | p. 604 |
If It Works, Patent It! | p. 606 |
Bibliography | p. 607 |
Index | p. 623 |
Table of Contents provided by Publisher. All Rights Reserved. |
The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.
The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.