Stochastic Learning and... | Buy

Stochastic learning and optimization is a multidisciplinary subject that has wide applications in modern engineering, social, and financial problems, including those in Internet and wireless communications, manufacturing, robotics, logistics, biomedical systems, and investment science. This book is unique in the following aspects. (Four areas in one book) This book covers various disciplines in learning and optimization, including perturbation analysis (PA) of discrete-event dynamic systems, Markov decision processes (MDP)s), reinforcement learning (RL), and adaptive control, within a unified framework. (A simple approach to MDPs) This book introduces MDP theory through a simple approach based on performance difference formulas. This approach leads to results for the n-bias optimality with long-run average-cost criteria and Blackwell's optimality without discounting. (Event-based optimization) This book introduces the recently developed event-based optimization approach, which opens up a research direction in overcoming or alleviating the difficulties due to the curse of dimensionality issue by utilizing the system's special features. (Sample-path construction) This book emphasizes physical interpretations based on the sample-path construction.

Preface	p. VII
Introduction	p. 1
An Overview of Learning and Optimization	p. 1
Problem Description	p. 1
Optimal Policies	p. 5
Fundamental Limitations of Learning and Optimization	p. 12
A Sensitivity-Based View of Learning and Optimization	p. 17
Problem Formulations in Different Disciplines	p. 19
Perturbation Analysis (PA)	p. 21
Markov Decision Processes (MDPs)	p. 26
Reinforcement Learning (RL)	p. 31
Identification and Adaptive Control (I&AC)	p. 34
Event-Based Optimization and Potential Aggregation	p. 37
A Map of the Learning and Optimization World	p. 41
Terminology and Notation	p. 42
Problems	p. 44
Four Disciplines in Learning and Optimization
Perturbation Analysis	p. 51
Perturbation Analysis of Markov Chains	p. 52
Constructing a Perturbed Sample Path	p. 53
Perturbation Realization Factors and Performance Potentials	p. 57
Performance Derivative Formulas	p. 64
Gradients with Discounted Reward Criteria	p. 68
Higher-Order Derivatives and the MacLaurin Series	p. 74
Performance Sensitivities of Markov Processes	p. 83
Performance Sensitivities of Semi-Markov Processes	p. 90
Fundamentals for Semi-Markov Processes	p. 90
Performance Sensitivity Formulas	p. 95
Perturbation Analysis of Queueing Systems	p. 102
Constructing a Perturbed Sample Path	p. 105
Perturbation Realization	p. 115
Performance Derivatives	p. 121
Remarks on Theoretical Issues	p. 125
Other Methods	p. 132
Problems	p. 137
Learning and Optimization with Perturbation Analysis	p. 147
The Potentials	p. 148
Numerical Methods	p. 148
Learning Potentials from Sample Paths	p. 151
Coupling	p. 156
Performance Derivatives	p. 161
Estimating through Potentials	p. 161
Learning Directly	p. 162
Optimization with PA	p. 172
Gradient Methods and Stochastic Approximation	p. 172
Optimization with Long Sample Paths	p. 174
Applications	p. 177
Problems	p. 177
Markov Decision Processes	p. 183
Ergodic Chains	p. 185
Policy Iteration	p. 186
Bias Optimality	p. 192
MDPs with Discounted Rewards	p. 201
Multi-Chains	p. 203
Policy Iteration	p. 205
Bias Optimality	p. 216
MDPs with Discounted Rewards	p. 226
The nth-Bias Optimization	p. 228
nth-Bias Difference Formulas	p. 229
Optimality Equations	p. 232
Policy Iteration	p. 240
nth-Bias Optimal Policy Spaces	p. 244
Problems	p. 246
Sample-Path-Based Policy Iteration	p. 253
Motivation	p. 254
Convergence Properties	p. 258
Convergence of Potential Estimates	p. 259
Sample Paths with a Fixed Number of Regenerative Periods	p. 260
Sample Paths with Increasing Lengths	p. 267
"Fast" Algorithms	p. 277
The Algorithm That Stops in a Finite Number of Periods	p. 278
With Stochastic Approximation	p. 282
Problems	p. 284
Reinforcement Learning	p. 289
Stochastic Approximation	p. 290
Finding the Zeros of a Function Recursively	p. 291
Estimating Mean Values	p. 297
Temporal Difference Methods	p. 298
TD Methods for Potentials	p. 298
Q-Factors and Other Extensions	p. 308
TD Methods for Performance Derivatives	p. 313
TD Methods and Performance Optimization	p. 318
PA-Based Optimization	p. 318
Q-Learning	p. 321
Optimistic On-Line Policy Iteration	p. 325
Value Iteration	p. 327
Summary of the Learning and Optimization Methods	p. 330
Problems	p. 333
Adaptive Control Problems as MDPs	p. 341
Control Problems and MDPs	p. 342
Control Systems Modelled as MDPs	p. 342
A Comparison of the Two Approaches	p. 345
MDPs with Continuous State Spaces	p. 353
Operators on Continuous Spaces	p. 354
Potentials and Policy Iteration	p. 359
Linear Control Systems and the Riccati Equation	p. 363
The LQ Problem	p. 363
The JLQ Problem	p. 368
On-Line Optimization and Adaptive Control	p. 373
Discretization and Estimation	p. 374
Discussion	p. 379
Problems	p. 381
The Event-Based Optimization - A New Approach
Event-Based Optimization of Markov Systems	p. 387
An Overview	p. 388
Summary of Previous Chapters	p. 388
An Overview of the Event-Based Approach	p. 390
Events Associated with Markov Chains	p. 398
The Event and Event Space	p. 400
The Probabilities of Events	p. 403
The Basic Ideas Illustrated by Examples	p. 407
Classification of Three Types of Events	p. 410
Event-Based Optimization	p. 414
The Problem Formulation	p. 414
Performance Difference Formulas	p. 417
Performance Derivative Formulas	p. 420
Optimization	p. 425
Learning: Estimating Aggregated Potentials	p. 429
Aggregated Potentials	p. 429
Aggregated Potentials in the Event-Based Optimization	p. 432
Applications and Examples	p. 434
Manufacturing	p. 434
Service Rate Control	p. 438
General Applications	p. 444
Problems	p. 446
Constructing Sensitivity Formulas	p. 455
Motivation	p. 455
Markov Chains on the Same State Space	p. 456
Event-Based Systems	p. 464
Sample-Path Construction	p. 464
Parameterized Systems: An Example	p. 467
Markov Chains with Different State Spaces	p. 470
One Is a Subspace of the Other	p. 470
A More General Case	p. 478
Summary	p. 482
Problems	p. 483
Appendices: Mathematical Background
Probability and Markov Processes	p. 491
Probability	p. 491
Markov Processes	p. 498
Problems	p. 504
Stochastic Matrices	p. 507
Canonical Form	p. 507
Eigenvalues	p. 508
The Limiting Matrix	p. 511
Problems	p. 516
Queueing Theory	p. 519
Single-Server Queues	p. 519
Queueing Networks	p. 524
Some Useful Techniques	p. 536
Problems	p. 538
Notation and Abbreviations	p. 543
References	p. 547
Index	p. 563
Table of Contents provided by Ingram. All Rights Reserved.

Rent More, Save More! Use code: ECRENTAL

Rent More, Save More! Use code: ECRENTAL

5% off 1 book, 7% off 2 books, 10% off 3+ books

Stochastic Learning and Optimization

9780387367873

038736787X

Summary

Table of Contents

Supplemental Materials

Rewards Program

Rent More, Save More! Use code: ECRENTAL

Rent More, Save More! Use code: ECRENTAL

5% off 1 book, 7% off 2 books, 10% off 3+ books

Stochastic Learning and Optimization

9780387367873

038736787X

Summary

Table of Contents

Supplemental Materials

Rewards Program

Digital License