rent-now

Rent More, Save More! Use code: ECRENTAL

5% off 1 book, 7% off 2 books, 10% off 3+ books

9780387367873

Stochastic Learning and Optimization

by
  • ISBN13:

    9780387367873

  • ISBN10:

    038736787X

  • Format: Hardcover
  • Copyright: 2007-09-21
  • Publisher: Springer-Verlag New York Inc
  • Purchase Benefits
  • Free Shipping Icon Free Shipping On Orders Over $35!
    Your order must be $35 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • eCampus.com Logo Get Rewarded for Ordering Your Textbooks! Enroll Now
List Price: $229.00 Save up to $153.76
  • Digital
    $163.02*
    Add to Cart

    DURATION
    PRICE
    *To support the delivery of the digital material to you, a digital delivery fee of $3.99 will be charged on each digital item.

Summary

Stochastic learning and optimization is a multidisciplinary subject that has wide applications in modern engineering, social, and financial problems, including those in Internet and wireless communications, manufacturing, robotics, logistics, biomedical systems, and investment science. This book is unique in the following aspects. (Four areas in one book) This book covers various disciplines in learning and optimization, including perturbation analysis (PA) of discrete-event dynamic systems, Markov decision processes (MDP)s), reinforcement learning (RL), and adaptive control, within a unified framework. (A simple approach to MDPs) This book introduces MDP theory through a simple approach based on performance difference formulas. This approach leads to results for the n-bias optimality with long-run average-cost criteria and Blackwell's optimality without discounting. (Event-based optimization) This book introduces the recently developed event-based optimization approach, which opens up a research direction in overcoming or alleviating the difficulties due to the curse of dimensionality issue by utilizing the system's special features. (Sample-path construction) This book emphasizes physical interpretations based on the sample-path construction.

Table of Contents

Prefacep. VII
Introductionp. 1
An Overview of Learning and Optimizationp. 1
Problem Descriptionp. 1
Optimal Policiesp. 5
Fundamental Limitations of Learning and Optimizationp. 12
A Sensitivity-Based View of Learning and Optimizationp. 17
Problem Formulations in Different Disciplinesp. 19
Perturbation Analysis (PA)p. 21
Markov Decision Processes (MDPs)p. 26
Reinforcement Learning (RL)p. 31
Identification and Adaptive Control (I&AC)p. 34
Event-Based Optimization and Potential Aggregationp. 37
A Map of the Learning and Optimization Worldp. 41
Terminology and Notationp. 42
Problemsp. 44
Four Disciplines in Learning and Optimization
Perturbation Analysisp. 51
Perturbation Analysis of Markov Chainsp. 52
Constructing a Perturbed Sample Pathp. 53
Perturbation Realization Factors and Performance Potentialsp. 57
Performance Derivative Formulasp. 64
Gradients with Discounted Reward Criteriap. 68
Higher-Order Derivatives and the MacLaurin Seriesp. 74
Performance Sensitivities of Markov Processesp. 83
Performance Sensitivities of Semi-Markov Processesp. 90
Fundamentals for Semi-Markov Processesp. 90
Performance Sensitivity Formulasp. 95
Perturbation Analysis of Queueing Systemsp. 102
Constructing a Perturbed Sample Pathp. 105
Perturbation Realizationp. 115
Performance Derivativesp. 121
Remarks on Theoretical Issuesp. 125
Other Methodsp. 132
Problemsp. 137
Learning and Optimization with Perturbation Analysisp. 147
The Potentialsp. 148
Numerical Methodsp. 148
Learning Potentials from Sample Pathsp. 151
Couplingp. 156
Performance Derivativesp. 161
Estimating through Potentialsp. 161
Learning Directlyp. 162
Optimization with PAp. 172
Gradient Methods and Stochastic Approximationp. 172
Optimization with Long Sample Pathsp. 174
Applicationsp. 177
Problemsp. 177
Markov Decision Processesp. 183
Ergodic Chainsp. 185
Policy Iterationp. 186
Bias Optimalityp. 192
MDPs with Discounted Rewardsp. 201
Multi-Chainsp. 203
Policy Iterationp. 205
Bias Optimalityp. 216
MDPs with Discounted Rewardsp. 226
The nth-Bias Optimizationp. 228
nth-Bias Difference Formulasp. 229
Optimality Equationsp. 232
Policy Iterationp. 240
nth-Bias Optimal Policy Spacesp. 244
Problemsp. 246
Sample-Path-Based Policy Iterationp. 253
Motivationp. 254
Convergence Propertiesp. 258
Convergence of Potential Estimatesp. 259
Sample Paths with a Fixed Number of Regenerative Periodsp. 260
Sample Paths with Increasing Lengthsp. 267
"Fast" Algorithmsp. 277
The Algorithm That Stops in a Finite Number of Periodsp. 278
With Stochastic Approximationp. 282
Problemsp. 284
Reinforcement Learningp. 289
Stochastic Approximationp. 290
Finding the Zeros of a Function Recursivelyp. 291
Estimating Mean Valuesp. 297
Temporal Difference Methodsp. 298
TD Methods for Potentialsp. 298
Q-Factors and Other Extensionsp. 308
TD Methods for Performance Derivativesp. 313
TD Methods and Performance Optimizationp. 318
PA-Based Optimizationp. 318
Q-Learningp. 321
Optimistic On-Line Policy Iterationp. 325
Value Iterationp. 327
Summary of the Learning and Optimization Methodsp. 330
Problemsp. 333
Adaptive Control Problems as MDPsp. 341
Control Problems and MDPsp. 342
Control Systems Modelled as MDPsp. 342
A Comparison of the Two Approachesp. 345
MDPs with Continuous State Spacesp. 353
Operators on Continuous Spacesp. 354
Potentials and Policy Iterationp. 359
Linear Control Systems and the Riccati Equationp. 363
The LQ Problemp. 363
The JLQ Problemp. 368
On-Line Optimization and Adaptive Controlp. 373
Discretization and Estimationp. 374
Discussionp. 379
Problemsp. 381
The Event-Based Optimization - A New Approach
Event-Based Optimization of Markov Systemsp. 387
An Overviewp. 388
Summary of Previous Chaptersp. 388
An Overview of the Event-Based Approachp. 390
Events Associated with Markov Chainsp. 398
The Event and Event Spacep. 400
The Probabilities of Eventsp. 403
The Basic Ideas Illustrated by Examplesp. 407
Classification of Three Types of Eventsp. 410
Event-Based Optimizationp. 414
The Problem Formulationp. 414
Performance Difference Formulasp. 417
Performance Derivative Formulasp. 420
Optimizationp. 425
Learning: Estimating Aggregated Potentialsp. 429
Aggregated Potentialsp. 429
Aggregated Potentials in the Event-Based Optimizationp. 432
Applications and Examplesp. 434
Manufacturingp. 434
Service Rate Controlp. 438
General Applicationsp. 444
Problemsp. 446
Constructing Sensitivity Formulasp. 455
Motivationp. 455
Markov Chains on the Same State Spacep. 456
Event-Based Systemsp. 464
Sample-Path Constructionp. 464
Parameterized Systems: An Examplep. 467
Markov Chains with Different State Spacesp. 470
One Is a Subspace of the Otherp. 470
A More General Casep. 478
Summaryp. 482
Problemsp. 483
Appendices: Mathematical Background
Probability and Markov Processesp. 491
Probabilityp. 491
Markov Processesp. 498
Problemsp. 504
Stochastic Matricesp. 507
Canonical Formp. 507
Eigenvaluesp. 508
The Limiting Matrixp. 511
Problemsp. 516
Queueing Theoryp. 519
Single-Server Queuesp. 519
Queueing Networksp. 524
Some Useful Techniquesp. 536
Problemsp. 538
Notation and Abbreviationsp. 543
Referencesp. 547
Indexp. 563
Table of Contents provided by Ingram. All Rights Reserved.

Supplemental Materials

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Rewards Program