Preface xv

Acronyms xix

Introduction xxiii

I.1 Background and Motivation : : : : : : : : : : : : : : : : : : : : : xxiii

I.2 Literature Review : : : : : : : : : : : : : : : : : : : : : : : : : : : xxix

1 Nonlinear Systems Analysis 1

1.1 Notation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1

1.2 Nonlinear Dynamical Systems : : : : : : : : : : : : : : : : : : : : 3

1.2.1 Remarks on Existence, Uniqueness, and Continua-

tion of Solutions : : : : : : : : : : : : : : : : : : : : : 3

1.3 Lyapunov Analysis of Stability : : : : : : : : : : : : : : : : : : : : 5

1.4 Stability Analysis of Discrete-Time Dynamical Systems : : : : : 11

1.5 Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 15

2 Optimal Control 17

2.1 Problem Formulation : : : : : : : : : : : : : : : : : : : : : : : : : 17

2.2 Dynamic Programming : : : : : : : : : : : : : : : : : : : : : : : : 19

2.2.1 Principle of Optimality : : : : : : : : : : : : : : : : : 19

2.2.2 Hamilton{Jacobi{Bellman Equation : : : : : : : : : : 22

2.2.3 A Sucient Condition for Optimality : : : : : : : : : 23

vii

2.2.4 Innite-Horizon Problems : : : : : : : : : : : : : : : : 25

2.3 Linear Quadratic Regulator : : : : : : : : : : : : : : : : : : : : : 28

2.3.1 Dierential Riccati Equation : : : : : : : : : : : : : : 28

2.3.2 Algebraic Riccati Equation : : : : : : : : : : : : : : : 36

2.3.3 Convergence of Solutions to the Dierential Riccati

Equation : : : : : : : : : : : : : : : : : : : : : : : : : 40

2.3.4 Forward Propagation of the Dierential Riccati Equa-

tion for Linear Quadratic Regulator : : : : : : : : : : 43

2.4 Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 47

3 Reinforcement Learning 49

3.1 Control-Ane Systems with Quadratic Costs : : : : : : : : : : : 50

3.2 Exact Policy Iteration : : : : : : : : : : : : : : : : : : : : : : : : 53

3.2.1 Linear Quadratic Regulator : : : : : : : : : : : : : : : 59

3.3 Policy Iteration with Unknown Dynamics and Function Approx-

imations : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 62

3.3.1 Linear Quadratic Regulator with Unknown Dynamics 70

3.4 Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 72

4 Learning of Dynamic Models 75

4.1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 75

4.2 Model Selection : : : : : : : : : : : : : : : : : : : : : : : : : : : : 77

4.2.1 Grey-Box vs. Black-Box : : : : : : : : : : : : : : : : : 77

4.2.2 Parametric vs. Non-Parametric : : : : : : : : : : : : 77

4.3 Parametric Model : : : : : : : : : : : : : : : : : : : : : : : : : : : 80

viii

4.3.1 Model in Terms of Bases : : : : : : : : : : : : : : : : 80

4.3.2 Data Collection : : : : : : : : : : : : : : : : : : : : : 81

4.3.3 Learning of Control Systems : : : : : : : : : : : : : : 81

4.4 Parametric Learning Algorithms : : : : : : : : : : : : : : : : : : : 82

4.4.1 Least Squares : : : : : : : : : : : : : : : : : : : : : : 83

4.4.2 Recursive Least Squares : : : : : : : : : : : : : : : : 85

4.4.3 Gradient Descent : : : : : : : : : : : : : : : : : : : : 88

4.4.4 Sparse Regression : : : : : : : : : : : : : : : : : : : : 89

4.5 Persistence of Excitation : : : : : : : : : : : : : : : : : : : : : : : 90

4.6 Python Toolbox : : : : : : : : : : : : : : : : : : : : : : : : : : : : 92

4.6.1 Congurations : : : : : : : : : : : : : : : : : : : : : : 92

4.6.2 Model Upadte : : : : : : : : : : : : : : : : : : : : : : 93

4.6.3 Model Validation : : : : : : : : : : : : : : : : : : : : : 94

4.7 Comparison Results : : : : : : : : : : : : : : : : : : : : : : : : : : 96

4.7.1 Convergence of Parameters : : : : : : : : : : : : : : 97

4.7.2 Error Analysis : : : : : : : : : : : : : : : : : : : : : : 98

4.7.3 Runtime Results : : : : : : : : : : : : : : : : : : : : : 99

4.8 Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 100

5 Structured Online Learning-Based Control of Continuous-Time

Nonlinear Systems 111

5.1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 111

5.2 A Structured Approximate Optimal Control Framework : : : : : 112

5.3 Local Stability and Optimality Analysis : : : : : : : : : : : : : : 117

5.3.1 Linear Quadratic Regulator : : : : : : : : : : : : : : : 118

5.3.2 SOL Control : : : : : : : : : : : : : : : : : : : : : : : 120

5.4 SOL Algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : : 121

5.4.1 ODE Solver and Control Update : : : : : : : : : : : 122

5.4.2 Identied Model Update : : : : : : : : : : : : : : : : 123

5.4.3 Database Update : : : : : : : : : : : : : : : : : : : : 124

5.4.4 Limitations and Implementation Considerations : : : 126

5.4.5 Asymptotic Convergence with Approximate Dynamics 127

5.5 Simulation Results : : : : : : : : : : : : : : : : : : : : : : : : : : 128

5.5.1 Systems Identiable in Terms of a Given Set of Bases 129

5.5.2 Systems to Be Approximated by a Given Set of Bases 131

5.5.3 Comparison Results : : : : : : : : : : : : : : : : : : : 138

5.6 Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 142

6 A Structured Online Learning Approach to Nonlinear Track-

ing with Unknown Dynamics 147

6.1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 147

6.2 A Structured Online Learning for Tracking Control : : : : : : : 148

6.2.1 Stability and Optimality in the Linear Case : : : : : 155

6.3 Learning-based Tracking Control Using SOL : : : : : : : : : : : 160

6.4 Simulation Results : : : : : : : : : : : : : : : : : : : : : : : : : : 162

6.4.1 Tracking Control of the Pendulum : : : : : : : : : : 163

6.4.2 Synchronization of Chaotic Lorenz System : : : : : : 164

6.5 Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 167

7 Piecewise Learning and Control with Stability Guarantees 171

7.1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 171

7.2 Problem Formulation : : : : : : : : : : : : : : : : : : : : : : : : : 173

7.3 The Piecewise Learning and Control Framework : : : : : : : : : 173

7.3.1 System Identication : : : : : : : : : : : : : : : : : : 174

7.3.2 Database : : : : : : : : : : : : : : : : : : : : : : : : : 176

7.3.3 Feedback Control : : : : : : : : : : : : : : : : : : : : 177

7.4 Analysis of Uncertainty Bounds : : : : : : : : : : : : : : : : : : : 178

7.4.1 Quadratic Programs for Bounding Errors : : : : : : : 180

7.5 Stability Verication for Piecewise-Ane Learning and Control 185

7.5.1 Piecewise Ane Models : : : : : : : : : : : : : : : : 185

7.5.2 MIQP-based Stability Verication of PWA Systems 185

7.5.3 Convergence of ACCPM : : : : : : : : : : : : : : : : 191

7.6 Numerical Results : : : : : : : : : : : : : : : : : : : : : : : : : : : 193

7.6.1 Pendulum System : : : : : : : : : : : : : : : : : : : : 193

7.6.2 Dynamic Vehicle System with Skidding : : : : : : : : 197

7.6.3 Comparison of Runtime Results : : : : : : : : : : : : 201

7.7 Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 201

8 An Application to Solar Photovoltaic Systems 203

8.1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 203

8.2 Problem Statement : : : : : : : : : : : : : : : : : : : : : : : : : : 208

8.2.1 PV Array Model : : : : : : : : : : : : : : : : : : : : : 209

8.2.2 DC-DC Boost Converter : : : : : : : : : : : : : : : : 211

8.3 Optimal Control of PV Array : : : : : : : : : : : : : : : : : : : : 214

8.3.1 Maximum Power Point Tracking Control : : : : : : : 217

8.3.2 Reference Voltage Tracking Control : : : : : : : : : 226

8.3.3 Piecewise Learning Control : : : : : : : : : : : : : : : 228

8.4 Application Considerations : : : : : : : : : : : : : : : : : : : : : 229

8.4.1 Partial Derivative Approximation Procedure : : : : : 230

8.4.2 Partial Shading Eect : : : : : : : : : : : : : : : : : : 235

8.5 Simulation Results : : : : : : : : : : : : : : : : : : : : : : : : : : 236

8.5.1 Model and Control Verication : : : : : : : : : : : : 239

8.5.2 Comparative Results : : : : : : : : : : : : : : : : : : : 239

8.5.3 Model-Free Approach Results : : : : : : : : : : : : : 242

8.5.4 Piecewise Learning Results : : : : : : : : : : : : : : : 243

8.5.5 Partial Shading Results : : : : : : : : : : : : : : : : : 245

8.6 Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 246

9 An Application to Low-Level Control of Quadrotors 255

9.1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 255

9.2 Quadrotor Model : : : : : : : : : : : : : : : : : : : : : : : : : : : 259

9.3 Structured Online Learning with RLS Identier on Quadrotor : 261

9.3.1 Learning Procedure : : : : : : : : : : : : : : : : : : : 261

9.3.2 Asymptotic Convergence with Uncertain Dynamics : 269

9.3.3 Computational Properties : : : : : : : : : : : : : : : 272

9.4 Numerical Results : : : : : : : : : : : : : : : : : : : : : : : : : : : 272

9.5 Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 275

xii

10 Python Toolbox 277

10.1 Overview : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 277

10.2 User Inputs : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 278

10.2.1 Process : : : : : : : : : : : : : : : : : : : : : : : : : : 278

10.2.2 Objective : : : : : : : : : : : : : : : : : : : : : : : : : 280

10.3 SOL : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 281

10.3.1 Model Update : : : : : : : : : : : : : : : : : : : : : : 281

10.3.2 Database : : : : : : : : : : : : : : : : : : : : : : : : : 282

10.3.3 Library : : : : : : : : : : : : : : : : : : : : : : : : : : : 283

10.3.4 Control : : : : : : : : : : : : : : : : : : : : : : : : : : 284

10.4 Display and Outputs : : : : : : : : : : : : : : : : : : : : : : : : : 286

10.4.1 Graphs and Printouts : : : : : : : : : : : : : : : : : : 286

10.4.2 3D Simulation : : : : : : : : : : : : : : : : : : : : : : 288

10.5 Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 289

11 Appendix 291

11.1 Supplementary Analysis of Remark 5.4 : : : : : : : : : : : : : : : 291

11.2 Supplementary Analysis of Remark 5.5 : : : : : : : : : : : : : : : 302

Bibliography 303

xiii