0% found this document useful (0 votes)
17 views76 pages

Mean-Field-Type Games For Engineers

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views76 pages

Mean-Field-Type Games For Engineers

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Mean-Field-Type

Games for Engineers


Mean-Field-Type
Games for Engineers

Julian Barreiro-Gomez
Hamidou Tembine
MATLAB® is a trademark of The MathWorks, Inc. and is used with permission. The MathWorks
does not warrant the accuracy of the text or exercises in this book. This book’s use or discussion of
MATLAB® software or related products does not constitute endorsement or sponsorship by The
MathWorks of a particular pedagogical approach or particular use of the MATLAB® software.

First edition published 2022


by CRC Press
6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742

and by CRC Press


2 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN

© 2022 Julian Barreiro-Gomez and Hamidou Tembine

CRC Press is an imprint of Taylor & Francis Group, LLC

Reasonable efforts have been made to publish reliable data and information, but the author and pub-
lisher cannot assume responsibility for the validity of all materials or the consequences of their use.
The authors and publishers have attempted to trace the copyright holders of all material reproduced
in this publication and apologize to copyright holders if permission to publish in this form has not
been obtained. If any copyright material has not been acknowledged please write and let us know so
we may rectify in any future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information stor-
age or retrieval system, without written permission from the publishers.

For permission to photocopy or use material electronically from this work, access [Link].
com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA
01923, 978-750-8400. For works that are not available on CCC please contact mpkbookspermis-
sions@[Link]

Trademark notice: Product or corporate names may be trademarks or registered trademarks and are
used only for identification and explanation without intent to infringe.

ISBN: 9780367566128 (hbk)


ISBN: 9780367566135 (pbk)
ISBN: 9781003098607 (ebk)

ISBN: 9781032128047 (eBook+ Enhancements)

DOI: 10.1201/9781003098607

Typeset in CMR10 font


by KnowledgeWorks Global Ltd.

Access the Support Material: [Link]/9780367566128


To God.
To Mayerly, Margarita, Luis, and Juan
for their permanent support.
To Jacobo for being my biggest motivation.
Julián Barreiro Gómez

To Yandai, Pama, Marie-Claire,


Jean-Pierre and Florence
for their unconditional support
Tembine Hamidou Doumbodo
Contents

List of Figures xiii

List of Tables xxiii

Foreword xxv

Preface xxvii

Acknowledgments xxix

Author Biographies xxxi

Symbols xxxiii

I Preliminaries 1
1 Introduction 3
1.1 Linear-Quadratic Games . . . . . . . . . . . . . . . . . . . . 4
1.1.1 Structure of the Optimal Strategies and Optimal Costs 4
1.1.2 Solvability of the Linear-Quadratic Gaussian Games . 5
1.1.3 Beyond Brownian Motion . . . . . . . . . . . . . . . . 5
1.2 Linear-Quadratic Gaussian Mean-Field-Type Game . . . . . 6
1.2.1 Variance-Awareness and Higher Order Mean-Field
Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.2 The Role of the Risk in Engineering Applications . . . 8
1.2.3 Uncertainties in Engineering Applications . . . . . . . 10
1.2.4 Network of Networks/System of Systems . . . . . . . . 14
1.2.5 Optimality Systems . . . . . . . . . . . . . . . . . . . 16
1.3 Game Theoretical Solution Concepts . . . . . . . . . . . . . 19
1.3.1 Non-cooperative Game Problem . . . . . . . . . . . . 20
1.3.2 Fully-Cooperative Game Problem . . . . . . . . . . . . 21
1.3.3 Adversarial Game Problem . . . . . . . . . . . . . . . 21
1.3.4 Berge Game Problem . . . . . . . . . . . . . . . . . . 22
1.3.5 Stackelberg Game Problem . . . . . . . . . . . . . . . 23
1.3.6 Co-opetitive Game Problem . . . . . . . . . . . . . . . 24
1.3.7 Partial-Altruism and Self-Abnegation Game Problem . 25

vii
viii Contents

1.4 Partial Integro-Differential System for a Mean-Field-Type Con-


trol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.5 A Simple Method for Solving Mean-Field-Type Games and
Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.5.1 Continuous-Time Direct Method . . . . . . . . . . . . 29
1.5.2 Discrete-Time Direct Method . . . . . . . . . . . . . . 30
1.6 A Simple Derivation of the Itô’s Formula . . . . . . . . . . . 30
1.7 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

II Mean-Field-Free and Mean-Field Games 37


2 Mean-Field-Free Games 39
2.1 A Basic Continuous-Time Optimal Control Problem . . . . . 40
2.2 Continuous-Time Differential Game . . . . . . . . . . . . . . 46
2.3 Stochastic Mean-Field-Free Differential Game . . . . . . . . 53
2.4 A Basic Discrete-Time Optimal Control Problem . . . . . . . 57
2.5 Deterministic Difference Games . . . . . . . . . . . . . . . . 60
2.6 Stochastic Mean-Field-Free Difference Game . . . . . . . . . 65
2.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

3 Mean-Field Games 73
3.1 A Continuous-Time Deterministic Mean-Field Game . . . . . 75
3.2 A Continuous-Time Stochastic Mean-Field Game . . . . . . 78
3.3 A Discrete-Time Deterministic Mean-Field Game . . . . . . 86
3.4 A Discrete-Time Stochastic Mean-Field Game . . . . . . . . 91
3.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

III One-Dimensional Mean-Field-Type Games 103


4 Continuous-Time Mean-Field-Type Games 105
4.1 Mean-Field-Type Game Set-up . . . . . . . . . . . . . . . . . 106
4.2 Semi-explicit Solution of the Mean-Field-Type Game Problem 112
4.3 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . 126
4.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

5 Co-opetitive Mean-Field-Type Games 139


5.1 Co-opetitive Mean-Field-Type Game Set-up . . . . . . . . . 141
5.2 Semi-explicit Solution of the Co-opetitive Mean-Field-Type
Game Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 144
5.3 Connections between the Co-opetitive Solution with the Non-
cooperative and Cooperative Solutions . . . . . . . . . . . . 150
5.3.1 Non-cooperative Relationship . . . . . . . . . . . . . . 150
5.3.2 Cooperative Relationship . . . . . . . . . . . . . . . . 151
5.4 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . 152
5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
Contents ix

6 Mean-Field-Type Games with Jump-Diffusion and Regime


Switching 167
6.1 Mean-Field-Type Game Set-up . . . . . . . . . . . . . . . . . 168
6.2 Semi-explicit Solution of the Mean-Field-Type Game with
Jump-Diffusion Process and Regime Switching . . . . . . . . 172
6.3 Numerical Example . . . . . . . . . . . . . . . . . . . . . . . 191
6.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

7 Mean-Field-Type Stackelberg Games 199


7.1 Mean-Field-Type Stackelberg Game Set-up . . . . . . . . . . 200
7.2 Semi-explicit Solution of the Stackelberg Mean-Field-Type
Game with Jump-Diffusion Process and Regime Switching . 203
7.3 When Nash Solution Corresponds to Stackelberg Solution for
Mean-Field-Type Games . . . . . . . . . . . . . . . . . . . . 220
7.4 Numerical Example . . . . . . . . . . . . . . . . . . . . . . . 221
7.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

8 Berge Equilibrium in Mean-Field-Type Games 229


8.1 On the Berge Solution Concept . . . . . . . . . . . . . . . . 230
8.2 Berge Mean-Field-Type Game Problem . . . . . . . . . . . . 230
8.3 Semi-explicit Mean-Field-Type Berge Solution . . . . . . . . 234
8.4 When Berge Solution Corresponds to Co-opetitive Solution for
Mean-Field-Type Games . . . . . . . . . . . . . . . . . . . . 243
8.5 Numerical Example . . . . . . . . . . . . . . . . . . . . . . . 244
8.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248

IV Matrix-Valued Mean-Field-Type Games 249


9 Matrix-Valued Mean-Field-Type Games 251
9.1 Mean-Field-Type Game Set-up . . . . . . . . . . . . . . . . . 252
9.1.1 Matrix-Valued Applications . . . . . . . . . . . . . . . 253
9.1.2 Risk-Neutral . . . . . . . . . . . . . . . . . . . . . . . 256
9.1.3 Risk-Sensitive . . . . . . . . . . . . . . . . . . . . . . . 257
9.2 Semi-explicit Solution of the Mean-Field-Type Game Problems:
Risk-Neutral Case . . . . . . . . . . . . . . . . . . . . . . . . 259
9.3 Semi-explicit Solution of the Mean-Field-Type Game Problems:
Risk-Sensitive Case . . . . . . . . . . . . . . . . . . . . . . . 271
9.4 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . 277
9.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289

10 A Class of Constrained Matrix-Valued Mean-Field-Type


Games 291
10.1 Constrained Mean-Field-Type Game Set-up . . . . . . . . . 291
10.1.1 Auxiliary Dynamics . . . . . . . . . . . . . . . . . . . 293
10.1.2 Augmented Formulation of the Constrained MFTG . . 294
x Contents

10.2 Semi-explicit Solution of the Constrained Mean-Field-Type


Game Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 295
10.3 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296

V Discrete-Time Mean-Field-Type Games 297


11 One-Dimensional Discrete-Time Mean-Field-Type Games 299
11.1 Discrete-Time Mean-Field-Type Game Set-up . . . . . . . . 299
11.2 Semi-explicit Solution of the Discrete-Time Non-Cooperative
Mean-Field-Type Game Problem . . . . . . . . . . . . . . . . 302
11.3 Semi-explicit Solution of the Discrete-Time Cooperative Mean-
Field-Type Game Problem . . . . . . . . . . . . . . . . . . . 311
11.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319

12 Matrix-Valued Discrete-Time Mean-Field-Type Games 321


12.1 Discrete-Time Mean-Field-Type Game Set-up . . . . . . . . 321
12.2 Semi-explicit Solution of the Discrete-Time Mean-Field-Type
Game Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 325
12.3 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . 344
12.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357

VI Learning Approaches and Applications 359


13 Constrained Mean-Field-Type Games: Stationary Case 361
13.1 Constrained Games . . . . . . . . . . . . . . . . . . . . . . . 361
13.1.1 A Constrained Deterministic Game . . . . . . . . . . . 362
13.1.2 A Constrained Mean-Field-Type Game . . . . . . . . 363
13.1.3 Constrained Variational Equilibrium . . . . . . . . . . 366
13.1.4 Potential Constrained Mean-Field-Type Game . . . . 368
13.1.5 Efficiency Analysis . . . . . . . . . . . . . . . . . . . . 369
[Link] Variations of the Variance . . . . . . . . . . . 370
[Link] Variations of the ε-parameters . . . . . . . . 370
[Link] Variations of the Number of Players . . . . . 370
[Link] Variations of Connectivity under Graphs . . 370
13.1.6 Learning Variational Equilibria . . . . . . . . . . . . . 371
13.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
13.3 Learning Algorithms . . . . . . . . . . . . . . . . . . . . . . . 375
13.4 Equilibrium under Migration Constraints . . . . . . . . . . . 377

14 Mean-Field-Type Model Predictive Control 379


14.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . 379
14.2 Risk-Aware Model Predictive Control Approaches . . . . . . 381
14.2.1 Chance-Constrained Model Predictive Control . . . . 381
14.2.2 Mean-Field-Type Model Predictive Control . . . . . . 381
14.2.3 Chance-Constrained vs Mean-Field Type Model Predic-
tive Control . . . . . . . . . . . . . . . . . . . . . . . . 382
Contents xi

14.2.4 Decomposition and Stability . . . . . . . . . . . . . . 385

15 Data-Driven Mean-Field-Type Games 389


15.1 Data-Driven Mean-Field-Type Game Problem . . . . . . . . 390
15.2 Machine Learning Philosophy . . . . . . . . . . . . . . . . . 392
15.3 Machine-Learning-Based (Linear Regression) Data-Driven Mean-
Field-Type games . . . . . . . . . . . . . . . . . . . . . . . . 394
15.3.1 Availability of Data . . . . . . . . . . . . . . . . . . . 394
15.3.2 Preparation of Data . . . . . . . . . . . . . . . . . . . 394
15.3.3 Machine-Learning Core . . . . . . . . . . . . . . . . . 395
15.4 Error and Performance Metrics . . . . . . . . . . . . . . . . . 398
15.5 Numerical Example . . . . . . . . . . . . . . . . . . . . . . . 399

16 Applications 405
16.1 Water Distribution Systems . . . . . . . . . . . . . . . . . . 405
16.1.1 Five-Tank Water System . . . . . . . . . . . . . . . . 405
16.1.2 Barcelona Drinking Water Distribution Network . . . 410
16.2 Micro-grid Energy Storage . . . . . . . . . . . . . . . . . . . 415
16.2.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 416
16.2.2 Numerical Results . . . . . . . . . . . . . . . . . . . . 419
16.3 Continuous Stirred Tank Reactor . . . . . . . . . . . . . . . 420
16.3.1 Linearization-Based Scheduling and Risk-Aware Con-
trol Problem . . . . . . . . . . . . . . . . . . . . . . . 423
16.3.2 Gain-Scheduled Mean-Field-Type Control . . . . . . . 425
[Link] Design . . . . . . . . . . . . . . . . . . . . . 425
[Link] Local Stability of the Operating Points . . . 427
16.3.3 Risk-Aware Numerical Illustrative Example . . . . . . 428
16.4 Mechanism Design in Evolutionary Games . . . . . . . . . . 433
16.4.1 A Risk-Aware Approach to the Equilibrium Selection 437
[Link] Known Desired Nash Equilibrium . . . . . . 437
[Link] Unknown Desired Nash Equilibrium . . . . . 440
16.4.2 Risk-Aware Control Design . . . . . . . . . . . . . . . 441
16.4.3 Illustrative Example . . . . . . . . . . . . . . . . . . . 443
16.5 Multi-level Building Evacuation with Smoke . . . . . . . . . 445
16.5.1 Markov-Chain-Based Motion Model for Evacuation over
Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . 446
[Link] Reaching Evacuation Areas . . . . . . . . . . 449
[Link] Evacuation of the Whole Area . . . . . . . . 453
[Link] Jump Intensities for Evacuation . . . . . . . 455
16.5.2 Markov-Chain-Based Modeling for Smoke Motion . . . 456
16.5.3 Mean-Field-Type Control for the Evacuation . . . . . 457
16.5.4 Single-Level Numerical Results . . . . . . . . . . . . . 461
16.6 Coronavirus Propagation Control . . . . . . . . . . . . . . . 466
16.6.1 Single-Player Problem . . . . . . . . . . . . . . . . . . 467
[Link] Control Problem of Mean-Field Type . . . . 469
xii Contents

16.6.2 Multiple-Decision-Maker Problem . . . . . . . . . . . 470


[Link] Non-cooperative Games . . . . . . . . . . . . 472

Bibliography 475

Index 489
List of Figures

1.1 Diversification problem with two assets. . . . . . . . . . . . . 8


1.2 Risk vs Return plot in a portfolio problem with two assets. . 9
1.3 Feedback control scheme for the temperature control system. 10
1.4 Evolution of the temperature for two different control scenar-
ios. (a) evolution of the expected temperature. (b) evolution of
the temperature. (c) variance comparison of the two scenarios. 11
1.5 Some engineering applications involving uncertainties. . . . 12
1.6 Network of networks. Interdependence among the water, traf-
fic, energy, district heating and communication systems. . . 15
1.7 Brief recent literature review on different methods to solve
mean-field-type control and game problems with finite number
of decision-makers. . . . . . . . . . . . . . . . . . . . . . . . 17
1.8 Two decision-makers illustrating a non-cooperative game
problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.9 Two decision-makers illustrating a cooperative game problem. 21
1.10 Two decision-makers illustrating an adversarial game problem. 22
1.11 Two decision-makers illustrating a Berge game problem. . . 23
1.12 Two decision-makers illustrating a Stackelberg game problem. 23
1.13 Three decision-makers illustrating a co-opetitive game prob-
lem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.14 Four decision-makers illustrating a Partial-Altruism game
problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.15 Four decision-makers illustrating a Self-Abnegation game
problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.16 Control problem. Equivalently, a single decision-maker
decision-making problem. . . . . . . . . . . . . . . . . . . . . 26
1.17 General scheme corresponding to the Continuous-Time Direct
Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.18 General scheme corresponding to the discrete-time direct
method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.19 Outline of the book. Arrows show the interdependence among
the chapters. . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.20 Risk doors exercise. Would you risk for a bigger reward? . . 35

2.1 A basic scalar-valued control scheme. . . . . . . . . . . . . . 40

xiii
xiv List of Figures

2.2 Feedback scheme for the linear-quadratic mean-field-free op-


timal control. . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.3 Feedback scheme for the infinite-horizon linear-quadratic
mean-field-free optimal control. . . . . . . . . . . . . . . . . 45
2.4 General scheme of the non-cooperative differential game. Dark
gray nodes represent the n players, whereas the black node
represents the common system state. . . . . . . . . . . . . . 46
2.5 Feedback scheme for the ith decision-maker in the linear-
quadratic mean-field-free differential game. . . . . . . . . . . 49
2.6 Feedback scheme for the infinite-horizon linear-quadratic
mean-field-free differential game. . . . . . . . . . . . . . . . . 53
2.7 Feedback scheme for the linear-quadratic mean-field-free op-
timal control in discrete time. . . . . . . . . . . . . . . . . . 59
2.8 Feedback scheme for the linear-quadratic mean-field-free dif-
ference game. . . . . . . . . . . . . . . . . . . . . . . . . . . 61

3.1 General Mean-Field Game Philosophy in either continuous


or discrete time. Each player (black node) strategically plays
against a population mass of infinite players (dark gray nodes). 74
3.2 Feedback scheme for the linear-quadratic mean-field game in
continuous time. . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.3 Feedback scheme for the linear-quadratic mean-field game in
discrete time. . . . . . . . . . . . . . . . . . . . . . . . . . . 90

4.1 General scheme of the non-cooperative mean-field-type game.


Dark gray nodes represent the n players, the black node rep-
resents the system state, and the light gray node represents
the stochasticity affecting the system state. . . . . . . . . . . 111
4.2 General scheme of the cooperative mean-field-type game. Dark
gray nodes represent the n players, the black node repre-
sents the system state, and the light gray node represents the
stochasticity affecting the system state. . . . . . . . . . . . . 112
4.3 Feedback scheme for the linear-quadratic mean-field-type non-
cooperative game problem. . . . . . . . . . . . . . . . . . . . 114
4.4 Feedback scheme for the linear-quadratic mean-field-type co-
operative game problem. . . . . . . . . . . . . . . . . . . . . 122
4.5 Evolution of the system state and its expectation for the
scalar-value non-cooperative scenario. . . . . . . . . . . . . . 127
4.6 Evolution of the players’ strategies and their expectation for
the scalar-value non-cooperative scenario. . . . . . . . . . . . 128
4.7 Evolution of the Riccati equations α1 , . . . , α3 for the scalar-
value non-cooperative scenario. . . . . . . . . . . . . . . . . 129
4.8 Evolution of the Riccati equations β1 , . . . , β3 for the scalar-
value non-cooperative scenario. . . . . . . . . . . . . . . . . 130
List of Figures xv

4.9 Evolution of the Riccati equations γ1 , . . . , γ3 for the scalar-


value non-cooperative scenario. . . . . . . . . . . . . . . . . 130
4.10 Evolution of the Riccati equations δ1 , . . . , δ3 for the scalar-
value non-cooperative scenario. . . . . . . . . . . . . . . . . 131
4.11 Optimal cost function for each player for the scalar-value non-
cooperative scenario. . . . . . . . . . . . . . . . . . . . . . . 132
4.12 Evolution of the system state and its expectation for the
scalar-value fully-cooperative scenario. . . . . . . . . . . . . 133
4.13 Evolution of the players’ strategies and their expectation for
the scalar-value fully-cooperative scenario. . . . . . . . . . . 133
4.14 Evolution of the Riccati equation α0 for the scalar-value fully-
cooperative scenario. . . . . . . . . . . . . . . . . . . . . . . 134
4.15 Evolution of the Riccati equation β0 for the scalar-value fully-
cooperative scenario. . . . . . . . . . . . . . . . . . . . . . . 135
4.16 Evolution of the Riccati equation γ0 for the scalar-value fully-
cooperative scenario. . . . . . . . . . . . . . . . . . . . . . . 135
4.17 Evolution of the Riccati equation δ0 for the scalar-value fully-
cooperative scenario. . . . . . . . . . . . . . . . . . . . . . . 136
4.18 Cost function associated with each player for the scalar-value
fully-cooperative scenario. . . . . . . . . . . . . . . . . . . . 136

5.1 Co-opetition scheme λ+ ij (λij ) means λij > 0, (λij < 0). Black
players are altruistic, dark gray players are spiteful, and light
gray players cooperate and compete simultaneously. (a) Fully-
altruistic scenario, (b) Fully-adversarial scenario, (c) Altruistic
and adversarial players, and (d) Mixture of behaviors. . . . 140
5.2 Feedback scheme for the linear-quadratic mean-field-type co-
opetitive game problem. . . . . . . . . . . . . . . . . . . . . 145
5.3 Evolution of the system state and its expectation. . . . . . . 153
5.4 Evolution of the optimal control strategies for the partially
cooperation in a co-opetitive scenario. . . . . . . . . . . . . . 153
5.5 Evolution of the Riccati equations α1 , . . . , α5 for the partially
cooperation in a co-opetitive scenario. . . . . . . . . . . . . . 154
5.6 Evolution of the Riccati equations β1 , . . . , β5 for the partially
cooperation in a co-opetitive scenario. . . . . . . . . . . . . . 154
5.7 Evolution of the Riccati equations γ1 , . . . , γ5 for the partially
cooperation in a co-opetitive scenario. . . . . . . . . . . . . . 155
5.8 Evolution of the Riccati equations δ1 , . . . , δ5 for the partially
cooperation in a co-opetitive scenario. . . . . . . . . . . . . . 155
5.9 Optimal cost for the five players for the partially cooperation
in a co-opetitive scenario. . . . . . . . . . . . . . . . . . . . . 157
5.10 Co-opetitive parameters for the spiteful behavior in a co-
opetitive scenario. . . . . . . . . . . . . . . . . . . . . . . . . 158
5.11 Evolution of the system state and its expectation for the spite-
ful behavior in a co-opetitive scenario. . . . . . . . . . . . . 159
xvi List of Figures

5.12 Evolution of the optimal strategies and their expectation for


the spiteful behavior in a co-opetitive scenario. . . . . . . . . 160
5.13 Evolution of the Riccati equations α1 , . . . , α5 for the spiteful
behavior in a co-opetitive scenario. . . . . . . . . . . . . . . 161
5.14 Evolution of the Riccati equations β1 , . . . , β5 for the spiteful
behavior in a co-opetitive scenario. . . . . . . . . . . . . . . 161
5.15 Evolution of the Riccati equations γ1 , . . . , γ5 for the spiteful
behavior in a co-opetitive scenario. . . . . . . . . . . . . . . 162
5.16 Evolution of the Riccati equations δ1 , . . . , δ5 for the spiteful
behavior in a co-opetitive scenario. . . . . . . . . . . . . . . 162
5.17 Optimal cost for the five players under the co-opetitive sce-
nario with spiteful behavior. . . . . . . . . . . . . . . . . . . 163

6.1 General scheme of the non-cooperative mean-field-type game


with jumps and regime switching. Dark gray nodes represent
the n players, the black node represents the system state, and
the light gray node represents the stochasticity affecting the
system state. . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
6.2 Brownian motion and two jumps. . . . . . . . . . . . . . . . 191
6.3 Evolution of the system state and its expectation for the
scalar-value non-cooperative scenario with Brownian motion,
Poisson jumps and regime switching. . . . . . . . . . . . . . 192
6.4 Evolution of the optimal strategies u∗1 and u∗2 for the scalar-
value non-cooperative scenario with Brownian motion, Poisson
jumps and regime switching. . . . . . . . . . . . . . . . . . . 193
6.5 Evolution of the Riccati equations α1 and α2 for the scalar-
value non-cooperative scenario with Brownian motion, Poisson
jumps and regime switching. . . . . . . . . . . . . . . . . . . 194
6.6 Evolution of the Riccati equations β1 and β2 for the scalar-
value non-cooperative scenario with Brownian motion, Poisson
jumps and regime switching. . . . . . . . . . . . . . . . . . . 195
6.7 Evolution of the Riccati equations γ1 and γ2 for the scalar-
value non-cooperative scenario with Brownian motion, Poisson
jumps and switching. . . . . . . . . . . . . . . . . . . . . . . 196
6.8 Evolution of the Riccati equations δ1 and δ2 for the scalar-
value non-cooperative scenario with Brownian motion, Poisson
jumps and switching. . . . . . . . . . . . . . . . . . . . . . . 196

7.1 General scheme of the two-player Stackelberg mean-field-type


game with jump-diffusion and regime switching. . . . . . . . 200
7.2 Hierarchical order in the Stackelberg mean-field-type game. 201
7.3 Brownian motion and two jumps. . . . . . . . . . . . . . . . 221
7.4 Evolution of the system state and its expectation for the
scalar-value Stackelberg scenario with Brownian motion and
Poisson jumps. . . . . . . . . . . . . . . . . . . . . . . . . . . 223
List of Figures xvii

7.5 Evolution of the optimal control inputs u∗i and u∗j for the
scalar-value Stackelberg scenario with Brownian motion and
Poisson jumps. . . . . . . . . . . . . . . . . . . . . . . . . . . 223
7.6 Evolution of the differential equations αi and αj for the scalar-
value Stackelberg scenario with Brownian motion and Poisson
jumps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
7.7 Evolution of the differential equations βi and βj for the scalar-
value Stackelberg scenario with Brownian motion and Poisson
jumps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
7.8 Evolution of the differential equations γi and γj for the scalar-
value Stackelberg scenario with Brownian motion and Poisson
jumps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
7.9 Evolution of the differential equations δi and δj for the scalar-
value Stackelberg scenario with Brownian motion and Poisson
jumps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225

8.1 General scheme of the two-player Berge mean-field-type game


with jump-diffusion and regime switching. . . . . . . . . . . 232
8.2 Considered jumps and Brownian for the numerical example. 245
8.3 Evolution of the state x(t) and its expected value E[x(t)] with
initial conditions x(0) = E[x(0)] = 100. . . . . . . . . . . . . 246
8.4 Evolution of the control input u1 (t). . . . . . . . . . . . . . . 246
8.5 Evolution of the control input u2 (t). . . . . . . . . . . . . . . 247
8.6 Evolution of the differential equations α1 (t) and α2 (t). . . . 247

9.1 Example of a matrix-valued application with d = 4. . . . . . 253


9.2 Exchange matrix for six currencies: Euro, US Dollar, Aus-
tralian Dollar, Canadian Dollar, Swiss Franc, and Japanese
Yen, for five different dates. . . . . . . . . . . . . . . . . . . 255
9.3 Evolution of the system state and its expectation for the
matrix-value continuous-time non-cooperative scenario. . . . 278
9.4 Evolution of the first player strategies and their expecta-
tion for the matrix-value continuous-time non-cooperative sce-
nario. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
9.5 Evolution of the second player strategies and their expecta-
tion for the matrix-value continuous-time non-cooperative sce-
nario. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
9.6 Evolution of the equation P1 for the matrix-value continuous-
time non-cooperative scenario. . . . . . . . . . . . . . . . . . 280
9.7 Evolution of the equation P2 for the matrix-value continuous-
time non-cooperative scenario. . . . . . . . . . . . . . . . . . 281
9.8 Evolution of the equation P̄1 for the matrix-value continuous-
time non-cooperative scenario. . . . . . . . . . . . . . . . . . 281
9.9 Evolution of the equation P̄2 for the matrix-value continuous-
time non-cooperative scenario. . . . . . . . . . . . . . . . . . 282
xviii List of Figures

9.10 Evolution of the Riccati equations δ1 , and δ2 for the matrix-


value continuous-time non-cooperative scenario. . . . . . . . 283
9.11 Optimal cost function for each player for the matrix-value
continuous-time non-cooperative scenario. . . . . . . . . . . 283
9.12 Evolution of the system state and its expectation for the
matrix-value continuous-time fully-cooperative scenario. . . 284
9.13 Evolution of the first player strategies and their expectation
for the matrix-value continuous-time fully-cooperative sce-
nario. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
9.14 Evolution of the second player strategies and their expec-
tation for the matrix-value continuous-time fully-cooperative
scenario. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
9.15 Evolution of the equation P0 for the matrix-value continuous-
time fully-cooperative scenario. . . . . . . . . . . . . . . . . 286
9.16 Evolution of the equation P̄0 for the matrix-value continuous-
time fully-cooperative scenario. . . . . . . . . . . . . . . . . 287
9.17 Evolution of the Riccati equation δ0 for the matrix-value
continuous-time fully-cooperative scenario. . . . . . . . . . . 287
9.18 Optimal cost function for each player for the matrix-value
continuous-time fully-cooperative scenario. . . . . . . . . . . 288

12.1 Evolution of the (a) system state and (b) its expectation for
the matrix-value discrete-time non-cooperative scenario. . . 345
12.2 Evolution of the (a) first player strategies and (b) their ex-
pectation for the matrix-value discrete-time non-cooperative
scenario. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
12.3 Evolution of the (a) second player strategies and (b) their ex-
pectation for the matrix-value discrete-time non-cooperative
scenario. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
12.4 Evolution of the equation P1 for the matrix-value discrete-time
non-cooperative scenario. . . . . . . . . . . . . . . . . . . . . 349
12.5 Evolution of the equation P2 for the matrix-value discrete-time
non-cooperative scenario. . . . . . . . . . . . . . . . . . . . . 349
12.6 Evolution of the equation P̄1 for the matrix-value discrete-time
non-cooperative scenario. . . . . . . . . . . . . . . . . . . . . 350
12.7 Evolution of the equation P̄2 for the matrix-value discrete-time
non-cooperative scenario. . . . . . . . . . . . . . . . . . . . . 350
12.8 Evolution of the Riccati equations δ1 , and δ2 for the matrix-
value discrete-time non-cooperative scenario. . . . . . . . . . 351
12.9 Optimal cost function for each player for the matrix-value
discrete-time non-cooperative scenario. . . . . . . . . . . . . 351
12.10 Evolution of the (a) system state and (b) its expectation for
the matrix-value discrete-time fully-cooperative scenario. . . 352
List of Figures xix

12.11 Evolution of the (a) first player strategies and (b) their ex-
pectation for the matrix-value discrete-time fully-cooperative
scenario. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
12.12 Evolution of the (a) second player strategies and (b) their ex-
pectation for the matrix-value discrete-time fully-cooperative
scenario. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
12.13 Evolution of the equation P0 for the matrix-value discrete-time
fully-cooperative scenario. . . . . . . . . . . . . . . . . . . . 355
12.14 Evolution of the equation P̄0 for the matrix-value discrete-time
fully-cooperative scenario. . . . . . . . . . . . . . . . . . . . 355
12.15 Evolution of the Riccati equation δ0 for the matrix-value
discrete-time fully-cooperative scenario. . . . . . . . . . . . . 356
12.16 Optimal cost function for each player for the matrix-value
discrete-time fully-cooperative scenario. . . . . . . . . . . . . 356
 
13.1 Gap between var xVI = xbest−GNE and var xGlobal as dif-
ferent parameters εij = ε, for all i, j ∈ N , and number of
players n change for a fixed variance σ = 1. . . . . . . . . . . 369
13.2 Different topologies for the comparison among δ−parameters
in (13.11a) with εij ≥ 0, for all i, j ∈ N . . . . . . . . . . . . 371
13.3 Evolution of the variables w and x under the learning algo-
rithm in (13.13) for the constrained MFTG presented in (13.4)
with n = 2. Figures correspond to: (a) evolution of w1 and w2 ,
and (b)-(c) evolution of x1 and x2 . . . . . . . . . . . . . . . 372

14.1 Graphical example for the sets X, Dx , and Xc . . . . . . . . . 383

15.1 General scheme for a data-driven mean-field-type game prob-


lem by using machine learning. . . . . . . . . . . . . . . . . 390
15.2 Input/output configuration for the unknown system in a two-
player mean-field-type game problem. . . . . . . . . . . . . . 391
15.3 Machine-learning scheme. . . . . . . . . . . . . . . . . . . . . 392
15.4 Machine-learning-based expected values in comparison with a
particular test. This figure corresponds to Test 7. . . . . . . 400
15.5 Data-based probability measure of x at k = 600. . . . . . . . 401
15.6 Error distribution with ek = xk − yk . . . . . . . . . . . . . . 402
15.7 Machine-learning-based parameters. . . . . . . . . . . . . . . 403

16.1 Illustrative example. Five-tank benchmark involving two play-


ers and two coupled input constraints. Players 1 and 2 corre-
spond to the black and gray colors, respectively. . . . . . . . 406
16.2 Brownian motions. . . . . . . . . . . . . . . . . . . . . . . . 407
16.3 Evolution of the system states. . . . . . . . . . . . . . . . . . 408
16.4 Evolution of the optimal control inputs for the players. . . . 409
16.5 Water distribution network. . . . . . . . . . . . . . . . . . . 412
xx List of Figures

16.6 Control inputs variance comparison for the two different MFT-
MPC controllers. . . . . . . . . . . . . . . . . . . . . . . . . 413
16.7 Results corresponding to the proposed stochastic MFT-MPC
controllers for Scenarios 1 and 2; and behavior of the deter-
ministic MPC controller. . . . . . . . . . . . . . . . . . . . . 414
16.8 General scheme of the micro-grid involving energy storage.
(Adapted from [1].) . . . . . . . . . . . . . . . . . . . . . . . 415
16.9 Evolution of the noise applied to the system. . . . . . . . . . 416
16.10 Evolution of the system state. . . . . . . . . . . . . . . . . . 417
16.11 Evolution of the control input for the first player and its ex-
pectation, i.e., u1,k and E[u1,k ]. . . . . . . . . . . . . . . . . 417
16.12 Evolution of the control input for the second player and its
expectation, i.e., u2,k and E[u2,k ]. . . . . . . . . . . . . . . . 418
16.13 Evolution of the control input for the third player and its
expectation, i.e., u3,k and E[u3,k ]. . . . . . . . . . . . . . . . 418
16.14 Continuous stirred tank reactor. . . . . . . . . . . . . . . . . 420
16.15 Gain-scheduled Mean-Field-Type Control Diagram with n op-
eration points and θ ∈ {1, . . . , n}. . . . . . . . . . . . . . . . 425
16.16 Gain-scheduled mean-field-free control diagram with n opera-
tion points and θ ∈ {1, . . . , n}. . . . . . . . . . . . . . . . . . 426
16.17 Noise Brownian motions for both the reactant concentration
and the reactor temperature. . . . . . . . . . . . . . . . . . . 429
16.18 Performance of the GS-MFTC. Evolution of the reactant con-
centration CA and its expectation E[CA ] tracking the reference
CA ref . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
16.19 Performance of the GS-MFTC. Evolution of the reactor tem-
perature TR and its expectation E[TR ] tracking the reference
TRref . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430
16.20 Evolution of the optimal control input and its expectation. . 430
16.21 (a)–(b) Evolutionary dynamics with imperfect fitness observa-
tion for a population game, and two-strategy population game.
(c)–(d) Closed-loop of the multi-layer game for the equilibrium
selection and two-strategy population game. . . . . . . . . . 436
16.22 Projection dynamics behavior for both the RSP and Zeeman
game with and without noisy fitness functions. . . . . . . . . 438
16.23 Behavior of the risk-aware controller over the projection dy-
namics for both the RSP and Zeeman game with imperfect
fitness observation. . . . . . . . . . . . . . . . . . . . . . . . 444
16.24 Representation of the space B, its respective discretization
into n regions, and an example graph G, and V = {46},
O = {43, 49}, and F = {50}. . . . . . . . . . . . . . . . . . . 447
16.25 Representation of spacial constraints such as walls and obsta-
cles in the graph G f . . . . . . . . . . . . . . . . . . . . . . . 448
16.26 Relationship between possible jump intensities in the Markov
chain and the links E in a connected graph G. . . . . . . . . 449
List of Figures xxi

16.27 Mass of people motion with an initial distribution x(0) evac-


uating the whole space B within time T and a unique exit
area. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
16.28 Mass of people motion with an initial distribution x(0) evacu-
ating the whole space B within time T and with two exit areas
O1 and O2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
16.29 Smoke spread given a fire source s(0) and covering the whole
space B within time T , i.e., smoke distribution s(T ). . . . . 456
16.30 Evolution of the smoke throughout the area B for 8.3 minutes
and with time steps 0.1 seconds. . . . . . . . . . . . . . . . . 462
16.31 Evacuation comparison with and without mean-field-type con-
troller. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
16.32 Evolution of the population mass throughout the area B with-
out risk minimization for 8.3 minutes and with time steps
0.1 seconds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464
16.33 Variance comparison with and without mean-field-type con-
troller. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
16.34 Propagation of the virus in Colombia with migration con-
straints among the 32 departments without quarantine pre-
vention. Dots in the map represent infected/dead people: (a)
initial condition, (b) spread of the virus after 400 iterations
(interactions), (c) allowed air traffic. . . . . . . . . . . . . . 468
16.35 Transition rates among the finite states S. . . . . . . . . . . 469
16.36 Transition rates among three different players, i.e., P =
{1, 2, 3} corresponding to the example in (16.55). . . . . . . 471
List of Tables

8.1 Simulation parameters . . . . . . . . . . . . . . . . . . . . . 244

9.1 Exchange rates involving six currencies on October 11th, 2018 254

15.1 Summary of available data for machine-learning purposes. . 393


15.2 Summary of prepared data for machine-learning purposes. . 395
15.3 Second summary of prepared data for machine-learning pur-
poses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
15.4 Initial data D for machine-learning purposes. . . . . . . . . . 399
15.5 Correlation among the decision-makers control-inputs. . . . 399
15.6 First preparation of data for machine-learning purposes. . . 401
15.7 Error metrics between a real trajectory and machine-learning-
based estimated trajectories . . . . . . . . . . . . . . . . . . 402

16.1 Description of the variables in the model (16.5). . . . . . . . 421


16.2 Value of the variables for the case study. . . . . . . . . . . . 428
16.3 Summary of results for three different scenarios with and with-
out MFTC/smoke. . . . . . . . . . . . . . . . . . . . . . . . 463

xxiii
Foreword

Mean field was first studied in physics for the behavior of systems with large
numbers of negligible individual particles. Recently mean-field game theory
was introduced in the economics and engineering literature to study the strate-
gic decision-making by small interacting agents of huge populations. Typically
a mean-field game is described by a Fokker-Planck equation, and solved by
a Hamilton-Jacobi-Bellman equation, which requires the number of agents
approaches infinity. This assumption limits the practical usage of mean-field
game theory in engineering fields.
Thanks to my friend, professor Hamidou Tembine, who also mentored my
former Ph.D. student majored in mean-field game theory, his team and collab-
orators including Dr. Julian Barreiro-Gomez have pioneer works to introduce
mean-field-type game theory to engineering scenarios. Mean-field-type games
differ from mean field games since it takes into account higher-order statis-
tics, it can be employed when dynamic programming cannot be applied, the
number of interacting agents is not necessarily large, and it can handle non-
symmetry non-negligible effect of individual decision on the mean field term.
Those significant advantages of mean-field-type game theory open a whole
gate for solving complex engineering problems that cannot be handled by
classic methods.
With such a demand from engineering audiences, this book is very timely
and provides a thorough study of mean-field-type game theory. The strenu-
ous protagonist of this book is to bridge between the theoretical findings and
engineering solutions. The book introduces the basics first, and then math-
ematical frameworks are elaborately explained. The engineering application
examples are shown in detail, and the popular learning approaches are also
investigated. Those advantageous characteristics will make this book a com-
prehensive handbook of many engineering fields for many years, and I will
buy one when it gets published.

Zhu Han, IEEE/AAAS Fellow


John and Rebecca Moores professor
University of Houston

xxv
Preface

If you have picked this book, you are probably already aware about how
powerful and suitable the mean-field-type control and game theory is in order
to solve risk-aware problems in the engineering framework, and that a large
variety of control and dynamic game problems can be set as particular cases
of the mean-field-type games.
Our main goal in this textbook is to provide a quite comprehensive and
simple treatment of the mean-field-type control and game theory, which can
also be interpreted as risk-aware optimal interactive decision-making tech-
niques. To this end, we exclusively focus on the so-called direct method either
in continuous or discrete time. Our experience indicates that other existing
methods reported in the literature to solve the class of stochastic problems
we address in this book, such as partial-differential-equation-based methods,
chaos expansion, dynamic programming, or the stochastic maximum princi-
ple, are not appropriate to start teaching beginner students in the field neither
early-career researchers. We recommend to focus on understanding this book
prior to moving on the study of other research manuscripts using other theo-
retical directions. In this regard, the contents of this book comprises an appro-
priate background to start working and doing research in this game-theoretical
field.
To make the exposition and explanation even easier, we first study the de-
terministic optimal control and differential linear-quadratic games. Then, we
progressively add complexity step-by-step and little-by-little to the problem
settings until we finally study and analyze mean-field-type control and game
problems incorporating several stochastic processes, e.g., Brownian motions,
Poisson jumps, and random coefficients.
This smooth trip, starting with a scalar-valued state optimal control prob-
lem in continuous and discrete time, passes through the scalar-valued deter-
ministic differential games and mean field games, the stochastic state-and-
control-input independent diffusion differential games and mean field games,
until we finally address the mean-field-type games with state-and-control-
input dependent diffusion terms and incorporating Poisson jumps and random
coefficients by means of switching regimes. On the other hand, we go beyond
the Nash equilibrium, which provides a solution for non-cooperative games,
by analyzing other game-theoretical concepts such as the Berge, Stackelberg,
adversarial and co-opetitive equilibria. For the mean-field-type game analysis,
we provide several numerical examples, which are obtained from a MatLab-

xxvii
xxviii Preface

based user-friendly toolbox that is available for the free use of the readers of
this book.
We devote a whole part of the book to discuss about some learning ap-
proaches that guarantee converge to mean-field-type solutions. In particular,
we present the constrained and static mean-field-type games where optimiza-
tion algorithms may be applied such as distributed evolutionary dynamics, the
receding horizon mean-field-type control also know as risk-aware model pre-
dictive technique, and the data-driven mean-field-type games motivating the
use of artificial intelligence tools such as machine learning with either neural
networks or simple linear regression. Finally, we present several engineering
applications in both continuous and discrete time. Among these applications
we find the following: water distribution systems, micro-grid energy storage,
stirred tank reactor, mechanism design for evolutionary dynamics, multi-level
building evacuation problem, and the COVID-19 propagation control.

Julian Barreiro-Gomez
Hamidou Tembine
Acknowledgments

We gratefully acknowledge support from the US Air Force, and the New York
University in the US campus (NYU) and the UAE campus (NYUAD), for
the research conducted at the Learning & Game Theory Laboratory (L&G
Lab) and at the Center on Stability, Instability and Turbulence (SITE). This
material is based upon work supported by Tamkeen under the NYU Abu
Dhabi Research Institute grant CG002.
We also acknowledge our friends, faculty members, and researchers with
whom we have had several scientific discussions about mean-field-type control
and game theory, and also regarding its potential for engineering applications.
We specially thank Prof. Tyrone E. Duncan and Prof. Bozenna Pasik-Duncan
from the mathematics department at Kansas University in the US, and Prof.
Boualem Djehiche from the mathematics department at Royal KTH in Swe-
den. We finally acknowledge all our co-authors with whom we have published
several articles in the mean-field-type field.

xxix
Author Biographies

Julian Barreiro-Gomez received his B.S. degree (cum laude) in Electronics En-
gineering from Universidad Santo Tomás (USTA), Bogota, Colombia, in 2011.
He received the [Link]. degree in Electrical Engineering and the Ph.D. degree
in Engineering from Universidad de Los Andes (UAndes), Bogota, Colombia,
in 2013 and 2017, respectively. He received the Ph.D. degree (cum laude) in
Automatic, Robotics and Computer Vision from the Technical University of
Catalonia (UPC), Barcelona, Spain, in 2017; the best Ph.D. thesis in control
engineering 2017 award from the Spanish National Committee of Automatic
Control (CEA) and Springer; and the EECI Ph.D. Award from the European
Embedded Control Institute in recognition to the best Ph.D. thesis in Eu-
rope in the field of Control for Complex and Heterogeneous Systems 2017.
He received the ISA Transactions Best Paper Award 2018 in Recognition to
the best paper published in the previous year. Since August 2017, he has
been a Post-Doctoral Associate in the Learning & Game Theory Laboratory
(L&G-Lab) at the New York University in Abu Dhabi (NYUAD), United
Arab Emirates, and since 2019, he has also been with the Research Center
on Stability, Instability and Turbulence (SITE) at the New York University
in Abu Dhabi (NYUAD). His main research interests are: risk-aware control
and games, mean-field-type games, constrained evolutionary game dynamics,
distributed optimization, stochastic optimal control, and distributed predic-
tive control.

Hamidou Tembine received the M.S. degree in applied mathematics from Ecole
Polytechnique, Palaiseau, France, in 2006 and the Ph.D. degree in computer
science from the University of Avignon, Avignon, France, in 2009. He is a
prolific Researcher and holds more than 150 scientific publications including
magazines, letters, journals, and conferences. He is an author of the book on
Distributed Strategic Learning for Engineers (CRC Press, Taylor & Francis
2012), and Coauthor of the book Game Theory and Learning in Wireless Net-
works (Elsevier Academic Press). He has been co-organizer of several scientific
meetings on game theory in networking, wireless communications, smart en-
ergy systems, and smart transportation systems. His current research interests
include evolutionary games, mean-field stochastic games and applications. Dr.
Tembine received the IEEE ComSoc Outstanding Young Researcher Award
for his promising research activities for the benefit of the society in 2014. He
received the best paper awards in the applications of game theory.

xxxi
Symbols

Symbol Description

x Scalar-valued system state ` Running cost (control case)


xi Scalar-valued system state h Terminal cost (control case)
of the ith decision-maker Li Cost functional of the ith
u Scalar-valued control input decision-maker
ui Scalar-valued control input `i Running cost of the ith
of the ith decision-maker decision-maker
X Matrix/vector-valued sys- hi Terminal cost of the ith
tem state decision-maker
U Matrix/vector-valued con- E[·] Expected value
trol input var[·] Variance
Ui Matrix/vector-valued con- cov[·] Co-variance
trol input of the ith decision- H Hamiltonian (control case)
maker f Guess functional (control
b Drift case), or fitness functions in
σ Diffusion a population game
B Standard Brownian motion Hi Hamiltonian of the ith
N Jump process decision-maker
Ñ Compensated jump process Fi Guess functional of the
Θ Set of jump sizes, or set of ith decision-maker (matrix-
operating point in a gain- valued problems)
schedule strategy fi Guess functional of the
ν Radon measure over Θ ith decision-maker (scalar-
s Regime switching valued problems), or the fit-
S Set of regime switching ness of the ith strategy in a
q̃ss0 Jump intensity from regime population game
switching s to s0 BRi Best response of the ith
W Discrete-time noise decision-maker
mx Mean-field term of x N Set of decision-makers
mu Mean-field term of u N0 Set of risk-neutral decision-
m Strategic distribution makers
φ Probability measure of the N+ Set of risk-averse decision-
system state x makers
L Cost functional (control N− Set of risk-seeking decision-
case) makers

xxxiii
xxxiv Symbols

X Feasible set of system state Ui Feasible control strategy of


P(X ) Space of the probability the ith decision-maker
measure of x
hx, yi Inner product for vectors
U Feasible set of control inputs
x, y
U Feasible control strategy
Ui Feasible set of control inputs hA, Bi Trace tr(A, B) for matrices
for the ith decision-maker A, B
Part I

Preliminaries
1
Introduction

We truly live in a more and more interconnected and interactive world. In re-
cent years, we have seen emerging technologies such as internet of everything,
collective intelligence including Artificial Intelligence (AI), blockchains, next-
generation wireless networks, among many others. The quantities-of-interest
in these systems involve both volatilities and risks.
A typical example of risk concerns in the current online market is the
evolution of prices for the digital and cryptocurrencies (e.g., bitcoin, litecoin,
ethereum, dash, and other altcoins (alternatives to bitcoin, etc.). The variance
plays a base model for many risk measures. From random-variable perspective
(probability theory) the volatility can be captured by means of the variance,
which is a mean-field term comprising the second moment and the square of
the mean. Another example concerns the variations of wireless channels in
multiple-input-multiple-output systems. Non-Gaussianity of wireless channels
has been observed experimentally and empirically, and its variability affects
the quality of the communication.
The term mean-field has been referred to as a physics concept that at-
tempts to describe the effect of an infinite number of particles on the motion
of a single particle. Researchers began to apply the concept to social sciences
in the early 1960s to study how an infinite number of factors affect individual
decisions. However, the key ingredient in a game-theoretic context is the in-
fluence of the distribution of states and/or control actions onto the payoffs of
the decision-makers. Notice that there is no need to have a large population
of decision-makers. A mean-field-type game is a game in which the payoffs
and/or the state dynamics coefficient functions involve not only the state
and actions profiles but also the distributions of state-action process (or its
marginal distributions).
Games with distribution-dependent quantity-of-interest such as state
and/or payoffs are particularly attractive because they capture not only the
mean, but also the variance and higher order terms. Such incorporation of
these mean and variance terms is directly associated with the paradigm intro-
duced by H. Markowitz, 1990 Nobel Laureate in Economics. The Markowitz
paradigm, also termed as the mean-variance paradigm, is often characterized
as dealing with portfolio risk and (expected) returns [2–4].
In this book, we address variance reduction problems when several
decision-making entities take place. When the decisions made by the
agents/players/decision-makers influence each other, the decision-making is

DOI: 10.1201/9781003098607-1 3
4 Mean-Field-Type Games for Engineers

said to be interactive (interdependent). Such problems are known as game-


theoretical problems.

“Interactive decision theory would perhaps be a more descriptive


name for the discipline usually called Game Theory”

Robert Aumman
[5, page 47]

In this book we study the mean-field-type game theory, which


can be also named as risk-aware interactive decision-making the-
ory.

Next, we present some basic definitions corresponding to the structure of


a particular class of games, which is addressed throughout this book.

1.1 Linear-Quadratic Games


We start by defining a particular class of either deterministic or stochastic
differential games determined by a specific structure that the system dynamics
and cost functional have.
Definition 1 (Linear-Quadratic Deterministic Games) Game problems,
in which the state dynamics is given by a linear deterministic system and a
cost functional that is quadratic in the state and in the control inputs, are
often called the Linear-Quadratic (LQ) games.
Definition 2 (Linear-Quadratic-Gaussian Games) Game problems, in
which the state dynamics is given by a linear stochastic system with a Brow-
nian motion and a cost functional that is quadratic in the state and in the
control inputs, are often called the Linear-Quadratic Gaussian (LQG) games.
Such games also belong to the family of stochastic linear-quadratic games.

1.1.1 Structure of the Optimal Strategies and Optimal Costs


For generic LQG game problems under perfect state observation, the opti-
mal strategy of the decision-maker is a linear state-feedback strategy, which
Introduction 5

is identical to an optimal control for the corresponding deterministic linear-


quadratic game problem where the Brownian motion is replaced by the zero
process. Moreover, the equilibrium cost only differs from the deterministic
game problem’s equilibrium cost by the integral of a function of time.
However, when the diffusion (volatility) coefficient is state and/or control-
dependent, the structure of the resulting differential system as well as the
equilibrium cost vector are modified. These results were widely known in both
dynamic optimization, control and game theory literature.
In this book, several structures are studied from simple ones up to cases
where the stochastic processes are not only dependent on both the system
states and control inputs, but also on the distribution of the states and/or the
control inputs.

1.1.2 Solvability of the Linear-Quadratic Gaussian Games


For both LQG control and LQG zero-sum games, it can be shown that a simple
square completion method provides an explicit solution to the problem. It
was successfully developed and applied by Duncan et al. [6–11] in the mean-
field-free case (games in the absence of the distribution of the variables of
interest). Moreover, Duncan et al. have extended the direct method to more
general noises including fractional Brownian noise and some non-quadratic
cost functionals such as on spheres and torus.
Here, we follow the same method in order to solve a large variety of mean-
field-type control and game problems in both continuous and discrete time,
making the solution of this complex problem accessible for early-career re-
searchers and engineering students.

1.1.3 Beyond Brownian Motion


Inspired by applications in engineering (e.g., internet connection, battery
state, etc) and in finance (e.g., price, stock option, multi-currency exchange,
etc) where not only Gaussian processes but also jump processes (e.g., Poisson,
Lévy, etc) play important features, the question of extending the framework
to linear-quadratic games under state dynamics driven by jump-diffusion pro-
cesses were naturally posed. Adding a Poisson jump and regime switching
(random coefficients) may allow to capture in particular larger jumps which
may not be captured by just increasing diffusion coefficients. Several examples
such as multi-currency exchange or cloud-server rate allocation on blockchains
are naturally in a matrix form.
Throughout this book, we discuss about several game-theoretical solution
concepts and different structures as it has been pointed out in Section 1.1.1
including the analysis when processes beyond Brownian motion are taken into
consideration, e.g., see Chapter 6 where mean-field-type game problems with
jumps and regime switching (random coefficients) are studied.
6 Mean-Field-Type Games for Engineers

1.2 Linear-Quadratic Gaussian Mean-Field-Type Game


According to the Definitions 1 and 2 in Section 1.1 corresponding to deter-
ministic LQ and stochastic LQG games, respectively, we introduce next in
Definition 3 a class of stochastic differential games that additionally involve
the distribution of the variables of interest such as the system states and/or
the control inputs.

Definition 3 (Linear-Quadratic Mean-Field-Type Games) Game prob-


lems, in which
• the state dynamics is given by a linear stochastic system driven by a Brow-
nian motion with a mean-field term (such as the expectation of the system
state and/or the expectation of the control actions/control inputs) , and
• the cost functional is quadratic in the state, control, expectation of the
state and/or the expectation of the control actions/control inputs, are
often called LQG games of mean-field type, or Mean-Field-Type Linear-
Quadratic Gaussian MFT-LQG games.

The incorporation of mean-field terms into the stochastic differential game


problems allows taking into consideration risk terms such as the variance, and
higher order terms as introduced next. As a motivation to continue studying
this book, notice that the incorporation of risk terms into the engineering
problems solution and systems design can potentially enable the enhancement
of the systems performance.

1.2.1 Variance-Awareness and Higher Order Mean-Field


Terms
Most studies illustrated mean-field game methods in the linear-quadratic game
with infinite number of decision-makers [12–16]. These works assume indis-
tinguishability (also associated with homogeneity) within classes and the cost
functionals were assumed to be identical or invariant per permutation of
decision-makers indexes. Note that the indistinguishability assumption is not
fulfilled for many interesting problems such as variance reduction and/or risk
quantification problems, in which decision-makers have different sensitivity
toward the risk, i.e., not all the decision-makers involved in the strategic in-
teraction perceive, measure or interpret the risk in the same manner. One
typical and practical example is to consider a multi-level building in which
every resident has its own comfort zone temperature and aims to use the
Heating, Ventilating, and Air Conditioning (HVAC) system to be in its com-
fort temperature zone, i.e., maintain the temperature within its own comfort
zone. This problem clearly does not satisfy the indistinguishability assumption
Introduction 7

used in the previous works on mean-field games (the homogeneity of decision-


makers is not satisfied). Therefore, it is reasonable to look at the problem
beyond the indistinguishability assumption.
Here we drop these assumptions and deal with the problem directly with
arbitrarily finite number of decision-makers, which can clearly be heteroge-
neous. In the LQ-mean-field game problems the state process can be modeled
by a set of linear stochastic differential equations of McKean-Vlasov and the
preferences are formalized by quadratic or exponential of integral of quadratic-
cost functions with mean-field terms. These game problems are of practical
interests and a detailed exposition of this theory can be found in [17–22]. The
popularity of these game problems is due to practical considerations in (but
not limited to) consensus problems, signal processing, pattern recognition,
filtering, prediction, economics, and management science [23–26].
To some extent, most of the risk-neutral versions of these optimal controls
are analytically and numerically solvable [6, 7, 9, 11, 19]. On the other hand,
the linear quadratic robust setting naturally appears if the decision makers’
objective is to minimize the effect of a small perturbation and related variance
of the optimally controlled nonlinear process. By solving a linear quadratic
game problem of mean-field type, and using the implied optimal control ac-
tions, decision-makers can significantly reduce the variance (and also the cost)
incurred by this perturbation. The variance reduction and min-max problems
have very interesting applications in risk quantification problems under adver-
sarial attacks and in security issues with interdependent infrastructures and
networks [26–30]. In order to provide few examples, in [27, 31], the control
for the evacuation of a multi-level building is designed by means of mean-field
games and mean-field-type control, and in [32], electricity price dynamics in
the smart grid is analyzed using a mean-field-type game approach under com-
mon noise which is of diffusion type.

This book investigates how the Direct Method (sometimes also


known as the verification method or, for linear-quadratic case,
as square-completion method) can be used to solve different
types of mean-field-type game problems which are non-standard
problems [17], e.g., Non-cooperative solutions, Cooperative solu-
tions, Co-opetitive solutions, Adversarial solutions, Stackelberg
solutions, Berge solutions; and for scalar, vector and/or matrix-
valued states and control inputs, and both in continuous and
discrete time. Besides, we study the potential applications that
this class of games have in engineering problems. To this end,
the following section discusses about the role that risk terms
play in the engineering field.
8 Mean-Field-Type Games for Engineers

Asset 1 Asset 2
Expected return Expected return
E[X] = r1 E[X] = r2
Standard deviation Standard deviation
σ1 σ2 = σ1 + b, b > 0

Actual return Actual return


[r1 − σ1 , r1 + σ1 ] [r2 − σ2 , r2 + σ2 ]

FIGURE 1.1
Diversification problem with two assets.

1.2.2 The Role of the Risk in Engineering Applications


Before introducing the mean-field-type control and game theory, we present
the role of risk terms from the economical perspective. Let us consider two
different assets where we can invest a total economic resource that is denoted
by “$”: Asset 1 and Asset 2, as presented in Figure 1.1. The ith asset is
characterized by two main features, i.e., the expected return that is denoted by
E[Xi ], and its volatility, which is expressed by means of the standard deviation
that is denoted by σi . Let us assume that the assets have expected returns
denoted by r1, r2 ∈ R, and volatilities σ1 < σ2 . Thus, the return that can be
earned from assets 1 and 2 belongs to the ranges

[r − σ1 , r + σ1 ], and [r − σ2 , r + σ2 ],

respectively. For the sake of simplicity, letting r1 = r2 , the asset 2 can poten-
tially return more money than asset 1. Nevertheless, asset 2 is riskier.
This problem is known as a diversification optimization introduced by
Harry Markowitz (Nobel Memorial Prize in Economic Sciences 1990) in [2, 3].
The objective consists in maximizing the total expected return while minimiz-
ing the associate risk. For n assets the problem is formally stated as follows:
n 
X p 
maximize E[Xi ] − var(Xi ) ,
X1 , ... , Xn
i=1

being Xi the price of the ith asset. Figure 1.2 shows a two-asset portfolio-
problem example. In this example, we can observe in the p diagram that the
asset 1 has an expected return E[X1 ] = r1 and volatility var(X1 ) = σ1 , and
Introduction 9

r2 asset 2

Return
minimum risk
r1 asset 1

σ1 σ2
Risk

FIGURE 1.2
Risk vs Return plot in a portfolio problem with two assets.

p
for the asset 2 E[X2 ] = r2 and var(X2 ) = σ2 , respectively. The interest-
ing result relies on the fact that there exists an optimal resource allocation
(investment split) in the portfolio problem such that the risk (volatility) is
minimized.

“We next consider the rule that the investor does (or should)
consider expected return a desirable thing and variance of return
an undesirable thing.”

Harry Markowitz
[33, page 15]

Once the concept of the volatility has been introduced from the economics
perspective, we extend the use of this concept to the engineering field, and
more precisely, in the automatic control and game theory context.
Mean-field-type game theory studies how to manage with risk terms, i.e.,
it concerns about quantifying and minimizing risk terms. In order to illustrate
the risk management in the context of engineering applications, let us consider
a temperature control example. The control objective consists in maintaining
stable the temperature inside a room, which is represented by a system state
that is denoted as x(t), around a desired reference denoted by r = 20 (in
Celsius degrees) by means of an actuator signal that is denoted by u(t). More-
over, let us consider a disturbance affecting the dynamical behavior of the
temperature, e.g., there are windows and doors opening and/or closing, error
in the measurements of the current temperature, or any agent perturbing the
evolution of the temperature. This control scheme is presented in Figure 1.3.
10 Mean-Field-Type Games for Engineers

disturbance

control System
u(t) x(t) temperature
input
Controller r reference

FIGURE 1.3
Feedback control scheme for the temperature control system.

The first option in order to design a temperature controller consists of look-


ing at the expected value of the current temperature, i.e., controlling E[x(t)]
to meet the desired reference r. As an example, consider the performance
of two different scenarios/controllers presented in Figure 1.4(a). Clearly, Sce-
nario 1 performs much better than Scenario 2 since it reaches a steady state
faster, i.e., shorter settling time; the temperature meets the desired target,
i.e., null steady-state error; and it does not oscillate too much, i.e., it has a
small over-shoot behavior.
Nevertheless, notice that the expectation of the temperature does not al-
low evaluating the risk (or volatility behavior), i.e., it does not allow observing
the real behavior of the temperature. Now, let us check the evolution of the
actual temperature x(t) for both scenarios in Figure 1.4(b). Now, it is clear
that the performance of the controller in Scenario 2 is much better than the
one exhibited in Scenario 1 since it has less variations along the time, i.e.,
Figure 1.4(b) shows the risk terms along the time for the two different scenar-
ios. A second option in order to design the temperature controller consists of
evaluating the variance of the variable-of-interest along the time (risk-aware
approach), i.e., E(x(t) − E[x(t)])2 . Figure 1.4(c) compares the variance of the
variable-of-interest for the two scenarios. It can be seen that Scenario 2 per-
forms better due to the fact that the variance of x(t) is smaller.
The risk-awareness analysis that was illustrated by means of the tem-
perature controller can be extended to a large number of other engineering
applications in which uncertainties take place. Next, we discuss about such
uncertainties and point out several engineering problems of current research
interest to motivate the study of the mean-field-type control and game theory
that is presented in this book.

1.2.3 Uncertainties in Engineering Applications


Networked engineering systems are becoming of larger-scale nature (involving
a large number of system states, control inputs, and/or variables) and highly
interconnected (several sub-system coupled to each other). Game theory, or
the interactive decision-making theory [5, 34, 35], has become a quite power-
ful tool to analyze, design, and control these types of systems. Applications
Introduction 11

22

20

18

temperature [C]
16

14

12

10

8 E[x(t)] Scenario 1
E[x(t)] Scenario 2
6
0 10 20 30 40 50
time [min]

(a)
25

20
temperature [C]

15

10

5
x(t) Scenario 1
x(t) Scenario 2
0
0 10 20 30 40 50
time [min]

(b)

(c)

FIGURE 1.4
Evolution of the temperature for two different control scenarios. (a) evolution
of the expected temperature. (b) evolution of the temperature. (c) variance
comparison of the two scenarios.
Mean-Field-Type Games for Engineers

Power
Systems

Some engineering applications involving uncertainties.


Traffic District
Systems Heating
Engineering
Applications
Social Drinking
Networks Water
Criptocurrencies Networks
Blockchains

FIGURE 1.5
12
Introduction 13

include wireless networks [25, 36], water systems [37–39], power systems
[32,39–41], traffic systems [42], blockchain [43], temperature control [39,44],
among many others.
All the previously mentioned engineering applications, and not limited
to them, might be affected by multiple uncertainties that can be taken into
consideration. Figure 1.5 shows a summarized diagram of some engineering
applications incorporating uncertainties where the risk-awareness control and
game-theoretical techniques can be implemented. To illustrate how relevant
the role of uncertainties is in the engineering applications, we describe next
how these uncertainties appear and affect the systems performance.

• In social networks, the uncertainties are associated with the spreading of


the information along the network such as fake news, uncertainties in the
number of followers, likes, and/or the uncertainty in the reputation of the
participants. Also, there can be malicious and spiteful agents perturbing
and manipulating the overall behavior of the network. Indeed, opinion
dynamics is quite related to the social networks where several uncertainties
can affect the evolution of the overall average opinion.
• In drainage and drinking water networks, the objective consists of op-
timally manage some of the inflows and outflows such that the drinking
water demand is satisfied while economical costs are minimized, or alterna-
tively, to manage the wastewater flows in a way that the pollution and/or
the overflows are reduced or even driven to zero. In the drinking water
networks, there exist uncertainties associated with the resource demand
and to the weather conditions that might considerably affect the operation
of the system. In wastewater networks, there is uncertainty associated with
the precipitations that directly affect the risk to have overflows throughout
the system.
• The uncertainties in the power systems can be interpreted in a similar
way as for the water networks. There exists a demand of power with an
associated uncertainty and the weather conditions also directly affect the
performance of the renewable resources such as the wind turbines and the
solar panels.
• In the transportation systems, two important goals are: (i) the road safety
and (ii) the congestion management throughout the network while mini-
mizing the travel time. There are some uncertainties associated with the
weather conditions and road incident states, which might affect the traffic
flow.
• In the context of blockchains and cryptocurrencies, two important goals
are: (i) the incentives design for users, developers, producers, and verifiers;
and (ii) the development of a consensus algorithm for the verification and
validation of the transactions in a distributed fashion. As in traditional
currencies, cryptocurrencies have many uncertainties as the exchange rates
14 Mean-Field-Type Games for Engineers

along the time. Besides, they add extra uncertainties related to network
security and the proof-of-verification.

• In temperature control problem such as the air conditioning system many


people use at home, car, or office; the system is continuously dealing with
uncertainties when doors and windows are opened for a while, or simply
by the fact the temperature measurements might have some errors related
to the sensor devices.

Extensive modeling of these systems under uncertainties has


been considered. Most often, the risk-neutral and the expected
utility approach have been used. However, the expected utility
which is a linear first-order performance metric may not cap-
ture the risk. One possible way to incorporate risk-awareness is
through the variance of the performance as studied throughout
this book.

1.2.4 Network of Networks/System of Systems


Engineering systems are becoming every time more complex, of large-scale
nature (some of them are represented by means of networks), and highly in-
terconnected. The aggregated overall model composed of such interconnected
systems is known as network of networks or system of systems. In order to
clarify this concept, let us consider the existing interdependence among the
water, traffic, energy, district heating, and communication systems as shown in
Figure 1.6. It can be seen that, each system has many components represented
by nodes.
First, let us consider the drinking water transportation system. The control
objectives consist of providing water to supply the demand while minimizing
the operational economic costs. Such management is made by means of valves
and pumps, whose operation requires energy, creating an interconnection with
the power system. There is an operational cost related to the economic cost of
energy to operate the active actuators in the system, and the other operational
cost is related to the economic cost of water depending on the source from
which it is gotten. Likewise, water is required in hydroelectric systems to
produce energy. This fact establishes a relationship between the water and
power networks. The interdependence is presented with links connecting the
components of each network in Figure 1.6.
Now, let us consider the heat network, also known as district heating,
and it is in charge of providing comfort temperature. Similarly, this system
Introduction 15

Power
Systems

Drinking
Water
Networks

District
Heating

Traffic
Systems

FIGURE 1.6
Network of networks. Interdependence among the water, traffic, energy, dis-
trict heating and communication systems.

requires both water and power to operate the actuators that transport the
thermo-fluids (water) throughout the network.
Regarding the traffic system, there is a large number of agents moving
along the transportation network. Notice that there is a direct relationship
between the agents (people) motion and the geographical distribution of the
demands for the water, the electrical power, and the comfort temperature.
Thus, we can observe the interdependence of the traffic system with the water,
power, and heat networks illustrated in Figure 1.6. Hence, power systems are
required in order to operate the traffic lights, which are the main actuators
that regulate the traffic.
Finally, all the aforementioned systems work utilizing a communication
network to exchange the information related to the actuators and measure-
ments. Such communication network is represented in Figure 1.6 as the
16 Mean-Field-Type Games for Engineers

connectivity among all the different components belonging to the same or


other systems.

1.2.5 Optimality Systems


For game problems considering mean-field terms, various solution methods
such as the Stochastic Maximum Principle (SMP) (see [17]) and the Dy-
namic Programming Principle (DPP) with Hamilton-Jacobi-Bellman-Isaacs
and Fokker-Planck-Kolmogorov equations have been proposed [17, 45, 46]. Al-
ternatively, to address this class of stochastic control and game problems, this
book focuses on the direct method highlighting its advantages among which
we can find the following:
• It is easy to apply
• It does not require to solve Partial Differential Equations (PDEs) and
• It allows computing either explicit or semi-explicit solutions (i.e., solutions
in terms of ordinary differential equations).
Figure 1.7 shows a time-dependent scheme on some of the recent litera-
ture review on different methods to solve mean-field-type control and game
problems with finite number of decision-makers, i.e., in the context of atomic
games where the decisions or strategic selections cannot be neglected.
In [47] and [48], optimal control problems under systems governed by mean-
field-type Stochastic Differential Equations (SDE) are studied by applying the
SMP. Similarly, in this work, both the system dynamics and cost functional
are allowed to be of mean-field type. Further analysis on the SMP applied
to mean-field-type problems has been made in [49], where authors consider
second-order adjoint processes. Later on, solutions for mean-field-type control
problems have been discussed by using methods either based on Hamilton-
Jacobi-Bellman and Fokker-Planck coupled equations or based on SMP in [17].
Other authors have contributed by adding variations, extensions and also by
using other different methods. For instance, [50] studies discrete-time mean-
field-type control problems, and presents necessary and sufficient conditions
for their solvability. In [51], the SMP is applied to solve jump-diffusion mean-
field problems involving mean-variance terms, i.e., of mean-field type.
Other considerations have been added into these classes of problems. For
instance, information related issues have been studied. In [52], it is proposed
to solve mean-field-type control problems by applying dynamic programming,
and two examples are presented: (i) a portfolio optimization, and (ii) a sys-
temic risk model. In [53], the SMP is used to solve mean-field-type control
problems with partial observation and the results are applied to financial en-
gineering. The work in [54] addresses a partially observed optimal control
problem whose cost functional is of mean-field type and the authors solve
it by means of a maximum principle using Girsanov’s theorem and convex
variation.
Mean-Field-Type Control Mean-Field-Type Games
J. J. Absalom R. Carmona & F.
Hosking (2012) Delarue. (2013)
(Applied Math & Optimiz.) (SIAM J. Cont. and Opt.)
M. Laurière &
O. Pironneau
Introduction

A. Bensoussan, (2014)

FIGURE 1.7
J. Frehse, & S.C.P. (C. R. Acad. Sci. Paris)
Yam. (2013)
(Springer) J. Barreiro-Gomez,
D. Andersson & W. Guangchen, M. Laurière & J. Barreiro-Gomez,
B. Djehiche W. Zhen, & O. Pironneau T. E. Duncan, & T. E. Duncan,
(2011) R. Elliott, X. Li, & Z. Chenghui (2014) (2016) H. Tembine (2019) B. Pasik-Duncan, &
Y. Ni. (2013) (IEEE TAC)
(Applied Math & Optimiz.) (CCDC) (J. Optim. Theory Apps) H. Tembine (2020)
(Automatica)
(IEEE TAC)

R. Buckdahn, A. Bensoussan,
Y. Shen & T. K. B. Djehiche, B. Djehiche,
B. Djehiche, & H. Tembine, & J. Barreiro-Gomez,
J. Li (2011) Siu (2013) H. Tembine, & T. E. Duncan, &
(Nonlinear Analysis)
R. Tempone (2015) S. C. P. Yam (2019)
(Applied Math & Optimiz.) H. Tembine (2020)
(IEEE TAC)
(Dyn. Games & Apps)
(Automatica)

2014 2015 2016 2018 2020


2011 2012 2013 2019

A. K. Cissé & B. Djehiche & T. Duncan & R. Tian, Z. Yu &


H. Tembine (2014) M. Huang (2016) H. Tembine (2018) R. Zhang (2020)
(Games)
(IFAC World Congress) (Dyn. Games & Apps) (Systems & Control Letters)

Methods:
B. Djehiche,
Maximum Principle H. Tembine (2015) J. Barreiro-Gomez, &
(Automatica)
H. Tembine (2018)
Dynmc Program. (MTNS)

control and game problems with finite number of decision-makers.


H. Tembine (2015)
Direct Method (IEEE CDC)
A. Aurell &
Others B. Djehiche (2018)
(SIAM J. Contr. Opt.)

Brief recent literature review on different methods to solve mean-field-type


17
18 Mean-Field-Type Games for Engineers

The well-posedness of mean-field-type forward-backward stochastic differ-


ential equations is studied in [55] motivated by the work previously reported
in [56], where the same problem was discussed but in the context of mean
field (not of mean-field-type). In [24] the SMP has been extended to the
risk-sensitive mean-field-type control problem, which involves not only first
and second moment terms, but also higher terms. In [57], a stochastic robust
(H2 /H∞ ) control of mean-field-type with state and control-input-dependent
noise is studied. In [58], the well-posedness and solvability of an indefinite
linear-quadratic mean-field-type control problem in discrete time is discussed,
developing over previous discussions made in [50] by making further gen-
eralizations on the settings. The kind of partial-observation mean-field-type
control problems, as those addressed by authors in [53] and [54], have been ex-
tended to the risk-sensitive case in [59] and [60], and by applying the backward
separation method in [61] and [62]. Besides, the continuation of the research
performed in [52] on dynamic programming has been reported in [63]. The dy-
namic programming approach has been later studied in an adaptive manner for
discrete-time problems. Hence, a new consideration over the mean-field-type
control problems by incorporating common noise has been studied in [64].
Different from the aforementioned works, which study and analyze mean-
field-type control problems (i.e., with a unique decision-maker), the work
in [65] presents these ideas for game theoretical problems (i.e., with multi-
ple decision-makers), e.g., cooperative mean-field-type games. Thus, several
developments on mean-field-type theory involving both control and game ap-
proaches have been reported in the literature. Regarding the mean-field-type
game theoretical results, uncertainty quantifications have been studied in [66]
in the context of mean-field-type teams and by using the Kosambi-Karhunen-
Loeve expansion (chaos expansion approach), which allows representing the
stochastic process as a linear orthogonal-functions combination. In [67], mean-
field-type games are characterized with time-inconsistent cost functionals and
the SMP is applied. Hence, new results on risk-sensitive mean-field-type games
have been reported in [26] by extending the risk-sensitive control approach
previously analyzed in [24], and sufficient optimality equations are established
via infinite-dimension dynamic programming principle. Other classes of games
have been studied in the mean-field-type area. The two-player game scenario
has been widely studied for both the non-zero-sum and zero-sum (robust)
cases in [68] and [69], respectively. In contrast to the different methods used
before to solve mean-field-type control and game problems, e.g., dynamic pro-
graming or SMP, authors in [70] use the so-called direct method in order to
compute explicit and semi-explicit solutions for mean-field-type game prob-
lems for the linear-quadratic case. The work in [70] served as a motivation
for other results using the same method, e.g., the most recent results related
to the direct method on mean-field-type games are [71] and [72], discussing
non-linear continuous-time problems, and discrete-time problems involving
different information issues, respectively. Regarding the most recent research
in the field of mean-field-type theory, mean-field-type games with jump and
Introduction 19

regime switching are studied by using dynamic programing and SMP princi-
ples in [73] and following the direct method in [74]. Other game-theoretical
solution concepts such as partial cooperation, competition, partial altruism,
etc are studied by means of the co-opetitive mean-field-type games in [75] and
by following the direct method for linear-quadratic problems.
Then, results on mean-field-type games started being applied over con-
crete engineering applications. For instance, mean-field-type games have been
applied to distributed power networks with prosumers in [64], to the design of
filters for big data assimilation in [76], to the network security analysis as a
public good problem in [77], and to the demand-supply management in smart
grid in [78], where also constraints have been taken into account.
The direct method has allowed designing risk-aware control and game tech-
niques for a large number of engineering applications. For instance, in [79] a
mean-field-type-based pedestrian crowd model is presented. In [32], applica-
tions on electricity price in smart grids and blocked-based power networks
are studied by following the direct method. In [80], a class of control input
constraints are considered for linear-quadratic mean-field-type games and an
application of water distribution system is presented. Hence, pedestrian mo-
tion has been studied in the context of games in [81], and other applications in
traffic, water, energy, block-chains, power systems, among others, have been
developed by using mean-field-type theory, e.g., see the works in [82], [83],
and [37].
The following section is in charge of presenting one of the methods to
solve mean-field-type control and game problems consisting in the master sys-
tem comprising the backward Hamilton-Jacobi-Bellman and forward Fokker-
Plank-Kolmogorov equations. This method focuses on satisfying optimality
conditions presented by means of a Partial-Integro Differential Equations
(PIDE) system. The next section seeks to motivate the reader by pointing
out that this book intends to considerably reduce the complexity to solve the
underlying problems avoiding to solve PDEs.

1.3 Game Theoretical Solution Concepts


In the context of decision-making theory, there are several problem statements,
information configurations, and/or strategic scenarios that can be studied.
Perhaps, the most popular solution concept in game theory is the Nash equi-
librium, which provides a solution for a non-cooperative game problem. Never-
theless, there is an enriched variety of solution concepts that we present next,
i.e., adversarial, Stackelberg, Berge, among others, and that are of interest
depending on the engineering problem we are working on.
20 Mean-Field-Type Games for Engineers

Let us consider a general scalar-valued system involving n decision-makers


from the set N = {1, . . . , n}, whose dynamics are given by a SDE as follows:
dx(t) = b(x, u1 , . . . , un )dt + σ(x, u1 , . . . , un )dB(t), (1.1)
where x ∈ X denotes the scalar system state, ui ∈ Ui denotes the scalar control
input for the decision-maker iQ∈ N , and B denotes a standard Q Brownian
motion. Moreover, b : X × R × j∈N Uj → R and σ : X × R × j∈N Uj → R
are state and control-dependent drift and diffusion terms, respectively. Each
decision-maker has a cost functional of mean-field type as follows:
Z T
Li (x, ui ) = hi (x(T )) + `i (x(t), ui (t))dt,
0

where hi : X → R denotes the terminal cost and `i : X × Ui → R denotes the


running cost. Next, we present the different game-theoretical problems (i.e.,
the different game theoretical solution concepts) that we are going to address
along this book.

1.3.1 Non-cooperative Game Problem


The first problem is given by a non-cooperative situation involving two
decision-makers, i.e., considering the set N = {1, 2}. Each decision-maker is
interested in minimizing the magnitude of the system state x while reducing
the used energy determined by the control input.

minu1 L1 (x, u) u1 u2
minu2 L2 (x, u)

Decision-maker 1 Decision-maker 2
Non-cooperative
game problem

FIGURE 1.8
Two decision-makers illustrating a non-cooperative game problem.

Figure 1.8 shows an illustrative example for a non-cooperative situation.


Each decision-maker solves the following problem:

minimize E[Li (x, ui )],


ui ∈Ui

subject to (1.1) and a given initial condition for the system state x(0) ,
x0 . The aforementioned problem represents a dilemma or a conflict since the
decision-makers want to minimize the magnitude of the state but without
making any effort (or applying the least effort as possible).
Introduction 21

1.3.2 Fully-Cooperative Game Problem


Let us now suppose that the two decision-makers cooperate with each other
to minimize the magnitude of the system state x while reducing the aggrega-
tive weighted effort to do so, i.e., q1 u1 + q2 u2 with q1 , q2 > 0. In this regard,
decision-makers optimize jointly as a team. The fully-cooperative game prob-
lem is given by:

X
minimize E[Lj (x, uj )],
{uj ∈Uj }j∈N
j∈N

subject to (1.1) and a given initial condition x(0) , x0 .

minu L(x, u)
L(x, u) = L1 (x, u) + L2 (x, u)
u

u2 u1
x

Decision-makers
{1, 2} Cooperative game
problem

FIGURE 1.9
Two decision-makers illustrating a cooperative game problem.

Figure 1.9 shows an illustrative example of cooperation between two


decision-makers. They act jointly pursuing the same objective. It is worth to
mention that the cooperative game solution corresponds to the control prob-
lem solution when the control input in the control problem is of the dimension
of the joined control inputs for all decision-makers in the game problem.

1.3.3 Adversarial Game Problem


There are other situations where the interests of the decision-makers are op-
posite i.e., what a decision-maker is pursuing goes against what its opponent
wants. In this case, we have an adversarial situation. There is a decision-maker
acting against the other. Let u ∈ U be the control input of the first decision-
maker and let v ∈ V denote the control input of the second decision-maker.
On one hand, the first decision-maker pursues to minimize the magnitude
of the system state together with its applied energy. In contrast, the second
decision-maker pursues to maximize the magnitude of the system state while
minimizing the applied energy.
The adversarial game problem is given by:

maximize minimize E[L(x, u, v)],


v∈V u∈U
22 Mean-Field-Type Games for Engineers

maxv L(x, u, v) v u
minu L(x, u, v)
x

Decision-maker 1 Adversarial game Decision-maker 2


problem

FIGURE 1.10
Two decision-makers illustrating an adversarial game problem.

subject to

dx(t) = b(x, u, v)dt + σ(x, u, v)dB(t),

and a given initial condition x(0) , x0 . Figure 1.10 illustrates by means of


an example the adversarial situation. It can be seen that one decision-maker
pursues a contrary objective with respect to the other.

1.3.4 Berge Game Problem


Let us now analyze an altruistic behavior known as a mutual support situation.
The idea is as the saying: “I help you, and you help me.” First, it is necessary
to highlight that this situation differs from the cooperative scenario where the
decision-makers had a common interest. Let us consider only two decision-
makers in the set N = {1, 2}. The Berge game problem is given by

minimize E[Li (x, u)], i ∈ N \ {j},


uj ∈Uj

subject to (1.1) and a given initial condition x(0) , x0 . Notice that, it is also
possible to consider a strategic interaction with multiple decision-makers with
a set N = {1, . . . , n}. Under such scenario, we can still study Berge solutions
using a particular criterion. One alternative consists of minimizing the cost
of the decision-maker whose cost functional is the highest. In other words,
supporting the decision-maker who needs it the most, i.e.,

minimize E[Li (x, u)], i ∈ arg max {La (x, u)},


uj ∈Uj a∈N \{j}

subject to (1.1) and a given initial condition x(0) , x0 .


Figure 1.11 shows a mutual support situation. The first decision-maker
acts minimizing the second decision-maker’s cost functional and vice versa.
This problem is quite different to the non-cooperative game problem due to
the fact that the control input of a decision-maker is not incorporated in the
cost functional of the other, what significantly modifies the optimal solution.
Introduction 23

minu1 L2 (x, u) u1 u2 minu2 L1 (x, u)

Decision-maker 1 Decision-maker 2
Berge game
problem

FIGURE 1.11
Two decision-makers illustrating a Berge game problem.

1.3.5 Stackelberg Game Problem


We analyze now a strategic interaction that is played in sequence with respect
to the time. The game is composed of two stages comprising two different time
instants known as action and reaction. Alternatively, for the two decision-
maker game problem, it is interpreted that there is a leader denoted by j
and a follower denoted by i. The first decision-maker, known as leader, plays;
and then, the second decision-maker, known as follower, reacts against the
leader’s selection. This configuration has gotten special importance to model
situations in which a hierarchical scheme emerges. For instance, the govern-
ment imposing rules and the population reacting, or a huge and powerful
company deciding the prices for the products and small competition compa-
nies reacting strategically to compete in the market. Hence, it is interesting
to determining if the leader has any advantage over the follower and under
which conditions it occurs.

minu1 L1 (x, u) u1

Stage 1: Action
x

Leader Follower

minu1 L1 (x, u) u1 u2
minu2 L2 (x, u)

Stage 2: Reaction

Leader Stackelberg game Follower


problem

FIGURE 1.12
Two decision-makers illustrating a Stackelberg game problem.
24 Mean-Field-Type Games for Engineers

The Stackelberg game solution is characterized by

u∗j ∈ arg min {Lj (x, u) : ui ∈ BRi (uj )},


uj ∈Uj

u∗i ∈ BRi (uj ),

where BRi (uj ) = {ui : ui ∈ arg min Li (x, ui , uj ), given uj } denotes the best-
response or best reaction against the decision made by the leader. Figure
1.12 presents the sequential strategic situation corresponding to a Stackelberg
game problem. It can be seen that the leader selects its strategy and then the
follower reacts to it.

1.3.6 Co-opetitive Game Problem


The mixture of both the cooperation and competition is known as co-opetition.
When having a larger number of decision-makers beyond two, several combi-
nations may emerge. In real life, some people might help others while behaving
indifferent, or even in detriment, of the others. To provide a concrete example,
consider some elections with three candidates: a, b, and c. Any candidate, let
us say b, can act during the campaign against a candidate, let us say a, in or-
der to help the other candidate, let us say candidate c. In this case, we cannot
really claim that the decision-maker b is a non-cooperative or cooperative one
since its behavior is a combination of both.
In order to demonstrate this concept, let us consider a strategic interaction
involving three decision-makers. Figure 1.13 shows an illustrative example of
a co-opetitive situation.

Decision-maker 1
competes with
u2
decision-maker 2 minu2 L2 (x, u)
x1

minu1 L3 (x, u) − L2 (x, u) u1

Decision-maker 2

u3
minu3 L3 (x, u)
Decision-maker 1

Decision-maker 1 x2
cooperates with
decision-maker 3 Co-opetitive game Decision-maker 3
problem

FIGURE 1.13
Three decision-makers illustrating a co-opetitive game problem.
Introduction 25

The interaction of the decision-makers 1 and 2 is given by an adversarial


situation, whereas the interaction between the decision-makers 1 and 3 is char-
acterized by a non-cooperative situation. Such heterogeneity of the behavior
with respect to different decision-makers in the same strategic interaction is
known as a co-opetitive game problem.

1.3.7 Partial-Altruism and Self-Abnegation Game Problem

minu1 L1 (x, u) + L2 (x, u)


minu2 L2 (x, u)
u11 u21 u22 u12

Partial-altruism
game problem

FIGURE 1.14
Four decision-makers illustrating a Partial-Altruism game problem.

u11 u21 u22


u12

maxu12 minu11 L1 (x, u) minu2 L2 (x, u)


Self-abnegation
game problem

FIGURE 1.15
Four decision-makers illustrating a Self-Abnegation game problem.

Once the co-opetitive game problem has been presented in the previous
section, we observe that there is a large variety of possibilities and situations to
analyze with game theory. To motivate such flexibility we consider a two-team
interactive decision-making. Figures 1.14 and 1.15 present a partial altruistic
scenario and self-abnegation case, respectively. On one hand, we observe an
altruistic behavior when one team helps the other. On the other hand, we ob-
serve a self-abnegation case when a member of the team is sabotaging (pushing
those who are pulling).
26 Mean-Field-Type Games for Engineers

1.4 Partial Integro-Differential System for a Mean-


Field-Type Control
Let us consider a control problem as the one presented in Section 1.3.2 corre-
sponding to the fully-cooperative game problem but with a unique decision-
maker as shown in Figure 1.16. This control problem can be solved by

minu L(x, u) u

Decision-maker
Control problem

FIGURE 1.16
Control problem. Equivalently, a single decision-maker decision-making prob-
lem.

computing the solution of the Hamilton-Jacobi-Bellman and Fokker-Plank-


Kolmogorov equations, which compose a Partial-Integro-Differential Equation
(PIDE) system for mean-field-type control. These equations describe optimal-
ity conditions for the solution of the mean-field-type problem.
Let us consider a general scalar-valued system whose dynamics are given
by a SDE as follows:

dx(t) = b(t, x, u, φ)dt + σ(t, x, u, φ)dB(t), (1.2)

where x ∈ X denotes the scalar system state, u ∈ U denotes the scalar control
input, φ ∈ R is the probability measure of x, and B denotes a standard
Brownian motion. Moreover, the mappings b : [0, T ] × X × U × P(R) → R and
σ : [0, T ] × X × U × P(X ) → R are state and control-dependent drift, and
diffusion terms, respectively. The control objective consists of minimizing the
following cost functional of mean-field type as follows:
Z T
L(t, x, u, φ) = h(x(T ), φ(T )) + `(t, x(t), u(t), φ(t))dt,
0

where h : X × P(X ) → R denotes the terminal cost and ` : [0, T ) × X × U ×


P(X ) → R denotes the running cost. The risk-aware control problem is:
Z
minimize L(t, x, u, φ)φ(t, dx), (1.3a)
u∈U X

subject to

dx(t) = b(t, x, u, φ)dt + σ(t, x, u, φ)dB(t), (1.3b)


Introduction 27

x(0) , x0 . (1.3c)

The problem in (1.3a)–(1.3c) is of mean-field type, or equivalently, it is a


risk-aware control problem involving mean-field terms, which are computed
by using the probability measure of the system state as follows:
Z
E[x(t)] = y φ(t, dy),
ZR
E[u(t)] = u(t, y, φ)φ(t, dy),
R

since the optimal control input is expressed in terms of the system state and its
probability measure. The solution of the risk-aware control problem is found
by solving the following backward-forward partial-integro-differential system
presented next:
1
φt = −(φ b)x + (φσ 2 )xx , (1.4a)
2
φ(0) = φ0 , (1.4b)
Z
0 = Vt (t, φ) + H(t, x, φ, Vxφ , Vxxφ )φ(t, dx). (1.4c)
X
Z
V (T, φ) = h(y, φ)φ(T, dy). (1.4d)
X

The initial boundary condition for φ makes the equation go forward, and ter-
minal boundary condition over V makes the equation go backward. Please no-
tice that for simplicity in the notation, in the system (1.4a)-(1.4d) subindexes
denote partial derivatives. For instance, [·]t denotes the partial derivative with
respect to time, and [·]xx denotes the second partial derivative with respect
to x. Hence, Z
V (t, φ) = minimize L(t, x, u, φ)φ(t, dx)
u∈U X
denotes the optimal cost from time t up to T in the running cost, and
n
H(t, x, φ, Vxφ , Vxxφ )= minimize `(t, x, u, φ) + b(t, x, u, φ)Vxφ
u∈U
σ(t, x, u, φ)2 o
+ Vxxφ (1.5)
2
denotes the integrand Hamiltonian. In (1.4a)–(1.4d), The Fokker-Planck-
Kolmogorov equation is given by (1.4a), and the Hamilton-Jacobi-Bellman
equation is the one in (1.4c). The solution of the system in (1.4a)–(1.4d) is
complex and normally requires from the implementation of numerical meth-
ods. In fact, when problems involving stochastic processes beyond Brownian
motion such as Poisson jumps, the Hamiltonian (1.5) becomes more difficult
involving another integration over the jump set in the integrand Hamiltonian
28 Mean-Field-Type Games for Engineers

(see e.g., [73, 74, 84]) making more challenging to compute the solution. Be-
sides, the system presented in (1.4) corresponds to the unique-decision-maker
problem and it must be extended to multiple decision-makers for solving game
problems.
Let us consider a mean-field-type games with i ∈ N = {1, . . . , n} decision-
makers where n ≥ 0, n ∈ N, corresponding to a non-cooperative game problem
as shown in Section 1.3.1. The Hamilton-Jacobi-Bellman system corresponding
to this game problem incorporates as many Hamiltonian as decision-makers
as follows:
Z
0 = Vi,t (t, φ) + Hi (t, x, φ, Vxφ , Vxxφ )φ(t, dx),
X
Z
Vi (T, φ) = hi (y, φ)φ(T, dy),
X

where the integrand Hamiltonian is


n
Hi (t, x, φ, Vxφ , Vxxφ )= minimize `i (t, x, u, φ)
ui ∈Ui
1 o
+b(t, x, u, φ)Vi,xφ + σ(t, x, u, φ)2 Vi,xxφ .
2
In the Hamiltonian, the functions have the following mappings:
Y
b, σ : [0, T ] × X × Uj × P(X ) → R,
j∈N
Y
`i : [0, T ) × X × Uj × P(X ) → R,
j∈N

hi : X × P(X ) → R.
An important issue to study and analyze consists in the existence and unique-
ness for the system in (1.4a)-(1.4d). Such problems have been studied, for
instance, in [73], [85] and [86].
The following important remark clarifies/emphasizes that the approach
this book follows pursues to make the mean-field-type game theory accessible
to engineers and early career researchers in the field.

Important Remark
The motivation in this book consists in solving the mean-field-type
control and game problems by avoiding to compute the solution of
the Backward-Forward Partial-Integro-Differential System for Mean-Field-
Type either Control or Games. Instead, this book proposes to follow the
direct method either in continuous or discrete time in order to find semi-
explicit solutions for the mean-field-type problems.

This book intends to be oriented to engineers, beginners in the mean-


field-type control and games field, and early career researchers.
Introduction 29

1. Mean-Field-Type
Game Problem

2. Guess 5. Process
Functional Identification
Continuous-Time
Direct Method

3. Integration [Link]
Formula Completion

FIGURE 1.17
General scheme corresponding to the Continuous-Time Direct Method

1.5 A Simple Method for Solving Mean-Field-Type


Games and Control
The mean-field-type game problems presented in this book are solved in a
semi-explicit way by means of the so-called direct method. This method can be
implemented in either continuous or discrete time. Next, we show the general
steps corresponding to the direct method.

1.5.1 Continuous-Time Direct Method


Figure 1.17 presents the general scheme of the continuous-time direct method.
In the first step, we have the mean-field-type game problem statement. Then,
inspired by the structure of both the cost functional and the system dynam-
ics of the mean-field-type game problem exhibited in the first step, a guess
functional is proposed in the second step. Afterward the integration formula
is applied, which is given by the Itô’s formula. It follows to perform square
completion (or optimization over the control-input-dependent terms) to deter-
mine the optimal control inputs in the fourth step. It could be identified that
the direct method has a tight relationship with the HJB equation. Thus, this
fourth step corresponds to the optimization of the Hamiltonian with respect
to the control inputs. Finally, the process identification allows to deduce the
optimality for both the control inputs and the cost functional.
30 Mean-Field-Type Games for Engineers

1. Mean-Field-Type
Game Problem

2. Guess 5. Process
Functional Identification
Discrete-Time
Direct Method

3. Telescopic [Link]
Sum Completion

FIGURE 1.18
General scheme corresponding to the discrete-time direct method.

1.5.2 Discrete-Time Direct Method


In this book we also address discrete-time mean-field-type game problems,
corresponding to the first step of the discrete-time direct method presented in
Figure 1.18. It is worth to highlight that the discrete approach might be more
suitable for implementation considerations in real engineering applications.
Then, following the same reasoning as in the continuous-time direct method,
a guess functional is also proposed in the second step. As a third step, and dif-
ferent from the continuous-time direct method, we apply the telescopic sum.
Finally, steps four and five allow identifying the optimal control input by per-
forming square completion (or optimization over the control-input-dependent
terms), and the respective recursive equations associated with the Riccati
equations. Throughout the book, the reader will observe that the complexity
to get the solution for discrete-time problems is not high, but in contrast, the
computation becomes long.

1.6 A Simple Derivation of the Itô’s Formula


One of the main ingredients of the direct method is the integration formula,
which is given by the Itô’s formula. Here, we present a simple derivation of
the Itô’s formula by means of a power series or Taylor expansion. Let us
consider the same SDE as in Section 1.4 in (1.2), i.e., dx = bdt + σdB, where
b : X × R × U → R denotes the state and control-dependent drift term, and
Introduction 31

σ : X×R×U → R is the state and control-dependent diffusion term. Let f (t, x)


be a twice-differentiable function and its Taylor expansion is as follows:

∂f ∂f 1 ∂2f 2
df = dt + dx + dx + higher order terms.
∂t ∂x 2 ∂x2
Now, replacing dx yields

∂f ∂f 1 ∂2f 2
df = dt + (bdt + σdB) + (bdt + σdB) + higher order terms,
∂t ∂x 2 ∂x2
and
∂f ∂f ∂f 1 ∂2f 2 2 1 ∂2f 2 2
df = dt + bdt + σdB + b dt + σ dB
∂t ∂x ∂x 2 ∂x2 2 ∂x2
∂2f
+ bσdtdB + higher order terms.
∂x2
Considering the fact that E[B 2 (t)] = t by the property of a Brownian motion,
it follows that
∂f ∂f ∂f 1 ∂2f 2
df = dt + bdt + σdB + σ dt.
∂t ∂x ∂x 2 ∂x2
Finally, since df should be linear in [dt, dB], by identification one obtains
 
∂f ∂f 1 ∂2f 2 ∂f
df = + b+ σ dt + σdB,
∂t ∂x 2 ∂x2 ∂x
which is the Itô’s formula for standard Brownian process.

1.7 Outline
This book is divided into six main parts:
1. First part is devoted to the introduction of preliminary works on
the field of mean-field and mean-field-type control and game the-
ory. The Direct Method is introduced for both continuous-time and
discrete-time versions. Then, we highlight the purpose and objec-
tives of this book oriented to engineers and early career researchers.
Moreover, this first part also motivates the study of mean-field-type
control and games, and shows the outline of this book.
2. In the second part, we introduce both mean-field-free and mean-
field games in continuous time. For the sake of simplicity in the
explanation, this part of the book assumes the system state and
control inputs to be scalar values. These approaches are solved by
32 Mean-Field-Type Games for Engineers

using the Direct Method. These two type of games allow to high-
light the main differences with respect to the mean-field-type games,
which is the core of this book. In addition, it is important to men-
tion to the reader that the two main branches in this book, i.e.,
the continuous-time and discrete-time mean-field-type games, can
be studied independently. Therefore, the reader can feel free to omit
the continuous-time approach and go directly over the discrete anal-
ysis within Part II and Part V.
3. Afterward, the third part continues focusing on the one-dimensional
case. We introduce the simplest mean-field-type game problem,
which is solved by means of the Direct Method. In addition, we
study different solution concepts, i.e., the non-cooperative, fully-
cooperative, and the co-opetitive scenarios. The reader can also re-
fer to the introductory section presented in Part I related to the
different game-theoretical solution concepts.
Finally, other two game-theoretical solution concepts are studied,
i.e., the Stackelberg (leader-follower) mean-field-type game consist-
ing in a sequential strategic interaction, and the Berge mean-field-
type game consisting in a mutual support consideration.
4. The study of matrix-valued mean-field-type games is presented in
the fourth part, where the proposed problems are semi-explicitly
solved by using the Direct Method. In addition, this part also dis-
cusses about an alternative to consider multiple coupled input con-
straints by means of auxiliary variables and without affecting the
method to obtain semi-explicit solutions.
5. Part five presents the discrete-time version of both the scalar-valued
and matrix-valued cases. This part also studies the cooperative and
non-cooperative solution concepts that are presented in the con-
tinuous counterpart. Note that this part does not discuss about
the co-opetitive, Stackelberg, and Berge problems in discrete time
given that, although they are interesting problems, the authors of
this book consider they do not add extra difficulty.
6. Finally, the last part focuses on showing other problem settings such
as the stationary case, risk-aware model predictive control approach,
data-driven mean-field-type games, and also presents a data-driven
mean-field-type game perspective by using machine learning and
massive data about the dynamical behavior of an unknown system
by means of a simple linear regression. This part also presents sev-
eral engineering applications.
Figure 1.19 shows the outline of this book divided into the main six parts,
and presents the suggested interdependence among the chapters to read. Thus,
we suggest some prerequisites for each chapter.
Outline of the book. Arrows show the interdependence among the chapters.
FIGURE 1.19

Introduction
I 1. Introduction

II 2. Mean-Field-Free
Games 3. Mean-Field Games

4. Mean-Field-Type 5. Co-opetitive Mean-Field-Type 6. Mean-Field-Type Games with


Games Games Jump-Diffusion and Regime
Switching

III
7. Mean-Field-Type Stackelberg 8. Berge Equilibrium in
Games Mean-Field-Type Games

9. Matrix-Valued 10. Constrained Matrix-Valued


IV Mean-Field-Type Games Mean-Field-Type Games

V 11. One-Dimensional Discrete-Time


Mean-Field-Type Games
12. Matrix-Valued Discrete-Time
Mean-Field-Type Games

VI 13. Constrained Mean-Field-Type


Games: Stationary Case
14. Mean-Field-Type Model
Predictive Control
15. Data-Driven Mean-Field-Type
Games

16. Applications

33
34 Mean-Field-Type Games for Engineers

Important Notation
To improve the readability and ease the understanding of the proposed
method presented throughout this book, we have omitted some arguments in
the functions.

For instance, a drift function in a stochastic differential equation de-


pending on time t, regime switching s, a system state x, the expectation of
the system state E[x], a control input u, and the expectation of the control
E[u], which is given by b(t, s, x, E[x], u, E[u]), we simply denote it by b.

1.8 Exercises
1. Mention an application from your research or study field of interests
(different from the ones discussed in Section 1.2.3), in which the
quantification and minimization of risk is (or could be) important to
enhance the desired performance. Once you select your application,
answer the following questions:
(a) What is the control objective associated with your selected ap-
plication?
(b) What kind of uncertainties are involved in your selected appli-
cation?
(c) What kind of control strategies have been implemented to your
selected application?
(d) Which other system could your engineering problem be coupled
with?, please see Section 1.2.4 as a guidance.
2. Define the following terms:
(a) Mean
(b) Standard deviation
(c) Variance
(d) Covariance
(e) Skewness
(f) Kurtosis
3. Define the following type of decision-makers:
(a) Risk-aware decision-maker
(b) Risk-averse decision-maker
(c) Risk-neutral decision-maker
(d) Risk-seeking decision-maker
Introduction 35
Main Decision

Room 1 Room 2

500 USD

Room 2a Room 2b

1000 USD

? ?

FIGURE 1.20
Risk doors exercise. Would you risk for a bigger reward?

(e) Risk-sensitive decision-maker


4. Let us suppose you have the following offer. You can select between
two rooms: 1 and 2. Once you have selected a door you cannot move
back or change your selection.
• Room 1: You will find 500 USD
• Room 2: You will have the option to enter any of two rooms:
2a and 2b. In one of them, you will find 1000 USD and in the
other there is nothing.
The situation is summarized in Figure 1.20. Answer the following
questions:
(a) What would you do?, i.e., to take the 500 USD for sure, or
pursue to get 1000 USD?
(b) Do you consider yourself a risky person?
(c) What is the expected reward for rooms 1 and 2?
5. Let us consider the same scenario as in the previous exercise but
with a reward xUSD in Room 1. Response then the following
questions:
36 Mean-Field-Type Games for Engineers

(a) For which values of x, a risk-averse decision-maker would select


Room 1?
(b) For which values of x, a risk-neutral decision-maker would select
randomly between Rooms 1 and 2?
(c) For which values of x, a risk-seeking decision-maker would select
Room 2?
References
A. Ouammi , H. Dagdougui , and R. Sacile . Optimal control of power flows and energy local storages in a network of microgrids modeled
as a system of systems. IEEE Transactions on Control Systems Technology, 23(1):128–138, 2015.
H. M. Markowitz . Portfolio selection. The Journal of Finance, 7(1):77–91, 1952.
H. M. Markowitz . The utility of wealth. The Journal of Political Economy (Cowles Foundation Paper 57), 2:151–158, 1952.
H. M. Markowitz . Portfolio Selection: Efficient Diversification of Investments. John Wiley & Sons, Inc. Chapman & Hall, Ltd. 1959.
Robert Aumann . Collected papers, volume 1. The MIT Press, Cambridge, MA, 2000.
T. E. Duncan and B. Pasik-Duncan . Solvable stochastic differential games in rank one compact symmetric spaces. International Journal of
Control, 91(11):2445–2450, 2018.
T. E. Duncan . Linear exponential quadratic stochastic differential games. IEEE Transactions on Automatic Control, 61(9):2550–2552,
2016.
T. E. Duncan and B. Pasik-Duncan . Linear-quadratic fractional Gaussian control. SIAM Journal on Control and Optimization,
51(6):4504–4519, 2013.
T. E. Duncan , B. Maslowski , and B. Pasik-Duncan . Linear-exponential-quadratic control for stochastic equations in a Hilbert space. SIAM
Journal on Control and Optimization, 50(1):507–531, 2012.
T. E. Duncan . Linear-exponential-quadratic Gaussian control. IEEE Transactions on Automatic Control, 58(11):2910–2911, 2013.
T. E. Duncan . Linear-quadratic stochastic differential games with general noise processes. In F. El Ouardighi and K. Kogan , editors,
Economics and Management Science: Essays in Honor of Charles S. Tapiero. Operations Research and Management Series., volume
198. Springer Intern. Springer, 2014.
P. E. Caines . Mean field games. In John Baillieul and Tariq Samad , editors, Encyclopedia of Systems and Control, pages 706–712.
Springer London, London, 2015.
M. Bardi . Explicit solutions of some linear-quadratic mean field games. American Institute of Mathematical Sciences, 7(2):243–261, 2012.
M. Bardi and F. S. Priuli . Linear-quadratic n-person and mean-field games with ergodic cost. SIAM Journal on Control and Optimization,
52(5):3022–3052, 2014.
H. Tembine , D. Bauso , and T. Basar . Robust linear quadratic mean-field games in crowd-seeking social networks. In Proceedings of the
52nd IEEE Conference on Decision and Control (CDC) , pages 3134–3139, Firenze, Italy, 2013.
H. Tembine , Q. Zhu , and T. Basar . Risk-sensitive mean-field games. IEEE Transactions on Automatic Control, 59(4):835–850, 2014.
A. Bensoussan , J. Frehse , and S.C.P. Yam . Mean Field Games and Mean Field Type Control Theory, Springer Briefs in Mathematics.
Springer, 2013.
A. Bensoussan . Explicit solutions of linear quadratic differential games. In H. Yan , G. Yin , and Q. Zhang , editors, International Series in
Operations Research and Management Science, volume 94. Springer, 2006.
T. E. Duncan and B. Pasik-Duncan . A direct method for solving stochastic control problems. Communications in Information and Systems,
12(1):1–14, 2012.
D. L. Lukes and D. L. Russell . A global theory of linear-quadratic differential games. Journal of Mathematical Analysis and Applications,
33(1):96–123, 1971.
J. C. Engwerda . On the open-loop Nash equilibrium in lq-games. Journal of Economic Dynamics and Control, 22(5):729–762, 1998.
E. J. Dockner , S. Jorgensen , N. Van Long , and G. Sorger . Differential Games in Economics and Management Science. Cambridge
University Press, Business and Economics, 2000.
A. Bensoussan , J. Frehse , and S. C. P. Yam . On the interpretation of the master equation. Stochastic Processes and their Applications,
127(7):2093–2137, 2017.
B. Djehiche B , H. Tembine , and R. Tempone . A stochastic maximum principle for risk-sensitive mean-field type control. IEEE
Transactions on Automatic Control, 60(10):2640–2649, 2015.
H. Tembine . Distributed Strategic Learning for Wireless Engineers. CRC Press, Taylor & Francis, 2012.
H. Tembine . Risk-sensitive mean-field-type games with lp -norm drifts. Automatica, 59(2015):224–237, 2015.
B. Djehiche , A. Tcheukam , and H. Tembine . Mean-field-type games in engineering. AIMS Electronics and Electrical Engineering,
1(1):18–73, 2017.
H. Tembine . Nonasymptotic mean-field games. IEEE Transactions on Cybernetics, 44(12):2744–2756, 2014.
B. Djehiche , T. Basar , and H. Tembine . Mean-Field-Type Game Theory. Springer, 2020. Under Preparation.
H. Tembine . Energy-constrained mean-field games in wireless networks. Strategic Behavior and the Environment, 4(2):187–211, 2014.
J. Barreiro-Gomez , S. E. Choutri , and H. Tembine . Risk-awareness in multi-level building evacuation with smoke: Burj Khalifa case
study. Automatica, 129, 109625 2021.
B. Djehiche , J. Barreiro-Gomez , and H. Tembine . Electricity price dynamics in the smart grid: A mean-field-type game perspective. In
23rd International Symposium on Mathematical Theory of Networks and Systems (MTNS), pages 631–636, Hong Kong, China, 2018.
H. M. Markowitz . Harry Markowitz: Selected Works. World Scientific-nobel Laureate Series. World Scientific Publishing Company, 2009.
N. Nisan , T. Roughgarden , É. Tardos , and V. V. Vazirani . Algorithmic Game Theory. Cambridge University Press, New York, NY, USA,
2007.
J. R. Marden and J. S. Shamma . Game-Theoretic Learning in Distributed Control, 1–36. Springer International Publishing, Cham 2018.
S. Lasaulce and H. Tembine . Game Theory and Learning for Wireless Networks: Fundamentals and Applications. Academic Press, 2011.
J. Barreiro-Gomez and H. Tembine . Mean-field-type model predictive control: An application to water distribution networks. IEEE Access,
7(2019):135332–135339, 2019.
J. Barreiro-Gomez , C. Ocampo-Martinez , and N. Quijano . Dynamical tuning for MPC using population games: A water supply network
application. ISA Transactions, 69(2017):175–186, 2017.
N. Quijano , C. Ocampo-Martinez , J. Barreiro-Gomez , G. Obando , A. Pantoja , and E. Mojica-Nava . The role of population games and
evolutionary dynamics in distributed control systems. IEEE Control Systems, 37(1):70–97, 2017.
J. Barreiro-Gomez , F. Dörfler , and H. Tembine . Distributed and robust population games with applications to optimal frequency control in
power systems. In American Control Conference (ACC) , pages 5762–5767, Milwaukee, 2018.
J. Barreiro-Gomez and H. Tembine . Constrained evolutionary games by using a mixture of imitation dynamics. Automatica,
97(2018):254–262, 2018.
J. I. Poveda , P. N. Brown , J. R. Marden , and A. R. Teel . A class of distributed adaptive pricing mechanisms for societal systems with
limited information. In Proceedings of the 56th IEEE Conference on Decision and Control (CDC) , pages 1490–1495, Melbourne, Australia,
2017.
J. Barreiro-Gomez and H. Tembine . Blockchain token economics: A mean-field-type game perspective. IEEE Access, 2019. DOI:
10.1109/ACCESS.2019.2917517.
G. Obando , A. Pantoja , and N. Quijano . Building temperature control based on population dynamics. IEEE Transactions on Control
Systems Technology, 22(1):404–412, 2014.
J. M. Lasry and P. L. Lions . Mean field games. Japanese Journal of Mathematics, 2(1):229–260, 2007.
A. Bensoussan , B. Djehiche , H. Tembine , and P. Yam . Risk-sensitive mean-field-type control. In Proceedings of the 56th IEEE
Conference on Decision and Control (CDC) , pages 33–38, Melbourne, Australia, 2017. DOI: 10.1109/CDC.2017.8263639.
D. Andersson and B. Djehiche . A maximum principle for sdes of mean-field type. Applied Mathematics & Optimization, 63(2011):341–356,
2011.
R. Buckdahn , B. Djehiche , and J. Li . A general stochastic maximum principle for SDEs of mean-field type. Applied Mathematics &
Optimization, 64(2011):197–216, 2011.
J. J. Absalom Hosking . A stochastic maximum principle for a stochastic differential game of a mean-field type. Applied Mathematics &
Optimization, 66(2012):415–454, 2012.
R. Elliott , X. Li , and Y. Ni . Discrete time mean-field stochastic linear-quadratic optimal control problems. Automatica,
49(2013):3222–3233, 2013.
Y. Shen and T. K. Siu . The maximum principle for a jump-diffusion mean-field model and its application to the mean–variance problem.
Nonlinear Analysis, 86(2013):58–73, 2013.
M. Laurière and O. Pironneau . Dynamic programming for mean-field type control. Comptes Rendus de l'Académie des Sciences Paris,
Series I, 352(2014):707–713, 2014.
W. Guangchen , W. Zhen , and Z. Chenghui . Maximum principles for partially observed mean-field stochastic systems with application to
financial engineering. In Proceedings of the 33rd Chinese Control Conference , pages 5357–5362, Nanjing, China, 2014.
G. Wang , C. Zhang , and W. Zhang . Stochastic maximum principle for mean-field type optimal control under partial information. IEEE
Transactions on Automatic Control, 59(2):522–528, 2014.
A. Bensoussan , S.C.P. Yam , and Z. Zhang . Well-posedness of mean-field type forward–backward stochastic differential equations.
Stochastic Processes and their Applications, 125(2015):3327–3354, 2015.
R. Carmona and F. Delarue . Mean field forward-backward stochastic differential equations. Electron. Commun. Probab., 18(2013):1–15,
2013.
L. Ma , T. Zhang , W. Zhang , and B. Chen . Finite horizon mean-field stochastic H 2/H ∞ control for continuous-time systems with (x, v)-
dependent noise. Journal of The Franklin Institute, 352(2015):5393–5414, 2015.
Y. Ni , J. Zhang , and X. Li . Indefinite mean-field stochastic linear-quadratic optimal control. IEEE Transactions on Automatic Control,
60(7):1786–1800, 2015.
B. Djehiche and H. Tembine . Risk-sensitive mean-field type control under partial observation. In F. E. Benth and G. D. Nunno , editors,
Stochastics of Environmental and Financial Economics, pages 243–263. Springer, Oslo, Norway, 2016.
H. Ma and B. Liu . Maximum principle for partially observed risk-sensitive optimal control problems of mean-field type. European Journal of
Control, 32(2016):16–23, 2016.
W. Guangchen , W. Zhen , and Z. Chenghui . A partially observed optimal control problem for mean-field type forward-backward stochastic
system. In Proceedings of the 35th Chinese Control Conference , pages 1781–1786, Chengdu, China, 2016.
H. Ma and B. Liu . Linear-quadratic optimal control problem for partially observed forward-bacxkward stochastic differential equations of
mean-field type. Asian Journal of Control, 18(6):2146–2157, 2016.
M. Laurière and O. Pironneau . Dynamic programming for mean-field type control. J Optim Theory Appl, 169(2016):902–924, 2016.
P. J. Graber . Linear quadratic mean field type control and mean field games with common noise, with application to production of an
exhaustible resource. Applied Mathematics & Optimization, 74(3):459–486, 2016.
A. K. Cissé and H. Tembine . Cooperative mean-field type games. In Proceedings of the 19th World Congress The International
Federation of Automatic Control , pages 8995–9000, Cape Town, South Africa, 2014.
H. Tembine . Uncertainty quantification in mean-field-type teams and games. In Proceedings of the IEEE Control Conference on Decision
and Control (CDC) , pages 4418–4423, Osaka, Japan, 2015.
B. Djehiche and M. Huang . A characterization of sub-game perfect equilibria for sdes of mean-field type. Dynamic Games and
Applications, 6(2016):55–81, 2016.
A. Aurell . Mean-field type games between two players driven by backward stochastic differential equations. Games, 9(88):1–26, 2018.
B. Djehiche and S. Hamadène . Optimal control and zero-sum stochastic differential game problems of mean-field type. Applied
Mathematics & Optimization, (2018):1–28, 2018.
H. Tembine T. E. Duncan . Linear-quadratic mean-field-type games: A direct method. Games Journal, 7(2018):1–18, 2018.
J. Barreiro-Gomez , T. E. Duncan , B. Pasik-Duncan , and H. Tembine . Semi-explicit solutions to some non-linear non-quadratic mean-
field-type games: A direct method. IEEE Transactions on Automatic Control, 65(6):2582–2697, 2020.
J. Barreiro-Gomez , T. E. Duncan , and H. Tembine . Discrete-time linear-quadratic mean-field-type repeated games: Perfect, incomplete,
and imperfect information. Automatica, 112(2020):108647, 2020.
A. Bensoussan , B. Djehiche , H. Tembine , and S. C. P. Yam . Mean-field-type games with jump and regime switching. Dynamic Games
and Applications, 10(1): 1–39, 2020.
J. Barreiro-Gomez , T. E. Duncan , and H. Tembine . Linear-quadratic mean-field-type games: Jump-diffusion process with regime
switching. IEEE Transactions on Automatic Control, 64(10):4329–4336, 2019.
J. Barreiro-Gomez , T. E. Duncan , and H. Tembine . Co-opetitive linear-quadratic mean-field-type games. IEEE Transactions on
Cybernetics, 2019. DOI: 10.1109/TCYB.2019.2901006.
J. Gao and H. Tembine . Distributed mean-field-type filters for big data assimilation. In 2016 IEEE 18th International Conference on High
Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on
Data Science and Systems , pages 1446–1453, 2016.
A. T. Siwe and H. Tembine . Network security as public good: A mean-field-type game theory approach. In 2016 13th International Multi-
Conference on Systems, Signals & Devices (SSD) , pages 601–606, Leipzig, Germany, 2016.
H. Tembine . Mean-field-type optimization for demand-supply management under operational constraints in smart grid. Energy Systems,
7(2016):333–356, 2016.
A. Aurell and B. Djehiche . Mean-field type modeling of nonlocal crowd aversion in pedestrian crowd dynamics. SIAM Journal on Control
and Optimization, 56(1):434–455, 2018.
J. Barreiro-Gomez , T. E. Duncan , and H. Tembine . Linear-quadratic mean-field-type games with multiple input constraints. IEEE Control
Systems Letters, 3(3):511–516, 2019.
A. Aurell and B. Djehiche . Modeling tagged pedestrian motion: A mean-field type game approach. Transportation Research Part B,
121(2019):168–183, 2019.
J. Gao and H. Tembine . Distributed mean-field-type filters for traffic networks. IEEE Transactions on Intelligent Transportation Systems,
20(2):507–521, 2019.
J. Barreiro-Gomez , T. E. Duncan , and H. Tembine . Linear-quadratic mean-field-type games-based stochastic model predictive control: A
microgrid energy storage application. In Proceedings of the American Control Conference , pages 3224–3229, Philadelphia, PA, 2019.
A. Bensoussan , J. Frehse , and S.C.P. Yam . Mean Field Games and Mean Field Type Control Theory, volume 1. Springer Briefs in
Mathematics, New York, 2013.
D. A. Gomes and J. Saúde . Mean field games models—A brief survey. Dynamic Games and Applications, 4(2):110–154, Jun 2014.
P. E. Caines , M. Huang , and R. Malhame . Mean field games. In T. Başar and G. Zaccour , editors, Handbook of Dynamic Game Theory,
pages 1–28. Springer International Publishing AG, 2017.
A. Wald . On some systems of equations of mathematical economics. Econometrica, 19:368–403, 1951.
J. M. von Neumann . Theory of Games and Economic Behavior. Princeton University Press, 1953.
J. Barreiro-Gomez and H. Tembine . A matlab-based mean-field-type games toolbox: Continuous-time version. IEEE Access,
7:126500–126514, 2019.
M. Deutsch . A theory of cooperation and competition. Human Relations, 2:129–151, 1949.
S. J. Chión , V. Charles , and M. Tavana . Impact of incentive schemes and personality-tradeoffs on two-agent coopetition with numerical
estimations. Measurement, 125(2018):182–195, 2018.
J. Barreiro-Gomez , C. Ocampo-Martinez , N. Quijano , and J. M. Maestre . Non-centralized control for flow-based distribution networks: A
game-theoretical insight. Journal of the Franklin Institute, 354(14):5771–5796, 2017.
P. Skowron , K. Rzadca , and A. Datta . Cooperation and competition when bidding for complex projects: Centralized and decentralized
perspectives. IEEE Intelligent Systems, 32(1):17–23, 2017.
H. V. Stackelberg . The theory of the market economy, trans. by aj peacock, london, william hodge. Originally published as Grundlagen der
Theoretischen Volkswirtschaftlehre, 1948.
M. Simaan and J. B. Cruz . On the Stackelberg strategy in nonzero-sum games. Journal of Optimization Theory and Applications,
11(5):533–555, 1973.
A. Bagchi and T. Basar . Stackelberg strategies in linear-quadratic stochastic differential games. Journal of Optimization Theory and
Applications, 35(3):443–464, 1981.
A. Bensoussan , S. Chen , and S. P. Sethi . The maximum principle for global solutions of stochastic Stackelberg differential games. SIAM
Journal on Control and Optimization, 53(4):1956–1981, 2015.
L. Pan and J. Yong . A differential game with multi-level of hierarchy. Journal of Mathematical Analysis and Applications, 161(2):522–544,
1991.
M. Simaan and J. Cruz . A stackelberg solution for games with many players. IEEE Transactions on Automatic Control, 18(3):322–324,
1973.
J. Cruz . Leader-follower strategies for multilevel systems. IEEE Transactions on Automatic Control, 23(2):244–255, 1978.
B. Gardner and J. Cruz . Feedback Stackelberg strategy for m-level hierarchical games. IEEE Transactions on Automatic Control,
23(3):489–491, 1978.
T. Basar and H. Selbuz . Closed-loop Stackelberg strategies with applications in the optimal control of multilevel systems. IEEE
Transactions on Automatic Control, 24(2):166–179, 1979.
Y. Lin , X. Jiang , and W. Zhang . An open-loop Stackelberg strategy for the linear quadratic mean-field stochastic differential game. IEEE
Transactions on Automatic Control, 64(1):97–110, 2019.
K. Du and Z. Wu . Linear-quadratic Stackelberg game for mean-field backward stochastic differential system and application. Mathematical
Problems in Engineering, Article ID 1798585, 17 pages, 2019.
J. Moon and T. Basar . Linear-quadratic stochastic differential Stackelberg games with a high population of followers. In Proceedings of the
54th IEEE Conference on Decision and Control , pages 2270–2275, Osaka, Japan, 2015.
A. Bensoussan , M. H. M. Chau , and S. C. P. Yam . Mean-field Stackelberg games: Aggregation of delayed instructions. SIAM Journal on
Control Optimization, 53(4):2237–2266, 2015.
A. Bensoussan , M.H.M. Chau , Y. Lai , and S.C.P. Yam . Linear-quadratic mean field Stackelberg games with state and control delays.
SIAM Journal on Control and Optimization, 55(4):2748–2781, 2017.
A. Y. Averboukh . Stackelberg solution for first-order mean-field game with a major player. Izvestiya Instituta Matematiki i Informatiki.
Udmurtskij, 52, 2018.
J. Moon and T. Basar . Linear quadratic mean field Stackelberg differential games. Automatica, 97(2018):200–213, 2018.
J. Shi , G. Wang , and J. Xiong . Leader-follower stochastic differential game with asymmetric information and applications. Automatica,
63(2016):60–73, 2016.
M. Nourian , P. Caines , R. P. Malhamé , and M. Huang . Mean field LQG control in leader-follower stochastic multi-agent systems:
Likelihood ratio based adaptation. IEEE Transactions on Automatic Control, 57(11):2801–2816, 2012.
H. Cai and G. Hu . Distributed tracking control of an interconnected leader-follower multiagent system. IEEE Transactions on Automatic
Control, 62(7):3494–3501, 2017.
Y. Li , D. Shi , and T. Chen . False data injection attacks on networked control systems: A Stackelberg game analysis. IEEE Transactions
on Automatic Control, 63(10):3503–3509, 2018.
J. Barreiro-Gomez , C. Ocampo-Martinez , and N. Quijano . Partitioning for large-scale systems: A sequential distributed MPC design. In
20th IFAC World Congress, pages 8838–8843, Toulouse, France, 2017.
Z. El Oula Frihi , J. Barreiro-Gomez , S. E. Choutri , B. Djehiche , and H. Tembine . Stackelberg mean-field-type games with polynomial
cost. In 21th IFAC World Congress, Berlin, Germany, 2020.
Z. El Oula Frihi , J. Barreiro-Gomez , S. E. Choutri , and H. Tembine . Hierarchical structures and leadership design inmean-field-type
games with polynomial cost. Games, 11(30):1–26, 2020.
C. Berge . Théorie générale des jeux à n personnes [general theory of n-person games]. Paris: Gauthier-Villars, 1957.
O. Musy , A. Potter , and T. Tazdait . A new theorem to find Berge equilibria. International Game Theory Review, 14(1250005), 2012.
B. Crettez . On sugden's mutually beneficial practice and Berge equilibrium. International Review of Economics, 128, 2017.
B. Crettez . A new sufficient condition for a Berge equilibrium to be a Berge-Vaisman equilibrium. Journal of Quantitative Economics,
15(3), 451–459, 2017.
K. Keskin and H. C. Saglam . On the existence of Berge equilibrium: An order theoretic approach. International Game Theory Review,
17(03), 2015.
H. W. Corley . A mixed cooperative dual to the Nash equilibrium. Game Theory, 1–7, 2015.
H. W. Corley and P. Kwain . An algorithm for computing all Berge equilibria. Game Theory, 1–2, 2015.
A. Pottier and R. Nessah . Berge-Vaisman and Nash equilibria: Transformation of games. International Game Theory Review, 16(04),
2014.
M. Larbani and V.I. Zhukovskii . Berge equilibrium in normal form static games: a literature review. lzv. IMI udGU, 49:80–110, 2017.
V.I. Zhukovskii , K.N. Kudryavtsev , and A.S. Gorbatov . Berge equilibrium in cournot model of oligopoly. Vestnik Udmurtskogo
Universiteta, Matematika. Mekhanika, 2015.
J.O. Hairault and T. Sopraseuth . Exchange Rate Dynamics, volume 1. Routledge Taylor & Francis Group, NewYork, 2004.
W. H. Sandholm . Population Games and Evolutionary Dynamics. Cambridge, MA, MIT Press, 2010.
J. Martinez-Piazuelo , G. Diaz-Garcia , N. Quijano , and L. F. Giraldo . Discrete-time distributed population dynamics for optimization and
control. IEEE Transactions on Systems, Man and Cybernetics: Systems, 2021.
A. Nagurney and D. Zhang . Projected dynamical systems in the formulation, stability analysis, and computation of fixed demand traffic
network equilibria. Transportation Science, 31:147–158, 1997.
J. B. Rosen . Existence and uniqueness of equilibrium points for concave n-person games. Econometrica, 33(3):520–534, 1965.
S. Boyd and L. Vandenberghe . Convex Optimization. Cambridge University Press, 2004.
P. D. Taylor and L. B. Jonker . Evolutionary stable strategies and game dynamics. Mathematical Biosciences, 40(1):145–156, 1978.
J. Barreiro-Gomez , G. Obando , and N. Quijano . Distributed population dynamics: Optimization and control applications. IEEE
Transactions on Systems, Man, and Cybernetics: Systems, 47(2):304–314, 2017.
J. M. Grosso , C. Ocampo-Martinez , V. Puig , and B. Joseph . Chance-constrained model predictive control for drinking water networks.
Journal of Process Control, 24(2014):504–516, 2014.
M. Pereira , D. Muñoz de la Peña , D. Limon , I. Alvarado , and T. Alamo . Application to a drinking water network of robust periodic MPC.
Control Engineering Practice, 57(2016):50–60, 2016.
Y. Wang , C. Ocampo-Martinez , and V. Puig . Stochastic model predictive control based on Gaussian processes applied to drinking water
networks. IET Control Theory & Applications, 10(8):947–955, 2016.
P. Sopasakis , D. Herceg , A. Bemporad , and P. Patrinos . Risk-averse model predictive control. Automatica, 100(2019):281–288, 2019.
Y. Yang and C. Sutanto . Chance-constrained optimization for nonconvex programs using scenario-based methods. ISA Transactions,
90(2019):157–168, 2019.
B. K. Ghosh . Probability inequalities related to Markov's theorem. The American Statistician, 56(3):186–190, 2002.
D. Mayne . Model predictive control: Recent developments and future promise. Automatica, 50(2014):2967–2986, 2014.
J. Rawlings and D. Mayne . Model Predictive Control Theory and Design. Nob Hill Pub, Llc, 2009.
P. D. Christofides , R. Scattolini , D. Muñoz de la Peña , and J. Liu . Distributed model predictive control: A tutorial review and future
research directions. Computers & Chemical Engineering, 51(5):21–41, 2013.
G. James , D. Witten , T. Hastie , and R. Tibshirani . An Introduction to Statistical Learning with Applications in R. Springer, 2014.
K. H. Johansson . The quadruple-tank process: A multivariable laboratory process with an adjustable zero. IEEE Transactions on Control
Systems Technology, 8(3):456–465, 2000.
J. Barreiro-Gomez . The Role of Population Games in the Design of Optimization-Based Controllers. Springer editorial, 2019.
J. Grosso . Economic and Robust Operation of Generalised Flow-based Networks. Doctoral dissertation. Universidad Politècnica de
Catalunya. Automatic Control Department., 2015.
C. Liou and T. Hsiue . Exact linearization and control of a continuous stirred tank reactor. Journal of the Chinese Institute of Engineers,
18(6):825–833, 1995.
G. Wang , S. Peng , and H. Huang . A sliding observer for nonlinear process control. Chemical Engineering Science, 52(5):787–805, 1997.
T. L. Friesz , D. Bernstein , N. J. Mehta , R. L. Tobin , and S. Ganjalizadeh . Day-today dynamic network disequilibria and idealized traveler
information systems. Operations Research, 42:1120–1136, 1994.
P. Malisani , F. Chaplais , and N. Petit . Design of penalty functions for optimal control of linear dynamical systems under state and input
constraints. In Proceedings of 50st IEEE Conference on Decision and Control and European Control Conference (CDC-ECC) , pages
6697–6704, Orlando, FL, 2011.
P. Malisani , F. Chaplais , and N. Petit . An interior penalty method for optimal control problems with state and input constraints of
nonlinear systems. Optimal Control Applications and Methods, 37(2016):3–33, 2016.
B. Merci and T. Beji . Fluid Mechanics Aspects of Fire and Smoke Dynamics in Enclosures, volume 1. Taylor & Francis, 2013.
R. Cressman and V. Křivan . Migration dynamics for the ideal free distribution. The American Naturalist, 168(3):384–397, 2006.
J. Hellewell , S. Abbott , A. Gimma , N. I. Bosse , C. I. Jarvis , T. W. Russell , J. D. Munday , A. J. Kucharski , W. J. Edmunds , S. Funk ,
and R. M. Eggo . Feasibility of controlling covid-19 outbreaks by isolation of cases and contacts. Lancet Glob Health, 8:488–496, 2020.
M. F. Bashir , B. Ma , Bilal , B. Komal , M. A. Bashir , D. Tan , and M. Bashir . Correlation between climate indicators and covid-19
pandemic in New York, USA. Science of the Total Environment, 278(2020):138835, 2020.
Y. Zhu , J. Xie , F. Huang , and L. Cao . Association between short-term exposure to air pollution and covid-19 infection: Evidence from
China. Science of the Total Environment, 727(2020):138704, 2020.
F. Taghizadeh-Hesary and H. Akbari . The powerful immune system against powerful covid-19: A hypothesis. Medical Hypotheses,
20(2020):1–7, 2020. DOI: [Link]
A. M. Al-Awadhi , K. Alsaifi , A. Al-Awadhi , and S. Alhammadi . Death and contagious infectious diseases: Impact of the covid-19 virus on
stock market returns. Journal of Behavioral and Experimental Finance, 27(2020):100326, 2020.
J. W. Goodell . Covid-19 and finance: Agendas for future research. Finance Research Letters, 20(2020):1–12, 2020. DOI:
[Link]
L. J. S. Allen . An introduction to stochastic epidemic models. In Mathematical epidemiology, pages 81–130. Springer editorial, 2008.
Z. Ceylan . Estimation of covid-19 prevalence in italy, spain, and france. Science of the Total Environment, 20(2020):1–24, 2020. DOI:
[Link]
J. A. Jacquez and P. O'Neill . Reproduction numbers and thresholds in stochastic epidemic models i. homogeneous populations.
Mathematical Biosciences, 107(2):161–186, 1991.
M. J. Keeling and J. V. Ross . On methods for studying stochastic disease dynamics. Journal of the Royal Society Interface,
5(19):171–181, 2008.
G. Giordano , F. Blanchini , R. Bruno , P. Colaneri , A. Di Filippo , A. Di Matteo , and M. Colaneri . Modelling the covid-19 epidemic and
implementation of population-wide interventions in Italy. Nature Medicine. DOI: [Link]
H. Tembine . COVID-19: Data-driven mean-field-type game perspective. Games, 11(4):51, 2020.

You might also like