Active Inference and Robot Control-A Case Study
Active Inference and Robot Control-A Case Study
rsif.royalsocietypublishing.org
a case study
Léo Pio-Lopez1,2, Ange Nizard1, Karl Friston3 and Giovanni Pezzulo2
1
Pascal Institute, Clermont University, Clermont-Ferrand, France
2
Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
Research 3
The Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College London, London, UK
LP-L, 0000-0001-8081-1070
Cite this article: Pio-Lopez L, Nizard A,
Friston K, Pezzulo G. 2016 Active inference and Active inference is a general framework for perception and action that is
robot control: a case study. J. R. Soc. Interface gaining prominence in computational and systems neuroscience but is less
13: 20160616. known outside these fields. Here, we discuss a proof-of-principle implemen-
http://dx.doi.org/10.1098/rsif.2016.0616 tation of the active inference scheme for the control or the 7-DoF arm of a
(simulated) PR2 robot. By manipulating visual and proprioceptive noise
levels, we show under which conditions robot control under the active infer-
ence scheme is accurate. Besides accurate control, our analysis of the internal
Received: 3 August 2016 system dynamics (e.g. the dynamics of the hidden states that are inferred
during the inference) sheds light on key aspects of the framework such as
Accepted: 1 September 2016
the quintessentially multimodal nature of control and the differential roles
of proprioception and vision. In the discussion, we consider the potential
importance of being able to implement active inference in robots. In particu-
lar, we briefly review the opportunities for modelling psychophysiological
Subject Category: phenomena such as sensory attenuation and related failures of gain control,
Life Sciences – Engineering interface of the sort seen in Parkinson’s disease. We also consider the fundamental
difference between active inference and optimal control formulations, show-
ing that in the former the heavy lifting shifts from solving a dynamical
Subject Areas:
inverse problem to creating deep forward or generative models with
systems biology, biomathematics, dynamics, whose attracting sets prescribe desired behaviours.
computational biology
Keywords:
active inference, free energy, robot control
1. Introduction
Active inference has recently acquired significant prominence in computational
and systems neuroscience as a general theory of brain and behaviour [1,2]. This
Author for correspondence: framework uses one single principle—surprise (or free energy) minimization—
Giovanni Pezzulo to explain perception and action. It has been applied to a variety of domains,
e-mail: giovanni.pezzulo@istc.cnr.it which includes perception– action loops and perceptual learning [3,4]; Bayes
optimal sensorimotor integration and predictive control [5]; action selection
[6,7] and goal-directed behaviour [8 –12].
Active inference starts from the fundaments of self-organization which
suggests that any adaptive agent needs to maintain its biophysical states
within limits, therefore maintaining a generalized homeostasis that enables it
to resist the second law of thermodynamics [2]. To this aim, both an agent’s
actions and perceptions both need to minimize surprise, that is, a measure of
discrepancy between the agent’s current predictive or desired states. Crucially,
agents cannot minimize surprise directly but they can minimize an upper
bound of surprise, namely the free energy of their beliefs about the causes of
sensory input [1,2].
This idea is cast in terms of Bayesian inference: the agent is endowed with
priors that describe its desired states and a (hierarchical, generative) model of
the world. It uses the model to generate continuous predictions that it tries to
fulfil via action; that is to say, the agent activity samples the world to minimize
prediction errors so that surprise (or its upper bound, free-energy) is sup-
pressed. More formally, this is a process in which beliefs about (hidden or
& 2016 The Authors. Published by the Royal Society under the terms of the Creative Commons Attribution
License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original
author and source are credited.
latent) states of the world maximize Bayesian model evidence and x [ X for a particular value. The tilde notation 2
of observations, while observations are sampled selectively to ~x ¼ ðx, x0 , x0 0 , . . .Þ corresponds to variables in generalized coordi-
nates of motion [24]. Each prime is a temporal derivative. p(X )
rsif.royalsocietypublishing.org
conform to the model [13,14]. The agent has essentially two
ways to reduce surprise: change its beliefs or hypotheses denotes a probability density.
( perception), or change the world (action). For example, if it
— V is the sample space from which random fluctuations v [ V
believes that its arm is raised, but observes it is not, then it are drawn.
can either change its mind or to raise the arm—either way, — Hidden states C : C A V ! R. They depend on actions
its prediction comes true (and free energy is minimized). and are part of the dynamics of the world that causes sensory
As we see, in active inference, this result can be obtained states.
by endowing a Bayesian filtering scheme with reflex arcs — Sensory states S : C A V ! R. They are the agent’s sen-
that enable action, such as raising a robotic arm or using it sations and constitute a probabilistic mapping from action
to touch a target. In this example, the agent generates a ( pro- and hidden states.
rsif.royalsocietypublishing.org
a >
>
>
> duce a prediction (first) term and an update (second) term
~ ðtÞ ¼ arg min Fð~sðtÞ, m
m ~Þ >
>
>
>
m
~ = based upon free-energy gradients that, as we see below, can be
ð2:5Þ expressed in terms of prediction errors (this corresponds to the
~ , ~sjmÞlq þ H½qðC
Fðs, mÞ ¼ kGðC ~ j~
mÞ >
>
>
> basic form of a Kalman filter). D is a differential matrix operator
>
>
¼ D½qðC~ j~ ~ j~smÞ ln pð~sðaÞjmÞ >
mÞjjpðC >
> that operates on generalized motion and D~ m describes the gener-
>
>
; alized motion of conditional expectations. Generalized motion
ln pð~sðaÞjmÞ:
comprises vectors of velocity, acceleration, jerk, etc.
The term D½:jj: is the Kullback –Leibler divergence (or cross- The generative model has the following hierarchical form
entropy) between two densities. The minimizations on a and m ~ 9
s ¼ gð1Þ ðxð1Þ , nð1Þ Þ þ vð1Þn >
>
>
rsif.royalsocietypublishing.org
2.4. Action
A motor trajectory (e.g. the trajectory of raising the arm) is pro-
duced via classical reflex arcs that suppress proprioceptive
prediction errors
ð1Þ ð1Þ
a_ ¼ @ a F ¼ ð@ a ~1ð1Þ
n Þ Pn ~1n ð2:10Þ
rsif.royalsocietypublishing.org
w f = but in this illustration, we focus on the sample problem
gm ¼ 2 illustrated in figure 2, where the start position is on the
jjwjj jjf2 jj ð2:15Þ
;
g c ¼ w f2 , bottom-centre, and the desired position is the green dot.
The four panels of figure 2 exemplify the robot reaching
where the absolute value of gc is bounded by gmax . 0. Then, the
second action is defined as under the four scenarios that we considered. In the first scen-
9 ario (figure 2a), there was no noise on proprioception and
b1 ¼ b3 ¼ b5 ¼ b6 ¼ b7 ¼ 0 =
vision. In the second, third and fourth scenarios, propriocep-
b2 ¼ gm gc p p2 ðshoulderÞ ð2:16Þ
b4 ¼ gm gc p p4 ðelbowÞ, ; tion (figure 2b), vision (figure 2c) or both (figure 2d) were
noisy, respectively. We used noise with a log precision of 4.
where p p2 and p p4 are additional positive settings used to balance As illustrated by the figures, in the absence of noise (first
rsif.royalsocietypublishing.org
1.2
1.0
1.1
0.9
1.0
z (m)
0.8
0.9
z (m)
0.7
0.8
0.6
0.7
0.5
–0.5 0.6
–0.4 –0.6
z (m)
0.8 1.0
0.9
0.7
0.8
0.6
0.7
0.5
–0.6 0.6
–0.4 0.5
–0.2
y(
1.0
m)
0 0.5
x(
0.78 0.76
)
x (m) y (m)
Figure 2. Reaching trajectories in three dimensions from a start to a goal location under four scenarios. (a) Scenario 1: reaching in three dimensions with 7 DoF.
(b) Scenario 2: reaching in three dimensions with noisy proprioception. (c) Scenario 3: reaching in three dimensions with noisy vision. (d) Scenario 4: reaching in
three dimensions with noisy proprioception and vision. The blue trajectory is the mean of 20 trajectories shown in grey.
1.2
4. Discussion
1.1 Our case study shows that the active inference scheme can
control the seven DoFs arm of a simulated PR2 robot—focus-
1.0
ing here on the task of reaching a desired goal location from a
0.9 ( predefined) start location.
z (m)
0.70 0.65
ception scenario (figure 5), hidden states are significantly more
)
x (m)
uncertain compared with the reference case with no noise
Figure 3. Reaching trajectories from a common starting point. In blue, no (figure 4). Yet, despite the uncertainty about joint angles, the
noise (scenario 1). In black, noise on proprioception (scenario 2). In red, robot can still rely on (intact) vision to infer where the arm is
noise on vision (scenario 3). In yellow, noise on vision and proprioception in space, and thus it is able to reach the target ultimately—
(scenario 4). The trajectories are the mean of 20 simulations from the although it follows largely suboptimal trajectories (in relation
same start and goal (green) locations. to its prior beliefs preferences). Multimodal integration or com-
pensation is impossible if both vision and proprioception are
corresponds to an internal degree of freedom that does not sufficiently degraded (figure 7). In the noisy vision scenario,
have any effect on the trajectory. The figures show some figure 6, noise has some effect on inferred causes but only
slight oscillations after 20 iterations, which are due to the affects hidden states (and ultimately action selection) to a
fact that the arm is moving in the proximity of the target. minor extent.
(a) prediction and error (b) hidden states 7
1.5 1.0
rsif.royalsocietypublishing.org
0.8
1.0 0.6
mx(t)
0.4
0.5 0.2
0
0 –0.2
–0.4
(c) (d)
1.5 hidden causes 1.5 perturbation and action
1.0
v(t)
1.0
0.5
0.5 0
–0.5
0 mv(t)
–1.0 a(t)
–0.5 –0.5
5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40
iterations iterations
Figure 4. Dynamics of the model internal variables in the normal case. The conditional predictions and expectations are shown as functions of the iterations. (a) the
panel shows the conditional predictions (coloured lines) and the corresponding prediction errors (red lines) based upon the expected states on the upper right.
(b) The coloured lines represent the expected hidden states causing sensory predictions. These can be thought of displacements in the articulatory state space. In this
panel and throughout, the grey areas denote 90% Bayesian confidence intervals. (c) The dotted lines represent the true expectation and the solid lines show the
conditional expectation of the hidden cause. (d) Action (solid line) and true causes (dotted line).
This pattern of results shows that control is quint- if one requires, for example, to estimate the pose of a
essentially multimodal and based on both vision and to-be-grasped object.
proprioception, and adding noise to either modality can be The above-mentioned results are consistent with a large
( partially) compensated for by appealing to the other, more body of studies showing the importance of proprioception
precise dimension. However, proprioception and vision for control tasks. Patients with impaired proprioception can
play differential roles in this scheme. Proprioception has a still execute motor tasks such as reaching, grasping and loco-
direct effect on hidden states and action selection; this is motion, but their performance is suboptimal [35–38]. In
because action dynamics depend on reflex arcs that suppress principle, the scheme proposed here may be used to explain
proprioceptive (not visual) prediction errors (see §2.4). If the human data under impaired proprioception [4] or other def-
robot has poor proprioceptive information, then it can icits in motor control—or even to help design rehabilitation
use multimodal integration and the visual modality to com- therapies. In this perspective, an interesting issue that we
pensate and restore efficient control. However, if both have not addressed pertains to the attenuation of propriocep-
modalities are degraded with noise, then multimodal inte- tive prediction errors during movement. Heuristically, this
gration becomes imprecise, and the robot cannot reduce sensory attenuation is necessary to allow the prior beliefs of
error accurately—at least in the simplified control scheme the generative model to supervene over the sensory evidence
assumed here, which (on purpose) does not include any that movement has not yet occurred (or in other words, to
additional corrective mechanism. Adding noise to vision is prevent internal states encoding the fact that there is no
less problematic, given that in the (reaching) task considered movement). This speaks to a dynamic gain control that med-
here, it plays a more ancillary role. Indeed, our reaching task iates the attenuation of the precision of prediction errors
does not pose strong demands on the estimation of hidden during movement. In the example shown above, we simply
causes for accurate control; the situation may be different reduced the precision of ascending proprioceptive prediction
(a) prediction and error (b) hidden states 8
1.5 1.5
rsif.royalsocietypublishing.org
1.0
1.0 mx(t)
0.5
0.5
0
0
–0.5
–1.5 –1.0
5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40
(c) (d)
1.5 hidden causes 2.5 perturbation and action
2.0
1.5
1.0 v(t)
1.0
0.5
0.5
0
–0.5
0
mv(t) –1.0
a(t)
–1.5
–0.5 –2.0
5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40
iterations iterations
Figure 5. (a – d) Dynamics of the model internal variables in the noisy proprioception case. The layout of the figure is the same as figure 4. Please see the previous
figure legends for details.
errors to enable movement. Had we increased their precision, people characterize experimental data by comparing the evi-
the ensuing failure of sensory attenuation would have sub- dence for different models in Bayesian model comparison.
verted movement; perhaps in a similar way to the poverty We hope to explore this in future work.
of movements seen in Parkinson’s disease—a disease that We next hope to port the scheme to a real robot. This will
degrades motor performance profoundly [39]. This aspect be particularly interesting, because there are several facets of
of precision or gain control suggests that being able to active inference that are more easily demonstrated in a real-
implement active inference in robots will allow us to perform world artefact. These aspects include a robustness to exogen-
simulated psychophysical experiments to illustrate sensory ous perturbations. For example, the movement trajectory
attenuation and its impairments. Furthermore, it suggests a should gracefully recover from any exogenous forces applied
robotic model of Parkinson’s disease is in reach, providing to the arm during movement. Furthermore, theoretically, the
an interesting opportunity for simulating pathophysiology. active inference scheme is also robust to differences between
Clearly, some of our choices when specifying the genera- the true motor plant and the various kinematic constants in
tive model are heuristic—or appeal to established notions. the generative model. This robustness follows from the fact
For example, adding a derivative term to equation (2.14) that the movement is driven by (fictive) forces whose fixed
could change the dynamics in an interesting way. In general, points do not change with exogenous perturbations—or
the framework shown above accommodates questions about many parameters of the generative model (or process).
alternative models and dynamics through Bayesian model Another interesting advantage of real-world implementations
comparison. In principle, we have an objective function (vari- will be the opportunity to examine robustness to sensorimotor
ational free energy) that scores the quality of any generative delays. Although not necessarily a problem from a purely
model entertained by a robot—in relation to its embodied robotics perspective, biological robots suffer non-trivial
exchange with the environment. This means we could delays in the signalling of ascending sensory signals and des-
change the generative model and assess the quality of the cending motor predictions. In principle, these delays can be
ensuing behaviour using variational free energy—and select absorbed into the generative model—as has been illustrated
the best generative model in exactly the same way that in the context of oculomotor control [40]. At present, these
(a) prediction and error (b) hidden states 9
1.5 0.8
rsif.royalsocietypublishing.org
0.6
mx(t)
1.0 0.4
0.2
0.5
0
–0.2
0
–0.4
–1.0 –1.0
5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40
1.0
1.0
v(t)
0.8
0.5
0.6
0.4 0
0.2
–0.5
0
mv(t) –1.0 a(t)
–0.2
–0.4 –1.5
5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40
iterations iterations
Figure 6. (a–d) Dynamics of the model internal variables in the noisy vision case. The layout of the figure is the same as figure 4. Please see the previous figure
legends for details.
proposals for how the brain copes with sensorimotor delays in robot performing a particular motor task. Optimal control
oculomotor tracking remain hypothetical. It would be extre- theory requires a mechanism for state estimation as well as
mely useful to see if they could be tested in a robotics setting. two internal models: an inverse and forward model. This
As noted above, active inference shares many similarities scheme also assumes that the appropriate optimality equation
with the passive movement paradigm (PMP, [33,34]). Although can be solved [41]. In contrast, active inference uses prior
strictly speaking, active inference is a corollary of the free- beliefs about the movement (in an extrinsic frame of reference)
energy principle, it inherits the philosophy of the PMP in the instead of optimal control signals for movements (in an intrin-
following sense. Active inference is equipped with a generative sic frame of reference). In active inference, there is no inverse
model that maps from causes to consequences. In the setting of model or cost function and the resulting trajectories are
motor control, the causes are forces that have some desired Bayes optimal. This contrasts with optimal control, which
fixed point or orbit. It is then a simple matter to predict the calls on the inverse model to finesse problems incurred by sen-
sensory consequences of those forces—as sensed by proprio- sorimotor noise and delays. Inverse models are not required in
ception or robotic sensors. These sensory predictions can active inference, because the robot’s generative (or forward)
then be realized through open loop control (e.g. peripheral model is inverted during the inference. Active inference also
servos or reflex arcs); thereby realizing the desired fixed point dispenses with cost functions, as these are replaced by the
(cf. the equilibrium point hypothesis [32]). However, unlike the robot’s (prior) beliefs (of note, there is a general duality
equilibrium point hypothesis, active inference is open loop. between control and inference [15,16]). In brief, replacing the
This is because its motor predictions are informed by deep gen- cost function with prior beliefs means that minimizing cost
erative models that are sensitive to input from all modalities corresponds to maximizing the marginal likelihood of a gen-
(including proprioception). The fact that action realizes the erative model [42–44]. A formal correspondence between
(sensory) consequences of (prior) causes explains why there cost functions and prior beliefs can be established with the
is no need for an inverse model. complete class theorem [45,46], according to which there is
Optimal motor control formulations [15,16] are fundamen- at least one prior belief and cost function that can produce a
tally different. Briefly, optimal control operates by minimizing Bayes-optimal motor behaviour. In sum, optimal control for-
some cost function in order to compute motor commands for a mulations start with a desired endpoint (consequence) and
(a) prediction and error (b) hidden states 10
1.5 0.8
rsif.royalsocietypublishing.org
0.6
1.0 mx(t)
0.4
0.2
0.5
0
0 –0.2
–0.4
–0.5
–0.6
1.0 1.0
v(t)
0.8
0.5
0.6
0
0.4
–0.5
0.2
–1.0
0
mv(t)
–1.5 a(t)
–0.2
–0.4 –2.0
5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40
iterations iterations
Figure 7. (a –d) Dynamics of the model internal variables in the ‘all noisy’ case. The layout of the figure is the same as figure 4. Please see the previous figure
legends for details.
tried to reverse engineer the forces (causes) that produce the Although active inference resolves many problems that
desired consequences. It is this construction that poses a diffi- attend optimal control schemes, there is no free lunch. In
cult inverse problem with solutions that are not generally active inference, all the heavy lifting is done by the generative
robust—and are often problematic in robot control. Active model—and in particular, the priors that define desired set-
inference finesses this problem by starting with the causes of points or orbits. The basic idea is to induce these attractors
movement, as opposed to the consequences. by specifying appropriate equations of motion within the
Accordingly, one can see the solutions offered under opti- generative model of the robot. This means that the art of
mal control as special cases of the solutions available under generating realistic and purposeful behaviour reduces to
an (active) inferential scheme. This is because some policies creating equations of motion that have desired attractors.
cannot be specified using cost functions but can be described These can be simple fixed-point attractors as in the example
using priors; specifically, this is the case of solenoidal move- above. They could also be much more complicated, pro-
ments, whose cost is equal for every part of the trajectory ducing quasi-periodic motion (as in walking) or fluent
[47]. This comes from variational calculus, which says that sequences of movements specified by heteroclinic cycles.
a trajectory or a policy has several components: a curl-free All the more interesting theoretical examples in the theoreti-
component that changes value and a divergence-free com- cal literature to date rest upon some form of itinerant
ponent that does not change value. The divergence-free dynamics inherent in the generative model that sometimes
motion can be only specified by a prior and not by a cost have deep structure. A nice example of this is the handwrit-
function. Discussing the relative benefits of control schemes ing example in [48] that used Lotka–Volterra equations to
with or without cost functions and inverse models is specify a sequence of saddle points—producing a series of
beyond the scope of this article. Here, it suffice to say that movements. Simpler examples could use both attracting
inverse models are generally hard to learn for robots, and and repelling fixed points that correspond to contact points
cost functions sometimes need to be defined in ad hoc and collision points, respectively, to address more practical
manner for robot control tasks. By eluding these constraints, issues in robotics. Irrespective of the particular repertoire of
active inference may offer a promising alternative to optimal attractors implicit in the generative model, the hierarchi-
control schemes. For a more detailed discussion on the links cal aspect of the generative models that underlie active
between optimal control and active inference, see [47]. inference enables the composition of movements, sequences
of movements and sequences of sequences [49–51]. In other more generally, to consider future ( predicted) and not only 11
words, provided one can write down (or learn) a deep gen- currently sensed contingencies [17,52– 56]. Planning mechan-
rsif.royalsocietypublishing.org
erative model with itinerant dynamics, there is a possibility isms have been described under the active inference scheme
of simulating realistic movements that inherit deep temporal and can solve challenging problems such as the mountain –
structure and context sensitivity. car problem [5], and can thus been seamlessly integrated in
In conclusion, in this article, we have presented a proof- the model presented here—speaking to the scalability of the
of-concept implementation of robot control using active active inference scheme. Finally, one reason for using a bio-
inference, a biologically motivated scheme that is gaining logically realistic model such as active inference is that it
prominence in computational and systems neuroscience. may be possible to directly map internal dynamics generated
The results discussed here demonstrate the feasibility of the by the robot simulator (e.g. of hidden states) to brain signals
scheme; having said this, further work is necessary to fully (e.g. EEG signals reflecting predictions and prediction errors)
demonstrate how this scheme works in more challenging generated during equivalent action planning or performance.
References
1. Friston K. 2010 The free-energy principle: a unified 11. Pezzulo G, Cartoni E, Rigoli F, Pio-Lopez L, Friston K. Front. Psychol. 4, 92. (doi:10.3389/fpsyg.2013.
brain theory? Nat. Rev. Neurosci. 11, 127–138. 2016 Active inference, epistemic value, and 00092)
(doi:10.1038/nrn2787) vicarious trial and error. Learn. Mem. 23, 322–338. 21. Todorov E. 2006 Linearly-solvable Markov
2. Friston K, Kilner J, Harrison L. 2006 A free energy (doi:10.1101/lm.041780.116) decision problems. Adv. Neural Inf. Process Syst. 19,
principle for the brain. J. Physiol.-Paris 100, 70– 87. 12. Pezzulo G, Rigoli F, Friston K. 2015 Active inference, 1369– 1376.
(doi:10.1016/j.jphysparis.2006.10.001) homeostatic regulation and adaptive behavioural 22. Maisto D, Donnarumma F, Pezzulo G. 2015 Divide et
3. Friston K, Kiebel S. 2009 Predictive coding under control. Prog. Neurobiol. 134, 17 –35. (doi:10.1016/ impera: subgoaling reduces the complexity of
the free-energy principle. Phil. Trans. R. Soc. B 364, j.pneurobio.2015.09.001) probabilistic inference and problem solving. J. R. Soc.
1211–1221. (doi:10.1098/rstb.2008.0300) 13. Dayan P, Hinton GE, Neal RM, Zemel RS. 1995 The Interface 12, 20141335. (doi:10.1098/rsif.2014.1335)
4. Friston K et al. 2012 Dopamine, affordance and Helmholtz machine. Neural Comput. 7, 889–904. 23. Friston K, Adams R, Perrinet L, Breakspear M. 2012
active inference. PLoS Comput. Biol. 8, e1002327. (doi:10.1162/neco.1995.7.5.889) Perceptions as hypotheses: saccades as experiments.
(doi:10.1371/journal.pcbi.1002327) 14. Knill DC, Pouget A. 2004 The Bayesian brain: the Front. Psychol. 3, 151.
5. Friston KJ, Daunizeau J, Kilner J, Kiebel SJ. 2010 role of uncertainty in neural coding and 24. Friston K, Stephan K, Li B, Daunizeau J. 2010
Action and behavior: a free-energy formulation. computation. Trends Neurosci. 27, 712 –719. Generalised filtering. Math. Probl. Eng. 2010,
Biol. Cybern. 102, 227–260. (doi:10.1007/s00422- (doi:10.1016/j.tins.2004.10.007) 621670. (doi:10.1155/2010/621670)
010-0364-z) 15. Todorov E, Jordan MI. 2002 Optimal feedback 25. Ashby WR. 1947 Principles of the self-organizing
6. Friston K, Schwartenbeck P, FitzGerald T, Moutoussis control as a theory of motor coordination. Nat. dynamic system. J. Gen. Psychol. 37, 125–128.
M, Behrens T, Dolan RJ. 2013 The anatomy of Neurosci. 5, 1226–1235. (doi:10.1038/nn963) (doi:10.1080/00221309.1947.9918144)
choice: active inference and agency. Front. Hum. 16. Todorov E. 2008 General duality between optimal 26. Feynman RP. 1998 Statistical mechanics: a set of
Neurosci. 7, 598. (doi:10.3389/fnhum.2013.00598) control and estimation. In 47th IEEE Conf. on lectures (advanced book classics). Boulder, CO:
7. FitzGerald T, Schwartenbeck P, Moutoussis M, Dolan Decision and Control (CDC), 2008, IEEE. Westview Press Incorporated.
RJ, Friston K. 2015 Active inference, evidence pp. 4286– 4292. 27. Hinton GE, Van Camp D. 1993 Keeping the neural
accumulation, and the urn task. Neural Comput. 27, 17. Botvinick M, Toussaint M. 2012 Planning as networks simple by minimizing the description
306–328. (doi:10.1162/NECO_a_00699) inference. Trends Cogn. Sci. 16, 485–488. (doi:10. length of the weights. In Proc. the Sixth Annual
8. Friston KJ, Daunizeau J, Kiebel SJ. 2009 1016/j.tics.2012.08.006) Conf. Computational Learning Theory, ACM.
Reinforcement learning or active inference? 18. Donnarumma F, Maisto D, Pezzulo G. 2016 Problem pp. 5–13.
PLoS ONE 4, e6421. (doi:10.1371/journal.pone. solving as probabilistic inference with subgoaling: 28. Zeki S, Shipp S. 1988 The functional logic of cortical
0006421) explaining human successes and pitfalls in the connections. Nature 335, 311– 317. (doi:10.1038/
9. Friston K, Rigoli F, Ognibene D, Mathys C, FitzGerald tower of Hanoi. PLoS Comput. Biol. 12, e1004864. 335311a0)
T, Pezzulo G. 2015 Active inference and epistemic (doi:10.1371/journal.pcbi.1004864) 29. Quigley M et al. 2009 ROS: an open-source robot
value. Cogn. Neurosci. 6, 187–214. (doi:10.1080/ 19. Pezzulo G, Rigoli F. 2011 The value of foresight: operating system. In Proc. Open-Source Software
17588928.2015.1020053) how prospection affects decision-making. Front. Workshop Int. Conf. Robotics and Automation, Kobe,
10. Friston K, FitzGerald T, Rigoli F, Schwartenbeck P, Neurosci. 5, 79. (doi:10.3389/fnins.2011.00079) Japan, May, vol. 3, p. 5.
O’Doherty J, Pezzulo G. 2016 Active inference and 20. Pezzulo G, Rigoli F, Chersi F. 2013 The mixed 30. Körding KP, Wolpert DM. 2004 Bayesian integration
learning. Neurosci. Biobehav. Rev. 68, 862–879. instrumental controller: using value of information in sensorimotor learning. Nature 427, 244–247.
(doi:10.1016/j.neubiorev.2016.06.022) to combine habitual choice and mental simulation. (doi:10.1038/nature02169)
31. Diedrichsen J, Verstynen T, Hon A, Zhang Y, Ivry RB. 39. Konczak J, Corcos DM, Horak F, Poizner H, Shapiro Comput. Biol. 5, e1000464. (doi:10.1371/journal. 12
2007 Illusions of force perception: the role of M, Tuite P, Volkmann J, Maschke M. 2009 pcbi.1000464)
rsif.royalsocietypublishing.org
sensori-motor predictions, visual information, and Proprioception and motor control in Parkinson’s 50. Pezzulo G. 2012 An active inference view of
motor errors. J. Neurophysiol. 97, 3305 –3313. disease. J. Motor Behav. 41, 543–552. (doi:10. cognitive control. Front. Psychol. 3, 478. (doi:10.
(doi:10.1152/jn.01076.2006) 3200/35-09-002) 3389/fpsyg.2012.00478)
32. Feldman AG, Levin MF. 2009 The equilibrium-point 40. Perrinet LU, Adams RA, Friston KJ. 2014 Active 51. Pezzulo G, Donnarumma F, Iodice P, Prevete R,
hypothesis—past, present and future. In Progress inference, eye movements and oculomotor delays. Dindo H. 2015 The role of synergies within
in motor control, pp. 699 –726. Berlin, Germany: Biol. Cybern. 108, 777– 801. (doi:10.1007/s00422- generative models of action execution and
Springer. 014-0620-8) recognition: a computational perspective. Comment
33. Mussa-Ivaldi F. 1988 Do neurons in the motor 41. Bellman R. 1952 On the theory of dynamic on “Grasping synergies: a motor-control approach to
cortex encode movement direction? An alternative programming. Proc. Natl Acad. Sci. USA 38, the mirror neuron mechanism” by A. D’Ausilio et al.
hypothesis. Neurosci. Lett. 91, 106 –111. (doi:10. 716 –719. (doi:10.1073/pnas.38.8.716) Phys. Life Rev. 12, 114–117. (doi:10.1016/j.plrev.