Models of Randomness
Part I: a sketchy survey
Fritz Obermeyer
Department of Mathematics
Carnegie-Mellon University
2008:04:08
Outline
Programs with randomness
Abstract probability theories
Finite sets
Probability measures
Pointless probability
Probability on dcpos
Probability on lattices
Abstract probability algebras
Summary and prospects
Motivational outline
Want to prove equivalence between randomized
algorithms.
Need a programming language with random monad;
and a domain-theoretic model of this language.
This talk surveys some attempts at models.
Later Part II tries to find a random monad
in a type-as-ambiguity framework (closures).
Bayesian networks
Consider entirely first-order programs, with no
looping/recursion, e.g.,
sample x from unif(0, 1) in
sample y from unif(0, x) in
sample z from unif(x, 0) in
if y + z < 1/2 then 0 else 1
(notice we can forget x after sampling y,z)
This is a Bayesian network (see picture).
All Bayesian networks are so expressible.
Markov chains
Consider iterative loops, e.g.
x ← sample x0 from normal(0, 1) ∈ x0.
while . . . :
sample n from normal(0, 1) in
let x0 = 1 + x/2 in
x ← x0 + n
This is a Markov chain (see picture).
All Markov chains are so expressible.
Prototype for a probability theory
I set of states/points (maybe with structure)
I space of events/predicates
I morphisms between state-event spaces
I random monad = probability distributions
I extra structure: products, exponentials, ...
We start with finite sets,
generalize to probability measures, then
weaken the event language,
add structure among points, and
end with fully algebraic approaches.
Finite sets: states, morphisms, extra structure
Start with a finite set X of states.
No need for event spacee.
Morphisms are just functions,
product, exponenial are standard.
NO recursive types, NO infinite types
Finite sets: Random functor?
Like the powerset, Rand is functorial,
On objects: Rand((X, F)) = (X0 , F0 ) where
I X0 = probability measures over (X, F)
I F0 = generated by (right?)
{ { f ∈ X0 | f−1 B ⊆ A } | A ∈ F, B ∈ G }
On arrows: for h : (X, F) → (Y, G),
Rand(h) = h0 : (X0 , F0 ) → (Y0 , G0 ) where for p : X0 a probability
measure on (X, F), B ∈ G,
Rand(h)(p)(B) = p(h−1 B)
But random functor doesn’t land in finite sets.
Random monad (pieces)
The probability functor forms a monad with natural
always : ∀a. a → Rand a
mix : ∀a. Rand(Rand a) → Rand a
equivalently a Kleisli triple with ’always’ and
sample : ∀a, b. Rand a → (a → Rand b) → Rand b
In finite-sets, semantics is
[always x](y) R= δx,y
[mix p](x) = [p](q)
P q(x) dq oops: p infinite
[sample p f](y) = x [p](x) [f](y)(x)
Random monad (properties)
Being a monad requires equations
sample x from p in always x = p,
sample x from always y in f x = f y,
sample y from (sample x from p in f x) in g y
= sample x from p in
sample y from f x in
g y
Being a computational monad (a la Moggi) requires also:
I ’always’ is mono
I monad plays nicely with products and sums
Probability measures: states, events, morphisms
Start with a state set X (unstructured).
The event space is a sigma-algebra
W≤ωF of X,
a structure hX,
W F ⊆ P(Ω), W ¬, i,
where ⊥ = ∅, > = F are definable.
A morphism is a sigma-algebra hom, a measurable
function f : X → Y, whose preimage induces a hom
f 0 : hF, ¬, i ← hG, ¬, i
U U
Measures: Random states
A random state is a probability measure, a hom
hF, ∅, X, ω i −→ h[0, 1], 0, 1, ω i
U P
i.e., functions p satisfying
p(⊥) = 0
p(>)
U =1 P
p( i Ai ) = i p(Ai )
and hence p(¬ A) = 1−p(A)
Question equivalent to additivity+continuity?
Measures: extra structure
Product sigma-algebras are generaged by rectangles
Exponentials have pointwise sigma-algebra (right?)
NO untyped/unityped model of lambda-calculus
Probability valuations
Relax event logic from sigma-algebra to topology;
abstract away points to frames/locales/CHAs.
Definition
A probability valuation on F is a monotone p : F → [0, 1]
satisfying
p(⊥) = 0, p(>) = 1
p(A) + p(B) = p(A u B) + p(A t B)
we also assume continuity (some authors don’t).
(analogous to countable additivity?)
...extension theorems e.g.
Theorem
(Jones) Every continuous valuation on a continuous dcpo
Directed-complete partial orders.
Start with a dcpo of states.
Events are Scott-open sets (a frame).
Morphisms are Scott-continuous functions
(preserving joins, inducing frame-homs).
Directed joins of random things are random things
so dcpo is closed under random monad.
dcpo is also closed under products, function spaces,
coinductive typess, ...
Continuous dcpos
...but dcpos don’t have enough structure to be “domains”.
Continuous domains have more.
CONT is closed under probability monad.
But NOT closed under function spaces.
Lattices I: states, randomness
Start with a lattice X (e.g. real line).
Let event space be the upper sets.
To each valuation p, define a cpdf
p0 (x) = p(upper x)
Dually, each cpdf d extends to a unique valuation
d0 (upper x) = d(x)
Try again: (no event space)
Morphisms are lattice homs.
Random states are cpdfs, but...
NO random monad
A space of random lattice elements need not itself be a
lattice.
Hence Rand◦Rand need not exist.
Example
the square lattice ⊥ v tr, fa v >,
⊥ + tr | ⊥ + fa v ⊥ + >, tr + fa
no unique minimal upper bound
The max of two cdfs may lead to negative densities.
Abstract probability algebras
Start with a dcpo with ⊥.
Generate initial “R-algebra” with binary mixing x + y
subject to monotonicity and
x+x=x idempotence
x+y =y+x commutativity
(ω + x) + (y + z) = (ω + z) + (y + x) associativity
I equivalent to arbitrary real mixing
I equivalent to valuations
Compare with initial join-semilattice (“J-algebra”)
x|x=x
x|y=y|x
x | (y | x) = (x | y) | z
Probability and lattices
Problem the join/meet of two random things
may not be a random thing.
R-algebra models randomness.
J-algebra models parallelism.
JR-algebra with distributivity.
(x + y) | z = (x | z) + (y | z)
models parallelism with randomness,
allows random normal form,
sampling semantics
JR-algebra models... nothing nice,
NO random normal form
Next time: can JR-algebras be made to work?
Summary and prospects
(again)
We started with finite sets,
generalized to probability measures, then
weakened the event language,
added structure among points, and
ended with fully algebraic approaches.