0% found this document useful (0 votes)
13 views108 pages

Module - 2 Knowledge and Reasoning

Module 2 covers knowledge and reasoning in artificial intelligence, focusing on logical agents, propositional and first-order logic, and inference techniques. It discusses the structure of knowledge bases, the functioning of knowledge-based agents, and various inference methods such as forward and backward chaining. The module emphasizes the importance of expressive and efficient knowledge representation languages for effective reasoning.

Uploaded by

Muskan Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views108 pages

Module - 2 Knowledge and Reasoning

Module 2 covers knowledge and reasoning in artificial intelligence, focusing on logical agents, propositional and first-order logic, and inference techniques. It discusses the structure of knowledge bases, the functioning of knowledge-based agents, and various inference methods such as forward and backward chaining. The module emphasizes the importance of expressive and efficient knowledge representation languages for effective reasoning.

Uploaded by

Muskan Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

Module 2

Knowledge and Reasoning


Knowledge and Reasoning

• Logical agents
• Propositional logic
• First-order logic (predicate logic)
• Inference techniques in first order logic
• Acting under uncertainty
• Basic probability theory
• Bayes Rule
• Naïve Bayes Model
Knowledge bases

 Knowledge base = set of sentences in a formal language


 Declarative approach to building an agent (or other system):
 Tell it what it needs to know

 Then it can Ask itself what to do - answers should follow from the KB
 Agents can be viewed at the knowledge level
 i.e., what they know, regardless of how implemented

 Or at the implementation level


 i.e., data structures in KB and algorithms that manipulate them
INFERENCE METHODS

• Unification (prerequisite)

• Forward Chaining

• Backward Chaining
– Logic Programming (Prolog)

• Resolution
– Transform to CNF (Chomsky normal form )
– Generalization of Prop. Logic resolution

4
Knowledge-based agent
•A knowledge-based agent includes a knowledge base and an inference
system.

•A knowledge base is a set of representations of facts of the world.

•Each individual representation is called a sentence.

•The sentences are expressed in a knowledge representation language.

•The agent operates as follows:

1. It TELLs the knowledge base what it perceives.

2. It ASKs the knowledge base what action it should perform.

3. It performs the chosen action.


A simple knowledge-based agent

• The agent must be able to:


Represent states, actions, etc.

Incorporate new percepts

Update internal representations of the world

draw a logical conclusion for hidden properties of the world


– draw a logical conclusion for appropriate actions
Example 1

• Given:
– “The red block is above the blue block”
– “The green block is above the red block”
• Infer:
– “The green block is above the blue block”
– “The blocks form a tower”
Example 2
• Given:
• If it is sunny today, then the sun shines on the
screen. If the sun shines on the screen, the
blinds are brought down. The blinds are not
down.
• Find out:
• Is it sunny today?
A KR language needs to be
• expressive
• unambiguous
• flexible
The inference procedures need to be
• Correct (sound)
• Complete
• Efficient
Candidates (for now)
• English (natural language)
• Java (programming language)
• Logic (special KR language)
Knowledge Representations
• Propositional logic
• Predicate Logic (First-order logic)
Propositional logic
• The symbols of propositional calculus are the
propositional symbols:
P, Q, R, S, …
• the truth symbols:
true, false
• and connectives:
, , , , 
Propositional Calculus Sentences
• Every propositional symbol and truth symbol
is a sentence.
Examples: true, P, Q, R.
• The negation of a sentence is a sentence.
Examples: P,  false.
• The conjunction, or and, of two sentences is a
sentence.
Example: P  P
Propositional Calculus Sentences (cont’d)

• The disjunction, or or, of two sentences is a sentence.


Example: P  P
• The implication of one sentence from another is a
sentence.
Example: P  Q
• The equivalence of two sentences is a sentence.
Example: P  Q  R
• Legal sentences are also called well-formed formulas
or WFFs.
Propositional calculus semantics
• An interpretation of a set of propositions is the assignment of a truth value,
either T or F to each propositional symbol.

• The symbol true is always assigned T, and the symbol false is assigned F.

• The truth assignment of negation, P, where P is any propositional symbol,


is F if the assignment to P is T, and is T is the assignment to P is F.

• The truth assignment of conjunction, , is T


only when both conjuncts have truth value T; otherwise it is F.
Propositional calculus semantics (cont’d)
• The truth assignment of disjunction, , is F
only when both disjuncts have truth value F; otherwise it is T.

• The truth assignment of implication, , is F


only when the premise or symbol before the implication is T and the truth
value of the consequent or symbol after the implication F; otherwise it is T.

• The truth assignment of equivalence, , is T


only when both expressions have the same truth assignment for all
possible interpretations; otherwise it is F.
For propositional expressions P, Q, R
Fig. 2.1: Truth table for the operator 

P Q P Q
T T T
T F F
T T F
T F F
Fig. 2.2 Truth table demonstrating the
equivalence of PQ and PQ
Proofs in propositional calculus
• If it is sunny today, then the sun shines on the screen. If the sun shines on
the screen, the blinds are brought down. The blinds are not down.

• Prove: Is it sunny today? (T or F)

• P: It is sunny today.
• Q: The sun shines on the screen.
• R: The blinds are down.
• Premises: PQ, QR, R
• Question: P
Prove using a truth table
Variables Premises Trial
Conclusions
P Q R P Q Q R R P P
T T T T T F T F
T T F T F T T F
T F T F T F T F
T F F F T T T F
F T T T T F F T
F T F T F T F T
F F T T T F F T
F F F T T T F T
Propositional calculus is cumbersome
• Given:
– If it is sunny today, then the sun shines on the screen.
– If the sun shines on the screen, the blinds are brought down.
– The blinds are not down.

• Prove : Is it sunny today?


• ---

• Conclusion:
– If it is sunny on a particular day, then the sun shines on the screen.
– If the sun shines on the screen on a particular day, the blinds are brought
down.
– The blinds are not down today.

• Is it sunny today?
Problems with Propositional
Logic

25
Propositional logic is a weak language
• Hard to identify “individuals” (e.g., Mary, 3yrs)

• Can’t directly talk about properties of individuals or relations between


individuals (e.g., “Bill is tall”)

• Generalizations, patterns, regularities can’t easily be represented (e.g., “all


triangles have 3 sides”)

• First-Order Logic (abbreviated FOL or FOPC) is expressive enough to


concisely represent this kind of information

• FOL adds relations, variables, and quantifiers, e.g.,


• “Every elephant is gray”:  x (elephant(x) → gray(x))
• “There is a white alligator”:  x (alligator(X) ^ white(X))

26
Example
• Consider the problem of representing the following information:
– Every person is mortal.
– Confucius is a person.
– Confucius is mortal.
• How can these sentences be represented so that we can infer the third
sentence from the first two?

27
First-order logic (Predicate logic)
• First-order logic (FOL) models the world in terms of
– Objects, which are things with individual identities
– Properties of objects that distinguish them from other objects
– Relations that hold among sets of objects
– Functions, which are a subset of relations where there is only one “value”
for any given “input”
• Examples:
– Objects: Students, lectures, companies, cars ...
– Relations: Brother-of, bigger-than, outside, part-of, has-color, occurs-after,
owns, visits, precedes, ...
– Properties: blue, oval, even, large, ...
– Functions: father-of, best-friend, second-half, one-more-than ...

28
Syntax of FOL: Basic elements
• Constants John, 2, DIT,...
• Predicates Brother, >,...
• Functions Sqrt, LeftLegOf,...
• Variables x, y, a, b,...
• Connectives , , , , 
• Equality =
• Quantifiers , 
Quantifiers

• Universal quantification
– (x)P(x) means that P holds for all values of x in the domain
associated with that variable
– E.g., (x) dolphin(x)  mammal(x)

• Existential quantification
– ( x)P(x) means that P holds for some value of x in the
domain associated with that variable
– E.g., ( x) mammal(x)  lays-eggs(x)
– Permits one to make a statement about some object
without naming it
30
Quantifiers
• Universal quantifiers are often used with “implies” to form “rules”:
(x) student(x)  smart(x) “All students are smart”

• Universal quantification is rarely used to make blanket statements about every


individual in the world:
“Everyone in the world is a student and is smart”
(x)student(x)smart(x)
• Existential quantifiers are usually used with “and” to specify a list of properties
about an individual:
(x) student(x)  smart(x) “There is a student who is smart”

• A common mistake is to represent this English sentence as the FOL sentence:


(x) student(x)  smart(x)

31
Quantifier Scope

• Switching the order of universal quantifiers does not change the meaning:
– (x)(y)P(x,y) = (y)(x) P(x,y)

• Similarly, you can switch the order of existential quantifiers:


– (x)(y)P(x,y) = (y)(x) P(x,y)

• Switching the order of universals and existential does change meaning:


– Everyone likes someone: (x)(y) likes(x,y)
– Someone is liked by everyone: (y)(x) likes(x,y)

32
Translating English to FOL

Every gardener likes the sun.


x gardener(x)  likes(x,Sun)
You can fool some of the people all of the time.
x t person(x) time(t)  can-fool(x,t)
You can fool all of the people some of the time.
x (person(x)  t time(t) can-fool(x,t)) Equivalent
x (person(x)  t (time(t) can-fool(x,t))
All purple mushrooms are poisonous.
x (mushroom(x)  purple(x))  poisonous(x)
No purple mushroom is poisonous.
x purple(x)  mushroom(x)  poisonous(x)
x (mushroom(x)  purple(x))  poisonous(x) Equivalent
Purple(mushroom)
Clinton is not tall.
tall(Clinton)
X is above Y iff X is on directly on top of Y or there is a pile of one or more other
objects directly on top of one another starting with X and ending with Y.
x y above(x,y) ↔ (on(x,y)  z (on(x,z)  above(z,y)))

33
Knowledge bases

 Knowledge base = set of sentences in a formal language


 Declarative approach to building an agent (or other system):
 Tell it what it needs to know

 Then it can Ask itself what to do - answers should follow from the KB
 Agents can be viewed at the knowledge level
 i.e., what they know, regardless of how implemented

 Or at the implementation level


 i.e., data structures in KB and algorithms that manipulate them
Inferencing Techniques

• Forward Chaining
• Backward Chaining
• Resolution
• Unification
Forward Chaining

• Forward Chaining
– Start with atomic sentences in the KB and apply
Modus Ponens in the forward direction, adding
new atomic sentences, until no further inferences
can be made.
– P Q and P is asserted to be true, so
therefore Q must be true

36
Forward Chaining
• Given a new fact, generate all consequences
• Assumes all rules are of the form
– C1 and C2 and C3 and…. --> Result
• Each rule & binding generates a new fact
• This new fact will “trigger” other rules
• Keep going until the desired fact is generated
• (Semi-decidable as is FOL in general)

37
FC: Example Knowledge Base
• The law says that it is a crime for an American to sell weapons
to hostile nations. The country Nono, an enemy America, has
some missiles, and all of its missiles were sold to it by Col.
West, who is an American.

• Prove that Col. West is a criminal.

AI: Chapter 9: Inference in First-Orde 38


r Logic
FC: Example Knowledge Base
• …it is a crime for an American to sell weapons to hostile
nations
American(x) Weapon(y) Sells(x,y,z) Hostile(z)  Criminal(x)

• Nono…has some missiles


x Owns(Nono, x)  Missiles(x)

Owns(Nono, M1) and Missle(M1)

• …all of its missiles were sold to it by Col. West


x Missle(x)  Owns(Nono, x)  Sells( West, x, Nono)

• Missiles are weapons


Missle(x)  Weapon(x)

39
FC: Example Knowledge Base
• An enemy of America counts as “hostile”
Enemy( x, America )  Hostile(x)

• Col. West who is an American


American( Col. West )

• The country Nono, an enemy of America


Enemy(Nono, America)

40
FC: Example Knowledge Base

41
FC: Example Knowledge Base

42
FC: Example Knowledge Base

43
Efficient Forward Chaining
• Order conjuncts appropriately
– E.g. most constrained variable
• Don’t generate redundant facts; each new fact
should depend on at least one newly
generated fact.
– Production systems
– RETE matching
– CLIPS

44
Forward Chaining Algorithm

45
Backward Chaining
• Consider the item to be proven a goal
• Find a rule whose head is the goal (and bindings)
• Apply bindings to the body, and prove these
(subgoals) in turn
• If you prove all the subgoals, increasing the binding
set as you go, you will prove the item.
• Logic Programming (cprolog )

46
Backward Chaining Example

47
Backward Chaining Example

48
Backward Chaining Example

49
Backward Chaining Example

50
Backward Chaining Example

51
Backward Chaining Example

52
Backward Chaining Example

53
Backward Chaining Algorithm

54
Properties of Backward Chaining
• Depth-first recursive proof search: space is linear in
size of proof
• Incomplete due to infinite loops
– Fix by checking current goal with every subgoal on the
stack
• Inefficient due to repeated subgoals (both success
and failure)
– Fix using caching of previous results (extra space)
• Widely used without improvements for logic
programming

55
Inference Methods

• Unification (prerequisite)

• Forward Chaining

• Backward Chaining
– Logic Programming (Prolog)

• Resolution
– Transform to CNF (Chomsky normal form )
– Generalization of Prop. Logic resolution

56
Resolution for propositional Logic
• Convert everything to CNF
(conjunctive normal form/ clausal normal form)
• If resolution is successful, proof succeeds

• If there is a variable in the item to prove, return


variable’s value from unification bindings

57
Resolution
• Resolution allows a complete inference mechanism (search-
based) using only one rule of inference
• Resolution rule:
Complementary literals P1 and P1 “cancel out”

• To prove a proposition F by resolution,


– Start with F
– Resolve with a rule from the knowledge base (that contains F)
– Repeat until all propositions have been eliminated
– If this can be done, a contradiction has been derived and the original
proposition F must be true.

58
Rules

• Eliminate implications and equivalences

this will eliminate all occurrences of ,


Resolution In Proposational LOgic.
Propositional Resolution Example

• Rules
– Cold and precipitation -> snow
¬cold  ¬precipitation  snow
– January -> cold
¬January  cold
– Clouds -> precipitation
¬clouds  precipitation

• Facts
– January, clouds

• Prove
– snow

64
Propositional Resolution Example
¬snow ¬cold  ¬precipitation  snow

¬cold  ¬precipitation ¬January  cold

¬January  ¬precipitation ¬clouds  precipitation

January ¬January  ¬clouds

¬clouds clouds

65
Resolution Theorem Proving (FOL(Predicate Logic))

• Convert everything to CNF


• Resolve, with unification (Negate goal and add
bindings)
– Save bindings as you go!
• If resolution is successful, proof succeeds
• If there is a variable in the item to prove,
return variable’s value from unification
bindings

66
Converting to CNF(Rules)

1. Replace implication (A  B) by A  B
2. Move  “inwards”
• x P(x) is equivalent to x P(x) & vice versa
3. Standardize variables
• x P(x)  x Q(x) becomes x P(x)  y Q(y)
4. Skolemize
• x P(x) becomes P(A)
5. Drop universal quantifiers
• Since all quantifiers are now , we don’t need them
6. Distributive Law
a(b + c) = ab + ac

67
RESOLUTION IN PREDICATE LOGIC

• Two literals are contradictory if one can be unified with the

negation of the other.

• For example man(x) and man (Himalayas) are contradictory

since man(x) and man(Himalayas ) can be unified.

• In predicate logic unification algorithm is used to locate

pairs of literals that cancel out.

• It is important that if two instances of the same variable

occur, then they must be given identical substitutions.


The resolution algorithm for predicate logic as follows

• Let f be a set of given statements and S is a statement to be proved.

1. Covert all the statements of F to clause form.

2. Negate S and convert the result to clause form. Add it to the set of clauses obtained in 1.

3. Repeat until either a contradiction is found or no progress can be made or a predetermined

amount of effort has been expended.

a) Select two clauses. Call them parent clauses.

b) Resolve them together.

The resolvent will be the disjunction of all of these literals of both clauses. If there is a pair

of literals T1 and T2 such that one parent clause contains Ti and the other contains T2 and if

T1 and T2 are unifiable, then neither t1 nor T2 should appear in the resolvent. Here Ti and

T2 are called complimentary literals.

C) If the resolvent is the empty clause , then a contradiction has been found. If it is not, then

add it to the set of clauses available to the procedure.


Example: Using resolution to produce proof is illustrated in the
following statements.

1. Marcus was a man.


2. Marcus was a Pompeian
3. All pompeians were Romans
4. Caesar was a ruler
5. All Romans were either loyal to Caesar or hated him.
6. Everyone is loyal to someone.
7. People only try to assassinate rulers they are not loyal to.
8. Marcus tried to assassinate Caesar.
Questions (Goals)

• Was Marcus a Roman?


• Was Marcus loyal to Caesar?
• Who was Marcus loyal to?
• Was Marcus a ruler?

• Open PDF
• Step-1:Examples for Conversion from Natural Language
Sentences to Predicate Logic
• Step 2: Convert to Clausal Form
• Step 3: Negate query and start
• Step 4:Resolve with a rule from the knowledge base (that
contains F)
• Step 5: Repeat until either a contradiction is found or no
progress can be made
Step-1:Examples for Conversion from Natural
Language Sentences to Predicate Logic
• 1. Marcus was a man
Man(Marcus)
• 2. Marcus was a Pompeian
Pompeian(Marcus)
• 3. All Pompeians were Romans
∀x [Pompeian(x)  Roman(x)]
• 4. Caesar was a ruler
Ruler(Caesar)
5. All Romans were either loyal to Caesar or hated him
∀x [Roman(x) (LoyalTo(x, Caesar) ∨ Hate(x, Caesar))]
6. Everyone is loyal to someone
∀x ∃y LoyalTo(x,y)
7. People only try to assassinate rulers they aren't loyal to
∀x∀y[(Person(x) ∧ Ruler(y) ∧ TryAssassinate(x,y))
¬LoyalTo(x,y)]
8. Marcus tried to assassinate Caesar
TryAssassinate(Marcus, Caesar)

• Can you prove: ¬LoyalTo(x,Caesar)


Step 2: Convert to Clausal Form
Convert to Clausal Form
Query
Another Resolution Example

81
What is Unification?

• Unification is a process of making two


different logical atomic expressions identical
by finding a substitution. Unification depends
on the substitution process.
• It takes two literals as input and makes them
identical using substitution.
Unifier
• Let S1 and S2 be two atomic sentences and P be a unifier
such that, Ψ1 P = Ψ2 P, then it can be expressed
as UNIFY(S1, S2).

• Example: Find the MGU for Unify{King(x), King(John)}


• Let S1 = King(x), S2 = King(John),

• Substitution 1 = {John/x} is a unifier for these atoms and


applying this substitution, and both expressions will be
identical.
• MGU (Most General Unifier)
UNIFY algorithm
• The UNIFY algorithm is used for unification, which takes two atomic
sentences and returns a unifier for those sentences (If any exist).

• Unification is a key component of all first-order inference


algorithms.

• It returns fail if the expressions do not match with each other.

• The substitution variables are called Most General Unifier or MGU.

• E.g. Let's say there are two different expressions,


• P(x, y), and P(a, f(z)).
Example 2
• E.g. Let's say there are two different expressions,
P(x, y), and P(a, f(z)).
• In this example, we need to make both above statements identical to each
other. For this, we will perform the substitution.
P(x, y)......... (i)
P(a, f(z))......... (ii)
• Substitute x with a, and y with f(z) in the first expression, and it will be
represented as:
(a/x )and ( f(z)/y )
• With both the substitutions, the first expression will be identical to the
second expression and the
• substitution set will be: [a/x, f(z)/y].
• P(a, f(z))......... (i)
P(a, f(z))......... (ii)
Conditions for Unification:

• Following are some basic conditions for unification:

1. Predicate symbol must be same, atoms or expression


with different predicate symbol can never be unified.
(P(x, y) P is Predicate symbol )
2. Number of Arguments in both expressions must be
identical. (P(x, y) , P(x, y, z) can not be unified)
3. Unification will fail if there are two similar variables
present in the same expression.
P(x, y)......... (i)
P(a, x)......... (ii)
(Sentence I and ii could not Unify, (infinity))
Algorithm
Step. 1: If S1 or S2 is a variable or constant, then:
• a) If S1 or S2 are identical, then return NIL.
• b) Else if S1 is a variable,
• a. then if S1 occurs in S2, then return FAILURE
• b. Else return { (S2/ S1)}.
• c) Else if S2 is a variable,
• a. If S2 occurs in S1 then return FAILURE,
• b. Else return {(S1/ S2)}.
• d) Else return FAILURE.

• Step.2: If the initial Predicate symbol in S1 and S2 are not same, then return FAILURE.
• Step. 3: IF S1 and S2 have a different number of arguments, then return FAILURE.
• Step. 4: Set Substitution set(SUBST) to NIL.
• Step. 5: For i=1 to the number of elements in S1.
• a) Call Unify function with the ith element of S1 and ith element of S2, and put the
result into Sub.
• b) If Sub = failure then returns Failure
• c) If Sub ≠ NIL then do,
• a. Apply Sub to the remainder of both L1 and L2.
• b. SUBST= APPEND(Sub, SUBST).
• Step.6: Return SUBST.
Implementation of the Algorithm

Step.1: Initialize the substitution set to be empty.


Step.2: Recursively unify atomic sentences:
• Check for Identical expression match.
• If one expression is a variable vi, and the other is a term
ti which does not contain variable vi, then:
– Substitute ti / vi in the existing substitutions
– Add ti /vi to the substitution setlist.
– If both the expressions are functions, then function name must be
similar, and the number of arguments must be the same in both the
expression.
Example: 1

• For each pair of the following atomic sentences find the


most general unifier (If exist).

1. Find the MGU of {p(f(a), g(Y)) and p(X, X)}


• Sol: S0 => Here, L1 = p(f(a), g(Y)),
• L2 = p(X, X)

SUBST θ= {f(a) / X}
S1 => L1 = p(f(a), g(Y)), and L2 = p(f(a), f(a))

SUBST 1= {f(a) / g(y)}, Unification failed.

• Unification is not possible for these expressions.


• Example 2. Find the MGU of {p(b, X, f(g(Z))) and p(Z, f(Y), f(Y))}

• Here, L1 = p(b, X, f(g(Z))) , and L2 = p(Z, f(Y), f(Y))



S0 => { p(b, X, f(g(Z))); p(Z, f(Y), f(Y))}

1. SUBST θ={b/Z}
S1 => { p(b, X, f(g(b))); p(b, f(Y), f(Y))}

2. SUBST θ={f(Y) /X}


S2 => { p(b, f(Y), f(g(b))); p(b, f(Y), f(Y))}

3. SUBST θ= {g(b) /Y}


• S2 => { p(b, f(g(b)), f(g(b)); p(b, f(g(b)), f(g(b))}

• Unified Successfully.
Unifier = { b/Z, f(Y) /X , g(b) /Y}.
• 3. Find the MGU of {p (X, X), and p (Z, f(Z))}
• Here, Ψ1 = {p (X, X), and Ψ2 = p (Z, f(Z))
S0 => {p (X, X), p (Z, f(Z))}
SUBST θ= {X/Z}
S1 => {p (Z, Z), p (Z, f(Z))}
SUBST θ= {f(Z) / Z}, Unification Failed.
• Hence, unification is not possible for these
expressions.
• 4. Find the MGU of UNIFY(prime (11),
prime(y))
• Here, Ψ1 = {prime(11) , and Ψ2 = prime(y)}
S0 => {prime(11) , prime(y)}
SUBST θ= {11/y}
• S1 => {prime(11) , prime(11)} , Successfully
unified.
Unifier: {11/y}.
Acting under uncertainty
A logical agent uses propositions that are true, false or unknown
When the logical agent knows enough facts about it
environment it derives plans that are guaranteed to work.
Unfortunately, agents almost never have access to the whole
truth about their environment and therefore agents must act
under uncertainty

Thursday, October 1, 2020 ARITIFICIAL INTELLIGENCE


97
Uncertaint
y
Let action At = leave for airport t minutes before flight . Will A t get me there on
time? Problems:
1. partial observability (road state, other drivers' plans, noisy sensors)
2. uncertainty in action outcomes (flat tire, etc.)
3. immense complexity of modeling and predicting traffic
Hence a purely logical approach either
4. risks falsehood: “A25 will get me there on time”, or
5. leads to conclusions that are too weak for decision making:
“A25 will get me there on time if there's no accident on the bridge and it doesn't rain and my
tires remain intact etc etc.” (A1440 might reasonably be said to get me there on time but I'd
have to stay overnight in the airport …)
Thursday, October 1, 2020 ARITIFICIAL INTELLIGENCE
98
Probability to the
Rescue
Probability
◦ Model agent's degree of belief, given the available
evidence.

◦ A25 will get meinthere


Probability on time
AI models ourwith probability
ignorance, not 0.04
the true state of
the world.

The statement “With probability 0.7 I have a cavity” means:


I either have a cavity or not, but I don’t have all the
necessary information to know this for sure.

Thursday, October 1, 2020 ARITIFICIAL INTELLIGENCE


99
Probabilit
1.yAll probability statements must indicate the evidence with
respect to which the probability is being assessed
2. As agent receives new percepts, the probability assessments
are updated to reflect new evidence
3. Before the evidence is obtained , we have prior or
unconditional probability
4. After the evidence is obtained, we have posterior or
conditional probability
5. Generally, the agent will have some evidence from its percepts
and will be interested in computing the posterior probabilities of
Thursday, October 1, 2020 ARITIFICIAL INTELLIGENCE
10

the outcomes it cares about 0


Probabilit
y
Subjective probability:

Probabilities relate propositions to agent's own state of


knowledge e.g., P(A25 | no reported accidents at 3 a.m. ) =
0.06

Probabilities of propositions change with new evidence:


e.g., P(A25 | no reported accidents at 5 a.m.) = 0.15
Thursday, October 1, 2020 ARITIFICIAL INTELLIGENCE
10
1
Making decisions under uncertainty

Suppose I believe the following:


P(A25 gets me there on time | …) = 0.04
P(A90 gets me there on time | …) = 0.70
P(A120 gets me there on time | …) = 0.95
P(A1440 gets me there on time | …) = 0.9999

Which action to choose?


Depends on my preferences for
missing flight vs. time spent waiting,
etc.
◦ Utility theory is used to represent and
infer
Thursday, Octoberpreferences
1, 2020 ARITIFICIAL INTELLIGENCE
10
◦ Decision theory = probability theory + 2

utility theory
Probability
Basics

Thursday, October 1, 2020 ARITIFICIAL INTELLIGENCE


10
3
Probability
Basics

Thursday, October 1, 2020 ARITIFICIAL INTELLIGENCE


10
4
Probability
Basics

Thursday, October 1, 2020 ARITIFICIAL INTELLIGENCE


10
5
Syntax
Atomic event: A complete specification of the state of the world about which the agent
is uncertain (i.e. a full assignment of values to all variables in the universe, a unique
single world).
E.g., if the world consists of only two Boolean variables Cavity and Toothache, then there
are 4 distinct atomic events:
Cavity = false ∧ Toothache = false
Cavity = false ∧ Toothache = true
Cavity = true ∧ Toothache = false
Cavity = true ∧ Toothache = true
if some atomic event is true,
then all other other atomic There is always some atomic event true.
events are false. Hence, there is exactly 1 atomic event true.
Atomic events are mutually exclusive and the set of all possible atomic events is
exhaustive
Thursday, October 1, 2020 ARITIFICIAL INTELLIGENCE
10
6
Prior probability

Prior or unconditional probabilities of propositions


e.g., P(Cavity = true) = 0.1 and P(Weather = sunny) = 0.72 correspond to belief prior to arrival of any (new)
evidence

Probability distribution gives values for all possible assignments:


P(Weather) = <0.72,0.1,0.08,0.1> (normalized, i.e., sums to 1)
Joint probability distribution for a set of random variables gives the probability of every atomic event of those
random variables
P(Weather,Cavity) = a 4 × 2 matrix of
values:
Weather = sunny rain cloud snow
y y
Cavity = true 0.144
0.0 0.016 0.02
2
Cavityquestion
Every = false about
0.576
a domain 0.0 0.064by the
can be answered 0.08
joint distribution
8

Thursday, October 1, 2020 ARITIFICIAL INTELLIGENCE


17
Conditional probability

Definition of conditional probability:


P(a | b) = P(a ∧ b) / P(b) if P(b) > 0

Product rule gives an alternative formulation:


P(a ∧ b) = P(a | b) P(b) = P(b | a) P(a)

Bayes Rule: P(a|b) = P(b|a) P(a) / P(b)

A general version holds for whole distributions, e.g.,


P(Weather,Cavity) = P(Weather | Cavity) P(Cavity)
(View as a set of 4 × 2 equations, not matrix multiplication)
P(X , …,X )
Chain1 rule nis derived =byP(X1,...,Xn-1) P(X
successive n | X1,...,Xn-1
application of) product rule:
= P(X1,...,Xn-2) P(Xn-1 | X1,...,Xn-2) P(Xn | X1,...,Xn-1)
=…
= πi= 1^n P(Xi | X1, … ,Xi-1)

Thursday, October 1, 2020 ARITIFICIAL INTELLIGENCE


19
Independence
A and B are independent iff P(A|B) = P(A) or P(B|A) = P(B) or P(A, B) = P(A) P(B)

P(Toothache, Catch, Cavity, Weather)


= P(Toothache, Catch, Cavity) P(Weather)
32 entries reduced to 12;

for n independent biased coins, O(2n) →O(n)


Absolute independence powerful but rare
Dentistry is a large field with hundreds of
variables, none of which are independent.
What to 1,do?
Thursday, October 2020 ARITIFICIAL INTELLIGENCE
25
Bayes' Rule
Product rule P(a∧b) = P(a | b) P(b) = P(b | a) P(a)
⇒ Bayes' rule: P(a | b) = P(b | a) P(a) / P(b)
or in distribution form
P(Y|X) = P(X|Y) P(Y) / P(X) = αP(X|Y) P(Y)

◦ E.g., let m be meningitis, s be stiff neck: Let P(s | m)=0.8,

P(m)=0.0001, P(s)=0.1 P(m|s) = P(s|m) P(m) / P(s) = 0.8 × 0.0001 / 0.1 = 0.0008

◦ Note: even though the probability of having a stiff neck given meningitis is very
large (0.8), the
posterior probability of meningitis given a stiff neck is still very small

◦ P(s|m) is more ‘robust’ than P(m|s). Imagine a new disease appeared


which would
Thursday, October 1, 2020 also cause a stiff neck, then P(m|s)
ARITIFICIAL changes but P(s|m) not.
INTELLIGENCE
26
Bayes' Rule and
conditional independence
This is an example of a naïve Bayes model:
P(Cause,Effect1, … ,Effectn) = P(Cause) πiP(Effecti|Cause)

Total number of parameters is linear in n


A naive Bayes classifier computes: P(cause|effect1,
effect2...)

Thursday, October 1, 2020 ARITIFICIAL INTELLIGENCE


•Thank you

You might also like