Lecture 2 - Alphabets-Strings, Languages and Grammars
Lecture 2 - Alphabets-Strings, Languages and Grammars
STRINGS AND
LANGUAGES
Alphabet and Strings
► An alphabet (Σ) is a finite, non-empty set of symbols.
► {0,1 } is a binary alphabet.
► { A, B, …, Z, a, b, …, z } is an English alphabet.
► A string (word) over an alphabet Σ is a finite sequence of symbols
chosen from any alphabet Σ.
► 0, 1, 11, 00, and 01101 are strings over {0, 1 }.
► Cat, CAT, and compute are strings over the English alphabet.
2
Empty String
► An empty string, denoted by ε / λ (epsilon / lambda), is a
string containing no symbol.
► ε / λ is a string over any alphabet.
3
Length of a string
► The length of a string x, denoted by length(x) |x|, is the number
of symbols(with repetition) in the string.
Let Σ = {a, b, …, z}
length(automata) = 8
length(computation) = 11
length(ε) = 0
► x(i), denotes the symbol in the ith position of a string x, for 1≤ i ≤
length(x).
4
Powers of an Alphabet
• ∑k to be the set of strings of length k, each of whose symbols is in ∑
Σ* (*- Kleene Closure)
► The set of all strings over an alphabet Σ is denoted by Σ*.
► That is, Σ* = ∪i=∞0 Σi
► Let Σ = {0, 1}.
► Σ* = Σ0 U Σ1 U Σ2 ….
► {0, 1}*= {0, 1}0 U {0, 1}1 U {0, 1}2 ….
= {ε, 0, 1, 00, 01, 10, 11, 000, 001, 010, 011, … }.
6
Σ (+-Positive closure)
+
7
Σ* = Σ0 U Σ1 U Σ2 …. ( includes ε / λ)
Σ+ = Σ1 U Σ2 …. (doesn’t include ε / λ )
Σ+ = Σ*- {ε}
String Operations
► Concatenation
► Suffix, Prefix
► Substring
► Reversal
9
Concatenation
► The concatenation (.) of strings x and y, denoted by x⋅y or xy,
is a string z such that:
► |z| = |x|+|y| (length of z is the sum of the length of x and y)
► xy is not equal to yx
► Example
► automata⋅computation = automatacomputation
10
Concatenation
The concatenation of string x for n times, where n≥0, is denoted
by xn
►x0 = ε
►x1 = x
►x2 = x x
►x3 = x x x
11
Binary relations on strings
1. Prefix - u is a prefix of v if there is a w such that v = uw.
► Examples:
► ε is a prefix of 0 since 0 = ε0
► apple is a prefix of appleton since appleton = apple.ton
2. Suffix - u is a suffix of v if there is a w such that v = wu.
► Examples:
► 0 is a suffix of 0 since 0 = ?
► ton is a suffix of appleton since ?
Substring
Let x and y be strings over an alphabet Σ
The string x is a substring of y if there exist strings w and z over Σ such
that y = w x z.
► ε is a substring of every string.
► For every string x, x is a substring of x itself.
Example
► ε, comput and computation are substrings of computation.
13
Reversal
Let x be a string over an alphabet Σ
The reversal of the string x, denoted by xr, is a
string such that
► if x is ε, then xr is ε.
► If a is in Σ, y is in Σ* and x = a y, then xr = yr a.
► Example: Let x = “computation”
xr = “noitatupmoc”
14
LANGUAGES
• A set of strings all of which are chosen from some Σ* where Σ is a
particular alphabet is called a language. If Σ is an alphabet, and L ⊆ Σ*
then L is a language over Σ .
• Notice that a language over Σ need not include strings with all the
symbols of Σ, so once we have established that L is a language over Σ,
we also know it is a language over any alphabet that is a superset of
Σ.
► Let Σ = {0, 1} be the alphabet.
► Le = {ω ∈ Σ* | the number of 1’s in ω is even}.
► ε, 0, 00, 11, 000, 110, 101, 011, 0000, 1100, 1010, 1001, 0110, 0101,
0011, … are in Le
17
Operations on Languages
► Complementation
► Union
► Intersection
► Concatenation
► Reversal
► Closure
18
Complementation
Let L be a language over an alphabet Σ.
The complementation of L, denoted by , is Σ*–L.
Example:
Let Σ = {0, 1} be the alphabet.
Le = {ω∈Σ* | the number of 1’s in ω is even}.
= {ω∈Σ* | the number of 1’s in ω is not even}.
= {ω∈Σ* | the number of 1’s in ω is odd}.
19
Union
Example:
{ x∈{0,1}*| x begins with 0} ∩ { x∈{0,1}*| x ends with 0}
= { x∈{0,1}*| x begins and ends with 0}
L1={0,01,1} L2={1,11}
L1∩L2 = {1}
21
Concatenation
Let L1 and L2 be languages over an alphabet Σ.
The concatenation of L1 and L2, denoted by L1⋅L2, is {w1⋅w2| w1 is in L1 and
w2 is in L2}.
Example:
Example:
L = {ω∈Σ* | the number of 1’s in ω is even}
L+ = {ω∈Σ* | the number of 1’s in ω is even} = Le*
Why?
L* = L+ ∪ {ε} ?
25
Grammars
► Just like English, languages can be ► The sentences formed from the grammar:
described by grammars.
► Example 1: Reeta loves Reeta
S Noun Verb-Phrase Reeta loves Puppy
Verb-Phrase Verb Noun
Reeta cares Reeta
Noun { Reeta, Puppy } Reeta cares Puppy
Verb { loves, cares } Puppy loves Reeta
Puppy loves Puppy
Puppy cares Reeta
Puppy cares Puppy
29
►Example 2 continues:
►A derivation of “the boy sleeps”:
30
► Example 2 continues:
► A derivation of “ a dog runs”:
31
► Example 2 continues:
► Language of the grammar:
L = { “a boy runs”,
“a boy sleeps”,
“the boy runs”,
“the boy sleeps”,
“a dog runs”,
“a dog sleeps”,
“the dog runs”,
“the dog sleeps” }
32
Notation
Variable Terminal
or Production
Symbols of
Non-terminal rule
the vocabulary
Symbols of
the vocabulary
33
Grammar – Formal Definition
► G = (V,T,S,P) is a 4-tuple, in which:
►V is a finite set of objects called variables / Nonterminals
►T is a finite set of objects called terminals
►S∈V is a special Variable / Nonterminal, the start symbol.
►in example 2 the start symbol was “sentence”.
►P is a set of productions (to be defined).
►Rules for substituting one sentence fragment for another
►Every production rule must contain at least one Variable on its
left side.
Example 3:
►Grammar: ►Derivation of sentence: aabb:
V={S} T={a,b} P=
►Derivation of sentence:
►Derivation of other sentences:
35
Example 3:
►Grammar: ►Derivation of sentence: aabb:
V={S} T={a,b} P=
►Derivation of sentence:
36
Grammar DERIVATION
► EXAMPLE 4:
► Let G = (V, T, S, P), where
► V = {A, B, S}
► T = {a, b},
► S is a start symbol
► P = {S → ABa, A → BB, B → ab, A → Bb}.
CHOMSKY HIERARCHY
Language Grammar Machine Example
Regular Grammar Deterministic or
Type 3: Regular • Right-linear grammar Nondeterministic a*
Language • Left-linear grammar Finite-state
acceptor