0% found this document useful (0 votes)
45 views8 pages

Information Theory

Some lectures on information theory and coding For the fourth stage of the College of Engineering Technology

Uploaded by

sa505404
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views8 pages

Information Theory

Some lectures on information theory and coding For the fourth stage of the College of Engineering Technology

Uploaded by

sa505404
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Dr. Emad A.

Mohammed Information Theory and Coding 4th Year

Information Theory
This subject deals with information and data transmission from one point to another.
The block diagram of a communication system is shown below.

Noise +
jamming
Source of Encoder Channel Decoder Receiver
information
Audio, Modulation, Wireless,
video, telex ciphering, error coaxial,
computer detection, data optical fiber,
compression
The concept of information is related with probability. Any signal that conveys information
must be random but not vice versa i.e not any random signal conveys information (noise is a
random signal conveying no information).

Self Information
Suppose that a source of information produces finite sets of messages x 1, x2, ….xn with
probability P(x1), P(x2)…..,P(xn). The amount of information gained from knowing that the
source produces the message xi (symbol) is related with P(xi) as follows:
1. Information is zero if P(xi)=1 (certain event).
2. Information increases as P(xi) decreases to zero.
3. Information is a positive quantity.
The function that relates P(xi) with information of xi called self information of xi=I(xi)
I(xi)=-loga P(xi)
The units of I(xi) depends on a.
If a=2 (this is mostly used) then I(xi) has the units of bits.
If a=e=2.718, I(xi) has units of nats.
If a=10, I(xi) has units of Hartley.
ln P
Note log a
P  .
ln a

1
Dr. Emad A. Mohammed Information Theory and Coding 4th Year

Example1: a fair dice is thrown; find the amount of information gained if 4 will appear.
Solution:
I(4)=-log2P(4)=-log2 (1/6)=ln6/ln2=2.584 bits

Example2: a coin has P(head)=0.3. Find the amount of information gained if a tail will
appear.
Solution:
P(tail)=1-P(head)=0.7
I(tail)=-log20.7=-ln0.7/ln2=0.5145 bits

Example3: find the amount of information contained in TV picture which has 2*105 pixels
and each pixel has 8 equiprobable levels of colors.
Solution:
P(each level)=1/8
Information/pixel=-log2P(level)=-log2(1/8)=3
Information /picture= 3  2  10 5  600 Kbit.

H.W find the amount of information contained in TV picture which has 2*10 5 pixels and
each pixel has 8 equiprobable levels of brightness and 16 equiprobable levels of color.

Source Entropy
In practical communication systems, we usually transmit long sequence of symbols from
an information source. Thus we are more interested in the average information that a source
produces than the information content of a single symbol. This average is called source
entropy and denoted as H(x).
n

H (x)   P ( xi )I ( xi )
i 1

2
Dr. Emad A. Mohammed Information Theory and Coding 4th Year
n

Or H ( x )    P ( x i ) log 2
P ( xi ) bits/symbols
i 1

Example: find the entropy of a source produces four symbols with the following
probability:
P(x)=[ 0.25 0.1 0.15 0.5]
Solution:
n

H ( x )    P ( x i ) log 2
P ( xi )
i 1

1
  ( 0 . 25 ln 0 . 25  0 . 1 ln 0 . 1  0 . 15 ln 0 . 15  0 . 5 ln 0 . 5 )
ln 2

H(x)=1.7427 bits/symbol.
Note: H(x) is maximum when the probability of a source is equal. H(x)=log2n for a source
produces n equiprobable symbols. Prove that?

Source Entropy Rate (Rx)


It represents average amount of information produces per second.
R(x)  H(x)  r bit/sec.
Where r is the rate of producing symbols
H (x)
Or R (x) 

n

   i
P (xi )
i 1

τ is the time duration of xi


Example1: a source producing dot and dash with P(dot)=0.65. If time duration of dot is
200ms, and time duration of dash is 800ms. Find the average source entropy rate.
Solution:
P(dash)=1-0.65=0.35
H(x)=-[0.65log20.65+0.35log20.35]=0.933 bit/symbol

3
Dr. Emad A. Mohammed Information Theory and Coding 4th Year
n

   i
P ( x i )  200  0 . 65  800  0 . 35  410 ms
i 1

H (x) 0 . 93
R (x)   3
 2 . 27 bit / sec .
 410  10

Example2: In a telex link, information is arranged in blocks of 8 characters. The 1st position
(character) in each block is always kept the same for synchronization purpose. The
remaining 7 positions of the block are filled from the English alphabet with equal
probability. If the system produces 400 blocks/ second find the source entropy rate.
Solution:
Z R W B G T S

Each of the 7 positions behaves as a source may produce randomly one of the English
alphabets with equal probability of 1/26
Information / position=-log2(1/26)=4.7bit/character.
Information/ block=7*4.7=32.9bits/block
R(x)=( Information/ block)*(block/sec.)=32.9*400=13160 bit/sec.

Mutual Information
Consider the set of symbols x 1, x2,…. xn the source may produce. The receiver may receive
y1,y2,…ym. Theoretically, if the noise and jamming is zero, then set x=set y and n=m.
however, due to noise and jamming, there will be a conditional probability P(y/x). The
amount of information that yi provides about xi is called mutual information between xi and
yi. This is given by:
P (xi / y j )
I ( x i , y j )  log 2
P (xi )

P ( y j / xi )
I ( x i , y j )  log 2
 I ( y j , xi )
P(y j)

i.e the mutual information is symmetric.


Note P(yj/xi)≠ P(xi/yj) in general .

4
Dr. Emad A. Mohammed Information Theory and Coding 4th Year

Properties of mutual information

1) it is symmetric i.e I ( y j , x i ) = I ( x i , y j )

2) I ( x i , y j ) >0 if P(y/x)>P(x) i.e y provide positive information about x.

3) I ( x i , y j ) <0 if P(y/x)<P(x) i.e y adds ambiguity.

4) I ( x i , y j ) =0 if P(y/x)=P(x) i.e statistical independent y gives no information about x.


Transinformation ( average mutual information)
It is the statistical average of all the pairs.
I ( xi , y j ) , i=1,2,…n, j=1,2,….m
n m

Transinformation= I ( X , Y )   I (xi , y j )P (xi , y j )


i 1 j 1

n m
P ( xi / y j )
I (X ,Y )   P ( x i , y j ) log 2
P ( xi )
i 1 j 1

m n
P ( y j / xi )
I (Y , X )   P ( x i , y j ) log 2
P(y j)
j 1 i 1

Marginal entropies
Tx Rx
A term is usually used to denote both
x Channel y
Source entropy H(x) and receiver entropy
H(y)
m
Margins of the
H ( y )    P ( y j ) log 2
P(y j) channel
j 1

Joint and conditional entropies


The average amount of information associated with the pair (x i,yj) is called the joint or
(system) entropy.

5
Dr. Emad A. Mohammed Information Theory and Coding 4th Year
m m

H(x, y)  H ( xy )     P ( x i , y j ) log 2
P ( xi , y j ) bits/symbol
j 1 i 1

The average amount of information associated with the pair (x i/yj) and (yj/xi) is called the
conditional entropies.
m

H ( y / x )    P ( x i , y j ) log 2
P ( y j / xi ) bits/symbol (noise entropy)
j 1

H ( x / y )    P ( x i , y j ) log 2
P (xi / y j ) bits/symbol (losses entropy)
j 1

Example1: Show that H(x,y)=H(x)+H(y/x).


Solution:
m m

H(x, y)  H ( xy )     P ( x i , y j ) log 2
P ( xi , y j )
j 1 i 1

But
P(xi,yj)=P(xi)P(yj/xi)
m m m m

H(x, y)     P ( x i , y j ) log 2
P ( xi )   P ( x i , y j ) log 2
P ( y j / xi )
j 1 i 1 j 1 i 1

 P ( xi , y j )  P ( xi )
j 1

n m m

H(x, y)    P ( x i ) log 2
P ( xi )   P ( x i , y j ) log 2
P ( y j / xi )
i 1 j 1 i 1

Then
H(x,y)=H(x)+H(y/x).

H.W: show that H(x,y)=H(y)+H(x/y).

Example2: Show that I(X,Y)=H(x)-H(x/y)


n m
P ( xi / y j )
I (X ,Y )   P ( x i , y j ) log 2
P ( xi )
i 1 j 1

6
Dr. Emad A. Mohammed Information Theory and Coding 4th Year
n m m m

I ( X ,Y )   P ( x i , y j ) log 2
P ( xi / y j )   P ( x i , y j ) log 2
P ( xi )
i 1 j 1 j 1 i 1

I(X,Y)=H(x)-H(x/y).
Note: above identity indicates that the trans information I(X,Y) is the net average
information gained at the Rx which is the difference between the original average of
information produced by the source H(x) and that information lost in the channel H(xy)(
losses entropy) due to noise and jamming.
H.W: Show that I(X,Y)=H(y)-H(y/x).

Example3: Show that I(X,Y)=0 for extremely noisy channel.


Solution: for extremely noisy channel then, yj gives no information about xi(the Rx can not
decide any thing about xi, as if we transmit a deterministic signal x i, the receiver receives
noise like signal yj that is completely has no correlation with xi). then xi and yj are
independent and P(xi/yj)=P(xi) for all i and j then,
I(xi,yj)=log21=0
Then I(X,Y)= average of I(xi,yj)=0
Example4: The joint probability is given by:
 0 .5 0 . 25 
 
P ( x, y)  0 0 . 125
 
 0 . 0625 0 . 0625 

1) Find the marginal entropies.


2) Find the joint (system) entropies.
3) Find the losses and noise entropies.
4) Find the mutual information between 1,2.
5) Find the transinformation.
6) Draw the channel model.
Solution:
1) P(x)=[0.75 0.125 0.125] and P(y)=[0.5625 0.3475]

7
Dr. Emad A. Mohammed Information Theory and Coding 4th Year
Then H(x)=-[0.75log2 0.75+0.125 log2 0.125+0.125 log2 0.125]=1.06127 bit/symbol
H(y)=-[0.5625log20.5625+0.4375log20.4375]=0.9887 bit/symbol.
m m

2) H(x, y)     P ( x i , y j ) log 2
P ( xi , y j )
j 1 i 1

=[0.5log20.5+0.25 log20.25+0.125 log20.125+2(0.0625 log20.0625)]


H(xy)=1.875 bit/symbol
3) H(y/x)=H(x,y)-H(x)=1.875-1.06127=0.81373 bit/symbol
H(x/y)=H(x,y)-H(y)=1.875-0.9887=0.8863
P ( x1 / y 2 )
4) I(x 1 , y 2 )  log 2
but
P ( x1 )

P ( x1 , y 2 )
P ( x1 / y 2 ) 
P( y2 )

P ( x1 / y 2 ) 0 . 25
I(x 1 , y 2 )  log 2
 log 2
  0 . 3923 bits.
P ( x1 ) 0 . 75  0 . 4375

That means y2 gives ambiguity about x1.


5) I(X,Y)=H(x)-H(x/y)=1.06127-0.8863=0.17497 bits/symbol
6) To draw channel model we find P(y/x) matrix from P(x,y) matrix.
 0 . 5 / 0 . 75 0 . 25 / 0 . 75  2 / 3 1/ 3
   
P ( y / x)  0 / 0 . 125 0 . 125 / 0 . 125  0 1
   
 0 . 0625 / 0 . 125 0 . 0625 / 0 . 125   1 / 2 1 / 2 

x1
2/3
1/3
x2 y1
1
1/2
x3 y2
1/2

H.W: for the channel model shown find:


x1 0.9 y1
1) Source entropy rate if τx1=1msec. τx2=2msec.
0.1
I(x1) =2bits y2
2) Transinformation. 0.2
x2 y3
0.8
8

You might also like