• No results found

Application: Entropy and Source Coding

the PGF G′, G′′ etc denote derivatives of G(z).

4.4 Application: Entropy and Source Coding

In Chapter 2 we discussed the topics of information theory and coding. Recall that the

information associated with an event Ai is defined as

where the logarithm is taken with base 2 and the information is therefore measured in ‘bits.’ Now reconsider the problem where the ‘events’ represent the transmission of symbols over a communication channel. As such, the events Ai are mutually exclusive

and collectively exhaustive.

With the concept of expectation developed in this chapter, we are in a position to formally define entropy H for the set of symbols as the average information, that is,

(4.26) Now suppose that a binary code is associated with each symbol to be transmitted. Let the discrete random variable L represent the length of a code word and define the average length of the code as

(4.27) Shannon’s source coding theorem can be stated as follows [12].

Theorem. Given any discrete memoryless source with entropy H, the average length

mL of any lossless code for encoding the source satisfies

(4.28) Here the term “lossless” refers to the property that the output of the source can be completely reconstructed without any errors or substitutions. The theorem can be illustrated by a continuation of Example 2.10 of Chapter 2.

Example 4.9: Recall that in the coding of the message “ELECTRICAL ENGINEERING" the probability of the letters (excluding the space) are represented by their relative frequency of occurrence and thus are assigned the probabilities listed in Example 2.14. The codeword and length of the codeword associated with each letter by the Shannon-Fano algorithm are also shown in the figure.

It can be seen that the random variable L representing the length of the codeword can take on only three possible values. The PMF describing the random variable can be contructed by adding up the probabilities of the various length codewords. This PMF is shown below:

The average length of the codewords is computed from (4.27) and is given by12

Now, let us compute the source entropy. From (4.26) and Fig. 2.16, the entropy is given by

The result shows that the bound (4.28) in Shannon’s theorem is satisfied. The average code length produced by the Shannon-Fano procedure is very close to the bound however. This shows that the Shannon-Fano code is very close to an optimal code for this message.

12 This could also be computed by multiplying the length of each codeword in Fig. 2.16 by its

probability and adding.

The original definition of entropy is in terms of source and channel coding. The basic definition extends to any random variables however (not just codewords). The entropy for any discrete random variable K is defined as

(4.29) while the entropy for a continuous random variable X is given by

(4.30) These formulas have many uses in engineering problems and a number of optimal design procedures are based on minimizing or maximizing entropy.

4.5 Summary

The expectation of a random variable E{X} is a sum or integral over all possible values of the random variable weighted by their associated probability. Further, for any function

g(X), the expected value E{g(X)} can be computed by summing or integrating g(X) with

the PMF or PDF of X.

The moments of a random variable are expected values of products of X. Thus

mx=E{X} is the first moment (also called the mean), E{X2} is the second moment, and so on. Central moments are expectations of products of the term (X – mx). The quantity is known as the variance of the distribution and measures how the random variable is spread about the mean.

The moment generating function is defined as E{esX} where s is a complex-valued

parameter. The MGF can also be interpreted as the Laplace transform of the PDF. The power series expansion of the MGF reveals the moments of the distribution as coefficients. Thus, knowing all the moments of a distribution is equivalent to knowing the distribution. Some results and properties for random variables are discovered more easily from the MGF than by using the PDF.

For discrete random variables, a number of operations are carried out more conveniently using the probability generating function. The PGF is defined as E{zK} and

is the z-transform of the PMF.

The last section of this chapter discusses the concept of entropy as average information. Shannon’s theorem states that entropy provides a lower bound on the average number of bits needed to code a message without loss.

References

[1] William Feller. An Introduction to Probability Theory and Its Applications—Volume I. John Wiley & Sons, New York, second edition, 1957.

[2] Athanasios Papoulis and S.Unnikrishna Pillai. Probability, Random Variables, and Stochastic Processes. McGraw-Hill, New York, fourth edition, 2002.

[3] Henry Stark and John W.Woods. Probability, Random Processes, and Estimation Theory for Engineers. Prentice Hall, Inc., Upper Saddle River, New Jersey, third edition, 2002.

[4] Wilbur B.Davenport, Jr. Probability and Random Processes. McGraw-Hill, New York, 1970. [5] William Feller. An Introduction to Probability Theory and Its Applications—Volume II. John

Wiley & Sons, New York, second edition, 1971.

[6] Daniel Zwillinger. CRC Standard Mathematical Tables and Formulae. CRC Press, Boca Raton, 2003.

[7] David R.Brillinger. Time Series: Data Analysis and Theory. Holden-Day, Oakland, California, expanded edition, 1981.

[8] Murray Rosenblatt. Stationary Sequences and Random Fields. Birkhauser, Boston, 1985. [9] Chrysostomos L.Nikias and Athina P.Petropulu. Higher-Order Spectra Analysis: A Nonlinear

Signal Processing Framework. Prentice Hall, Inc., Upper Saddle River, New Jersey, 1993. [10] Sheldon M.Ross. A First Course in Probability. Prentice Hall, Inc., Upper Saddle River, New

Jersey, sixth edition, 2002.

[11] Ruel V.Churchill and James Ward Brown. Complex Variables and Applications. McGraw-Hill Book Company, New York, fourth edition, 1984.

[12] Claude E.Shannon and Warren Weaver. The Mathematical Theory of Communication. University of Illinois Press, Urbana, IL, 1963.

[13] Alvin W.Drake. Fundamentals of Applied Probability Theory. McGraw-Hill, New York, 1967.

Problems

Expectation of a random variable

4.1 Rolf is an engineering student at the Technical University. Rolf plans to ask one of