Dynamic Range - Digital Theory - Sound System Engineering.pdf

Digital Theory

5.2 Dynamic Range

Dynamic range theoretically for 24-bit quantization equals:

(5-2)

or 146.25 dB and for 16-bit quantization equals 98.09dB.

To find the bit rate when the SNR is known use:

(5-3)

Indeed digital recordings and electronics can in theory deliver increased dynamic range. Like distor-tion figures, dynamic ranges in real life sound rein-forcement systems, while desirable, are usually constrained by the acoustic possibilities.

Consumer digital devices, cell phones, music players etc. are low-cost throw-away consumer items in today's market place. In professional sound systems, that operate in the presence of live audi-ences, battery operated analog backup systems for at least safety purposes, i.e. voice communication during a power failure, are still necessary adjuncts to a fully digital system. Latency problems also deserve special attention as many digital devices contain significant signal delay.

The digital world still falls short of what man can do inasmuch as the human brain has a storage capacity of from 10¹⁵ to 10¹⁷ bits of information and a processing rate of 100,000 Teraflops per second.

The modern computing devices have reached the stage of being on a par with the brain of a Guppy, Fig. 5-5. The computer approach is a linear approach to a very nonlinear system called thinking.

Humans have the very real ability to process very nonlinear information, distorted information, and even the ability to draw correct conclusions from false information.

In discussing neural networks David J. C.

MacKay in his book, Information Theory Inference and Learning Algorithms, points out that digital devices suffer from:

Figure 5-2. Information Theory.

Figure 5-3. Effects of sampling rates on quality.

Sample Rate Higher And Higher

Large

Error Less

Error

No Error

Bit Depth More And More

SNR = 10log₁₀(6 2× ^[⁽^{2 bits}^× ^{) 2}^– ^])

Figure 5-4. ‘Shannon Space’ for human hearing.

Threshold NSD

11 bits @52kHz

CD Channel

18.2 bits @ 96kHz Noise shaper

0Hz 10kHz 20kHz 30kHz 40kHz 140

120 100 80 60 40 20 0

−20

−40

SPL

bits

SNR ---10

---6

⎝ ⎠

⎜ ⎟

⎛ ⎞

ln 2 --- 2ln + ---2

Digital Theory 37

1. Address space memory is not associative.

2. Address space memory is not robust or fault tolerant.

3. Address-based memories are not distributed.

In the case of biological memory:

1. Biological memory is associative memory. Recall is content addressable.

2. Biological memory is error tolerant and robust.

For example: “An American politician who was very intellectual and whose political father did not like broccoli” leads many people to think of President Bush (remember the author of this book is British) even though one of the cues contains an error.

3. Hardware faults can be tolerated. Memory often persists through brain damage.

4. Biological memory is parallel and distributed, and has a remarkable ability to work through loops.

The above is not to deprecate digital devices, but it merely intends to make the point that a live operator with multiple backup capabilities is wise insurance.

5.2.1 Cognitive Computing

August 18, 2011, IBM announced a series of chips that would allow a computer with processors that mimic the human brain’s cognition, perception, and action abilities. It is described as:

The first cognitive computing core that combines computing in the form of n e u r o n s , m e m o r y i n t h e f o r m o f

synapses, and communications in the form of axons all working in silicon, and not PowerPoint. These chips can enable biological ‘senses’ such as sight, sound, smell, and touch, and drive multiple motor modes while consuming less than 20W (of power) and occupying less volume than a 2L bottle of soda, and weighing less than 3 pounds.

IBM hopes:

T o w e a v e t h e b u i l d i n g b l o c k s together into a scalable network and progressively scale it to a mammalian scale system with 10,000,000,000 n e u t r o n s , 1 0 0 , 0 0 0 , 0 0 0 , 0 0 0 , 0 0 0 synapses, all while consuming 1kW of power and fitting in a shoebox some-time between now and the year 2018.

Many brain researchers make a distinction between the brain and “Mind,” and the role of each in consciousness. (See Chapter 3 Sound and Our Brain for further clarification of these distinctions).

5.2.2 Digital Recording Techniques

In 1928, Harry Nyquist wrote that sampling a signal at more than twice the desired bandwidth was a neces-sary limit on digital signaling. It is an error to say that it should be equal to twice the necessary bandwidth; it must be greater than the desired bandwidth.

Shannon termed the zeros and the ones as bits (binary digits), and employed the logarithmic base 2 in his calculations. From this sprang the concepts of sampling rate (samples per second), and quantiza-tion of the amplitude in bits. In audio recordings the sampling rate multiplied by the time in seconds, multiplied by the quantum value in bits, multiplied by the number of channels, divided by the channel reciprocal multiplied by bits equals the file size, Fig. 5-4. Eight bits equal one byte, four bits equals a nibble, and sixteenbits equals a word. File sizes are normally stated in bytes.

An example of decimal numbers as digital code is given in Table 5-2 where it can be seen that the decimal number is the addition of the exponents related to base 2; conversely you can by repeated division find a digital code from a decimal number.

Logarithms are used with many bases: the Napieran base e, the Briggsian base 10 (used for bans in code breaking during World War II), and base 2 for information. In the physical science world, one “Nat” has the physical dimensions of a square two Planck lengths on a side. The world-Figure 5-5. Raw Computing Muscle, as exemplified by

a plot of 120 top machines of their time since 1940, is today on par with the brain of a guppy. It may reach human equivalent around 2040.

1940 1960 1980 2000 2020 2040 Year

Brain Power (instructions per second)

Mac G5/Dual 2.0 GHz

38 Chapter 5 renowned physicist, John Wheeler, declared reality was “It’s from bits.”

Digital file size can be calculated with the following equation:

(5-4)

where,

Sr is the sampling rate, ts is the time in seconds,

bits is the quantum value. 8bits equal 1byte, chan is the number of channels.

An example is; if we have a sampling rate of 44,100Hz, a time of 60s, 16 bits, and 2 channels, we would have:

To find the file depth use:

(5-5) or

To find the file depth in dB use;

(5-6) or

(5-7)

The download time can be found by:

(5-8)

where,

ts is 1 for seconds, 60 for minutes, and 3600 for hours.

Assuming the modem speed is 0.056 mbps, the download time would be:

Dr. Thomas Stockham made the very first 16-bit PCM recording in the United States in 1976 for the Santa Fe Opera on his Sound Stream recorder. When I first measured the “ringing” associated with the early digital recorders (antialiasing filters), Dr.

Stockham was the only one I knew that had both understood and avoided this anomaly.

Studer, upon seeing the measurements, withheld their professional recorder until the problem was corrected; others failed to do so, which led, in our opinion, to some of the artifacts that so disturbed

“sensitives” during that early period (1982). The Motion Picture Expert Group (MPEG) within the International Organization for Standardization (ISO), worked out a series of audio coding standards for storage and transmission of various digital media.

Digital versatile disc (DVD), High definition television HDTV, and ongoing competing methods make the description of each system chronologically challenging. Currently the Dolby AC-3 is the preva-lent coding standard for the U. S. Phillips PASC (Precision Adaptive Subband Coding) is similar to the ISO/MPEG/1 layer 1; Sony has ATRAC (Adap-tive TRansform Acoustic Coding) with its ability to manipulate psychoacoustic principles to both bit allocation and the time frequency mapping. Further digital details, mathematics and circuitry are in Chapter 22, Signal Processing and Chapter 25, Putting It All Together.

The broad parameters discussed here have not changed in the decades since Shannon’s paper.

Encoding and transmission techniques will continue to evolve, but the fundamental parameters continue to be usable guideposts when looking at devices and their claims.

All audio devices contain some latency. In digital devices such as crossovers (inadvertent delay) and deliberate (delay lines) can be acoustically signifi-cant in live sound reinforcement. In one sound system I was hired to evaluate, the delay in a digital crossover was 30ms and rendered signal alignment a matter of putting one part of the loudspeaker system physically 30ft behind the other part.

Numerous occasions occurred wherein measuring the time domain behavior of a system left us with a blank screen on the analyzer until we remembered the fundamental rule to always measure globally before measuring specifically.

download time 10.54 10× ⁶ 0.056 10( × ⁶) 60×

Digital Theory 39 The sound system engineer must keep in mind

that we start with an acoustic signal in an acoustic environment and end up with an acoustic signal in an acoustic environment in a vast majority of cases.

In broadcasting and recording up to 90% of the actual signal material can be removed in the processing of the digital signal at hand. These were some of the early lessons in using digital equipment.

Antialiasing filters, i.e. brick wall filters with attenu-ation rates as high as −146dB per octave, were tried by some early CD manufacturers, resulting in a group delay at 20kHz of 1ms, and a relative phase shift of some 3000° which was clearly audible. The problem was finally solved and then only on replay by the use of over sampling techniques that moved the aliasing frequency from 22kHz up to as high as some cases as 154kHz. This allowed controlled filter design well away from the desired pass band.

Richard C. Heyser liked to point out how much the human listener can detect that can’t be measured. He said:

The end product of audio is the listening experience. The end product is a result of perception and cognition and evaluation processes occurring in the mind. What do we know about such processes—the answer is very little.

Heyser went on to elaborate:

My own research into nonlinear behavior has caused me to introduce three divisions to what is universally spoken of as perception. These are the divisions of:

1. Sensory contact and stimulus.

2. Association of stimulus with memory and past experience and ongoing stimuli of other nature.

3. Evaluation of stimulus in light of ongoing experience I call them perception, cognition, and valuation.

We can like something today and not like the same tomorrow even though the program and the stimulus are essentially identical in both cases. The perception was unchanged, but cogni-tion and evaluacogni-tion were altered.

Pre-emphasis

The signal-to-noise ratio can be improved by using high-frequency pre-emphasis. The choice of the pre-emphasis characteristic must be made with care.

The curve for pre-emphasis is designed making assumptions about the program content spectrum.

Dither

At low levels, only a small number of states are available. This can lead to audible distortion such as the decay of a piano note. It has been found that adding a small amount of random noise significantly improves the perceived quality. This noise “dithers”

the LSB and can be regarded as the digital equiva-lent of bias in magnetic recording.

Aliasing

It is important that the sampling process is protected from out-of-band frequencies. It was in the design of antialiasing filters in the early days of digital audio that artifacts became audible to the listener.

Over-sampling

A common technique to reduce the burden on the filters is over sampling. By reading the data two, four, or eight times, the spurious frequencies are raised one, two, or four octaves; therefore, need less severe filters to attenuate them to insignificant proportions.

Bit rate reduction

Bit reduction allows more audio to be processed in a given time. Halving the audio bit rate on the hard disk system can double the number of audio signals that can be simultaneously output from it. This is done for economic reasons.

5.2.3 Frequency Dependent Case

Where the additive noise is not white or that the signal-to-noise is not constant with frequency over the bandwidth, the following equation can be used by treating a series of channels as narrow, indepen-dent Gaussian channels in parallel.

(5-9)

where,

C is channel capacity in bits per second,

C log₂

40 Chapter 5 W is the bandwidth of the channel in Hz,

S (f) is the signal power spectrum, N (f) is the noise power spectrum, f is the frequency in Hz.

These equations can be used to demonstrate how spread spectrum communication systems make it possible to transmit signals which are actually much weaker than the background noise level.

There are three predominant types of noise;

1. White—equal energy per hertz.

2. Pink—equal energy per octave.

3. Black—silence.

Errors in digital transmission are considered as noise, as well as the naturally occurring thermal noise, and the effects of radio frequency interference (RFI). Relations between bandwidth and time similar to the one found by Nyquist was discovered simultaneously by Karl Kupfmuller in Germany.

And even more stringent analysis of the relation was carried out in Gabor’s, Theory of Communication.

As was pointed out by Tuller (1949), a funda-mental deficiency of the theories of Nyquist, Hartley, Kupfmuller and Gabor, was that their formulae did not include noise. The role of noise is that it sets a fundamental limit to the number of levels that can be reliably distinguished by a receiver. Each of these scientist-engineers had worked with the fundamentals of noise and had fundamental papers that included the earliest measurements and mathematical derivations, rela-tive to noise, but chose to treat the analysis of a communications channel as noiseless. Claude Shannon’s genius was to unite all these disparate theories into one.

5.2.4 A Stochastic Process

A system which produces a sequence of symbols (letters of the alphabet or musical notes) according to certain probabilities is called a stochastic process, and the special case of a stochastic process in which the probabilities depend on the previous events, is called a Markov process or a Markov chain. Of the Markov processes which might conceivably generate messages, there is a special class which is of primary importance for communication theory, these being what are called ergodic processes.

An ergodic process is one which produces a sequence of symbols which would be a poll taker’s dream, because any reasonably large sample tends to be representative of the sequence as a whole. A truly reverberant auditorium is an excellent example

of an ergodic process, in as much as, any one point of measurement would give the same result as any other point of measurement.

The world is full of non-Markov processes. A Markov process is one in which the future depends only on the conditions in the past. If your awareness of something is changed irrevocably by the intro-duction of some new piece of knowledge, then the altered awareness is non-Markovian.

In discussing the signal delay between a low-frequency driver and a high-frequency driver with a world-renowned psychoacoustic authority who had stated that 3ms was inaudible, I asked had he walked the polar pattern. The lesson was non-Markovian as the polar response had been altered by the seemingly innocuous signal delay, while the amplitude etc. had not. Audio, acoustic, and digital measurement systems are helpless when-ever nonlinear phenomenon is present. All our measurement systems are Markovian inasmuch as we expect the input to predict the output. All our measurement systems are dependent upon linear equations. This is not to imply that they are useless because the skilled and experienced operator can often read ambiguous data.

As a guide to a new listening experience, Dr. John Diamond’s detection of serious flaws in the early CDs did lead to correction of the antialiasing problems, but not, unfortunately, to his being prop-erly acknowledged as having brought it to the recording world’s attention other than as a non-Markovian moment for many overconfident engineers.

In document Sound System Engineering.pdf (Page 53-57)