Loudness
With many sounds arriving at the ears at once, a top priority of the "mind's ear" is to discern degrees of loudness. Typically one or two sounds seem loud and close, others are in general vicinity while many seem to arrive from much greater distance. The ear can detect sounds from all angles and from many distances but the brain attends to only one or two at a time.
The quality of “loudness” plays such an important role in listening that two sources with similar level tend to create confusion. To create a sense of depth or “space” in a soundtrack requires careful control, of the relative volume or the loudness of each of the sound elements in the mix. The final soundtrack emerges from only one spot in space with no other coding other than differences in volume to represent space.
Sound engineers use decibels (dB) to quantify the levels of sound energy detectable by the human ear. The scale is logarithmic with each increase 6 dB seeming to double the apparent loudness of a sound. A 6dB steps is kind of an audio equivalent of a F- STOP. 0dB is defined as the threshold of hearing and 120dB is regarded as the point at which many people begin to experience pain. This range from 0-120dB could be understood as an effective "latitude" in human hearing-- about 20 Stops.
Each 6 dB step doubles apparent loudness
Unfortunately, though understandably, sound media are not able to capture nor reproduce such a wide range of sound levels. The most expensive recording equipment can capture a range of about 90dB and the best theatres can represent a range of about 60dB. The 16mm optical soundtrack is capable of a range of only 30dB. This range of discernable levels is the dynamic range of a medium or system.
In a typical, urban, living room, one can hear a constant rumble that penetrates the window panes and walls. Whether it be from street traffic, factories and freeways or even the whir of ones’ own heating system, there is always a minimal level or "presence" in a setting. The perceived loudness or quietude of that level depends a bit on expectation and conditioning. A level of 40 decibels in one’s
living room seems much quieter than 60dB of hubbub from local street traffic in the backyard, but even the “quiet” living room is 50 times louder than the faintest sound level one can detect.
The high background sound levels of urban environments makes it harder for us to discern and record sounds. It’s common to hear students complain about the “bad sound” they got on a location shoot. Was the recordist conditioned to the background levels and not perceiving them as loud? An demonstration of relative volumes shows why bad sound easily happens.
A person speaking in a conversational tone at a distance of 1 foot produces about 70 dB of energy.
If the room has a background level of 40dB, this is a separation of about 30 dB. Therefore if the mic is set-up at a distance of one foot from the person, the voice should record at 30 dB louder than the background—sufficient separation to be used in post-production. But if the mic is moved to 4’, the relative sound level of the voice drops to 58dB—and the voice will only be 18dB louder than the background sound in the recording. 30dB separation between voice and background sounds allows the sound editor to add background sounds of their choice. But with only 18dB of separation, the editor has no choice but to use the existing background sounds in the recording in the final mix.
0dB 30 36 58 70 80 100 120dB
As a guide in the field, the background sound level should not produce Vu needle movement on the Sony TC-D5M when the desired foreground sound level produces peaks to –5 to –3dB. If the needle is moving just from the background sounds, you need to move the mic closer.
Pitch
Everyone is familiar with the tonal qualities of sounds produced by musical instruments and the pleasure of hearing a variety of tonalities within rhythmic patterns. The vibration rates produced by a bass guitar and a bass drum are relatively low whereas the pitches of the lead guitar and the
saxophone are noticeably higher. If you listen carefully, you might discern that the rhythm guitar and the vocals have pronounced pitch qualities somewhere in between. Tonal variety is one
characteristic that gives music great appeal.
Audio engineers measure the pitch by counting the vibrations occurring over a given unit of time.
The more vibrations produced, the higher the pitch is to the ear. The term, Hz , is an abbreviation of
Hertz , the person who refined the study of sound vibrations in time. Hertz used the term frequency to describe the phenomenon of certain rates being sustained long enough to be discerned as pitch.
As vibrations actually cause the air molecules to constantly move back and forth, the energy of sound is constantly changing direction or polarity. This wave-like or cycling nature of sound is reflected in another term used to describe pitch, cycles per second, or CPS. (Hz is a modernized form of CPS.)
Through testing, it can also be determined that human ears are able detect sound vibrations with frequencies from 20 cycles per second (or Hz) to 20,000 Hz. Although most sounds have several pitches occurring at once, the lowest frequency of a sound, is called its fundamental. As the fundamental frequency is usually the loudest frequency produced, charting the fundamental frequencies of common sound sources is one way to sense where certain sources cluster their frequencies within the range of discernible to humans. Such a chart is called an Audio Frequency Spectrum.
If a film soundtrack combines sound elements whose frequencies come from several different regions of the spectrum, it will have aural vitality. Soundtracks lacking clarity often have sound elements whose frequencies overlap and "mask" or fail to utilize an entire region of the frequency spectrum.
The ability to discern low, medium and high pitch sounds can also help the recordist determine optimum placement for the microphone. As the cassette tape recording medium is more efficient at reproducing low frequencies than high ones, positioning the microphone physically closer to the source of higher frequencies can give recordings greater clarity and fidelity. This is another reason to monitor with headphones while recording.
Duration
A key to imagining, recording and incorporating intriguing sounds comes from thinking about sounds adding variety, contrast and unpredictability the soundtrack. Beyond sounds that “go with” particular objects and settings, the sound editor is free to imagine any sound or quality of sound that one desires to hear at any given moment. Rather than slaving sound to image, the sound editor allows the soundtrack equal expression.
In addition to variation in tone and volume, the durations of the sounds used in a mix have an important role in establishing variety, expectation and composition.
Consider the sound impulses created by thumping a bass drum and thumping the low E string on a bass guitar. The two sounds are close in terms of tone and in terms of volume. What qualities are we hearing to be able to easily discern them?
Bass Drum Bass Guitar
Waveforms, or graphs of each sound displaying the changes in volume over time provide a clear picture the difference we can hear. Both instruments have an abrupt beginning, but the way each sound ends (or “decays”) is quite different. The volume of the strummed bass guitar string is sustained over three seconds while the sound of the drum decays fairly rapidly, in about 1/5 of second. Sounds with fast attacks and rapid decays are perceived as “percussive,” whereas sounds with slow attacks and long decays seem “sustained” and create contrast.
Sustained Guitar feedback
The notion of “Hi Fidelity” as goal in audio production is tied to the ability of the medium to use and portray a wide range of frequencies and volume levels. A film soundtrack as a time construction or composition portrays variety through passages that are soft, loud, percussive, sustained and combinations of these qualities.
In a segment from Michael Moore’s TV Nation, fast-paced, introductory voice-over stops abruptly and leaves audience wondering what is next. What follows is foreground music.
Steve Bogner begins his piece, Personal Belongings, with a percussive, dramatic music composition to ease monotony of a lot of narration that is to follow.. Another, brief music passage occurs about 1.5 minutes later
Time Time
Time
Decay Decay
Julie Dash intersperses background music, location sounds, brief passages of dialog in Daughters of the Du The location sounds occur in the sections with thin lines.
David Daniels video, Buzz Box, portrays the relentless flow of broadcast television