The audio terms and definitions you need to know
By Roy Johnson, loudspeaker designer, Green Mountain Audio, Inc.
In a loudspeaker world filled with technical misinformation, no professional degree program, and no quality standards, here are the terms and definitions you need to know.
Crossover Circuits. Inside a speaker, a circuit is made of capacitors, inductors, and resistors. It divides the signal into bass, midrange, and high tones. A crossover circuit is necessary because a large woofer distorts voice-tones and cannot produce any highs; a midrange driver does not handle highs or lows very well; and a small tweeter can neither handle bass power nor produce any bass. The circuit can be simple or complicated, depending on how much protection is provided to the tweeter from the bass, and to the woofer from the highs. We use the simplest first-order circuit because it is the only circuit that neither stores energy nor distorts the timing between highs and lows passing through it.
Dispersion. How widely the sound is scattered across the room. Our speakers’ bass intentionally goes everywhere to allow the room to work with it. The opposite is true in the ultra-highs, which are directed to the front so they do not reflect from the side walls. Our speakers achieve a smooth transition between these two extremes of dispersion.
Distortion. Comes in two forms, harmonic and modulation. Harmonic distortion is the addition of unwanted harmonics to the original tone. These are measured by a specialized meter which filters out the original signal, leaving only the harmonics behind. The perfection standard is zero. Modulation is what happens when one tone distorts another tone, such as the bass distorting the voice. Again, a special meter filters out these two original tones to read the remaining distortions, and a perfect reading is zero.
Dynamics. Assuming the speaker will play loud enough, how soft will it play? This is its dynamic range, a span measured in decibels. In classical music, the range can exceed 60dB, that is, from 40dB to 100dB loud. The very soft 40dB sounds are easily drowned out by the air conditioner's fan or a person talking 30' away. Highly compressed rock music and hip-hop have less than a 20dB dynamic range. A hot jazz group has about 40dB of dynamic range. Beyond the notions of playing loud and soft, a speaker also must produce very small changes in loudness at any instant, whether that is a small change in a loud sound or a small change in a soft sound. These are the nuances that shape the emotions of the music. Without them, you hear elevator music. There is no straightforward way to characterize this performance in a speaker's specifications, but you can listen for it in the presentation of the music and soundtrack details.
Frequency. When a string vibrates, we hear a tone from the air molecules moving back and forth right next to our eardrum. If those could be seen, we would see them behave as if connected with tiny springs all the way back to the air molecules literally stuck to the string. When the vibrations occur at 440 times per second, we hear the tone of A above middle C. The frequency is 440Hertz (440Hz) and named for the discoverer of electromagnetic waves, Dr. Heinrich Hertz.
Frequency Range / Frequency Response. How low can the speaker go, and how high? What is its frequency range? Unless you are given a +/- dB variation for the loudness across the scale, you can not know. Perhaps the speaker goes low in the bass, but compared to the voice range, that bass is barely audible. If there is no published +/- dB spec, it can be assumed that the upper and lower frequency limits are about half as loud as the voice range, or -10dB ("10dB down"). The lowest and highest tones are audible, but just barely.
The ideal speaker would move its cones back and forth from 0Hz (the woofer's cone just moves out and stays there) to well above 100,000Hz, far beyond the conventional human high-frequency hearing limit. Its frequency response would then be from 0 to 100kHz. If there was a +/- dB tolerance given, then you might see a frequency response of 0-100kHz, +/- 3dB, but even then, that specification needs to be followed by the caveats "measured at X distance, Y height from the floor, on this type of signal (such as a short 'beep')." Without all of that, it is hard to predict what will be the frequency range of the speaker in your home. If the speaker is not balanced in loudness across the audible frequency range, you will hear the symptoms of its tone balance problem -- highs that are too loud, voices that are too hollow, or bass that is too soft.
Harmonics. A string's main vibration can usually be seen as a back-and-forth motion, with the center of the string moving the most. What can not be seen are tiny double, triple, quadruple wiggles that overlay that main motion. Those extra wiggles or 'quiverings' are vibrations cycling two, three, four times faster than the main vibration. From them, we hear the second, third, and fourth harmonics, each at different loudnesses, each with different onsets and decays. They also modulate the loudness of each other as time passes. All of their motions, along with the fundamental motion, produce the 'sound' that says 'string' of a certain size and length. When the sounds of the vibrating body of the instrument are added, we then hear 'guitar' or 'violin.'
Impedance. A measure of the speaker’s electrical resistance at each frequency. A perfect impedance would be the same Ohms at all frequencies, over the widest possible bandwidth. An extremely low deviation in impedance allows the most power from an amplifier at all frequencies and the use of long speaker wires.
Loudness. When something is 'twice as loud,' this is a 10-decibel increase on a sound meter. Half as quiet is 10 decibels less, or -10dB. A 'Bel' is 10dB, in honor of Alexander Graham Bell. A soft conversation measures about 60dB; a noisy pub, 90dB, and a whisper measures 30dB. The human threshold of '0dB' is the tick of a wristwatch at 6' (2m) away. We want a speaker to always produce the proper balance of loudness between lows and highs across a wide range of listener positions, whether the speakers are being played soft or loud.
Max SPL. (Maximum Sound Pressure Level) is the peak loudness a sound meter would show at 10' (3m) away from the speakers in an average room. Anything greater than 100dB is ‘shouting loud.’ Room gain means ‘how much echo from the room’ is adding to that reading.
Pair Matching. Involves matching the speakers to each other for balanced sound from left to right. If not well-matched, instruments will not seem to be in their right positions and voices will be off to one side or the other. There will be more voice in one speaker than the other -- or more bass, or more highs. A perfect pair match means that the speakers sound identical to each other, resulting in the clearest sound and most musicality. The speakers' impedance also must match so the amplifier puts out equal power to each speaker. For the impedance to match, crossover circuit parts must be have very tight tolerances. Since we want the speakers to be 100 percent matched, perfection is a zero percent difference between these crossover parts.
Phase Shift. Time delay imposed on any frequency, measured as a portion of any frequency’s 360-degree period. Perfection is zero degrees across the widest possible frequency band.
Polarity. An indicator of the speaker’s motion with respect to the amplifier’s signal. Normal polarity, the ideal, means that every driver moves outward into the room for a positive voltage from the amplifier. Inverted polarity means those drivers ‘suck’ inwards for a positive voltage. Imagine a trumpeter ‘sucking in’ on his horn, instead of blowing outwards through his instrument. That is the difference between inverted and normal polarity. Hopefully, each of your stereo components was made with normal polarity. If one component has inverted polarity, you will then need to re-invert it by reversing the wires on each speaker, from plus-to-minus to minus-to-plus. This will not hurt your equipment if left uncorrected, but normal polarity results in better sound.
Power. The recommended size of amplifier to be used with our loudspeakers. Amplifiers are rated into both 4-Ohms and 8-Ohms. We benchmark our recommended power ratings using an amplifier’s 8-Ohm rating.
Power Handling. "So, how much power can I really pump into these? On the back it says 50 Watts." This number is someone's estimate. What it likely means that if the sound is rock music, where every sound is equally loud, all the time, the speaker will handle an amplifier that can safely put out 50 Watts. On classical music, or most any solo instrument or voice, that same speaker may easily handle more than 100 Watt peaks, which means the piano probably sounds close to 'live' in loudness. A manufacturer can give a range of recommended amplifier powers, usually meaning from 'just enough power for most music, in an apartment' to 'more than loud enough for any sane person, on any music, at 15' (5m).' Such a recommendation might read "10 to 100 Watt amplifiers."
Resonance. Either unwanted or excess vibration of something, at a particular frequency. That resonance can be damped (controlled) so that it does not continue to build up, or it can be damped very little, leading to a car-stereo's one-note bass. A speaker's cabinet can resonate inside, in the air space behind the woofer, which may sound like a 'hoot.' A midrange cone can resonate, ringing out on one piano note. A tweeter can resonate, adding a 'sizzle' to the cymbals. Many speakers' crossover circuits resonate, as their parts are designed to store and release energy. A speaker cabinet's sides can resonate, leading to a 'wooden' sound. Everything resonates; it is only a question of 'how much.' A 'low-Q' system resonates the least. A 'high-Q' system resonates a lot at one frequency, such as feedback ringing from a microphone.
Rise Time. The ‘get up and go.’ It is how long it takes for a speaker to go from ‘no signal’ to ‘full signal.’ Perfection is zero seconds.
Sensitivity. A measure of how much sound is output for a certain signal input. Lower decibels means more power is required. We could send a specific amount of wattage and measure the resulting loudness in decibels, but wattage depends upon whether the speaker is either 4-Ohms or 8-Ohms, or even 6-Ohms. This would make it hard to compare ratings among different speakers. Instead, a particular voltage is sent in, as measured by a volt meter on the speaker wires, and then the sound pressure is measured. It is a better method because that voltage remains the same no matter which speaker is connected. It is equivalent to leaving the amplifier’s volume control at the same level while connecting speakers.
Sound Analyzers. We can mathematically analyze any sound into its set of frequencies. An analysis may show all the harmonics, such as those from the vibrating string. Sometimes it might show many completely unrelated frequencies, such as those coming from the sound of a single hand clap. This is why we hear no particular tone from that clap. In fact, many hands clapping sounds like noise because each clap is actually a short burst of noise. Noise is a collection of random frequencies, happening at random times, with random loudnesses.
Standing Waves. When you stand in one spot and hear that the bass is too strong, or conversely, is missing, then you are in a 'standing wave' zone, a term that comes from showing the math of the waves on a blackboard. If a sound wave reaches one wall and then comes back to you, perhaps more of that same wave is still coming in from the speaker. If these two waves overlap at your ear with the same polarity, then you hear extra bass. If that reflected wave reaches you one-half-cycle late, then it is 'upside down' compared to the wave still arriving from the speaker. The two cancel out in that one spot and you hear no bass.
Stereo Image. Is there a person standing there, sing/playing in front of you? Another person to the left, right? The sharpness of a stereo image does not have a rating, but do know that you can hear two distinct sound locations when they are spread apart by only two degrees, which is about 4" at 10' away (two voices singing cheek to cheek). With just that small amount of separation, and your eyes closed, you can point to each voice precisely. Besides left-to-right, there can be 'depth' to an image, which comes from the echoes behind the voice, or simply the sound of a voice far away, mixed with echoes. Echoes are artificially produced in most pop/rock/jazz recordings and soundtracks. Those could be the sound of a 'fantasy' space, perhaps a cave with rough walls and a cathedral's height, but a highly-polished floor. There cannot be 'height' in recordings, because microphones do not know up from down. On occasion, two microphones on one orchestra will pick up the same upper-midrange sounds but at different times. In that tone range, we may hear what sounds like 'height', but always at the expense of clarity from the double arrival at the microphones.
Sweet Spot. The best seat in the house. There is only one very-best position for listening to a two-channel or a surround-sound system, for three reasons: We have two ears; recordings are mixed for one person having two ears, and we have at least two speakers. Only in the middle do we hear the very best from any recording. A wide sweet-spot means that the sonic clarity does not degrade too rapidly as one moves off center.
Timbre. Another name for the texture, or harmonic structure, of a particular voice or instrument. Timbre (tam-burr) is a principal part of that sound's identity, and is created by the loudness of its harmonics versus the fundamental, and also when each comes and goes. Timbre is a big part of how we tell two of anything apart -- for example, two guitars or two voices -- that are playing or singing the same note. It is also a strong part of how we know that sound is from our child and not some else, for yet another example. One learns timbre through experience. Comparing the sound of a Stradivarius to other violins playing the same notes is a good example of this. The artist also contributes to the timbre of that violin. A recording engineer's challenge is to get the timbre correct for your living room. You may know a violin's sound from the concert hall, but the microphone is much closer and hears a completely different timbre (and dynamic range). The type of microphone is chosen with those factors in mind, and the studio's gear is used to produce both a pleasing timbre and dynamic range. It can never be completely 'realistic,' and is the most difficult aspect of 'getting a good recording.'
Time-coherent Speakers. This is a speaker that preserves the exact timing of the high tones, middle tones and bass, as they were recorded. Most all speakers delay the bass and voice-range tones compared to the highs, and the high tones are then heard to arrive too soon. This causes all sorts of sonic problems, depending on the amount of time delay and where in the tone range it is at a maximum. These time-domain distortions are introduced by the speaker's crossover circuit, by different distances from each driver (woofer, midrange, and tweeter) to your ear, and by the physical characteristics of those drivers. Our speakers are time coherent by design, having almost no time-domain distortion. Time-domain distortions are mainly why speakers 'sound like speakers' instead of the real thing.
Tone Balance. Are voices bass-heavy or 'thin'? Proper tone balance is best judged on familiar sounds such as voices, guitar, piano, and woodwinds. With those familiar middle-range tones as references, are all the lower notes of a string bass loud enough? Are the 'esses' and 'tees' of the voice too loud? For good tone balance, the speakers must be designed to have a 'flat tone balance' when actually placed in rooms without many echoes and with no early reflections coming off surfaces too close to the speakers. Such a setup seldom requires tone control adjustments, even on old recordings or poor movies.
Voice Coil. When a speaker's cone or dome strokes, the amplifier's signal is creating a positive or negative magnetic field in a long spiral of wire behind the cone, called the voice coil. The coil creates a magnetic field running down the center of it, and this is alternately attracted and repelled from the magnet around it. A proper voice coil and magnet assembly allows the voice coil to consistently experience the same magnetic field around it for the lowest distortion, no matter how far it strokes.