Soundstage vs. Imaging
As reviewers, it is our duty to relate our experiences with the various products we review in a language that communicates what we hear to the reader in such a way as they can share in that experience and make informed decisions about those products. This process involves a lot of Audiophile jargon that we as writers assume that our target audience understands, words like Soundstage, Imaging, Channel Phase, Absolute Phase, Phase Coherency, Bandwidth, Linear Bandwidth, Frequency Response, Dynamic Range, Noise Floor, Air, Tonal Balance, Timbre, Resolution, and Musicality. We also use a lot of technical terms like THD (Total Harmonic Distortion), TIMD (Transient Intermodulation Distortion), Impedance, Bit Rate, Bit Depth, Sample Rate, and Signal to Noise Ratio (admittedly some of the terms from the previous list are also technical terms). Unfortunately, unless someone consults a learned Audiophile, they may not have a way to actually learn what these terms mean. This was born out several years ago at CES when I was having dinner with several of my colleagues and one of my fellow reviewers (I say fellow reviewer, but at the time, I was a manufacturer, though he is still a reviewer so I guess I can use the inclusive) declared that there was no such thing as Soundstage, that it was just Audiophile BS for imaging. After several hours of explanation and argument, I can’t swear that he was ever convinced, though I will grant that he was a Personal Audio person where Soundstage is often limited in the absence of Binaural recordings (recordings made using an artificial head with the microphones placed where the ears are in order to properly recreate the soundstage for headphone listening). So in an attempt to address this issue, I will take a stab at defining my terms.
Soundstage vs. Imaging, since this is where this conversation started, this is where I’ll begin my explanations. Imaging is the ability of the sound system to place individual instruments in a three-dimensional field (usually in front of you) so that it is possible to visualize the musicians on stage before you (or more to the point, visualize yourself being at the performance). Proper Imaging allows you to identify each instrument in a focused position on the stage. Many factors can blur the Image, or Images can drift giving the impression that the musicians are wandering around while playing different notes, or in the case of low-resolution recordings or sound systems there is no sense of individual musicians at all, but just a wall of sound coming from the left, right, or middle. Soundstage on the other hand, though related to Imaging, is the sense of space around you and the instruments, the sense of walls surrounding you if it is an indoor performance or the sense that there are no walls if it is an outdoor performance (this is why room treatment is so important to soundstage).
Phase is both the simplest and most complex of the technical terms used and therefore can be easily misunderstood. Sound waves are actual waves of energy and Phase is the orientation of these waves in relation to the listener or in the event of a recording, the microphone, and can be measured in terms of a half-circle (IE: 180º).
For those who are aware of Phase, Channel Phase is the most obvious. If the right and left channel are 180 º out of phase (IE: One speaker is wired backward) and facing the same direction you will get waveform cancellation of common information (information that is sent to both speakers), which is usually the bass and vocals which means you get a dropout of bass making the sound tinny. This also means that if the speakers are facing each other they need to be wired out of phase or the same will occur (a problem the often happens with a dual subwoofer system where one woofer is at the front and the other at the rear of a room). Note: Phase is critical to Imaging and Soundstage so the right and left speaker should ideally be facing towards the listener.
Absolute Phase (also known as the Wood Effect) is an issue that relates most pointedly to digital recordings as they tend to be either In-Phase or 180º Out-of-Phase. It is also an effect largely ignored by a majority of the audio community as it is extremely subtle. Essentially, if the sound is 180º Out-of-Phase it hits your ear as if the sound was coming from behind you which can damage the Soundstage and tends to be slightly harsher less dynamic.
Phase Coherency is a matter of timing; specifically, do all frequencies hit your ear in the same temporal arrangement as they hit the microphone? Since Phase is one of the primary factors in locating the source of a sound, Phase Coherency is critical to Soundstage and Imaging.
Bandwidth and Frequency Response are related. Bandwidth is the top and bottom limits of the Frequency Response (how high and how low does the system go) and Frequency Response is the linearity of that Bandwidth (how accurate is the sound). Linear Bandwidth is the measurement of how consistent the Frequency Response is at different power levels and volumes (Poor Linear Bandwidth contributes to a fuzzy Image and Soundstage and is one of the causes of the dancing musician).
Dynamic Range is simply the difference between the quietest sound and the loudest sound the system (or recording as most music has a much more limited Dynamic Range than either the system or recording are capable of; IE: the difference between a triangle and a full orchestra is much less than the difference between silence and a full orchestra, hence unless there are periods of silence in the piece the Dynamic Range is limited to the difference in the volumes of the individual instruments) can produce. Noise Floor is simply how quiet the system is, which has a great effect on Soundstage as it affects Resolution.
Air is a term used to describe Imaging and Soundstage and refers to how much space you hear around each instrument. Do you hear a group of individual violins playing or a single multi-toned violin? It can also be used to describe the size of the Soundstage and the sense space around the listener.
Tonal Balance is how natural is the relationship between the Bass, Midrange, and High Frequencies and all of the micro-steps between. Oddly enough, a flat Frequency Response does not always equate to a Linear Tonal Balance as other factors are at play.
Timbre usually is used in reference to individual instruments and all of their resonances and how much do they sound like a real-life example of that instrument (This is why it is important for reviewers to hear live instruments as often as possible).
Resolution is an optical term that is a great metaphor for how detailed, how finely sliced up over time a recording or system sounds. This is why I don’t think of higher Bit Depth as being higher resolution, while it increases dynamic range and accuracy, Sample Rate determines resolution over time (it’s not more defined just more accurate).
Musicality, well that’s what it is all about really, is a combination of how pleasant it is to listen to and how real it sounds with probably a little bit of emphasis on the pleasant (how shrill or brassy do you want your violins and horns?).
Which brings us to my list of technical terms, which honestly you could look up, but I’ll go through them anyway.
“THD (Total Harmonic Distortion) is defined as the ratio of the sum of the powers of all harmonic components to the power of the fundamental frequency” to quote Wiki, which in effect means how much is the original signal distorted by its resonant harmonic overtones. It is the most well known and understood distortion and is pretty much universally considered to be inaudible under 1%, but since it is also the most easily controlled is what most manufacturers concentrate on.
TIMD (Transient Intermodulation Distortion) is caused by the use of negative feedback in a solid-state amplifier circuit. Negative feedback is the most common way of dealing with THD and entails sending a little bit of Out-of-Phase signal back through the amplifier circuit to cancel out THD and is caused by the amount of time it takes to make that loop (measured in rise time also known as slew rate) producing that harsh metallic sound associated with solid-state amplifiers. It is the reason that many high-end amplifiers eschew negative feedback in favor of simply using more expensive components (more expensive to the tune of tossing about 99 for every 1 usable component, you can see why they cost more). Ironically, this is also why many tube amp manufacturers eschew negative feedback despite the fact that the slew rate of tube amps is too slow to produce TIMD.
Impedance is resistance load as it relates to frequency, a resistor is a fixed impedance load whereas a speaker is a dynamic impedance load meaning the impedance changes as frequency changes. It is important because it is a factor in how much power the amplifier has to produce. Lower Impedance means you need more current (which means more heat, which burns out components) if the current isn’t there, then neither is the note. Higher Impedance means you need more voltage or you have a significant drop in power. If an amplifier produces 100 Watts @ 8Ω it should produce 200 Watts at @ 4Ω and 400 Watts @ 2 Ω but they often don’t have enough power supply to support that so you will often see amplifiers rated at 200 Watts @ 4Ω and 201 Watts @ 2Ω which means that a speaker that is rated nominal 4Ω that drops to 1Ω below 100Hz simply won’t produce those signals, and might fry an inexpensive amplifier or worse the speaker via clipping. Conversely, that same 200 Watt @ 4Ω should produce 50 Watts @ 16Ω but often don’t have the circuitry to support that and might kill a nominal 8Ω speaker that swings up to 16Ω.
Bit Rate is a measurement of digital compression, and an uncompressed 16-bit/44.1kHz file has a bit rate of 1411.20kbps, anything less and a portion of the bits have been tossed out. If you are using lossless compression, that should not be a problem as long as your processor is fast enough to decode the compression scheme and reproduce the missing bits. If it is a lossy compression system you can see that a significant portion of the music is lost (up to 88% for a 128kbps file). Bit Rate is also important when considering data transfer (say via USB to your DAC). A 24-bit/192kHz file has a Bit Rate 9612kbps, a 24-bit/384kHz file 18432kbps, and a 32-bit/786kHz 49152.00kbps, with quad DSD ringing in at 22579.20kbps (remember it is an uncompressed signal that goes from your computer to your DAC, unless it is DoP in which case it is larger).
Bit Depth is how many bits are contained in each sample in a PCM signal, which determines dynamic range (6dB per bit) and accuracy which increases geometrically with each bit.
Sample Rate is simply how many samples per second which determines the Resolution or Bandwidth of your digital signal.
Signal to Noise Ratio is the relationship of Noise Floor to 0VU which was the standard set for the recording industry as the optimum recording level. Dynamic Headroom is how far above 0VU a particular component is capable of performing or in the case of an amplifier how far above rated power.
That covers all of the terms that come to mind, but feel free to ask about any term I use in any of my articles and I’ll do my best to clarify.
I hope this is of use or at least interest to you and Happy Thanksgiving to all.
Leave a Reply
Want to join discussion?
Feel free to contribute!