TR 18.09.2009

measuring speakers

Some disputes never end – like the one about the best performing loudspeaker. Why our ears appreciate so much the loudspeakers like Wilson Audio despite un-stellar measurements? Why there is no sense in attaching incompetent measurements to an audio component review? And why one simply cannot decide on a speaker´s sound quality based on such the measurements?

FRONTPAGE-Anechoic-room

Measuring - yes or no?

2015 revision note: In 2009 Audiodrom published the article that challenged the significance of frequency response measurements in hi-fi magazines as a lead for deciding whether the speaker is good or not. The very same article is reprinted herewith, a bit revised. Why did we feel the need for the revision?
Reading between the lines of our readers’ feedbacks I perhaps did not single out one important thing: a speaker’s frequency response measurement, either in a room or a lab, is irrelevant if it is the only measurement available.
If taken in a listening room, the frequency interaction curve of the speaker and the room can be assessed, yet such the measurement does not say anything about the quality of the speaker. The measurement will change, often significantly, depending on the position of the speakers, the listener and a measuring device – all these pretty much variable.
If taken to an acoustically treated lab, the frequency response is an important parameter that can be used for comparative assessments. Yet, provided that it is not accompanied by the set of other measurements (impedance, resonant behavior, off-axis dispersion, step response, etc.) it remains to be just a kind of clue about how good the loudspeaker may be.
Expert measurements are laborious and require technical and interpretation skills as well as appropriate equipment. Audiodrom, as a non-commercial magazine, does not provide necessary funds and time for its reviewers to perform such tasks. We keep sticking to subjective evaluations through the standardized DRS rating. All said now let’s get back to the originally published article.

The controversy

Let’s start with the quick flashback at few speaker reviews that appeared in Stereophile. To me Stereophile is the sole authority that has a certain and proven methodology and experience of measuring the speakers and exploring the correlation of those measurements with subjective auditioning. Systematical approach of Stereophile (alias John Atkinson) means that it is clear who and how took the measurements – not the same can be said about haphazardly thrown in charts in other hi-fi magazines.
Yet, even Stereophile does not claim that the measurements are objective proof of the subjective listening tests. Rather they are complementary to the auditioning and can (but does not have to) show something that was not noticed by ears. It was Robert Harley who made the statement (A Matter of Measurements, Stereophile, September 1989) that “The decision to measure some performance criteria in no way changes our [Stereophile’s] basic philosophy: the way to musical truth is through the ears, not the oscilloscope”, and that (...) “"objective" measurements are, in reality, subjective. One must make a subjective decision as to which "objective" measurements are important”.

It is fair to say that mostly the measurements correlate with the listening tests. Thus frequency response unevenness, resonances, load problems, nonlinearities, high values of jitter or THD can be both heard and seen. What is surprising is that quite often measurements does not correlate with what we hear. That it happens especially with high-end equipment is less surprising – the untrained ear accepts the fabulous sound with fabulous asking price and forgives the obvious imperfections – when a lab measurement shows opposite the listener is confronted and refuses to believe.

I will be quoting few interesting lab reports of well-known brands that have appeared in Stereophile in the course of last few years...

Lumen White Whiteflame, 24,000 USD a pair, Stereophile, October 2002
John Atkinson comments the measured impedance curve and the accelerometer’s output: “Of more concern are the peaks and discontinuities at 120Hz and 845Hz, which indicate the presence of major resonances at these frequencies.” The nearfield response (see picture) highlighted another problem: „ But note the mess in the woofer and port traces between 90Hz and 150Hz. There is a major enclosure resonance in this region that affects the behavior of both the woofers and the port. I certainly found this resonance audible on pink noise from behind the speaker. What was surprising was how inaudible it was with music when I auditioned the speakers in Michael Fremer's system.”

Well, after reading those remark, would you invest 24,000 USD into the Lumen White Flame? Yet their sound made the reviewer excited: „With the right amplification—ie, tubes—the Whiteflame offers effortless, seamless, natural musical performance subjectively free of obvious colorations or dynamic constraints at either end of the scale,” and “The Whiteflame's overall transparency and ability to resolve low-level musical detail was exceptional, and its midband performance was lush and airy”.
Do not blame the reviewer for incompetence – the same speakers got the highest rating also in German Stereo magazine (97%, the same like WATT/Puppy 8 system), a true reference.

Vienna Acoustics Beethoven Concert Grand, 4,500 USD a pair, Stereophile, May 2006
John Atkinson: „I was disappointed in the Beethoven Concert Grand's measured performance. I had expected more, both from Michael's auditioning impressions and from my experience of earlier Vienna Acoustics models. (...) Perhaps this is why Michael's friend described the Beethoven as a "music lover's speaker" rather than one aimed at audiophiles.”
The excitement of Michael Fremer about the Beethoven Concert Grand was shared by other reviewers around the globe.

Wilson WATT/Puppy 8, 27,900 USD a pair, Stereophile, June 2008One of the most tested loudspeaker systems on our planet (different planets may have different preferences). The Wilson Audio’s design is ultimate and unique, yet not everything was okay in the lab:
John Atkinson: „...the WATT/Puppy 8's impedance characteristic demands the use of an amplifier that can deliver generous amounts of current (...) It is definitely a difficult load for an amplifier to drive, which leaves me puzzled as to why Wes had no problems driving it with the tubed Cayin amplifier.”

Anechoic frequency response (see picture above) was compared to the previous model, WATT/Puppy 7: „...as in the earlier design, the Focal tweeter's dome-resonance peak lies at 20kHz, which is a little close to the audioband for comfort.”
The lumpy frequency curve shows uneven peaks around 3-5KHz and dips of more than -4dB above 10kHz. The real room frequency response did not improve the picture (see picture below).

Wilson-Watt-Puppy-2

John Atkinson reserves his comments to 700Hz peak: „I also felt the stereo imaging to be stable and well-defined and the sound superbly detailed, though the small presence-region peak visible (...) might have had something to do with the latter aspect.” There are no other remarks except that the bass balance was “warmer and ever-so-slightly smeared". With +/-10dB unevenness the in-room response is far from ideal.
The Wilson WATT/Puppy 8’s are excellent and great sounding loudspeakers. Audiodrom’s PW has them at home and I would not object to owning them too, provided I raised funds enough. So nobody questions their performance. I assume that Wes Phillips must have heard the in-room frequency imperfections (+/- 10dB is easy to spot), yet he wrote: „However, Jonathan Levine's baritone sax solo (...) sounded perhaps a trifle too warm, as did Arnie Kinsella's kickdrum. Perhaps," "a trifle" - or was I finally hearing the recording accurately for the first time?”
In the conclusion, the reviewer makes a benchmark product from the Wilson WATT/Puppy 8. Oh yes, and also notes: „ The W/P8 appears to be a relatively easy load to drive.”

Sonics Allegra, 7,800 USD a pair, Stereophile, January 2009
This review was doomed from the beginning. The pair of speakers that was provided for the review was proven to be faulty. It „wasn't bad, exactly, just diffuse and lifeless”. That is why a new pair was requested and commented by Brian Damkroger: „ They sounded different from the first pair—better, but still pretty ordinary. Systematically moving the Allegras around resulted in small changes, but nothing dramatic. (...) Placing the speakers midway between the front and rear walls, and my listening chair at the two-thirds point between those walls. This resulted in greater changes in the sound, but they weren't necessarily improvements.”
After few days of experiments with the positioning the reviewer ended up with a satisfactory placement (still a bit mystery to me as the placement was nearly identical with the initial one). John Atkinson performed his standard set of measurements to conclude: „There is much to admire in the Sonics by Joachim Gerhard Allegra's measured performance, and nothing to indicate why Brian Damkroger had to work so hard to optimize its performance in his listening room.” The review was positive in the end, yet brings in many question marks that challenge the measurements vs the listening tests (or vice versa)..

The high-end enthusiasts, irrespective of the country of their residence, tend to comment things they think they understand. However, there is not a lot of people that can truly interpret the correctly taken measurements – if you are one of those talented individuals you have my respect. The most universal “truths” is that a speaker between 25Hz and 20kHz should measure as flat as possible in a listening room. Look at the selection of three speakers below – which one you prefer?

NHTThree

UsherBe718

RevelUltimaSalon2

It is not an easy task, is it? The responses of these three speakers differ just in details and the lows they can reach. The anechoic measurements (the sum of nearfield measurements) are excellent and the smoothness of the response is great with +/-2dB fluctuations. The fact you cannot see is the price difference between the first and the last speaker - ca 20,000 € (scroll down to the end of this article to see the contenders). I am sure that you understand why it would be naïve to make your choice based solely on graphs.

Are controlled anechoic measurements objective? I believe that yes, it is the only instrument we have, anyway. Theoretically, if we take the same speaker it should measure identically anywhere in the world. Let’s use again the Wilson WATT/Puppy 8 that was measured by National Research Council in Canada for Soundstage! magazine: the measurements of Soundstage! and Stereophile provide the same data, which supports the objectivity claim.

Wilson-Watt-Puppy-SS

Think positive!

Each of us prefers different sound. Each of us has a different pair of ears, different experiences, different benchmarks. We listen at different volume levels, we listen to different music, we have different emotions. You know this: one evening you spend magic moments accompanied by your favorite music selection, the next days the same music sounds just terrible. Or this: you love what you hear from your new piece of equipment until you read in a renowned magazine that this very piece of equipment is a highly colored underperforming crap. You start hating the magazine (we also have built a hater mob already for ourselves) yet the probability that you would sell such the piece of equipment has risen close to 100%. It works the other way round too – owning a device that was praised by a reviewer gives you peace of mind and increases your listening pleasure. You cannot do anything with the psychological biases we have. But there are other parameters to work with.
The room acoustics has the power to totally dump or elevate the listening experience. The best sounding component in a poor room will perform poorly. The best sounding component paired with a bad sounding component will perform badly. Can we foretell the result based on the measurements? Partially. Having the complete set of measurement on hand can help us to avoid mistakes in the room or in the audio chain. Yet such the assessment will improve our chances to 60-65% that is just a bit better than weather forecast success rate. Another question: can a different source component (a turntable, a player, a DAC) have an audible and measurable effect on what we hear in a room?

The bad news is that subjective auditioning can reveal the differences despite the measurements does not explain why. In ideal conditions the measurements should tell us everything, unfortunately there is too many variables and distortions that mask the nuances. Let’s face it again - the flat or non-flat in-room frequency response remains intact even if you replace your Blu-Ray audio player with a smartphone. There is no information on THD, S/N ratios, step responses, time alignment, etc. Can you measure transparency, drive, rhythm, bass articulation,..? Subjectively perhaps but no scientific proof, I am sorry.

As explained, the acoustic properties of the listening room are by far the most important element of any hi-fi set up if we want to know how the chart looks like from our listening chair. If you compare again the in-room and anechoic measurements of the Wilson Audio WATT/Puppy 8 you see the big difference. The same can be seen from the pictures herebelow:As explained, the acoustic properties of the listening room are by far the most important element of any hi-fi set up if we want to know how the chart looks like from our listening chair. If you compare again the in-room and anechoic measurements of the Wilson Audio WATT/Puppy 8 you see the big difference. The same can be seen from the pictures herebelow:

Avalon-anechoic

Avalon-listening-room

The first picture shows fabulous Avalon Acoustics Indra (20,000 USD /pair, Stereophile, October 2008) in an anechoic chamber, the second graph was measured in Wes Phillips’s listening room. Despite the latter looks ugly it is not that bad – the axis x calibration is quite fine – yet you would agree that it is totally different from the anechoic. However, both curves are objective in a sense. On top of that, should you happen to be a proud owner of the Avalon Indra, the measurement in your room will be completely different from either of the examples.
Though the same behavior (objective vs subjective) can be applied on any component, the loudspeakers are the most visually ‘understandable’ and that is why I use them in this article.

The High End is not High enough anymore

We do not review equipment that is on the shelves of supermarkets. With high end producers one expect that the equioment is properly designed, measured and tested before it reaches our listening rooms. There is fierce competition out there waiting for mistakes and no one wants to risk his reputation. Yet the state-of-the-art technology, technical advancements and know-how ceased to be the property of few big brands – a pair of Asian hands can do miracles in couple of hours and the result can be astonishing. Call it globalization tax if you want. The ideas travel with speed of light and are applied in the order of days, not months. I like it, it enhances options we have.

To conclude one more nice comment from Stereophile’s John Atkinson. He measured beautiful Sonus Faber (Sonus Faber Amati Homage, 20,000 USD /pair, Stereophile, June 1999) only to find that:
"While some of the Amati Homage's measurements are excellent, there is nothing to indicate why Michael Fremer was so enamored of the speaker's sound. Indeed, some of the measurements, such as of the speaker's bass performance, raise more questions than they answer. But if a demonstration is required of the fact that once the basic science has been addressed, speaker design still involves art, the Sonus Faber Amati Homage provides it."

(The three frequency responses in the riddle belong to: NHT Classic Three, Usher Be-718, Revel Ultima Salon2).

As we see it - tips & thoughts