44 KHz (CD) not enough !? (Nyquist etc.), plethora of distortion frequencies?
Remarks and conclusions added May 12 2003 - 1:55 PM, and edited May 14 2003 - 08:35 AM :

My dubious claims unfortunately had a very short life span due to the very successful enlightenment efforts of tigre, 2Bdecided, KikeG and mrosscook.

In short: I failed to come up with evidence that cd quality (I mean 44.1 KHz digital sampling) is somehow problematic. It basically was a story of using the wrong tools, jumping to the wrong conclusions, and not having enough of a clue about signal processing.

Nevertheless, I tried again to make less daunting claims that the 44.1 KHz digital sampling rate is not enough to represent all signals less than 22.05 KHz correctly.

And again my claims had a very short life span. This time due to further enlightenment efforts by DonP, 2Bdecided, KikeG, mrosscook and SikkeK.

The conclusion: Arguing against the technical specification of cd quality (44.1 KHz/16 bit) should not be tried by someone that severely lacks in signal processing clue (like me).

If the cd sound quality is perceived as suboptimal, it may have more to do with poor recording, poor mastering, and suboptimal reproduction equipment (i.e. cd-player and sound system/headphones).

What one still could try are listening tests:

Such tests would need to be done with one and the same high end hardware for all signals and all tests (preferably with 192 KHz resolution, with 20-24 bit, and with a DAC that is perfectly shielded and outside of any system that is rich of EM signals, like a computer, and has a near perfect analog circuitry). And when testing the 192 KHz signal against the 44.1 KHz signal, the latter would need to be a digitally downsampled version (to 44.1 KHz), which was upsampled to 192 KHz again. Using the best available algorithms (Cool Edit may do a resonable job here).

And still, asking the test persons for audible artifacts would most likely not work at all. It might be more rewarding letting them rate how the music "felt" (e.g.: more or less "relaxing" for music that should be "relaxing" but is rich in high frequency content nonetheless). This could be done in a way that is scientifically sound and statistically relevant.

My original post:


I have to admit: This 44.1 KHz topic more or less has been discussed to death already. It also seems likely that the following problem has been discussed on Hydrogenaudio several times as well (but I had no luck with the search function).

The 44.1 KHz sampling rate (CD quality) seems to create an infinite number of "mirrors" at its harmonics. These in turn create a complex set of distortion frequencies for every frequency in the analog source.

The strongest "mirror" is at at 22.05 KHz (44.1 KHz/2). But the problem can easily be demonstrated with the one at 11025 Hz (44.1 KHz/4) as well: if one creates a sine signal of 11025-1000 = 10025 Hz in a sound editor (e.g. Audacity, using a 44.1 KHz sampling rate) and plots the spectrum, then two additional frequencies are shown: one at 1000 Hz and one at 22050-1000 = 21050 Hz. More distortion signals can be seen if the FFT resolution is increased above 1024.

The general problem seems to be that a sampling frequency of 44.1 KHz does not guarantee that frequencies below 22.05 KHz are represented faithfully (as is mostly believed). Instead it probably more or less only guarantees that in the resulting complex signal the source frequency is significantly stronger than the numerous distortion signals.

Of course, the remaining question is if these distortions are audible (they resemble pretty much amplitude modulation). I cannot really test this with 44.1 KHz since I don´t have a 96 KHz soundcard. But the example with 11024 Hz surely looks rather disturbing (when looking at the waveform) and doesn´t sound very clean as well.

Did anyone do any respective (blind) listening tests?


The following example is very audible: When using a sampling frequency of only 2000 Hz (instead of 44100 Hz) and creating a sine frequency of 750 Hz (well below the Nyquist limit of 1000 Hz) then the result sounds pretty ugly (it´s some kind of mixed signal of 750 Hz, 250 Hz and 1250 Hz).

This post has been edited by zephirus: May 19 2003, 15:49
QUOTE (KikeG @ May 19 2003 - 03:39 PM)
I don't trust these kind of comparisons, that are not rigorously controlled. Could you give more details?

I'm sure I've tried to write a definitive report of my experience before, but I can't find it, so here it is...

It's the 109th AES conference in LA. I'm presenting a paper, but while I'm there, my tutor has "volunteered" me to help with an audio demonstration. As it turns out, they don't need much help, but I go along anyway because it's very interesting.

The people involved are David Chesky and Kevin Halverson. Other "well known" people involved were dCS, who provided the converters, and Studer, who were kind enough to align the tape recorder before use.

David Chesky was there for reasons I'll explain in a moment. Kevin Halverson was there because most of it was his equipment, and he was operating it. They both had a very down-to-earth attitude to the audio industry: "How do you finish up with a small fortune after making a Jazz record? Start with a large fortune!" "Of course our company doesn't matter in the industry - heck - Sony's budget for paper clips is bigger than our turnover!" They both made quality recordings and equipment for the sake of it, though I was interested to hear the David Chesky would rather read a book when he gets home than listen to music. David was more of an artist, Kevin more of an engineer. They were kind enough to explain everything that was there, and let me play with all the equipment.

There were two demos, one relevant to this discussion, the other I'll explain too because it's probably more interesting!

Demo 1: 6-channel surround sound.

The main demo was 6-channel surround sound. Most of us have stereo. Some of us have 5.1. David was proposing that 5.1 isn't ideal for music. Since DVD-A and SACD actually have 6 full bandwidth channels, he wanted to use them. He didn't see the point of using one for the centre channel when his recordings already had a solid centre image. He didn't see the point of a dedicated .1 channel when music listeners should have full range speakers anyway, and their amplifiers should handle the bass management when they didn't. Plus, he had a much better use for these two channels.

He started with stereo (+/-30 degree spaced speakers), and used the normal surrounds (+/-110 degree spaced speakers, but this spacing could ideally be something else (I forget what) and in practice could be virtually anything that would fit into your room, so long as they were symmetrical behind you). To these four speakers, he added another two at +/-55 degrees front. The idea is that, in many real music listening situations, you get echoes from this direction. Good concert halls have their first main reflection at around this angle. So do many rooms. Since the human ear judges distance partly on the basis of comparing direct and reflected sound, it's good to make this cue more accurate. These extra speakers were raised off the floor by several feet, the idea being that in any realistic situation the audience would prevent you hearing anything lower from this angle.

You can argue with the reasoning behind this - whatever the justification, I think David chose it because it "just works". He'd decoded his B-format ambisonic masters into a 6-channel configuration and played with the result. For playback, he was using a PC with Cool Edit Pro running in multitrack mode. The 6 channels were output as 3 stereo pairs, to a special 6-channel sound card that Kevin had prepared for the demo. All files were 24-bit 96kHz. The PC could just keep up. You could switch channels on and off using the solo and mute controls in CEP, and I even dropped down into edit view a few times to try some spectral analysis on the various channels.

So, what did it sound like? These recordings were mostly things that David had already released on CD and got excellent reviews for. So, when listening to the stereo version, you were listening to a 24/96 version of what was available on CD. And it sounded pretty good. If you added the 2 rear surround channels, you got a nice bit of echo as well. This is like 5.1, but without the centre and sub. It's quadraphonic, if you like. It sounds nicer than stereo. Nice enough to go out and buy more amps and speakers? Depends. Then you switch on the two extra front speakers - wow! I mean - really - wow! The front sound stage went from maybe 6 feet to 20 feet across - and the depth! Sometimes CD reviewers go on about great depth in CD recordings - I know what they mean, but really, that's nothing! It's a few feet - this was like 20 feet! "You could shoot an arrow into that sound stage" was how one visitor described it.

Some of the recordings sounded "artificial" - larger than life. This was just due to the miking - David was still experimenting. Others sounded so realistic it was breathtaking. You had to be in the sweet spot for the most magic effect, but wow! Imagine sitting close to a cinema screen. Now imagine that it's in perfect 3D - so you know you're in a movie theatre because it's still there at the edge of your vision, but all in front of you is a different place. That's a visual analogy of what it sounded like.

I tried some interesting experiments. Firstly, if you turn off the main stereo pair, there's no real sense of sound stage or image - it's just sounds like random echo. Secondly, if you turn off just the back surrounds, leaving the front 4 speakers on, it still sounds amazing. OK, there's no echo from the back anymore, but the front sound stand is still as huge. If you only had four channels, it would be much better to have four at the front rather than two front and two back with these recording. Strange, but true. Try selling that to Joe public.

The speakers were $3000 a pair (3 pairs required), and the amps were much more expensive. So I have no idea what this would sound like on equipment that I could afford! But in the demo, it was just fantastic. Stereo was pathetic by comparison. I don't own a surround system myself, partly because 5.1 just can't compare to what I’ve heard from 6.0.

Demo2: different sampling rates.

Most people visiting the demo room wanted to hear demo 1. But a few were very interested in demo 2. We had a studer 2-track studio machine, a master tape from a 1960s classical recording (apparently an audiophile-beloved rendition of Scheherazade), dCS pro A>D and D>A converters, and the same amps and speakers from the 6-channel demo, just running the front stereo pair.

Simply, you could change the source selector on the pre-amp to compare the analogue source directly with the source sent via the A>D and D>A converters. On the A>D converter you could chose any rate you wanted, and the D>A would oblige. You couldn't switch sample rates while monitoring because there was a glitch while the D>A caught up - so any digital version was always interspersed with the direct analogue feed by switching the pre-map over while the A>D rate was changed. It was also possible to change the bit depth, but we left it at 24-bits. DSD was also possible, apparently, but would mean changing the digital interconnection, which we didn't attempt.

So, you could compare analogue, 32,44.1,48,96,192 digital. You could have 88.2 and 176.4 too, but we didn't bother.

People usually couldn't hear a difference. They'd ask us to switch more quickly than was possible, saying it was impossible to hear a difference, because by the time we'd switched, the mood of the music had changed. They wanted to hear the loud cymbal crashing bit most, convinced they stood most chance of discerning a difference during this. But most failed. And I'll tell you the truth - I sat in the sweet spot, listened both sighted and blind and couldn't hear any difference.

The next day, while the demo was being run for the Nth time, I was at the back of the room talking with someone. Suddenly, I heard a difference as the source switched. I was surprised, having failed to hear a difference the previous day listening in the sweet spot. I listened as it switched again, and heard it switch back – ah ha, it must have just gone analogue / digital / analogue. I kept listening – I couldn’t hear the difference next time it switched.

I went to the middle/back of the room, and listened through the next demo. Without being told, I could pick out 44.1 and 48kHz. The difference was more obvious back from the sweet spot than in the sweet spot itself. More importantly, the difference wasn’t what I (or the other people who failed to hear it) had been listening for. It didn’t make any difference to the frequency response at all, or to the clarity of the high frequencies.

What 44.1kHz and 48kHz did do was to make the sound slightly less realistic, like the difference between a good and bad CD player. If the lower sampling rate had any defined “quality” it was a glassy kind of sound – I’d heard that word associated with CD before and thought it was complete rubbish – but now I actually heard the difference, I understood exactly what people had meant.

The change from 44.1 or 48kHz to analogue to 96kHz slightly increased the depth of the sound stage. I’d been listening to the amazing demo 1 for 2 days, so it was hardly an impressive difference, but it was still there.

If you’re counting, that’s only two blind detections – once when I wasn’t even listening, and again when I went back to the middle of the room to check – I confirmed which had been which with Kevin afterwards – “The next to last one was 40something, wasn’t it?”

You can say many things about this. You could say it was just luck, but I don’t think it was – I wasn’t even listening for the difference because (having listened the previous day) I didn’t think there was one to hear! You can say that I was hearing sonic deficiencies in the equipment. Well, maybe. That may be what the whole 44.1/96k debate is based on. All I can say is that, if there are sonic deficiencies in this equipment (I think the dCS boxes are around 5k each, and are used in many recording studios) then there isn’t much hope for the rest of us!

What you could say, with some justification, is that the “character” of 44.1 was more obvious outside the sweet spot, so maybe it’s not such a big issue. That’s probably true – except that maybe I was just listening for the wrong thing when I was “in the sweet spot”. Maybe I had to stop listening to the Hi-Fi, and start listening to the music and the performance to hear what was happening.

What is significant is that the 44.1kHz version wasn’t just different from the 96k and analogue version, it was [I[worse[/I]. As the analogue was the master, any difference would be bad news, but for it to be subjectively worse makes matters even, well, worse!

I was upset to think how much recorded music only exists as a 44.1kHz or 48kHz sampled digital master tape. I discussed the subjective imperfections (the improved depth and realism of the 96k version) with Kevin, and he agreed. He was surprised that I’d noticed it that day, but couldn’t even hear anything wrong with 32kHz the previous day! I asked him what he heard with 16-bit (we’d been using 24-bit all along) and DSD. He said 16-bit was even worse – it made the whole sound “grungy”, and that DSD sounded nice, but added it’s own signature. “You can tell when you’re playing DSD through this system – the rooms heats up wink.gif” he said – I looked at the huge amps, and could believe it.

One thing I should note: I didn’t think the analogue master was particularly good quality. It was a gorgeous recording, but it had obvious flaws – e.g. background noise, and some audible edits. Also, I didn’t hear any difference between analogue, 96k and 192k. I can’t explain why 44.1kHz and 48kHz sounded worse, but they did. No one responsible for the demo had any reason to rig the results, and I played with enough of the equipment to know that everything was above board and fair, even though some of the cables we used might not have met with audiophile approval.

Demo 3

In another room, they had a 1960s recording studio set-up, and they had several master tapes from the 1950s and 1960s. Some actually were the masters (Jackie Wilson live somewhere), others were direct dubs from them, including Elvis, from a 3-track master – not many people have heard an Elvis recording in the original 3-track stereo!

There were plenty of other demos around. The DSD 5.1 demo was terrible, but that was due to equipment and volume. Most other demos sounded very harsh and artificial compared to the three I’ve mentioned. Not because they were using CDs as sources, but because modern studio and mastering practice gives rise to dubious quality recordings, as we often discus here.

It seemed quite perverse that the nicest sounds at the AES were a 6-channel demonstration of recordings and equipment that you can’t even buy, and a tape from the early 1960s which still hasn’t been released in it’s original 3-track format.

Hope this clarifies my thoughts and opinions.

