Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Sound, the human ear, and the digital world (Read 19020 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Sound, the human ear, and the digital world

Hello all,

Side note first: Please bare with me, I know this was discussed before in the 'lossless' thread, and maybe in some other places, but I still want (with your help) to understand more issues that wasn't covered in previous threads.


Now to the actual subject in mind

Lately I started to be interested in sound, what it is, how it's replicated via digital means, and how much of it can be actually heard. And this I've understood so far as facts.

1. The human ear can actually hear frequencies until 20,000 Hz, which actually makes sense to me via some personal testing in CoolEdit, since I wasn't able to hear nothing when creating a sample sine wav.

2. According to the 'Nyquist theorem' (Which by what Garf said can be proved mathematically, and I'll take his word for it, as my math skills on this are not very good), 40,000 samples per seconds is more then enough to replicate digitally that frequency. A Normal 44.1/16 CD should even replicate frequencies until 22,500 Hz which should be more then enough 'headroom' just-in-case.

3. According to what I read, and also is a fact, a 16bit integer that is used in CDs is sufficient to describe 65,536 amplitude rates, which translate to a theoretical playback system optimum of 96 dB dynamic ranges. Now 'quantization errors' which what I've understood is a nice word for amplitude errors, should only become apparent near the edges when the sound is very quiet, AKA: -96dB range.

Now to the questions I'm asking:

1. Why does sampling rates higher then 44,100 exist? Since, nobody can ever hear beyond the 20,000 Hz frequency? For example, DVDs are sampled at 48,000 samples per second, DVD-A and SuperCD are sampled at 96,000 per second, and I think I even heard of 192,000 samples per second.
What did the people designing those sample-rates hoped to achieve?

edit: Main question, is why even bother to preserve frequencies nobody will ever able to hear? <Sarcasm> To test aliens hearing in the future? </Sarcasm>

Possible guesses:
a. Storage become so cheap, people just thought, hey why not, lets get some *more* headroom
b. Obscure copy protection, to prevent devices limited to 44.1 couldn't touch the audio stream
c. More quality (?!?)

2. Why does a bit rate of over 16bit is really needed (aka the infamous 24bit)? For what those 16777216 amplitude rates are needed to provide the amazing 144dB range is actually needed? I can't imagine people hearing normal music/voice/whatever will be sensitive to quantization errors at the -96dB level.
If I'm wrong, please explain how so

Again, my possible guesses:
a. Storage is cheap, so again, why not.
b. Even more headroom for some sound processing, which doesn't explain why some sound cards/software or sound devices are so proud at 24bit output.
c. More quality (?!?)

To sum up my post, I really want to know if those advancements really contribute (if only by something) to the sound quality that can be heard by the end consumer (Or for that matter the human ear), or all of them are in the end, just placebo, and this is the real reason DVD-A, and SuperCD formats are failing.
You can fool some of the people all of the time, and all of the people some of the time, but you can not fool all of the people all of the time.

- Abraham Lincoln

Sound, the human ear, and the digital world

Reply #1
The one thing that I'll address here, and take this with a lump of salt, I'm no mathematician either: I believe that a great number of people have a fundamental misunderstanding of the Nyquist thereom.

Nyquist was trying to describe the base line, or, the absolute minimum requirements in the most simplistic case, to perform analog-to-digital conversion. His, "2-times-the base-frequency" calculation was based on a repetive sine-wave. As soon as you deviate from that example, ie., the single impulse of the 5th harmonic of a crash of a cymbal, the 2-times thing doesn't work. Its not repetetive and its not a sine-wave.

As a consequence of this, the debate over needing higher than 44,100 comes in.  Keep in mind also, that rate was chosen because Sony and Phillips had so much time and money in the development of the CD, they had to get something to market.  Its a bit of a compromise, but that, as they say, is a whole new can of worms.   

Dex

Sound, the human ear, and the digital world

Reply #2
To give a compact answer: it isn't needed to go higher than 44k/16bit as far as pure audio quality goes, but in some cases the excess makes real world devices easier to design.

Sound, the human ear, and the digital world

Reply #3
Quote
The one thing that I'll address here, and take this with a lump of salt, I'm no mathematician either:
I believe that a great number of people have a fundamental misunderstanding of the Nyquist thereom


A quite ironic thing to say.

Quote
Nyquist was trying to describe the base line, or, the absolute minimum requirements in the most simplistic case, to perform analog-to-digital conversion. His, "2-times-the base-frequency" calculation was based on a repetive sine-wave. As soon as you deviate from that example, ie., the single impulse of the 5th harmonic of a crash of a cymbal, the 2-times thing doesn't work. Its not repetetive and its not a sine-wave.


This is wrong. Nyquist _does_ apply to any signal, including complex signals that are not sines.

The problem is that Nyquist assumes a signal of infinite length, and perfect filters. In practise, we are very close to that. But it's easier to build cheap filters if there's some more headroom to work with.


Sound, the human ear, and the digital world

Reply #5
Quote
I suggest both of you read the FAQ, more specifically:

http://www.hydrogenaudio.org/forums/index....t=ST&f=1&t=9311
http://www.hydrogenaudio.org/forums/index....t=ST&f=1&t=6150
http://www.hydrogenaudio.org/forums/index....4949#entry50336
http://www.hydrogenaudio.org/forums/index....t=ST&f=1&t=3390
http://www.musicgearnetwork.com/cgi-bin/ul...ic;f=3;t=000822

Thats for the links, really interesting reading matirial.
You can fool some of the people all of the time, and all of the people some of the time, but you can not fool all of the people all of the time.

- Abraham Lincoln

Sound, the human ear, and the digital world

Reply #6
Here's my best guess.

1) It's a good thing to move away from 44.1kHz to at least 48kHz. Resampling becomes tremendously easier (computationally, I think) along the 8, 16, 32 48, 96, 192 kHz multiples. Though I can't imagine why a consumer would want that kind of audio.

2) The best reasoning I heard for having increased sample rates is the interpolation. Yes, theoretically, one can interpolate the samples perfectly when approaching the Nyquist limit.  However, the perfect one is not always the one implemented (just a guess here), so one can and will hear the artifacts of interpolation.  An increased sample rate should eliminate all artifacts generated by inadequate interpolation methods.

Sound, the human ear, and the digital world

Reply #7
Quote
Now to the questions I'm asking:

1. Why does sampling rates higher then 44,100 exist? Since, nobody can ever hear beyond the 20,000 Hz frequency? For example, DVDs are sampled at 48,000 samples per second, DVD-A and SuperCD are sampled at 96,000 per second, and I think I even heard of 192,000 samples per second.
What did the people designing those sample-rates hoped to achieve?


1. Better time-domain resolution.
44.1kHz/20us vs. 96kHz/10us, human ear up to 6us resolution.

Easier analog filters, they work outside the hearing rage >30kHz, no/less distortion appears in the hearing rage <=20kHz.


Quote
2. Why does a bit rate of over 16bit is really needed (aka the infamous 24bit)? For what those 16777216 amplitude rates are needed to provide the amazing 144dB range is actually needed?


Low noisefloor at recording/mixing/editing, because every step add noise and every sound device adds noise.
.halverhahn

Sound, the human ear, and the digital world

Reply #8
The 'human ear goes up to 6us' sounds weird to me and I'm not sure how this relates to the need for 44kHz.

Quote
Low noisefloor at recording/mixing/editing, because every step add noise and every sound device adds noise.


Note that this is pretty irrelevant for consumers.

Sound, the human ear, and the digital world

Reply #9
Quote
Main question, is why even bother to preserve frequencies nobody will ever able to hear?

Let us not forget the VERY basic (seeing that i am just beginning to study this sort of thing... and i already understand this) idea that when 2 mechanical wave meet, more waves are created as a result of this meeting.  Although some waves may be imperceptable to the human ear, perhaps the result of these waves is.

Basically this:  The sounds we do not hear affect the sounds we do hear.

Sound, the human ear, and the digital world

Reply #10
Quote
Although some waves may be imperceptable to the human ear, perhaps the result of these waves is.


(Better explanation by more knowledged people below)

Sound, the human ear, and the digital world

Reply #11
Quote
DVD-A and SuperCD are sampled at 96,000 per second,

Note that SACD is sampled at a much higher rate, around 2.5Mhz if I remember correctly, not 96kHz. But it's only 1 bit. This is done because you can trade off bandwidth for dynamic range with noise shaping.

Sound, the human ear, and the digital world

Reply #12
Quote
1. Better time-domain resolution.
44.1kHz/20us vs. 96kHz/10us, human ear up to 6us resolution.

Human ear has a great INTER-EAR temporal resolution. But 44.1 KHz sample rate can achieve a even greater INTER-CHANNEL resolution, much more than just 1/44100 sec.

Sound, the human ear, and the digital world

Reply #13
You know, questions are nice, if they're new questions.

But we've already been here, and done it to death.

The inter-channel time resolution of CD is infinite, while for human ears it's about 10 micro seconds at best (not 6).


Please please please please read through those threads, or search them. They're really good!


EDIT: I agree with Garf's first answer to a large extent. But surely to say that it's enough, you'd need at least lots of negative ABX results to prove it. Granted, you'd need some positive ABX result to prove otherwise, but either way: caution is needed when making definitive statements.

Cheers,
David.

Sound, the human ear, and the digital world

Reply #14
Quote
Quote

The one thing that I'll address here, and take this with a lump of salt, I'm no mathematician either:
I believe that a great number of people have a fundamental misunderstanding of the Nyquist thereom


A quite ironic thing to say.

Quote
Nyquist was trying to describe the base line, or, the absolute minimum requirements in the most simplistic case, to perform analog-to-digital conversion. His, "2-times-the base-frequency" calculation was based on a repetive sine-wave. As soon as you deviate from that example, ie., the single impulse of the 5th harmonic of a crash of a cymbal, the 2-times thing doesn't work. Its not repetetive and its not a sine-wave.


This is wrong. Nyquist _does_ apply to any signal, including complex signals that are not sines.

The problem is that Nyquist assumes a signal of infinite length, and perfect filters. In practise, we are very close to that. But it's easier to build cheap filters if there's some more headroom to work with.

I realize that for the sake of supplying jsheridan with the most accurate and meaningful information, I probably should have butted-out of this one, its just that every explanation or tutorial of how the Nyquist Thereom works always presents it with a diagram of a repetitve sinewave.  Common sense tells me that any scientist or mathematician developing a theory would start with the baseline . . . thus my assumptions.

I agree and understand with the whole concept of the "filters", but I also believe, perhaps erroneously, that it is the quality of the filters, and not the thereom itself that allows for quality audio encoding. (To digital, that is.)  If we had, say, 10Mhz sampling rates, the need for quality filters, if any, would greatly diminish. Thus, the idea that higher sampling rates could lead to better audio.   

Dex

Sound, the human ear, and the digital world

Reply #15
Quote
You know, questions are nice, if they're new questions.

But we've already been here, and done it to death.

The inter-channel time resolution of CD is infinite, while for human ears it's about 10 micro seconds at best (not 6).


Please please please please read through those threads, or search them. They're really good!

First, I must agree, after re-reading some of the previous threads, I came to understand allot of things. Excellent reading, which I would recommend to anyone

I Came to the conclusion (based on the reading), that no one was able to prove any 'quality' related claim about the superiority of any format above the 44/16, which is what I've suspected (But couldn't prove because of lack of sufficient knowledge) as I wrote in the original post.

(The only useful gain, which wasn't related to quality, was the easier re-sampling from 48,000 for example)

However, I did added the 'twist', if all of this information is known to most professional people dealing with audio, why even bother at creating those DVD-A, and SACD? What do they gain from this? And what the consumer has to  gain from those new formats if at all?

(For example, in the video/image world, there was allot to gain from better color bits, and better resolutions)
You can fool some of the people all of the time, and all of the people some of the time, but you can not fool all of the people all of the time.

- Abraham Lincoln

Sound, the human ear, and the digital world

Reply #16
Well, you will be able to sell new hardware devices, 'new' CD's (you got to have the 'better' version of your favorite albums obviously), and you've rammed better copy protection schemes upon the customers ass.

The only thing the customer will gain is multichannel audio.

Sound, the human ear, and the digital world

Reply #17
did you see my long post here?

http://www.hydrogenaudio.org/forums/index....opic=9311&st=50


Many people believe they can hear a difference, whatever the reason.

One thing is for certain: it can't sound worse. It can either sound the same, or better.

Almost everyone here will tell you it sounds the same. But they (usually) haven't heard it.

Almost everyone trying to sell you something will tell you that it sounds so much better.

A lot of people working in the recording world have never liked CDs. Not "prefered vinyl to CD" - that's an audiophile obsession. Rather, they don't like CDs compared to their (analogue or hi-res digital) masters.

However, there are many other reasons for CDs sounding bad! And there are good reasons for people in the recording industry to say that SACD sounds great: Sony will help them with the cost of making SACDs.

Cheers,
David.


Sound, the human ear, and the digital world

Reply #19
Quote
One thing is for certain: it can't sound worse. It can either sound the same, or better.

It may sound worse.
Frequencies over 20 kHz can intermodulate with lower ones in the hifi, and create distortion that wouldn't be present in a 44100 Hz recording.
Again the old frequency test with an audible frequency in one speaker and an inaudible one in the other : when I play 6 kHz and 18 kHz in the same speaker, I can hear the 12 kHz distortion, that should not be there (don't try this at home if you don't want your tweeters to be fried).

Sound, the human ear, and the digital world

Reply #20
Recording studios use 24 bits audio because they need headroom for sound processing.
16 bits is enough for playback, but the mixing is now done digitally, including amplifying, panning, compressing, equalizing, adding effects etc.
If you start from a 16 bits source and apply a +6 db gain, the result has only 15 bits of resolution, but if you start from a 24 bits source, the result has 23 bits of resolution, and you can still press a 16 bits CD from it.
I don't know why 96 or 192 kHz sample rates are needed. It seems that manufacturers don't know either. "THEY can't hear a difference, they make the boxes for those who claim they can". (Quoted from http://recpit.prosoundweb.com/viewtopic.php?t=1556 )

Sound, the human ear, and the digital world

Reply #21
Quote
Quote
One thing is for certain: it can't sound worse. It can either sound the same, or better.

It may sound worse.
Frequencies over 20 kHz can intermodulate with lower ones in the hifi, and create distortion that wouldn't be present in a 44100 Hz recording.
Again the old frequency test with an audible frequency in one speaker and an inaudible one in the other : when I play 6 kHz and 18 kHz in the same speaker, I can hear the 12 kHz distortion, that should not be there (don't try this at home if you don't want your tweeters to be fried).

That's true, and a very good point, but...

Very very loud ultra sonic harmonics of the original signal will give rise to distortion components at the fundamental frequency. This doesn't sound too bad (unless there's _serious_ distortion).

On a CD player, some ultra sonic image frequencies above fs/2 will get through the low pass filter. These will not be harmonics of the original signal (because the spectrum is mirrored around fs/2, so 21kHz has an image at 23.1kHz - not a harmonic). If this intermodulate, the distortion will not be harmonically related to the original signal, and will sound much worse.


On most music, neither will happen. If there is something significant (i.e. at as high a level as the audible material) in the ultrasonic region, it may be wise to attenuate it.


However, you make a good point, so I'll re-phrase my original statement. Rather than better or worse, I'll say this:

Compared to a direct feed from a microphone, a 96kHz recording may sound closer (i.e. more similar) to that direct feed than a 44.1kHz recording, or it may sound exactly the same. The 96kHz recording cannot sound less like the direct microphone feed than the CD quality recording.

Cheers,
David.

Sound, the human ear, and the digital world

Reply #22
Quote
I agree and understand with the whole concept of the "filters", but I also believe, perhaps erroneously, that it is the quality of the filters, and not the thereom itself that allows for quality audio encoding.

The theorem says that's it's possible to store and reconstruct *any signal* (not just sines) that doesn't go over x Hz (i.e., doesn't have any frequency components higher than that) with a sampling rate of more than 2*x Hz. It assumes we have perfect filters. We don't really have those, but we have real world filters that are close enough in practise.

The theorem says that we can do it, in theory, and we can indeed do well enough in practise.

It is not limited to sines in any way, forget about that, it's wrong.

Sound, the human ear, and the digital world

Reply #23
Quote
Quote
I agree and understand with the whole concept of the "filters", but I also believe, perhaps erroneously, that it is the quality of the filters, and not the thereom itself that allows for quality audio encoding.

The theorem says that's it's possible to store and reconstruct *any signal* (not just sines) that doesn't go over x Hz (i.e., doesn't have any frequency components higher than that) with a sampling rate of more than 2*x Hz. It assumes we have perfect filters. We don't really have those, but we have real world filters that are close enough in practise.

The theorem says that we can do it, in theory, and we can indeed do well enough in practise.

It is not limited to sines in any way, forget about that, it's wrong.

I concur.  Nyquist's theorem applies to any signal that bandlimited to the sampling frequency.  Doesn't have to be a pure sine but can be a range of harmonics.  Sampling causes this band to be replicated and mirrored about multiples of the sampling freqency, thus the most one can get without aliasing and with brickwall-like filters in terms of bandwidth is half of sampling frequency.

Rather than going through the mathematics of Fourier transforms of sampled signals, I suggest someone dig up Dennis Gabor's classical paper, Theory of Communication.  He's got quite a simple and intuitive way of deriving Nyquist's result.  An amazing paper

Sound, the human ear, and the digital world

Reply #24
I suppose, where I get hung-up on the sine-wave thang, (besides the tutorials), is, in order to not violate the not-greater-than-2-times criteria, the highest frequency that could be reproduced that wasn't a sinewave would be approximately 11.025 khz.  (44,100 / 2) / 2.  In other words, the second harmonic of an 11.025 khz wave would be 22.050 khz, the highest allowable frequency within the 44,100 sampling rate.

And, of, course, even then, there's no room for a third harmonic, so not a very complex signal.  Even as I type this, this doesn't quite seem right, but I'm not sure how else to interpret the math.

Dex