Help - Search - Members - Calendar
Full Version: Why 24bit/48kHz/96kHz/
Hydrogenaudio Forums > Hydrogenaudio Forum > Scientific Discussion
Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9
unfortunateson
QUOTE(Crystaljuggler @ Jun 22 2006, 10:19) *

the limitations of your own optical nerves, which can only detect rates up to 25fps or thereabouts.


This sounds quite nonsensical to me
Mike Giacomelli
QUOTE(unfortunateson @ Jun 22 2006, 11:38) *

QUOTE(Crystaljuggler @ Jun 22 2006, 10:19) *

the limitations of your own optical nerves, which can only detect rates up to 25fps or thereabouts.


This sounds quite nonsensical to me


Its not correct, but its not really important to the point he was getting at, so I don't think its important.
legg
QUOTE(Crystaljuggler @ Jun 22 2006, 11:19) *

I'm a PC modder, and I'll do all sorts of things to get my PC to run faster and cooler and with more lights in it. I have benchmarks and tests that will show just how much faster my PC is than other people's. Numbers! Never mind that it's physically impossible to tell, from the user's point of view, whether you're going at 29.4 fps or 31.2 fps in a game. But if I was to sell you a graphics card I'd gloss over that, give you the numbers, and you would decide for yourself that the one that goes faster is the better one. Apart from cost, what real difference is it going to make to your life? Only one : it will make you feel better. Your experience of playing a game will be better. It's subjective, but you'll have more fun knowing that you're not losing out to the limitations of your own optical nerves, which can only detect rates up to 25fps or thereabouts. So : should progress of graphics technology be stopped, because it's "good enough"? Of course not.


On a PC it does matter how fast can it go or how much info it can process. Getting 200fps on X game might not be noticeable, but in the near future there will be a more demanding Y game that will get you roughly 20fps.

It is not the same with audio, indeed you can improve the sound card, but there's no point in improving it to the point where the difference is impercetible or negligible if you know that the future will not demand any more.

I don't see a problem with 96kHz/24-bit either, but it is much like killing a flea with a cannon.
Cosmo
The words ''sound[s] better'' should not be used to describe sensations that have nothing to do with hearing.
HbG
24 FPS was chosen because it was the lowest number to provide smooth-enough moving images. When there is a lot of movement in a scene, it becomes very easy to notice the jerkyness. Especially in a cinema.

In fast-paced computer games, 25fps equals unplayable for me. I seem to need about 40 as a minimum. Let's do some math. A fast turn at high mouse sensitivity can easily be ~500 degrees per second. At 25fps this equals 20 degrees per frame, with a FOV of 100 this means a displacement of 1/5th of your screen. Your eyes will have a lot of trouble tracking a moving object at such a low granularity. Framerates much higher than those needed for smooth images can also serve a purpose as it reduces processing latency, which can make a difference when everyone is playing at their limits, like in cyberathletics.

I can tell the refresh rate of a monitor when it's displaying a light surface, up to about 85Hz. When servicing computers i try to guess the screen's refresh rate before checking it. I'm nearly always right. Not a double blind test but as good as it gets. smile.gif LCD's are excempt from this of course, but at the right angle in direct sunlight, i've seen the old style LCD screens of digital watches flicker also. Curiously enough i can also notice some plasma screens refresh like TV's. Spotting the 100HZ tv's in a tv store is pretty easy also.


Sound isn't video, however. smile.gif
crimsontide
While its a whole different argument, I agree with the above post - that while 24fps may be the cinemas moving image benchmark, its not the limit of visual perception by any means whatsoever.

In the same way that 128kbit mp3 is not "CD Quality" audio, however it is suitable for the general populous with mobile mp3 players.

EG:
I bought 2 albums on itunes, 128kbit AACs, and they sound terrible. The album in question is Less than Jake (trumpets sound ok, but the cymbals sound like gravel in my ears)

I also can tell (also used to be video technician for Digital Video Computing in sussex, UK) the difference between 50hz, and 75hz for example, and between 75 and 85 (gets harder), between 85 and 100 is difficult though - although I think i would get reasonable results in double blind tests.

But noone is saying that CD audio is DVD-A quality.

They are saying that noone can scientifically prove that they can tell a difference.

But thats like saying a gherkin is a good gherkin:
how many gherkins have you tried? what kind of vinegar are they pickled in? How long have they been pickled for? did you read the label before tasting and voting on it?

Frankly I dont give a damn - I know a good gherkin when i taste one, and i know a bad one too.

I think 96khz 24bit is a good place to stop for audio, because I don't see why we should have the accuracy/reproduction of an analogue source limited in any way. The mathematical proof is - that DVD-A is far better when mixed down from an analogue recording session.

infinite wave sampled to CD Quality

is an exponentially poorer reproduction

than sampling @ DVD-a quality.


Enough of the abx blind tests.....they are subjective.....YOU are listening through your EARS. Just liek I did - i couldnt tell the difference, but I say, when we have a lot of cannons, and cannonballs, lets kill fleas with em!!!!
stephanV
You are not killing fleas, you are shooting holes in the air. And in your wallet. tongue.gif
Mike Giacomelli
QUOTE(crimsontide @ Jun 23 2006, 05:32) *


I think 96khz 24bit is a good place to stop for audio, because I don't see why we should have the accuracy/reproduction of an analogue source limited in any way. The mathematical proof is - that DVD-A is far better when mixed down from an analogue recording session.

infinite wave sampled to CD Quality

is an exponentially poorer reproduction

than sampling @ DVD-a quality.



Analog recordings don't have infinate resolution, if thats what you're trying to say.

Even ignoring that, saying something is a "mathematical proof" is meaningless if you neglect to include the actual math!

crimsontide
QUOTE(Mike Giacomelli @ Jun 23 2006, 07:53) *

Analog recordings don't have infinate resolution, if thats what you're trying to say.

Even ignoring that, saying something is a "mathematical proof" is meaningless if you neglect to include the actual math!


Alright they are limited by the accuracies of the voltage applied to the magnetic head during recording, and the quality/thicknes/width/age of the tape, which in turn is limited by the musical signal you are recording to the tape etc. etc. blah..... Jeez, there are literally billions of factors affecting it, but at least they also affect digital recording excluding the voltage/current accuracies applied to the recording head and the efficiency and response curve of the recording head.

I didnt do any maths to prove that 1 to the power of 16 is exponentially smaller dynamic resolution than 1 to the power of 24, now are you happy that ive wasted 2 minutes telling you what you already should know to be posting comment in this thread? Go look up "exponential" and prove it yourself.

poking holes in my post while not addressing the point is a classic strawman argument - congratulations.
Or are you suggesting that its not exponentially a poorer representation? No - you didnt do that - just poked holes to make it SEEM like my point held no water. I don't like that - its childish and totally unscientific. Go away.

ITs friday - im hot, at work, and i was just making a point without POINTLESSLY DELVING into UNNECESSARY DETAILS.
stephanV
QUOTE(crimsontide @ Jun 23 2006, 16:50) *

I didnt do any maths to prove that 1 to the power of 16 is exponentially smaller dynamic resolution than 1 to the power of 24, now are you happy that ive wasted 2 minutes telling you what you already should know to be posting comment in this thread? Go look up "exponential" and prove it yourself.

In my math book 1^16 = 1^24. tongue.gif tongue.gif tongue.gif

I assume you mean 2^16 < 2^24.

But the question is if the extra resolution will bring you anything you can hear. And a higher dynamic range might help a little in very extra-ordinary cases. But a higher sampling rate won't. But if you have nothing better to spend your money on, go ahead.
crimsontide
QUOTE(stephanV @ Jun 23 2006, 09:25) *

QUOTE(crimsontide @ Jun 23 2006, 16:50) *

I didnt do any maths to prove that 1 to the power of 16 is exponentially smaller dynamic resolution than 1 to the power of 24, now are you happy that ive wasted 2 minutes telling you what you already should know to be posting comment in this thread? Go look up "exponential" and prove it yourself.

In my math book 1^16 = 1^24. tongue.gif tongue.gif tongue.gif

I assume you mean 2^16 < 2^24.

But the question is if the extra resolution will bring you anything you can hear. And a higher dynamic range might help a little in very extra-ordinary cases. But a higher sampling rate won't. But if you have nothing better to spend your money on, go ahead.


pwned.

I told ya it was a friday!!!! and it really is hot in my office..... hahaha

I agree with you essentially, but my point is - why not? when we can make cd players that read dvd-a as well? I don't see.....my hard drive is now 320+160 gigabytes, almost a half terabyte. My first HD was 40 MEgabytes!!!

As long as the production takes advantage of it at every stage, then i think its worth doing as a last step to arguably transparent digital sampling.
WmAx
QUOTE(dobyblue @ Jul 13 2006, 16:35) *
...


This is the scientific discussion forum. You are posting speculation(s), which are useless in this context, and parts of your post where you make audible claims without acceptable perceptual tests to back them up, may actually be a violation of the TOS of this website.

-Chris
Taz PA-C
The reason we need more than 16 bit audio is because the human ear can hear through the noise floor level. This is easily proven by yourself. Go to a crowded party where the noise level is very high. If you couldn't hear below this noise level, a private conversation between two people would be inaudible over the noise. Imagine if suddenly one of them mentions your name in a less than complimentery way. Suddenly, all the noise in the room is now quiet, and you can hear every word they say, even though they were noise previously. Human hearing is like that. Now imagine that you are listening to a band with several voices, and in the background is a tune that is being played very softly, but that is what you want to hear. If the bit level is deep enough, you will be able to tune it in, otherwise, it is just interference to the main sound level. I wish we had 32 bit/64KHz stereo minimum for all recordings, that would do justice to any music.
This is my first post. The people here seem very educated. I am educated too, but I am not an engineer or sound technician. This is my personal opinion, but I wish that the subject of sound reproduction wasn't so cold and clinical, too many assumptions are made that make cold clinical sense, but not sense if you factor in the human element. The subject of not being able to hear below the noise floor is a perfect example.
Lyx
QUOTE(Taz PA-C @ Jul 30 2006, 20:30) *

The reason we need more than 16 bit audio is because the human ear can hear through the noise floor level. This is easily proven by yourself. Go to a crowded party where the noise level is very high. If you couldn't hear below this noise level, a private conversation between two people would be inaudible over the noise. Imagine if suddenly one of them mentions your name in a less than complimentery way. Suddenly, all the noise in the room is now quiet, and you can hear every word they say, even though they were noise previously.

Your argument-chain is invalid, because you are mixing up *MASKING* and Noise-Floor. Both are by far not the same. The fact that we can change "how we filter information" has nothing to do with the noise-floor. This is an argument from ignorance.

- Lyx
TBeck
QUOTE(Taz PA-C @ Jul 30 2006, 20:30) *

The reason we need more than 16 bit audio is because the human ear can hear through the noise floor level. This is easily proven by yourself. Go to a crowded party where the noise level is very high. If you couldn't hear below this noise level, a private conversation between two people would be inaudible over the noise. Imagine if suddenly one of them mentions your name in a less than complimentery way. Suddenly, all the noise in the room is now quiet, and you can hear every word they say, even though they were noise previously. Human hearing is like that. Now imagine that you are listening to a band with several voices, and in the background is a tune that is being played very softly, but that is what you want to hear. If the bit level is deep enough, you will be able to tune it in, otherwise, it is just interference to the main sound level. I wish we had 32 bit/64KHz stereo minimum for all recordings, that would do justice to any music.

I don't understand, why this should be an argument for the need of higher bit resolutions. Perception can be selective, that is true. The selection here can be made by special properties of the signal (Frequency, patterns, the meaning it has for you...). But why should you need a high bit resoulution or dynamics to perform this selection? This would only be true, if the dynamic range would be too small to let you discriminate the properties you need for your selection process. I am quite sure, that for your examples even far less than 16 bit would be sufficient to perform the selection.
greynol
Let's take a cold clinical approach to the example you have presented here:
QUOTE(Taz PA-C @ Jul 30 2006, 11:30) *
Go to a crowded party where the noise level is very high. If you couldn't hear below this noise level, a private conversation between two people would be inaudible over the noise. Imagine if suddenly one of them mentions your name in a less than complimentery way. Suddenly, all the noise in the room is now quiet, and you can hear every word they say, even though they were noise previously.

The SPL of a crowded party that is loud is probably no more than 95dB, 105dB tops with loud music going.

The SPL of a "whisper" that can be heard between two people at such a party will be no less than 35dB.

The difference between these levels is at most going to be 70dB. The SNR due to the quantization noise of 16 bits is 96dB, more than enough to capture that whisper at a noisy party.
jhbretz
While you are right that the SNR of the quantization noise of a 16 bit signal is 96dB, keep in mind that we're talking about the *quality* of the quietest sound. So if this quietest sound was at -96dB, then it would have no quality at all because it would be a square wave.

The real question should be "what bit resolution is necessary *FOR THE QUIESTEST SOUND*." At 16 bits, the quiet sounds are audible, but they don't have the same detail as the louder sounds.

cool.gif Now check this out. You can get 24 bit audio from 16 bit CDs - how? Use ogg vorbis at a very high quality level. The file now contains *frequency* domain information. That means that if there is quantization, it will disappear with a decoder running at 24 bit accuracy.

You could still do a lot of other things wrong. I use Shure E5C headphones driven with a custom Sigma delta converter and amp. Things are real clean and real quiet. The difference between 16 bit and 24 bit is just OBVIOUS. Especially quiet sounds with some texture, like the echos after a background percussion hit.

QUOTE(TBeck @ Jul 30 2006, 13:57) *

QUOTE(Taz PA-C @ Jul 30 2006, 20:30) *

The reason we need more than 16 bit audio is because the human ear can hear through the noise floor level. This is easily proven by yourself. Go to a crowded party where the noise level is very high. If you couldn't hear below this noise level, a private conversation between two people would be inaudible over the noise. Imagine if suddenly one of them mentions your name in a less than complimentery way. Suddenly, all the noise in the room is now quiet, and you can hear every word they say, even though they were noise previously. Human hearing is like that. Now imagine that you are listening to a band with several voices, and in the background is a tune that is being played very softly, but that is what you want to hear. If the bit level is deep enough, you will be able to tune it in, otherwise, it is just interference to the main sound level. I wish we had 32 bit/64KHz stereo minimum for all recordings, that would do justice to any music.

I don't understand, why this should be an argument for the need of higher bit resolutions. Perception can be selective, that is true. The selection here can be made by special properties of the signal (Frequency, patterns, the meaning it has for you...). But why should you need a high bit resoulution or dynamics to perform this selection? This would only be true, if the dynamic range would be too small to let you discriminate the properties you need for your selection process. I am quite sure, that for your examples even far less than 16 bit would be sufficient to perform the selection.

Axon
QUOTE(jhbretz @ Aug 14 2006, 12:33) *
While you are right that the SNR of the quantization noise of a 16 bit signal is 96dB, keep in mind that we're talking about the *quality* of the quietest sound. So if this quietest sound was at -96dB, then it would have no quality at all because it would be a square wave.

The real question should be "what bit resolution is necessary *FOR THE QUIESTEST SOUND*." At 16 bits, the quiet sounds are audible, but they don't have the same detail as the louder sounds.

That is a strawman argument and, in fact, is not the question.

Clearly, if you want to encode a signal using only the least significant bit, you will have large amounts of distortion - after all, it's a 1-bit signal. But those signals simply do not exist in real life. The most important reason is that nobody would want to listen to actual music on it - there's no known listening environment, or set of listeners, that can tolerate more than 60db of dynamic range. There is also no popular or classical music with 90db of dynamic range.

I don't think you are able to come up with a valid use case where the quantization noise is actually important, unless you want to cause permanent hearing loss in the listener. If you wanted to record the Space Shuttle taking off, then sure, you'd probably need 24 bits of precision. But that has no bearing on the use of CDs to store actual music, or failing that, sounds that people really want to listen to.

QUOTE
cool.gif Now check this out. You can get 24 bit audio from 16 bit CDs - how? Use ogg vorbis at a very high quality level. The file now contains *frequency* domain information. That means that if there is quantization, it will disappear with a decoder running at 24 bit accuracy.

You are confusing "accuracy" with "precision". Look it up. You will never get better than 16 bits of accuracy out of CD, even if you decode to 64-bit floating point, no matter which format you use.

Woodinville
QUOTE(jhbretz @ Aug 14 2006, 10:33) *

So if this quietest sound was at -96dB, then it would have no quality at all because it would be a square wave.


Please go to a good reference library and read Bart Locanthi Sr's paper on dithering.

Thank you.
Mike Giacomelli
QUOTE(Woodinville @ Aug 14 2006, 12:11) *

QUOTE(jhbretz @ Aug 14 2006, 10:33) *

So if this quietest sound was at -96dB, then it would have no quality at all because it would be a square wave.


Please go to a good reference library and read Bart Locanthi Sr's paper on dithering.

Thank you.


heh that was my thought. You can do a lot with just one bit. Certainly more then a "square wave".
greynol
QUOTE(Mike Giacomelli @ Aug 14 2006, 13:06) *

heh that was my thought. You can do a lot with just one bit. Certainly more then a "square wave".
I tried to point that out to someone here once but it didn't go over too well.

http://www.hydrogenaudio.org/forums/index....&pid=414672
HotshotGG
QUOTE
Clearly, if you want to encode a signal using only the least significant bit, you will have large amounts of distortion - after all, it's a 1-bit signal. But those signals simply do not exist in real life. The most important reason is that nobody would want to listen to actual music on it - there's no known listening environment, or set of listeners, that can tolerate more than 60db of dynamic range. There is also no popular or classical music with 90db of dynamic range.


This is true under practical conditions. wink.gif
Pio2001
QUOTE(jhbretz @ Aug 14 2006, 19:33) *
The real question should be "what bit resolution is necessary *FOR THE QUIESTEST SOUND*."


Answer : 1 bit, because the quietest audible sound by the ear has no quality either.

QUOTE(jhbretz @ Aug 14 2006, 19:33) *
cool.gif Now check this out. You can get 24 bit audio from 16 bit CDs - how? Use ogg vorbis at a very high quality level. The file now contains *frequency* domain information. That means that if there is quantization, it will disappear with a decoder running at 24 bit accuracy.


No. Quantization introduces noise. When you decode at 24 bits, you still have got the 16 bits noise introduced by the ADC process. It doesn't disappear. It is encoded and decoded.

QUOTE(jhbretz @ Aug 14 2006, 19:33) *
The difference between 16 bit and 24 bit is just OBVIOUS. Especially quiet sounds with some texture, like the echos after a background percussion hit.


Please, provide ABX results.
Woodinville
QUOTE(HotshotGG @ Aug 14 2006, 15:21) *

QUOTE
Clearly, if you want to encode a signal using only the least significant bit, you will have large amounts of distortion - after all, it's a 1-bit signal. But those signals simply do not exist in real life. The most important reason is that nobody would want to listen to actual music on it - there's no known listening environment, or set of listeners, that can tolerate more than 60db of dynamic range. There is also no popular or classical music with 90db of dynamic range.


This is true under practical conditions. wink.gif



Really, if I encode a signal with a massively oversampled 1-bit signal, I'll get large amounts of distortion?

***cough***

Really?????

QUOTE(jhbretz @ Aug 14 2006, 10:33) *
I use Shure E5C headphones driven with a custom Sigma delta converter and amp. Things are real clean and real quiet. The difference between 16 bit and 24 bit is just OBVIOUS. Especially quiet sounds with some texture, like the echos after a background percussion hit.

(Emphasis added)

That's a pretty good 1-bit system, now, isn't it?
Patsoe
QUOTE(Woodinville @ Aug 15 2006, 19:32) *

[...]

Really, if I encode a signal with a massively oversampled 1-bit signal, I'll get large amounts of distortion?

***cough***

Really?????

[...] (Emphasis added)

That's a pretty good 1-bit system, now, isn't it?


I think Axon was talking about the LSB in a multi-bit system, which is not the same as a 'massively oversampled 1-bit signal' - it's not oversampled on the disc... Also the sigma/delta converter you're boldfacing is a dac: it's a decoder, so it can do nothing about your encoding...

Maybe I just don't get the joke?
jhbretz
Okay, so you're saying that to hear the 1 bit sound, you would end up blasting your ears with everything else.

I agree.

Now let's ask "what bit resolution is necessary" for this quietest sound? When I say "quiestest sound" I mean the quietest sound on the track of which you can still discern its quality.

If this "quietest sound" were 8 bits, would you be able to tell? That means that it has %0.4 steps on it. I would guess that this is probably around the threshold of hearing.

This quietest sound would only have to be 48dB down on the 16 bit recording to render it 8 bit precision.

I don't know the answer to this question, but the limit of perception is definately bigger than 1 bit, so the 96dB figure is strictly an upper limit.

QUOTE(Axon @ Aug 14 2006, 13:04) *

QUOTE(jhbretz @ Aug 14 2006, 12:33) *
While you are right that the SNR of the quantization noise of a 16 bit signal is 96dB, keep in mind that we're talking about the *quality* of the quietest sound. So if this quietest sound was at -96dB, then it would have no quality at all because it would be a square wave.

The real question should be "what bit resolution is necessary *FOR THE QUIESTEST SOUND*." At 16 bits, the quiet sounds are audible, but they don't have the same detail as the louder sounds.

That is a strawman argument and, in fact, is not the question.

Clearly, if you want to encode a signal using only the least significant bit, you will have large amounts of distortion - after all, it's a 1-bit signal. But those signals simply do not exist in real life. The most important reason is that nobody would want to listen to actual music on it - there's no known listening environment, or set of listeners, that can tolerate more than 60db of dynamic range. There is also no popular or classical music with 90db of dynamic range.

I don't think you are able to come up with a valid use case where the quantization noise is actually important, unless you want to cause permanent hearing loss in the listener. If you wanted to record the Space Shuttle taking off, then sure, you'd probably need 24 bits of precision. But that has no bearing on the use of CDs to store actual music, or failing that, sounds that people really want to listen to.

QUOTE
cool.gif Now check this out. You can get 24 bit audio from 16 bit CDs - how? Use ogg vorbis at a very high quality level. The file now contains *frequency* domain information. That means that if there is quantization, it will disappear with a decoder running at 24 bit accuracy.

You are confusing "accuracy" with "precision". Look it up. You will never get better than 16 bits of accuracy out of CD, even if you decode to 64-bit floating point, no matter which format you use.

jhbretz
I will definately look up the reference. Thank you.

But the question of dithering is a side issue. You can certainly use the LSB of any data stream and modulate it at a higher frequency to emulate a higher precision, provided that you oversample at a sufficient rate.

My main point was that on a 16 bit recording, the softer sounds are not 16 bit precision. This begs the question of what precision is transparent for these softer sounds.

QUOTE(Woodinville @ Aug 14 2006, 14:11) *

QUOTE(jhbretz @ Aug 14 2006, 10:33) *

So if this quietest sound was at -96dB, then it would have no quality at all because it would be a square wave.


Please go to a good reference library and read Bart Locanthi Sr's paper on dithering.

Thank you.
Patsoe
jhbretz, your post above is very cryptic... could you take a shot at rewriting that?

edit: oh - I meant the first of the two above here smile.gif
jhbretz
This quote of mine below intuitively sounds like nonsense, doesn't it? (Maybe this is why I got "warned") But hear me out - I was just posting this to see whether anyone was listening. And someone is! Very exciting!

Let's say for the purposes of discussion, the "softest sound" on a recording is at -60dB. On a 16 bit recording, that means that it is only at 6 bit precision. It has "steps" of 1.5% signal amplitude at the sampling frequency. Now the act of encoding with vorbis fits this 6 bit signal with the RMS minimum error frequency components. In doing so, an interpolated signal has been created that doesn't have the 1.5% steps in it. (CT Fourier transform coeffiecients are discrete but the frequencies themselves are continuous)

The fact that sounds in general are very well correlated means that the interpolated signal is a good fit for what the signal should have been.

QUOTE(jhbretz @ Aug 14 2006, 12:33) *

cool.gif Now check this out. You can get 24 bit audio from 16 bit CDs - how? Use ogg vorbis at a very high quality level. The file now contains *frequency* domain information. That means that if there is quantization, it will disappear with a decoder running at 24 bit accuracy.
Patsoe
QUOTE(jhbretz @ Aug 15 2006, 20:34) *

This quote of mine below intuitively sounds like nonsense, doesn't it? (Maybe this is why I got "warned") But hear me out - I was just posting this to see whether anyone was listening. And someone is! Very exciting!

Let's say for the purposes of discussion, the "softest sound" on a recording is at -60dB. On a 16 bit recording, that means that it is only at 6 bit precision. It has "steps" of 1.5% signal amplitude at the sampling frequency. Now the act of encoding with vorbis fits this 6 bit signal with the RMS minimum error frequency components. In doing so, an interpolated signal has been created that doesn't have the 1.5% steps in it. (CT Fourier transform coeffiecients are discrete but the frequencies themselves are continuous)

The fact that sounds in general are very well correlated means that the interpolated signal is a good fit for what the signal should have been.


Ah... I think I know what you mean by "steps of 1.5%" now. You mean the quantization error has a relative size of 1.5% on a signal of 2^6 amplitude, right?

Well, the bad news is: no Fourier transform is going to give you back the lost information. The error just takes another form in another domain.
Thus, without the transform, the quantized signal is just as good a fit for what it should have been. And since encoding to vorbis does a bit more than just transforming to the frequency domain, the fact is that the original wav is a better fit than the ogg.

If you got a warning, I think it wasn't for this erroneous thought, but rather for claiming you could obviously hear it...
jhbretz
Please see previous post where I said:

Let's say for the purposes of discussion, the "softest sound" on a recording is at -60dB. On a 16 bit recording, that means that it is only at 6 bit precision. It has "steps" of 1.5% signal amplitude at the sampling frequency. Now the act of encoding with vorbis fits this 6 bit signal with the RMS minimum error frequency components. In doing so, an interpolated signal has been created that doesn't have the 1.5% steps in it. (CT Fourier transform coeffiecients are discrete but the frequencies themselves are continuous)

The fact that sounds in general are very well correlated means that the interpolated signal is a good fit for what the signal should have been.

QUOTE(Pio2001 @ Aug 15 2006, 13:11) *


QUOTE(jhbretz @ Aug 14 2006, 19:33) *
cool.gif Now check this out. You can get 24 bit audio from 16 bit CDs - how? Use ogg vorbis at a very high quality level. The file now contains *frequency* domain information. That means that if there is quantization, it will disappear with a decoder running at 24 bit accuracy.


No. Quantization introduces noise. When you decode at 24 bits, you still have got the 16 bits noise introduced by the ADC process. It doesn't disappear. It is encoded and decoded.

QUOTE(jhbretz @ Aug 14 2006, 19:33) *
The difference between 16 bit and 24 bit is just OBVIOUS. Especially quiet sounds with some texture, like the echos after a background percussion hit.


Please, provide ABX results.


I just did an informal ABX and was able to discern a 16 bit wav from its vorbis encoded & 24 bit decoded equivalent consistently. I did another test decoding the vorbis at 16 bit and 24 bits and got the same results. How should I go about providing these results?
Mike Giacomelli
QUOTE(jhbretz @ Aug 15 2006, 13:02) *

Please see previous post where I said:

Let's say for the purposes of discussion, the "softest sound" on a recording is at -60dB. On a 16 bit recording, that means that it is only at 6 bit precision. It has "steps" of 1.5% signal amplitude at the sampling frequency. Now the act of encoding with vorbis fits this 6 bit signal with the RMS minimum error frequency components. In doing so, an interpolated signal has been created that doesn't have the 1.5% steps in it. (CT Fourier transform coeffiecients are discrete but the frequencies themselves are continuous)

The fact that sounds in general are very well correlated means that the interpolated signal is a good fit for what the signal should have been.

QUOTE(Pio2001 @ Aug 15 2006, 13:11) *


QUOTE(jhbretz @ Aug 14 2006, 19:33) *
cool.gif Now check this out. You can get 24 bit audio from 16 bit CDs - how? Use ogg vorbis at a very high quality level. The file now contains *frequency* domain information. That means that if there is quantization, it will disappear with a decoder running at 24 bit accuracy.


No. Quantization introduces noise. When you decode at 24 bits, you still have got the 16 bits noise introduced by the ADC process. It doesn't disappear. It is encoded and decoded.

QUOTE(jhbretz @ Aug 14 2006, 19:33) *
The difference between 16 bit and 24 bit is just OBVIOUS. Especially quiet sounds with some texture, like the echos after a background percussion hit.


Please, provide ABX results.


I just did an informal ABX and was able to discern a 16 bit wav from its vorbis encoded & 24 bit decoded equivalent consistently. I did another test decoding the vorbis at 16 bit and 24 bits and got the same results. How should I go about providing these results?


Describe the process you did to create the test, list the sample(s) you used, and post the score from your ABX test.
Patsoe
QUOTE(jhbretz @ Aug 15 2006, 21:02) *

I just did an informal ABX and was able to discern a 16 bit wav from its vorbis encoded & 24 bit decoded equivalent consistently. I did another test decoding the vorbis at 16 bit and 24 bits and got the same results. How should I go about providing these results?


You'll need to go from informal to formal wink.gif

* what equipment were you using? (e.g. if you played back the 24bit file on a 16bit device it might sound different due to lack of dithering)

* were the levels the same after the encode/decode cycle? (you can check this in your wav-editor)

* how did you do the blinding (what software)?

* how many rounds did you do (without checking how you were doing!)?

Woodinville
QUOTE(jhbretz @ Aug 15 2006, 11:51) *
Now let's ask "what bit resolution is necessary" for this quietest sound? When I say "quiestest sound" I mean the quietest sound on the track of which you can still discern its quality.


You are, I presume, aware of the absolute threshold of hearing?

I should, I suppose, also point out the fact that the atmosphere, being made of molecules of gasses (N2, O2, CO2, H2O primarily) and atoms of Argon, is discrete in nature, and that "air pressure" is in fact exactly the result of the momentum of the molecules and atoms bouncing off something.

This means that there is a noise level to the atmosphere. For something the size of an eardrum, it's reasonably close to 6dB SPL, white noise, in the 20Hz to 20kHz range. This is a noise in the system that can not be removed, ever, unless you remove the air from both sides of the eardrum, which I believe may create some difficulties with the listener involved.

Standard understanding of noise-masking-noise shows that if you have noise in a critical band, and you wish to use to to mask some smaller noise, that an SNR of 3 to 4dB is sufficient to mask that noise.

That is what we have when we have quantization noise being masked by the atmosphere (which, by the way, is just SLIGHTLY below the threshold of hearing in the most sensitive ERB), with the quantization noise being at, oh, about 3dB SPL. That puts the peak level from a system operating in the quietest room on earth at 98dB SPL for peak levels. For much presentation, this is completely sufficient.

Now, if we consider a quiet, normal room, what happens? We have at least 20dB more headroom. So, in other words, 16 bits is likely to be more than sufficient if we have normal speakers, which may get to 110dB on peaks without frying, in a QUIET normal room.

If we argue for 18 bits, we have enough dynamic range to get from the noise level of the atmosphere to the loudest most speakers can get, and certainly to the loudest that the human ear should ever actually be exposed to.

If we want to reproduce a rimshot, up close, first we have to invent new speakers. We should also, in that case, encourage research in to hair-cell regeneration on the Organ of Corti.

If, just for kicks, we argue for 32 bits, we can go from the atmospheric noise level (6dB) to +4dB re: 1 ATMOSPHERE RMS. Of course nobody can realize that level on the negative excursion side, and any one device is only likely to render it on the positive side for one use... smile.gif
jhbretz
I know where you are coming from. You are saying "you can't possibly create new information" in decoding 16 bit data at 24 bits. True - you can't, but the key is that sound is well correlated. If we were talking about random noise, then it wouldn't work. But sounds that come from real instruments have amplitudes that don't jump around a whole lot between samples. So I would argue that doing an interpolation (and in fact much better than a linear interpolation) is valid.

In terms of "creating information," you aren't - you're just guessing that the source of the sound was correlated. And since uncorrelated things sound awful, this is a good guess.

The frequency components of the actual sound are limited (real instruments have limited overtones) and therefore the vorbis encoder can store these limited number of frequencies and get very close to the original sound.

QUOTE(Patsoe @ Aug 15 2006, 15:02) *

QUOTE(jhbretz @ Aug 15 2006, 20:34) *

This quote of mine below intuitively sounds like nonsense, doesn't it? (Maybe this is why I got "warned") But hear me out - I was just posting this to see whether anyone was listening. And someone is! Very exciting!

Let's say for the purposes of discussion, the "softest sound" on a recording is at -60dB. On a 16 bit recording, that means that it is only at 6 bit precision. It has "steps" of 1.5% signal amplitude at the sampling frequency. Now the act of encoding with vorbis fits this 6 bit signal with the RMS minimum error frequency components. In doing so, an interpolated signal has been created that doesn't have the 1.5% steps in it. (CT Fourier transform coeffiecients are discrete but the frequencies themselves are continuous)

The fact that sounds in general are very well correlated means that the interpolated signal is a good fit for what the signal should have been.


Ah... I think I know what you mean by "steps of 1.5%" now. You mean the quantization error has a relative size of 1.5% on a signal of 2^6 amplitude, right?

Well, the bad news is: no Fourier transform is going to give you back the lost information. The error just takes another form in another domain.
Thus, without the transform, the quantized signal is just as good a fit for what it should have been. And since encoding to vorbis does a bit more than just transforming to the frequency domain, the fact is that the original wav is a better fit than the ogg.

If you got a warning, I think it wasn't for this erroneous thought, but rather for claiming you could obviously hear it...
Patsoe
QUOTE(jhbretz @ Aug 15 2006, 21:54) *

I know where you are coming from. You are saying "you can't possibly create new information" in decoding 16 bit data at 24 bits. True - you can't, but the key is that sound is well correlated. If we were talking about random noise, then it wouldn't work. But sounds that come from real instruments have amplitudes that don't jump around a whole lot between samples. So I would argue that doing an interpolation (and in fact much better than a linear interpolation) is valid.


Ofcourse the values can't jump around a lot between samples... that's what you ensure by low-passing the signal before a/d conversion.
What you want to accomplish - interpolating out the "steps" in the digital signal - is exactly what low-passing after d/a conversion does for you. And indeed that's better than a linear interpolation.

Seriously, your statement that the Fourier coefficients are quantized says it all. There is no improvement - the errors are now in those coefficients and will still sound the same.

QUOTE(Woodinville @ Aug 15 2006, 21:23) *

I should, I suppose, also point out the fact that the atmosphere, being made of molecules of gasses (N2, O2, CO2, H2O primarily) and atoms of Argon, is discrete in nature, and that "air pressure" is in fact exactly the result of the momentum of the molecules and atoms bouncing off something.


Maybe you shouldn't have pointed that out wink.gif
It serves no purpose at all for making your point (everything else in that post did not rely on the atomic structure of air), plus it seems to suggest that the discreteness of gasses would justify the discreteness in digital audio... that discreteness is on a completely different scale, and for all audio reproduction purposes on earth (not in outer space) we can consider air a continuous medium, I think smile.gif
Woodinville
QUOTE(Patsoe @ Aug 15 2006, 14:28) *
Maybe you shouldn't have pointed that out wink.gif
It serves no purpose at all for making your point (everything else in that post did not rely on the atomic structure of air),


Actually, it sets an absolute lower limit on what detectable noise from an audio system can be, and I did that, ergo it is rather germane to the discussion. Since that sets a peak level of 98dB or so for a 16 bit system that meets that constraint, as I pointed out in that post, that rather does tie in to most audio reprodction.
QUOTE


plus it seems to suggest that the discreteness of gasses would justify the discreteness in digital audio...


Perhaps it seems to do so to you, it certainly relates to discrete levels in audio for me, since it does show, quite indisputably, the lowest useful level, EVER, for actual reproduced quantization noise. It does not assert that any particular form of discrete system is the right one, or that systems even must be discrete.

Since (we do remember our QM, yes?) all systems in the real world ARE discrete, your point seems rather pointless, as it were.
QUOTE


that discreteness is on a completely different scale, and for all audio reproduction purposes on earth (not in outer space) we can consider air a continuous medium, I think smile.gif


Really, I don't think so. Just look at sound propagation through air as a function of moisture content. Right then and there, the issue rises its ugly head again.

As to "on a completely different scale", I have no idea what you mean. In fact, the threshold of hearing is very near the level of atmospheric noise at the eardrum, and very near the potential low-level noise of a 16 bit system set up for a reasonable playback level. That is hardly "a completely different scale" it is very, very much on the same part of the scale, and very very nearly equal. So whatever DO you mean?
Patsoe
QUOTE(Woodinville @ Aug 15 2006, 22:43) *

So whatever DO you mean?


Hey, no need to bring out the all-caps smile.gif

OK, I have reread your post - and think I get it now. The point is that thermal motion would account for 6dB SPL already, right? I read too fast the first time, and missed that connection smile.gif Also, now that I'm getting the message, I'm quite stunned by the fact that the absolute threshold of hearing would be this low.

QUOTE
Since (we do remember our QM, yes?) all systems in the real world ARE discrete, your point seems rather pointless, as it were.


I think I remember some QM smile.gif But what effects that we're discussing would be lost in a classical treatment?

My point is that quantizing the audio signal would be just as legitimate if the real world was not discrete in any way. So I was thinking that bringing that point up blurs the discussion (and we're only making it worse now tongue.gif).

Different scale: I meant that the sheer number of atoms that touch your ear drum every "integration time" (of the sensory nerves, that is) would be such that there's no point discretizing that. Thinking some more about that and your point about thermal noise: can't we predict that 6dB thermal noise level within the framework of classical thermodynamics, without thinking of independent particles? edit: hmmm, I guess not... a classical gas pressure could just generate a constant force...
bhoar
If this is off-topic, I apologize...

I assume the thread (currently, at least) is only addressing the "why 24bit/48kHz/96kHz" issue for playback only purposes.

There are several reasons that recording, mixing and processing should be done at higher resolution and/or sampling rates than the targeted playback format, especially for music sourced from large #s of individual tracks...

-brendan
Woodinville
QUOTE(Patsoe @ Aug 15 2006, 15:37) *

QUOTE(Woodinville @ Aug 15 2006, 22:43) *

So whatever DO you mean?


Hey, no need to bring out the all-caps smile.gif

OK, I have reread your post - and think I get it now. The point is that thermal motion would account for 6dB SPL already, right? I read too fast the first time, and missed that connection smile.gif Also, now that I'm getting the message, I'm quite stunned by the fact that the absolute threshold of hearing would be this low.


It's not clear that it is, but it's within 5 dB at the frequency of your ear canal resonance. Needless to say it's hard to run this experiment.
QUOTE


I think I remember some QM smile.gif But what effects that we're discussing would be lost in a classical treatment?


Well, amplifiers, the atmosphere, etc, would be a whole lot quieter smile.gif
QUOTE


My point is that quantizing the audio signal would be just as legitimate if the real world was not discrete in any way. So I was thinking that bringing that point up blurs the discussion (and we're only making it worse now tongue.gif).


Well yes, I don't think that's in dispute, however, the quantized nature of the universe does set lower limits to perception and electronics, and those lower limits are NOT very much smaller than our perceptions.
QUOTE


Different scale: I meant that the sheer number of atoms that touch your ear drum every "integration time" (of the sensory nerves, that is) would be such that there's no point discretizing that. Thinking some more about that and your point about thermal noise: can't we predict that 6dB thermal noise level within the framework of classical thermodynamics, without thinking of independent particles? edit: hmmm, I guess not... a classical gas pressure could just generate a constant force...


Exactly. But the force isn't constant.

Just some numbers. 1 Atmosphere RMS is 194dB SPL. We can hear to about -6dB SPL at our ear canal resonance. That's a factor of 10^20 in energy, or 10^10 in amplitude. Yeah, we do hear really small variations in air pressure at audio frequencies. Really small.

Fortunately we don't hear low frequencies at all. Consider the implications of a change in barometric pressure caused by a cloud going over. That's nothing but low-frequency, nonlinear acoustics. Consider the SPL, too, strictly speaking. Yeah, milliHz, fortunately.
jlt
QUOTE(William @ Dec 29 2005, 06:45) *

Yes, I have searched the forum.
Yes, maybe I am dumb.

But it seems I cannot find the answer.

Why do we need 24bit/48kHz/96kHz/192kHz if 16bit/44.1kHz is good enough? Are there any situations that 16bit/44.1kHz simply cannot satisfy? In other words, is there any real need for the higher bit depth and sampling rate?

Thanks for answering.



QUOTE(bhoar @ Aug 15 2006, 16:57) *

If this is off-topic, I apologize...

I assume the thread (currently, at least) is only addressing the "why 24bit/48kHz/96kHz" issue for playback only purposes.

There are several reasons that recording, mixing and processing should be done at higher resolution and/or sampling rates than the targeted playback format, especially for music sourced from large #s of individual tracks...

-brendan


brendan,
you answered right(There are several reasons...) but William don't ask for playback only purposes.
for editions we need more than 16bit(dithering is horrible)

Pio2001
QUOTE(jhbretz @ Aug 15 2006, 22:54) *
the key is that sound is well correlated. If we were talking about random noise, then it wouldn't work. But sounds that come from real instruments have amplitudes that don't jump around a whole lot between samples. So I would argue that doing an interpolation (and in fact much better than a linear interpolation) is valid.


This has nothing to do with quantization. It deals with sample rate. When the amplitude doesn't jump around in both directions between samples, it means that there are no high frequencies, and that a low sample rate is enough.
When amplitude doesn't jump around much in a given direction, it means that the sound is quiet.

QUOTE(Woodinville @ Aug 15 2006, 23:43) *
Since (we do remember our QM, yes?) all systems in the real world ARE discrete, your point seems rather pointless, as it were.


In QM, the cinetic energy of a free air molecule is not discrete, but continuous.
Taz PA-C
Fascinating discussion. Excellent question. Check out the book by Malcolm Gladwell called "Blink". There is report in this book of a double blind taste test conducted by Pepsi which proved that on the single sip, people much preferred Pepsi over Coke. Coke then conducted their own double blind taste tests using the single sip taste test, and confirmed Pepsi's results. As a result of these tests, Coke reformulated their recipe, and the rest is history. In case you missed it, the new Coke failed miserably in the market. Why?!? Taste tests proved in double blind tests that the new formula was a winner. The reason the new Coke failed was the very nature of the sip test, it was too brief a test. It didn't reflect the days and weeks of using a product that on the initial test, won. If buyers took a case of pop home, the results were completely different from a double blind test. Coke eventually went back to "Classic" Coke, then the "New" Coke quietly dissappeared from the market.
Now lets apply the lesson from this to audio. 16 bits may be enough for a brief listen, but if you listen day after day, week after week, month, 16 bits aren't enough, I can hear too many limitations on my mid-fi system with 16 bit audio. I can hear it when a 16 bit music CD is truncated as it fades to silence, and I hate it. I want more. To heck with double blind audio "sip" tests. I want more.
Mike Giacomelli
QUOTE(Taz PA-C @ Aug 19 2006, 20:06) *

16 bits may be enough for a brief listen, but if you listen day after day, week after week, month, 16 bits aren't enough, I can hear too many limitations on my mid-fi system with 16 bit audio.


Prove it.

QUOTE(Taz PA-C @ Aug 19 2006, 20:06) *

I can hear it when a 16 bit music CD is truncated as it fades to silence, and I hate it.


There is no trunication. Theres a noise floor, and the as the signal gets quieter it vanishes under quantinization error. It seems to me you've fooled yourself into hearing things.
Radetzky
QUOTE(Taz PA-C @ Aug 19 2006, 19:06) *

Fascinating discussion. Excellent question. Check out the book by Malcolm Gladwell called "Blink". There is report in this book of a double blind taste test conducted by Pepsi which proved that on the single sip, people much preferred Pepsi over Coke. Coke then conducted their own double blind taste tests using the single sip taste test, and confirmed Pepsi's results. As a result of these tests, Coke reformulated their recipe, and the rest is history. In case you missed it, the new Coke failed miserably in the market. Why?!? Taste tests proved in double blind tests that the new formula was a winner. The reason the new Coke failed was the very nature of the sip test, it was too brief a test. It didn't reflect the days and weeks of using a product that on the initial test, won. If buyers took a case of pop home, the results were completely different from a double blind test. Coke eventually went back to "Classic" Coke, then the "New" Coke quietly dissappeared from the market.
Now lets apply the lesson from this to audio. 16 bits may be enough for a brief listen, but if you listen day after day, week after week, month, 16 bits aren't enough, I can hear too many limitations on my mid-fi system with 16 bit audio. I can hear it when a 16 bit music CD is truncated as it fades to silence, and I hate it. I want more. To heck with double blind audio "sip" tests. I want more.


What kind of analogy is that?

I don't even bother to explain anymore. I guess reading posts about how a WAV sounds better than a FLAC just killed my enthusiasm... (head to head-fi.org if you need to laugh a bit...)

No wonder there is a market out there for all kind of "stereo-snake-oil" products... there is a world of suckers out there !!
Patsoe
QUOTE(Radetzky @ Aug 20 2006, 05:11) *

QUOTE(Taz PA-C @ Aug 19 2006, 19:06) *

[...]
Now lets apply the lesson from this to audio. 16 bits may be enough for a brief listen, but if you listen day after day, week after week, month, 16 bits aren't enough,
[...]


What kind of analogy is that?

I don't even bother to explain anymore. I guess reading posts about how a WAV sounds better than a FLAC just killed my enthusiasm... (head to head-fi.org if you need to laugh a bit...)

No wonder there is a market out there for all kind of "stereo-snake-oil" products... there is a world of suckers out there !!


What kind of a reaction is that?
Don't take this wrong, I know the feeling, but if you don't bother to explain, just don't post smile.gif

Taz PA-C: nothing in abx protocols forbids you to listen for days and weeks...
Pio2001
Blind listening for days have even been done recently : http://www.hydrogenaudio.org/forums/index....showtopic=45432
The results ? People gets much worse results than listening for seconds. Psychological illusions seem to grow up and reinforce themselves day after day.

The failure of the new Coke taste can be explained by the fact that people have liked the old taste for years. It was a traditional product. Change the taste, for better or worse, and it is no more a traditional product. It is a new attempt from beginners, made by chemists and mathematicians instead of gastronoms, welcomed with suspucion.
That's now how a drink should be made, thus it has to taste bad. The psychological effect is at work, and makes people dislike the new taste, or not even try it.
Kees de Visser
QUOTE(Taz PA-C @ Aug 19 2006, 20:06) *

I can hear it when a 16 bit music CD is truncated as it fades to silence, and I hate it.

If you hear artifacts (grainy sound e.g.) when music fades to silence it probably means that either your monitoring system (DAC) or the audio on the cd (or both) hasn't been properly dithered. With correct dithering the sound will smoothly fade into the (dither-)noisefloor, which should be at such a low level (1LSB is about -90dBFS) that it's almost or completely inaudible under normal listening conditions.
Properly dithered music can still sound decent at very low levels (<-80dBFS).
jlt
good explanations Kees de Visser cool.gif

the "issue" that i don't like in 16bit is that after each effect applyed(volume,equalize,etc) in the source when editing encrease the noise floor is summed one more time. ohmy.gif
then,after some effects you can hear clearly the "white noise"(hiss)....too bad.
Mike Giacomelli
QUOTE(jlt @ Aug 20 2006, 08:17) *

good explanations Kees de Visser cool.gif

the "issue" that i don't like in 16bit is that after each effect applyed(volume,equalize,etc) in the source when editing encrease the noise floor is summed one more time. ohmy.gif
then,after some effects you can hear clearly the "white noise"(hiss)....too bad.


What software edits in 16 bit precision? Even winamp plugins from the late 90s use floating point.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.