Help - Search - Members - Calendar
Full Version: Q about digital audio
Hydrogenaudio Forums > Hydrogenaudio Forum > General Audio
ye110man
as i understand it cd's are sampled 44,100 times per second and quantized to one of 16 bits. instead of coding each sampling point why not code a sine representation? wouldn't this result in a more accurate description of the original analog sound?
AstralStorm
The playback process restores sines using these points as guidance.

-------------------------------------
---------------------*--------------
------------------------------*-----
--------------*---------------------
------*-----------------------------

becomes

-------------------------------------
---------------------****---------
-----------------***------***----
------------****--------------*---
------****---------------------*--

EDIT: Not exactly that, but you get the picture.

How could it be done in other way (not necessarily better)? Check SACD links in FAQ.
Pio2001
QUOTE(ye110man @ May 3 2003 - 09:27 PM)
why not code a sine representation?

What do you mean ?
Music is not sinusoidal, you know

Example :

user posted image
AstralStorm
Music is sinusoidal - it consists of infinite (or very large) number of sines.
Look at your graph. wink.gif

And it is already coded as a sine representation... PCM.
Pio2001
A sine is a function from R to R that for any x in R is sin (x). This is obviously not the case of the picture above.
Functions like
A*sin(b*x+c)
are usually called sines.

Any periodic function can be considered as an infinite sum of sines.
Music not being periodic, it can be considered like an infinite and continuous sum of sines.
So to store music as a sum of sine, it requires an infinite amount of sine parameters... It it however possible to sample them with enough accuracy tongue.gif
Delirium
QUOTE(Pio2001 @ May 4 2003 - 05:37 AM)
Music not being periodic, it can be considered like an infinite and continuous sum of sines.

Well, as most music you'll be interested in recording has a finite length, you can simply find an infinite sum of sines that reproduce the music as a periodic function with period equal to the length of the music.
AstralStorm
Yes, but that is VERY computationally intensive compared to PCM.
Pio2001
Anyway, this would mean dividing the frequency domain into 20000*300=6,000,000 sines for a 300 seconds song and a 20 kHz frequency response. Then one would ask "why not represent the music as a continuous wave instead of dividing it into sines ? It would be more accurate."

Ye110man, the short answer is that 44100 Hz 16 bits are beyond the human ear ability, therefore no extra accuracy is needed.
AstralStorm
QUOTE
Ye110man, the short answer is that 44100 Hz 16 bits are beyond the human ear ability, therefore no extra accuracy is needed.
16 bits are certainly (without dithering) not, 20 bits really are.
ye110man
i see. however wouldn't it be better to have a variable sampling rate rather than a constant 44.1khz 16 bits?
AstralStorm
It is possible, but run length encoding would be even better tongue.gif
There are already more efficient methods - lossless compression.
Delirium
QUOTE(ye110man @ May 5 2003 - 12:01 PM)
i see. however wouldn't it be better to have a variable sampling rate rather than a constant 44.1khz 16 bits?

It's possible, but impractical. Well, it's more practical in software, but originally almost all audio was done in hardware, and still a lot of it is done in hardware. In hardware it's fairly easy to synchronize a fixed sampling rate with an oscillator (or an oscillator times some multiplier), but it's rather hard to accurately track frequent changes in sampling rate.
Pio2001
QUOTE(AstralStorm @ May 5 2003 - 09:07 PM)
16 bits are certainly (without dithering) not, 20 bits really are.

D'oh, I must be tired, these times.
spoon
That is just what mp3 is, a bunch of sines, or mpc, or....well any format that allows the equalizer to be applied to the raw stream before converting it back into what you hear.
ye110man
so then dvd-audio is overkill?
i read that tom holman advocated 60khz 20 bit.
i guess maybe it's technically easier to work with 24 bits (3 bytes) than 20 bits (uneven # of bytes).
but where did they come up with the 92khz for dvd audio?
Pio2001
This was discussed to death. You can find some of these discussions in the FAQ
spoon
QUOTE
but where did they come up with the 92khz for dvd audio?


Music for Dogs? smile.gif it does seem overkill, if you have a 5Gig disc and are creating DVD Audio discs then you need to fill it some how.
ye110man
ok i read up on it. dvd-audio is scalable to a more reasonable 44.1khz 20 bit.

i was wondering... has there been any work on variable sampling rates?
AstralStorm
I think that VOC format can use variable sample rates...
2Bdecided
The answer to the original question is so simple: sound waves exist in the time domain. We only analysis them in the frequency domain.

A microphone outputs a time domain waveform - one value for each instant of time you care to look.
Likewise a loudspeaker requires a time domain waveform as an input, telling it where to move to at each instant of time.

To transform the time-domain waveform into the frequency domain is a waste of time - you'll just have to transform it back again to listen to it!

(Obviously it has it's uses wink.gif However, for the simplest digital capture, storage, and replay: it has no place)


Variable sample rate recording would be pointless - you have to record it before you know that you could have used a lower sample rate for that moment!

Variable sample rate coding could be useful, BUT with lossless coding you kind of get that anyway - if there's no content above 10kHz, you'll gain an appropriate advantage in compression without worrying about the sample rate. Try it with a lossy codec - it works surprisingly well. Not perfectly (i.e. you don't save ALL the bits), but well enough. This, simplistically, is because a lossless representation of a sine wave can be almost independent of sample rate.

Cheers,
David.
ye110man
naturally variable sample rate recording it pointless.
but with variable sample rate encoding couldn't you pick just those points closest to a quantization level to encode? because the samples aren't equally spaced you may need a higher sampling rate in many cases. but by being able to pick your own sampling points wouldn't you be able to greatly reduce quantization error? for all practical purposes it would eliminate quantization error.
Doctor
And then quantize the offsets for these points? ;-)

Nice try. ;-)
ye110man
i was considering that but i'm not talking about indefinitely variable sampling rates.
sample at 92khz or something high. pick samples for which the quantization error is minimal and pick them at the smallest rate greater than nyquist (leaving some room for error if you'd like). encode only those samples.

another question... (sorry i'm a newbie)
upon decoding are quanziation errors smoothed out by some sort of regression curve (dont' know if that's the right terminology) or are they reproduced?
Doctor
Looks like you are trying to invent something like subsample antialiasing. ;-)

Imagine the analogue signal as a curve on graph paper. The vertical lines are sampling invervals, the horizontal, quantization levels. Usually what's done is something like average the signal between two lines and output the nearest level. You propose instead to locate the level that's intersected by the curve. You can easily convince yourself that this will likely increase, not decrease noise.

Once the quantization noise is introduced into the signal, it cannot be removed, but it can be pushed around (with more noise!) to render it inaudible. This is caled dithering/noise shaping.
ye110man
i see. thanks.

for the 2nd part... wouldn't quantization error result in an irregular curve and not a smooth one? is this curve smoothed out upon output?
Doctor
Going back to graph paper example. Try to approximate the curve drawing only on graph lines. This is what a quantized signal looks like.

If you subtract your original curve from this one, you will see a lot of jerking half-step up and down. It will not be smooth, so in frequency domain there will be new high-frequency components.

When you output the signal, you filter out high-frequency components and by the magic of Fourier and Nyquist your original signal is reconstructed.
marcan
And what about a variable 1 bit encoder/decoder?

You set the frequency range and it makes the rest. The variation should be applied on the sampling rate (very high in 1 bit) in order to respect the signal in the frequency range defined before.
marcan
Someone can enlighten me please?
Doctor
I did not understand at all.
AstralStorm
This suggestion looks like SACD Delta modulation... it has been done,
but has not been proved better.
marcan
Coupled with a digital amplifier (like tact http://www.tactaudio.com/Millennium/index.html) should sound wonderful tongue.gif
QUOTE
This suggestion looks like SACD Delta modulation... it has been done

blink.gif Damn, I'm arriving too late...
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.