Sample rate conversion |
![]() ![]() |
Sample rate conversion |
Apr 11 2011, 18:50
Post
#1
|
|
|
Group: Members Posts: 8 Joined: 18-July 07 Member No.: 45403 |
Hello everyone.
I have a question regarding sample rate conversion algorightms. I don't know how they work, but I guess you guys are more familiar with this Let's say I have a 48 kHz wave file that I want to use in something I have to mix with some 96 kHz material at certain places. This requires the 48 kHz file be resampled and in the end back to 48. So in short: does resampling 48 --> 96 --> 48 change the original or can the original be restored 100% ? I'm using Wavelab for this. |
|
|
|
Apr 11 2011, 18:58
Post
#2
|
|
|
Group: Members Posts: 4132 Joined: 2-September 02 Member No.: 3264 |
So in short: does resampling 48 --> 96 --> 48 change the original or can the original be restored 100% ? Theres going to be some rounding error. Its probably better to just do everything at 48k. That said, with a very good resampler, there will be no meaningful difference. I have no idea if wavelab is any good, although this page suggests that at least one of the wavlab 6 resamplers is pretty good: http://src.infinitewave.ca/ |
|
|
|
Apr 12 2011, 14:29
Post
#3
|
|
![]() Group: Members Posts: 3212 Joined: 29-October 08 From: USA, 48236 Member No.: 61311 |
Hello everyone. I have a question regarding sample rate conversion algorightms. I don't know how they work, but I guess you guys are more familiar with this Let's say I have a 48 kHz wave file that I want to use in something I have to mix with some 96 kHz material at certain places. This requires the 48 kHz file be resampled and in the end back to 48. So in short: does resampling 48 --> 96 --> 48 change the original or can the original be restored 100% ? I'm using Wavelab for this. Can't you just stash a copy of the 48k origional some place out of the way? |
|
|
|
Apr 12 2011, 20:09
Post
#4
|
|
![]() Group: Members (Donating) Posts: 1983 Joined: 4-January 04 From: Austin, TX Member No.: 10933 |
If you are just mixing, then you ought to be able to get the same results (to within -144db or better) by resampling the 96k content down to 48k, and mixing at 48k. The reason is that mixing is a purely linear operation.
48k->96k, being a 2x oversample, is among the most numerically conservative resampling possibilities. If the resampler is correctly implemented, 50% of all the samples should be numerically exact, with zero quantization error. |
|
|
|
Apr 13 2011, 15:21
Post
#5
|
|
![]() ReplayGain developer Group: Developer Posts: 4587 Joined: 5-November 01 From: Yorkshire, UK Member No.: 409 |
If the resampler is correctly implemented, 50% of all the samples should be numerically exact, with zero quantization error. That's an interesting use of the word "correctly" - a 2x resampler can easily be (and often is) perfect in terms of frequency and phase response, while giving 100% "new" samples.You can easily design it to do as you suggest, but that's not necessarily the way all are designed. Cheers, David. |
|
|
|
Apr 13 2011, 16:05
Post
#6
|
|
![]() Group: Members Posts: 840 Joined: 7-October 01 Member No.: 235 |
You can easily design it to do as you suggest, but that's not necessarily the way all are designed. Afaik all resamplers apply another lowpass on upsampling, be it 2x or some strange number from 44.1 to 96kHz. So when this lowpass is applied all relation to a pattern in the source is gone. Otherwise weīll have an aliased, mirror above the sources max frequency. Isnīt it? And if not, what software does that creation of added 0s correctly? |
|
|
|
Apr 13 2011, 20:25
Post
#7
|
|
![]() Group: Members Posts: 266 Joined: 3-August 08 From: UK Member No.: 56644 |
Given 10 seconds of 440Hz tone and upsampling from 48k to 96k, sox preserves the input samples in the output for all but the first and last 40-or-so samples. Of course, the test becomes more 'difficult' if the bit-depth or the tone frequency is increased.
CODE sox -b 16 -n 1.wav synth 10 sox -D 1.wav 2.wav rate 96k sox -c 2 -r 48k 2.wav 3.wav remix 1 cmp -l 1.wav 3.wav | wc -l |
|
|
|
Apr 13 2011, 20:47
Post
#8
|
|
![]() Group: Members Posts: 840 Joined: 7-October 01 Member No.: 235 |
Given 10 seconds of 440Hz tone and upsampling from 48k to 96k, sox preserves the input samples in the output for all but the first and last 40-or-so samples. Of course, the test becomes more 'difficult' if the bit-depth or the tone frequency is increased. CODE sox -b 16 -n 1.wav synth 10 sox -D 1.wav 2.wav rate 96k sox -c 2 -r 48k 2.wav 3.wav remix 1 cmp -l 1.wav 3.wav | wc -l This must be lowpassed already even if i donīt know what -D does. You may have a look what sox does without lowpass. http://www.hydrogenaudio.org/forums/index....st&p=675545 The resulting 2.wav looks lowpassed at 22-25kHz. Isnīt it? ![]() This post has been edited by Wombat: Apr 13 2011, 21:22 |
|
|
|
Apr 13 2011, 21:19
Post
#9
|
|
![]() Group: Members Posts: 266 Joined: 3-August 08 From: UK Member No.: 56644 |
QUOTE This must be lowpassed already even if i donīt know what -D does. -D is to disable dithering (which would otherwise mess up the test). The only filter involved is that of the resampler (in the second line), applied after zero stuffing, a lowpass at just below the original nyquist (i.e. the standard method).QUOTE You may have a look what sox does without lowpass. In fact, that shows a different resampler; with sox the resampling filter is not optional. |
|
|
|
Apr 13 2011, 22:43
Post
#10
|
|
![]() Group: Members (Donating) Posts: 1983 Joined: 4-January 04 From: Austin, TX Member No.: 10933 |
That's an interesting use of the word "correctly" - a 2x resampler can easily be (and often is) perfect in terms of frequency and phase response, while giving 100% "new" samples. You can easily design it to do as you suggest, but that's not necessarily the way all are designed. Feh. Yes, you are correct. Thanks for the catch. I think that if I restrict my statement to the domain of windowed sinc filters, it's accurate. But any filter possessing an asymmetric response around the -6db point at Fs/2 is categorically not in that domain. While asymmetric filters are 2x more complex to implement, obviously they exist, particularly in software implementations where the symmetric optimization may not get performed. (offhand, I can't recall any specific instance of such a filter, but I am quite sure they exist.) ... Right? This post has been edited by Axon: Apr 13 2011, 22:48 |
|
|
|
Apr 14 2011, 00:03
Post
#11
|
|
![]() Group: Super Moderator Posts: 3267 Joined: 26-July 02 From: princegeorge.ca Member No.: 2796 |
I think that if I restrict my statement to the domain of windowed sinc filters, it's accurate. But any filter possessing an asymmetric response around the -6db point at Fs/2 is categorically not in that domain. I have a question that's marginally off-topic, but this post triggered all my keyword detectors. Windowed sinc is symmetric. What would similar functions be in the asymmetric case? The symmetric, acausal nature of sinc has always bugged me, but I've never found a reference containing asymmetric, causal analogues that I could use in SRC contexts.-------------------- (atrix|(fb2k->e-mu 0404 usb|audio 8 dj))->hd280|jvc ha-fx35-b
|
|
|
|
Apr 14 2011, 00:44
Post
#12
|
|
![]() Group: Members Posts: 840 Joined: 7-October 01 Member No.: 235 |
I admit i have no clue about some things you talk but to summarize my findings i still donīt see anyone convincing me there is upsampling without lowpassing, so changing EVERY single musical bit in the process.
On http://en.wikipedia.org/wiki/Upsampling they talk about the 2 ways of implementing 1. Add zeros between each sample -> to my understanding all resamplers do 2. Filter with a low-pass filter which, theoretically, should be the sinc filter -> that no resampler does because, also from Wikipedia: "The second step calls for the use of a perfect low-pass filter, which is not implementable" |
|
|
|
Apr 14 2011, 00:52
Post
#13
|
|
![]() Group: Super Moderator Posts: 3267 Joined: 26-July 02 From: princegeorge.ca Member No.: 2796 |
I admit i have no clue about some things you talk but to summarize my findings i still donīt see anyone convincing me there is upsampling without lowpassing, so changing EVERY single musical bit in the process. Sample-and-hold, ie. "nearest neighbour", ie. out[2*x]=in[x];out[2*x+1]=in[x] is trivial upsampling without lowpassing, though it's gonna generate a lot of high-frequency aliasing. Solution? Low-pass the aliasing out.I suspect that's why step 2 exists in your algorithm. Step 1 will resample (even accurately) but the issue becomes high-frequency aliasing, ie. putting high-frequency content in the signal that was not there originally. This post has been edited by Canar: Apr 14 2011, 00:54 -------------------- (atrix|(fb2k->e-mu 0404 usb|audio 8 dj))->hd280|jvc ha-fx35-b
|
|
|
|
Apr 14 2011, 00:58
Post
#14
|
|
![]() Group: Members Posts: 840 Joined: 7-October 01 Member No.: 235 |
Sample-and-hold, ie. "nearest neighbour", ie. out[2*x]=in[x];out[2*x+1]=in[x] is trivial upsampling without lowpassing, though it's gonna generate a lot of high-frequency aliasing. Solution? Low-pass the aliasing out. Exactly, that aliasing i mentioend in post 6 above. We are running in circles This post has been edited by Wombat: Apr 14 2011, 00:59 |
|
|
|
Apr 14 2011, 01:03
Post
#15
|
|
|
Group: Members Posts: 4132 Joined: 2-September 02 Member No.: 3264 |
I admit i have no clue about some things you talk but to summarize my findings i still donīt see anyone convincing me there is upsampling without lowpassing, so changing EVERY single musical bit in the process. On http://en.wikipedia.org/wiki/Upsampling they talk about the 2 ways of implementing 1. Add zeros between each sample -> to my understanding all resamplers do 2. Filter with a low-pass filter which, theoretically, should be the sinc filter -> that no resampler does because, also from Wikipedia: "The second step calls for the use of a perfect low-pass filter, which is not implementable" Although you can never make a perfect sinc filter, its possible to build ones very very close to perfect, such that any difference is below quantization noise except for a tiny region right around the Nyquist limit, which of course won't include audio anyway because of the anti-alias filter on the original recording ADC. |
|
|
|
Apr 14 2011, 01:03
Post
#16
|
|
![]() Group: Super Moderator Posts: 3267 Joined: 26-July 02 From: princegeorge.ca Member No.: 2796 |
Exactly, that aliasing i mentioend in post 6 above. We are running in circles Then what you're looking for is sinc interpolation.Edit: saratoga beat me, and was more informative to boot. This post has been edited by Canar: Apr 14 2011, 01:04 -------------------- (atrix|(fb2k->e-mu 0404 usb|audio 8 dj))->hd280|jvc ha-fx35-b
|
|
|
|
Apr 14 2011, 01:11
Post
#17
|
|
![]() Group: Members Posts: 840 Joined: 7-October 01 Member No.: 235 |
Although you can never make a perfect sinc filter, its possible to build ones very very close to perfect, such that any difference is below quantization noise except for a tiny region right around the Nyquist limit, which of course won't include audio anyway because of the anti-alias filter on the original recording ADC. I am pretty sure we can get close but from reading some earlier posts in this thread it is suggested that 2x upsampling will leave the source material intact which it is not. One can argue about such ultra-close filters have huge amounts of pre-ringing btw. It is not a lossless operation. I see it in the context of some audiophile claims about upsampling 2x from 44.1 to 88.2 for example sounds better as to 96kHz cause of only applying zeros on every second bit, but it isnīt. This post has been edited by Wombat: Apr 14 2011, 01:15 |
|
|
|
Apr 14 2011, 01:17
Post
#18
|
|
|
Group: Members Posts: 4132 Joined: 2-September 02 Member No.: 3264 |
Although you can never make a perfect sinc filter, its possible to build ones very very close to perfect, such that any difference is below quantization noise except for a tiny region right around the Nyquist limit, which of course won't include audio anyway because of the anti-alias filter on the original recording ADC. I am pretty sure we can get close but from reading some earlier posts in this thread it is suggested that 2x upsampling will leave the source material intact which it is not. Obviously if the filter does not have unity transmittance at every frequency, then at least some samples cannot be the same . . . |
|
|
|
Apr 14 2011, 03:26
Post
#19
|
|
![]() Group: Members (Donating) Posts: 1983 Joined: 4-January 04 From: Austin, TX Member No.: 10933 |
I think that if I restrict my statement to the domain of windowed sinc filters, it's accurate. But any filter possessing an asymmetric response around the -6db point at Fs/2 is categorically not in that domain. I have a question that's marginally off-topic, but this post triggered all my keyword detectors. Windowed sinc is symmetric. What would similar functions be in the asymmetric case? The symmetric, acausal nature of sinc has always bugged me, but I've never found a reference containing asymmetric, causal analogues that I could use in SRC contexts.Yeah, you better split this mofo right now before the OP's head explodes. I was using the term "windowed sinc" previously with the implicit assumption that the sinc function is critically sampled, ie, with lowpass frequency Fs/2. If it's not that -- more specifically, if it's both lower than, and relatively prime to, Fs/2 -- then you'll get your asymmetric response, and you'll also get every sample modified. That's the simplest example I can think of. Lemme do a bit of armchair mathematical derivation to outline what I'm riffing from. I'm going to wave my hands *really* widely here, so I apologize in advanced for abuses of notation, convention, or for that matter, logic. For starters, I'll write this assuming a normalized, ordinary-frequency Fourier transform, but using the variable omega (w) instead of xi. I am also assuming Fs=1, and I write Fs/2 largely as a more easily identifiable representation of the frequency "1/2". The basic principle here is the interpolating filter, ie, one which can be used when interpolating between sampled data values: for interpolating kernel k(t) convolved with a discrete-time signal with sampling period 1, k(0)=1 and k(N)=0 for nonzero integer N. The sinc function is the "simplest" analytic function satisfying this requirement. I base this statement from the following identity: ![]() (This is a nice time to ask: Can an admin please add MathJax support to the site? The inner term there, (1-x2/n2), can be rewritten as (n-2)(n+x)(n-x). That (n-2) is just a constant which can be ignored, so what we're left with is a function which could be notionally defined as a polynomial with a zero at every nonzero integer. That's precisely what is necessary for an interpolating kernel. This implies, at least the way I see it currently, that any interpolating function can be written in the form k(t) = h(t) sinc(t), as long as h(0)=1, and h(t) is defined across the support of k(t). Furthermore, the windowed (time-limited) interpolation functions can be written h(t) = g(t/a) rect(t/a), where "a" controls the window width and g(t) is the basic window function evaluated over [-1/2, 1/2]. So a Hann-windowed sinc kernel is going to be something like g(t) = (1+cos(pi t))/2. The statement k(t) = h(t) sinc(t) in the time domain is equivalent to K(w) = H(w) * rect(w) in the frequency domain. If we assert:
... then, around w=+1/2, we can approximate rect(w) ~~ 1 - u(w-1/2) (and similarly for w=-1/2). From this -- and I apologize again if I'm skipping way too many steps here -- by looking at the bare convolution integrals, it's fairly easy to see that K(1/2)=(1/2) K(0). For a normalized filter response, ie K(0)=0db, then K(1/2)=-6db. In summary, when performing integral upsampling with a lowpass filter, the kernel k(t) is interpolating (preserves the value of the original samples) if and only if: *either*
|
|
|
|
Apr 14 2011, 07:30
Post
#20
|
|
![]() Group: Members Posts: 266 Joined: 3-August 08 From: UK Member No.: 56644 |
Obviously if the filter does not have unity transmittance at every frequency, then at least some samples cannot be the same . . . Indeed, with 2Ũ resampling, if the input signal frequency stays below the filter transition, half of the output samples (weird/asymmetric filters notwithstanding) will be the same as the input samples (as observed at the chosen bit depth). With sox, 16-bit, 48kHz ->96kHz, the odd-numbered output samples are bit-exact up to 22kHz; at 23kHz, the roll-off kicks in and the samples are no longer the same. At 24-bit, it's bit-exact to 16k; at 17k, attenuation of 0.0001dB is evident in the output. |
|
|
|
Apr 14 2011, 19:38
Post
#21
|
|
![]() Group: Members (Donating) Posts: 1983 Joined: 4-January 04 From: Austin, TX Member No.: 10933 |
Obviously if the filter does not have unity transmittance at every frequency, then at least some samples cannot be the same . . . Indeed, with 2Ũ resampling, if the input signal frequency stays below the filter transition, half of the output samples (weird/asymmetric filters notwithstanding) will be the same as the input samples (as observed at the chosen bit depth). With sox, 16-bit, 48kHz ->96kHz, the odd-numbered output samples are bit-exact up to 22kHz; at 23kHz, the roll-off kicks in and the samples are no longer the same. At 24-bit, it's bit-exact to 16k; at 17k, attenuation of 0.0001dB is evident in the output. Strictly speaking, it is not very meaningful to describe the flatness of a frequency response as "bit-exact". |
|
|
|
Apr 14 2011, 20:24
Post
#22
|
|
![]() Group: Members Posts: 266 Joined: 3-August 08 From: UK Member No.: 56644 |
Strictly speaking, it is not very meaningful to describe the flatness of a frequency response as "bit-exact". Just as well that's not what we're describing then! We're describing when, after 2Ũ band-limited interpolation, alternate output samples are exactly the same as the input samples. Practical examples were given of when this indeed the case and also when it's not. |
|
|
|
Apr 14 2011, 20:46
Post
#23
|
|
![]() Group: Members Posts: 840 Joined: 7-October 01 Member No.: 235 |
after 2Ũ band-limited interpolation, alternate output samples are exactly the same as the input samples. I still donīt get how any sample can be exactly the same when a lowpass is applied to it in the output!? I hope i only need a small hint Edit: Added pic of source 48k and upsampled. How can any sample be intact? ![]() This post has been edited by Wombat: Apr 14 2011, 20:55 |
|
|
|
Apr 14 2011, 21:35
Post
#24
|
|
|
Group: Members Posts: 147 Joined: 31-July 08 Member No.: 56508 |
48 -> 96 kHz conversion may leave signal samples intact, but also may change them.
The condition for leaving samples intact is the use of a half-band low-pass filter, i.e. the filter of the form x1 0 x2 0 x3 0 .... 0 xn. Half-band filters can be designed with a windowed sinc method to easily achieve the required frequency response. However many SRC algorithms are using other designs (not half-band). |
|
|
|
Apr 14 2011, 21:49
Post
#25
|
|
![]() Group: Members Posts: 840 Joined: 7-October 01 Member No.: 235 |
48 -> 96 kHz conversion may leave signal samples intact, but also may change them. The condition for leaving samples intact is the use of a half-band low-pass filter, i.e. the filter of the form x1 0 x2 0 x3 0 .... 0 xn. Half-band filters can be designed with a windowed sinc method to easily achieve the required frequency response. However many SRC algorithms are using other designs (not half-band). Half-Band filters, ok... May you give me the answer to the sample above and how sox does here. Is there any sample intact? And how to tell? |
|
|
|
![]() ![]() |
|
Lo-Fi Version | Time is now: 22nd May 2013 - 14:42 |