IPB

Welcome Guest ( Log In | Register )

3 Pages V  < 1 2 3 >  
Reply to this topicStart new topic
ffmpeg vs. SoX for resampling
db1989
post Feb 6 2013, 15:27
Post #26





Group: Super Moderator
Posts: 5174
Joined: 23-June 06
Member No.: 32180



QUOTE (soulsearchingsun @ Feb 6 2013, 13:34) *
So, where is TOS#8 when everyone agrees on looking at graphs? Not trying to troll, I just smell some bigottery.
There’s no problem with discussing theoretical degradation of signals. IMHO, this (the degradation, not the discussion!) should be avoided wherever possible, whether we can hear it or not. If anyone had said ‘Yeah, it looks awful, and it sounds even worse!’, then TOS #8 might be relevant. I suspect that FFmpeg is encroaching onto audible territory, but I make no claim either way: rather, my point is that there’s no need for it to degrade the signal that much when other algorithms produce conversions that are many times cleaner.

This post has been edited by db1989: Feb 7 2013, 00:35
Reason for edit: adding clarification in brackets
Go to the top of the page
+Quote Post
[JAZ]
post Feb 6 2013, 23:36
Post #27





Group: Members
Posts: 1711
Joined: 24-June 02
From: Catalunya(Spain)
Member No.: 2383



As always, there is tradeoff between speed and accuracy, and in this field, there's even different techniques in play.


There are two good ways to resample:

- Using SINC ( sin(x)/x ) interpolation.
- Using a decimator/interpolator combination.

In both cases, a filter is needed to reconstruct the signal, and the quality of that filter reflects how fast and how clean it is in removing frequency imaging, without adding other unwanted distortions.


Both are relatively slow (They migh be "fast" for downsampling a single stereo sample from 96Khz to 44.1Khz, but now think about a realtime sampler like BASSMIDI, FluidSynth, or any other Soundfont player, where you not only play several streams, but also change their speed while playing).


So there have always existed other "resamplers" (not in the academic sense, but approximations) like:

- ZO (Zero order, or sample Hold). The fastest and worst of them all (See the graph of Secret Rabbit Code, ZOH).
- Linear interpolation. Quite an improvement over ZO, and still fast (This was being used in 16 and 32 channel trackers back in 1992-1996 with 386 and 486 PC's! See Secret Rabbit Code, Linear. One can apply a filter with this type of interpolation, like in Wavosaur Linear, but it sort of defeats the purpose)
- Cubic interpolation. Cubic interpolation, and other polynomial interpolators are an advancement over linear interpolation, approximating the signal using polinomials. This generally lowers aliasing notably, but they still miss a filter. (See Renoise cubic, or OpenMPT polyphase, which are, again, realtime multichannel trackers with realtime virtual effects).


Any SINC interpolator should give a clean signal, but it is at least 4 times slower than cubic interpolation. It can filter by itself, but to be of good quality, it gets slower (because it needs more taps).


So, while a product that is specific to change the sample rate of a signal should have a good resampler, one should not think that it is the only reasonable way to resample.

This post has been edited by [JAZ]: Feb 6 2013, 23:41
Go to the top of the page
+Quote Post
bandpass
post Feb 7 2013, 08:57
Post #28





Group: Members
Posts: 323
Joined: 3-August 08
From: UK
Member No.: 56644



QUOTE (soulsearchingsun @ Feb 6 2013, 13:34) *
So, where is TOS#8 when everyone agrees on looking at graphs? Not trying to troll, I just smell some bigottery.

I doubt you'd be happy if a straight PCM file conversion (say, from AIFF to WAV) introduced artefacts; why settle for less when resampling is involved? Even if you can't hear the artefacts immediately, you run the risk that you might hear them later (e.g. on a better system, or after further processing).
Go to the top of the page
+Quote Post
phofman
post Feb 12 2013, 15:01
Post #29





Group: Members
Posts: 268
Joined: 14-February 12
Member No.: 97162



QUOTE (bandpass @ Feb 5 2013, 18:23) *
The conversion quality is the same (graphs for the newer version are @ infinitewave under Audacity 2.0.3); the performance is higher cos it's faster than before.


Do I understand correctly Audacity 2.0.3 is using new ffmpeg through which it uses libsoxr for the resampling? I would like to add a new resampling option to linux alsa rate plugin. Until recently I thought sox implementation was the best from the quality/speed POW. Using a library compatible with libsamplerate API would make the addition simple as libsamplerate is already supported by the plugin.

Thanks for the info.
Go to the top of the page
+Quote Post
bandpass
post Feb 12 2013, 16:13
Post #30





Group: Members
Posts: 323
Joined: 3-August 08
From: UK
Member No.: 56644



Audacity are using libsoxr directly (native API). LameDrop OTOH, is using the libsoxr's libsamplerate-like API.

If you're linking to libsamplerate dynamically, you may be able to try out libsoxr before any recompiling, by simply moving/renaming the libsoxr-lsr.so in place of the libsamplerate one (I tried this successfully with pulseaudio and saw a 4-5 times speed up @ SRC_SINC_MEDIUM_QUALITY).

Note however, that varying the sample-rate in real-time is only supported in libsoxr through it's native API, so this trick wouldn't work in that case.

HTH.
Go to the top of the page
+Quote Post
saratoga
post Feb 12 2013, 16:50
Post #31





Group: Members
Posts: 4718
Joined: 2-September 02
Member No.: 3264



QUOTE ([JAZ] @ Feb 6 2013, 17:36) *

As always, there is tradeoff between speed and accuracy, and in this field, there's even different techniques in play.


There are two good ways to resample:

- Using SINC ( sin(x)/x ) interpolation.
- Using a decimator/interpolator combination.

In both cases, a filter is needed to reconstruct the signal, and the quality of that filter reflects how fast and how clean it is in removing frequency imaging, without adding other unwanted distortions.


Aren't these two ways of implementing essentially the same process since the decimating/interpolating filter will probably have pretty close to a sinc form?

FWIW, I would add polynomial interpolation as the main alternative to sinc interpolation. Most software uses one or the other.
Go to the top of the page
+Quote Post
halb27
post Feb 12 2013, 18:30
Post #32





Group: Members
Posts: 2414
Joined: 9-October 05
From: Dormagen, Germany
Member No.: 25015



QUOTE ([JAZ] @ Feb 6 2013, 23:36) *

... Any SINC interpolator should give a clean signal ...

Can you tell me please which software uses SINC?


--------------------
lame3100m --bCVBR 300
Go to the top of the page
+Quote Post
[JAZ]
post Feb 12 2013, 20:25
Post #33





Group: Members
Posts: 1711
Joined: 24-June 02
From: Catalunya(Spain)
Member No.: 2383



@ halb27: I haven't checked other software, but a sinc method is implemented in Psycle (And at least back to 2003 or so, there was a tracker name Aodix that had a 64point sinc interpolator, for non-realtime rendering). Here you can see the code of Psycle:

http://sourceforge.net/p/psycle/code/10455...rs/dsp.hpp#l544
search for: static float band_limited

The sinc table is calculated in this other file, and a blackman window is applied to make it finite.
http://sourceforge.net/p/psycle/code/10455...helpers/dsp.cpp

This implementation still lacks a filter (I am trying to find a good one that doesn't require too much lookahead, but I might end calling soxr variable rate if I can't find a good way to do it).
I have a file as old as 2001 (I had to doublecheck to be sure of the year), that shows a sinc interpolator and applies a filter modifying the sinc speed. But this way of filtering decays slowly and in what i've tested, alters the frequencies too.




@saratoga: I am not an expert in DSP or maths (I did study the fourier transform at the university, but was applied to signals in general, not specifically to sound).
But with my knowledge of resampling (i.e. what I've tried to know), the sinc interpolation is considered the ideal (which also means not possible in finite time/signals) resampling method because it is the response or an ideal brickwall lowpass filter. Real implementations have to window the sinc in order to make it finite, and in this way, limit the amount of samples needed to calculate the output. (See Psycle's implementation).

In contrast, decimating and interpolating is a two step method which firstly upsamples using the zero-stuffing method, applies a lowpass filter at the lowest of the two samplerates, and then downsamples by getting directly the samples from the lowpassed signal. The difficult part is getting the values to what upsample, wich is the common minimum multiple. (erm.. spelling?)
In some way, this is how a DAC works (except that then, the result is a continuous signal).


I have included the polynomial interpolators in the "other resamplers", since they are, in some way, approximations, or concepts applied to sound when sometimes they originated in graphics ( splines, for example, is more about visuals than samples).

This post has been edited by [JAZ]: Feb 12 2013, 20:29
Go to the top of the page
+Quote Post
lvqcl
post Feb 12 2013, 20:39
Post #34





Group: Developer
Posts: 3219
Joined: 2-December 07
Member No.: 49183



QUOTE ([JAZ] @ Feb 12 2013, 23:25) *
The sinc table is calculated in this other file, and a blackman window is applied to make it finite.

AFAIK LAME also uses blackman windowed sinc.
Go to the top of the page
+Quote Post
2012
post Feb 13 2013, 15:49
Post #35





Group: Members
Posts: 63
Joined: 7-February 12
Member No.: 96993



QUOTE (phofman @ Feb 12 2013, 16:01) *
QUOTE (bandpass @ Feb 5 2013, 18:23) *
The conversion quality is the same (graphs for the newer version are @ infinitewave under Audacity 2.0.3); the performance is higher cos it's faster than before.


I would like to add a new resampling option to linux alsa rate plugin.


I already did this. Replicating the libsamplerate code in alsa-plugins with some gluing just works (with some quality modes).

Here is a patch that implements soxr_lsr_{HQ,MQ,LQ} :
http://ompldr.org/vaGc3Ng/Initial-soxr-lsr-support.patch

I shared more info here:
http://www.hydrogenaudio.org/forums/index....st&p=817595
Go to the top of the page
+Quote Post
knutinh
post Feb 13 2013, 16:04
Post #36





Group: Members
Posts: 568
Joined: 1-November 06
Member No.: 37047



QUOTE ([JAZ] @ Feb 12 2013, 20:25) *

@saratoga: I am not an expert in DSP or maths (I did study the fourier transform at the university, but was applied to signals in general, not specifically to sound).
But with my knowledge of resampling (i.e. what I've tried to know), the sinc interpolation is considered the ideal (which also means not possible in finite time/signals) resampling method because it is the response or an ideal brickwall lowpass filter. Real implementations have to window the sinc in order to make it finite, and in this way, limit the amount of samples needed to calculate the output. (See Psycle's implementation).

In contrast, decimating and interpolating is a two step method which firstly upsamples using the zero-stuffing method, applies a lowpass filter at the lowest of the two samplerates, and then downsamples by getting directly the samples from the lowpassed signal. The difficult part is getting the values to what upsample, wich is the common minimum multiple. (erm.. spelling?)
In some way, this is how a DAC works (except that then, the result is a continuous signal).

The "ideal" lowpass filter is a sin(x)/x function of infinite extent. Sampling theory tells us that using such a filter allows for perfect sampling and perfect reconstruction of a continous signal up to (but not including) fs/2.

If you think about it, resampling can be considered to be a reconstruction/sampling process, the same theory applies.

There are many ways to think about this, and many ways to optimize the processing (avoiding multiplications that does not affect the output is a significant one). I believe that you come a long way by only considering lowpass filter design. Linear interpolation, cubic interpolation, (windowed) sinc can all be considered lowpass filters.

-k
Go to the top of the page
+Quote Post
saratoga
post Feb 13 2013, 18:32
Post #37





Group: Members
Posts: 4718
Joined: 2-September 02
Member No.: 3264



QUOTE ([JAZ] @ Feb 12 2013, 14:25) *

@saratoga: I am not an expert in DSP or maths (I did study the fourier transform at the university, but was applied to signals in general, not specifically to sound).
But with my knowledge of resampling (i.e. what I've tried to know), the sinc interpolation is considered the ideal (which also means not possible in finite time/signals) resampling method because it is the response or an ideal brickwall lowpass filter. Real implementations have to window the sinc in order to make it finite, and in this way, limit the amount of samples needed to calculate the output. (See Psycle's implementation).

In contrast, decimating and interpolating is a two step method which firstly upsamples using the zero-stuffing method, applies a lowpass filter at the lowest of the two samplerates, and then downsamples by getting directly the samples from the lowpassed signal.


A lowpass filter is a windowed sinc. So you propose two methods, one of which fits a windowed sinc to calculate values, and another which ... fits a windowed sinc to calculate values.

These are just two ways of implementing the same algorithm, which is preferred is just an implementation detail that depends on the exact needs of the resampler. Trying to draw some abstract distinction between them is silly.

QUOTE ([JAZ] @ Feb 12 2013, 14:25) *

I have included the polynomial interpolators in the "other resamplers", since they are, in some way, approximations, or concepts applied to sound when sometimes they originated in graphics


They didn't originate in graphics, they originated in 17th century boundary value problems. They're general numerical techniques, as such both polynomial and windowed sinc interpolation are widely used in audio and graphics.

QUOTE ([JAZ] @ Feb 12 2013, 14:25) *

( splines, for example, is more about visuals than samples).


What is it you think digital images are made of if not samples?
Go to the top of the page
+Quote Post
phofman
post Feb 13 2013, 20:29
Post #38





Group: Members
Posts: 268
Joined: 14-February 12
Member No.: 97162



QUOTE (2012 @ Feb 13 2013, 16:49) *
I already did this. Replicating the libsamplerate code in alsa-plugins with some gluing just works (with some quality modes).


Fantastic, great work. Please would you mind sending the patch to the alsa-devel mailing list to have it included upstream? It would help a number of people. Thanks a lot in advance.
Go to the top of the page
+Quote Post
[JAZ]
post Feb 13 2013, 22:11
Post #39





Group: Members
Posts: 1711
Joined: 24-June 02
From: Catalunya(Spain)
Member No.: 2383



@ saratoga:
As I said, my knowledge of this is average, and I might even use the wrong words sometimes.

That said, I'm not sure I agree completely with what you say.
A -> B does not necesarily equal B -> A

A sinc filter (which has a sinc function as its impulse response) is a lowpass filter, but a lowpass filter is not necesarily a sinc filter. I don't have the math knowledge to make filters (or understand fully the poles and zeroes), but I understand that the polynomials generated are not just "sinc aproximations".
Just like the most basic lowpass filter is not a sinc filter: o0 + (i1-o0)*FC


You simplified the two methods I described as to using a lowpass filter. In essence, this is true (we want to get a lowpassed signal to avoid aliasing), but I wanted to differentiate the theory from the result. Example:
A linear interpolation is an intuitive way to find the value between two points, but it is not based on theory that reconstructs the path that a continuous bandlimited signal would take.

In that way, i made a distinction between the sinc method and the decimate/interpolate method because they do have a theory related to sound behind them, but it is not the same theory. (Or, let's say, one is the theory directly, and the other is a derivate of the theory, as in the second one does not necesarily imply a sinc filter, even though it is the ideal one).
I can accept that the decimate/interpolate method is akin of doing a fixed point implementation of a floating point one, so in essence, they do the same. But as an implementation, they reach the solution differently.


About polynomials, I admit I might have been too quick. I overlooked the math history, but again I was mentioning concrete methods while you mention the concepts on which they are based. Polynomials serve many purposes, and not all of them apply to bandlimited signals.
I mentioned graphics, because the word "spline" does describe that, a line (a visual concept).

Images are indeed made of samples, but.. what is the equivalent shannon theorem for images? I could accept that images are bandlimited (there's a finite spectrum represented by the sampled image colours, but even then, the RGB points are the representation of the image in the time domain?)
Go to the top of the page
+Quote Post
Rotareneg
post Feb 14 2013, 05:39
Post #40





Group: Members
Posts: 190
Joined: 18-March 05
From: Non-Euclidean
Member No.: 20701



QUOTE ([JAZ] @ Feb 13 2013, 15:11) *
Images are indeed made of samples, but.. what is the equivalent shannon theorem for images? I could accept that images are bandlimited (there's a finite spectrum represented by the sampled image colours, but even then, the RGB points are the representation of the image in the time domain?)


You might want to look at the Wikipedia article on the Nyquist–Shannon sampling theorem as it has a section specifically about multivariable sampling (images for example.)
Go to the top of the page
+Quote Post
saratoga
post Feb 14 2013, 07:01
Post #41





Group: Members
Posts: 4718
Joined: 2-September 02
Member No.: 3264



QUOTE ([JAZ] @ Feb 13 2013, 16:11) *

@ saratoga:
As I said, my knowledge of this is average, and I might even use the wrong words sometimes.

That said, I'm not sure I agree completely with what you say.
A -> B does not necesarily equal B -> A

A sinc filter (which has a sinc function as its impulse response) is a lowpass filter, but a lowpass filter is not necesarily a sinc filter. I don't have the math knowledge to make filters (or understand fully the poles and zeroes), but I understand that the polynomials generated are not just "sinc aproximations".


I think you misunderstand. My point is that the two methods you suggested (sinc interpolation and decimator/interpolator) are basically the same thing.

I brought up polynomials as an example of a different approach. My point is that there are basically two families of resamplers in widespread use: sinc-based and polynomial-based.

QUOTE ([JAZ] @ Feb 13 2013, 16:11) *

A linear interpolation is an intuitive way to find the value between two points, but it is not based on theory that reconstructs the path that a continuous bandlimited signal would take.

In that way, i made a distinction between the sinc method and the decimate/interpolate method because they do have a theory related to sound behind them, but it is not the same theory.


To be clear, the decimate/interpolate method always uses a sinc filter (or close approximation thereof) in practice. Linear interpolation can be combined with decimation/interpolation (using a sinc filter), but in practice never is because that rather defeats the purpose.

QUOTE ([JAZ] @ Feb 13 2013, 16:11) *

I can accept that the decimate/interpolate method is akin of doing a fixed point implementation of a floating point one, so in essence, they do the same. But as an implementation, they reach the solution differently.


I don't accept that. What does decimation/interpolation have to do with machine precision? You can do it integer, fixed point, floating point, decimal, whatever.

QUOTE ([JAZ] @ Feb 13 2013, 16:11) *
I mentioned graphics, because the word "spline" does describe that, a line (a visual concept).


huh?

QUOTE ([JAZ] @ Feb 13 2013, 16:11) *

Images are indeed made of samples, but.. what is the equivalent shannon theorem for images? I could accept that images are bandlimited


Pixels are samples in 2D spaces, just as voxels are samples in 3D spaces. The sampling theorem is universally applicable to all N-dimensional spaces.

QUOTE ([JAZ] @ Feb 13 2013, 16:11) *

(there's a finite spectrum represented by the sampled image colours, but even then, the RGB points are the representation of the image in the time domain?)


An RGB image is simply 3 independent grayscale images recorded using a color filter.
Go to the top of the page
+Quote Post
bandpass
post Feb 14 2013, 08:04
Post #42





Group: Members
Posts: 323
Joined: 3-August 08
From: UK
Member No.: 56644



QUOTE (halb27 @ Feb 12 2013, 17:30) *
Can you tell me please which software uses SINC?

First, go to http://src.infinitewave.ca/
For the top graph, select Test Result = Impulse, then press the > button to look at each resampler in turn:
  1. If the impulse looks broadly like Audacity 2.0.3 (perhaps with shorter tails), it's a close approximation to a sinc. Could be obtained by windowing sinc, but could also be other techniques such as Parks-McClellan optimal FIR. Longer tails equate to steeper/deeper filters.
  2. It it looks broadly like AbletonLive 8.2 then it's a sinc approximation that's been phase-adjusted to be causal (i.e. non-linear phase).
  3. It it looks roughly like AbletonLive 7 or Waveburner 1.2, then it's low-order polynomial. Even this is a simple approximation to a sinc, but gives correspondingly poor results.
  4. Others, like FL Studio 10 (6 Point Hermite), use other ways to approximate a sinc, but again with not so good results.
  • Most are type #1.
  • Soundhack looks like it should be type #1 but has implementation errors.
  • SIR2 and a few others are type #1 but inverted (bug).
  • Wavosaur 1.0.3.0 looks like a design error: polynomial followed by a sinc. (A multi-stage approach is perfectly valid; however, a good multi-stage design will still give a sinc impulse response.)
So the answer is they all use use sinc approximation, but the closer the approximation, the closer the result to the 'ideal'.
Go to the top of the page
+Quote Post
halb27
post Feb 14 2013, 09:07
Post #43





Group: Members
Posts: 2414
Joined: 9-October 05
From: Dormagen, Germany
Member No.: 25015



So SSRC high precision is a good approximation to SINC?


--------------------
lame3100m --bCVBR 300
Go to the top of the page
+Quote Post
bandpass
post Feb 14 2013, 13:48
Post #44





Group: Members
Posts: 323
Joined: 3-August 08
From: UK
Member No.: 56644



SSRC uses kaiser-windowed sinc.
Go to the top of the page
+Quote Post
halb27
post Feb 14 2013, 19:03
Post #45





Group: Members
Posts: 2414
Joined: 9-October 05
From: Dormagen, Germany
Member No.: 25015



Sorry, I don't know what this means in terms of quality. Is SSRC high precision a good choice in this respect?


--------------------
lame3100m --bCVBR 300
Go to the top of the page
+Quote Post
saratoga
post Feb 14 2013, 19:12
Post #46





Group: Members
Posts: 4718
Joined: 2-September 02
Member No.: 3264



QUOTE (halb27 @ Feb 14 2013, 13:03) *
Sorry, I don't know what this means in terms of quality.


You can check the quality using this link:

http://src.infinitewave.ca/

As you can see it is very good.
Go to the top of the page
+Quote Post
bandpass
post Feb 14 2013, 22:28
Post #47





Group: Members
Posts: 323
Joined: 3-August 08
From: UK
Member No.: 56644



It has some problems upsampling though:


Go to the top of the page
+Quote Post
halb27
post Feb 14 2013, 23:05
Post #48





Group: Members
Posts: 2414
Joined: 9-October 05
From: Dormagen, Germany
Member No.: 25015



No problem for me, I'm only interested in downsampling.


--------------------
lame3100m --bCVBR 300
Go to the top of the page
+Quote Post
saratoga
post Feb 15 2013, 01:53
Post #49





Group: Members
Posts: 4718
Joined: 2-September 02
Member No.: 3264



QUOTE (bandpass @ Feb 14 2013, 16:28) *
It has some problems upsampling though:


Are you referring to the background signals? If so, I wouldn't really call them a 'problem' given that they're spectrally flat and -110dB below peak.
Go to the top of the page
+Quote Post
Wombat
post Feb 15 2013, 03:42
Post #50





Group: Members
Posts: 950
Joined: 7-October 01
Member No.: 235



QUOTE (bandpass @ Feb 14 2013, 23:28) *
It has some problems upsampling though:

Heh! Just think about what the main purpose of SSRC back in 2001 was when Naoki wrote this little gem. The need of upsampling anything to 192kHz was most likely as far away as rabbits taking over planet mars smile.gif
Go to the top of the page
+Quote Post

3 Pages V  < 1 2 3 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 23rd April 2014 - 20:02