IPB

Welcome Guest ( Log In | Register )

3 Pages V  < 1 2 3  
Reply to this topicStart new topic
SoundExpert explained, Methodology issues
Serge Smirnoff
post Nov 30 2010, 17:38
Post #51





Group: Members
Posts: 370
Joined: 14-December 01
Member No.: 641



QUOTE (2Bdecided @ Nov 30 2010, 19:24) *
How can you say this when SebG and Woodinville both gave you examples to the contrary?

I hit the exact problem Woodinville describes using the method I posted on the first page of this thread - a listener gets stuck in a "false" minima of audibility because double the difference gives you the original signal back (with the part "removed" by the codec being inverted, but that difference is not usually audible). Hardly monotonic - the chance of hearing the artefact becomes zero at a single gain setting (+6dB), and (with the specific audio I used - YMMV!) leaps back to the "expected" function very quickly either side of that.

In many papers devoted to "coding margin" a special filtering is recommended to eliminate those "ghost" frequencies. We also use it.


--------------------
keeping audio clear together - soundexpert.org
Go to the top of the page
+Quote Post
Woodinville
post Dec 1 2010, 03:11
Post #52





Group: Members
Posts: 1401
Joined: 9-January 05
From: JJ's office.
Member No.: 18957



QUOTE (Serge Smirnoff @ Nov 30 2010, 08:38) *
QUOTE (2Bdecided @ Nov 30 2010, 19:24) *
How can you say this when SebG and Woodinville both gave you examples to the contrary?

I hit the exact problem Woodinville describes using the method I posted on the first page of this thread - a listener gets stuck in a "false" minima of audibility because double the difference gives you the original signal back (with the part "removed" by the codec being inverted, but that difference is not usually audible). Hardly monotonic - the chance of hearing the artefact becomes zero at a single gain setting (+6dB), and (with the specific audio I used - YMMV!) leaps back to the "expected" function very quickly either side of that.

In many papers devoted to "coding margin" a special filtering is recommended to eliminate those "ghost" frequencies. We also use it.



How do you know what "it" is? You have to work specifically to every bit rate, every bandwidth, every sampling rate, every different encoder?

This is not useful.


--------------------
-----
J. D. (jj) Johnston
Go to the top of the page
+Quote Post
Serge Smirnoff
post Dec 1 2010, 09:17
Post #53





Group: Members
Posts: 370
Joined: 14-December 01
Member No.: 641



QUOTE (Woodinville @ Dec 1 2010, 06:11) *
QUOTE (Serge Smirnoff @ Nov 30 2010, 08:38) *

In many papers devoted to "coding margin" a special filtering is recommended to eliminate those "ghost" frequencies. We also use it.



How do you know what "it" is? You have to work specifically to every bit rate, every bandwidth, every sampling rate, every different encoder?

This is not useful.

Subtracting a portion of reference signal from output one it's not hard to figure out what frequencies are "ghosted' and remove them with FIR filter. So, yes, we do it for every test sample with amplified artifacts. This helps to get smoother perception curves. Every item tested at SE has its own unique curve plotted on results of SE listening tests. Extrapolating that curve we get resulting quality rating for each testing item.


--------------------
keeping audio clear together - soundexpert.org
Go to the top of the page
+Quote Post
2Bdecided
post Dec 1 2010, 16:26
Post #54


ReplayGain developer


Group: Developer
Posts: 4945
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



I can see how this could work for a simple low pass filter, but not how it could work for SBR.

With SBR, there's nothing you can usefully present to a listener that's "just like the coded version, but with the faults a bit louder" or "just like the coded version, but with the faults a bit quieter".

It's like me singing the same song twice. You can't figure out how close the two different versions are by subtracting them or amplifying differences. Subjectively (if I was a very consistent singer) the two versions could sound basically identical, but mathematically every sample would be very different, and I can't see how what you propose could work. SBR isn't so different from this example!

Cheers,
David.
Go to the top of the page
+Quote Post
Woodinville
post Dec 1 2010, 22:03
Post #55





Group: Members
Posts: 1401
Joined: 9-January 05
From: JJ's office.
Member No.: 18957



QUOTE (Serge Smirnoff @ Dec 1 2010, 00:17) *
QUOTE (Woodinville @ Dec 1 2010, 06:11) *
QUOTE (Serge Smirnoff @ Nov 30 2010, 08:38) *

In many papers devoted to "coding margin" a special filtering is recommended to eliminate those "ghost" frequencies. We also use it.



How do you know what "it" is? You have to work specifically to every bit rate, every bandwidth, every sampling rate, every different encoder?

This is not useful.

Subtracting a portion of reference signal from output one it's not hard to figure out what frequencies are "ghosted' and remove them with FIR filter. So, yes, we do it for every test sample with amplified artifacts. This helps to get smoother perception curves. Every item tested at SE has its own unique curve plotted on results of SE listening tests. Extrapolating that curve we get resulting quality rating for each testing item.


So, it's "by clip". This still seems useless.


--------------------
-----
J. D. (jj) Johnston
Go to the top of the page
+Quote Post
Kees de Visser
post Dec 1 2010, 23:47
Post #56





Group: Members
Posts: 612
Joined: 22-May 05
From: France
Member No.: 22220



QUOTE (Woodinville @ Dec 1 2010, 23:03) *
This still seems useless.
So which options are available to reveal sub-threshold differences in a listening test ?
Go to the top of the page
+Quote Post
Woodinville
post Dec 1 2010, 23:55
Post #57





Group: Members
Posts: 1401
Joined: 9-January 05
From: JJ's office.
Member No.: 18957



QUOTE (Kees de Visser @ Dec 1 2010, 14:47) *
QUOTE (Woodinville @ Dec 1 2010, 23:03) *
This still seems useless.
So which options are available to reveal sub-threshold differences in a listening test ?


This leads to a very simple question: What does "sub-threshold differences in a listening test" mean?

Therein lies, perhaps, the underlying philosophical problem here.


--------------------
-----
J. D. (jj) Johnston
Go to the top of the page
+Quote Post
greynol
post Dec 2 2010, 06:47
Post #58





Group: Super Moderator
Posts: 10000
Joined: 1-April 04
From: San Francisco
Member No.: 13167



QUOTE (Woodinville @ Dec 1 2010, 14:55) *
sub-threshold differences

wacko.gif


--------------------
Your eyes cannot hear.
Go to the top of the page
+Quote Post
Serge Smirnoff
post Dec 2 2010, 08:53
Post #59





Group: Members
Posts: 370
Joined: 14-December 01
Member No.: 641



QUOTE (Woodinville @ Dec 2 2010, 02:55) *
This leads to a very simple question: What does "sub-threshold differences in a listening test" mean?

Therein lies, perhaps, the underlying philosophical problem here.

No, the question is without "in a listening test" part. What does sub-threshold differences mean?
It is probably something that distinguishes, say, aac@192 from aac@256.

I'm not sure about philosophical but problem of definitions in this case exists for sure.

EDIT: ... or may be contradiction between objective and subjective plays some role here ...

This post has been edited by Serge Smirnoff: Dec 2 2010, 09:08


--------------------
keeping audio clear together - soundexpert.org
Go to the top of the page
+Quote Post
Serge Smirnoff
post Dec 2 2010, 09:01
Post #60





Group: Members
Posts: 370
Joined: 14-December 01
Member No.: 641



QUOTE (Woodinville @ Dec 2 2010, 01:03) *
So, it's "by clip". This still seems useless.

Yes, by clip, like in ordinary listening tests.


--------------------
keeping audio clear together - soundexpert.org
Go to the top of the page
+Quote Post
Kees de Visser
post Dec 2 2010, 09:35
Post #61





Group: Members
Posts: 612
Joined: 22-May 05
From: France
Member No.: 22220



QUOTE (Woodinville @ Dec 2 2010, 00:55) *
This leads to a very simple question: What does "sub-threshold differences in a listening test" mean?
Differences that can be proven to exist with technical means, but are undetectable with a standard listening test.

Let me try this analogy:
Someone has to leave the next day on a 6-month boat trip. He has to prepare canned food and can choose between two unlabeled lots that look identical. Someone told him that the lots have different "best before" dates: one expires in 1 month, the other in 10 months. He tastes a bit from each, but both taste absolutely identical. He knows that best before dates don't mean that the food will be bad the day after, but his chances to survive the trip are probably bigger when he picks the fresher one.
(btw, the boat is too small to take both)
Go to the top of the page
+Quote Post
Serge Smirnoff
post Dec 2 2010, 09:41
Post #62





Group: Members
Posts: 370
Joined: 14-December 01
Member No.: 641



QUOTE (2Bdecided @ Dec 1 2010, 19:26) *
With SBR, there's nothing you can usefully present to a listener that's "just like the coded version, but with the faults a bit louder" or "just like the coded version, but with the faults a bit quieter".

Why not? If there is a difference with main signal, then there is something to present. The main question is - how good such differences will represent the drawbacks really important for HAS. Probably there are some psychoacoustic tricks which are badly covered by the metric. Then usual question is - to what extent such cases will affect final rating? All metrics have its limits.
QUOTE (2Bdecided @ Dec 1 2010, 19:26) *
It's like me singing the same song twice. You can't figure out how close the two different versions are by subtracting them or amplifying differences. Subjectively (if I was a very consistent singer) the two versions could sound basically identical, but mathematically every sample would be very different, and I can't see how what you propose could work. SBR isn't so different from this example!

Why not as well?


--------------------
keeping audio clear together - soundexpert.org
Go to the top of the page
+Quote Post
greynol
post Dec 2 2010, 10:34
Post #63





Group: Super Moderator
Posts: 10000
Joined: 1-April 04
From: San Francisco
Member No.: 13167



QUOTE (Kees de Visser @ Dec 2 2010, 00:35) *
Differences that can be proven to exist with technical means, but are undetectable with a standard listening test.

...and the intended purpose of transparent perceptual compression is to satisfy the latter. The rest is little more than mental masturbation.


--------------------
Your eyes cannot hear.
Go to the top of the page
+Quote Post
2Bdecided
post Dec 2 2010, 11:25
Post #64


ReplayGain developer


Group: Developer
Posts: 4945
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



QUOTE (Kees de Visser @ Dec 2 2010, 08:35) *
QUOTE (Woodinville @ Dec 2 2010, 00:55) *
This leads to a very simple question: What does "sub-threshold differences in a listening test" mean?
Differences that can be proven to exist with technical means, but are undetectable with a standard listening test.

Let me try this analogy:
Someone has to leave the next day on a 6-month boat trip. He has to prepare canned food and can choose between two unlabeled lots that look identical. Someone told him that the lots have different "best before" dates: one expires in 1 month, the other in 10 months. He tastes a bit from each, but both taste absolutely identical. He knows that best before dates don't mean that the food will be bad the day after, but his chances to survive the trip are probably bigger when he picks the fresher one.
(btw, the boat is too small to take both)
The best before date is a simple function - an apples to apples comparison - you know that 6 months is better than 5 months. You also know that what you want to do (go out longer in the boat) relates to what you are measuring (how long the food will last).

Comparing codecs isn't like this at all. Comparing codecs is an apples to oranges comparison - you don't know that artefacts 6dB below threshold are better than artefacts 5dB below threshold - 1) because the characteristic of the artefacts could be different, and 2) you haven't said what "better" means. Better for what? Not for just listening (either is fine), so for what?

Cheers,
David.
Go to the top of the page
+Quote Post
2Bdecided
post Dec 2 2010, 11:32
Post #65


ReplayGain developer


Group: Developer
Posts: 4945
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



QUOTE (Serge Smirnoff @ Dec 2 2010, 08:41) *
QUOTE (2Bdecided @ Dec 1 2010, 19:26) *
It's like me singing the same song twice. You can't figure out how close the two different versions are by subtracting them or amplifying differences. Subjectively (if I was a very consistent singer) the two versions could sound basically identical, but mathematically every sample would be very different, and I can't see how what you propose could work. SBR isn't so different from this example!
Why not as well?
There must be some disconnect here, because this doesn't make sense to me. Either I don't understand what you mean, or you don't understand what I mean.

If I sing the same thing twice, what do you do to these two files to present them on SoundExpert.com?

Cheers,
David.
Go to the top of the page
+Quote Post
Serge Smirnoff
post Dec 2 2010, 12:18
Post #66





Group: Members
Posts: 370
Joined: 14-December 01
Member No.: 641



QUOTE (2Bdecided @ Dec 2 2010, 14:32) *
If I sing the same thing twice, what do you do to these two files to present them on SoundExpert.com?

There is nothing to present on SE in this case.
Or we have to define that the first recording is reference one and we want to know how different (bad) is the second. Then why not to amplify the difference to some extent.


--------------------
keeping audio clear together - soundexpert.org
Go to the top of the page
+Quote Post
Kees de Visser
post Dec 2 2010, 13:09
Post #67





Group: Members
Posts: 612
Joined: 22-May 05
From: France
Member No.: 22220



QUOTE (2Bdecided @ Dec 2 2010, 12:25) *
Comparing codecs isn't like this at all. Comparing codecs is an apples to oranges comparison - you don't know that artefacts 6dB below threshold are better than artefacts 5dB below threshold - 1) because the characteristic of the artefacts could be different, and 2) you haven't said what "better" means. Better for what? Not for just listening (either is fine), so for what?
Do we agree that there are 3 types of quality levels, from better to worse:
1- artefacts are non-existent (-inf), like in lossless coding
2- artefacts are below the hearing threshold
3- artefacts are audible, by at least one listener for at least one (killer)sample

In my view the better codec is the one that will remain in category 2 in any situation (e.g. inserting an Orban in the monitoring chain).

Example: original master is 24/96. Two lossy copies are made, one 16/44.1 and one mp3 320kbs. Both sound identical to the master.
I would say the 16/44.1 is better than the mp3, but if you can give arguments for the contrary, I'm all ear.

QUOTE (2Bdecided @ Dec 2 2010, 12:32) *
If I sing the same thing twice, what do you do to these two files to present them on SoundExpert.com?
SoundExpert won't work for this, nor will ABX since there's a huge risk for false positives. A lot depends on where you switch from A to B. Small tempo and pitch differences will remain unnoticed when heard in isolation, but as soon as you jump from one to the other they can become apparent. This is the daily job of an audio editor, to find the best spot to inaudibly switch from one take to another. (hint: it's not always easy and I'm glad to be paid per hour) smile.gif
Go to the top of the page
+Quote Post
Serge Smirnoff
post Dec 2 2010, 13:10
Post #68





Group: Members
Posts: 370
Joined: 14-December 01
Member No.: 641



QUOTE (2Bdecided @ Dec 2 2010, 14:25) *
Comparing codecs isn't like this at all. Comparing codecs is an apples to oranges comparison - you don't know that artefacts 6dB below threshold are better than artefacts 5dB below threshold - 1) because the characteristic of the artefacts could be different, and 2) you haven't said what "better" means. Better for what? Not for just listening (either is fine), so for what?

SE metric is trying to find the way how to get to know this. And "better" means "as if it was judged by golden ears in perfect listening environment".
What for? I don't know but all audio pro guys want huge quality margins for their equipment and most listeners want flac while aac@192 is transparent. May be they are just not clever enough.


--------------------
keeping audio clear together - soundexpert.org
Go to the top of the page
+Quote Post
2Bdecided
post Dec 2 2010, 16:04
Post #69


ReplayGain developer


Group: Developer
Posts: 4945
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



QUOTE (Kees de Visser @ Dec 2 2010, 12:09) *
QUOTE (2Bdecided @ Dec 2 2010, 12:25) *
Comparing codecs isn't like this at all. Comparing codecs is an apples to oranges comparison - you don't know that artefacts 6dB below threshold are better than artefacts 5dB below threshold - 1) because the characteristic of the artefacts could be different, and 2) you haven't said what "better" means. Better for what? Not for just listening (either is fine), so for what?
Do we agree that there are 3 types of quality levels, from better to worse:
1- artefacts are non-existent (-inf), like in lossless coding
2- artefacts are below the hearing threshold
3- artefacts are audible, by at least one listener for at least one (killer)sample
You can certainly define 3 such categories. It also sounds like a thing that's theoretically true (whatever that means). I suspect your categories are completely useless though...

In practice, it's hard to find a codec in category 2 that gives a significant bitrate saving over those in category 1.

It's rather difficult to prove that the codec is in category 2 rather than 3. You've got to get everyone in the world to listen carefully to every possible audio signal.

QUOTE
In my view the better codec is the one that will remain in category 2 in any situation (e.g. inserting an Orban in the monitoring chain).
Ah, good, so now we have everyone in the world listening to every possible audio signal via every possible piece of audio processing. Excellent.

Now, seriously, even if we put the "every person" and "every audio signal" parts to one side, you must realise that for any codec which changes the signal (let's assume the change is inaudible), there must be some audio processing we can do to make that change audible. So no codec can remain in category 2 "in any situation".


QUOTE
Example: original master is 24/96. Two lossy copies are made, one 16/44.1 and one mp3 320kbs. Both sound identical to the master.
I would say the 16/44.1 is better than the mp3, but if you can give arguments for the contrary, I'm all ear.
If the mp3 is made from a 16/44.1 file (as is normal) then this is silly - of course it can't be better, since it's a copy of a copy.

However, if the mp3 is made from the 24/96 by resampling to 44.1 but maintaining 24-bits, then it's kind of trivial to find a situation where the mp3 is "better":
The original master contains a signal at -110dB
The mp3 is decoded to 24-bits
The "processing" applied to the 16/44.1 wav and the decoded 320kbps mp3 is... increasing the level by 80dB.

Oh look - both sounded identical to the master before processing, but with my highly advanced processing in place (well, OK, it was a volume control!) the mp3 is revealed to be far closer to the master than the 16/44.1 version.


These are all silly examples, but I think they prove the point - there's far too much assumption in the SoundExpert methods, or the "this thing sounds the same but must be better" statements.


QUOTE
QUOTE (2Bdecided @ Dec 2 2010, 12:32) *
If I sing the same thing twice, what do you do to these two files to present them on SoundExpert.com?
SoundExpert won't work for this, nor will ABX since there's a huge risk for false positives. A lot depends on where you switch from A to B. Small tempo and pitch differences will remain unnoticed when heard in isolation, but as soon as you jump from one to the other they can become apparent. This is the daily job of an audio editor, to find the best spot to inaudibly switch from one take to another. (hint: it's not always easy and I'm glad to be paid per hour) smile.gif
But SBR is "singing along" with the music without tempo and pitch differences, yet re-creating it from scratch (the original waveform is discarded). ABX works fine. Amplifying the sample-by-sample differences is meaningless.


I don't see any explanation of why the SoundExpert approach works for SBR, or accurately quantifies the subjective quality of SBR wrt "traditional" coding.


It's funny - we've seen a second revolution in audio coding. The first was when basic psychoacoustics came in, and suddenly having a waveform that was "closest" to the original was no longer the way to judge quality. With two codecs, the one which had a greater error signal could sound better.

Now with SBR and PS we have another revolution where the waveform isn't an (inaudibly) distorted version of the original, but actually bares no resemblance to the original. So any measurements that include psychoacoustics while assuming that the waveform should be at least vaguely similar are also broken.

I'm not convinced that the SoundExpert method actually survived the first revolution, but it's difficult to see how it survived the second.

ABX will survive whatever happens.


I'll eat my words if someone can provide a detailed explanation of how SoundExpert works, and prove a correlation - but if it relies on sticking plasters to undo or account for each new coding trick, it's no good generally.

Cheers,
David.
Go to the top of the page
+Quote Post
Kees de Visser
post Dec 2 2010, 17:52
Post #70





Group: Members
Posts: 612
Joined: 22-May 05
From: France
Member No.: 22220



QUOTE (2Bdecided @ Dec 2 2010, 17:04) *
QUOTE (Kees de Visser @ Dec 2 2010, 12:09) *
In my view the better codec is the one that will remain in category 2 in any situation (e.g. inserting an Orban in the monitoring chain).
Ah, good, so now we have everyone in the world listening to every possible audio signal via every possible piece of audio processing. Excellent.
Exactly, that's not very practicle.
And that's the very reason why so many audio professionals prefer to offer lossless formats and let the customer decide how to process it for his/her personal use.
I remember numerous complaints from HA members about online music being only available in lossy formats. Deutsche Grammophon offers both flac and 320kbs mp3, which makes a lot of sense IMO, even if they sound identical wink.gif
Go to the top of the page
+Quote Post
greynol
post Dec 2 2010, 19:15
Post #71





Group: Super Moderator
Posts: 10000
Joined: 1-April 04
From: San Francisco
Member No.: 13167



QUOTE (Kees de Visser @ Dec 2 2010, 04:09) *
2- artefacts are below the hearing threshold

I'm sure you won't be surprised to hear from me that this is an oxymoron.


--------------------
Your eyes cannot hear.
Go to the top of the page
+Quote Post
Serge Smirnoff
post Dec 2 2010, 19:24
Post #72





Group: Members
Posts: 370
Joined: 14-December 01
Member No.: 641



QUOTE (2Bdecided @ Dec 2 2010, 19:04) *
Now with SBR and PS we have another revolution where the waveform isn't an (inaudibly) distorted version of the original, but actually bares no resemblance to the original. So any measurements that include psychoacoustics while assuming that the waveform should be at least vaguely similar are also broken.

Below are Diff. Levels of 9 SE samples processed by HE and LC profiles of CT encoder (@128 kbit/s):

CODE
           aac+ CBR@128.9 (Winamp 5.21)       aac CBR@129.6 (Winamp 5.21)
----------------------------------------------------------------------------------
BAH:       -34.4139 dB                           -33.6044 dB          
BAS:       -35.8823 dB                           -36.6633 dB
CST:       -9.9989 dB                            -21.8093 dB
FMS:       -30.0811 dB                           -36.1838 dB      
GLK:       -19.6055 dB                           -36.1699 dB      
HRP:       -14.6460 dB                           -21.5798 dB
LOB:       -16.6801 dB                           -22.8038 dB
MOF:       -21.3063 dB                           -31.3638 dB
QRT:       -33.6662 dB                           -33.6656 dB


The same for encoders @192 kbit/s:

CODE
           aac+ CBR@192.7 (Winamp 5.24)       AAC VBR@190.9 (NeroRef 1530)
----------------------------------------------------------------------------------
BAH:       -37.1624 dB                           -33.0144 dB          
BAS:       -39.2936 dB                           -32.5532 dB
CST:       -23.2628 dB                           -28.0893 dB
FMS:       -39.2991 dB                           -33.3562 dB      
GLK:       -33.8942 dB                           -37.5733 dB      
HRP:       -20.2250 dB                           -26.7197 dB
LOB:       -29.0020 dB                           -29.4662 dB
MOF:       -34.7739 dB                           -36.6439 dB
QRT:       -37.0264 dB                           -32.7181 dB


As you see waveforms of both profiles differ from reference waveforms approx. to thу same degree. So it is an illusion that with SBR the waveforms "bares no resemblance to the original". The illusion is inspired by the knowledge of how SBR works. In reality both waveforms are changed to the same level (@192 HE versions even closer to references than LC ones, though the encoders and modes are different). The main question is how they are changed.


--------------------
keeping audio clear together - soundexpert.org
Go to the top of the page
+Quote Post
Woodinville
post Dec 3 2010, 00:32
Post #73





Group: Members
Posts: 1401
Joined: 9-January 05
From: JJ's office.
Member No.: 18957



QUOTE (Kees de Visser @ Dec 2 2010, 00:35) *
QUOTE (Woodinville @ Dec 2 2010, 00:55) *
This leads to a very simple question: What does "sub-threshold differences in a listening test" mean?
Differences that can be proven to exist with technical means, but are undetectable with a standard listening test.


Thanks to Quantum Mechanics, a good enough measurement will always be different if this is an analog signal. If it's a digital signal, well, you have a tiny leg to stand on, but still, let's take a 120 second log sweep from 20 to 15khz. Under that by 40dB I put a 4kHz tone.

Now, the difference is going to be a constant 4khz tone. The "noise" is stationary and exactly predictable. Its audibility is going to vary enormously over the time of the sweep.

We have to use how many different examplars or whatever to decide the audibility of this noise? The scale is continuous, so ...??????


--------------------
-----
J. D. (jj) Johnston
Go to the top of the page
+Quote Post

3 Pages V  < 1 2 3
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 16th April 2014 - 15:40