Help - Search - Members - Calendar
Full Version: Another Joint Stereo Discussion
Hydrogenaudio Forums > Lossy Audio Compression > MP3 > MP3 - Tech
Pages: 1, 2, 3
cAPSLOCK
Split from here.

QUOTE(Blanka @ Feb 26 2005, 01:56 AM)
sad.gif Has anyone ever had a hard job trying to dispell certain mp3 myths? Like:

1) Joint Stereo messes up the sound.
2) Joint Stereo ruins high frequencies.
3) You can always tell the difference.

etc etc.

I feel like I'm bashing my head against a brick wall trying to talk to these people...  unsure.gif
*



Although I agree with you that most people are not going to notice the difference between a stereo/js encoded mp3 implying that the difference is mythical, or even inaudible is plain wrong.

It seems to me that the difference is more profound when using VBR.

If you encode a file twice with the same VBR settings except for s/js setting, and then turn them back to wavs (so there is no size difference) and email me the same 10 seconds of each I am willing to bet you I can tell them apart.

And I am wearing flame proof armor! wink.gif

cAPS
kjoonlee
cAPSLOCK, unless you can prove it yourself, by providing an uncompressed sample that sounds worse to you when encoded with mid-side joint stereo rather than "pure" stereo, and prove that you can tell the difference through double-blind testing, no one here is going to take you seriously.
cAPSLOCK
QUOTE(schonenberg @ Feb 28 2005, 07:29 PM)
Is there any reason we would ever really need simple stereo since Lame JS doesn't destroy stereo?

If no one can think of a good reason, maybe it should be taken out of Lame, or at least taken out of the --longhelp and removed from the docs. What do you think?
*



I can think of a reason.

The is an audible difference between the two. ohmy.gif I would like the choice of how I encode.

For what it's wirth I normally encode in joint stereo.

Here is one way to prove the difference.

The following two files are the S channel of the same exact file (recorded in my studio) one encoded in joint stereo and one encoded in stereo.

http://noisevault.com/temp/hyd-js.wav
http://noisevault.com/temp/hyd-s.wav

If you dont have decent speakers... listen with headphones. You will hear the difference.

Here is what I did:

Took a song and encoded it VBR 112-320 both in stereo and joint stereo. Then I did a Mid/Side encoding on it and isolated the Side channel. That's what you have downloaded.

If you don't know about mid-side encoding it's pretty simple. It is just another way to represent a stereo signal over two channels. Instead of left and right, you have the audio that is in the middle, and the audio that is on the sides. Google up M/S encoding if you want to know more.

For the sake of reference here is the original wav that was encoded. Well more or less... I cut out the pieces after I encoded the whole 4 minute files.

http://noisevault.com/temp/hyd-orig.wav

regards,
cAPS
cAPSLOCK
QUOTE(kjoonlee @ Feb 28 2005, 11:42 PM)
cAPSLOCK, unless you can prove it yourself, by providing an uncompressed sample that sounds worse to you when encoded with mid-side joint stereo rather than "pure" stereo, and prove that you can tell the difference through double-blind testing, no one here is going to take you seriously.
*



Can you tell a difference between the S channel files I posted? (about three posts up)

cAPS
kjoonlee
Why should I care? I'm only interested in testing uncompressed vs. compressed samples.
cAPSLOCK
Do you mean you don't think I could tell a j/s encoded mp3 from an uncompressed wav file? If so, then you are completely insane. wink.gif

Or am I misunderstanding you?

cAPS
kjoonlee
I was thinking you meant joint stereo was inferior. Oh well.

But nevertheless, when you registered here, you agreed to the Terms of Service.

QUOTE
8. All members that put forth a statement concerning subjective sound quality, must -- to the best of their ability -- provide objective support for their claims. Acceptable means of support are double blind listening tests (ABX or ABC/HR) demonstrating that the member can discern a difference perceptually, together with a test sample to allow others to reproduce their findings. Graphs, non-blind listening tests, waveform difference comparisons, and so on, are not acceptable means of providing support.


You have just made some statements regarding sound quality. If you can back up your claims, then please do so in a manner which is acceptable by this community, without resorting to comparisons between "artificial" samples.

I bet there are some cases where you wouldn't be able to tell a LAME 3.9x encoded 320kbps CBR mid-side j/s MP3 from an uncompressed original.
cAPSLOCK
Oh, I get it...

You are telling me that j/s vs stereo encoding is analogous to M/S encoding.

This would be true but for the mp3 encoding happening on top of it.

The bits saved by j/s encoding are GOING somewhere. wink.gif Where do you think?

I am an audio professional. It's what I do for a living. I listen to sound up close all day long over a fairly nice signal chain. I modified the A/D D/A converters I use myself. I know there is a difference between stereo and joint stereo encoding. I even provided proof that anyone should be able to hear the difference in a post a above this one.

I understand I have walked into a holy topic throwing bombs... I will shut up now.

But it was this thread that made me join hydrogen audio finally after a long time of moderate trolling... I do not promise to not throw other bombs elsewhere.

love,
cAPS
kjoonlee
I was implying that mid-side, not intensity-stereo, is the proper way to do high quality joint-stereo.

It doesn't matter if you're an audio professional. If you can't back up your claims here, no one will take you seriously, as I have pointed out before.
cAPSLOCK
Woof... You are right... "Number 8" is a pretty serious part of the agreement.

I retract all my statements as I do not have enough time to enter a double blind test over the internet (actually I am audition mic preamps via some blind (not double blind haha I won't cheat my wallet) testing this week before I purchase a new one).

But wouldn't it be unfair to say things like
QUOTE
I bet there are some cases where you wouldn't be able to tell a LAME 3.9x encoded 320kbps CBR mid-side j/s MP3 from an uncompressed original.
" without providing the same level of proof?

Or does the rule only apply to me?

tongue.gif

cAPS
AtaqueEG
QUOTE(cAPSLOCK @ Mar 1 2005, 12:16 AM)
Do you mean you don't think I could tell a j/s encoded mp3 from an uncompressed wav file?  If so, then you are completely insane. wink.gif

Or am I misunderstanding you?

cAPS
*



Try it (with a good encoder and settings, don't cheat wink.gif ) and then tell us.

I think you will be surprised.

And if you can, PLEASE post the sample (uncompressed) and results
odious malefactor
QUOTE(cAPSLOCK @ Feb 28 2005, 10:25 PM)
The bits saved by j/s encoding are GOING somewhere.

Uh.... where exactly are they going?
AtaqueEG
QUOTE(cAPSLOCK @ Mar 1 2005, 12:32 AM)
But wouldn't it be unfair to say things like
QUOTE
I bet there are some cases where you wouldn't be able to tell a LAME 3.9x encoded 320kbps CBR mid-side j/s MP3 from an uncompressed original.
" without providing the same level of proof?

*




This has been already proven.

I'm not saying that you couldn't do it, but it is very hard.

MP3s produced by LAME 3.9x at that bitrate are pretty much transparent.

But, then again, try.
kjoonlee
It does not apply to me, because it was merely a counterclaim. The ball is in your court, but you seem to have retracted all your statements.
cAPSLOCK
QUOTE(AtaqueEG @ Mar 1 2005, 12:33 AM)

Try it (with a good encoder and settings, don't cheat  wink.gif ) and then tell us.

I think you will be surprised.

And if you can, PLEASE post the sample (uncompressed) and results
*




OK... I admit it... I am ensnared by your challenge. wink.gif

What exactly do you want me to do that will prove I can hear a difference?

If it is within reasonable limits I will do anything to either prove I am right... or you are. wink.gif

cAPS
cAPSLOCK
QUOTE(kjoonlee @ Mar 1 2005, 12:36 AM)
It does not apply to me, because it was merely a counterclaim. The ball is in your court, but you seem to have retracted all your statements.
*



Oh... I was just being dramatic. I still know I can hear a difference. wink.gif And if you want I will attempt a test, but what do I need to do to convince you?

But doesnt the general notion that "audible JS encoding differences are a myth" put forth by this very thread deserve the scrutiny of rule #8? Or is this "the doctrine of the church" and doesnt have to be supported? Is there proof already?

cAPS
kjoonlee
The myth in question is "Joint stereo, even mid-side jointstereo, is worse than pure stereo."

It has been proven wrong, under strict conditions, I am sure.
odious malefactor
QUOTE(cAPSLOCK @ Feb 28 2005, 10:39 PM)
But doesnt the general notion that "audible JS encoding differences are a myth" put forth by this very thread deserve the scrutiny of rule #8?  Or is this "the doctrine of the church" and doesnt have to be supported?  Is there proof already?

Read the FAQ
cAPSLOCK
QUOTE(odious malefactor @ Mar 1 2005, 12:34 AM)
QUOTE(cAPSLOCK @ Feb 28 2005, 10:25 PM)
The bits saved by j/s encoding are GOING somewhere.

Uh.... where exactly are they going?
*



They are going *POOF* and disappearing.

cAPS
AtaqueEG
QUOTE(cAPSLOCK @ Mar 1 2005, 12:36 AM)
What exactly do you want me to do that will prove I can hear a difference?

If it is within reasonable limits I will do anything to either prove I am right... or you are. wink.gif

cAPS
*



Find a music sample in which you can hear a difference caused by incorrect stereo information produced by joint stereo encoding.

Perform a blind test vs uncompressed wav file (using whatever bitrate is transparent for you in MP3 encoding, but I suggest --alt-preset standard or equivalent in later LAMEs) or versus MP3 encoded using full stereo.

Provide results.

That sounds reasonable, doesn't it?

And it is not a challenge. This community and the codec developers who post here are very open to criticism which is constructive, scientific and will encourage the improvement of compressed audio. We really love when somebody finds anything that will work towards the perfecting of codecs.

It would not be the first time either result would happen.
odious malefactor
QUOTE(cAPSLOCK @ Feb 28 2005, 10:47 PM)
QUOTE(odious malefactor @ Mar 1 2005, 12:34 AM)
QUOTE(cAPSLOCK @ Feb 28 2005, 10:25 PM)
The bits saved by j/s encoding are GOING somewhere.

Uh.... where exactly are they going?
*



They are going *POOF* and disappearing.

cAPS
*


Thus making a smaller file--one of the main reasons to use mp3, no?
dev0
cAPS, if you want your claims to be taken seriously do the following:

1. Encode a sample of your choice using LAME 3.90.3 with these two commandlines:
--alt-preset standard
--alt-preset standard -m s

2. ABX both samples from the original and each other (That makes 3 ABX tests overall). I assume you - as a professional - are familiar with ABX methodology, if not read this. I recommend using foobar2000's foo_abx component.

3. Post your results and a losslessly compressed version of the sample you used for others to verify your findings.

4. Decide if you stick to your claim that "Joint stereo, even mid-side jointstereo, is worse than pure stereo."
cAPSLOCK
QUOTE(odious malefactor @ Mar 1 2005, 12:50 AM)
QUOTE(cAPSLOCK @ Feb 28 2005, 10:47 PM)
QUOTE(odious malefactor @ Mar 1 2005, 12:34 AM)
QUOTE(cAPSLOCK @ Feb 28 2005, 10:25 PM)
The bits saved by j/s encoding are GOING somewhere.

Uh.... where exactly are they going?
*



They are going *POOF* and disappearing.

cAPS
*


Thus making a smaller file--one of the main reasons to use mp3, no?
*



Certainly! This is why I use JS encoding for the most part! But by that logic, why encode at 320 ro even 192? Why not just drop the file down to the lowest bitrate you possibly can for 44.1 audio? Every bit you lose changes something from the original file when using lossy encoding.

cAPS

PS Man I hate those forum bombthrowers who promise to shut up and continue to hi-jack a thread... Seems I hate myself! See I am open to being wrong. wink.gif

cAPSLOCK
QUOTE(dev0 @ Mar 1 2005, 12:56 AM)
cAPS, if you want your claims to be taken seriously do the following:

1. Encode a sample of your choice using LAME 3.90.3 with these two commandlines:
--alt-preset standard
--alt-preset standard -m s

2. ABX both samples from the original and each other (That makes 3 ABX tests overall).  I assume you - as a professional - are familiar with ABX methodology, if not read this. I recommend using foobar2000's foo_abx component.

3. Post your results and a losslessly compressed version of the sample you used for others to verify your findings.

4. Decide if you stick to your claim that "Joint stereo, even mid-side jointstereo, is worse than pure stereo."
*



Alright. I will do this and post the results.

cAPS
odious malefactor
QUOTE(cAPSLOCK @ Feb 28 2005, 10:56 PM)
But by that logic, why encode at 320 ro even 192?  Why not just drop the file down to the lowest bitrate you possibly can for 44.1 audio?  Every bit you lose changes something from the original file when using lossy encoding.


However....

QUOTE(Gabriel @ Jan 20 2005, 12:13 AM)
The M/S stereo transform itself is lossless.
However, in the context of a lossy encoder, anything leading to some variations in the bits might affect the final loss.
It means that in the context of mp3 encoding, M/S stereo is as lossy as the Huffman coding.


cAPSLOCK
GOOD LORD!!! Doing that test nearly melted my brain. I kinda wish I had made a mistake so you would be more likely to believe I can hear the difference rather than work a text editor. wink.gif But I swear I didn't cheat. Before I relized the ABX plugin would output a log I took a screenshot.

http://noisevault.com/temp/mp3test.png

To be utterly frank, I have to admit that though I have little doubt I could tell you which file was mp3 and which was wav in both tests that involved a wav file, I don't think I couold do the same with the two mp3s back to back. I was never sure which was which, only that they were different.

So I learned something!

Also, I learned that the current LAME does a much better job than the last time I really did an intense listening test. MUCH better.

foo_abx v1.2 report
foobar2000 v0.8.3
2005/03/01 02:18:28

File A: file://C:\downloads\lame-3.90.3\z.mp3
File B: file://C:\downloads\lame-3.90.3\z.wav

02:18:28 : Test started.
02:18:41 : 01/01 50.0%
02:19:14 : 02/02 25.0%
02:19:48 : 03/03 12.5%
02:20:04 : 04/04 6.3%
02:20:37 : 05/05 3.1%
02:21:21 : 06/06 1.6%
02:21:41 : Test finished.

----------
Total: 6/6 (1.6%)

foo_abx v1.2 report
foobar2000 v0.8.3
2005/03/01 02:22:23

File A: file://C:\downloads\lame-3.90.3\x.mp3
File B: file://C:\downloads\lame-3.90.3\z.wav

02:22:23 : Test started.
02:23:49 : 01/01 50.0%
02:26:45 : 02/02 25.0%
02:27:12 : 03/03 12.5%
02:27:36 : 04/04 6.3%
02:28:04 : 05/05 3.1%
02:28:40 : 06/06 1.6%
02:28:47 : Test finished.

----------
Total: 6/6 (1.6%)

foo_abx v1.2 report
foobar2000 v0.8.3
2005/03/01 02:29:20

File A: file://C:\downloads\lame-3.90.3\x.mp3
File B: file://C:\downloads\lame-3.90.3\z.mp3

02:29:20 : Test started.
02:29:48 : 01/01 50.0%
02:30:25 : 02/02 25.0%
02:33:12 : 03/03 12.5%
02:33:52 : 04/04 6.3%
02:34:21 : 05/05 3.1%
02:34:47 : 06/06 1.6%
02:34:51 : Test finished.

----------
Total: 6/6 (1.6%)

Here are the dfiles I used to compare. It is from a song by king crimson called "Matte Kudisai"

http://noisevault.com/temp/mp3test.rar

Anyway, thanks for indulging me, and for turning me on to the ABX plugin for foobar2000.

I will concede that Joint Stereo may not be worse than Stereo, but I hold firm that it is different. smile.gif

cAPS
cAPSLOCK
I also want to add two other things.

Ironically the two mp3 files were almost identical sizes in my test. Usually the JS file is a little smaller. I don't know why this was the case... I selected some audio I am familiar with and had the kind of frequency/dynamic content that I felt would make the test as easy to pass as possible. If I had time I would try the test on a bit of audio that has a big difference... but this whole thing took me a while in the first place (it took me a few minutes to figure out how to work the ABX component).

Secondly. I have long been a LAME cheerleader. The other folks in my industry that I meet that insist on using other mp3 encoders get a swift opinion. I am glad LAME has continued to improve. Of course I like the other lossy encoders like ogg, but mp3 is so universal.

cAPS
cabbagerat
QUOTE(cAPSLOCK @ Feb 28 2005, 10:47 PM)
QUOTE(odious malefactor @ Mar 1 2005, 12:34 AM)
QUOTE(cAPSLOCK @ Feb 28 2005, 10:25 PM)
The bits saved by j/s encoding are GOING somewhere.

Uh.... where exactly are they going?
*


They are going *POOF* and disappearing.
*


No - they aren't. Any more than the bits saved by Zip are dissapearing. All the l/r to m/s transform does is change the representation of the audio data to one that can, for a given level of quality, encoded with fewer bits. LAME joint stereo picks which representation (l/r or m/s) is likely to be most efficient for each frame and uses that representation for the frame.
schonenberg
QUOTE(cAPSLOCK @ Mar 1 2005, 12:04 AM)
QUOTE(schonenberg @ Feb 28 2005, 07:29 PM)
Is there any reason we would ever really need simple stereo since Lame JS doesn't destroy stereo?

If no one can think of a good reason, maybe it should be taken out of Lame, or at least taken out of the --longhelp and removed from the docs. What do you think?
*



I can think of a reason.

The is an audible difference between the two. ohmy.gif I would like the choice of how I encode.

For what it's wirth I normally encode in joint stereo.

Here is one way to prove the difference.

The following two files are the S channel of the same exact file (recorded in my studio) one encoded in joint stereo and one encoded in stereo.

http://noisevault.com/temp/hyd-js.wav
http://noisevault.com/temp/hyd-s.wav

If you dont have decent speakers... listen with headphones. You will hear the difference.

Here is what I did:

Took a song and encoded it VBR 112-320 both in stereo and joint stereo. Then I did a Mid/Side encoding on it and isolated the Side channel. That's what you have downloaded.

If you don't know about mid-side encoding it's pretty simple. It is just another way to represent a stereo signal over two channels. Instead of left and right, you have the audio that is in the middle, and the audio that is on the sides. Google up M/S encoding if you want to know more.

For the sake of reference here is the original wav that was encoded. Well more or less... I cut out the pieces after I encoded the whole 4 minute files.

http://noisevault.com/temp/hyd-orig.wav

regards,
cAPS
*




I believe it. I was saying get rid of non-joint-stereo.
Simple stereo is 'stereo right'? Where can I find out about all the different stereo modes?
schonenberg
QUOTE(cAPSLOCK @ Mar 1 2005, 12:08 AM)
QUOTE(kjoonlee @ Feb 28 2005, 11:42 PM)
cAPSLOCK, unless you can prove it yourself, by providing an uncompressed sample that sounds worse to you when encoded with mid-side joint stereo rather than "pure" stereo, and prove that you can tell the difference through double-blind testing, no one here is going to take you seriously.
*



Can you tell a difference between the S channel files I posted? (about three posts up)

cAPS
*


yes
JeanLuc
QUOTE(cAPSLOCK @ Mar 1 2005, 08:57 AM)
I also want to add two other things.

Ironically the two mp3 files were almost identical sizes in my test.  Usually the JS file is a little smaller.  I don't know why this was the case...


Can you please link to the source material you used in your test ?
cAPSLOCK
QUOTE(JeanLuc @ Mar 1 2005, 02:51 PM)

Can you please link to the source material you used in your test ?
*



It's at the bottom of the results post.

cAPS
SirGrey
One thing to mention:
Is this material use any of the surround matrixing ?
If yes - then your results are not something unusual - mp3 encoding damage artificial surround image by it's nature (search the forum) and different stereo modes can damage it in different ways.
Check it please, if you can.

Just my 2 cents...
P.S. L/R + M/S joint stereo should always produce better quality than L/R stereo, unless bugs in the switching alghorymt presents. So, check source...
cAPSLOCK
It was a stereo recording made around 1984.

Sure got quiet in here after I posted my results.

cAPS
Blanka
QUOTE(cAPSLOCK @ Mar 1 2005, 10:55 PM)
It was a stereo recording made around 1984.

Sure got quiet in here after I posted my results.

cAPS
*



You wouldn't be suffering from the placebo effect would you? tongue.gif You can't really tell the difference in an ABX test but still believe you can hear a difference...
music_man_mpc
QUOTE(cAPSLOCK @ Mar 1 2005, 10:55 PM)
It was a stereo recording made around 1984.

Sure got quiet in here after I posted my results.
*


I think that you ABXed by hearing atifacts introduced by the encoder, not by hearing stereo collapse. Try ABXing Musepack at --quality 5 or higher, it also uses JS by default.
phong
You did prove that joint stereo does sound different than simple stereo on this sample. This is not surprising* - the whole point of joint stereo is that it lets lame use M/S instead of L/R in cases where it's more efficient. Lame will switch to M/S when it can be as accurate with fewer bits (or, from a different point of view, more accurate when allocated the same number of bits). So, in theory, if one is noticably better (better being defined as a more faithful reproduction of the original), it would be the joint stereo version. The only time that would not be the case is if it frequently chooses between L/R and M/S incorrectly (highly unlikely in widely used versions of Lame.) If neither is noticably better, then the joint stereo version should be smaller.

The next step would be to perform an abc/hr test (which will let you judge relative quality in a double-blind fashion).

* Sometimes people don't completely explain joint stereo and give the impression that it should be identical, when in fact, it might be different (but better and/or smaller when the encoder chooses correctly, which it usually does). They usually explain the process simplified as:
a) When encoding, compute M and S channels from Lo and Ro (Left channel and Right chanel in the original, respectively):
M = (Lo + Ro) / 2
S = (Lo - Ro) / 2
b) When decoding (Ld and Rd are the Left and Right channels in the decoded version):
Ld = M + S
Rd = M - S
c) With some basic algebra:
Ld = (Lo+Ro)/2 + (Lo-Ro)/2
Ld = (Lo+Ro+Lo-Ro)/2
Ld = Lo
Rd = (Lo+Ro)/2 - (Lo-Ro)/2
Rd = (Lo+Ro-Lo+Ro)/2
Rd = Ro

Which shows that M/S encoding preserves the original L and R channels exactly.

That actually skips past the part where M/S actually buys you something. The quantization phase of mp3 encoding occurs after it converts things to M/S. Because most of the time, the Left and Right channels are similar, the M channel will be "large" and the S channel will be "small". Because of that, the encoder can steal some bits that would be used to encode the S channel and put them in the M channel and get a more accurate result. The S channel is more quantized than it would be otherwise, and the M channel is less quantized. In simplified terms, you might reduce the amount of error in the M channel by half, and double the amount of error in the S channel, but because the M channel is, e.g. 10 times more significant, the overall error when you convert them back to L/R is less. In cases where the two channels have less correlation, this won't work (you won't be able to afford to chop down the S channel as much if it's "big"), so the encoder will use L/R encoding instead.

This still way oversimplifies things, because mp3 quantization is more complicated than just "rounding off a few numbers", but that should hopefully paint the right picture for somebody who stumbles along here and doesn't know what the heck is going on.

Of course, there could be additional confusion because at very low bitrates (or if you have a really stupid encoder), there's intensity stereo to deal with, which can fudge up the stereo image, and this is sometimes also called joint stereo even though it's completely unrelated to M/S stereo.
sTisTi
Great explanation, phong, actually the best I ever read concerning Joint Stereo. This should be put in the FAQ section IMO smile.gif
cAPSLOCK
QUOTE(Blanka @ Mar 3 2005, 11:58 AM)
You wouldn't be suffering from the placebo effect would you? tongue.gif You can't really tell the difference in an ABX test but still believe you can hear a difference...
*



I think my test results speak for themselves. At that quality level it was not evident to me which encoding method was used, but I could tell them apart.

When I posted my results I conceded that at that quality between the two sampes was inperceptable to me.

As phong pointed out, It could be that my past perception of damage to the stereo field might be realted to intensity stereo as oppsed to j/s encoding, but I am not sure of this yet. Can anyone tell me at what bitrates lame will use intesity stereo? I would like to preform tests at the lowest bitrates that use only m/s frame encoding.

Also, my perception of damage to the image when encoding at lower bitrates than "--alt preset standard" could not be described as "hearing the stereo collapse". I don't remember using these words, but if I did they were a poor choice. I would describe what I hear as the stereo image being smeared, or made perceptably less accurate. Not simply 'less stereo'.

It is admittedly hard, if not entirely impossible to make judgements about quality from any standpoint that is not entirely SUBJECTIVE. Even in a blind test the differences I hear are simply that.... differences I hear.

As I have said, am am willing to admit that I have learned that at even fairly conservative bitrates j/s does not seem to damage the stereo field any more than true stereo encoding.

But the fact that I could tell the difference between joint and true stereo 6 times out of 6 seems to imply that there is at least a difference. That is clear. How this difference is percieved in terms of stereo image is, at least to me, quite unclear.

cAPS
ChiGung
Nice post phong, and great inquiry cAPS!
The way i explain these samples, is that - obviously we can hear artifacting in both, and stronger artifacting in the js sourced sample. Well there is unavoidable artifacting in all mp3s but it is kept to a minimum which is hopefuly below hearing.

The reason why the js side channel contains stronger artifacts, is because that signal is in fact quieter than the main signal, psychoacoustic recognise this, and allow more 'relative innacuracy' in order to save definition for the main signal, where artifacts will be more audible. The pure stereo sample may be doubly helped out by the fact that between the two separately encoded channels, distortion will not be synchronised and will not reinforce between each channel into the side channel.

So with pure stereo, the stereo separation can have a relative quality which neither of the separate channels can ever attain, but this top quality in what is often the least audible abstraction of the audio data, comes at the cost of having less definition for the most audible abstract -the main channel.

The less significant the stereo separation is on a track, the more extreme this expertiment will sound, and v.v. the more significant the separation, the less youll be able to hear a difference.

Turning off js, simply turns off an opportunity to economise by ignoring a synchronous aspect the audio data. When used in lossless codecs, recognising this synchronisity reduces filesize without damaging the reproduction, it does so in lame too but its the psychoacoustic decision to put the distortion in the least audible places which will be the side channel if it is weakest.
SirGrey
>>Can anyone tell me at what bitrates lame will use intesity stereo?
intensity stereo is not implemeted in Lame, that's why it is recommended at least to try Fhg encoders when encoding at very low bitrates...
cAPSLOCK
QUOTE(phong @ Mar 3 2005, 01:43 PM)
That actually skips past the part where M/S actually buys you something.  The quantization phase of mp3 encoding occurs after it converts things to M/S.  Because most of the time, the Left and Right channels are similar, the M channel will be "large" and the S channel will be "small".  Because of that, the encoder can steal some bits that would be used to encode the S channel and put them in the M channel and get a more accurate result.  The S channel is more quantized than it would be otherwise, and the M channel is less quantized.  In simplified terms, you might reduce the amount of error in the M channel by half, and double the amount of error in the S channel, but because the M channel is, e.g. 10 times more significant, the overall error when you convert them back to L/R is less.  In cases where the two channels have less correlation, this won't work (you won't be able to afford to chop down the S channel as much if it's "big"), so the encoder will use L/R encoding instead.

*



Thanks for such a good explanation of how j/s encoding works.

I have no argument that lossless M/S encoding does no damage to the sound at all (though when worlking at low bitdepths like 16 bit any math on the data (in this case the /2 part) will introduce significant rounding errors). I work with M/S techniques frequently in the course of recording production and mastering.

The idea that you compress the S channel a little more to save on compression for the M channel is a great idea, especially in light of the fact (well explained) that the S channel will tend to be much less 'significant' in terms of amplitude.

However, although the S information is indeed more subtle in most music than the mono part it is this very subtlety that creates the stereo image. It seems to reason that compresing the S channel more than the M channel will produce a a solid good sounding file at the expense of some of the subtle information the is responsible for the stereo image.

The idea that M/S encoding does not affect the sound is true. The idea that compressing the S channel more will not damage the image is not clearly true, at least in my mind, yet.

Seems to me, anytime you take more data away from the original file you do a little more damage. How perceptable this damage is is the important part. I agree that image is one of the more subtle aspects in a recording. But focusing compression on the image will obviously have an effect on the quality of the stereo field.

cAPS
ChiGung
QUOTE(cAPSLOCK @ Mar 3 2005, 08:40 PM)
The idea that M/S encoding does not affect the sound is true.  The idea that compressing the S channel more will not damage the image is not clearly true, at least in my mind, yet.

*


Im curious about this too. It seems that mj stereo has complications for lossy application. One of these command lines might be changed to tighten it up?

--nssafejoint M/S switching criterion
--nsmsfix <arg> M/S switching tuning [effective 0-3.5]
--interch x adjust inter-channel masking ratio

For paranoid joint stereo at expense of bitrate~
But then default settings are probably tuned to perceptual optimum, mp3s arent suitable for this type of studio post processing at all, which i found out after trying to remaster some live recording made at 320.

QUOTE
Seems to me, anytime you take more data away from the original file you do a little more damage.

editadd - thats not what is happening though, data is not being taken away, sameness can be noted, or not - correlation is recognised or data is redundantly repeated,thats the essence of cross channel referencing, only in this case distribution of inaccuracy is leaving room for debate tongue.gif
Gabriel
Compressing more the S channel (sometimes called side channel starving) is a nice adea for high compression ratio. The question is just how to carefully starve it. Right now Lame doesn't starve side channel by default (although it has the ability do do it).

LR->MS transform will enhance compressibility of side channel without any starving if the signal has some identical or similar parts in both channels. In a near-mono frequency band, the side channel will have a lot of 0 and 1 coeffs, which are easier to compress on the Huffman stage. Saved bits can be then be used for the Mid channel.
ChiGung
QUOTE(Gabriel @ Mar 4 2005, 08:41 AM)
Compressing more the S channel (sometimes called side channel starving) is a nice adea for high compression ratio. The question is just how to carefully starve it. Right now Lame doesn't starve side channel by default (although it has the ability do do it).
*


It sounds as though the side channel is treated as normal. Are the same psychoacoustics used for the S channel that would be used on L&R, Mid -even temporal masking ?
The wavs cAPS posted seem to indicate that S channel can fair worse when using Joint stereo. Its seems that the side channel he extracted was quite distorted, guessing that psychoacoustics allowed it to be because it was so 'quiet'
I wondered if there would be a way to include the inherent efficiency of M/S while maintaining the fidelity of separation that pure L/R implies - by feasting the side channel mebbie ?: )

I do alot of wondering tho.
-Thanks in retreat*
Gabriel
QUOTE
It sounds as though the side channel is treated as normal. Are the same psychoacoustics used for the S channel that would be used on L&R, Mid -even temporal masking ?

In Lame V3, yes.

QUOTE
The wavs cAPS posted seem to indicate that S channel can fair worse when using Joint stereo

Lame is assuming (as a lot of other encoders) that the file will not be postprocessed after decoding.
Extracting side channel is a little like extracting a specific frequency band from an encoded file. Of course your frequency band, when extracted, will sound different. But in the encoding file as a whole it sounds the same.
ChiGung
QUOTE(Gabriel @ Mar 4 2005, 09:49 AM)
QUOTE
It sounds as though the side channel is treated as normal. Are the same psychoacoustics used for the S channel that would be used on L&R, Mid -even temporal masking ?

In Lame V3, yes.

Hmmm it sounds like you arent overly concerned with the purity of the S channel -starving it or merely subjecting it to filtration designed for real channels, this could add fire the pure stereo enthusiasts~

Me im easy, 'appreciate your clarification.
phong
OK, so S is not being starved, so my understanding wasn't exactly right. Considering this fact, and the fact that Lame switches to L/R when necessary (and it seems to be quite good that), I find it wildly improbable that joint-stereo is going to be at all inferior.

Even in a lossless situation, joint-stereo saves bits. Here's an example, which is really just me thinking out loud:

Let's say for a segment of audio (e.g. a block), the side channel is 1/4 the full amplitude of the original (it might even be quite a less than that in a typical segment of audio, since a large portion of the signal is mono). The M channel (L+R) can be, at worse, twice the original amplitude. A lossless representation of the S channel, at 1/4 the amplitude would require 2 fewer bits, and the M channel would require only 1 more. Not only that, but huffman coding would work better on the S channel (and not much worse on the M channel, cause we're at a bit-depth that huffman coding would be doing poorly at anyway).

As for intensity stereo, as others have said, it's not even in Lame. FhG only uses it at 96kbps or below (if at all). At least some versions of Xing use it at higher bitrates, which is one of the reasons why it's a poor performer.

I dispute the notion that the stereo image is the aspect of the audio would be damaged even if joint-stereo failed because it hypothetically starved the S channel and did so excessively, or if it chose incorrectly between M/S and L/R stereo. The result would be added noise (artifacts). The noise might be percieved in a different place in the stereo field than if a similar failure occured in L/R coding, but noise would have to be added in a much more specific way to modify the original stereo image. Stereo image perception is affected by things like relative attenuation of different frequencies and phase shift, which are things that aren't easily modified by adding noise in this way. If the artfiacts were really bad, perhaps they would decrease your ability to discern the original stereo image, but that's not the same thing. It's like not being able to tell the direction of a chirping bird because a jet is flying overhead. The jet doesn't really modify the bird's sound, or move it around - it just makes it harder to hear.

Someone please correct me if I'm barking up the wrong tree on this...

QUOTE
Hmmm it sounds like you arent overly concerned with the purity of the S channel -starving it or merely subjecting it to filtration designed for real channels, this could add fire the pure stereo enthusiasts~

You make it sound as if the S channel is somehow much more important than the M channel when it comes to contributing to the stereo image (or overall audio quality.) I don't think that's a reasonable conclusion. Take for example filters in audio players that give you "wider stereo". They do so by (effectively) making the M channel more quiet. Likewise, since the L/R representation contains the same information as the M/S representation, damaging the L/R channels damages the overall audio just as much as damaging the S channel does.

Also, please don't refer to "stereo" mode in Lame (meaning the "-m s" switch) as "true-stereo". Joint stereo is not in any way less "true". It just means that it lets the encoder choose between two different (but equivalent) representations - L/R and M/S based on which works better on a given block. "Stereo" mode prevents it from having that choice. There's also "forced joint" mode, which forces it to have M/S on all frames. To be fair, the mp3 standard is at fault for the confusion because of the names chosen. Better names might be LR stereo, adaptive stereo and MS stereo.
cAPSLOCK
QUOTE(phong @ Mar 4 2005, 11:41 AM)
I dispute the notion that the stereo image is the aspect of the audio would be damaged even if joint-stereo failed because it hypothetically starved the S channel and did so excessively, or if it chose incorrectly between M/S and L/R stereo.  The result would be added noise (artifacts).  The noise might be percieved in a different place in the stereo field than if a similar failure occured in L/R coding, but noise would have to be added in a much more specific way to modify the original stereo image. 


This could be right. It is certainly a matter of perception. But one thing is for sure. The artifacts will be only in the stereo image portion of the sound. In other words they are out of phase with themselves across the l/r stereo field. I might speculate that this could actually make them LESS apparent. Especially in the case that they are in lower frequencies, though this would be a rarer case.

QUOTE
QUOTE
Hmmm it sounds like you arent overly concerned with the purity of the S channel -starving it or merely subjecting it to filtration designed for real channels, this could add fire the pure stereo enthusiasts~

You make it sound as if the S channel is somehow much more important than the M channel when it comes to contributing to the stereo image (or overall audio quality.) I don't think that's a reasonable conclusion. Take for example filters in audio players that give you "wider stereo". They do so by (effectively) making the M channel more quiet. Likewise, since the L/R representation contains the same information as the M/S representation, damaging the L/R channels damages the overall audio just as much as damaging the S channel does.


You are right that one way to widen the stereo field is to reduce the M chanel amplitude. But a better way of putting this would be to say you are chaning the relationship of amplitude between the m/s channels. It is also possible to widen the stereo field by making the S channel LOUDER. I susteain that extra compression of the S channel will lead to more artifacts in the stereo field. The most important question in your above statement is raised in the last sentance. In the end it might indeed be a moot choice between L/R and M/S.

One more question this raises for me...

Does LAME encode the L/R channels in a frame separately? I assume it does.

If this is the case... then in the end, the two ways of encoding are really apples and oranges when it comes to image perception. One way will always be better than the other. How LAME chooses is the key to the result. Pretty intense speculation. wink.gif

I am amazed at (and greatful for) the technology and knowledge required to do this magic.

QUOTE
Also, please don't refer to "stereo" mode in Lame (meaning the "-m s" switch) as "true-stereo". 
*



That is a very reasonable request. They are both truely stereo.

cAPS
phong
QUOTE
Does LAME encode the L/R channels in a frame separately? I assume it does.

Not sure what you mean... How else would it encode them? It does allocate bits independently for them (if one channel is more demanding than the other, it will get more bits), if that's what you're asking.

QUOTE
If this is the case... then in the end, the two ways of encoding are really apples and oranges when it comes to image perception. One way will always be better than the other. How LAME chooses is the key to the result. Pretty intense speculation.

Which is better depends on the content of a particular frame. If there is a great deal of correlation between the two channels, then the S channel is small and M/S can represent the audio more accurately with fewer bits. If there is very little correlation, then M and S could both be large and therefore require more bits to represent accurately. I suppose M/S would work well when M is small too, but that would indicate the two channels were mostly the same but 180 degrees out of phase, which I think would be unusual.

I don't know enough technical information about Lame to say what criteria it uses to choose, but other people here write Lame, and they do know, and I trust them when they say Lame is Indiana Jones and not that sadistic German chick.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.