Help - Search - Members - Calendar
Full Version: MPC v1.15f (alpha version) ready for download
Hydrogenaudio Forums > Lossy Audio Compression > MPC
SK1
And share your deep thoughts and experiences with it. Tell us how it affected your life, your mood, and your dog.
Get it.
Q!
Yay! B)
I checked a few samples and bitrates seem to be a little higher (useless observation, I know smile.gif)
I'll do some tests tomorrow.

My dog loves it.
cmokruhl
Is there a changelog anywhere?
JohnV
QUOTE
Tonality estimation for 1 kHz...11 kHz has been made more robust.

Buschmann:
 Method 1 for 0 kHz...22 kHz

[encoder version] 0.91...1.10
 Method 1 for 0 kHz...22 kHz
 Method 2 for 1 kHz...11 kHz

1.11...1.14
 Method 1 for 0 kHz...22 kHz
 Method 2 for 1 kHz...15 kHz
 
1.15
 Method 1 for 0 kHz...22 kHz
 Method 2 for 1 kHz...11 kHz
 Method 3 for 1 kHz...11 kHz

Method (1,2,3) are formulas with a result of around 1 for atonal audio
and << 1 for tonal audio. All have their strong and weak sides.
Mask is computed by

   Mask = Signal * Method1 * Method2 * Method3 / NMR

--
Frank Klemm


So 1.15 adds 3rd tonality estimation algorithm.
floyd
any results for this new 1.15 yet? I just tried encoding an album and the bitrate chance was very slight, 1.15 700k larger than 1.14 over a 55 mb album.

Theoretically what can we expect from this new tonality estimation?
hans-jürgen
QUOTE(floyd @ Dec 23 2002 - 05:16 AM)
Theoretically what can we expect from this new tonality estimation?

Maybe Musepack not freaking out on 2000 year old string instruments anymore that guruboolez must have digged out of the chinese mud... wink.gif
Gecko
I tested it with the angelic_short sample I posted a while ago. Bitrate is ca 10kbps higher, but the artifact remains. I didn't notice a change in sound compared to 1.14b when quickly switching around in Winamp.
AgentMil
RoFL! Shouldn't they be in museums? laugh.gif
hans-jürgen
QUOTE(AgentMil @ Dec 23 2002 - 11:44 AM)
RoFL! Shouldn't they be in museums?  laugh.gif

"Err... who?" [Beavis & Butthead, suspected name donators]
AgentMil
Whoops Gecko's post beat me to the punch... Was referring to those string instruments that you were mentioning hans-jurgen. wink.gif
JohnV
QUOTE(hans-jürgen @ Dec 23 2002 - 12:03 PM)
Maybe Musepack not freaking out on 2000 year old string instruments anymore that guruboolez must have digged out of the chinese mud... wink.gif

Erhu was a problem with Lame APS/APE, not MPC.
hans-jürgen
QUOTE(JohnV @ Dec 23 2002 - 11:25 PM)
Erhu was a problem with Lame APS/APE, not MPC.

Well, that depends on the bitrate in question... wink.gif
guruboolez
QUOTE
Well, that depends on the bitrate in question... wink.gif


I found two more impressive seconds for mpc. Very funny at low bitrate... And note the 'bitrate score' at --standard. Instrument is an ugly glass harmonica. File is glass_short.zip

http://membres.lycos.fr/guruboolez/


Erhu is not a critical sample for mpc. It's just representative for the general behavior of mpc with many 'pure' instruments.
xterm
QUOTE(guruboolez @ Dec 23 2002 - 03:51 PM)
QUOTE
Well, that depends on the bitrate in question... wink.gif


I found two more impressive seconds for mpc. Very funny at low bitrate... And note the 'bitrate score' at --standard. Instrument is an ugly glass harmonica. File is glass_short.zip

http://membres.lycos.fr/guruboolez/


Erhu is not a critical sample for mpc. It's just representative for the general behavior of mpc with many 'pure' instruments.

The Glass Harmonica sample sounds like dogsh*t in mpc.

Ogg pretty much was the best of of the 3, followed by mp3.
JohnV
QUOTE(xterm @ Dec 24 2002 - 11:53 PM)
The Glass Harmonica sample sounds like dogsh*t in mpc.

Ogg pretty much was the best of of the 3, followed by mp3.

No sh*t? laugh.gif

I don't know what's the point in these test. MPC --standard is -q 5. Here it seems -q 0.71 is used for MPC?!?

MPC has not been tweaked for low bitrates, and it may never be as good as transform codecs at lowbitrates, especially, if you take a sample which is hard to encode, and force the bitrate down, like in this case.

I see no point in this test.
floyd
-q 0.71?? Does anyone seriously use such a quality level with musepack? I don't even think many people use any q levels under 5, except for testing.
guruboolez
QUOTE(JohnV @ Dec 24 2002 - 11:38 PM)
MPC has not been tweaked for low bitrates, and it may never be as good as transform codecs at lowbitrates, especially, if you take a sample which is hard to encode, and force the bitrate down, like in this case.

I see no point in this test.

I used -q 0.71 to reach 64 kbps with this sample. And you can hear it, the sample is not difficult to encode : no short-attacks, no special stereo effects.... Musepack always need much more bitrate than others codecs for some 'easy' to encode instrument : violin, organ.... and why not glass harmonica or erhu smile.gif

I uploaded this little pack, with an implicit reference for a recent thread about the low-bitrate ability of mpc. Case and Hans-Jürgen said that mpc -thumb offers a good quality, and can compete with a bit more tweaking with other formats, as mp4 or vorbis. I suggest that this assertion is really limited to some music genres. I encoded approximatly 500 classical discs, and I saw many time a big inflation of bitrate with harpsichord, violin, baroque concertos...

Of course, -q 0.71 is ridiculous. I'm not totally stupid. But just imagine a new general blind test at 64 kbps (with a forced 64 kbps), or why not a more popular 128 kbps, you must encode with « telephone » or « catastrophic » profile. It's really easy to shoot mpc with similar samples : there are against all expectations hard to encode, and mpc competitor are systematically offering a really good quality at low and medium bitrate. By forcing mpc to encode them at a given bitrate, the result is really shocking.
I uploaded a more representative sample : a simple violin. Try to encode it at 128 kbps with mppenc 1.15f. You must set the encoder at -q 2.25, and of course, the sound is not really good. A simple mp3/gogo at -b 128 sound much more better, near transparency. And I repeat it : it's not a critical sample... Radio is near transparency, but at... 200 kbps when --alt-preset extreme with 3.93.1 is mesured at 202 kbps !)

Mpc seems to have a really impressive VBR mode, very adaptative, and is offering a constant quality for a given profile. But the bitrate variation are really different from LAME VBR, Ahead MP4 and from Vorbis. Different, and wider amplitude with non-electronic sample. I suppose that a for low-bitrate tweaked mpc codec, this general behaviour is a real handicap for the format.
hans-jürgen
QUOTE(guruboolez @ Dec 29 2002 - 08:03 AM)
QUOTE(JohnV @ Dec 24 2002 - 11:38 PM)
MPC has not been tweaked for low bitrates, and it may never be as good as transform codecs at lowbitrates, especially, if you take a sample which is hard to encode, and force the bitrate down, like in this case.

I see no point in this test.

I used -q 0.71 to reach 64 kbps with this sample. And you can hear it, the sample is not difficult to encode : no short-attacks, no special stereo effects.... Musepack always need much more bitrate than others codecs for some 'easy' to encode instrument : violin, organ.... and why not glass harmonica or erhu smile.gif

Because these are very "tonal" instruments as opposed to e.g. percussion instruments maybe? wink.gif I sometimes wonder if a poster has even read the thread and the questions therein before starting his personal rant about his favorite format being critized in one way or another.

Obviously Musepack has its very own difficulties with samples that contain many tonal components like your er-hu sample and so the bitrate will be doubled compared to other codecs or other samples with the same MPC profile (er-hu with --thumb comes out as ~130 kbps and sounding fine, as you probably know, but --thumb normally is supposed to encode everything at ~64 kbps). That's one part of what I meant with "freaking out", but also the resulting sound if you force it to use just 64 kbps (shortwave radio background noise).

So when reading something about a new "tonality estimation" method in v1.15f, it is not too far away in my opinion to assume that this might have something to do with the behaviour described above (as we can only assume things here, because I don't know any other forums or sources where Frank would comment on the changes he made).

It's also interesting that a subband codec seems to lose out on these samples where a transform codec is more appropriate to the music in question and can "survive" with much fewer bits. So I think it's a good idea to address these inherent problems with a better tonality estimation - if this was the purpose of this update anyhow.
NumLOCK
QUOTE
So when reading something about a new "tonality estimation" method in v1.15f, it is not too far away in my opinion to assume that this might have something to do with the behaviour described above (as we can only assume things here, because I don't know any other forums or sources where Frank would comment on the changes he made).

It's also interesting that a subband codec seems to lose out on these samples where a transform codec is more appropriate to the music in question and can "survive" with much fewer bits. So I think it's a good idea to address these inherent problems with a better tonality estimation - if this was the purpose of this update anyhow.

I think you're mistaken about the tonality estimation changes in v1.15f.
The real issue that subband codecs encounter when dealing with highly tonal signal components, is in the prediction of samples.

Where mp3, ogg, aac (tonal-domain encoders) have virtually no serious issues, mpc fails because due to its subband nature, it is NOT allowed to make "guesses" directly in the frequency domain.

I think, only relatively simple "guesses" can be done by MPC. For this reason, much of the inter-frame frequency components' redundancy is not optimally exploited.

Just my 2 cents...
JohnV
QUOTE(guruboolez @ Dec 29 2002 - 09:03 AM)
Musepack always need much more bitrate than others codecs for some 'easy' to encode instrument : violin, organ.... and why not glass harmonica or erhu smile.gif

Isolated tonal instruments are not easy. Especially mp3 and vorbis vbr-modes often fail at lower quality. Of course, transform codecs have an advantages here, because those have lots higher frequency resolution than subband codecs at similar bitrates by nature. Then when you take isolated istrument, there's not much masking to help mpc, also PNS can't help if the sound is very tonal.
So we have a signal that needs high frequency resolution but there's nothing, which helps mpc to achieve lower bitrate. So in order to preserve even some kind of frequency resolution (when there's no masker signal of anykind and the signal is quite tonal), it needs pretty high bitrate even at --thumb or --radio with these signals.

With some different signals, where mpc can use its excellent masking, PNS (although untweaked) and time resolution, it can reach very good results even at low bitrates.

So, I'm saying that isolated tonal instruments are no easy for any lossy codec especially when using vbr. Tonality estimation often fails, and estimates the signal to be noisier than it is, which means that the codec thinks it can take advantage of the masking more than it should. But because of the higher freq reso of transform codecs, those sound quite a lot better than mpc at similar very low bitrate.
guruboolez
I thought that pure tonal instruments were easy to encode/transform for transform codec ; I suspected theses uniform signals to be less complex than others. AFAIK, it's hard to ABX with succes such pure instruments at medium bitrate (80-120) with transform codec (cbr & vbr).

Thanks for explanation smile.gif
JohnMK
I'm curious -- who compiles mppenc? Which compiler with which optimizations is used?
mpcfiend
QUOTE
So, I'm saying that isolated tonal instruments are no easy for any lossy codec especially when using vbr. Tonality estimation often fails, and estimates the signal to be noisier than it is, which means that the codec thinks it can take advantage of the masking more than it should. But because of the higher freq reso of transform codecs, those sound quite a lot better than mpc at similar very low bitrate.


So basically, the reason that mp3 wins in these scenarios is the same reason that it loses out to mpc in more normal situations? How ironic. smile.gif
Seed
It should be noted that when I tested 1.15f, it didn't improve on any of the classic 30 or so samples I tested it
with (at --insane, for castanets and such). It did improve on Gecko's sample at --braindead, but very very slightly. This is one exceptional sample with no isolated tonal instruments that is still hard to encode. But yes, it is ironic that the 'simpler' samples can prove hard to encode as well.
NumLOCK
JohnV, I agree with your explanation but in my opinion, the issue could be explained in a simpler way.

Take Vorbis (a transform codec) and MPC (a subband codec).

Force both of them to a 64kbps-ish bitrate.

What does Vorbis do ? It encodes less tonal components, with less precision. Result: More tonal signal => less noise.
Consequence: better efficiency on tonal signals at that bitrate, softer sound and smeared transients.

What does MPC do ? It encodes most subband samples with less precision. Result: More noise => less tonal signal.
Consequence: better efficiency on noisy/aggressive signals at that bitrate, harsher sound.

The reason why mpc often sounds worse on highly tonal instruments at this TOO LOW bitrate, is that the codec IS FORCED to add noise almost everywhere to gain bits... and that noise is greater than the allowed masking threshold, almost everywhere in the music.

That's also the reason why, if you lower the bitrate really too much, then MPC - most of the time - will artifact almost everywhere.
Frank Klemm
QUOTE(NumLOCK @ Dec 30 2002 - 05:51 PM)
JohnV, I agree with your explanation but in my opinion, the issue could be explained in a simpler way.

Take Vorbis (a transform codec) and MPC (a subband codec).

Force both of them to a 64kbps-ish bitrate.

What does Vorbis do ? It encodes less tonal components, with less precision. Result: More tonal signal => less noise.
Consequence: better efficiency on tonal signals at that bitrate, softer sound and smeared transients.

What does MPC do ? It encodes most subband samples with less precision. Result: More noise => less tonal signal.
Consequence: better efficiency on noisy/aggressive signals at that bitrate, harsher sound.

The reason why mpc often sounds worse on highly tonal instruments at this TOO LOW bitrate, is that the codec IS FORCED to add noise almost everywhere to gain bits...  and that noise is greater than the allowed masking threshold, almost everywhere in the music.

That's also the reason why, if you lower the bitrate really too much, then MPC - most of the time - will artifact almost everywhere.

Nearly completely wrong.

The problem is situated in the psycho model and has nothing to do
with the problem of a 2048 tap single overlap MDCT vs. a 512 tap
15-time overlap MDCT.

Under some situations a noise level of -30 dB is audible and it is difficult
to find out this situation with numerical means. This problem is
independend from the encoder! MPC can encode with noise levels
down to -86 dB, MP3 down to -106 dB. But this doesn't play any role, because
the problem is the psycho model says -23 dB, but -33 dB is needed.
You can increase SMR by 10 dB, but this increases bitrate for all music by
about 120 kbps, so this is possible for MPC and also Ogg Vorbis (partially
also for MP3), but it is no real solution.

True is that MP3, AAC, Ogg Vorbis can compress tonal signals a little bit better
than MP1, MP2 and Musepack without LPC.

Another remark:
Be careful with any posting here in the forum. Most postings are a dangerous
mixture of wrong information and partially wrong information. Correct information
is very rare and highly correlated with some names I don't want to mention.

Most is pure speculation and waiting for an account from an opposing point of view.
mithrandir
QUOTE
Nearly completely wrong.

Give 'em hell Frank! laugh.gif

QUOTE
Most is pure speculation and waiting for an account from an opposing point of view.

It's called antagonism. The temptation to make comments just to invoke a response happens to the best of us. wink.gif
KikeG
Frank, It's good to see you back here, I really appreciate your knowledge and very informative posts.

About your comments, I think that the situation here is not as bad as you say, although sometimes that happens. Also, note that everybody can be and has been wrong sometimes, including me, you, the admins, etc. Please don't take this as an offense, I just wanted to say that everybody has the 'right' to make an occasional mistake from time to time. As long as it is corrected...
Daybreak
QUOTE(hans-jürgen @ Dec 23 2002 - 06:03 PM)
QUOTE(floyd @ Dec 23 2002 - 05:16 AM)
Theoretically what can we expect from this new tonality estimation?

Maybe Musepack not freaking out on 2000 year old string instruments anymore that guruboolez must have digged out of the chinese mud... wink.gif

Hmm.. I take offence with this post.. perhaps is it just me, or is this view representative of a highly intolerant and uneducated view?

The erhu isn't unknown, on the contrary, it's a staple of most Chinese orchestral music. Just because you don't listen to Chinese orchestral music ( not that I really like it myself .. ) or the fact that you don't know much about it doesn't give you the right to diss on it.. In fact, the lack of knowledge all the more means that you shouldn't really degrade something you don't know. To say that it was dug out of the Chinese mud.... mad.gif
JohnMK
Don't be hurt. You really just misinterpreted him.
NumLOCK
QUOTE(Frank Klemm @ Jan 1 2003 - 08:45 PM)
QUOTE(NumLOCK @ Dec 30 2002 - 05:51 PM)
JohnV, I agree with your explanation but in my opinion, the issue could be explained in a simpler way.

Take Vorbis (a transform codec) and MPC (a subband codec).

Force both of them to a 64kbps-ish bitrate.

What does Vorbis do ? It encodes less tonal components, with less precision. Result: More tonal signal => less noise.
Consequence: better efficiency on tonal signals at that bitrate, softer sound and smeared transients.

What does MPC do ? It encodes most subband samples with less precision. Result: More noise => less tonal signal.
Consequence: better efficiency on noisy/aggressive signals at that bitrate, harsher sound.

The reason why mpc often sounds worse on highly tonal instruments at this TOO LOW bitrate, is that the codec IS FORCED to add noise almost everywhere to gain bits...  and that noise is greater than the allowed masking threshold, almost everywhere in the music.

That's also the reason why, if you lower the bitrate really too much, then MPC - most of the time - will artifact almost everywhere.

Nearly completely wrong.

The problem is situated in the psycho model and has nothing to do
with the problem of a 2048 tap single overlap MDCT vs. a 512 tap
15-time overlap MDCT.

Under some situations a noise level of -30 dB is audible and it is difficult
to find out this situation with numerical means. This problem is
independend from the encoder! MPC can encode with noise levels
down to -86 dB, MP3 down to -106 dB. But this doesn't play any role, because
the problem is the psycho model says -23 dB, but -33 dB is needed.
You can increase SMR by 10 dB, but this increases bitrate for all music by
about 120 kbps, so this is possible for MPC and also Ogg Vorbis (partially
also for MP3), but it is no real solution.

True is that MP3, AAC, Ogg Vorbis can compress tonal signals a little bit better
than MP1, MP2 and Musepack without LPC.

Another remark:
Be careful with any posting here in the forum. Most postings are a dangerous
mixture of wrong information and partially wrong information. Correct information
is very rare and highly correlated with some names I don't want to mention.

Most is pure speculation and waiting for an account from an opposing point of view.

Thank you for your answer, Frank.
Hmm. It wasn't my goal to create antagonism when I posted my message. I didn't expect such opposition, but your post was a pleasant surprise :-)
There will always be speculation going on. Of course I'm part of it. Nobody can see the whole picture with all details - and as you certainly know, it takes time and effort to dig into another person's source code, especially for a codec. For the rest however, you can safely assume I know what I'm talking about.

Okay, to the point now.

You're saying that basically, 32xPQF and MDCT are equally useful for ~64 kbps audio compression, and only the psymodel makes the difference ? I can't agree with that.

However, I really appreciated your insight about the difficulty of accurately choosing the SMR in the psymodel.

1) In theory yes, with optimal prediction in time domain, subbanding would get all the advantages of MDCT (then MDCT would get mostly useless for both audio and video). Only the psymodel would then matter. Unfortunately, from my experience, at very low bitrates, subband coding (+LPC) is - at best - not really adequate to beat MDCT.

2-a ) Even though Musepack's current psymodel is unoptimal for low bitrates, most if the SMR under-estimation is certainly caused by the agressive command-line settings that are used to reach ~64kbps in the first place. So, the problems *does happen* in the psycho model, but it is *not* the fault of the psycho model itself. After all, how to do proper encoding, if you can't reach the SMR you need :-(

2-b ) In other words: In the --quality 5 preset, Musepack will use some headroom to compensate for possible SMR inaccuracies and reach (near-)transparency in most cases. At 64kbps (0.7 bit per sample !), there's simply no room for this "luxury" !

I actually did (and still do) experiment with audio compression prototypes, and I'm still convinced that the reason why this kind of subband codec (Musepack) will usually need more bits (compared with Vorbis) to encode a tonal signal satisfactorily (NOT transparently), results from the combination of two facts:
- the encoding process is being done in (near-) time domain, and
- the audio signal cannot always be predicted accurately (even if no transient happens). Compare this with MP3/Vorbis where similarity between successions of the same MDCT spectral coefficient can be exploited more easily.

In other words, at low bitrate a pure transform-domain codec can't encode a tonal signal perfectly, but still, at the decoding we'll generate a sine bank that will mimic the tonal signal reasonably well - except for the usual stereo, flanger, pre/post-shoot/echo or transient problems of course.

How possibly could a subband codec lower the bitrate to the same point ? I really don't see where the bits could be taken. Obviously, @ 0.7 bit per sample on average, and few masking effects to exploit, noise HAS TO be added pretty much everywhere, and at a too high level. This can be avoided if, and only if, the prediction algorithm is exceptional, and the entropy coding efficient enough (arith. coding or sample grouping - as in SV8) ! In that case the codec would make most MDCT-based codecs obsolete, even in the worst low-bitrate situations.

Now an interesting (I think) remark: we all know that a transform codec does the same thing as Musepack (put noise bursts into narrow bands of the signal). Now what's interesting, is that at low bitrates (~64 kbps) the "killed" and most aggressively quantized MDCT coefficients will smoothen up the time-domain audio wave in a subtle way, adding a kind of noise that sounds quite different (less "rough/icy" but more "watery/soft sound") than for a subband codec (which creates band-limited bursts of white noise). The "texture" of sound is an important point in low-bitrate encoding IMHO, and sticks to each user's personal preference.

My point is that if, at low bitrates, Musepack sounds quite good, it's only because of its accurate psy model and good entropy coding, which compensate (to a certain extent) for the bigger burden of time-domain encoding at forced low bitrate. Of course, at higher average "bits-per-sample", the time-domain encoding will become a strong point.

If we stay in low bitrates, then the efficiency of time-domain coding is closely tied (in very low-bitrate applications) to the type of signal involved.

About the SMR again - we have seen that Musepack, to reach ~64kbps, MUST take the risk of estimating a too low SMR sometimes. And to reach a ~64kbps bitrate with Musepack, one has to force the codec to use too agressive SMR reductions in many, many places. This can be heard - and again, I don't think this can be called a psymodel "problem".

Now my claim is that with (for example) Vorbis, the encoding is easier at these bitrates, because we can cut into time resolution also, to gain bits - and for these reasons, the encoder can afford to use safer, (and statistically higher) SMR values on highly tonal sounds.

Of course, as soon as masking or violent transients are involved, mpc might take the lead again.

To sum it up, I do NOT think that taking even more risks (in the psymodel) in estimating the suitable SMR, would solve the problem of bit-shortage. I think most improvements would be brought predicting the signal better - but again, for me compression had always been "prediction + arithmetic coding" so I admit my opinion might be biased ;-)

Keep up the excellent work, Frank. Your codec is King B)

* * *
To complete my answer and clear up any confusion (for non-mpc users) concerning my post:
- don't get me wrong, Musepack, to me does a beautiful job at audio compression. I personally think that subbanding is a more elegant solution than MDCT.
- it's just an experiment, with low bitrates reached using non-transparent mpc settings.

[I've been editing this post to make it clearer to read]
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.