IPB

Welcome Guest ( Log In | Register )

3 Pages V  < 1 2 3 >  
Reply to this topicStart new topic
AAC beaten at low bitrates, why?
Phobos
post Aug 7 2002, 00:13
Post #26





Group: Members
Posts: 290
Joined: 5-April 02
From: Guadalajara, Jalisco
Member No.: 1693



i personaly can tolerate the artifacts of AAC at ~96kbps but not less, i think that hides a little the crappyness of the sound while vorbis accentuates it, well thats just my point of view...
Go to the top of the page
+Quote Post
Frank Klemm
post Aug 7 2002, 00:41
Post #27


MPC Developer


Group: Developer
Posts: 543
Joined: 15-December 01
From: Germany
Member No.: 659



QUOTE
Originally posted by guruboolez
Frank, I have a question who haunt me for month...  I encode my classical music in mpc for month, and I noticed immediatly after leaving mp3 than Musepack had I strange behaviour with some instruments. Piano don't need too much bitrate, with mp3, mpc or Vorbis. But a violon (not a critical instrument ) seems overrated by mppenc : +20% (200 on --standard ; 230-240 on extreme, etc...). Harpsichord, organ... the same thing (a bit less fororgan, but harpsichord is more problematic). With --alt-preset standard, I obtain 180 kb/s, and never reached 200 kb/s : mp3 is very cool for classical listener who don't like Metallica. But with Musepack, Mozart need as much bitrate as AC/DC with mp3 encoding :mad: tongue.gif 

I recently find a strange and forgotten instrument, called glass harmonica : an horrible and distorded sound !!! Brrr...  With --alt-preset standard, an adagio (quiet but awfull music) need only 150 kb/s ! With mpc --standard : 250 kb/s !!!!

Why distorded music (harpischord, baroque instruments) are needing so much bitrate, and why heavy metal don't ? Can you, or someone else, help me to understand this big differences ? Thanks a lot

[sorry for my poor expression, and thanks again for your job]


Harpsichord is one of most difficult to encode instruments.


--------------------
-- Frank Klemm
Go to the top of the page
+Quote Post
guruboolez
post Aug 7 2002, 01:11
Post #28





Group: Members (Donating)
Posts: 3474
Joined: 7-November 01
From: Strasbourg (France)
Member No.: 420



QUOTE
Originally posted by Frank Klemm


Harpsichord is one of most difficult to encode instruments.


One of the nicest, too...

But I don't understand why mp3 find it easy to encode (normal bitrate 190 kb/s with --aps) and explode with an electric guitar (don't experiment it, but read it many time on many forums). And why mpc choose to increase bitrate on a simple violin (one note during 10 seconds : there is no attacks, but the bitrate jump over 200 kb/s with --quality 5 !), and not on Slayer's cacophony. Is a violin grainy, with a lot of hidden details ? If my supposition is true, thats a good thing if mpc encode it perfectly, and a big progress over mp3 : I want details, even if I can hear them.
But is my supposition true ? I often see a big bitrate jump on distorded instruments : have sub-band encoders problems with theses sounds ? Need mpc or mp2 more bitrate for a good encoding of distorded signals ?

I'm sure you kwow the answers. Thanks for your reply...
Go to the top of the page
+Quote Post
Dibrom
post Aug 7 2002, 02:41
Post #29


Founder


Group: Admin
Posts: 2958
Joined: 26-August 02
From: Nottingham, UK
Member No.: 1



As said before, MP3 is not a good format to make quality/bitrate collerations from due to the fact that encoding frequencies over 16khz will cause the bitrate to increase tremendously, even if otherwise this content should be easy to encode. Metal is full of this type of thing.

If you compare a metal encode with aps to a similar quality encode with mpc, aac, or vorbis, none of them will be as high in bitrate... and that doesn't mean they are lower quality either.

As for why MPC increases the bitrate a lot on those other signals, I suspect it is due to it's significantly more advanced psymodel over LAME simply deciding that more accuracy is needed. I'm inclined to believe that this is a Good Thing especially since encoding artifacts are quite rare with MPC compared to other formats.

Again, LAME is really not a good candidate for comparison.

Oh, and theoretically subband encoders are not as efficient in encoding some signals as transform coders (frequency based vs time based), but I don't believe that is actually what is causing the bitrate increase here, because certainly the jump would not be that high... not in the 200kbps+ range at any rate. Of course, I could be wrong wink.gif
Go to the top of the page
+Quote Post
guruboolez
post Aug 7 2002, 03:02
Post #30





Group: Members (Donating)
Posts: 3474
Joined: 7-November 01
From: Strasbourg (France)
Member No.: 420



Thank you very much, Dibrom, for your answer.

LAME is maybe not a good candidate for comparison, but I'm a bit perplex when I see the different behaviour between audio codecs. As I said it on an another topic, critical sample are making codecs crazy (+100 % for mpc standard, -66 % on Vorbis, and CBR who are not CBR at all...). In that case, this is funny. But in real musical tracks (I apologise for the people who listen every morning short_block.wav), it is interresting. Isn't it more clever not to choose only ONE codec for encoding CDs, but choose the better one for each instruments ? MPC for electronical, Vorbis or AAC for violin, and mp3 for glass harmonica ? Or even change for each movement (I noticed that quiet movements : largo-andante... needs much bitrate than vivace-allegro-presto...) ? Not very practice. But strange for a guy who was familiar with mp3 behaviour.
Go to the top of the page
+Quote Post
Dibrom
post Aug 7 2002, 03:42
Post #31


Founder


Group: Admin
Posts: 2958
Joined: 26-August 02
From: Nottingham, UK
Member No.: 1



A good psymodel should be able to cope with almost any situation thrown at it. I don't think a different encoder is needed for every type of music.. that approach makes no sense IMO. Futhermore, unless you've actually heard a problem with a codec like MPC for example (which you can provide samples and abx scores for), then why bother worrying about the bitrate it chooses on a particular sample? (given that the average bitrate overall is acceptable, which I believe it is)

Let the psymodel decide what it thinks is right, and let your hearing do the rest. Trying to analyze quality or interpret results based on bitrate and situation is pretty much just as meaningless as looking at spectrograms and trying to decide which codec is best that way.
Go to the top of the page
+Quote Post
spase
post Aug 7 2002, 03:44
Post #32





Group: Members
Posts: 773
Joined: 23-October 01
From: USA
Member No.: 340



i will try to answer a few questions here... mind you im not an expert on the subject.

first off, slower more quiet movements need more bitrate, because there is less volume in general, and thus less volume of noise and sound to cover up other sounds. as such, the encoder keeps more information about the sound, because more of it must be played back, as less of it is "covered up" by louder noise and sounds. (i hope you can understand this... and i hope i am correct at least partially on this)

as for mpc having higher bitrates, i believe you are correct in the idea that harpsichord, violin, and glass harmonica, while seemingly simple, are actually quite complex. i personally have some tracks by blues traveller, in which john popper plays a normal harmonica, and when it is solo and lower volume, the mpc at standard --ltq fil jumps to over 230 kbps. later when it is "covered up" by drums and guitars and some audience noise, the bitrate drops down to about 190 kbps or so. i would believe this to be another example.

one last thing. this same sort of "phenomenon" can be seen on the encoding of live music. when the audience applauds, the encoder of course does not keep the sound of every single hand clapping, but (in the case of a large audience) there are so many hands clapping that there is a tough job to be done when deciding what to keep and what to get rid of. i have noticed that mp3 does a very bad job, and the applause sounds like the ocean (very swishy) while mpc handles it fairly well (albeit alloting a lot of bitrate to a seemingly simple sample). this is perhaps less obvious, as when you hear applause in a track, you tend to ignore it, or your own mental "image" of what applause sounds like takes its place, and you associate that sound with the sound coming from your speakers.

i hope i am correct in what i am saying, and i hope i have helped you with your answers, guru


--------------------
http://www.last.fm/user/spase

-spase-
Go to the top of the page
+Quote Post
guruboolez
post Aug 7 2002, 03:59
Post #33





Group: Members (Donating)
Posts: 3474
Joined: 7-November 01
From: Strasbourg (France)
Member No.: 420



QUOTE
Originally posted by Dibrom
Futhermore, unless you've actually heard a problem with a codec like MPC for example (which you can provide samples and abx scores for), then why bother worrying about the bitrate it chooses on a particular sample? (given that the average bitrate overall is acceptable, which I believe it is)


I totally agree with you. Harpsichord CDs are compensated by piano CDs. For me its not a problem, and I don't really care about. But as I often read it on this forum, space ALWAYS matters, no ?




QUOTE
Originally posted by spase
first off, slower more quiet movements need more bitrate, because there is less volume in general, and thus less volume of noise and sound to cover up other sounds.  as such, the encoder keeps more information about the sound, because more of it must be played back, as less of it is "covered up" by louder noise and sounds. (i hope you can understand this... and i hope i am correct at least partially on this)


I understand perfectly. It make sense... it's just that mp3 seems to have the opposite behavior, and I was educated with another model.

QUOTE
as for mpc having higher bitrates, i believe you are correct in the idea that harpsichord, violin, and glass harmonica, while seemingly simple, are actually quite complex.

Yes, of course, I can feel this complexity. And hear bad distorsion on harpsichord encoded with mp3 (--r3mix, and --aps) : bitrate is low, quality too...

QUOTE
i personally have some tracks by blues traveller, in which john popper plays a normal harmonica, and when it is solo and lower volume, the mpc at standard --ltq fil jumps to over 230 kbps.  later when it is "covered up" by drums and guitars and some audience noise, the bitrate drops down to about 190 kbps or so. i would believe this to be another example.


Same thing for a violin concerto (or an orchestral work with a solo violin at the middle of the track) : +20-30% for a solo instrument, less for the 120 others playing at the same time.

QUOTE
i hope i am correct in what i am saying, and i hope i have helped you with your answers, guru


Very well, spase. Thanks a lot for this answer.
But don't blame me if I request again the opinion of the mpc developer : just for curiosity.

Thanks a lot for all answers smile.gif
Go to the top of the page
+Quote Post
spase
post Aug 7 2002, 04:09
Post #34





Group: Members
Posts: 773
Joined: 23-October 01
From: USA
Member No.: 340



QUOTE
Originally posted by guruboolez

But as I often read it on this forum, space ALWAYS matters, no ?


lol

spase always matters? :diabolic:

QUOTE
[b]
Same thing for a violin concerto (or an orchestral work with a solo violin at the middle of the track) : +20-30% for a solo instrument, less for the 120 others playing at the same time.


anyhow, yes as in the beginning of the first movement of rimsky-korsakov's sheherezade ("the sea and sinbad's ship" in english)

or perhaps part way through the russian easter overture, by the same composer (one of my favorite composers indeed)

anyhow, i'm glad to help...

by reading your other posts, its obvious your ears surpass mine... i have an interesting sample... i used it once when i was trying to decide between mp3, vqf, wma, and the three new (at the time) formats of ogg, aac, and mpc... at the time i weighed in factors of encoding time, decoding speed/cpu usage, average size, and of course audio quality, and i found mpc to be the most favorable when concidering all options.

if you are interested i could send the file to you via email to see if there are any artifacts left in it... im sure it would be best to test on quite low bitrates... i know this is getting a wee bit off topic biggrin.gif but hey whatever...

email or send me a private message here on the forum if you are more interested in the file...

edit: lol by now youd think i would know how to format these posts!


--------------------
http://www.last.fm/user/spase

-spase-
Go to the top of the page
+Quote Post
spase
post Aug 7 2002, 04:12
Post #35





Group: Members
Posts: 773
Joined: 23-October 01
From: USA
Member No.: 340



QUOTE
Very well, spase. Thanks a lot for this answer.
But don't blame me if I request again the opinion of the mpc developer : just for curiosity.


he doesnt visit this forum THAT much... better to drop a email his way...

(sorry for double post) sad.gif

edit: once again the formattting....


--------------------
http://www.last.fm/user/spase

-spase-
Go to the top of the page
+Quote Post
unplugged
post Aug 7 2002, 04:31
Post #36





Group: Members
Posts: 86
Joined: 9-March 02
From: Sicily
Member No.: 1469



QUOTE
Originally posted by spase
first off, slower more quiet movements need more bitrate, because there is less volume in general, and thus less volume of noise and sound to cover up other sounds.


So... Can we guess these are complicated situations for time
domain based codecs? (like MPC)

Because of the format's structure, it's well suited to encode and confine transients/space/bitrate but ... for long evolution signals it has problems to shortly synthesize the real "body" of sound: the frequencies? ohmy.gif


biggrin.gif Any voluntuer can briefly explain (I don't require simply words, without commitment, guys) to what consist the time based coding of MPC,

Time... in which sense? Shouldn't it encode freqs too, as the others DCT based codecs?
Go to the top of the page
+Quote Post
spase
post Aug 7 2002, 04:42
Post #37





Group: Members
Posts: 773
Joined: 23-October 01
From: USA
Member No.: 340



again im not an expert, but assuming the temporal compression (i guess thats what you are talking about) is similar to that used in movie compression, i would guess when adjacent frames are not even close to being similar it would provide a problem.

i dont know all that much about the insides of the codecs... but i try smile.gif


--------------------
http://www.last.fm/user/spase

-spase-
Go to the top of the page
+Quote Post
niktheblak
post Aug 7 2002, 08:21
Post #38





Group: Members (Donating)
Posts: 302
Joined: 3-October 01
From: Finland
Member No.: 188



QUOTE
Originally posted by unplugged

So... Can we guess these are complicated situations for time
domain based codecs? (like MPC)


IIRC the only time domain based codecs are the lossless ones. Since Musepack performs spectral analysis and subband encoding, a Fourier transform is somewhat of a necessity. You can't speak of i.e. 0-200 Hz frequency band when in time-domain, can you? Time domain usually involves things like Huffman codes, RLLs, convolution and such.

Now that I'm at it, why does everyone keep saying that codecs using DCT are the only "transform" codecs whereas codecs like MPC (using FFT) are "subband" codecs with nothing to do with transforms at all?

"Subband" encoding does use discrete Fourier transform. "Transform" encoding uses discrete cosine transform. Mathematically speaking, these transforms are nearly identical, with cosine transform being nothing but a cosine-termed (is this the correct english expression?) Fourier series (Fourier transform without the cosine, or ImX, part). Cosine transform just makes energy representation a little easier than Fourier transform. The differences of subband and transform codecs are much more profound than a simple variation in a transform equation.

Well, so much for pathos biggrin.gif
Go to the top of the page
+Quote Post
Gecko
post Aug 7 2002, 11:10
Post #39





Group: Members
Posts: 934
Joined: 15-December 01
From: Germany
Member No.: 662



About the glass organ which was encoded at only 150kbps by mpc. The recent versions of mpc have included code that takes better advantage of stereo correlation between channels and thus can compress tracks with minor stereo separation alot better. (Allthough they don't sound "mono-ish" to me). I can't tell beforehand by listening to a track, if the stereo correlation can be exploited well or not. I'm often surprised by what the encoder does but I can't hear any flaws. There has been a long discussion about this on the phorum, but beware: the method how Kevin T. analyzes codecs is flawed! Do not be mislead by his posts!

Now I'm not sure if this applies to your glass organ sample, but it could be a possible explanation why such a tonal instrument gets encoded at such a low bitrate. But hey: if it sounds good, then don't worry about the bitrate. smile.gif
Go to the top of the page
+Quote Post
guruboolez
post Aug 7 2002, 11:20
Post #40





Group: Members (Donating)
Posts: 3474
Joined: 7-November 01
From: Strasbourg (France)
Member No.: 420



I stop you : mp3 gives 150 kb/s and mpc 250 kb/s
And I discover the problem before mppenc 1.00 (the 0.90 era). Frank Klemm has nothing to do with it biggrin.gif
Last thing : it's glass harmonica, not organ,a very strange and rare instrument :
Go to the top of the page
+Quote Post
Gecko
post Aug 7 2002, 12:01
Post #41





Group: Members
Posts: 934
Joined: 15-December 01
From: Germany
Member No.: 662



Possibly this is all nonsense but here goes anyway:

A subband codec only splits up the signal into, well, subbands. Because each freqeuency band can be encoded at a different resolution/sampling rate (see Nyquist). This allready reduces the amount of bits needed to represent the signal. This process is not lossless but nearly. I don't know how this is done but I see no way around Fourier transformation or something similar. Then the signals of the subbands are quantisized. Here you are operating in the time domain again. The quantisation process is where all the psymodell magic happens.

A transform codec first moves everything into the frequency domain usually by mdct. Then the mdct coefficients are quantisized (not the signal itself). So here you are working in the frequency domain. This works somewhat more efficient so you can generally achieve lower bitrates while maintaining the same quality. The mdct is where preecho can happen, because when converting to the frequency domain you have to work on a rel. large number of samples (or better: over a period of time, which can not be infinitessimally small).

When you do the conversion from the time domain into the frequency domain you allways have to make a compromise between time resolution vs. frequency resolution. There's a nice physical example which illustrates this:

You can build a device to measure the frequency of a sound by having several metal prongs of different lengths alined next to another like a comb. each of these prongs is resonant at a different frequency. Now imagine you play a sound. Sound waves are basically swinging impulses of different air pressure levels. The speaker which is playing the sound moves the air at a certain frequency. But the very first impulse of air hits all prongs the same way. They will all start moving at their own resonant frequency. When the second impulse comes along some prongs will be affected more than others because their own movement correlates with the movement of the air. This goes on and on and in the end in theory only one prong will be left swinging which matches the frequency of the speaker.

The point is: it takes some time until you can determine what frequency is playing by looking at the prongs. If you were to decrease the number of prongs this would make the process faster because the resonant frequencies of the individual prongs are further apart. So you have increased the time resolution but had to sacrifice frequency precision.

If you increase the number of prongs (higher frequency resolution) it will take longer to deteremine which frequency is playing because some prongs whose resonant frequency is close to the one being played will take longer to stop swinging. Thus you loose time resolution.

This is not all bad because our ear works in a similar way. I hope you can understand all this as I would be better explaining this in German.
Go to the top of the page
+Quote Post
Gecko
post Aug 7 2002, 12:22
Post #42





Group: Members
Posts: 934
Joined: 15-December 01
From: Germany
Member No.: 662



QUOTE
I stop you : mp3 gives 150 kb/s and mpc 250 kb/s
And I discover the problem before mppenc 1.00 (the 0.90 era). Frank Klemm has nothing to do with it biggrin.gif
My apologies for my mistake! Mpc's high bitrate is fairly easy to explain then. When you quantisize the signal and loose some of the info you introduce noise. This noise has to be hidden somewhere among the other sounds so you can't hear it. If your sample is very tonal there is not much space to hide the noise because the human ear hears the difference easily. Ergo you have to quantisize with higher resolution to introduce less noise and this in turn raises the bitrate.

Mp3 on the other hand only has to store a few of the mdct coefficients, because most likely the signal will be made up of just some simple waveforms + overtones. (See my other post above)

Just for the record: 0.90 is allready the Klemm era, but in the beginning most optimizations were related to speed.
QUOTE
Last thing : it's glass harmonica, not organ,a very strange and rare instrument
Well, in German it's called "Wasserorgel" where Orgel = organ, just me mixing things up. rolleyes.gif

Cheers smile.gif
Go to the top of the page
+Quote Post
unplugged
post Aug 7 2002, 18:04
Post #43





Group: Members
Posts: 86
Joined: 9-March 02
From: Sicily
Member No.: 1469



Thanks guys,
Gecko very well explained the matter wink.gif

So, a NOT indifferent lack point of transform codecs is the response time!! What a lack!
It's not a great news to known today... crying.gif

mmm... for example, with transform codecs we cannot record a certain freqency that raises or decrease (the slide), we can "only" record a X frequency at Y time respecting the Z time resolution/granularity (compromise).


Thanks again for the interest, must say this mostly happens only at HA.
Go to the top of the page
+Quote Post
jalonsom
post Aug 7 2002, 19:23
Post #44





Group: Members
Posts: 17
Joined: 2-June 02
From: Spain
Member No.: 2195



Now that you're talking about subband vs. transform, I am wondering why nobody is taking the advantages of each method. How about an hybrid codec that would decide to encode each frame either as subband or transform?
Maybe frames with strong transients could be encoded as well as mpc does, and more tonally simple ones could be encoded efficiently as with aac or vorbis...

Of course it would require twice the effort in coding and tuning and it would encode slower. Do you think that bitrate reduction would justify the drawbacks? Maybe vorbis could implement subband in 2.0 ?

Maybe this is all nonsense?

Jaime.
Go to the top of the page
+Quote Post
rjamorim
post Aug 7 2002, 19:58
Post #45


Rarewares admin


Group: Members
Posts: 7515
Joined: 30-September 01
From: Brazil
Member No.: 81



QUOTE
Originally posted by jalonsom
How about an hybrid codec that would decide to encode each frame either as subband or transform?


That codec is called MPEG Audio Layer 3. biggrin.gif

Ever heard about MP3's hybrid filterbank?


--------------------
Get up-to-date binaries of Lame, AAC, Vorbis and much more at RareWares:
http://www.rarewares.org
Go to the top of the page
+Quote Post
Garf
post Aug 7 2002, 20:00
Post #46


Server Admin


Group: Admin
Posts: 4853
Joined: 24-September 01
Member No.: 13



QUOTE
Originally posted by rjamorim


That codec is called MPEG Audio Layer 3. biggrin.gif


No, MP3 _always_ does _both_.

Switching between the two is problematic because transform codecs use overlapped windows, i.e. you can't really encode frames independently.

--
GCP
Go to the top of the page
+Quote Post
Phobos
post Aug 7 2002, 20:40
Post #47





Group: Members
Posts: 290
Joined: 5-April 02
From: Guadalajara, Jalisco
Member No.: 1693



blah, AAC beats MP3 it being a transform codec
Go to the top of the page
+Quote Post
Gecko
post Aug 7 2002, 20:43
Post #48





Group: Members
Posts: 934
Joined: 15-December 01
From: Germany
Member No.: 662



QUOTE
Maybe vorbis could implement subband in 2.0
Afaik "Subband Technology" is unfortunately patented by Philips. See here, number 8.
Go to the top of the page
+Quote Post
Frank Klemm
post Aug 7 2002, 22:53
Post #49


MPC Developer


Group: Developer
Posts: 543
Joined: 15-December 01
From: Germany
Member No.: 659



QUOTE
Originally posted by Gecko
Afaik "Subband Technology" is unfortunately patented by Philips. See here, number 8.


Subband coding is NOT patented.
Read patents very very carefully or don't read it.
Perceptional noise substitution is also NOT patented.

When reading patents it is necessary to find out what is EXACTLY patented.


--------------------
-- Frank Klemm
Go to the top of the page
+Quote Post
HotshotGG
post Aug 7 2002, 23:15
Post #50





Group: Members
Posts: 1593
Joined: 24-March 02
From: Revere, MA
Member No.: 1607



QUOTE
Maybe vorbis could implement subband in 2.0


That was not the main bone of contention in the first place, Monty had decided NOT to use subbands for reasons based upon computational complexity and any other reasons he may have came to conclusions with. It also paved the way for more experimentation that could be performed, just as maybe theoritcally speaking Hybrid Discrete Wavelet Filterbanks which I try to outline so fondly as many ask.

:gah:


--------------------
College student/IT Assistant
Go to the top of the page
+Quote Post

3 Pages V  < 1 2 3 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 17th April 2014 - 20:36