Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Personal Listening Test of MP3 encoders at 224kbps (Read 42468 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Personal Listening Test of MP3 encoders at 224kbps

Abstract:
Blind Comparison between Lame 3.100i V2+, Lame 3.99 V1, LAME 3.98 CBR 224kbps -q 0 , Helix -V146, BladeEnc CBR 224kbps(low anchor).

Encoders:
LAME 3.100i
http://www.hydrogenaudio.org/forums/index....showtopic=99483
LAME 3.99.5 VBR V1
http://www.rarewares.org/mp3-lame-bundle.php
LAME 3.98.4 CBR 224kbps -q 0(slowest)
Helix mp3enc v5.1 Open Source encoder 2005-12-20 -V146
http://www.rarewares.org/mp3-others.php
BladeEnc 0.94.2 CBR 224kbps (low anchor)

Settings:
lame3100i -S -V2+ input.wav  output.mp3
lame3.99.5 -S -V1 input.wav output.mp3
lame3.98.4 -S -q 0 -b 224 input.wav output.mp3
hmp3 input.wav output.mp3 -X2 -U2 -V146
bladeenc -quit -nocfg input.wav output.mp3 -224

Samples:
25 Sounds of various genres.

Hardwares:
Sony PSP-3000 + RP-HT560.

Results:



Conclusions & Observations:
I could not a significant difference except the low anchor. There are no big differences in the average quality of these four encoders.

Anova analysis:
Code: [Select]
FRIEDMAN version 1.24 (Jan 17, 2002) [url=http://ff123.net/]http://ff123.net/[/url]
Blocked ANOVA analysis

Number of listeners: 25
Critical significance:  0.05
Significance of data: 0.00E+000 (highly significant)
---------------------------------------------------------------
ANOVA Table for Randomized Block Designs Using Ratings

Source of        Degrees    Sum of    Mean
variation        of Freedom  squares  Square    F      p

Total              124          23.56
Testers (blocks)    24          7.75
Codecs eval'd        4          9.24    2.31  33.80  0.00E+000
Error              96          6.56    0.07
---------------------------------------------------------------
Fisher's protected LSD for ANOVA:  0.147

Means:

Helix-V1 3.100iV2 3.98CBR  3.99V1  BladeEnc
  4.60    4.58    4.57    4.54    3.90

---------------------------- p-value Matrix ---------------------------

        3.100iV2 3.98CBR  3.99V1  BladeEnc
Helix-V1 0.829    0.706    0.419    0.000*
3.100iV2          0.871    0.553    0.000*
3.98CBR                    0.666    0.000*
3.99V1                              0.000*
-----------------------------------------------------------------------

Helix-V146 is better than BladeEncCBR
3.100iV2+ is better than BladeEncCBR
3.98CBR is better than BladeEncCBR
3.99V1 is better than BladeEncCBR
Raw data:
Code: [Select]
% MP3 224kbps ABC/HR Score
% This format is compatible with my graphmaker, as well as ff123's FRIEDMAN.
3.100iV2+ 3.99V1 3.98CBR Helix-V146 BladeEncCBR
%feature 7 LAME LAME LAME Other Other
4.700 4.600 4.000 4.300 3.100
4.300 4.200 4.600 4.800 3.800
4.500 4.500 4.400 5.000 4.700
4.800 5.000 4.600 5.000 4.300
4.700 4.500 4.200 4.400 3.500
4.700 4.300 5.000 4.600 4.500
4.400 5.000 3.800 4.700 3.900
4.200 4.500 4.400 4.500 3.800
4.300 4.200 4.000 4.500 3.200
4.400 4.300 5.000 4.600 3.400
4.000 4.300 4.500 4.600 3.500
4.500 4.200 4.400 4.600 3.600
4.200 4.500 5.000 4.700 4.000
4.300 4.100 4.300 4.100 3.500
4.200 4.200 4.400 4.600 3.900
5.000 4.500 5.000 5.000 4.100
5.000 4.300 4.700 4.400 4.000
4.500 4.400 4.200 4.000 3.200
5.000 5.000 5.000 4.500 4.400
5.000 5.000 4.700 5.000 4.400
5.000 4.800 4.800 4.600 4.200
4.700 5.000 5.000 4.500 3.900
4.800 5.000 5.000 4.600 4.200
5.000 5.000 4.700 5.000 4.200
4.400 4.100 4.600 4.400 4.100
%samples 41_30sec hihats
%samples finalfantasy cemb
%samples ATrain Jazz
%samples BigYellow Pops
%samples FloorEssence Techno
%samples macabre orch
%samples mybloodrusts guitar
%samples Quizas Latin
%samples VelvetRealm Techno
%samples Amefuribana Pops
%samples Trust Gospel
%samples Waiting Rock
%samples Experiencia Latin
%samples Heart to Heart Pops
%samples Tom's Diner Vocal
%samples Reunion Blues Jazz
%samples French Speech
%samples undelete Pops
%samples Dimmu Borgir Metal
%samples Run up Pops
%samples German Speech
%samples ItCouldBeSweet Pops
%samples OnTheRoofWith Pops
%samples easy game Pops
%samples Tears Infection Pops
Bitrates:
Code: [Select]
259222	250962	224500	270110	224109
212206 190894 224404 200702 224012
210626 223963 224651 248472 224040
256744 246869 224559 243848 224081
272211 268745 224645 225813 224060
212126 222561 224771 234717 224101
227229 274100 224802 226008 224228
252353 237478 224475 243034 224060
264467 270433 225449 245293 224317
230309 214030 224517 219325 224051
243315 240742 224427 240749 224024
232944 226612 224709 251829 224129
256994 236299 224619 229343 224034
245990 237097 224533 243966 224121
220848 204723 224825 225298 224235
226596 224930 224500 248784 224110
274433 235266 224408 176326 224012
240666 234456 224458 230376 224032
218750 222924 224796 232772 224208
234844 237687 224774 229799 224189
210745 167946 224583 104856 224026
219320 180893 224796 161194 224211
211124 209905 224500 200196 224110
214539 204183 224648 182829 224167
226791 224161 224673 215760 224121
average:
235016 227514 224641 221256 224112

Personal Listening Test of MP3 encoders at 224kbps

Reply #1
Thanks for the work, nice to have such a detailed comparison. Did you use the -HF switch with the Helix encoder, btw?

Personal Listening Test of MP3 encoders at 224kbps

Reply #2
It must have been a hard test, thank you very much.
Very interesting result. From this using 3.100i -V2+ isn't very useful compared to using -V1.
lame3995o -Q1.7 --lowpass 17

Personal Listening Test of MP3 encoders at 224kbps

Reply #3
Thanks for the work, nice to have such a detailed comparison. Did you use the -HF switch with the Helix encoder, btw?

hmp3 input.wav output.mp3 -X2 -U2 -V146
I didn't use the -HF switch, as the default option is typically the most recommended option by the developer(s).
But it may be interesting to test -HF, along with 3.100 alpha2, and 3.99.5f. I won't start testing them now because I'll be
very busy until June and rainy season starts from June.

Personal Listening Test of MP3 encoders at 224kbps

Reply #4
I welcome very much if you could test Lame3.100 alpha2 as well. With your current test lame3.100i stands against 3.99.5 which is not the same basis (though I don't think things will change essentially, I even expect 3.100 alpha2 to come out a little bit better than 3.99.5 does).
lame3995o -Q1.7 --lowpass 17

Personal Listening Test of MP3 encoders at 224kbps

Reply #5
It seems the basis here is ~224 kbits.  If there is a desire to determine if there are advantages between Lame versions using the same V level, run a new test and present the results in a new discussion rather than attempt to co-opt this one.

Personal Listening Test of MP3 encoders at 224kbps

Reply #6
???
I think it's interesting to see how 3.100a2 -V1 compares against 3.99.5 -V1 in its own right.
Sure I'm interested to see how 3.100i -V2+ compares against 3.100a2 -V1 (because the underlying basis is the same - except for the -V level of course).
lame3995o -Q1.7 --lowpass 17


Personal Listening Test of MP3 encoders at 224kbps

Reply #8
Astounding listening test, and quite interesting. I am a long-time fan of HMP3 (a great time saver on a hum-drum machine).
I didn't use the -HF switch, as the default option is typically the most recommended option by the developer(s).
But it may be interesting to test -HF, along with 3.100 alpha2, and 3.99.5f.
I wanted to add that simply adding -HF1 / -HF2 will shift the bitrate slightly higher. Here's my quick test results (optional remarks follow):
Code: [Select]
-X2 -U2 -B146 ....... ~234kbps
-X2 -U2 -B146 -HF1 .. ~235kbps
-X2 -U2 -B145 -HF1 .. ~233kbps
-X2 -U2 -B143 -HF2 .. ~234kbps
-X2 -U2 -B142 -HF2 .. ~233kbps

Note: all these bitrates are above 224 kbps,
I am just going with the OP's switches.

Note2: Spectrogram observations of HMP3 regarding bitrate increase
(NOT a quality metric)
- Without -HF the material is clearly cut-off above 16kHz;
- Using -HF1 looks similar to LAME 3.99 -V2/-V3 (some material encoded >16kHz);
- Using -HF2 encodes like LAME 3.99 -V0/-V1 (gradual roll-off between 16-20kHz).
In regards to the 16kHz cutoff and -HF switches, you can see from the OP that the
results are not as dramatic as some may believe
"Something bothering you, Mister Spock?"

Personal Listening Test of MP3 encoders at 224kbps

Reply #9
Kamedo2,
Thank You Very Much for sharing this test here.  Great one! 
Have followed your tests since some time ago and it's clear for me why You've used CBR for LAME 3.98.4. Because it performs better http://d.hatena.ne.jp/kamedo2/20111214

I was reading your test and trying to process an information. Here are my thoughts.
Well, except the low anchor, all encoders are on par.  But some additional analysis would be useful to get some extra conclusions. 
  • A lowest score per encoder. 
    All individual scores are >= 4.0 per sample. Except 3.98.4 CBR. That's where CBR fails imo. I would rather prefer a bit lower average score while  scores for each particular sample would stay at least at 4.0. It's only one sample where 3.98.4 CBR did worse than 4.0. Yes, but it also does the same in your previous test http://d.hatena.ne.jp/kamedo2/20111214
  • It's hardly a coincidence that Helix MP3 encoder ends up with a slightly higher score each time, as here  http://listening-tests.hydrogenaudio.org/s...8-1/results.htm and http://www.hydrogenaudio.org/forums/index....st&p=808142 .  Helix encoder is 7 years old  and it still shines.
  • All average scores are >4.5 (except the low anchor) and You are the experienced listener. It means these encoders will be transparent for an averaged listener.
  • The halb’s 3.100i V2  looks good. A bit higher average score comparing between LAME encoders (though no significant difference making statistical analysis, but still) and all individual scores are higher than or equal to 4.0.

Personal Listening Test of MP3 encoders at 224kbps

Reply #10
Yes, my thought is the same. Even when the difference of Helix MP3 encoder over LAME is slight, I like the way how Helix behaves. The number of badly encoded samples is low.
I collected 3 different test results and combined the results in one image. Many people will use encoders in many bitrates and settings, and this collection represents a fair approximation of these overall average quality people will experience. Average score: LAME3.98=4.27 Helix=4.33, Number of samples: 25+20+14=59

Code: [Select]
%Kamedo2's Personal Listening Test of MP3 224kbps
LAME3.98 Helix
4 4.3
4.6 4.8
4.4 5
4.6 5
4.2 4.4
5 4.6
3.8 4.7
4.4 4.5
4 4.5
5 4.6
4.5 4.6
4.4 4.6
5 4.7
4.3 4.1
4.4 4.6
5 5
4.7 4.4
4.2 4
5 4.5
4.7 5
4.8 4.6
5 4.5
5 4.6
4.7 5
4.6 4.4

%IgorC's Personal Listening Test of MP3 encoders (part II) LAME vs Helix MP3 encoders at 130 kbps.
4.1 4
3.9 3.8
3.1 3
3.4 3.8
4 3.7
3.2 3.9
4.1 4.2
3.3 3
2 4.5
4.4 3.8
3.2 3.1
4.3 4.1
4.4 3.6
3.1 4
4 4.5
4.5 3.3
3.9 4.3
4 4.3
3.3 3
4.2 4.2


%Results of the public MP3 listening test @ 128 kbps (October 2008)
3.68 4.74
4.34 4.67
4.64 4.6
4.12 4.39
4.58 4.75
4.65 4.77
4.55 4.8
4.57 4.41
4.82 4.22
4.79 4.59
4.75 4.08
4.44 4.74
4.62 4.7
4.54 4.75

Personal Listening Test of MP3 encoders at 224kbps

Reply #11
Thanks for the effort and ability you put into this substantial test, Kamedo2.

I wonder if this thread should be added to the Wikipedia list of Codec Listening Tests (which includes those high bitrate individual tests by Guruboolez some years ago).

I also tend to look at the lower bound and/or the tightness of the distribution to attempt to reduce the likelihood of really nasty artifacts, though my artifact detection training is fairly poor. The problem, as always, is there might be extreme cases that one psymodel just doesn't deal with adequately that are missed in the test corpus, but a general idea of the spread and lower bounds of quality is still helpful.

HELIX VBR seems to do very well at 128 kbps and 224 kbps, and I'd feel confident using it anywhere from 128kbps upwards.

Pros and Cons
  • Encoding speed: Helix MP3 wins
  • Quality (~128 to ~224kbps): Helix MP3 and LAME tie
  • Gapless support: LAME wins


I do use Helix at ~131kbps for loudness-levelled background music compilations on hardware where gapless support is impossible. Otherwise, easy gapless support in sufficiently good players keeps me using LAME, and I'm happy that the likes of Amazon use LAME at around -V0 for that reason.


Halb27's special LAME -Vn+ does also have specific uses for certain types of track (e.g. solo harpsichord or other music having heavy transients with strong steady tones). I haven't completely kept up with how the main LAME3.100 copes with these (I think there's some improvement over 3.99), but the strategy of providing maximum bitrate for short blocks seemed to work for halb27's version where LAME 3.99 and Helix MP3 both fall down unless the bitrate gets very high generally. I might well adopt that version for specific types of content or to fix a problem sample.
Dynamic – the artist formerly known as DickD

Personal Listening Test of MP3 encoders at 224kbps

Reply #12
I wonder if this thread should be added to the Wikipedia list of Codec Listening Tests (which includes those high bitrate individual tests by Guruboolez some years ago).

Yes, it should be. And Hydrogenaudio knowledgebase too.

Personal Listening Test of MP3 encoders at 224kbps

Reply #13
I wonder if this thread should be added to the Wikipedia list of Codec Listening Tests (which includes those high bitrate individual tests by Guruboolez some years ago).

Yes, it should be. And Hydrogenaudio knowledgebase too.
With the conclusion that at these high bitrates all modern encoders perform equally well?
It's only audiophile if it's inconvenient.

Personal Listening Test of MP3 encoders at 224kbps

Reply #14
I wonder if this thread should be added to the Wikipedia list of Codec Listening Tests (which includes those high bitrate individual tests by Guruboolez some years ago).

Yes, it should be. And Hydrogenaudio knowledgebase too.
With the conclusion that at these high bitrates all modern encoders perform equally well?

Yes, the conclusion is 4-way tie (all except BladeEnc). The word 'tie' is preferred over 'equal', for obvious reasons.


Personal Listening Test of MP3 encoders at 224kbps

Reply #16
BladeEnc != modern encoder

That's why I said 4-way tie. (I assume 3 Lame encoders and Helix are the modern encoders. These 4 encoders are the winner and BladeEnc is the obvious loser.)


Personal Listening Test of MP3 encoders at 224kbps

Reply #18
Very good and interesting test. It shows that for a modern mp3 encoder above a certain threshold (192k)  - the bitrate is a strong indicator of quality no matter VBR / CVBR or CBR.  Also it prove as I've said before that CBR will not 'starve' of bits given sufficient bitrate and the popular 320 CBR encodings on the internet are a huge waste as 224 yields an excellent quality.

Personal Listening Test of MP3 encoders at 224kbps

Reply #19
Thank you very much for the test!

I realize you have extensive artifact training, but I am still surprised that so few samples are 100% transparent.

Personal Listening Test of MP3 encoders at 224kbps

Reply #20
It seems that if you're particularly sensitive to pre-echo then just about anything with hard transients won't be transparent with mp3, even at 320kbits.

Personal Listening Test of MP3 encoders at 224kbps

Reply #21
The ABX criteria was 15+/20(p=0.02). All samples and all encoders were ABXed 20 times. So there were 25(samples) x 5(encoders) = 125 tests, of which 25 tests I failed and thus scored 5.0(Correct answer:14 or less)
The 15+/20 criteria allows me to fail up to 25% of the blind tests, so it explains why only 20% of them were transparent.

The software to plot the graph and table in this result thread. Web application. Feel free to use it.
http://zak.s206.xrea.com/bitratetest/graphmaker3.htm
Help page:
http://zak.s206.xrea.com/bitratetest/faq.htm

I wonder if this thread should be added to the Wikipedia list of Codec Listening Tests (which includes those high bitrate individual tests by Guruboolez some years ago).

I refrain from adding the result in wikipedia, because writing articles about oneself or one's own result is what should be avoided.

Personal Listening Test of MP3 encoders at 224kbps

Reply #22
It would help since Guruboolez's data is classical-centric, and, quite frankly, I'm tired of seeing his results get raised during discussions where they aren't a good fit.

Personal Listening Test of MP3 encoders at 224kbps

Reply #23
It seems that if you're particularly sensitive to pre-echo then just about anything with hard transients won't be transparent with mp3, even at 320kbits.

I'm only familiar with some of the samples, but would you say that most of them contain hard transients? The "speech" and "vocal" samples don't but still are not transparent.

Kamedo, could you maybe elaborate a little on the problems you heard?