lame 3.97 alpha 5 testing thread, tests & results |
![]() ![]() |
lame 3.97 alpha 5 testing thread, tests & results |
Jan 11 2005, 08:03
Post
#1
|
|
![]() Group: Members (Donating) Posts: 3474 Joined: 7-November 01 From: Strasbourg (France) Member No.: 420 |
The 12 samples correspond to the ff123'samples suit selected for the 64 kbps listening test.
www.ff123.net Encoders and settings: • lame 3.90.3 | John33 compile | --alt-preset 128 • lame 3.96.1 | John33 compile | --preset 128 • lame 3.97.a5 | John33 compile | --preset 128 -X 10,10 Hardware and software configuration • Audigy2 soundcard • Onyko R-A5 FM/AM Tuner Amplifier • Beyerdynamic DT-531 headphone • ff123's ABC/HR 1.1 beta 2 Personal mood • tired... I didn't spent too much time for this test, and I have only ABXed important things (the two best encoders, and only if needed). RESULTS CODE 3.90.3 3.96.1 3.97.a5 ATrain 3.0 2.0 3.5 BachS1007 4.5 3.5 3.5 BeautySlept 1.5 2.5 3.5 Blackwater 4.5 3.5 4.5 FloorEssence 1.8 1.5 2.5 Layla 3.3 1.0 3.0 LifeShatters 3.0 1.5 3.0 LisztBMinor 3.5 1.5 2.0 MidnightVoyage 2.0 1.0 1.8 thear1 4.5 3.0 4.0 TheSource 4.0 3.0 5.0 Waiting 1.0 1.5 3.0 ___________MEANS 3.05 2.13 3.27 COMMENTS • most often, lame 3.96.1 appeared to be clearly behind lame 3.90.3 and lame 3.97.a5 • I had some troubles to differenciate lame 3.90.3 and lame 3.97.a5. I often changed the notation. Both are close (according to my experience), and it wasn't easy for me to tell which sounded worse or better, even when a difference was audible and ABXable. • newest alpha had apparently serious troubles with LisztBMinor.wav sample: background noise was severly wounded, removing precious musical information. Difference could be checked through a frequency editor: it's really eloquent. IIRC [proxima] had also reported problems for this sample and recent lame builds. ANALYSIS • ANOVA ANALYSIS CODE ANOVA FRIEDMAN version 1.24 (Jan 17, 2002) http://ff123.net/ Blocked ANOVA analysis Number of listeners: 12 Critical significance: 0.05 Significance of data: 4.54E-004 (highly significant) --------------------------------------------------------------- ANOVA Table for Randomized Block Designs Using Ratings Source of Degrees Sum of Mean variation of Freedom squares Square F p Total 35 45.01 Testers (blocks) 11 27.30 Codecs eval'd 2 8.91 4.46 11.15 4.54E-004 Error 22 8.80 0.40 --------------------------------------------------------------- Fisher's protected LSD for ANOVA: 0.535 Means: 3.97.a5 3.90.3 3.96.1 3.27 3.05 2.13 ---------------------------- p-value Matrix --------------------------- 3.90.3 3.96.1 3.97.a5 0.393 0.000* 3.90.3 0.002* ----------------------------------------------------------------------- 3.97.a5 is better than 3.96.1 3.90.3 is better than 3.96.1 • TUKEY PARAMETRIC ANALYSIS CODE TUCKEY PARAMETRIC FRIEDMAN version 1.24 (Jan 17, 2002) http://ff123.net/ Tukey HSD analysis Number of listeners: 12 Critical significance: 0.05 Tukey's HSD: 0.649 Means: 3.97.a5 3.90.3 3.96.1 3.27 3.05 2.13 -------------------------- Difference Matrix -------------------------- 3.90.3 3.96.1 3.97.a5 0.225 1.150* 3.90.3 0.925* ----------------------------------------------------------------------- 3.97.a5 is better than 3.96.1 3.90.3 is better than 3.96.1 => 3.96.1 is the worst; 3.90.3 and 3.97.a5 are tied. EDIT: log files are here This post has been edited by guruboolez: Dec 29 2005, 22:01 |
|
|
|
Jan 11 2005, 15:00
Post
#2
|
|
![]() Group: Members Posts: 669 Joined: 15-January 02 From: SE Pennsylvania Member No.: 1032 |
This is what X mode 10 does...in quantize.c
CODE case 10: {
if (best->over_count > 0 ) { /* there are distorted sfb*/ better = calc->over_SSD < best->over_SSD; } else { /* no distorted sfb*/ better = calc->max_noise <= best->max_noise; } break; } This post has been edited by mithrandir: Jan 12 2005, 06:36 |
|
|
|
Jan 11 2005, 20:56
Post
#3
|
|
![]() Group: Members Posts: 197 Joined: 12-October 02 From: Italy Member No.: 3537 |
QUOTE (guruboolez @ Jan 11 2005, 08:03 AM) • newest alpha had apparently serious troubles with LisztBMinor.wav sample: background noise was severly wounded, removing precious musical information. Difference could be checked through a frequency editor: it's really eloquent. IIRC [proxima] had also reported problems for this sample and recent lame builds. Yes, this problem of altered background noise (with also musical information) is noticeable with even more samples for recent lame builds. Sometimes the removed background noise is replaced with a HF ringing. Both artifacts are really annoying for me , this is the only reason i still prefer the old 3.90.3 in these cases. At this time, with 3.97a5, the following samples are affected: obscured, Atom_heart_mother, FloorEssence, LisztBMinor, spahm, rebel, Queen+-+your+my+best, gbtinc, applaud, amnesia, Bayle - Etching. Often the artifacts are even visible as dropouts with a spectral analysis. Of course, i can upload the remaining samples on request. This post has been edited by [proxima]: Jan 11 2005, 21:08 -------------------- WavPack 4.3 -mfx5
LAME 3.97 -V5 --vbr-new --athaa-sensitivity 1 |
|
|
|
Jan 12 2005, 03:24
Post
#4
|
|
![]() Group: Members Posts: 669 Joined: 15-January 02 From: SE Pennsylvania Member No.: 1032 |
I looked at the output of rebel on a spectrogram. There's a definite dropout in the 10-11KHz region. Maybe the encoder is not devoting bits to this sfb because it (falsely) thinks the tones are masked by the much stronger tones in the lower midrange (guitar). Just a guess.
|
|
|
|
Jan 12 2005, 03:28
Post
#5
|
|
|
Group: Members Posts: 133 Joined: 2-January 04 Member No.: 10896 |
Tested Dev0's sample from here.
3.90.3 --alt-preset cbr 128 3.97a5 -b 128 -X 10,10 I liked 3.97a5's CBR128 better than 3.90.3's on this particular sample even though they were both noticeably distorted. ABC/HR Version 0.9b, 30 August 2002 Testname: 3.97a5 vs 3.90.3 1R = C:\My Test Samples\giveup\3903cbr128.mp3.wav 2L = C:\My Test Samples\giveup\397a5cbr128.mp3.wav --------------------------------------- General Comments: The stereo separation (not sure if this is the correct term) on both files sounds broken for the guitar. On the original, the guitar is at mid-left (no echo from the right channel), but on the mp3s, the guitar is echoed from the right. --------------------------------------- 1R File: C:\My Test Samples\giveup\3903cbr128.mp3.wav 1R Rating: 2.5 1R Comment: This sounds rougher and dirtier. The incorrect stereo separation/added echo from the right channel is more noticeable than File 2. --------------------------------------- 2L File: C:\My Test Samples\giveup\397a5cbr128.mp3.wav 2L Rating: 3.0 2L Comment: The distortion is less annoying than File 1. Sounds slightly cleaner. --------------------------------------- ABX Results: Original vs C:\My Test Samples\giveup\3903cbr128.mp3.wav 10 out of 10, pval < 0.001 Original vs C:\My Test Samples\giveup\397a5cbr128.mp3.wav 10 out of 10, pval < 0.001 C:\My Test Samples\giveup\3903cbr128.mp3.wav vs C:\My Test Samples\giveup\397a5cbr128.mp3.wav 10 out of 10, pval < 0.001 |
|
|
|
Jan 12 2005, 04:33
Post
#6
|
|
|
Group: Members Posts: 14 Joined: 9-January 05 Member No.: 18950 |
Thank you
If I may be so bold, was there a reason why you chose the --preset 128 for the 3.96.1 as opposed to -V 5 ? Regards Randy |
|
|
|
Jan 12 2005, 06:18
Post
#7
|
|
![]() Group: Developer Posts: 1679 Joined: 23-December 01 From: Germany Member No.: 731 |
Cause 3.96.1 had ABR/CBR issues, which Gabriel worked on and need testing.
-------------------- "To understand me, you'll have to swallow a world." Or maybe your words.
|
|
|
|
Jan 12 2005, 20:07
Post
#8
|
|
![]() LAME developer Group: Developer Posts: 2950 Joined: 1-October 01 From: Nanterre, France Member No.: 138 |
QUOTE I looked at the output of rebel on a spectrogram. There's a definite dropout in the 10-11KHz region. Maybe the encoder is not devoting bits to this sfb because it (falsely) thinks the tones are masked by the much stronger tones in the lower midrange (guitar). Just a guess. Unfortunately no. The psymodel thinks that this area needs some bits. However, it appears that sfb10 is empty on those parts, which is in contradiction with the psymodel requirements, and even with the computed distortion which indicates no distortion there. edit: 3.90.3 seems to exhibit the same dropouts on the Rebel sample. Does 3.90.3 sounds better on this sample? This post has been edited by Gabriel: Jan 12 2005, 20:11 |
|
|
|
Jan 12 2005, 20:33
Post
#9
|
|
![]() LAME developer Group: Developer Posts: 2950 Joined: 1-October 01 From: Nanterre, France Member No.: 138 |
It seems that lowering the ath is reducing those dropouts (at least in the Atom case).
Would you mind testing "-X 10,10 --athlower 10"? If this solves the problem, it would be very helpfull the determine which athlower value is leading to satisfactory results. |
|
|
|
Jan 12 2005, 20:57
Post
#10
|
|
![]() Group: Members Posts: 197 Joined: 12-October 02 From: Italy Member No.: 3537 |
The rebel sample is only slight better with 3.90.3, i don't think rebel is a problem case for the new aplhas. Even 3.90.3 has distorsions.
I will test when possible the setting with the lowered ATH but i strongly believe this is the way to go because i've already noticed improvments in the past. Now, for VBR, i believe that "--athaa-sensitivity 1" reduce significantly ringing because VBR can choose to lower the ATH even more than the default. When i proposed the "--athaa-sensitivity 1" you seemed to dislike this way of tweaking because of too high ATH adjustment range for a 128 preset. Maybe i have misunderstood something but it seems that now you're going for the "lower ATH" solution even with CBR/ABR. Maybe a lower ATH (or a higher ATH sensitivity) is the key for all presets (CBR, ABR,VBR) that targets below ~150 Kbps (medium included). -------------------- WavPack 4.3 -mfx5
LAME 3.97 -V5 --vbr-new --athaa-sensitivity 1 |
|
|
|
Jan 12 2005, 22:27
Post
#11
|
|
![]() LAME developer Group: Developer Posts: 2950 Joined: 1-October 01 From: Nanterre, France Member No.: 138 |
I'd prefer to use a lower base level for ath but not increasing the adjustement range too much.
I think that I should also enable ath-aa for cbr/abr. |
|
|
|
Jan 13 2005, 08:12
Post
#12
|
|
![]() Group: Members (Donating) Posts: 3474 Joined: 7-November 01 From: Strasbourg (France) Member No.: 420 |
96 kbps ABR & VBR TEST
Samples: The 12 first samples correspond to the ff123'samples suit selected for the 64 kbps listening test. I've also added eight more samples: - the 4 samples used by ff123 for the 128 kbps collective test (Dogies.wav, Fossiles.wav, Rawhide.wav, Wayitis.wav) - macabre.wav (full orchestra sample, uploaded by ff123) - SinceAlways.wav (recently uploaded by Dev0) - castanets2.wav for testing sharpness & pre-echo - Orion II.wav (brass instrument) for testing micro-attacks with a real instrument. Encoders and settings: • lame 3.90.3 | John33 compile | --alt-preset 96 • lame 3.96.1 | John33 compile | --preset 96 • lame 3.97.a5 | John33 compile | --preset 96 -X 10,10 • lame 3.97.a5 | John33 compile | -V 7 • lame 3.97.a5 | John33 compile | -V 8 [see NOTE ABOUT ENCODINGS for details about these choices] Hardware and software configuration: • Audigy2 soundcard • Onyko R-A5 FM/AM Tuner Amplifier • Beyerdynamic DT-531 headphone • ff123's ABC/HR 1.1 beta 2 NOTE ABOUT VBR ENCODINGS: There's no VBR preset corresponding to 96 kbps for 'general' music. -V7 seems to produce the closest bitrate value on most samples, but -V8 is needed with some kind of music (I've noticed it in the past with metal/hard rock, and it was also reported here by other members). On the other side, even –V7 could be inferior to 96 kbps. It happens here with four samples, with a terrible deviation for the two "classical music" (and low volume) samples: 67 & 72 kbps for BachBWV1007 and LisztBMinor. Of course, -V7 will logically produce bloated bitrate with the first category of music: the seven biggest files of this test were encoded with this VBR setting. I had therefore two serious possibilities: first one was to discard VBR from this test. This solution is the most opportune, but IMHO not the most useful one. I'd like to see how will perform lame VBR at such bitrate, and wonder if it could outperform ABR/CBR, at comparable bitrate of course. Second valid possibility: introducing BOTH -V7 and -V8 encodings in this test. Then, everybody could choose its own comparison's strategy (comparing ABR/CBR to a fixed VBR preset or comparing with the preset matching with 96 kbps for each situation). I've finally decided for the second option, despite of difficulty (it supposes for each sample an unnecessary file to be tested...) NOTE ABOUT SAMPLING RATE. All settings lead to a resampling process (-> 32000 hertz), except one: lame 3.96.1. RESULTS CODE 3.90.3 3.96.1 3.97a5 3.97a5 3.97a5 ABR 96 ABR 96 ABR 96 VBR V7 VBR V8 ATrain 2.3 1.0 2.8 1.8 1.0 BachS1007 4.7 3.8 4.3 1.3 1.0 BeautySlept 3.0 1.5 2.5 2.5 1.0 Blackwater 3.5 1.0 3.5 2.0 1.5 FloorEssence 1.7 1.3 3.0 2.0 1.0 Layla 2.5 1.0 3.0 3.8 1.5 LifeShatters 3.4 1.0 2.3 4.0 3.0 LisztBMinor 4.5 3.5 4.0 1.5 1.3 MidnightVoyage 2.5 1.0 2.0 3.5 1.5 thear1 3.0 1.0 3.0 4.0 2.6 TheSource 2.0 1.5 2.7 1.8 1.0 Waiting 2.0 1.0 2.0 4.0 3.0 __________________________________________________________ Dogies [ff123] 2.5 1.4 2.5 2.0 1.2 Fossiles [ff123] 3.5 1.5 3.5 2.5 1.0 SinceAlways [Dev0] 2.7 1.0 2.5 4.2 3.5 Macabre [ff123] 2.5 1.0 2.8 2.0 1.4 Rawhide [ff123] 3.0 1.0 3.0 1.5 1.0 Wayitis [ff123] 2.7 1.7 2.7 2.2 1.2 __________________________________________________________ Casta.2 [preecho] 1.4 1.4 2.3 2.7 1.0 OrionII [micro-att] 3.5 1.0 2.5 1.4 1.2 ----------------------------------------------------------- · · · · · · MEANS 2.85 1.43 2.84 2.54 1.55 | ----------------------------------------------------------- click for log files COMMENTS: • VBR encoding at low bitrate can’t apparently be recommended. Bitrate fluctuates too much (that’s normal), but quality too (that’s not normal). VBR should provide constant quality for fluctuating bitrate whereas ABR/CBR should conduct to constant bitrate and erratic quality. At low bitrate, ABR is clearly more robust, despite its limited bitrate allocation. VBR suffers too much from ringing and tons of other artefacts. –V 8 is most often awful; -V7 can’t avoid some difficulties (on LisztBMinor for example). • ABR at 96 kbps with lame 3.96.1 is a complete tragedy. The lack of resampling could maybe explain a part of this disaster. • There’s no winner at the end of the 3.90.3 / 3.97a5 competition. Overall results are identical (2.85 vs 2.84!), but this final result contains many difference. Each encoder has specific reaction. With micro-attacks [Orion II.wav] 3.90.3 is apparently clearly better. It could be confirmed with other tests already performed by other persons on comparable samples [fatboy.wav and Bayle – Etching.wav]. With pure pre-echo sample, the new alpha is apparently better. 3.90.3 has also less ringing problem (but problem is not totally absent). Nevertheless, 3.97a5 is now close to 3.90.3 in my opinion, but future progress is of course welcome. STATISTICS: If some people would play with statistic tool, just copy and paste the following table: CODE 390ABR 396ABR 397ABR 397VBR7 397VBR8 2.3 1.0 2.8 1.8 1.0 4.7 3.8 4.3 1.3 1.0 3.0 1.5 2.5 2.5 1.0 3.5 1.0 3.5 2.0 1.5 1.7 1.3 3.0 2.0 1.0 2.5 1.0 3.0 3.8 1.5 3.4 1.0 2.3 4.0 3.0 4.5 3.5 4.0 1.5 1.3 2.5 1.0 2.0 3.5 1.5 3.0 1.0 3.0 4.0 2.6 2.0 1.5 2.7 1.8 1.0 2.0 1.0 2.0 4.0 3.0 2.5 1.4 2.5 2.0 1.2 3.5 1.5 3.5 2.5 1.0 2.7 1.0 2.5 4.2 3.5 2.5 1.0 2.8 2.0 1.4 3.0 1.0 3.0 1.5 1.0 2.7 1.7 2.7 2.2 1.2 1.4 1.4 2.3 2.7 1.0 3.5 1.0 2.5 1.4 1.2 • ANOVA Analysis: CODE FRIEDMAN version 1.24 (Jan 17, 2002) http://ff123.net/ Blocked ANOVA analysis Number of listeners: 20 Critical significance: 0.05 Significance of data: 1.95E-009 (highly significant) --------------------------------------------------------------- ANOVA Table for Randomized Block Designs Using Ratings Source of Degrees Sum of Mean variation of Freedom squares Square F p Total 99 102.70 Testers (blocks) 19 16.33 Codecs eval'd 4 39.16 9.79 15.76 1.95E-009 Error 76 47.20 0.62 --------------------------------------------------------------- Fisher's protected LSD for ANOVA: 0.496 Means: 390ABR 397ABR 397VBR7 397VBR8 396ABR 2.85 2.84 2.54 1.55 1.43 ---------------------------- p-value Matrix --------------------------- 397ABR 397VBR7 397VBR8 396ABR 390ABR 1.000 0.217 0.000* 0.000* 397ABR 0.217 0.000* 0.000* 397VBR7 0.000* 0.000* 397VBR8 0.646 ----------------------------------------------------------------------- 390ABR is better than 397VBR8, 396ABR 397ABR is better than 397VBR8, 396ABR 397VBR7 is better than 397VBR8, 396ABR • TUKEY PARAMETRIC Analysis: CODE FRIEDMAN version 1.24 (Jan 17, 2002) http://ff123.net/ Tukey HSD analysis Number of listeners: 20 Critical significance: 0.05 Tukey's HSD: 0.699 Means: 390ABR 397ABR 397VBR7 397VBR8 396ABR 2.85 2.84 2.54 1.55 1.43 -------------------------- Difference Matrix -------------------------- 397ABR 397VBR7 397VBR8 396ABR 390ABR 0.000 0.310 1.300* 1.415* 397ABR 0.310 1.300* 1.415* 397VBR7 0.990* 1.105* 397VBR8 0.115 ----------------------------------------------------------------------- 390ABR is better than 397VBR8, 396ABR 397ABR is better than 397VBR8, 396ABR 397VBR7 is better than 397VBR8, 396ABR • Bitrate table: CODE 3.90.3 3.96.1 3.97 3.97 3.97
ABR96 ABR96 ABR96 -V 7 -V 8 ATrain 91 90 91 112 94 BachS1007 97 98 97 67 56 BeautySlept 93 92 93 98 84 Blackwater 93 94 93 95 78 castanets2 91 92 94 99 86 doggies 94 93 95 107 90 FloorEssence 95 101 99 120 101 Fossils 92 92 94 104 84 SinceAlways 95 95 96 124 105 Layla 93 96 97 121 103 LifeShatters 95 96 95 107 89 LisztBMinor 91 90 91 72 59 Macabre 91 90 91 108 92 MidnightVoyage 92 92 94 121 99 Orion II 90 94 100 102 83 Rawhide 92 92 93 100 82 thear1 92 92 93 110 91 TheSource 98 98 97 93 79 Waiting 90 90 92 117 98 Wayitis 91 91 93 92 75 92,8 93,4 94,4 103,4 86,4 This post has been edited by guruboolez: Dec 29 2005, 22:05 |
|
|
|
Jan 13 2005, 08:13
Post
#13
|
|
![]() Group: Members (Donating) Posts: 3474 Joined: 7-November 01 From: Strasbourg (France) Member No.: 420 |
160 kbps ABR & VBR TEST [including –VBR-NEW vs default VBR mode]
Samples: 20 samples - same as 96 kbps listening test above. Encoders and settings: • lame 3.90.3 | John33 compile | --alt-preset 160 • lame 3.97.a5 | John33 compile | --preset 160 -X 10,10 • lame 3.97.a5 | John33 compile | -V 4 • lame 3.97.a5 | John33 compile | -V 4 –vbr-new [see NOTE ABOUT ENCODINGS for details about these choices] Hardware and software configuration: …they didn’t change. NOTE ABOUT ENCODINGS: • Modern encoder at 160 kbps should all reach near-transparency state to my ears. Tests are therefore more difficult. In order to avoid unnecessary exhaustion, I’ve tried to limit the number of challengers. 3.96.1 ABR, which clearly appeared as buggy during previous tests, was consequently removed. • Like previous test, the main epistemological issue was to legitimate the choice of a VBR setting. –V 4 is the closest setting from 160 kbps value… but often higher. Average bitrate for the 20 samples reach for example 171 kbps, with 197 and 134 kbps for extreme samples. I can’t introduce this time a lower setting: the full sample suit drop to 135 kbps with –V 5 setting, and testing it would be completely absurd. I had another idea at that moment: using the alternative VBR engine, aka --vbr-new, commonly used with --alt-preset fast routines. I’ve noticed in the past that --vbr-new encodings are slightly smaller than default VBR mode. Other people have also reported the same fact. I’ve therefore encoded the 20 samples with –V 4 --vbr-new, and average bitrate reached 160,4 kbps. Nice, isn’t it ;-) I didn’t choose between –V 4 and –V 4 fast, and simply put both in the arena. It will be a good occasion to compare the performances of these two VBR engines. P.S. I did a big rest during this test, and resumed it after 8 hours of sleeping. RESULTS CODE 3.90.3 3.97a5 3.97a5 3.97a5 ABR160 ABR160 VBR4 VBR4NEW ATrain 4.0 3.0 4.3 4.0 BachS1007 4.5 5.0 4.0 5.0 BeautySlept 3.5 4.0 4.0 2.5 Blackwater 4.5 4.5 4.0 3.7 FloorEssence 2.9 3.2 3.4 4.0 SLEEPING - SLEEPING - SLEEPING - SLEEPING Layla 3.7 4.0 4.7 4.3 LifeShatters 4.0 4.5 3.5 4.0 LisztBMinor 5.0 4.5 3.5 5.0 MidnightVoyage 3.0 4.0 4.9 4.3 thear1 4.8 4.8 5.0 5.0 TheSource 4.2 4.2 3.5 3.2 Waiting 3.0 3.5 4.0 4.5 _________________________________________________ Dogies [ff123] 3.7 3.0 2.0 4.5 Fossiles [ff123] 4.0 3.5 1.8 3.0 SinceAlways [Dev0] 2.0 3.0 2.3 3.5 Macabre [ff123] 3.8 3.5 4.5 5.0 Rawhide [ff123] 4.0 4.5 4.5 4.5 Wayitis [ff123] 4.0 3.5 2.0 4.7 _________________________________________________ Casta.2 [preecho] 2.8 3.2 2.5 1.8 OrionII [micro-att] 3.0 3.5 2.5 4.3 --------------------------------------------------- · · · · · · MEANS 3.72 3.84 3.54 4.04 | --------------------------------------------------- click for log files COMMENTS: • On average, 3.90.3 ABR was slightly inferior to 3.97.a5 ABR. Both are very close, and similar, except on speed: 3.97a5 is obviously faster. • VBR comparison is more interesting. First, --vbr-new engine produces with most sample better results than default VBR mode. Statistically (see above), it only appears using friedman.exe tool with tukey parametric with –s 0.1 option (10% confidence, instead of defaulted 5%). But it’s probably more interesting to differentiate results. VBR NEW was much better with LisztBMinor, Dogies, Fossiles, SinceAlways, Wayitis and Orion II. For strange reasons, the defaulted VBR suffers a lot from ringing: background noise becomes irregular, and the distortions also infect some precious musical information (especially at low volume). I’d like to illustrate this problem with a frequency analysis, confirming by eyes the seriousness of this problem: http://audiotests.free.fr/tests/200...r_vs_vbrnew.gif On the other side, --vbr-bew routine produced a clearly worse result with the harpsichord sample (BeautySlept.wav). It’s not a surprise I must say: I’ve noticed it two years ago [it must be with lame 3.92 or 3.93]. But if we except this issue (very specific, but unfortunately very annoying for me), I didn’t found any other situation in which --vbr-new really suffered compared to default mode. • -V 4 --vbr-new is not obviously better than lame 3.97 alpha 5 ABR 160 (final bitrate were the same). It’s a bit problematic: Shouldn’t we expect a real difference between ABR and VBR? Is VBR clearly better than ABR/CBR? And for what situation? Are those VBR/ABR similarities something structural (e.g. we can’t expect from well-tuned MP3 encoders a real quality margin between two modes) or something purely accidental (e.g. lack of tuning of current VBR mode compared to well-tuned ABR settings)? We have some elements of answers: we saw first than ABR outperformed VBR at 96 kbps, and then that existing differences at 160 kbps are really minor, at least with common music (situation could differ with killer sample). On the other side, I’ve found –V 5 --athaa-sensitivity with lame 3.96.1 really better than lame 3.90.3 in a recent past. There are dissimilarities between elements of answer I’ve gathered. Therefore, I think we should try to find a real answer in the future (and temporary forgot our current beliefs) to this fundamental interrogation. STATISTICS: If some people would play with statistic tool, just copy and paste the following table: CODE 90ABR 97ABR 97VB4 97VB4n 4.0 3.0 4.3 4.0 4.5 5.0 4.0 5.0 3.5 4.0 4.0 2.5 4.5 4.5 4.0 3.7 2.9 3.2 3.4 4.0 3.7 4.0 4.7 4.3 4.0 4.5 3.5 4.0 5.0 4.5 3.5 5.0 3.0 4.0 4.9 4.3 4.8 4.8 5.0 5.0 4.2 4.2 3.5 3.2 3.0 3.5 4.0 4.5 3.7 3.0 2.0 4.5 4.0 3.5 1.8 3.0 2.0 3.0 2.3 3.5 3.8 3.5 4.5 5.0 4.0 4.5 4.5 4.5 4.0 3.5 2.0 4.7 2.8 3.2 2.5 1.8 3.0 3.5 2.5 4.3 • ANOVA Analysis: my results can’t lead to any differentiation. • TUKEY PARAMETRIC Analysis [-s 0.1]: CODE FRIEDMAN version 1.24 (Jan 17, 2002) http://ff123.net/ Tukey HSD analysis Number of listeners: 20 Critical significance: 0.10 Tukey's HSD: 0.480 Means: 97VB4n 97ABR 90ABR 97VB4 4.04 3.84 3.72 3.54 -------------------------- Difference Matrix -------------------------- 97ABR 90ABR 97VB4 97VB4n 0.195 0.320 0.495* 97ABR 0.125 0.300 90ABR 0.175 ----------------------------------------------------------------------- 97VB4n is better than 97VB4R • Bitrate table: CODE 3.90.3 3.97 3.97 3.97
ABR160 ABR160 -V4 -V4-new ATrain 155 154 166 171 BachS1007 163 160 134 124 BeautySlept 157 156 142 148 Blackwater 158 157 168 155 castanets2 150 152 143 154 dogies 158 158 178 173 FloorEssence 164 171 198 188 fossiles 156 156 188 171 SinceAlways 161 161 190 173 Layla 159 163 193 177 LifeShatters 161 159 177 147 LisztBMinor 155 153 145 155 macabre 156 154 187 166 MidnightVoyage 156 157 189 171 Orion II (2.1) 155 160 176 159 rawhide 157 156 159 155 thear1 157 156 182 157 TheSource 165 162 136 139 Waiting 155 154 191 167 wayitis 155 155 190 174 157,65 157,7 171,6 161,2 This post has been edited by guruboolez: Dec 29 2005, 22:07 |
|
|
|
Jan 13 2005, 08:13
Post
#14
|
|
![]() Group: Members (Donating) Posts: 3474 Joined: 7-November 01 From: Strasbourg (France) Member No.: 420 |
DVD RIP test: AC3 transcoding at ~96 and ~128 kbps
Samples: I’ve tried to see how will perform a comparison with DVD Video transcoding. DVD Video are very different from CD: we have on one hand variable quality AC3 (which introduce a lot of quantization noise, and sometimes strong lowpass), 48000 hertz sampling rate, and always high dynamic soundtracks including spoken and ambient parts; on the other hand, CD is 44100 hertz, original PCM quality (with infinitesimal quantization noise) and most often limited dynamically (thanks to loudness race). For this test, I had to build myself all samples. There are only six samples: the conclusions couldn’t be anything else than leads for further investigations. I always used AC3 as source (no DTS nor PCM). I’ve selected native stereo AC3 encodings when possible; for one sample, I had to downsample myself to stereo. Decoding, downsampling and transcoding were performed directly with foobar2000. Samples are: • Jean-Pierre Jeunet — Alien 4 Resurrection: Jean-Pierre Jeunet presents the DVD edition in English with pronounced French accent. Native stereo AC3 encoding at 192 kbps. • Rowan Atkinson — Blackadder IV (“Captain Cook”). English speaking with public’s laughs. Native stereo AC3 encoding at 384 kbps. • King Hu — Come Drink With Me (L’Hirondelle d’Or). Quiet music with water. Mono (two channels) AC3 encoding at 192 kbps. • Gérard Corbiau — Farinelli. A morning: Horses, birds… then woman voice on diner in French language. Native stereo AC3 encoding at 224 kbps. • Quentin Tarentino — Pulp Fiction. Ezechiel and gunshots… 448 kbps multichannel AC3 encoding, downsampled to stereo. • Akira Kurosawa — Ran. A hunt, and dramatic music (flute & percussion). Native stereo AC3 encoding at 448 kbps (!). PART I: 96 kbps encodings Encoders and settings: • lame 3.90.3 | John33 compile | --alt-preset 96 • lame 3.96.1 | John33 compile | --alt-preset 96 • lame 3.97.a5 | John33 compile | --alt-preset 96 -X 10,10 • lame 3.97.a5 | John33 compile | -V 7 --vbr-new • lame 3.97.a5 | John33 compile | -V 8 --vbr-new Hardware and software configuration: …same as before NOTE ABOUT ENCODINGS: I explained before (see 96 kbps encoding test) the reason for maintaining two VBR settings in the test. This time, I used --vbr-new engine, which apparently perform better than defaulted mode, especially on low volume signal (and soundtracks are mainly built with low volume parts). RESULTS CODE 3.90.3 3.96.1 3.97a5 3.97a5 3.97a5 ABR 96 ABR 96 ABR 96 VBnew7 VBnew8 Alien4 4.7 3.0 4.2 1.5 1.0 Blackadder 3.0 2.3 3.5 2.0 1.5 Come Drink With Me 5.0 4.0 4.3 4.3 2.5 Farinelli 2.5 1.3 2.3 2.0 1.3 Pulp Fiction 3.0 1.7 2.7 1.7 1.0 Ran 4.0 3.0 4.0 2.0 1.0 MEANS 3.70 2.55 3.50 2.25 1.38 click for log files COMMENTS: • lame 3.90.3 is slightly better on average than 3.97.a5 (use this statement with caution: it can’t be confirmed by friedman.exe analysis). The latest alpha had slight problems with ringing (it also appeared on previous test with the same setting but CD encoding). Difference is not dramatic, but I’d use 3.90.3 in order to maximise quality at this setting (or better: resume the test with more sample). • lame 3.96.1 is bad, but MUCH BETTER here than during previous 96 kbps test. • VBR encodings are another time not reliable at this low bitrate. Using the alternative VBR engine is not a solution for all audible problems: ringing first, and many other artefacts. –V8 is pathetic (despite of high bitrate!); -V7 better, but still inferior to ABR for higher bitrate. STATISTICS: If some people would play with statistic tool, just copy and paste the following table: CODE 3.90.3 3.96.1 3.97ABR 3.97Vn7 3.97Vn8 4.7 3.0 4.2 1.5 1.0 3.0 2.3 3.5 2.0 1.5 5.0 4.0 4.3 4.3 2.5 2.5 1.3 2.3 2.0 1.3 3.0 1.7 2.7 1.7 1.0 4.0 3.0 4.0 2.0 1.0 • ANOVA Analysis: CODE 3.90.3 is better than 3.96.1, 3.97Vn7, 3.97Vn8 3.97ABR is better than 3.96.1, 3.97Vn7, 3.97Vn8 3.96.1 is better than 3.97Vn8 3.97Vn7 is better than 3.97Vn8 • TUKEY PARAMETRIC Analysis [-s 0.1]: CODE 3.90.3 is better than 3.96.1, 3.97Vn7, 3.97Vn8 3.97ABR is better than 3.96.1, 3.97Vn7, 3.97Vn8 3.96.1 is better than 3.97Vn8 • Bitrate table: CODE 3.90.3 3.96.1 3.97 3.97 3.97 ABR 96 ABR 96 ABR 96 -Vn 7 -Vn 8 Alien4 95 97 92 133 128 Blackadder 95 97 97 100 92 Come Drink… 98 102 98 110 106 Farinelli 99 101 99 105 104 Pulp Fiction 96 100 98 107 109 Ran 91 92 93 82 68 95,7 98,2 96,2 106,2 101,2 PART II: 128 kbps encodings Encoders and settings: • lame 3.90.3 | John33 compile | --alt-preset 128 • lame 3.96.1 | John33 compile | --alt-preset 128 • lame 3.97.a5 | John33 compile | --alt-preset 128 -X 10,10 • lame 3.97.a5 | John33 compile | -V 5 • lame 3.97.a5 | John33 compile | -V 5 --vbr-new Hardware and software configuration: …still the same NOTE ABOUT ENCODINGS: This time, I’ve compared –V 5 and –V 5 --vbr-new: bitrate are totally different, and it’s a good occasion to see if --vbr-new engine is also better at ~128 kbps compared to defaulted VBR mode. Important note: this time, --vbr-new doesn’t lead to lower bitrate, but to much higher one (102 vs 138 kbps). Differences could be amazing. Best example: with the 100% spoken sample (Alien 4), -V 5 encoding = 90 kbps and –V 5 --vbr-new = 171 kbps. [The very end of the sample was encoded at 320 kbps, which is probably excessive for near-silence…]. RESULTS CODE 3.90.3 3.96.1 3.97a5 3.97a5 3.97a5 ABR128 ABR128 ABR128 VBR 5 VBRnew5 Alien4 4.7 4.3 4.7 3.5 4.0 Blackadder 3.0 2.5 3.3 3.5 4.0 Come Drink With Me 5.0 4.5 4.5 3.5 4.0 Farinelli 3.7 3.0 3.5 4.0 4.3 Pulp Fiction 2.7 2.5 3.5 1.5 3.0 Ran 4.0 2.0 3.0 2.5 3.5 MEANS 3.95 3.13 3.75 3.08 3.80 click for log files COMMENTS: • lame 3.90.3 is slightly better on average than 3.97.a5 (again, it can’t be confirmed by friedman.exe analysis). It’s an important change, because with CD encoding at the same setting, lame 3.90.3 sounded slightly worse. But 6 samples are probably not enough to be sure about it. Still ringing (slight but existing) issues with 3.97 alpha 5 (I repeat that 3.90.3 is not entirely free of ringing). • lame 3.96.1 is not as terrible with AC3@48000 than with PCM@44100. But it can’t be recommended. • VBR –V5 is inferior again to –V5 --vbr-new, on all samples! Extensive tests should be done to confirm it. • VBR –V5 --vbr-new and ABR 128 are tied. Difference is really marginal (but bitrate is 10 kbps higher with VBR). Again, we should question the theoretical superiority of VBR compared to ABR, and its usefulness. Especially when we have in mind the bloated bitrate which occurs with (apparently) innocent samples. It could be problematic with some movies. STATISTICS: If some persons would play with statistic tool, just copy and paste the following table: CODE 3.90.3 3.96.1 3.97ABR 3.97VB5 3.97Vn5 4.7 4.3 4.7 3.5 4.0 3.0 2.5 3.3 3.5 4.0 5.0 4.5 4.5 3.5 4.0 3.7 3.0 3.5 4.0 4.3 2.7 2.5 3.5 1.5 3.0 4.0 2.0 3.0 2.5 3.5 • ANOVA Analysis: CODE 3.90.3 is better than 3.96.1, 3.97VB5 3.97Vn5 is better than 3.96.1, 3.97VB5 3.97ABR is better than 3.97VB5 • TUKEY PARAMETRIC Analysis [-s 0.1]: no reliable conclusion • Bitrate table: CODE 3.90.3 3.96.1 3.97 3.97a5 3.97a5 ABR 128 ABR 128 ABR 128 -V 5 -V 5--vbr-new Alien4 124 123 121 90 171 Blackadder 126 127 126 126 133 Come Drink… 128 133 130 73 147 Farinelli 132 134 132 103 129 Pulp Fiction 129 132 130 103 133 Ran 122 123 122 121 119 126,8 128,7 126,8 102,7 138,7 PART II: 160 kbps encodings I’m K.O. Use original AC3 instead EDIT: all 6 samples are available HERE (limited availability). This post has been edited by guruboolez: Dec 29 2005, 22:08 |
|
|
|
Jan 13 2005, 09:36
Post
#15
|
|
![]() LAME developer Group: Developer Posts: 2950 Joined: 1-October 01 From: Nanterre, France Member No.: 138 |
Thank you very much for those extensive results.
|
|
|
|
Jan 13 2005, 10:54
Post
#16
|
|
![]() Group: Members (Donating) Posts: 3474 Joined: 7-November 01 From: Strasbourg (France) Member No.: 420 |
I've forgot this one, from Dev0 at 128 kbps :
QUOTE ABC/HR Version 1.1 beta 2, 18 June 2004 Testname: 1R = D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.96.1 - ABR - 128].wav 2R = D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.96.1 - CBR - 128].wav 3L = D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.90.3 - CBR - 128].wav 4L = D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.90.3 - ABR - 128].wav 5R = D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.97a5 - ABR - 128 - XX10].wav 6L = D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.97a5 - CBR - 128 - XX10].wav --------------------------------------- General Comments: notation is linked to the performance of solo guitar (introduction, from 1.0 to ~5.0) --------------------------------------- 1R File: D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.96.1 - ABR - 128].wav 1R Rating: 1.0 1R Comment: --------------------------------------- 2R File: D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.96.1 - CBR - 128].wav 2R Rating: 1.0 2R Comment: --------------------------------------- 3L File: D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.90.3 - CBR - 128].wav 3L Rating: 2.0 3L Comment: --------------------------------------- 4L File: D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.90.3 - ABR - 128].wav 4L Rating: 2.0 4L Comment: --------------------------------------- 5R File: D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.97a5 - ABR - 128 - XX10].wav 5R Rating: 3.5 5R Comment: --------------------------------------- 6L File: D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.97a5 - CBR - 128 - XX10].wav 6L Rating: 3.5 6L Comment: --------------------------------------- ABX Results: D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.96.1 - ABR - 128].wav vs D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.90.3 - CBR - 128].wav 8 out of 8, pval = 0.004 D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.90.3 - CBR - 128].wav vs D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.97a5 - ABR - 128 - XX10].wav 8 out of 8, pval = 0.004 No real difference between ABR/CBR for the same encoder. 3.96.1 < 3.90.3 < 3.97a5 This post has been edited by guruboolez: Jan 13 2005, 10:54 |
|
|
|
Jan 13 2005, 17:58
Post
#17
|
|
![]() Group: Members Posts: 483 Joined: 1-December 02 Member No.: 3949 |
Wow. guruboolez, I have to thank you for your professional efforts.
|
|
|
|
Jan 13 2005, 19:07
Post
#18
|
|
![]() LAME developer Group: Developer Posts: 2950 Joined: 1-October 01 From: Nanterre, France Member No.: 138 |
Right now I do not need additionnal feedback in the 96-165kbps range.
All those results are very informative to me, and I will adjust parameters according to them. |
|
|
|
Jan 13 2005, 20:18
Post
#19
|
|
![]() LAME developer Group: Developer Posts: 2950 Joined: 1-October 01 From: Nanterre, France Member No.: 138 |
A new alpha will be available soon:
*vbr is unchanged *cbr/abr are now using the X 10 mode (remapped to X9), ath levels changed. I think that this version should reduce dropouts introduced by 3.96 in cbr/abr. |
|
|
|
Jan 14 2005, 14:31
Post
#20
|
|
![]() Group: Members Posts: 524 Joined: 7-November 02 From: Gloucester, UK Member No.: 3716 |
I think I speak on behalf of everyone (pompous thing that I am) when I say I am *really* happy reading this thread. Huge thanks to Guru for testing, and a huge thanks to Gabriel for continuing development. Thanks
-------------------- http://www.megalev.co.uk
|
|
|
|
Jan 15 2005, 11:10
Post
#21
|
|
![]() Group: Members (Donating) Posts: 3474 Joined: 7-November 01 From: Strasbourg (France) Member No.: 420 |
To finish with this alpha:
I've tested yesterday the --athlower setting, using different value (from -5 to -15). With LisztBminor.wav, I've noticed the biggest progress using --athlower 10 (compared to --athlower 9 and lower value). --athlower 15 lead to slight additional progress (one artifact was removed). I've also tested with some other samples. --athlower 10 is apparently a good way to reduce ringing (but not to totally remove it). But I have to precise that I had to increase the listening volume to hear it (mp3GAIN could also reveal some problems inaudible on 'normal' conditions' — I have often experienced that with my portable player). If some people are interesed to test, I have three other samples that might be interesting: ftp://ftp2.foobar2000.net/foobar/ATH_LAME.ZIP |
|
|
|
Jan 15 2005, 16:57
Post
#22
|
|
![]() Group: Members Posts: 669 Joined: 15-January 02 From: SE Pennsylvania Member No.: 1032 |
--vbr-new has another problem with controlling bitrate bloat. Guruboolez identified that vbr-new does often use 320kbps frames at the ends of an encoded track (during near silence). However, I have noticed that it also does this on tracks with those notorious "hidden songs", when they'll stick an extra song at the end of the final track with several minutes of silence between the two (in the same WAV).
I encoded a track by Duncan Sheik called "Nichiren" and during the silence vbr-new was using 192 and 224kbps frames. Of course this "silence" is not digital silence but "analog silence". At -85dB, it's probably tape hiss but quiet enough to use 32kbps frames, I'd say. |
|
|
|
![]() ![]() |
|
Lo-Fi Version | Time is now: 19th June 2013 - 04:48 |