Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: LAME 3.96 vs. 3.90.3 Test (Read 122024 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

LAME 3.96 vs. 3.90.3 Test

The purpose of this thread is to (finally) test LAME 3.96 throughly enough to make it the new recommended LAME version and enhance communication between LAME developers and the HA.org community. The recent confusion about the different LAME versions and compiles (-Qrcd trouble etc.) have led to the decision that moving along with LAME development has been postponed for long enough and LAME 3.90.X can hopefully be declared dead very soon.
Please test as many samples as possible according to the following guidelines and post any results here.


1. Use the following LAME compiles (updated!):
lame3.90.3
lame3.96b2

2. The focus of the test should be --alt-preset/--preset standard, since it will allow us to make conclusions regarding the overall performance of the 'code level tweaked' VBR presets. Other VBR/ABR/CBR presets are interesting too, but not as important. If problems with --alt preset standard are detected feel free to compare extreme and insane too. Test the following combinations please:

(alt)presets + VBR/ABR
(320kbps) 3.96 --preset insane vs. 3.90.3 --alt-preset insane
(~256kbps) 3.96 --preset extreme vs. 3.90.3 --alt-preset extreme
(~210kbps) [span style='font-size:11pt;line-height:100%']3.96 --preset standard vs. 3.90.3 --alt-preset standard[/span]
(~160kbps) 3.96 -V 4 vs. 3.96 --preset 160 vs. 3.90.3 --alt-preset 160
(~128kbps) 3.96 -V 5 vs. 3.96 --preset 128 vs. 3.90.3 --alt-preset 128

CBR
3.96 --preset cbr <bitrate> vs. 3.90.3 --alt-preset cbr <bitrate>

If you want, you can additionally test VBR/ABR vs. CBR at comparable bitrates

3. You may use any sample you want, as long as you upload a losslessly compressed version of that (or provide a working link), so others can verify your results.

4. Your test results have to include the following:
  • ABX results for
    3.90.3 vs. Original
    3.96b2 vs. Original
    3.96b2 vs. 3.90.3
  • ABC/HR results are appreciated especially at lower bitrates, but shouldn't be considered a requirement.
  • (Short) descriptions of the artifacts/differences
Notes:
This thread is for results only, discuss about the test here, please.
The earlier 'regression examples" thread has been closed and we'll try to add all results from it to the collected test results below. The original thread is still available here.
Please provide direct links to the sample you tested with or upload it here, even if it's a well-known one - it'll make things much easier for others.


[span style='font-size:7pt;line-height:100%']Edit by tigre, Mar 31 2004: Updated to 3.96b2 and modified/simplified medium bitrate ABR/VBR test recommendations according to 3.96 -V x mass encode average bitrates.[/span]
"To understand me, you'll have to swallow a world." Or maybe your words.

LAME 3.96 vs. 3.90.3 Test

Reply #1
Test results so far:


320kbps

3.96b1 --p insane < 3.90.3 --ap insane :: Lazy_Jones :: owowo :: 0x verified so far

~ 256kbps
3.96b1 --p extreme < 3.90.3 --ap extreme :: Lazy_Jones :: owowo :: 0x verified so far

~ 210kbps
3.96b1 --p standard < 3.90.3 --ap standard :: myf_4sec :: LoFiYo :: 2x verified by Wombat, harashin
3.96b2 : 1x verified by LoFiYo
3.96b1 --p standard < 3.90.3 --ap standard :: Rebel :: Proxima :: 1x verified by tigre
3.96b1 --p standard < 3.90.3 --ap standard :: (-) Ions :: Westgroveg :: 1x verified so far by harashin
3.96b1 --p standard < 3.90.3 --ap standard :: Hustlejet :: harashin :: 0x verified so far
3.96b2 : 1x verified by harashin
3.96b1 --p standard < 3.90.3 --ap standard :: trumpets1 :: SometimesWarrior :: 0x verified so far
3.96b1 --p standard < 3.90.3 --ap standard :: Doesnair :: Moitah :: 1x verified by harashin
3.96b1 --p standard < 3.90.3 --ap standard :: cantwait :: tigre :: 2x verified by harashin, Moitah
3.96b1 --p standard < 3.90.3 --ap standard :: death2 :: SometimesWarrior :: 1x verified by harashin
3.96b1 --p standard < 3.90.3 --ap standard :: Destitute :: harashin :: 0x verified so far
3.96b1 --p standard < 3.90.3 --ap standard :: Rosemary :: harashin :: 0x verified so far
3.96b1 --p standard < 3.90.3 --ap standard :: drone_short :: freakngoat :: 0x verified so far
3.96b1 --p standard < 3.90.3 --ap standard :: Chanchan1 :: tigre :: 1x verified by harashin
3.96b2 : 1x verified by tigre
3.96b1 --p standard < 3.90.3 --ap standard :: Hosokawa___Atem_lied :: harashin :: 0x verified so far
3.96b2 : 1x verified by harashin
3.96b2 --p standard = 3.90.3 --ap standard :: 41_30sec :: ViPER1313
3.96b1 --p standard > 3.90.3 --ap standard :: spahm :: Pio2001 :: 1x verified by Wombat
3.96b1 --p standard > 3.90.3 --ap standard :: Birds :: Wombat :: 0x verified so far
3.96b1 --p standard > 3.90.3 --ap standard :: hokuscaredpiano :: Moitah :: 0x verified so far
3.96b1 --p standard > 3.90.3 --ap standard :: Lazy_Jones :: owowo :: 1x verified by freakngoat

~ 192kbps
no results so far

~ 160kbps
3.96b2 --p 160 > 3.90.3 --ap 160 > 3.96b2 -V 4 :: myf_4sec :: LoFiYo :: 0x verified so far + missing ABX results*

~ 144kbps
no results so far

~ 128kbps
VBR/ABR

3.96b1 --p 128 < 3.90.3 --ap 128 :: Quizas :: tigre :: 0x verified so far
3.96b1 --p 128 < 3.90.3 --ap 128 :: Doesnair :: Moitah :: 0x verified so far + missing ABX results*
3.96b1 --p 128 < 3.90.3 --ap 128 :: (-) Ions :: PVNC :: 0x verified so far + missing ABX results*
3.96b1 --p 128 < 3.90.3 --ap 128 :: Applaud :: [proxima] :: 0x verified so far + missing ABX results*
3.96b1 --p 128 < 3.90.3 --ap 128 :: Bassdrum :: [proxima] :: 0x verified so far + missing ABX results*
3.96b1 --p 128 < 3.90.3 --ap 128 :: Blackwater :: [proxima] :: 0x verified so far + missing ABX results*
3.96b1 --p 128 < 3.90.3 --ap 128 :: Campestre :: [proxima] :: 0x verified so far + missing ABX results*
3.96b1 --p 128 < 3.90.3 --ap 128 :: Fall :: [proxima] :: 0x verified so far + missing ABX results*
3.96b1 --p 128 < 3.90.3 --ap 128 :: Iron :: [proxima] :: 0x verified so far + missing ABX results*
3.96b1 --p 128 < 3.90.3 --ap 128 :: Thewayitis :: [proxima] :: 0x verified so far + missing ABX results*
3.96b1 --p 128 < 3.90.3 --ap 128 :: Tosca :: [proxima] :: 0x verified so far + missing ABX results*
3.96b1 --p 128 > 3.90.3 --ap 128 :: fatboy :: sony666 :: 2x verified so far by Wombat, FatBoyFin
3.96b1 --p 128 -q 0 < 3.90.3 --ap 128 :: Velvet :: [proxima] :: 0x verified so far + missing ABX results*
3.96b1 -V 5 > 3.90.3 --ap 128 :: Quizas :: tigre :: 0x verified so far
3.96b2 --V 5 > 3.96b1 --p 128 > 3.90.3 --ap 128 :: fatboy :: FatBoyFin :: 0x verified so far
3.96b2 --V 5 > 3.90.3 --ap 128 > 3.96b1 --p 128 :: Its_me :: tigre :: 0x verified so far
3.96b2 --V 5 > 3.90.3 --ap 128 > 3.96b1 --p 128 :: entierren con rumba :: tigre :: 0x verified so far

CBR
3.96b1 --p cbr 128 < 3.90.3 --ap cbr 128 :: Quizas :: tigre :: 0x verified so far

[span style='font-size:8pt;line-height:100%']* If a result doesn't match the minimum requirements (-> 4. in 1st post), the result is 'greyed out' until the missing data is provided or someone else confirms the results. If your result isn't included in this list at all, we're either too slow, or there's too much missing (ABX results, link to the sample, clear statment which version is better, information about lame version/setting used etc.). If you want to provide missing information, do it in a new post to ensure that we notice it.[/span]
"To understand me, you'll have to swallow a world." Or maybe your words.

LAME 3.96 vs. 3.90.3 Test

Reply #2
3.96b1 shows great improvement with fatboy and --alt-preset 128 (ABR)

-wav vs 3.90.3: 8/8, horribly distorted "vocals", "knocking" artifact 1.5-3.0s
-wav vs 3.96b1: 8/8 badly distorted vocals
-3.90.3 vs 3.96b1: 8/8, clearly less distortion in 3.96b1

I tried aps, but failed miserably on both encoders this time.

LAME 3.96 vs. 3.90.3 Test

Reply #3
I used this to test APS 3.90.3 vs APS 3.96b1.

3.90.3 APS was 138kbps; 3.96b1 APS was 116kbps. If you don't have a golden set of ears, I recommend you try this sample. It's very easy to ABX  .

[span style='font-size:8pt;line-height:100%']ABC/HR Version 0.9b, 30 August 2002
Testname: My Funny Valentine - 3.90.3 vs 3.96b1

1R = C:\My Music\lab\MYF-4SEC\aps396b1.wav
2R = C:\My Music\lab\MYF-4SEC\aps3903.wav

---------------------------------------
General Comments:
#2 is the obvious winner.
---------------------------------------
1R File: C:\My Music\lab\MYF-4SEC\aps396b1.wav
1R Rating: 1.0
1R Comment: Distortion is very noticeable throughout the sample.
---------------------------------------
2R File: C:\My Music\lab\MYF-4SEC\aps3903.wav
2R Rating: 3.9
2R Comment: Much better than #1. Perceptible and almost not annoying.
---------------------------------------
ABX Results:
Original vs C:\My Music\lab\MYF-4SEC\aps396b1.wav
    8 out of 8, pval = 0.004
Original vs C:\My Music\lab\MYF-4SEC\aps3903.wav
    8 out of 8, pval = 0.004
C:\My Music\lab\MYF-4SEC\aps396b1.wav vs C:\My Music\lab\MYF-4SEC\aps3903.wav
    8 out of 8, pval = 0.004
[/span]

LAME 3.96 vs. 3.90.3 Test

Reply #4
Just verified myf_4sec with aps

I could ABX 396b -> original 8/8
and            3903 -> original 5/8

As this sample is new to me i am surprised how obvious 396b suffers here with the third blow of the trumpet!

Wombat
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

LAME 3.96 vs. 3.90.3 Test

Reply #5
I begin with the easiest samples for me. I included MPC out of curiosity, and WMA9 for reasons that are not necessary to explain here.

Lame 3.90.3 --alt-preset standard
Lame 3.96b1 --preset standard
Mppenc 1.14 --xlevel
WMA9 VBR 135-215 kbps

Sample : Badvilbel
Wav vs 3.90.3 ABX 11/11 Lots of pre echo, loss of transients, drop outs
Wav vs 3.96b1 ABX 8/8 Same as above
Wav vs MPC ABX 7/8 Gurgling on transients, ringing, drop outs, but better
Wav vs WMA9 ABX 8/8 Best, only transient problems, no drop outs at all

3.96.1 vs WMA9 : ABX 7/8
3.96.1 vs MPC : ABX 8/8

Conclusion of 3.90.3 vs 3.96b1 :
Same quality


Sample : Drone Short
3.90.3 : ABX 16/16 toooo easy ! Artifacts during the first half second, and time smearing during the second half.
3.96b1 : ABX 16/16 Somewhat louder artifacts, but not enough to talk about regression, just chance.
90 vs 96 : 11/14 (3% chance of guessing). The artifacts are not at the same place.
MPC : perfect. Maybe a tiny noise ? ABX 3/8, no.
WMA9 : 16/16, unacceptable quality

Conclusion of 3.90.3 vs 3.96b1 :
Same quality


Spahm
3.90.3 : Noise. 16/16
3.96.1 : perfect. Maybe a tiny noise ? 8/8 It seems so...
3.90.3 vs 3.96.1 : 16/16
MPC : Same effect as transients in badvilbel : grungy instead of hissy (mp3) 16/16
WMA : worse. 16/17 (I went too fast and made one mistake).

Conclusion of 3.90.3 vs 3.96b1 :
Big improvement Sample nearly completely solved.


For these three samples, compared to 3.90.3 :
3.96b1 is better one time, equal two times
MPC equal one time, slightly better one time, better one time.
WMA9 is unlistenable one time, worse one time, better one time.

Overall concusion : none for the time being. Results are sample dependant.
More tests to come, but with more difficult samples. It will take longer.

These samples are available on FF123's page : http://www.ff123.net/samples.html

LAME 3.96 vs. 3.90.3 Test

Reply #6
I wrote befor in another thread about the birds sample and aps

3903 -> original 16/16 clear artifact when she sings the first "e" of become
396b  only guessing
396b -> 3.903 8/8

396b is the first aps without a problem to me and birds.

The sample itself is now located here:
http://www.hydrogenaudio.org/forums/index....=0&#entry195385

I did 16 this time, for trying how long i can clearly discern. I am not used to use abx tests much.
I have to admit it is not that easy, even when you know the problem in the test file.

Edit: location of the sample file

Wombat
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

LAME 3.96 vs. 3.90.3 Test

Reply #7
Sample: rebel.wav
Artifact: Preecho.
ABX results (range 0:01-0:03 sec.):

Original vs. 3.90.2 (--aps -Z) 13/14
Original vs. 3.96b1 (--aps) 10/10
3.90.2 vs 3.96b1 10/10

Both non-transparent but 3.90.2 is slightly better.
As already reported in my previous test, the same sample also reveal ringing problem with 3.96b1 (--ap 128).
The sample is available in the apposite thread.
WavPack 4.3 -mfx5
LAME 3.97 -V5 --vbr-new --athaa-sensitivity 1

LAME 3.96 vs. 3.90.3 Test

Reply #8
Sample: Quizas
Tested lame settings:
3.90.3 --alt-preset 128 (132kbps)
3.90.3 --alt-preset cbr 128 (128kbps)
3.96b1 --preset 128 (135kbps)
3.96b1 --preset cbr 128 (128kbps)
3.96b1 -V 5 (138kbps)
Decoded with fb2k with trackgain enabled
Ratings:
3.96b1 -V 5 >> 3.90.3 --alt-preset 128 > 3.90.3 --alt-preset cbr 128 >> 3.96b1 --preset 128 > 3.96b1 --preset cbr 128

ABC/HR results:
Quote
ABC/HR Version 0.9b, 30 August 2002
Testname: Quizas 128kbps

1L = quizas.wav_3903_ap128.wav
2R = quizas.wav_396b1_pcbr128.wav
3R = quizas.wav_396b1_V5.wav
4L = quizas.wav_396b1_p128.wav
5L = quizas.wav_3903_apcbr128.wav

---------------------------------------
General Comments:
Ratings bases on 1st half (0.0-14.0), biggest problems there.
---------------------------------------
1L File: quizas.wav_3903_ap128.wav
1L Rating: 3.0
1L Comment: pre-echo/smearing, easy to ABX, e.g. 8.6-10.9
---------------------------------------
2R File: quizas.wav_396b1_pcbr128.wav
2R Rating: 1.0
2R Comment: pre-echo + ringing/chirping, easy to ABX, e.g. 8.6-10.9
---------------------------------------
3R File: quizas.wav_396b1_V5.wav
3R Rating: 4.0
3R Comment: only very small smearing of percussions, ABXed 8.6-10.9
---------------------------------------
4L File: quizas.wav_396b1_p128.wav
4L Rating: 1.5
4L Comment: preecho and hf chirping/warbling, ABXed 5.1-8.6
---------------------------------------
5L File: quizas.wav_3903_apcbr128.wav
5L Rating: 2.5
5L Comment: preecho, easy to ABX, e.g. 8.6-10.9
---------------------------------------
ABX Results:
Original vs quizas.wav_3903_ap128.wav
    10 out of 11, pval = 0.006
Original vs quizas.wav_396b1_pcbr128.wav
    7 out of 7, pval = 0.008
Original vs quizas.wav_396b1_V5.wav
    11 out of 12, pval = 0.003
Original vs quizas.wav_396b1_p128.wav
    7 out of 7, pval = 0.008
Original vs quizas.wav_3903_apcbr128.wav
    7 out of 7, pval = 0.008
quizas.wav_3903_ap128.wav vs quizas.wav_396b1_pcbr128.wav
    11 out of 12, pval = 0.003
quizas.wav_3903_ap128.wav vs quizas.wav_396b1_V5.wav
    7 out of 7, pval = 0.008
quizas.wav_3903_ap128.wav vs quizas.wav_396b1_p128.wav
    7 out of 7, pval = 0.008
quizas.wav_3903_ap128.wav vs quizas.wav_3903_apcbr128.wav
    11 out of 12, pval = 0.003
quizas.wav_396b1_pcbr128.wav vs quizas.wav_396b1_V5.wav
    7 out of 7, pval = 0.008
quizas.wav_396b1_pcbr128.wav vs quizas.wav_396b1_p128.wav
    8 out of 8, pval = 0.004
quizas.wav_396b1_pcbr128.wav vs quizas.wav_3903_apcbr128.wav
    7 out of 7, pval = 0.008
quizas.wav_396b1_V5.wav vs quizas.wav_396b1_p128.wav
    7 out of 7, pval = 0.008
quizas.wav_396b1_V5.wav vs quizas.wav_3903_apcbr128.wav
    7 out of 7, pval = 0.008
quizas.wav_396b1_p128.wav vs quizas.wav_3903_apcbr128.wav
    11 out of 12, pval = 0.003
Let's suppose that rain washes out a picnic. Who is feeling negative? The rain? Or YOU? What's causing the negative feeling? The rain or your reaction? - Anthony De Mello

LAME 3.96 vs. 3.90.3 Test

Reply #9
Quote
3.96 --p standard < 3.90.3 --ap standard :: isitloveintro :: Moitah :: 0x verified so far + missing ABX results*

Could you remove this one?  It isn't worse in 3.96, I was able to ABX for 3.90.3 as well (see further down in the original thread).

Quote
3.96 --p standard < 3.90.3 --ap standard :: Doesnair :: Moitah :: 0x verified so far + missing ABX results*

This sample is almost certainly a regression... I will do more thorough testing to make sure.  Can someone else verify this (listen where the snair is hit, and its a good idea to start with --p 128 because the artifact is a lot more obvious)?

LAME 3.96 vs. 3.90.3 Test

Reply #10
Sample: doesnair
3.90.3 --ap standard vs 3.96b1 --p standard

Code: [Select]
Original vs H:\lame\doesnair\1579_doesnair-390-s.wav
   14 out of 16, pval = 0.002
Original vs H:\lame\doesnair\1583_doesnair-396-s.wav
   16 out of 16, pval < 0.001
H:\lame\doesnair\1579_doesnair-390-s.wav vs H:\lame\doesnair\1583_doesnair-396-s.wav
   13 out of 16, pval = 0.011


Both aren't transparent vs original, but 3.96b1 is worse.

LAME 3.96 vs. 3.90.3 Test

Reply #11
spahm
This sample makes my ears vibrating somehow after too many tries!

3903 8/8 pretty easy i only listen the first beats and the added noise is pretty obvious
396b 7/8 the same noise but to a much lesser degree


hustlejet

just gave me helpless clicky, clicky and when i thought i heard the problem - Nada!
Not my cup of sample

btw. ff123 did a nice job with abchr!!

Wombat
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

LAME 3.96 vs. 3.90.3 Test

Reply #12
Sample: hokuscaredpiano (Hoku - I'm Scared)
3.96b1 --p standard vs 3.90.3 --ap standard

Code: [Select]
Original vs H:\lame\hokuscaredpiano\1586_hokuscaredpiano-396-s.wav
   15 out of 16, pval < 0.001
Original vs H:\lame\hokuscaredpiano\1585_hokuscaredpiano-390-s.wav
   14 out of 16, pval = 0.002
H:\lame\hokuscaredpiano\1586_hokuscaredpiano-396-s.wav vs H:\lame\hokuscaredpiano\1585_hokuscaredpiano-390-s.wav
   13 out of 16, pval = 0.011


I'm hearing a problem on the piano (first chord change, 1.87 secs in), 3.96b1 sounds a bit better .

LAME 3.96 vs. 3.90.3 Test

Reply #13
cantwait

I wanted to use this CD for testing lower bitrates but at 128kbps the artifacts were so obvious that I decided to start with (alt)preset standard.

Both, 3.90.3 --alt-preset standard and 3.96b1 --preset standard make the singer's voice (or it's echo) sound metallic in many places. One easily noticable example is the "s" of the word "understand" at ~ 8 seconds. Other "s" sounds (but also other places) have similar problems. 3.96b1 is clearly worse.

ABXed with fb2k, original vs. mp3 9/9 each time, 3.90.3 vs. 3.96b1 11/12

EDIT:
Edited in Wombat's + Moitah's (and my) results.

P.S.: Can someone verify my findings, please - is it just me or can this be considered as lame problem/killer sample?
Let's suppose that rain washes out a picnic. Who is feeling negative? The rain? Or YOU? What's causing the negative feeling? The rain or your reaction? - Anthony De Mello

LAME 3.96 vs. 3.90.3 Test

Reply #14
Quote
hustlejet

just gave me helpless clicky, clicky and when i thought i heard the problem - Nada!
Not my cup of sample

That's bad , it seems even ReplayGain didn't fix its clipping issue(I don't notice here though) on your system.

Anyway, I was able to confirm tigre's cantwait sample. Somehow I can't reach my hosting space at CyberQuébec today, I write these results here.
Code: [Select]
A file: I:\test\cantwait\cantwait.wav B file: I:\test\cantwait\3.96b1aps.wav
10/10  p< 0.1%
A file: I:\test\cantwait\cantwait.wav B file: I:\test\cantwait\3.90.3aps.wav
4/15  p=98.2%
A file: I:\test\cantwait\3.90.3aps.wav B file: I:\test\cantwait\3.96b1aps.wav
15/20  p=2.1%

EDIT: The server is back now. Results

LAME 3.96 vs. 3.90.3 Test

Reply #15
Quote
P.S.: Can someone verify my findings, please - is it just me or can this be considered as lame problem/killer sample?

Yeah, both are pretty easy to ABX against the original (--(alt-)preset standard):

Code: [Select]
Original vs H:\lame\1583_Max - Cant Wait Until Tonight (Dry Wurlitzer Mix)-396-s.wav
   18 out of 20, pval < 0.001
Original vs H:\lame\1585_Max - Cant Wait Until Tonight (Dry Wurlitzer Mix)-390-s.wav
   17 out of 20, pval = 0.001

Didn't have any luck ABXing them against eachother, but I wasn't concentrating as hard as I normally do... I'll try again later.

EDIT: Also, I only tried one part of the sample (the "s" in "understand").

EDIT 2: Took a break and switched to Grado SR225s (was using PortaPros before):

Code: [Select]
H:\lame\1585_Max - Cant Wait Until Tonight (Dry Wurlitzer Mix)-390-s.wav vs H:\lame\1583_Max - Cant Wait Until Tonight (Dry Wurlitzer Mix)-396-s.wav
   20 out of 25, pval = 0.002

3.96b1 is worse.

LAME 3.96 vs. 3.90.3 Test

Reply #16
Sample used: Lazy_Jones.flac
3.96b1 --preset standard <268kbps> vs. 3.90.3 --alt-preset standard <242kbps>
EDIT: Average bitrates according to Nero6
Encoded with lame.exe/LAMEDrop1.3 - Decoded with Nero6
Code: [Select]
-------------------------------------
WinABX v0.42 test report
03/21/2004 00:10:57

A file: C:\WINDOWS\Desktop\Lazy_Jones (wav).wav
B file: C:\WINDOWS\Desktop\Lazy_Jones 3.90.3 (--ap s).wav

00:11:59    1/1  p=50.0%
00:12:10    2/2  p=25.0%
00:12:19    3/3  p=12.5%
00:12:25    4/4  p=6.2%
00:12:29    5/5  p=3.1%
00:12:36    6/6  p=1.6%
00:12:40    7/7  p=0.8%
00:12:44    8/8  p=0.4%
00:12:55    9/9  p=0.2%
00:13:50  reset

00:14:18  test finished

-------------------------------------
WinABX v0.42 test report
03/21/2004 00:14:18

A file: C:\WINDOWS\Desktop\Lazy_Jones (wav).wav
B file: C:\WINDOWS\Desktop\Lazy_Jones 3.96b1 (--p s).wav

00:14:48    1/1  p=50.0%
00:14:58    1/2  p=75.0%
00:15:07    2/3  p=50.0%
00:15:16    3/4  p=31.2%
00:15:23    4/5  p=18.8%
00:15:31    5/6  p=10.9%
00:15:39    6/7  p=6.2%
00:15:49    7/8  p=3.5%
00:15:58    7/9  p=9.0%
00:16:28  reset

00:16:32  test finished

-------------------------------------
WinABX v0.42 test report
03/21/2004 00:17:47

A file: C:\WINDOWS\Desktop\Lazy_Jones 3.90.3 (--ap s).wav
B file: C:\WINDOWS\Desktop\Lazy_Jones 3.96b1 (--p s).wav

00:18:03    1/1  p=50.0%
00:18:12    2/2  p=25.0%
00:18:21    3/3  p=12.5%
00:18:30    4/4  p=6.2%
00:18:38    5/5  p=3.1%
00:18:47    6/6  p=1.6%
00:18:58    7/7  p=0.8%
00:19:09    8/8  p=0.4%
00:19:19    9/9  p=0.2%
00:19:22  reset

00:19:33  test finished

Comments:
3.90.3 --alt-preset standard; completely chokes on this one
3.96b1 --preset standard; near perfect accuracy

LAME 3.96 vs. 3.90.3 Test

Reply #17
As requested by tigre i have to provide abx results for my listening test even if samples are very easy because there are few people that test a specific sample. I'm sorry if i'm too lazy to abx the two lame versions each other fot all the samples but you can always see ratings in the posted table and comments. For the great majority of the samples 3.90.2 is better. Most samples should be well known to testers, nevertheless i uploaded some of them.

Quote
Original vs applaud396q0.wav
    8 out of 8, pval = 0.004
Original vs applaud_3902.wav
    11 out of 12, pval = 0.003
Original vs applaud_396b1.wav
    10 out of 10, pval < 0.001

available here: http://www.mp3dev.org/mp3/gpsycho/quality.html
Quote
Original vs Bassdrum_3902.wav
    10 out of 10, pval < 0.001
Original vs Bassdrum396q0.wav
    8 out of 8, pval = 0.004
Original vs Bassdrum_396b1.wav
    8 out of 8, pval = 0.004

sample uploaded
Quote
Original vs Blackwater_396b1.wav
    8 out of 8, pval = 0.004
Original vs Blackwater_3902.wav
    8 out of 8, pval = 0.004
Original vs Blackwater396q0.wav
    8 out of 8, pval = 0.004

sample available here: http://www.ff123.net/samples.html
Quote
Original vs campestre396q0.wav
    8 out of 8, pval = 0.004
Original vs campestre_396b1.wav
    8 out of 8, pval = 0.004
Original vs campestre_3902.wav
    8 out of 8, pval = 0.004

uploaded http://xoomer.virgilio.it/fofobella/campestre.flac
Quote
Original vs emmtop_3902.wav
    8 out of 8, pval = 0.004
Original vs emmtop_396b1.wav
    8 out of 8, pval = 0.004
Original vs emmtop396q0.wav
    14 out of 15, pval < 0.001

sorry, sample exceeds 30 sec.
Quote
Original vs fall_3902.wav
    16 out of 28, pval = 0.286
Original vs fall_396b1.wav
    8 out of 8, pval = 0.004
Original vs fall396q0.wav
    18 out of 21, pval < 0.001

sample uploaded
Quote
Original vs iron396q0.wav
    8 out of 8, pval = 0.004
Original vs iron_396b1.wav
    8 out of 8, pval = 0.004
Original vs iron_3902.wav
    8 out of 8, pval = 0.004

available here: http://www.mp3dev.org/mp3/gpsycho/quality.html
Quote
Original vs Rebel396q0.wav
    8 out of 8, pval = 0.004
Original vs Rebel_3902.wav
    8 out of 8, pval = 0.004
Original vs Rebel_396b1.wav
    13 out of 14, pval < 0.001

sample uploaded
Quote
Original vs thewayitis_3902.wav
    8 out of 8, pval = 0.004
Original vs thewayitis396q0.wav
    8 out of 8, pval = 0.004
Original vs thewayitis_396b1.wav
    8 out of 8, pval = 0.004

available here: http://www.ff123.net/samples.html
Quote
Original vs Tosca_396b1.wav
    8 out of 8, pval = 0.004
Original vs Tosca_3902.wav
    8 out of 8, pval = 0.004
Original vs Tosca396q0.wav
    8 out of 8, pval = 0.004

sample uploaded
Quote
Original vs velvet_3902.wav
    8 out of 8, pval = 0.004
Original vs velvet_396b1.wav
    8 out of 8, pval = 0.004
Original vs velvet396q0.wav
    8 out of 8, pval = 0.004

available here: http://www.mp3dev.org/mp3/gpsycho/quality.html
WavPack 4.3 -mfx5
LAME 3.97 -V5 --vbr-new --athaa-sensitivity 1

LAME 3.96 vs. 3.90.3 Test

Reply #18
Ticket to Ride by The Beatles

I had to go down to 96kbps, because this one is easy to encode.

lame-3.90.3 --alt-preset 96
vs
lame-3.96b1 --preset 96

Quote
ABC/HR Version 0.9b, 30 August 2002
Testname: TICKET TO RIDE - ABR96 - 3.90.3 VS 3.96B1

1L = C:\My Music\lab\tkt\abr96-396b1.mp3.wav
2L = C:\My Music\lab\tkt\abr96-3903.mp3.wav

---------------------------------------
General Comments:
#2 was much harder to ABX than #1. For this sample, #2 is the clear winner.
---------------------------------------
1L File: C:\My Music\lab\tkt\abr96-396b1.mp3.wav
1L Rating: 2.5
1L Comment: The tambourine sounds distorted. The drum roll is also distroted.
---------------------------------------
2L File: C:\My Music\lab\tkt\abr96-3903.mp3.wav
2L Rating: 4.0
2L Comment: The tambourine sounds fine. The drum roll is distored.
---------------------------------------
ABX Results:
Original vs C:\My Music\lab\tkt\abr96-396b1.mp3.wav
    10 out of 11, pval = 0.006
Original vs C:\My Music\lab\tkt\abr96-3903.mp3.wav
    10 out of 13, pval = 0.046
C:\My Music\lab\tkt\abr96-396b1.mp3.wav vs C:\My Music\lab\tkt\abr96-3903.mp3.wav
    5 out of 5, pval = 0.031

LAME 3.96 vs. 3.90.3 Test

Reply #19
Sound Blaster Live! 5.1
foobar2000 ABX comparator
DSP: volume control, Resampler (SSRC) 48000Hz
all files replaygained on track basis
Harman/Kardon HK-395 speakers


(-) Ions --alt-preset 128

wav vs 3.90.3 modified (494 KB)
8/8
0.4%
Comment: mp3 version sounds less sharp when volume of buzzing reaches peak level; mp3 sounds kind of smeared.

wav vs 3.96b1 (520 KB)
8/8
0.4%
Comment: distortions between volume peaks - this affects the lower-volume buzzing; (peak volume buzzing seems to sound like original wav).

3.90.3 modified (494 KB) vs 3.96b1 (520 KB)
8/8
0.4%
Comment: harder to ABX the two encoded versions.  I found that 3.96's distortions were easier to differentiate the two than the peak buzzing volume.  3.90.3 is harder to pick out, at least with my ears.


Rock & Roll --alt-preset 128

wav vs 3.90.3 modified (442 KB)
8/8
0.4%
Comment: cymbals and drum solo sound very "smeared."  Also I noticed what seemed like an instant of stereo collapse.

wav vs 3.96b1 (449 KB)
8/8
0.4%
Comment: similar artifacts as 3.90.3, with the initial part of the drum solo seeming to suffer from a HF boost, but I did not notice the stereo problem.

3.90.3 modified (442 KB) vs 3.96b1 (449 KB)
17/21
(ABX comparator gives me N/A as a probability)
Comment: This was the toughest.  The smearing of cymbals was of no use to differentiate the two, as it occurs in both, and I stopped getting positive results with the supposed HF boost in the drum solo (hence the 4 failed trials).  Then I focused on the end of the last chorus and the final vocal, which sounded more distorted in 3.90.3 than in 3.96.  That turned out to be the main difference.


LAME 3.96 vs. 3.90.3 Test

Reply #21
Sample used: Lazy_Jones.flac
TEST: 3.96b1 (--preset extreme) <287.5 Kbps> vs. 3.90.3 (--alt-preset extreme) <273.9 Kbps>
Average bitrates according to Nero6
Encoded with lame.exe/LAMEDrop1.3 (l3maniac) - Decoded with Nero6
Code: [Select]
-------------------------------------
WinABX v0.42 test report
03/22/2004 20:50:33

A file: C:\WINDOWS\Desktop\Lazy_Jones (wav).wav
B file: C:\WINDOWS\Desktop\Lazy_Jones 3.90.3 (--ap e).wav

20:50:55    1/1  p=50.0%
20:51:02    2/2  p=25.0%
20:51:11    3/3  p=12.5%
20:51:19    4/4  p=6.2%
20:51:34    5/5  p=3.1%
20:51:42    6/6  p=1.6%
20:51:50    7/7  p=0.8%
20:51:59    8/8  p=0.4%
20:52:09    9/9  p=0.2%
20:52:10  reset

20:52:19  test finished

-------------------------------------
WinABX v0.42 test report
03/22/2004 20:52:19

A file: C:\WINDOWS\Desktop\Lazy_Jones (wav).wav
B file: C:\WINDOWS\Desktop\Lazy_Jones 3.96b1 (--p e).wav

20:52:32    1/1  p=50.0%
20:52:45    2/2  p=25.0%
20:52:58    3/3  p=12.5%
20:53:09    4/4  p=6.2%
20:53:17    5/5  p=3.1%
20:53:26    6/6  p=1.6%
20:53:34    7/7  p=0.8%
20:53:44    8/8  p=0.4%
20:53:53    9/9  p=0.2%
20:53:54  reset

20:54:01  test finished

-------------------------------------
WinABX v0.42 test report
03/22/2004 20:54:01

A file: C:\WINDOWS\Desktop\Lazy_Jones 3.90.3 (--ap e).wav
B file: C:\WINDOWS\Desktop\Lazy_Jones 3.96b1 (--p e).wav

20:54:12    1/1  p=50.0%
20:54:20    2/2  p=25.0%
20:54:28    3/3  p=12.5%
20:54:36    4/4  p=6.2%
20:54:44    5/5  p=3.1%
20:54:52    6/6  p=1.6%
20:55:01    7/7  p=0.8%
20:55:09    8/8  p=0.4%
20:55:19    9/9  p=0.2%
20:55:21  reset

20:55:23  test finished

Comments:
3.90.3 (--alt-preset extreme); artifacts, sound like 'hi-hat close'
3.96b1 (--preset extreme); fatal flaw during initial attack during the first few milliseconds, sounds similar to a 'missing frame header' error, dead give-away not present in 3.90.3. Much clearer than 3.90.3, 'hi-hat close' artifacts not nearly as audible, but still present.
_____________________
Sample used: Lazy_Jones.flac
TEST: 3.96b1 (--preset insane) vs. 3.90.3 (--alt-preset insane)
Encoded with lame.exe/LAMEDrop1.3 (l3maniac) - Decoded with Nero6
Code: [Select]
-------------------------------------
WinABX v0.42 test report
03/22/2004 20:43:52

A file: C:\WINDOWS\Desktop\Lazy_Jones (wav).wav
B file: C:\WINDOWS\Desktop\Lazy_Jones 3.90.3 (--ap i).wav

20:44:34    1/1  p=50.0%
20:44:43    2/2  p=25.0%
20:44:54    3/3  p=12.5%
20:45:03    4/4  p=6.2%
20:45:12    5/5  p=3.1%
20:45:21    6/6  p=1.6%
20:45:29    7/7  p=0.8%
20:45:38    8/8  p=0.4%
20:45:46    9/9  p=0.2%
20:45:50  reset

20:45:59  test finished

-------------------------------------
WinABX v0.42 test report
03/22/2004 20:45:59

A file: C:\WINDOWS\Desktop\Lazy_Jones (wav).wav
B file: C:\WINDOWS\Desktop\Lazy_Jones 3.96b1 (--p i).wav

Start position 00:00.0, end position 00:06.3
20:46:30    1/1  p=50.0%
20:46:41    2/2  p=25.0%
20:46:51    3/3  p=12.5%
20:46:59    4/4  p=6.2%
20:47:08    5/5  p=3.1%
20:47:17    6/6  p=1.6%
20:47:26    7/7  p=0.8%
20:47:35    8/8  p=0.4%
20:47:45    9/9  p=0.2%
20:47:48  reset

20:48:00  test finished

-------------------------------------
WinABX v0.42 test report
03/22/2004 20:48:00

A file: C:\WINDOWS\Desktop\Lazy_Jones 3.90.3 (--ap i).wav
B file: C:\WINDOWS\Desktop\Lazy_Jones 3.96b1 (--p i).wav

20:48:17    1/1  p=50.0%
20:48:25    2/2  p=25.0%
20:48:34    3/3  p=12.5%
20:48:42    4/4  p=6.2%
20:48:51    5/5  p=3.1%
20:48:59    6/6  p=1.6%
20:49:09    7/7  p=0.8%
20:49:17    8/8  p=0.4%
20:49:27    9/9  p=0.2%
20:49:29  reset

20:49:31  test finished

Comments:
3.90.3 (--alt-preset insane); faint artifacts sound like "hi-hat close"
3.96b1 (--preset insane); fatal flaw during initial attack during the first few milliseconds, sounds similar to a 'missing frame header' error, dead give-away not present in 3.90.3. Much learer than 3.90.3

LAME 3.96 vs. 3.90.3 Test

Reply #22
I am verifying Lazy Jones 3.90.3 aps vs. 3.96b1 ps. I focused on the last 1.5s of the sample:

3.90.3 aps vs. original
8/8 - popping artifacts in the last second of sample, very obvious

3.96b1 aps vs original
unable to ABX - sounds perfect

3.90.3 aps vs. 3.96b1 aps
8/8

edit: unable to ABX 3.96b1 or 3.90.3 at ape


LAME 3.96 vs. 3.90.3 Test

Reply #24
I tested drone_short, 0-1 second:

3.90.3 aps vs. original 0-1s
8/8 - sounds like air being blown

3.96b1 aps vs. original 0-1s
8/8 - much easier to ABX, air sound louder and more noticable

3.96b1 aps vs. 3.90.3 aps
7/8 - missed last test, ears getting fatigued

3.90.3 sounds better