Public Listening Test [2010], Item Selection, Material to use for the upcoming AAC test |
![]() ![]() |
Public Listening Test [2010], Item Selection, Material to use for the upcoming AAC test |
Mar 13 2010, 15:12
Post
#26
|
|
|
Group: Developer Posts: 618 Joined: 6-December 08 From: Erlangen Germany Member No.: 64012 |
OK, guys, I need help here. The number of critical items I can dig up is overwhelming. So, in order of appearance:
halb27, post #3: Your harpsichord 40 item is accepted for testing. Other items you proposed are from "mp3 times" and might not be AAC-critical. They are: - trumpet_My Prince - lead-voice - trumpet - Là Ou Je Suis Née - keys_1644ds - herding_calls /mnt, post #5: over the last few months you proposed a nice long list of AAC-critical items. Linchpin is already accepted for testing. Your remaining proposals are: - Kraftwerk remasters (the Zip file with excerpts you uploaded around Christmas?), - Show Me Your Spine (any 15-second passage you favor) - Human Disease - Hexonxonx - smothered_hope IgorC, post #11: many of the items you proposed to me are actually already in Fraunhofer's internal test set, so well known to me. The ones which are not are: - Creuza - Spill the blood - Aquatisme from 48 kbps AAC test - Descending Darkness - Girl - Erase_replace To all above members: Could you please ABX-HR your respective items list using the newest nero 1.5.4 -q 0.41 and QT True VBR Q60 and then report to me via personal message - which item is easiest to ABX vs. the lossless reference, regardless of coder (or in other words, averaged over the two coders) - how large the differences are for each item between the two coders without mentioning the coders (i. e. "huge differences", "both sound bad", etc.) That would be a great help! Thanks, Chris -------------------- If I don't reply to your reply, it means I agree with you.
|
|
|
|
Mar 13 2010, 17:28
Post
#27
|
|
|
Group: Members Posts: 273 Joined: 18-June 03 Member No.: 7254 |
It seems to me this more a test of how different encoders react to various killer samples - which is interesting but far less useful than a general audio quality test. You should test a wide selection of musical genres, not statistical anomalies (which all codecs will have) which have no bearing on overall sound quality whatsoever. It seems like an enormous waste of effort, have I missed something crucial here?
|
|
|
|
Mar 13 2010, 17:35
Post
#28
|
|
![]() Group: Developer Posts: 2983 Joined: 2-December 07 Member No.: 49183 |
|
|
|
|
Mar 13 2010, 18:02
Post
#29
|
|
![]() Group: Members Posts: 677 Joined: 4-May 08 Member No.: 53282 |
For me the idea that codecs have an "overall general quality" is a joke, it's either you can tell that there is a difference & describe how it sound different to you, or you can't.
Before I did some ABXing for myself & wasted several days to understand how lossy music sounded different from lossless music, I used to speak with term like this "overall general quality" ... nowaday I think that this way of speaking is only due to a lack of understanding of what you are speaking about: for me in the mouth of a newbie: "overall general quality"=placebo. The only "overall general quality" of a codec I know is mix between: 1: how many killer samples affect the codec. 2: how bad is the distortion that you can hear within the killer samples that affect the codec. What is a killer sample? It is a sample that you can ABX, simply. There is no "overall general quality" outside of killer samples because you cannot evaluate the quality of sound that you cannot even ABX. This is specially true at medium/high bitrate because then generic music (not selected) will be transparent 99% of time ... evaluating the quality of transparent samples is non-sense. I agree that you can speak of "overall general quality" of a codec at low bitrate (strictly below 128Kbps for modern encoders) because at say 96Kbps even random (non-killer) music is likely to not be transparent. Then yes "overall general quality" can have some meanings, but IMHO that is a special case. I know that the idea that such thing as an "overall general quality" exist is due to people speaking of codecs in a very generic way like "this codec sounds metallic", with modern codecs this is definitly a word abuse & a generalization. Don't be fooled by the langage, if a codec pass killer samples you can rest in peace: generic music will be an health walk for the codec. Edit: I disagree with Gurubolez's above opinion, but that's also why I value personnal test that I can reproduce like those of /mnt more than public listening tests. /mnt killer samples are often gold to me. This post has been edited by sauvage78: Mar 13 2010, 18:26 -------------------- CDImage+CUE
Secure [Low/C2/AR(2)] Flac -4 |
|
|
|
Mar 13 2010, 18:39
Post
#30
|
|
|
Group: Developer Posts: 618 Joined: 6-December 08 From: Erlangen Germany Member No.: 64012 |
Edit: I disagree with Gurubolez's above opinion, but that's also why I value personnal test that I can reproduce like those of /mnt more than public listening tests. /mnt killer samples are often gold to me. Which is why I like to include some of /mnt's samples in this public test And: which of guruboolez' posts are you referring to? Chris This post has been edited by C.R.Helmrich: Mar 13 2010, 18:41 -------------------- If I don't reply to your reply, it means I agree with you.
|
|
|
|
Mar 13 2010, 19:06
Post
#31
|
|
![]() Group: Members Posts: 677 Joined: 4-May 08 Member No.: 53282 |
1: No, I don't have personnal samples for AAC, the simple reason being that I actually don't use lossy at all. I like Nero AAC LC's quality (Specially 0.55) but I have a personnal problem with AAC not being gapless natively (from MPEG specifications). Stealing other ABXer samples is the reason why I read this topic
2: Those linked within lvqcl' post. This post has been edited by sauvage78: Mar 13 2010, 19:10 -------------------- CDImage+CUE
Secure [Low/C2/AR(2)] Flac -4 |
|
|
|
Mar 13 2010, 19:20
Post
#32
|
|
|
Group: Developer Posts: 618 Joined: 6-December 08 From: Erlangen Germany Member No.: 64012 |
1: I like Nero AAC LC's quality (Specially 0.55) but I have a personnal problem with AAC not being gapless natively (from MPEG specifications). True, the lack of gapless playback standardization is shocking even for me. But there might be some progress on this subject soon. Chris -------------------- If I don't reply to your reply, it means I agree with you.
|
|
|
|
Mar 13 2010, 20:00
Post
#33
|
|
|
Group: Members Posts: 698 Joined: 6-March 10 Member No.: 78779 |
Since all thinkable gapless implementations need the same metadata (encoder delay, actual number of samples) I wouldn't worry too much about it. Apple's implementation is the de-facto standard right now. And if there is every any future formal specification, it will be trivial to copy the existing iTunes metadata into a tag compatible to the new scheme.
|
|
|
|
Mar 14 2010, 01:36
Post
#34
|
|
![]() Group: Members Posts: 677 Joined: 4-May 08 Member No.: 53282 |
Well, the following is about gaps, I know it is offtopic here but I don't know where to answer:
I did a very short & very simple test after googlebot's post, I encoded my lossless Pink Floyd - 1973 - The Dark Side Of The Moon rip to Nero AAC V1.5.4.0/quality 0.55 with F2K V1.01 & I listened between track 5 & 6 to see if I could hear anything bad, either added silence or a glitch. CODE TRACK 05 AUDIO TITLE "Money" PERFORMER "Pink Floyd" INDEX 01 19:24:35 TRACK 06 AUDIO TITLE "Us And Them" PERFORMER "Pink Floyd" INDEX 01 25:56:35 With the tags created during the encoding I couldn't hear any glitch. Now I deleted the tags with Mp3tag v0.45a & re-listened to the transition between track 5 & track 6, & guess what ... now without tags there is an audible glitch ... I don't know if this is an issue with Mp3tag but all I know is that this very simple test (It takes less than 5 min) shows that losing the gapless playback metadata information by misstake is actually very easy with the actual Nero trick ... so as long as a simple tag edition will end in the possibility of losing gapless playback personnaly I will not use AAC for music. (I may use it with video as gaps are not an issue there) Even if there is a standard for this one day & even if it is a trivial task to convert the actual metadata trick to this future standard ... it is actually so easy to lose this information that you may have lost it before a more robust standard exist. My dream lossy codec is an MPEG ISO standard codec that achieve the quality of Nero AAC, with a native gapless support as good as Vorbis/Musepack (gapless directly in the specification). The actual tag trick is not satisfying for me. In the future this issue may lead me to re-use Vorbis while I know from my listening tests that Nero AAC beats Vorbis qualitywise. Actually I only use lossless in order to avoid choosing between plague & cholera ... This post has been edited by sauvage78: Mar 14 2010, 01:56 -------------------- CDImage+CUE
Secure [Low/C2/AR(2)] Flac -4 |
|
|
|
Mar 14 2010, 03:26
Post
#35
|
|
|
Group: Members Posts: 698 Joined: 6-March 10 Member No.: 78779 |
I don't understand the issue. Proper gapless tags were saved. You removed them with a tagging tool, that didn't honor them. And gapless info was lost. The situation would not be different if the tags were written in a to be proposed ISO format, as long as the tool you are using doesn't honor them.
PS I just realized: sorry for the ongoing off-topic debate. Feel free to remove it from the thread. This post has been edited by googlebot: Mar 14 2010, 03:28 |
|
|
|
Mar 14 2010, 03:43
Post
#36
|
|
![]() Group: Members Posts: 677 Joined: 4-May 08 Member No.: 53282 |
Well indeed I obviously deleted manually the metadata tags but the issue is that you cannot delete them as easily with a simple tag editor from other codecs with native gapless support. So on the one hand many people (incl. newbies) edit their tags, on the other hand very few people (advanced users) are aware of the gap problem untill they suddenly hear a glitch. This disproportion leads to the conclusion that many unaware AAC users might sooner or later lose their gapless metadata which is "volatile". Indeed it is not really an issue for people like us who know which misstake not to commit in order to keep their metadata, but it is not natural for beginners to think that editing tags can hurt the playback of their files. This tricky tag solution is just not friendly to newbies & not fully satisfaying IMHO. Feel free to disagree that's just my opinion.
Edit: Even if from a technical point of view the info wouldn't be very different if it was in the specification, the metadata would be buried deeper in the files. Embedded inside, instead of wrapped around. Undeletable & supported by default by all decoders. For me gapless playback is not something "optionnal", it should be at the heart of any modern codec. Despite its great audio quality, with regard to gaps, AAC is not really a modern codec. It seems to me that nothing evolved for gaps between mp3 & AAC, which means that so far all audio codecs designed by MPEG are thought for video users & not audiophiles. This post has been edited by sauvage78: Mar 14 2010, 04:35 -------------------- CDImage+CUE
Secure [Low/C2/AR(2)] Flac -4 |
|
|
|
Mar 14 2010, 05:16
Post
#37
|
|
|
Group: Members Posts: 195 Joined: 29-May 07 Member No.: 43837 |
Did you report this bug to mp3tag? (I assume this doesn't occur when editing tags with itunes.) I also have some issues with the itunes aac tag format, including the lack of support for multiple items in a tag (such as multiple artists), but it is the standard now, and I doubt another (lossy) format will succeed it. (had it arisen now, the tagging format chosen would probably have been xml.)
|
|
|
|
Mar 14 2010, 19:34
Post
#38
|
|
|
Group: Developer Posts: 618 Joined: 6-December 08 From: Erlangen Germany Member No.: 64012 |
1: No, I don't have personnal samples for AAC, the simple reason being that I actually don't use lossy at all. But here (very helpful test, by the way!), you mentioned a Ginnungagap item. Is that AAC-critical? If yes, can you point us to that one, or upload it here? Thanks, Chris -------------------- If I don't reply to your reply, it means I agree with you.
|
|
|
|
Mar 14 2010, 20:12
Post
#39
|
|
![]() Group: Members Posts: 677 Joined: 4-May 08 Member No.: 53282 |
Ginnungagap (a sample from a Therion song) is a very noisy critical item for lossywav, I haven't seriously tested it with AAC but I doubt it would be as critical as the encoding technique used are very different. I honestly don't recall that I tested it with classic lossy encoder but it is likely that I quickly did but found nothing & that it's why I give up the idea of cross testing lossywav killer samples with classic lossy encoders ... but at the same time testing plenty of DCT killer samples on lossywav I found that Abfahrt Hinwil & Fool's Garden samples (which were originally found with classic lossy encoders) were also critical for lossywav ... so there is definitely the possibility that lossywav killer samples affects DCT codecs. But for Ginnungagap I honestly don't know, it was found by Martel on lossywav & has remained a lossywav specific test sample. It could be tested but as with all listening test it takes time. I am not sure it is really worth it because even if I think that this sample might be hard to encode at 96Kbps, it has very few chance to be as interesting as Harlem & Autechre at 128Kbps.
This post has been edited by sauvage78: Mar 14 2010, 20:13 -------------------- CDImage+CUE
Secure [Low/C2/AR(2)] Flac -4 |
|
|
|
Mar 15 2010, 01:14
Post
#40
|
|
![]() Group: Members Posts: 677 Joined: 4-May 08 Member No.: 53282 |
TechVsLife:
I just tested the broken gapless playback problem with Mp3tag V2.46 in case it was already fixed, sadly the problem is still here, so I sended a PM with a link to Post #34 to Florian. Edit: For fun I tested with aoTuVb5.7 ... with or without tags no glitch indeed. It shows that audio quality/compression is not everything, features are very important too. This post has been edited by sauvage78: Mar 15 2010, 02:07 -------------------- CDImage+CUE
Secure [Low/C2/AR(2)] Flac -4 |
|
|
|
Mar 15 2010, 02:15
Post
#41
|
|
|
Group: Members Posts: 195 Joined: 29-May 07 Member No.: 43837 |
@sauvage78: thanks. unlike the ms/apple bureaucracy, florian does correct bugs quickly (otoh, ms/apple have infinite wealth and life).
|
|
|
|
Mar 19 2010, 23:37
Post
#42
|
|
|
Group: Developer Posts: 618 Joined: 6-December 08 From: Erlangen Germany Member No.: 64012 |
For public (and personal) reference, some more potential items from LAME 3.96 tuning days (some have already been mentioned here):
http://www.hydrogenaudio.org/forums/index....showtopic=19882 I also re-uploaded the abovementioned Mandylion item in our accompanying upload thread. Chris -------------------- If I don't reply to your reply, it means I agree with you.
|
|
|
|
Mar 20 2010, 11:00
Post
#43
|
|
|
Group: Members Posts: 288 Joined: 14-August 06 Member No.: 34027 |
It seems to me this more a test of how different encoders react to various killer samples - which is interesting but far less useful than a general audio quality test. You should test a wide selection of musical genres, not statistical anomalies (which all codecs will have) which have no bearing on overall sound quality whatsoever. It seems like an enormous waste of effort, have I missed something crucial here? If the "killer samples" are from readily available music, they are valid. If the "killer samples" are very rare one-off's like some guy playing three notes on a violin in his bathroom, they are invalid. Unless I misunderstood this test was not about music? |
|
|
|
Mar 20 2010, 11:37
Post
#44
|
|
|
Group: Developer Posts: 618 Joined: 6-December 08 From: Erlangen Germany Member No.: 64012 |
I don't want to repeat myself so much, so please start with this (posts #195-200).
Who says that killer samples must stem from readily available music? They must be readily available (CD or download), but not necessarily what we consider music. Plus, some musical pieces (e.g. Jazz) have isolated instruments in them, so it's not that far off. And guess why we have these items in our list - because solo instruments can reveal artifacts that spectrally complex music can't reveal. Plus plus, only half of the samples used in MPEG tests are musical pieces. The rest are instruments from the SQAM CD, speech recordings, etc. Chris -------------------- If I don't reply to your reply, it means I agree with you.
|
|
|
|
Mar 21 2010, 23:34
Post
#45
|
|
|
Group: Members Posts: 288 Joined: 14-August 06 Member No.: 34027 |
2. Samples. Different styles of music, different levels of difficulty, pointing issues etc....? To be discussed here or in separate topic. Chris, I was working from the above, so I apologize for missing that the sample selection had migrated away from a broad range. While I agree that 128kbps in modern codecs is substantially better than it was 5 years ago, I don't agree that 128kbps is too high for a valid test based on a broad range of samples including mainstream music. I'll have to go along with guruboolez that with a test based on rare extreme samples, my interest is nil. |
|
|
|
Mar 22 2010, 00:13
Post
#46
|
|
|
Group: Members Posts: 698 Joined: 6-March 10 Member No.: 78779 |
As far as I understand, the result for a "broad range" 128kbit/s test can be known in advance "mostly transparent with few exceptions". That's niler than nil.
|
|
|
|
Mar 22 2010, 02:06
Post
#47
|
|
|
Group: Members Posts: 195 Joined: 29-May 07 Member No.: 43837 |
Who says that killer samples must stem from readily available music? They must be readily available (CD or download), but not necessarily what we consider music. I'm a layman, but fwiw, I'm somewhere in the middle of the two camps: I think there is value in going for killer or extreme samples, because some people are using lossy formats as their only formats, i.e. as formats for both casual portable and more demanding at-home listening, so there's a need to test the possibility of glaring artifacts (since they have no pure archive format as a backup). (Also that might affect individual decisions about one's need for lossless fornats.) However, I don't see a purpose for picking non-music or super-artificial samples EXCEPT and INSOFAR as they help developers fine tune their encoders for non-artificial music samples. Lossy compression is always lossy with a purpose; otherwise, you couldn't decide which bits you can afford to lose or not. I don't think lossy music compression should be construed as SOUND compression, i.e. faithful reproduction of any sound or noise, because I assume that would defeat some of the techniques used which depend on harmonics and because the reason that people use mp3/m4a compression is for the appreciation of music and songs, and not for copying arbitrary sounds. Even speech compression is better served by speech-specific codecs. To me, extreme metal and synthetic music starts to enter a non-music world; that may be taste only and others might want a codec geared precisely to reproduce that genre very faithfully. But if there has to be a tradeoff between faithful reproducing of "natural" music and say the most extreme amusical/synthetic samples, I'd say there's great justification for gearing it to the faithful reproduction of music (or if you will traditional music). --Of course, if there's no tradeoff, there's no problem here, but I assume that there's some correlation between the character of the killer samples we see and the difficulty of encoding them. Also a caveat, there's some traditional music that is difficult to encode (harpsichords) so it is somehow "unnatural" from the point of view of the encoder (--if the encoder had a musical taste). I assume the lossy encoders can to some extent "know" what genre they are facing, speech, heavy metal, etc. and switch techniques accordingly, though apparently none of them are good enough to use all the bits they need for the killer samples, perhaps because of the constraints imposed by some common lossy techniques that work so well for 99% of music. |
|
|
|
Mar 22 2010, 04:06
Post
#48
|
|
|
Group: Members Posts: 698 Joined: 6-March 10 Member No.: 78779 |
I assume the lossy encoders can to some extent "know" what genre they are facing, speech, heavy metal, etc. and switch techniques accordingly, though apparently none of them are good enough to use all the bits they need for the killer samples, perhaps because of the constraints imposed by some common lossy techniques that work so well for 99% of music. That's usually not happening. Every sample faces the same psy model. It will have a different states as a function of the sample's context, but that's all. There is usually no higher level mode-switching or genre detection going on. A local context is all that's needed for the encoder to make optimal decisions. 2-pass encoding is an exception, but that doesn't do anything, that could be called "genre detection", either. And it's not a bug, it's a feature. The code would be a mess to maintain and tuning a generalized solution makes much more sense. It would be different, for example if one encoder for both guitar and avantgardistic electronic music would be a contradicting design goal, but it's not. The design goal is exploiting your ear and auditory system, for whatever one throws at it. I would agree to not include any synthetic samples, that were especially created to fuck a specific implementation. With enough knowledge about piece of code, there will probably always be some way to exploit it. But that's not true for any of the submitted samples, as far as I can see. Some of them may sound strange to your ears, but it is an encoder's job to fool them anyway and be transparent. And LAME, Fraunhofer, Nero, and Apple have become so good, that I really don't see much sense for yet another listening test, that turns out as transparent for item 1-14, almost transparent for item 14-15. We know that already and 1-14 will be a major pain to test, which would frustrate many potential listeners. So why not go for hard nuts only, and see which encoder can crack the most. This post has been edited by googlebot: Mar 22 2010, 04:25 |
|
|
|
Mar 22 2010, 06:09
Post
#49
|
|
|
Group: Members Posts: 195 Joined: 29-May 07 Member No.: 43837 |
A local context is all that's needed for the encoder to make optimal decisions. 2-pass encoding is an exception, but that doesn't do anything, that could be called "genre detection", either. Thank you, I don't know how the encoders do their magic. At least with the local context, it looks like they still "know" to throw more bits at certain things and not at others, which end up corresponding to higher bitrates on e.g. heavy metal music. ok, not genre detection at all, but a detection of local difficulty/hardness that if sustained over the piece can perhaps correspond in some way to genre. I also didn't know the same rules were used, switching to more bits with the same rules depending on the local context. . It would be different, for example if one encoder for both guitar and avantgardistic electronic music would be a contradicting design goal, but it's not. The design goal is exploiting your ear and auditory system, for whatever one throws at it. Here I'm not sure I agree. If it turns out that encoding efficiently for avantgardistic electronic music (lowest bitrate, highest quality) produces worse quality than encoding efficiently for guitar music, then it seems there wouldn't in fact be one design goal (i.e. as a practical matter), regardless of intention. Now that, I take it, is not the case. But it could be the case, e.g. some sounds are not easily reduced to a score (sheet music)--"sheet music" encoding hasn't been designed efficiently for reproducing all sounds (likewise midi). Some compressors work better on text formats, some on binaries and jpegs. Even though we would prefer there to be one compressor that worked best for all, we are forced to make a choice (or the compression engine decides on the fly). It could be that certain kinds of sounds are more efficiently compressed by one kind of encoder than another. This post has been edited by TechVsLife: Mar 22 2010, 06:12 |
|
|
|
Mar 22 2010, 13:04
Post
#50
|
|
|
Group: Members Posts: 698 Joined: 6-March 10 Member No.: 78779 |
Some compressors work better on text formats, some on binaries and jpegs. Even though we would prefer there to be one compressor that worked best for all, we are forced to make a choice (or the compression engine decides on the fly). That's a completely different case than (perceptual) compression of audio. Binary (non audio) data can have regularity (and also complexity) several magnitudes larger than usual audio data. Binary (non audio) data can have repetitive patterns spanning larger ranges than what could be found by brute force search. So it can make sense to tune a lossless data compressor for specifically structured patterns out of the infinite problem space. With the exception of looped, electronic music the same is not usually true for audio. There are also perceptual encoders tuned for special use cases, like speech codecs. They have some modification above what a general purpose encoder would do on the fly. But their goal isn't necessarily transparency, but sounding good enough at very low rates*. For general purpose encoders, which target transparency, there is no "on the fly" mode switching. All it does, is look at how many bits it does still have available at the current position and what its estimate for the audibility of each component in the current window are. * Especially lower sample rates, for which usual block sizes are insufficient. This post has been edited by googlebot: Mar 22 2010, 13:45 |
|
|
|
![]() ![]() |
|
Lo-Fi Version | Time is now: 22nd May 2013 - 00:54 |