Most 'true' way to de-emphasize CD image, Remove pre-emphasis - possible and best ways? |
![]() ![]() |
Most 'true' way to de-emphasize CD image, Remove pre-emphasis - possible and best ways? |
Oct 26 2010, 18:32
Post
#51
|
|
|
Group: Members Posts: 514 Joined: 1-November 06 Member No.: 37047 |
2. FLAC has a very imperfect method of expressing the filtering efficiently. In fact, it makes no special attempt to, since in the general case there is no utility in doing so, and even in this specific case, the prediction (to give a perfect result, recovering 8 zeroed LSBs) would have to be far more accurate than anything else it ever achieves. Seems that you agree on my previous proposed explanation: QUOTE (knutinh) If the 8lsb can be "created" as a function of the last N input samples, and this function is held constant over a file, then a predictor looking for autocorrelation can in principle find that function (or the convolution of the original source spectrum and the filter). Once you have that function you can do predictions and transmitt only the prediction residue. There are surely practical constraints that can prevent this (buffer sizes, processing power, truely random dither etc), but no-one has elaborated on those. So perhaps the linear prediction (somehow the words Yule-Walker flash in the back of my head?) is inperfect. Perhaps it does only the 8 msb. Perhaps it has too short a buffer. Perhaps it does only a partial search through candidate coefficients. QUOTE If you take the 24-bits with 8 zero LSBs, and add a DC offset of 000000000000000010101011 to it, thus breaking the "perfect" trick FLAC can use on those 8 zero LSBs, you'll probably find the compression differential between that and the filtered version narrows dramatically (unless FLAC is smart enough to remove it - I haven't tried). This is despite the DC offset carrying exactly 8 extra bits of information for the entire file over and above the 24--bits with 8 zero LSBs. FLAC isn't a form of artificial intelligence trying to find the absolute smallest lossless representation of a given set of data, including reverse engineering any algorithms that may have "bloated" that data; it's just trying to do a reasonable job in a reasonable time. I was thinking about that. If other trivial inserted datasets into the 8 lsb also cause a large increase in filesize, then it seems to suggest that flac simply isnt really clever wrg the 8 lsb of a 24 bit stream. There may be very good reasons why it isnt. Throughout this discussion I have had a crude mental model of lossless codecs where they try to predict future samples from a historical buffer, transmitting only the (slowly varying) model + the prediction residue (further compressed using entropy coding). If this is indeed a valuable reference for this kind of high-level discussions, it really boils down to the statistics of the input signal and the capabilities of the prediction, doesnt it? -k |
|
|
|
Oct 27 2010, 10:14
Post
#52
|
|
![]() Group: Members Posts: 1474 Joined: 30-November 06 Member No.: 38207 |
This does not make sense. A filter does not create information (look up Shannon). That's true under the condition that we have machines operating on real numbers, which we don't. Instead we have samples of finite width, which when processed, can to lead to new samples of theoretically infinite width, which we represent by rounding. I fail to see how this should lead to a dramatically worse performance. If rounding appears as uniformly random, it is a noise component which cannot be compressed, fair enough -- but are you altering the 24-16=8 bits? No. (Are you ever rounding off more than 1 LSB? That's not a retorical question, I don't know the answer.) And the finite-wordlength constraint merely means that there are more functions fitting, modulo the rounding. So in principle, it should be possible to compress to something fairly near original size (17/16 if roundoff and dither is in the LSB only, right?). Now my original inquiry was due to the observation that files became > 50% larger -- up to 70%, actually. That corresponds to (more than!) a fully 24 bit recording (27 bits in the 70 % case). Now assume you have (A) 24 bits of music (resp. 27), FLACed vs (B) the file in (A), which you crop down to 16 bits (discarding 1/3 of the information), adding a little dither, applying a filter (adding no information although representing it in a 24 bit file) and rounding off. Then you FLAC it. Is it at all reasonable that file (B) should be as large as file (A)? This post has been edited by Porcus: Oct 27 2010, 10:18 -------------------- geocities.com/hydrogenaudio: http://goo.gl/tqYZj
|
|
|
|
Oct 27 2010, 10:19
Post
#53
|
|
![]() ReplayGain developer Group: Developer Posts: 4589 Joined: 5-November 01 From: Yorkshire, UK Member No.: 409 |
Yes, but the design is such that it works "best" with the vast majority of content out there - which means it might not do very well on a given special case. The predictor looks at the sample values as a whole (i.e. all 24-bits) - it's never looking at the 8LSBs specifically (or the 16 MSBs!) and seeing if they can be predicted from anything.
Plus I've read (don't know if it's true) that the design of FLAC isn't especially focussed on 24/96 audio. Certainly it's a practical implementation of lossless coding - it's not meant to reach for perfection in terms of compression ratio. FWIW, IIRC, the Wavepack author said that audio which had been created from another source (e.g. by simple normalisation, or I suppose by filtering - I think the specific case he mentioned was normalisation without dither which meant certain sample values would be completely unused), where that other source would have required fewer bits to store, wasn't unheard of - but he didn't think it was worth building something to reverse engineer the transformation. Sometimes it would noticeably reduce the bitrate, but the encoding effort would slow things down too much. (with apologies to David if I've mis-remembered this! I searched but couldn't find the quote.) Cheers, David. |
|
|
|
Oct 27 2010, 11:04
Post
#54
|
|
|
Group: Members Posts: 698 Joined: 6-March 10 Member No.: 78779 |
(Are you ever rounding off more than 1 LSB? That's not a retorical question, I don't know the answer.) Rounding, in the strict sense of the word*, only affects the 1 LSB. But when you process at 16 bit values at 32 or 64 bit precision, you get true 32 and 64 bit results. Storing them in 24 bit will give 8 unique bits over 16 bit and not some trivially correlated pattern. The thing is, you guys asked for a very special case handling and why the predictor isn't able to detect the pattern automagically. It is just too special to scan for specifically. We also can't use an universal predictor covering all possible cases. A complete implementation would probably itself be unpredictable (halting problem) or at least have unpractical exponential costs. As 2Bdecided put it very nicely: "FLAC isn't a form of artificial intelligence" That FLAC has considerably worse performance at 24 bit than 16 bit is true, but not necessarily related to its effectiveness in the "reconstruct n LSB as a function of the l-n MSB in the last x samples" special case. *Instead of saying to 'round' a 32 bit value to 24 bit, as I may have done, the term 'word length reduction' would have been more appropriate. This post has been edited by googlebot: Oct 27 2010, 11:28 |
|
|
|
Oct 27 2010, 12:38
Post
#55
|
|
|
Group: Members Posts: 514 Joined: 1-November 06 Member No.: 37047 |
I guess that whatever per-channel correlation is in a music file is generally mainly to be found in the 8msb (no matter what the bit-depth happens to be)?
-k |
|
|
|
Oct 27 2010, 12:42
Post
#56
|
|
|
Group: Members Posts: 698 Joined: 6-March 10 Member No.: 78779 |
Why exactly 8?
|
|
|
|
Oct 27 2010, 15:35
Post
#57
|
|
![]() ReplayGain developer Group: Developer Posts: 4589 Joined: 5-November 01 From: Yorkshire, UK Member No.: 409 |
Rounding, in the strict sense of the word*, only affects the 1 LSB.propriate. go on, think that one through again!(or round 8999.9 to the nearest integer and see how many of the numbers change). You could just say "truncate". Apart from this special case, it's often just an academic difference with audio, since rounding = add 0.5 then truncate. A 0.5 LSB DC offset is usually irrelevant - except here I suppose. Back to the original topic, if you want a lossless but efficient method, I'm sure it's possible to calculate how many bits you need to store optimal 16-bit + de-emph without creating a detectable difference. I bet 20 is enough, in which case, dump the last four bits. I'm not even going to say dither, though I suppose you could. Either way, it'll bring the FLAC bitrate down. You could say that 20 isn't as good as 24. True. 24 isn't as good as 32. 32 isn't as good as 64. But you have to stop somewhere. 16 is already enough IMO, but if you're using lossless you might want some extra headroom, no matter how irrational or at least unimportant. Cheers, David. |
|
|
|
Oct 27 2010, 15:42
Post
#58
|
|
|
Group: Members Posts: 514 Joined: 1-November 06 Member No.: 37047 |
|
|
|
|
Oct 27 2010, 17:33
Post
#59
|
|
|
Group: Members Posts: 698 Joined: 6-March 10 Member No.: 78779 |
|
|
|
|
Oct 27 2010, 20:33
Post
#60
|
|
![]() Group: Members Posts: 1474 Joined: 30-November 06 Member No.: 38207 |
Why exactly 8? Could just as well say 1 or 15. The further into the noise/dither floor you are trying to predict, the harder it is, right? That's absolutely a point (with a reservation below), if it is true. It is an easy test (which takes some time): gather a test corpus of 16 bit recordings (or, if you can find a representative test corpus of 24-bit recordings). Truncate down by 1 bit, 2 bits, ... down to some practicality bound (8 is a round figure? (One could do the same with dithering, but that should simply shift the effective # of bits, right? 16 bits dithered down to 15 would be roughly equivalent -- in information content -- to somewhere between 15 and 16 of the original 16? So that there would be expected one bit difference between 16-to-15-with-dithering 16-to-14-with-dithering?) But then the reservation: If the file is originally 16 bits, and becomes 24 bits just by padding with zeroes and applying a filter, then the last 8 bits are not in the noise floor, are they? This post has been edited by Porcus: Oct 27 2010, 20:38 -------------------- geocities.com/hydrogenaudio: http://goo.gl/tqYZj
|
|
|
|
Oct 28 2010, 01:09
Post
#61
|
|
![]() Group: Members Posts: 1474 Joined: 30-November 06 Member No.: 38207 |
FWIW, I checked this with multiple lossless encoders. No surprises in terms of relative performance -- relative filesizes about what you would expect in a 16-bit test. So all compress about equally (relative to their assumed quality) bad -- they all returned file sizes greater than the original 16-bit WAV.
I.e.: With the de-emphasis algorithm and a single flag indicating application of it, then even WAV -- without any compression whatsoever -- would outperform today's state-of-the-art lossless algorithms (on this fairly hardrock-ish test corpus, that is). That's the state of today's art Procedure: The 16-bit files were converted to 24 bits de-emphed by SoX. Each album converted to one single WAV file by foobar2000. Test corpus with 24-bit file sizes below, total 11 286 552 576 bytes in the file (11 286 517 144 according to foobar2k, strange since it is not divisible by 3 ...). Then each wav file was compressed with a range of encoders (Monkey's and TAK by their GUI applications, ofr.exe by CLI, while Foobar handled FLAC (1.2.1) and WavPack -- the latter unbearably slow, taking hours.) In order of filesize: 7 761 846 272 bytes - flac -8 7 672 954 880 bytes - WavPack high, x5 7 629 369 344 bytes - Monkey's extrahigh 7 582 380 032 bytes - TAK -p4 7 533 199 360 bytes - ofr extranew For comparison, the original 16 bit signal approximated by removing 1/3 11 286 552 576 bytes*2/3= 7 524 368 384 bytes (file size) 11 286 517 144 bytes*2/3= 7 524 344 763 (minus a third. Audio size) Test corpus with 24-bit wav filesizes: . 779 289 380 Backstreet Girls - Boogie Till You Puke . 607 080 644 Black Sabbath - Black Sabbath [Castle orig . 679 535 180 Black Sabbath - Black Sabbath, Vol. 4 [Castle orig . 635 569 244 Black Sabbath - Master of Reality [Castle orig . 704 629 844 Carnivore - Retaliation 1 099 159 028 Ebba Grön - Ebba Grön, 1978-1982 . 762 788 924 In Slaughter Natives - Enter Now the World . 814 826 924 Leonard Bernstein - West Side Story . 938 924 324 Lifelover - Konkurs . 992 003 084 MZ.412 - Burning the Temple of God . 933 198 380 MZ.412 - In Nomine Dei Nostri Satanas Luciferi Excelsi . 803 773 700 Ordo Equilibrio - Reaping the Fallen...The First Harvest . 864 977 444 Raison d'Etre - Prospectus I . 670 761 044 Roger Waters - The Pros and Cons of Hitch Hiking -------------------- geocities.com/hydrogenaudio: http://goo.gl/tqYZj
|
|
|
|
Oct 28 2010, 01:17
Post
#62
|
|
![]() Group: Members Posts: 1474 Joined: 30-November 06 Member No.: 38207 |
For reference:
7 761 846 272 bytes: flac -8 (24 bits) 4 043 931 648 bytes: Same, but set to 16 bits output (no dithering). The first 16 bits take up 52.1% of the filesize. The last 8 bits (probably with dithering?): 47.9%. -------------------- geocities.com/hydrogenaudio: http://goo.gl/tqYZj
|
|
|
|
Oct 28 2010, 03:45
Post
#63
|
|
![]() Group: Members Posts: 42 Joined: 6-October 10 Member No.: 84390 |
I haven't found much info on emphasis. From what I understand, it's some kind of tag on the audio file that tells the CD amplifier to amplify this song by this much.
Close? |
|
|
|
Oct 28 2010, 10:05
Post
#64
|
|
![]() Group: Members Posts: 1474 Joined: 30-November 06 Member No.: 38207 |
I haven't found much info on emphasis. From what I understand, it's some kind of tag on the audio file that tells the CD amplifier to amplify this song by this much. Close? No, it dictates application of a certain EQ curve which attenuates the treble. Your description fits ReplayGain, which exists for file formats like FLAC and WavPack. -------------------- geocities.com/hydrogenaudio: http://goo.gl/tqYZj
|
|
|
|
Nov 1 2010, 04:09
Post
#65
|
|
![]() Group: Members Posts: 42 Joined: 6-October 10 Member No.: 84390 |
I haven't found much info on emphasis. From what I understand, it's some kind of tag on the audio file that tells the CD amplifier to amplify this song by this much. Close? No, it dictates application of a certain EQ curve which attenuates the treble. Ok, still not understanding it. Any recommended reading? |
|
|
|
Nov 1 2010, 05:10
Post
#66
|
|
![]() Group: Super Moderator Posts: 9268 Joined: 1-April 04 Member No.: 13167 |
Have you tried our wiki? It's generally the first place to look...
http://wiki.hydrogenaudio.org/index.php?title=Pre-emphasis -------------------- Everything sounds the same until it is proven otherwise.
|
|
|
|
Nov 1 2010, 05:18
Post
#67
|
|
|
Group: Members Posts: 307 Joined: 19-April 08 From: LA Member No.: 52914 |
I haven't found much info on emphasis. From what I understand, it's some kind of tag on the audio file that tells the CD amplifier to amplify this song by this much. Close? No, it dictates application of a certain EQ curve which attenuates the treble. Ok, still not understanding it. Any recommended reading? Pre-emphasis and de-emphasis has been around a long time and is used in FM radio, analog TV audio, analog tape and LPs. The theory is since the high frequency components are typically lower amplitude than the lows and mids, we can use a little of the unused real estate by boosting the highs on the transmit/record side and lower them at receive/playback. Any noise introduced after pre-emphasis will be attenuated along with the excessive highs. This restores the response and reduces the noise. When CDs were introduced some thought pre-emphasis would be a good idea to reduce quantizing noise and while a few discs were made with pre-emphasis, most were not and for many years, none. On the CD the pre-emphasis is an analog high boost ahead of the A-D converter. This also sets a flag bit on the CD to activate the filter during playback if needed. During playback an analog filter after the DAC restores the response and reduces he quantizing errors. If you extract the digital audio from the disc during a rip session, you now have boosted highs but no analog filter to correct it. In theory the analog filter after the DAC is the best but in practice digitally processing the stream is certainly acceptable and possibly more accurate as the digital filter is not subject to 5% or even 1% component tolerances. They tell me Sox works well and I'm happy with the CoolEdit/Audition filter settings I used - all of 2 times. G˛ This post has been edited by Glenn Gundlach: Nov 1 2010, 05:20 |
|
|
|
Nov 1 2010, 19:22
Post
#68
|
|
![]() Group: Members Posts: 42 Joined: 6-October 10 Member No.: 84390 |
Have you tried our wiki? It's generally the first place to look... http://wiki.hydrogenaudio.org/index.php?title=Pre-emphasis I looked through the main topics under technical but didn't find anything. Thanks. |
|
|
|
Nov 1 2010, 19:36
Post
#69
|
|
![]() Group: Developer Posts: 2986 Joined: 2-December 07 Member No.: 49183 |
[misread]
This post has been edited by lvqcl: Nov 1 2010, 20:21 |
|
|
|
Nov 1 2010, 19:43
Post
#70
|
|
![]() Group: Super Moderator Posts: 9268 Joined: 1-April 04 Member No.: 13167 |
I looked through the main topics under technical but didn't find anything. It can be found under Signal Processing which is under Technical, though I found it simply by typing "preemphasis" in the search field (or "pre-emphasis", it doesn't matter either way) . This post has been edited by greynol: Nov 1 2010, 19:45 -------------------- Everything sounds the same until it is proven otherwise.
|
|
|
|
Apr 29 2012, 16:28
Post
#71
|
|
![]() Group: Members Posts: 1474 Joined: 30-November 06 Member No.: 38207 |
Are they still making CDs with pre-emphasis [...]? Yes, unfortunately. My most-recent CD purchase: http://www.discogs.com/Lifelover-Konkurs/release/1513652 from 2008. Argh. Cthulhu has risen from R'lyeh again, as reported here: http://forum.dbpoweramp.com/showthread.php...ll=1#post121273 . The current 'newest pre-emphasis CD' to my knowledge is http://www.discogs.com/Marc-Almond-Michael...release/3066984 , released June 2011. -------------------- geocities.com/hydrogenaudio: http://goo.gl/tqYZj
|
|
|
|
![]() ![]() |
|
Lo-Fi Version | Time is now: 25th May 2013 - 12:14 |