True FLAC vs. Fake FLAC |
![]() ![]() |
True FLAC vs. Fake FLAC |
Oct 12 2011, 05:22
Post
#51
|
|
![]() Group: Super Moderator Posts: 9361 Joined: 1-April 04 Member No.: 13167 |
I've seen examples of both false positives and false negatives.
-------------------- Everything sounds the same until it is proven otherwise.
|
|
|
|
Oct 12 2011, 05:44
Post
#52
|
|
|
Group: Members Posts: 4163 Joined: 2-September 02 Member No.: 3264 |
3 things: 1. hi I am new 2. I recently stumbled across a paper which details an algorithm that, with a very high success rate, guess the bit rate of an audio file just using data from the file's high-frequency spectrum. If developed further it could remove the need to visually inspect a spectrogram etc. and would be much faster. http://www.fileden.com/files/2009/2/14/232...20Frequency.pdf QUOTE In order to obtain the feature data, the source MP3 files were each decompressed into a 1411 kbps WAV file using the Fraunhofer IIS MP3 Surround Commandline Decoder V1.4 [2]. This was done because audio files in this format can easily be read into MATLAB, and as we have demonstrated, transcoding to a higher bit rate does not affect the frequency characteristics of the audio which we are observing. It is, in my opinion, not a good sign when the author of a paper does not understand that WAV is a lossless format and so resorts to arguing that "transcoding" to PCM probably doesn't change the audio. Regardless, all that paper demonstrates is that if you know that LAME 3.97 was used with default lowpass for each bitrate, you can figure out the source bitrate by looking at the lowpass setting. |
|
|
|
Oct 12 2011, 06:19
Post
#53
|
|
|
Group: Members Posts: 4 Joined: 12-October 11 Member No.: 94299 |
It is, in my opinion, not a good sign when the author of a paper does not understand that WAV is a lossless format and so resorts to arguing that "transcoding" to PCM probably doesn't change the audio. Regardless, all that paper demonstrates is that if you know that LAME 3.97 was used with default lowpass for each bitrate, you can figure out the source bitrate by looking at the lowpass setting. So does the paper actually not do what it claims it does? I don't see any reliance on prior knowledge concerning the "history" of the file in question. |
|
|
|
Oct 12 2011, 06:49
Post
#54
|
|
|
Group: Members Posts: 4163 Joined: 2-September 02 Member No.: 3264 |
So does the paper actually not do what it claims it does? It does what they claim, take a known encoder and version and then determine what bitrate was used. It doesn't do what people in this thread are interested in though. I don't see any reliance on prior knowledge concerning the "history" of the file in question. Suggest reading section 2, "procedure". They train their model using the same encoder and settings they will then attempt to detect. Without this the system is useless. |
|
|
|
Oct 12 2011, 07:01
Post
#55
|
|
|
Group: Members Posts: 4 Joined: 12-October 11 Member No.: 94299 |
QUOTE It doesn't do what people in this thread are interested in though. once you have an algorithm which can estimate original bit rates of a transcoded lossless format, getting a yes/no answer to the question "are my flacs 'real'?" seems trivial. QUOTE Suggest reading section 2, "procedure". They train their model using the same encoder and settings they will then attempt to detect. Without this the system is useless. Correct. I still don't see any reliance on prior knowledge concerning the "history" of the file in question. Of course any algorithm of this nature needs information about what it is looking for! |
|
|
|
Oct 12 2011, 07:09
Post
#56
|
|
|
Group: Members Posts: 4163 Joined: 2-September 02 Member No.: 3264 |
QUOTE It doesn't do what people in this thread are interested in though. once you have an algorithm which can estimate original bit rates of a transcoded lossless format, getting a yes/no answer to the question "are my flacs 'real'?" seems trivial. If you know that the file was encoded with a given lame version, then you already know the answer to the question "are my flacs that I've created from my LAME mp3s 'real'" is "No". That said you are correct that determining if the output of a given LAME version will be lossy is quite trivial. QUOTE Suggest reading section 2, "procedure". They train their model using the same encoder and settings they will then attempt to detect. Without this the system is useless. Correct. I still don't see any reliance on prior knowledge concerning the "history" of the file in question. Of course any algorithm of this nature needs information about what it is looking for! As I said above, the prior knowledge is the encoder and settings (aside from bitrate) used to create the file in question. |
|
|
|
Oct 12 2011, 07:28
Post
#57
|
|
|
Group: Members Posts: 4 Joined: 12-October 11 Member No.: 94299 |
QUOTE If you know that the file was encoded with a given lame version, then you already know the answer to the question "are my flacs that I've created from my LAME mp3s 'real'" is "No". Yes but surely the OP was referring to a situation where the file history is unknown. (Suggest reading OP.) Even in this case the algorithm should be able to take arbitrary WAVs and, if they are indeed transcodes, guess their original bitrate with a great deal of accuracy. QUOTE As I said above, the prior knowledge is the encoder and settings (aside from bitrate) used to create the file in question. No. The prior knowledge is the information about the frequency characteristics of lame-encoded mp3s. What I am saying is that once you have trained the algorithm, you can then take arbitrary WAVs (or flacs, or whatever) and use the algorithm on them. This is pretty standard. Train an algorithm with a set of inputs, then give it arbitrary inputs and see how it does. 100% accuracy cannot be expected. |
|
|
|
Oct 12 2011, 07:41
Post
#58
|
|
|
Group: Members Posts: 4163 Joined: 2-September 02 Member No.: 3264 |
Yes but surely the OP was referring to a situation where the file history is unknown. In this case you cannot use this software. The authors have demonstrated identification of files that are known a priori to be transcoded by incorporating that knowledge into their algorithm. Hence my point above that its not useful for what people in this thread want to do. (Suggest reading OP.) No need to get angry at me. I'm not attacking you, I'm just trying to lead you towards an understanding of why what you are proposing does not work. Even in this case the algorithm should be able to take arbitrary WAVs It should? The authors certainly haven't demonstrated that. In fact they are quite clear that they have not chosen arbitrary wav files. QUOTE As I said above, the prior knowledge is the encoder and settings (aside from bitrate) used to create the file in question. No. The prior knowledge is the information about the frequency characteristics of lame-encoded mp3s. I'm not being condescending, but read more carefully, there is a LOT more prior information being used here. The procedure explains that the training set and the unknown set were encoded with identical settings and encoder. This is not by chance. The authors have not accidentally made their problem extremely easy compared to the one you want to solve. What I am saying is that once you have trained the algorithm, you can then take arbitrary WAVs (or flacs, or whatever) and use the algorithm on them. Ignoring for a moment what is actually going on, if this actually worked, why do you think the authors decided not to show that this was possible? Perhaps they were concerned about making their paper too exciting |
|
|
|
Oct 12 2011, 12:37
Post
#59
|
|
|
Group: Members Posts: 74 Joined: 8-September 11 Member No.: 93574 |
I've seen examples of both false positives and false negatives. I guess when a full CD is identified as CDDA it's safe to consider it an original and not an MP3 reconstruct. This might be the reason why Tau Analyser only works for a full CD and not an individual file. OK, i'll go a bit off topic for the last time (i hope |
|
|
|
Oct 12 2011, 22:03
Post
#60
|
|
|
Group: Members Posts: 591 Joined: 12-May 06 From: Colorado, USA Member No.: 30694 |
I have receive the answer from Qobuz. They have checked the file and think at has been through some MPA compression. They will ask the producer for a true original and offered me a free album to compensate. They were pretty quick to react too. Good point for them Many people, even many musicians, simply can't hear the difference between a lossy version and the original, which shouldn't be surprising, given the robustness of the lossy formats and all the listening tests we're familiar with. Some/many are also just not very technologically savvy. It really would not surprise me to find out, then, that artists or their representatives wouldn't necessarily even know that MP3/AAC/whatever is lossy at all, or recognize that once something is lossily encoded, there's no going back, even if they have a converter that turns their MP3s back into the WAVs or AIFFs needed by their labels and distributors. That's one of the reasons transcodes happen in general; people think "if higher bitrates mean higher quality, then I'll just convert this 128 kbps MP3 to a 320 kbps one! or maybe I'll just convert it to WAV and it'll be perfect!" |
|
|
|
Oct 13 2011, 09:52
Post
#61
|
|
|
Group: Members Posts: 24 Joined: 24-September 10 Member No.: 84113 |
QUOTE Some/many are also just not very technologically savvy. Unfortunately, this is very true. I have contacted several artists that I follow on Soundcloud about this. Knowledge of lossy/lossless encoding is not a prerequisite to creating music, and is sometimes not learned. |
|
|
|
Oct 13 2011, 15:31
Post
#62
|
|
|
Group: Members Posts: 74 Joined: 8-September 11 Member No.: 93574 |
And then, there are people who are not even musicians or technicians involved at some point.
Hopefully, it will change now that compagnies selling FLAC start to appear throughout the web. Unfortunately there is still too few poeple who are well aware of the issue to be careful about what they buy. 1 year ago i was still ripping to MP3 and burning those back on CDs when i wanted to copy a disc! It's up to us to spread the word now |
|
|
|
Oct 16 2011, 00:20
Post
#63
|
|
|
Group: Members Posts: 515 Joined: 1-November 06 Member No.: 37047 |
I have noticed that mp3s commonly add a large error to the input signal - when the error is estimated visually by looking at waveforms, not if you are listening to the decoded file (what the format is made for really).
In my simplified understanding, this can be interpreted as signal-dependent narrow-band noise insertion (psy-model guided quantization of subbands), and perhaps phase-error? Are there no known mechanisms to guesstimate that such an error was inserted at one stage? I believe that natural music commonly contain spectrally sparse content (pure harmonic waveforms) or temporally coherent impulses (at least when rising in level). Can one not search for such things in a file, and find traces commonly attributed to mp3 encoding and with little chance of being generated via other means? -k |
|
|
|
Oct 17 2011, 10:42
Post
#64
|
|
![]() ReplayGain developer Group: Developer Posts: 4609 Joined: 5-November 01 From: Yorkshire, UK Member No.: 409 |
On exceptionally pure tone-like signals you can see the shape of the codec's noise skirting it.
I can't think of many recordings that contain spectrally sparse content. I went looking for some once when trying to assess the audibility of distortion. Even a solo flute or violin or piano is too rich to spot masking noise at high bitrates, and with most pop, rock, jazz etc you can forget it completely. Pure impulses are a great test signal for ID-ing codecs, but only synthetic signals are known to be clean. With anything else, the pre-echo is usually lost in the other instruments. You could find it in isolation in some recordings, and maybe make a judgement that nothing else had caused it, but it doesn't sound like something you could automate. If the recording you want to check contains no ultra-pure tone and no isolated impulse-like sounds, then this task is impossible IMO. You can't see the coding noise in the coded version when looking in the waveform view or the spectral view. Apart from the low pass, some of the common lossy distortions can be easier to hear than see. Cheers, David. |
|
|
|
Oct 21 2011, 23:00
Post
#65
|
|
![]() Group: Members Posts: 34 Joined: 10-January 11 Member No.: 87208 |
There used to be a freeware DOS command line/console program that could examine a WAV and tell if it came from an MP3. It's called AuCDtect and does a spectrum analysis looking for patters introduced by lossy compression. It works very well, you'll hardly ever get a false negative. There a windows frontend called Tau Analyzer and even a foobar plugin, all avaiable here: http://en.true-audio.com OK thanks very much for that. I haven't been able to find that program for years! -------------------- opinion is not fact
|
|
|
|
Jan 6 2012, 03:50
Post
#66
|
|
|
Group: Members Posts: 40 Joined: 22-December 09 Member No.: 76233 |
|
|
|
|
Jan 6 2012, 16:26
Post
#67
|
|
|
Group: Super Moderator Posts: 4483 Joined: 23-June 06 Member No.: 32180 |
Second result on Google (not sure which version): http://www.softpedia.com/get/Multimedia/Au...fooCDtect.shtml
Fourth result on Google (the same download is linked in a topic of our own, which is the first result): http://idle.netau.net/5099/foobar2000-fooc...i-for-aucdtect/ |
|
|
|
![]() ![]() |
|
Lo-Fi Version | Time is now: 18th June 2013 - 06:25 |