Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: MP3 Encoder Identification (Read 80031 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

MP3 Encoder Identification

I'm working on a program similar to EncSpot. As EncSpot probably is dead (or isn't it, where's the author?) I've made my self into MP3 tech and already figured out some curiousities for
- XiNG MP3 Encoder v1.0
- BladeEnc v0.94.2
- Canna MP3 Maker v1.1 (rare, I think)

Regarding to the values I collected yet, I can already identify the three mentioned encoders EXACTLY. As I don't see any point in doing such a proggy as "closed-source" and shareware or something else which people have to pay for, I hope we can discuss identification schemes in public and some of you may be willing to contribute.

For me it's not that important to integrate things like drag'n'drop and shell support in general. The main aim is to identify most of all possible MP3 Encoders and to do that task as accurate as possible.

And here are my first questions:

1) Is it enough to decode the header and side info from MP3 frames in order to identify the codec? At least for the three mentioned encoders it's enough, but I still don't get into the detectable differences between all those FhG encoders...

2) May I use information about the frame padding manner which is used? Some encoders start with a non-padded frame followed by a couple of padded frame and others do it vice-versa. That's something which gives a good hint at the possible MP3 encoders.

3) May I use the header flags (copyright, private, original) to identify an encoder? Many (old) encoders didn't allow you to set or clear these bits, so that's quite a good "mark", too.

4) Where can I find information about LAME tags, XiNG VBR Info and FhG VBRI info?

The problem is for questions 2 and 3 is that there probably are utilities which can change this information. For example if you cut MP3s with an appropriate utility I think the padding chain is broken, too. At least this is detectable, so the tool could tell if an MP3 was cut or not, which would be a feature which EncSpot didn't have (as probably any encoder starts with a non-padded frame first or with a complete chain of padded frames a cut would be identified easily).

MP3 Encoder Identification

Reply #1
Have you tried browsing through the source of Naoki Shibata's mp3guessenc?

Portions of that code were used in EncSpot, AFAIK.
criZZb

MP3 Encoder Identification

Reply #2
There was a commandline-tool of encspot once on the Guerillasoft-Homepage with all the sources, but I cannot find it anymore, since this page has gone. Does someone knew, where to find both (tool+sources)?

MP3 Encoder Identification

Reply #3
You should definitively used padding info, and original/copyright flags. It is the way FhG encoders are differenciated.


MP3 Encoder Identification

Reply #5
Thanks for your info! And thanks for the link to the EncSpot sources (I've never known that they are public)... 

Ofcourse I'll have to use information wether padding is used or not, but what I wanted to ask is, if the padding -manner- can be used too.

Here something which I found out for 128kbit MP3s:
(number = amount of consecutive padded frames)
(x = single non-padded frame)
(> ... < = looped till the end)

1) BladeEnc
> 23 x 24 x <

2) Xing (old)
> x 23 x 24 <

3) Cool Edit Mp3 Me! (Fast) + MusicMatch Jukebox 5 (Fast) (FhG)
12 > x 23 x 24 <

4) mp3enc v3.1 (FhG) + l3enc v2.61 (FhG)
> x 24 x 23 <

Well, something special for (3): Both MP3s made from the same wav look very similar. In fact the block types, bit reservoir usage and some other values were identically over the whole MP3, so I guess it's almost the same algorithm which is used here. Note that MMJB5 is quite old compared to Cool Edit's MP3(Pro) Export feature. Seems that FhG didn't change their -fast- algorithm, just slightly. EncSpot identifies them as "FhG (fastenc or mp3enc)". But using the padding manner information we can seperate fastenc from mp3enc. Of course I'll have to try the same for all other bitrates.

MP3 Encoder Identification

Reply #6
I was not thinking about the 1 slot padding, but about ancillary data. When you have few information to code, encoders are padding ancillary data with bytes. This padding is different between encoders.

as an example, Lame (recent ones) is using its own name and version info as padding. Some encoders are using 0x00, some are using 0xFF.

MP3 Encoder Identification

Reply #7
This is what I figured out so far (it still must be proven by testing it on enough examples):

Explanations:
scfs => scfsi
scal => scalefactor_scale usage ($0000=never, $ffff=always)
mode => channel mode (modes 0-3)
mext => channel mode extension (extension 0-3)
pman => padding manner (#non-padded, #padded, #non-padded ......)
flag => P=private, C=copyright, O=original, -=never used, +=always, !=optional
mbeg° => main_data_begin (i.e. number of slots of bit reservoir used)
bigv° => big_values
glgn° => global_gain
r0ct° => region0_count
r1ct° => region1_count
°: 0=average, 1=min, 2=max, 3=usage ($0000=never, $ffff=always)

BladeEnc v0.94.2
----------------
=> pman_128=$00,$17,$01,$18,$01,$17,$01,$18
=> mode[0]=$FFFF (stereo only)
=> emph[0]=$FFFF (no emphasis possible)
=> flag=P-,C-,O+,CRC-
=> mbeg[0]<$040
=> bigv[2]=$120
=> glgn[2]=$0D2
=> glgn[3]=$FFFF
=> r0ct[2]<$F
=> scal=$0000
=> no scfs

Canna MP3 Maker v1.1
--------------------
=> pman_128=$00,$17,$01,$18,$01,$17,$01,$18 (forced)
=> mode[0]=$FFFF (stereo only)
=> flag=P-,C!,O!,CRC-
=> mbeg[0]<$040
=> bigv[2]=$120
=> glgn[3]=$FFFF
=> r0ct[2]<$F
=> scal=$FFFF
=> no scfs
=> 128 bytes missing in last frame

FhG MusicMatch Jukebox v5.0 Level High (mp3enc?)
------------------------------------------------
=> pman_128=$01,$18,$01,$17,$01,$18,$01,$17 (forced)
=> mext[1]=$0000, mext[3]=$0000 (on joint stereo no intensity stereo frames)
=> emph[0]=$FFFF
=> flag=P-,C-,O-,CRC-
=> bigv[2]=$0CB
=> glgn[3]=$FFFF
=> r0ct[2]=$F
=> r1ct[2]=$7
=> scal used properly
=> no scfs

FhG MusicMatch Jukebox v5.0 Level Low (fastenc?)
------------------------------------------------
=> pman_128=$00,$0c,$01,$17,$01,$18,$01,$17 (forced)
=> mext[1]=$0000, mext[3]=$0000 (on joint stereo no intensity stereo frames)
=> emph[0]=$FFFF
=> flag=P-,C-,O-,CRC-
=> glgn[3]=$FFFF
=> r0ct[2]<$F
=> scal used properly
=> no scfs

Comments
------------
1) it seems that encoders relying on ISO never use big values > $120
2) ISO-encoders (canna + bladeenc) don't use bit reservoir properly
3) ISO-encoders don't use joint-stereo
4) it seems that BladeEnc uses a maximum global_gain value of $d2
5) region0_count: seems that some encs always use highest possible value ($0f) and others never use it
6) region1_count: mp3enc seems to always use the maximum of $07
7) global_gain: xing (not included here) is the only encoder which doesn't set values>0 on empty (=silent) frames
8) MusicMatch 5.0 with processing level "low" seems to be the same as fastenc
9) MM50 with proc. level "high" seems to be the same as mp3enc
10) FastEnc uses a funny 1-slot padding method (like no other encoder) and thus could be easily identified (except on 160kbps, you'll see later)

So this is what I basically figured out, and I'm requesting for your comments. I first want to make the pure frame analyizer working (as exact as possible), then, if I'm satisfied with it, I'll spot on VBR tags, ancillary data and things like that.

EDIT: If anybody wonders... I'm not doing this in order to "copy" encspot! The aim is to get more precise and detect more "marks" that encoders put.

MP3 Encoder Identification

Reply #8
FastEnc incorrectly reports joint-stereo in the header for bitrates > 160 kbit/s, while it actually uses all SS frames.

Is there some way to identify Soundjam/iTunes mp3's?

ff123

MP3 Encoder Identification

Reply #9
Are these encoders available for M$ Windows or DOS?

Either you can send me some example MP3s (that's illegal, isn't it, but I swear I won't hear them ) or links to the encoders (if they are DOS/win32).


EncSpot never used these information, but I'll use, and possibly these can be used to detect the encoders you mentioned:

1) often the big_values maximum values are in certain ranges for certain encoders
2) region0_count and region1_count are sometimes even fixed to certain values


EDIT: My program is now able to detect (and seperate from each other) these enc's and variants...

Canna MP3 Maker v1.1
BladeEnc v0.94.2
Fraunhofer MP3Enc v3.1
Fraunhofer MP3 Producer Pro v2.1
Fraunhofer FastEnc (MMJB50)
Fraunhofer FastEnc v1.02 (the difference is the original flag)
XING MP3 Encoder v1.0

BTW: If you put something in like Lame, Gogo or FhG L3Enc it simply says that the encoder wasn't identified, so the actually used detection routines are somewhat exact (i.e. it hopefully won't tell you encoder XYZ was used if it wasn't).

MP3 Encoder Identification

Reply #10
This would make for a really cool foobar plgin, plz consider doing that when you get it going

MP3 Encoder Identification

Reply #11
Hey, I'll release all info in plain text form if I'm ready. As I'm using Delphi the thing won't be portable. But isn't Foobar a win32 tool? If so, ofcourse I'll do!

EDIT: Another Xing issue...

I have used test files (5 wave files of completly different music styles) and put them into all encoders which I could get, and then I let EncSpot run over these to see what it is capable of.

Two encoders were used:
(1) XING MP3 Encoder v1.0 (1998, windows GUI program)
(2) XING's TOMPG.EXE v3.0 (1997, console program)

Results on (1):
- EncSpot identifies it as Xing (old)
- it never uses other channel mode than plain stereo

Results on (2):
- EncSpot identifies it as Xing (old)
- you have the option to use stereo, joint-stero or dual channel, so...
- EncSpot identifies it as Xing (very old) on joint-stereo mode
- EncSpot identifies it as unknown on dual channel mode

In fact, these results aren't satisfying. After all I had a further look at the values my tool printed out, and it seems like (1) is just a restricted (regarding to channel mode settings) and slightly changed (if more exact in detecting silent sound blocks) alternate of (2). The results will basically be the same, and except channel modes the very same marks are present (padding, big_value maximum etc.).

EDIT2: Okay, here the truth! In fact there are two different XING engines, one completed in 1997 which is used in TOMPG.EXE v3.0 (anybody got an earlier version?) and XING MP3 Encoder v1.0, and the other made in 1999 which is used in Products like XING MP3 Encoder v1.5, AudioCatalyst and Audiograbber AND AFAIK MusicMatch v4.xx

EDIT3: EncSpot issue! Canna MP3 Maker is wrongly identified as FhG (fastenc or mp3enc)! It seems this is due to the fact that scalefactor_scal is used (one of two differences to BladeEnc), but ALWAYS, i.e. for every granule, for every channel. This is the best "mark" one encoder can set, as I've never seen this on other encoders. My tool reports SCAL=$FFFF (=> scalefactor_scale used in 100% of all occurances).

EDIT4: Look at http://www.mp3pro-soft.narod.ru/download/E.../EncInfo0_1.rar and grab this !attempt! of mp3 encoder identification util. It uses ATL library for Delphi which is available as open-source.

1. The identification is lamely done through frame header flags only
2. The number of frames is calculated by filesize/framesize on CBR, or taken from VBR info tag
3. It's lame (for the above reasons)
4. Excerpt: "From the similar programs differs by an exactitude, simple interface and fast reading of an information from the file."

MP3 Encoder Identification

Reply #12
foorbar = win32 app and written in c++

MP3 Encoder Identification

Reply #13
I can flood you with old & rare MP3 encoders. I have a pretty big collection here.

If you are interested, just PM me.

MP3 Encoder Identification

Reply #14
Yet detected encoders list (31.07.03)

Canna MP3 Maker v1.1 [ISO Engine]
BladeEnc v0.94.2 [ISO Engine]
SoloH mpeg Encoder v0.07a [ISO Engine] *added
Fraunhofer MP3Enc v3.1
Fraunhofer MP3 Producer Pro v2.1
Fraunhofer FastEnc (MMJB50)
Fraunhofer FastEnc v1.02
XING Engine v1 (1997)
EmP3-N-Coder (1999) [XINGv1 hack] *added
XING Engine v2 (1999)

BTW: Please note that tests have just been done with 128kbit 44.1KHz Stereo MP3s, so it's not sure that detection will work for other mp3s atm.


Better detection as EncSpot

- seperation of three ISO engines (two detected as Blade, one as fastenc/mp3enc)
- seperation of FhG MP3Enc and FhG FastEnc
- seperation of EmP3-N-Coder (was detected as XING)
- xing (old) and xing (very old) actually is the same engine

MP3 Encoder Identification

Reply #15
Arsed!

It's driving me mad. It looks like you really can't decide between FhG FastEnc and FhG Producer Pro (ACM). Since the MP3 Me! Export Plugin(?) for Cool Edit and Nero offer options to disable padding and all other values which may be used to detect encoder marks are very similar between FastEnc and ACM the only possible decision fact is blown away. Wanna see the results of this shit?

1. As I knew that this plugin(?) offers options to disable padding settings each header flag (copyright, original, private) independently I tried to make detection not relying on this.

2. EncSpot relies on 1-slot padding on decision between FastEnc and ACM. And here we go...

=> EncSpot reports non-padded FastEnc MP3s as FhG (ACM or Producer Pro)
=> MyTool does NOT, but it detects ACM files as FastEnc, too

A little bit of tweaking is possible, but it won't work for all MP3 files made with these two encoders. Maybe it's time to figure out new relationships between some values...

EDIT: Simple work-around => if something was detected as ACM first it won't be detected as FastEnc. That's pretty the logic which I didn't want to use, but however. Note that the thing is not working clearly. There may be a few MP3s made by FastEnc which aren't padded (due to that f**** MP3 Me! Plugin(?)) and have a maximum region1_count value of $7 (value $6 is the maximum value for most FastEnc MP3s, but there are few exceptions I guess). So there's a chance of approx 1% that some FastEnc files will be wrongly identified as ACM MP3s, I think only few people will override the plugin's default and switch off ISO padding and I think there are just few MP3s which use region1_count values of $7.

MP3 Encoder Identification

Reply #16
When you will test other bitrates, I think that you will also need to know combination that can not happen for encoders.
Example: joint stereo is not possible with ISO encoders.

I think that perhaps you should set up a webpage/textfile somewhere with a big table indicating parameters for each encoder. It would be easier for us to see what you already took into consideration and what you did not.


Misc things: ISO ref code included a wrong CRC method (crc was wrong for each frame if used). It was corrected in a specific Blade version.

Bit reservoir variance for ISO-based should be quite low.

I think that you could gather some interesting info by extracting which huffman tables are used.

MP3 Encoder Identification

Reply #17
Here is a list of potential points to be used as identification hints: (to be edited if needed)

*mpeg1/2/2.5
*mono/stereo/dual/ms/is/mixed/forced is/forced ms
*crc on/off/wrong
*long/short/mixed blocks
*use of bit reservoir
*variance of bit reservoir
*sfscale
*scfsi
*subblock gain
*region0
*region1
*use of different block size for both channels
*private bits
*use of huffman tables
*padding method
*ancillary data padding
*original/copyright
*free format by using free format frame size
*free format by regulary alternating normal frame sizes
*max big_values value (8191 or 8206)
*global gain

MP3 Encoder Identification

Reply #18
I would like to test your program...is there anyway you can upload it or something? Thanks
--alt-presets are there for a reason! These other switches DO NOT work better than it, trust me on this.
LAME + Joint Stereo doesn't destroy 'Stereo'

MP3 Encoder Identification

Reply #19
Here is a list of known mp3 encoders (to be edited if needed):

Dist10
Blade
Soloh
Canna
Plugger
SCMPX

L3enc 2.0
L3enc post 2.0
Mp3enc
SWA export plug-in
Mp3Producer/Audioactive/ACM
Fastenc
FastEnc+mp3Pro

TomPG
XingV1
XingV2

QDesign
Uzura3
Shine

Lame
Gogo

ARM
Intel IPP

Mpegger

MP3 Encoder Identification

Reply #20
Quote
Here is a list of known mp3 encoders (to be edited if needed):
(...)

Where is my favorite one : Plugger™ ?


EDIT : added (message can be deleted)
Wavpack Hybrid: one encoder, one encoding for all scenarios
WavPack -c4.5hx6 (44100Hz & 48000Hz) ≈ 390 kbps + correction file
WavPack -c4hx6 (96000Hz) ≈ 768 kbps + correction file
WavPack -h (SACD & DSD) ≈ 2400 kbps at 2.8224 MHz

MP3 Encoder Identification

Reply #21
Thanks, Gabriel!

1. The huffman table selection (scalefac_compress, table_select) could not give any hint as far as I can see. All encoders almost use the same / don't use the same tables. That's what I've seen so far, and the same is for subblock gain.

2. I primarily want to make a tool for MPEG1-LayerIII files. Once the detection for MP3 files is complete, I may think about adding support for MPEG2 or MPEG2.5 files aswell as other layers (need info about the frame format).

3. Use of region0_count, region1_count, main_data_begin, scalefac_scale, scfsi and big_values and global_gain is already analyzed and used for identification. Aswell as the padding method (1-slot padding) and use of SS/IS/MS/Mixed frames. Private bits were never set (those from the side info) as far as I could see.

4. I have just added SCMPX encoder (thanks rjamorim) to the detection list and it can easily be identified as it is based on ISO BUT used bit reservoir in a normal way(!). Seems that this one is tweaked a lot.

5. What I will have to check (and is not shown by the tool yet, so thanks for these suggestions):
- variance of bit reservoir
- statistics of "empty" frames (especially the "padding" in those)
- freeformat MP3s (can somebody explain the format?)

6. There are some encoders listed which aren't available for Win32 or don't work with Win2k. Any chance of some of you willing to help (i.e. getting five example waves (loslessly compressed) and encoding them at 128k and giving info about the encoder capabilities)?

7. I try not to use frame header flags (private, copyright, original) as these can be altered too easily. As I said before, the MP3pro export thing in plain MP3 mode offers options to set/clear these flags individually and 1-slot padding is optional too. So I don't want the detection algorithms to rely on it. The same is for VBR headers. They will be used if e.g. LAME or FhG fastenc/mp3enc is detected (in order to obtain version numbers and other info), but I won't give them priority as these can be altered, exchanged or generated too easily, too. I even think about adding a header repair and VBRI=>XING conversion function as most players doesn't support VBRI headers, so I don't want to get the tool arsed by itself.

8. What's "different block size for both channels"? Will I have to decode main data? If so, it's probably a difficult thing as I'm using delphi and will have to "convert" C sources to Delphi ones...

But big thanks for all the info! I guess I will either upload a pre-alpha version (user interface will be poor) at the weekend or next week. Please note that I've tested mostly 128kbit MP3s and -some- 192kbit and VBR MP3s. It worked fine for them. I'll release a table what I've tested so far. Still I did not try it on mono and <44.1KHz files, the results for these will be a surprise I guess!

EDIT: Plugger is already detected!

MP3 Encoder Identification

Reply #22
Quote
1. The huffman table selection (scalefac_compress, table_select) could not give any hint as far as I can see. All encoders almost use the same / don't use the same tables.

It could. I've read several papers about speed-optimized mp3 encoder (mainly for arm cores). A frequent optimization is to not try every huffman table, only only a few ones which should, overall, be the most efficient ones.
I think that perhaps you could discover that some tables are never used by some encoders.

Quote
freeformat MP3s (can somebody explain the format?)

Described in the mp3 standard (I am assuming you have a copy).
Explanation: 2 ways of achieving freeformat

*using the free format in the bitrate index
*alternating between standard bitrates. Example: try encoding 20kbps with mp3enc.

MP3 Encoder Identification

Reply #23
If bitrate index indicates free format, I've read that in fact any frame size may appear and the application has to calculate that frame size N (or N+1 on frames with padding) on its own, right? Shouldn't be that problem.

For the second way of free format: Achieved by alternating between standard bitrates, so each frame should show a valid bitrate index, right? If so, is the "average" bitrate fixed in any case or not? Is just alternated between two bitrates (i.e. if encoded in 144kbps alternating between 128 and 160 only)?

EDIT: Nice info on the huffman table selection! But as far as I have seen, no single encoder I've tried yet uses all huffman tables (perhaps on low bitrates only?). Anyway, I'll take a further look at it.

EDIT2: Feel free to download this pre-pre-pre-alpha at http://fkr.nm.ru/encscanp1.zip and don't except a user interface, hehe. You can just open one file at a time and scan it. If you encounter any problems (yes, it may crash on non-mp3 files) or wrongly or non-identified (especially these) MP3s feel free to post or PM the log and tell me which encoder was used. VBR headers are detected but not used, just skipped (in order to get real frame information only). Sourcecode included.

MP3 Encoder Identification

Reply #24
Quote
If bitrate index indicates free format, I've read that in fact any frame size may appear and the application has to calculate that frame size N (or N+1 on frames with padding) on its own, right? Shouldn't be that problem.

In this case frame size is constant

Quote
Achieved by alternating between standard bitrates, so each frame should show a valid bitrate index, right? If so, is the "average" bitrate fixed in any case or not? Is just alternated between two bitrates (i.e. if encoded in 144kbps alternating between 128 and 160 only)?

The case I know is mp3enc alternating between 16 and 24kbps frames, in order to provide a "constant" bitrate of 20kbps