Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Some confusion about size of header-sideinfo mp3 (Read 4215 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Some confusion about size of header-sideinfo mp3

Counting the number of bits used by the sideinfo, I get 18.8 bytes.
Note that the side info is defined to contain the following parts:
  • main_data_begin (9bits)
  • private_bits (5bits)
  • scalefactor_selection_info (4bits)
  • side_info_granule0(66bits)
  • side_info_granule1(66bits)

I hereby used the option for single channel and the window_switching_flag set --- so region2 is empty and only two regions are coded.
So my question is: why doesn't it add up to 136 bits (17 bytes), as the standard specifies it to be?

Some confusion about size of header-sideinfo mp3

Reply #1
[Resolved] The size of the side_info_granule I calculated was incorrect. To rephrase, the information contained in the side_info_granule_(0 or 1)-field:
  • block_type(2bits,4bits)
  • mixed_block_flag(1bit,2bits)
  • table_select (10bits,20bits), or when window_select_flag = 0, (15bits,30bits)
  • subblock_gain (9bits,18bits)
  • region0_count (4bits,8bits)
  • region1_count (3bits,6bits)

Where the first amount of bits between parantheses is for mono- and the second for others modes.
In the case that the window_select_flag = 1,  region2 is empty so only two regions are coded. Hence, in mono mode we have 5*2*1 = 10. 5 bits are needed to specify the specific Huffman table; 2 regions to be coded; 1 channel (mono mode). A similar argument goes for the dual channel case, and with the window_select_flag = 1 --- and hence all three regions are coded.

Secondly, when window_select_flag =  0 the fields block_type, mixed_blockflag and subblock_gain are not used.

When considering these issues, the number of bits add up to 136 (mono) or 256 (dual channel). Namely, for mono mode and window_select_flag = 0 we have for the window_select_flag-dependent fields:
  • table_select (3 * 5 bits), namely 5 bits are needed to specify the Huffman tables and 3 regions are coded. (3 regions are coded because window_select_flag = 0)
  • block_type (2bits), not used because window_select_flag = 0
  • mixed_block_flag (1bit), not used, (")
  • subblock_gain (9bits), not used (")
  • total = 15 bits;

for the rest of the side_info_granule fields:
  • part2_3_length (12bits)
  • big_values (9bits)
  • global_gain (8bits)
  • scalefac_compress (4bits)
  • preflag (1bit)
  • scalefac_scale (1bit)
  • count1table_select (1bit)
  • window_switching_flag (1bit)
  • region0_count(4bits)
  • region1_count (3bits)
  • total = 44  bits;

for the general field that isn't part of the side_info_granule-field:
  • main_data_begin (9bits)
  • private_bits (5bits)
  • scalefactor_selection_info (4bits)
  • total = 18 bits

So total bits per granule: 15 + 44 = 59 bits. There are two granules per mp3 frame so we have for the total size of the sideinfo: 2 * 59 + 18 = 136 bits = 17 bytes.

Indeed, this is consistent with the standard. A similar argument goes for the case of other modes and with the window_switching_flag = 1.

Sources:
  • D. Salomon, Data Compression: The Complete Reference, text book, p.818-819
  • P. Sripada, MP3 DECODER in Theory and Practice, master thesis, p.24-25
  • R. Raissi, The Theory Behind Mp3, paper, p.13-17