IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
How to determine gapless information in runtime
Piligrim
post Apr 2 2011, 17:30
Post #1





Group: Members
Posts: 26
Joined: 28-October 07
From: Los Angeles
Member No.: 48279



Hi, I could not find anywhere exact information about how to determine gapless information from aac in mp4 container. So far I've figured out from various sources and experiments that to have gapless playback with mp4 one has to apply different approaches based on the encoder:

itunes: no gapless, maybe drop first frame (like in faad)
faac: skip first frame and make last frame shorter based on stts
nero: skip first two frames + 576 samples of encoder delay and make last frame shorter based on stts

first of all, are these assumptions correct? I'm using foobar2k and nero decoder as a reference. Is there any other information in the container that I can use instead of checking what tool the file was encoded with?

Thank you.
Go to the top of the page
+Quote Post
Alex B
post Apr 2 2011, 20:50
Post #2





Group: Members
Posts: 1303
Joined: 14-September 05
From: Helsinki, Finland
Member No.: 24472



QUOTE (Piligrim @ Apr 2 2011, 19:30) *
itunes: no gapless, maybe drop first frame (like in faad)

iTunes uses its own system. It places the gapless decoding data in the ITUNSMPB tag

Here is my recent reply from another forum. I gathered the info from old HA threads:

QUOTE
iTunes and the recent versions of the Nero encoder store the gapless playback info in the ITUNSMPB tag. The values are stored in hex format as follows (in this example: Nero LC, HE and HE+PS):

Red=encoder delay  Green=padding   Blue=original sample count
LC
00000000 00000A40 000003E4 000000000004B5DC 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
HE
00000000 00000920 000003F2 0000000000025AEE 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
HE+PS
00000000 00000AF8 0000021A 0000000000025AEE 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

Here are the values after conversion to dec:
LC
2624    996   308700
HE
2336   1010   154350
HE+PS
2808    538   154350

Apparently the original sample count in the HE versions is correct for the LC part (half of the reconstructed sample rate).

In addition, you must take into account the decoder delay.


I also noticed a bug in an early iTunes v.10.x build:

QUOTE
FYI, I just noticed that the ITUNSMPB tag values in my "iTunes HE-AAC (SBR)" sample files are slightly off. This was caused by a buggy iTunes version (an early v.10 build).

I just installed the latest iTunes version (10.2.1.1) and the bug seems to be fixed. It produces values that are identical to the Nero and qaac/QuickTime values.

qaac is a frontend for the QuickTime encoder.

This post has been edited by Alex B: Apr 2 2011, 21:19


--------------------
http://listening-tests.freetzi.com
Go to the top of the page
+Quote Post
Piligrim
post Apr 3 2011, 09:50
Post #3





Group: Members
Posts: 26
Joined: 28-October 07
From: Los Angeles
Member No.: 48279



Hi,

Thanks a lot! So, if ITUNSMPB tag is present, I use it, if it's not, I use my previous approach?
Go to the top of the page
+Quote Post
Alex B
post Apr 3 2011, 22:39
Post #4





Group: Members
Posts: 1303
Joined: 14-September 05
From: Helsinki, Finland
Member No.: 24472



I don't know what system FAAC uses, but apparently the Nero encoder uses the chapter list*:

QUOTE (menno @ Nov 27 2008, 18:06) *
Yeah ctts and stts should not be used, although current encoder still creates this I think. We use the chapter list.

It would be nice if someone could explain how the Nero encoder uses the chapter list for storing the gapless decoding data.

Some related links (in no particular order):

http://www.hydrogenaudio.org/forums/index....showtopic=67518
http://www.hydrogenaudio.org/forums/index....showtopic=16846
http://www.hydrogenaudio.org/forums/index....showtopic=34989
http://www.hydrogenaudio.org/forums/index....showtopic=35482


* EDIT ...and also (as I said above):
QUOTE
From the NeroAacEnc change log - 2009-12-17 - Version 1.5.1.0

- Write iTunes compatible gapless data


This post has been edited by Alex B: Apr 3 2011, 23:00


--------------------
http://listening-tests.freetzi.com
Go to the top of the page
+Quote Post
kode54
post Aug 13 2013, 00:02
Post #5





Group: Admin
Posts: 4505
Joined: 15-December 02
Member No.: 4082



Bumping because it's useful knowledge, and also because I have something to add.

It seems that iTunes needs SBR and PS files to include a padding field up to a duration multiple of 2048 instead of 1024, or at least needs an extra packet of padding to include the decoder delay. This based on a bit of experimenting with the XLD fdk-aac PlugIn, against which I have a pull request. I'm not sure if I did it exactly right, though.

It was fun testing the files against foobar2000, too, because foobar appears to completely disregard the padding field and only uses the delay and total length fields. iTunes, on the other hand, will disregard the gapless information if the padding field is too small, resulting in gapped files. The previous version of the encoder plug-in would hack in a shorter delay field, which iTunes presumably dodged around, as the files would play gaplessly there, but not in foobar2000.
Go to the top of the page
+Quote Post
lvqcl
post Aug 13 2013, 16:06
Post #6





Group: Developer
Posts: 3221
Joined: 2-December 07
Member No.: 49183



Another info about HE-AAC and gapless playback: http://www.hydrogenaudio.org/forums/index....showtopic=98450
Go to the top of the page
+Quote Post
nu774
post Aug 14 2013, 11:44
Post #7





Group: Developer
Posts: 477
Joined: 22-November 10
From: Japan
Member No.: 85902



QUOTE (Alex B @ Apr 4 2011, 06:39) *
It would be nice if someone could explain how the Nero encoder uses the chapter list for storing the gapless decoding data.

As far as I know, older version of Nero encoder used to employ non standard Nero style chapter list (udta.chpl) to declare delay, and stts (Decoding Time to Sample Box) to denote valid length of the final frame.

If you don't know about MP4 stts box, you can roughly take stts as a table keeping length of each frame. For constantly framed audio codec like AAC, it usually has only one entry, whose length is 1024 or something.
In case of Nero, it has an extra entry for the final frame, which keeps valid length of the final frame (which is shorter than 1024, and a player has to trim decoded audio down to that length after decoding).
This is the same method as is used by ALAC in m4a.

Finally, since Nero style chapters can be used for ordinary purpose, I think there rises an ambiguity on treatment of the first entry in the chapter list.
It seems that fb2k always expects first entry to denote the delay when Nero style chapter is present.

This post has been edited by nu774: Aug 14 2013, 11:52
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 25th April 2014 - 03:47