Since no one seems to take this task, I read the specification of MP4 file format available from
the spec, install the latest iTunes on my machine, and download the latest mpeg4ip tools to dump MP4 streams.
I googled and found
a thread to implement the gapless solultion (but I found
original thread with an important information later and realized that I shouldn't have taken this approach

) and implemented a tool to set "ctts" and "stts" MP4 boxes to iTunes' MP4 files (
Again, don't take this approach).
I measured iTunes’ encoder delay and found it to be probably 1088 (= 1024+64?). Then I modify the MP4 stream to store gapless-playback information. The following is an example of a dump text of an MP4 stream my experimental program generated:
CODE
type stts
version = 0 (0x00)
flags = 0 (0x000000)
entryCount = 2 (0x00000002)
sampleCount = 431 (0x000001af)
sampleDelta = 1024 (0x00000400)
sampleCount[1] = 1 (0x00000001)
sampleDelta[1] = 744 (0x000002e8)
type ctts
version = 0 (0x00)
flags = 0 (0x000000)
entryCount = 2 (0x00000002)
sampleCount = 1 (0x00000001)
sampleOffset = 1088 (0x00000440)
sampleCount[1] = 431 (0x000001af)
sampleOffset[1] = 0 (0x00000000)
RESULT:
In a short answer, I could not get gapless playback/decoding by using foobar2000/faad. Even though foobar2000 displays the song length as I expected:
01-itunes-1088.m4a: 441000 (= 431 * 1024 + 744 -1088)
01-itunes.m4a: 442368 (= 432 * 1024)
foobar2000 and faad won’t remove samples at the beginning which comes from the encoder delay of iTunes’ AAC encoder.
REASON FOR FAILURE:
I found a post saying,
"don't use 'ctts' and 'stts' boxes for gapless playback", in the original HA thread which does not exist in the Google's cache. Now I realized the reason why foobar and faad did not implement "ctts" for removing samples at the beginning.
HOW GAPLESS PLAYBACK IS ACHIEVED IN FAAC:
I have no idea how faac implements gapless playback. To remove the padded samples, we can use duration field in 'mdhd' MP4 box instead of 'stts'. But I was wondering how the decoder removed the samples which comes from FAAC encoder's delay. Then I compared the wave forms of: original wave; iTunes (delay = 1024; this is only for debugging purpose); iTunes (delay = 1088); iTunes (no delay information); faac (MP4 stream); and faac (AAC stream) in this order:
http://nyaochi.sakura.ne.jp/temp/mp4-delay.pngAll streams made from iTunes have the same delay even though I added delay information. Another interesting thing is, AAC stream does not have any delay. AFAIK, AAC stream does not contain gapless playback information, right? If so, the encoder delay of FAAC is found to be zero...
In conclusion, I could find a solution to remove padded samples, but no solution for removing samples at the beginning of a track that comes from encoder's delay. Does anyone know how to store encoder's delay in an MP4 stream? I'm disappointed to waste my weekend...

I've gotta sleep.