Hi Sebastian
I recognise your name - from xtremevbtalk, of course!
My guess is this: for each frame, the encoder determines the minimum size in bytes, MINS, required to encode the next set (1152) of samples (given the requested quality options, etc).
Then it adds 4 or 6 to allow for the header and CRC if requested.
It then selects the minimum bitrate that corresponds to a frame of at least this size, and encodes it at that rate.
The encoding algorithms never seem to waste space, so the final encoding will always fill the frame - I guess further that when there is room, extra sample data that is "locally redundant", that is, not essential for the required setting, is added in. In other words, the "quality" of the frame is improved when there is room.
This lovely little theory doesn't quite sit with reality, however. I encoded a wav file to mp3 VBR with LAME 3.97, with and without CRC, with highest quality options (-V0 -q0), and then stripped off any LAME-tagged frames.
Both files have 8435 frames, of which 2471 have different size in the two files. I'd expect (assuming the theory) these 2471 frames to all be 1 "size" bigger in the CRC version, corresponding to 2471 frames where the local redundancy measure didn't allow the 2-byte CRC to fit.
Not so! Of those 2471 frames, only 1315 are bigger in the CRC version, while the other 1156 are bigger in the non-CRC version.
Here's an extract (front and back) from a listing of comparative frame sizes when they differ. The "net diff" is the difference between the total frame sizes to that point, a negative value means the CRC version is bigger.
Note that there are several points where the CRC version would be smaller than the non-CRC version, the last such occurrence being frame 282 (the CRC file ends up about 15k bytes longer).
So it is definitely true that an extract from a non-CRC file can be longer than in the CRC version.
The obvious question now is, given that frames are meant to be more or less independent (ie can be copied and pasted together arbitrarily), why do so many nonCRC frames get shortened in the CRC version?
CODE
Frame Frame Size Net
no. noCRC CRC diff
--------------------------
22 -522 +626 -104
23 +626 -522 0
24 -522 +626 -104
25 +626 -522 0
26 -522 +626 -104
28 +626 -522 0
31 -522 +626 -104
32 +626 -522 0
33 -522 +626 -104
34 +626 -522 0
36 -522 +626 -104
37 +626 -522 0
38 -522 +626 -104
42 +626 -522 0
43 -522 +626 -104
47 +626 -522 0
48 -522 +626 -104
49 +626 -522 0
50 -522 +626 -104
70 -626 +731 -209
71 +731 -626 -104
72 -626 +731 -209
73 +731 -626 -104
76 +731 -626 +1
77 -626 +731 -104
78 +731 -626 +1
82 -626 +731 -104
83 +731 -626 +1
84 -626 +731 -104
85 +731 -626 +1
87 -626 +731 -104
88 +731 -626 +1
89 -731 +835 -103
90 +731 -626 +2
93 -835 +1044 -207
94 +731 -626 -102
95 +731 -626 +3
104 +835 -731 +107
106 -626 +731 +2
107 +731 -626 +107
108 -626 +731 +2
112 -731 +835 -102
113 +731 -626 +3
114 -731 +835 -101
116 +731 -626 +4
117 -626 +731 -101
118 +731 -626 +4
119 -626 +731 -101
136 -626 +731 -206
137 +731 -626 -101
138 -626 +731 -206
139 +731 -626 -101
141 -626 +731 -206
142 +731 -626 -101
155 -835 +1044 -310
156 +731 -626 -205
157 +731 -626 -100
171 -731 +835 -204
172 +1044 -835 +5
173 -626 +731 -100
175 -626 +731 -205
176 +731 -626 -100
181 -626 +731 -205
182 +731 -626 -100
205 +731 -626 +5
206 -626 +731 -100
207 +731 -626 +5
208 -626 +731 -100
212 +835 -731 +4
221 -626 +731 -101
222 +731 -626 +4
225 +731 -626 +109
226 -626 +731 +4
241 -626 +731 -101
242 +835 -731 +3
246 -835 +1044 -206
247 +1044 -835 +3
249 -626 +731 -102
251 +731 -626 +3
252 -626 +731 -102
281 -626 +731 -207
282 +1044 -835 +2 (Last time noCRC > CRC)
283 -835 +1044 -207
284 +731 -626 -102
...
...
8386 +626 -522 -15338
8387 -417 +522 -15443
8394 +522 -417 -15338
8395 -417 +522 -15443
8399 -417 +522 -15548
8400 +522 -417 -15443
8404 -417 +522 -15548
8405 +522 -417 -15443
8408 -417 +522 -15548
8409 +522 -417 -15443
8410 -417 +522 -15548
8411 +522 -365 -15391
8412 -365 +417 -15443
8413 -365 +417 -15495
8414 +522 -417 -15390
8416 -365 +522 -15547
8417 +522 -365 -15390
8418 -365 +417 -15442
8419 -365 +417 -15494
8431 -313 +365 -15546
8432 +365 -313 -15494
8433 -313 +365 -15546
BTW, the "+" and "-" signs for the frame sizes are just there to indicate which one is bigger.