Concatenating MP3 streams |
![]() ![]() |
Concatenating MP3 streams |
May 17 2009, 22:00
Post
#1
|
|
|
Group: Members Posts: 2 Joined: 17-May 09 Member No.: 69906 |
Hi all,
I want to implement simple streaming server (something similar to SHOUTcast, but much more simplier) but with one additional feature. I want to insert targeted elements (e.g. ads) to user stream (basic on user IP, location, etc.). Let's assume I will be using mp3 stream. For eficiency reasons, of course I cannot encode stream in server - so basically, I have my "MAIN" mp3 stream (obtained from some external source) and bunch of prepared mp3 files (targeted elements). As far as MP3 is only list of frames, it looks that I can simply concatenate those mp3 files with main stream at some points (on frame borders) and everything will be OK. BUT, I've read something about MP3 (e.g. LAME TECH-FAQ http://lame.sourceforge.net/tech-FAQ.txt) and I know that there is this "bytes reservoir" stuff, MDCT overlapping frames etc. My questions is: If I concatenate two mp3 files (which were prepared with LAME with option '--nores' (turning off bytes reservoir)) will everything work seamlessly? I can understand that there could be a bit of silence (because some starting bits in mp3 files are useless, etc.) but is there a possibility that some artifacts will show up? In other words - does turning off bytes reserveoir, really makes mp3 frames independent and ready to simple concatenate with other mp3 files? What about this overlapping MDCT's? Maybe they are overlapping only in one physical frame scope? Or even if they are overlapping at global scope, for decoder there is no big deal to decode concatenated streams without artiffacts? I hope, I am clear with my doubts Thanks for Your help. P.S. Maybe trhere are some better lossy formats (like OGG Vorbis, AAC?) for this kind of simple-concatenation-requirement? Cheers, Kamil |
|
|
|
May 18 2009, 16:51
Post
#2
|
|
![]() Group: Members Posts: 366 Joined: 17-September 06 Member No.: 35307 |
Reservoir means that frame n's audio data may include information stored in frame n-1, n-2 etc. None of it is stored in frame n+1.
If you're not breaking up an existing MP3 file into two halves, but instead are waiting until the end of music1.mp3 and inserting an advertisement MP3 file insert.mp3 before the next MP3 file music2.mp3 is concatenated there's no problem with bit reservoir. It will have been used in the last frame of music1.mp3. That last frame's payload may be padded with null data that gets ignored if the frame size is larger than the space required to encode its audio. insert.mp3 likewise will have been encoded from the first frame, so cannot include bit reservoir unless a null silent frame has been inserted. Null silent frames are unusual for files that have been encoded and left alone, and will probably only exist for files created using pcutmp3, which is used to split mp3 files in two, preserving bit reservoir in the second file in this way and setting the gapless information in the LAME header (itself a silent frame) to point to the audio starting at the true start point. This isn't very widespread, because few playback devices/software (foobar2000, Rockbox and LAME is all the ones I think do) support the gapless (accurate length) feature of LAME MP3. Similarly, music2.mp3 will not begin with any reservoir for the same reason. If you ARE actually interrupting the music by splitting one MP3 file, then you'll either need to do what pcutmp3 does to transfer the bit reservoir to the part after your insert.mp3 or simply ignore the reservoir. If you're smart and wait for silence to exist before and after the point where you include insert.mp3 (which is usually common on track boundaries if you know where they are), there's likely to be no audible bit reservoir usage at such a point. Also beware of very large tags. I guess your files/streams would be untagged, but a large embedded album art tag could cause other problems with streaming. |
|
|
|
May 18 2009, 17:30
Post
#3
|
|
![]() Group: Members Posts: 366 Joined: 17-September 06 Member No.: 35307 |
If I concatenate two mp3 files (which were prepared with LAME with option '--nores' (turning off bytes reservoir)) will everything work seamlessly? I can understand that there could be a bit of silence (because some starting bits in mp3 files are useless, etc.) but is there a possibility that some artifacts will show up? In other words - does turning off bytes reserveoir, really makes mp3 frames independent and ready to simple concatenate with other mp3 files? What about this overlapping MDCT's? Maybe they are overlapping only in one physical frame scope? Or even if they are overlapping at global scope, for decoder there is no big deal to decode concatenated streams without artiffacts? P.S. Maybe trhere are some better lossy formats (like OGG Vorbis, AAC?) for this kind of simple-concatenation-requirement? Cheers, Kamil Turning off reservoir using --nores will either cause increased bitrate at the same quality (for VBR) or worse quality at the same bitrate (for CBR, except for 320kbps CBR which refuses to use reservoir in recent LAME to avoid offending Windows Media Player's decoder). If CBR, you're probably better off ignoring reservoir when splitting the files (a brief, subtle degradation when it's probably silent and the audio is gettign interrupted anyway) than causing degraded quality throughout the file, especially for transients. MP3 is so universally supported, that unless you're writing a client player (like Spotify, who use Vorbis), it's the easy route. AoTuV Vorbis and AAC (iTunesencode or nero) are usually considered superior to LAME MP3 for lower bitrates (<120 kbps perhaps for stereo material). Vorbis scales well to low bitrates (32-64 kbps), while AAC-LC isn't great below 64 kbps and the more complex HE-AAC is better. If low bitrate is important enough, they can be used. |
|
|
|
May 18 2009, 22:30
Post
#4
|
|
|
Group: Members Posts: 2 Joined: 17-May 09 Member No.: 69906 |
Reservoir means that frame n's audio data may include information stored in frame n-1, n-2 etc. None of it is stored in frame n+1. If you're not breaking up an existing MP3 file into two halves, but instead are waiting until the end of music1.mp3 and inserting an advertisement MP3 file insert.mp3 before the next MP3 file music2.mp3 is concatenated there's no problem with bit reservoir. It will have been used in the last frame of music1.mp3. That last frame's payload may be padded with null data that gets ignored if the frame size is larger than the space required to encode its audio. insert.mp3 likewise will have been encoded from the first frame, so cannot include bit reservoir unless a null silent frame has been inserted. Null silent frames are unusual for files that have been encoded and left alone, and will probably only exist for files created using pcutmp3, which is used to split mp3 files in two, preserving bit reservoir in the second file in this way and setting the gapless information in the LAME header (itself a silent frame) to point to the audio starting at the true start point. This isn't very widespread, because few playback devices/software (foobar2000, Rockbox and LAME is all the ones I think do) support the gapless (accurate length) feature of LAME MP3. Similarly, music2.mp3 will not begin with any reservoir for the same reason. If you ARE actually interrupting the music by splitting one MP3 file, then you'll either need to do what pcutmp3 does to transfer the bit reservoir to the part after your insert.mp3 or simply ignore the reservoir. If you're smart and wait for silence to exist before and after the point where you include insert.mp3 (which is usually common on track boundaries if you know where they are), there's likely to be no audible bit reservoir usage at such a point. Yes, I'll probably interrupt music by splitting one MP3 stream (of course, on frame boundaries). For now let's assume that I will use --nores option, so bit reservoir will be no problem. What about overlapping MDCT frames in actual audio data (see point 3.1 in LAME tech FAQ http://lame.sourceforge.net/tech-FAQ.txt)? May it introduce any artifacts? Or it depends if I am in silence point, or something? Of course, maybe I am not understanding this MDCT stuff correctly.. Thanks for Your response. P.S. Is really turning off bit reservoir (--nores in LAME) a bad option? P.S.2. About other codecs - I know that AAC (especially HE-AAC v2) is really good at low bitrates (OGG Vorbis possibly too). This is important in e.g. Internet radio stations. But my question was actually about complexity of concatenating encoded streams in these formats? Do you know something about this? Cheers, Kamil |
|
|
|
May 19 2009, 22:26
Post
#5
|
|
![]() Group: Members Posts: 366 Joined: 17-September 06 Member No.: 35307 |
The MDCT lapping window function will help smooth the transition. Inserting an alien chunk of audio into the middle of a song is the bigger artifact !! ;)
The impact of --nores depends a lot on your other settings (VBR/CBR and quality level/bitrate) and the limitations you're under. Given that people were splitting and joining mp3s long before pcutmp3 with no qualms (and I tried a few with perfectly acceptable quality, joining mp3s that SHOULD run together with continuous sound), I'd be tempted to say that you needn't worry, and would be better off leaving the encoder alone, especially as you're deliberately disrupting the sound to insert something new. I don't know about concatenating AAC or Vorbis, though I think there are splitter applications for each, probably open source, that could reveal their workings. |
|
|
|
![]() ![]() |
|
Lo-Fi Version | Time is now: 22nd November 2009 - 07:39 |