Help - Search - Members - Calendar
Full Version: Silence Is Golden
Hydrogenaudio Forums > Lossy Audio Compression > MP3 > MP3 - Tech
Shrubsole
Hello Everyone! - Newbie Alert !!! sad.gif

My problem is -

Within Hex Editor I can see and define the header info in the mp3 file but how can you tell which is the first audio frame?

The reason behind this is that I have a Karaoke mp3 and as normal they have some silence at the start of the mp3 (Many about 10 seconds, but it varies!)

Now, I don't wish or need to alter that at all, but in Hex Edit you can see the difference between a silent part and where the audio (song) actually starts - But how can I tell the difference in Visual basic ?

If the "Silent" Frames were all one number or clear it would be all too easy, but the trouble is that there is alway "some" rubbish within the "Silent" frames - that makes it hard for a programme to detect the difference between the start of the real audio and rubbish (Encoder stuff etc) within the silent frames!!! sad.gif

I don't wish or need to decode the actual mp3 to see where the sound starts, I just need a to write a programme that can detect the difference between a silent frame and the first audio frame.

The Object when I can do that is a cueing device for the track.

HELP! - Is there some magic part of the frame to look at and check or something simple like that?

I would be more than greatful for any help or advice you can give.

Thanks! biggrin.gif
getID3()
You have two possible situations, and I'm not sure which one applies to your file(s).

a) You have a bunch of garbage, non-playble data at the beginning of the file.
B) Your file has no garbage, but has ~10 seconds of MP3-encoded silence at the beginning.

If (a), then it's relatively simple to scan through and find a valid MP3 frame header (ideally find a sequence of at least 10 valid frames in a row to make sure you didn't pick up a false synch) and go from there.

But if (B), then you'll have to decode the audio data and determine if it's "silent" or not, a much more complex task.
cabbagerat
QUOTE
then you'll have to decode the audio data and determine if it's "silent" or not, a much more complex task

True, it is a much more complex task - but if you use the right libraries/programs then it won't be very difficult at all.

Step One: Either use a library (don't know any with VB bindings) or an external program to decode the frame to PCM audio data. This could be as simple as interfacing with mpg123 or similar

Step Two: The resulting PCM data is likely to be pretty quite, but certainly non-zero. The easiest way to see if the PCM data is silent is to compare each sample to a threshhold and if all of them are less, then the frame data is silent. If you decide that everything below -60dB should be considered silent, then your threshhold value would be 0x3F (for 16 bit samples).

If I was doing this in C, I would use libmad to decode the mp3 data, I have no idea if there are VB bindings for it.
384kbps
Hi!

If U have an MP3 that is encoded with variable bitrate and
uses for silence the lowest possible frame size resp. frame-rate,
try to 'scan' through the MP3 as far as the frame-rate is equal 32 kbit/s.

But this only works on this special/newer type of VBR MP3s...


CU,
384kbps


N.B: Make a short post if You need a DOS tool to create a frame log.
Gabriel
QUOTE
Now, I don't wish or need to alter that at all, but in Hex Edit you can see the difference between a silent part and where the audio (song) actually starts - But how can I tell the difference in Visual basic ?


Take some time to think about the following:
In your hex editor, how are you able to see the difference between a silent part and a non-silent part. You are probably noticing a kind of change between the 2 parts. Take a piece of paper, and write down how you are able to distinguish the 2 parts.


Now, you just have to program in VB what is on your piece of paper.
Shrubsole
Many thanks for your replies so far !

I like the sound of Gabriel's reply and indeed that was/is the way my brain was thinking. - I'm trying to find what is common ground with a silent frame, but with different encoded mp3s, it's hard!

I have found with Lame that all the frames up to 10 seconds (When the song starts) are "mostly" 55H with "lame...." in the middle of the frame: Yet with another encoder used, the common value is FFH "most" of the way through the silent frame (Yet in the first song frame the audio data in the middle of the frame with FFH either side!)

I will try and find some common ground on which to write a programme or I'll just have to resort to Decoding it as suggested by many of you - something I was trying to avoid ! sad.gif

Thanks ! biggrin.gif
Sebastian Mares
QUOTE(Shrubsole @ Nov 12 2003, 11:54 AM)
Many thanks for your replies so far !

I like the sound of Gabriel's reply and indeed that was/is the way my brain was thinking. - I'm trying to find what is common ground with a silent frame, but with different encoded mp3s, it's hard!

I have found with Lame that all the frames up to 10 seconds (When the song starts) are "mostly" 55H with "lame...." in the middle of the frame: Yet with another encoder used, the common value is FFH "most" of the way through the silent frame (Yet in the first song frame the audio data in the middle of the frame with FFH either side!)

I will try and find some common ground on which to write a programme or I'll just have to resort to Decoding it as suggested by many of you - something I was trying to avoid ! sad.gif

Thanks ! biggrin.gif

I think that is the padding.
smack
IMHO the only reliable way to tell if an MP3 frame contains silence is to decode the frame.

The suggested "pattern matching" method will not be correct in all cases, two reasons come to mind:
  • the unused ("ancillary") data may be filled by the encoder with *anything* (JPG images, anybody?tongue.gif)
  • bit reservoir usage may result in non-silence data inside a silent frame
So, how to decode?
  1. use a decoder library, such as libmad (has already been suggested)
  2. analyze the frame data yourself. IIRC you only have to decode the "side info" at the beginning of a frame. there is info about how many bits there are for each "huffman region". if you find out that the end of region_3 (which is the last region and contains the 4-samples-per-code data) is zero then you know that there are actually no samples at all to decode in this frame - thus it must be completely silent! (I really hope this is correct. can somebody verify that using the MPEG specs or some decoder sources?)
Option 1 is probably easier to implement and more flexible because it allows you to detect near-silent frames, too. (as already suggested)
Option 2 looks more interesting from a hackers point of view, though. B)
wkwai
QUOTE(Shrubsole @ Nov 11 2003, 07:04 PM)
I don't wish or need to decode the actual mp3 to see where the sound starts, I just need a to write a programme that can detect the difference between a silent frame and the first audio frame.


I think this is a very complicated task. First the nature of the MP3 stream means that even if that particular frame is silent..(It does not contain any spectrals data) it can contain spectrals data from the next non silent frames.. You cannot just cut and throw away any silent frame header either because it will affect the entire bitstream..

You have to scan through the frame headers to the point where the audio starts..
I think you can do that because the headers do provide a way to track where the start of the audio data for that particular frame is from the start of the frame header. For a silent frame, most likely this header pointer would be zero.. Otherwise, the decoder would decode the wrong huffman code for a silent frame.

But then, a non silent frame too could have a pointer to zero..

I think scanning through a series of headers and try to derive a logical conclusion from it..
Shrubsole
biggrin.gif
Many, many thanks for your replys smile.gif

Some of them I almost understood !!! biggrin.gif

Hoffman coding ?????? unsure.gif Ummm Errrr OK !!! wink.gif

I think that I will have to decode the mp3 (or at least the first 15 seconds) to get to where I want to get.
The problem I now have with that ia that as my program is controlling winamp anyway - Could I not use it's .dll to decode the mp3???

The main problem with that is that I can't find any help on how to "Address" the In_mp3.dll or In_Mad.dll from wothin Visual Basic mad.gif

Does anyone know the function to write messages to it and what messages to send it, so that I can decode the mp3 in VB and then look at the resulting output???

Please HELP !!!! and thanks!
wkwai
Your program controlling winamp?? If you are trying to do some low-level manipulation on some dll module, you will need to know the content / various entry points of the dll module.. That you will have to ask the developers of winamp...

I thought you are writing an entire program from scratch.. In such case, you have a lot of flexibility.. You don't even have to decode the first 15 seconds completely.. All you need is to scan the ISO frame headers.. without carrying out the dequantization & imdct module.. I think you can write a small function to do that..

You'll have to approach the problem from a low-level angle.. If you attempt to approach from the high level angle, you are likely to encounter many bottlenecks... thus constrain your creativity... huh.gif
Shrubsole
Sounds interesting, wkwai !

I take it you mean that I should actually decoce the mp3 myself without using a decoder .dll ?!?!
Trouble there is my knowlledge on that only (So Far) goes to finding the frame headers (The ones that contain the bit rate etc etc information) but in all the listings that give details about these headers, there doesn't seem much info that is applicable to me and nothing on how I start learning how to decode it to give me the minimum info I need (IE is it Silent or not)

Is there an Idiots Guide to start decoding anywhere?

As for getting the inner fuctions of the In_mp3.dll from winamp or anywhere else has been a no go so far - so as you say, doing it myself (If I'm capable!?!?) would be rewarding and seems the only way to go at the moment !!!

Cheers for putting up with a newbie ! blink.gif
Shrubsole
PS.............

AH!!!!!
I've found the "Side Info"! (32 bytes following the 4 byte header!!!)
(Don't laugh!!! I know you lot aready know all this, but I have to start somewhere! rolleyes.gif )

Reseaching now what all this "new" side info means and what bit of it are useful to me!!!

I might be getting somewhere! - Not fast, but at least grinding forward !!!

Any ideas on what bits of the side info I should concentrate on to find if it's a "Silent frame" ???

Thanks !!! blink.gif
wkwai
There should be a pointer that tells you where the start of the spectral huffman data is in the previous granule.. I think the first frame that has a non zero pointer is the start of the non silent frame.. This is just a simplification of course.. Under some very rare cases, the pointer could be zero and yet it is the start of a non-silent frame.. You just have find a way to compensate for this..
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.