Help - Search - Members - Calendar
Full Version: Matroska file format definition
Hydrogenaudio Forums > Digital Audio/Video > General A/V
getID3()
I've tried a couple times to make sense of the Matroska file format, as defined on www.matroska.org (which is down as I type this), but I honestly can't make sense of the scheme used to describe the file structure. I read through the documentation and tried comparing it to sample files I'd created, but I can't relate the two. I need to find out basic information such as bitrate, playtime, resoution, sample rate, codecs, etc. Can someone walk me through how to find that information, please? smile.gif
plonk420
maybe try these sites and spider from there...?

http://ld-anime.faireal.net/guide/matroska-en
http://ld-anime.faireal.net/guide/mkv-eng

this stuff is soooooo fascinating... that and the release of VP6 (which prompted me to use MKV seeing as how i was having probs muxing AAC-HE with AVIs) can't wait for h.264
getID3()
Those are very basic "What is Matroska? How do I play it" kind of guides. I couldn't find anything linked from there either.

Looking at the Matroska specs, I can see that all my sample files begin with [1A][45][DF][A3] as expected, but beyond that I'm lost. I can't relate the subsequent bytes to what I'm seeing in the specs. After those first 4 bytes, I see [93][42][82][88], then 'matroska', etc.

I guess what I'm looking for is either a guide for how to parse a Matroska file, or at least how to understand the specs as they're presented.
Latexxx
This looks almost usable: http://cvs.corecodec.org/cgi-bin/viewcvs.c...gram/index.html It's some kind of index for the spec file.
rpop
What's to figure out? It's communist! tongue.gif

j/k
robUx4
getID3(), you need to understand EBML first, before understanding Matroska.

EBML is like this : [ID][Length of Data][Data]

The Data can contain other EBML elements, or just an integer, a float, a string, etc.

The ID and the Length are coded like UTF-8. That means depending on the first bits of the ID/Length, you will have an ID/Length that span on 1,2,...,8 bytes (octets).

Once you understand that it's easy to see how Matroska works (I guess). The first element is a Level 0 element. And contains a list of other EBML element in its Data (this is called a Master element). All the elements contains other elements, that sometimes contain just valuable data (int, float, string, date, etc). That makes a hierarchy tree between the data. That's why it's close to the XML principles.

I hope it's a bit clearer now.
ChristianHJW
Have a look here : http://cvs.corecodec.org/cgi-bin/viewcvs.c...gram/index.html

Pamel's diagram and robux4's explanation should be a good starting point to understand matroska's structure.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2008 Invision Power Services, Inc.