Valid HTML 4.01!


estimate overhead

back to main page

  1. Overhead of AVI files
  2. Overhead of MKV files
  3. Overhead of OGG files
  4. Overhead of OGM files

(1) Overhead of AVI files

It is hardly possible to calculate the exact overhead of an output file with a reasonable amount of effort. That's why AVI-Mux GUI tries to estimate the overhead

This is pretty easy for standard AVI files as well as for pure Open-DML files and should work with a deviation of not more than a few percent.

However, it is much more sophisticated for Open-DML AVIs with legacy index. The problem is that the legacy index is formed only over a certain region of an AVI file (the first gigabyte).
If the output file is split, each output file will contain a legacy index, calculated over the first gigabyte. If you indicate manual split points, it is almost impossible to estimate the overhead easily.
That's why the overhead is only roughly estimated for Open-DML + legacy output and can differ from the estimated value more than a few percent.

Some rules on overhead calculation:

Let's first define one unit of overhead: The following items cause one unit of overhead in AVI-Mux GUI: (1): standard settings is 2, but can be adjusted beginning with 1.15 alpha 14
(2): almost all muxers force this value to 1, but it is adjustable in AVI-Mux GUI 1.16.11
(3): adjustable in AVI-Mux GUI 1.16.11

Furthermore, rec lists also cause overhead: Example:

A movie with 25 fps, 3 hours long, 2 streams AC3 (2 frames per chunk), 1 stream MP3-VBR: These are so far about 1.339.000 units of overhead.
In a standard AVI file, this means 32 MB of overhead, while it causes only 21 MB in an Open-DML file.
Lets assume a destination size of 2 GBs altogether. With an audio interleave and rec list size of 250 kB, there will be 8000 rec lists, causing another 224 kB of overhead in a standard AVI file, or 96 kB in an Open-DML file.

As you can also see, one MP3-VBR stream causes more overhead than the video stream and the AC3 stream together!

(2) Overhead of MKV files

If you seriously thought that AVI overhead estimation was difficult, then I have to disappoint you. The obfuscated way Matroska stores element sizes can really cause some headache.

Variable size integer values
Matroska uses variable size integers to indicate the length of an element:
0x81 - 0xFE -> 1 - 127
0x4080 - 0x7FFE -> 128 - 16382

That means, the overhead an element, such as a video or audio frame, causes depends on its size!

This system is also used for the element headers, meaning that the ID code of an element can be between 1 and 4 bytes large (longer values, up to 8 bytes, are allowed for element size indicators).

Audio and Video data is stored in 'blocks', where each block contains a stream number (variable size integer!), a 16 bit timestamp (relative to the timestamp of the cluster it is in), and one byte for flags.

Video data
A block is contained in one 'blockgroup', which has again a header element (one byte) and a size indicator (variable size integer). If the block contains a nonkeyframe, there will be one or more reference blocks attached to the block, each of which takes 3 bytes (1 for P-frames, 2 for B-frames).
As a result, the normal overhead per video frame is: If you have frames that are larger than 16382 bytes, you have to add 2 more bytes to that.
One video frame is contained in one block.

When using Matroska v2, you need 7 bytes per frame, independant from its type. When using MPEG4 ASP, you can save 4 bytes per frame by using header striping.

Audio data
One or more audio frames are contained in one block.
If you store several audio frames in one block, each audio frame, except for the last one in that block (!), causes 1+(frame size / 255) bytes of extra overhead when using Xiph lacing, but saves the space for another block header and blockgroup.
When using EBML lacing, each frame within a lace, exept for the last one, causes either one or 2 bytes (more would require rediculously high audio frame sizes). Fixed lacing uses a few bytes per lace, independent from the number of frames in lace.
When using fixed lacing (possible for e.g. AC3 and DTS), there is no overhead 'per frame', but only the usual overhead per block.

When using header striping (beginning with AVI-Mux GUI 1.17.4), you can save

Subtitles cause 8 - 10 bytes of overhead for each subtitle element, depending on the size of each element.

A few seconds of video and audio data are put in one cluster, where usually each cluster is indexed in the metaseek. Assuming that this is the case, each cluster causes 29 bytes of overhead in average. There are 4 more bytes for both PrevClusterSize and Position element, if they are stored (3 if a value is < 2^16 or 5 if it is >= 2^24, meaning that Position usually takes 5 bytes als PrevClusterSize usually takes 4),
However, if the cluster size is 2 MB or more, you need one more byte to store the cluster size.

Cue points
There can be one cue point for each frame, but normally, there is one cue point only for each keyframe, and no cue points for any non-video-tracks. Each cue point takes about 18 or 19 bytes (there are, again, some variable size integers involved), if those conditions are met.

As you can see, an estimation on how much overhead a MKV file will have is based on a lot of hopes and guesses, but will rarely ever be accurate if you don't know details about your source files (such as format, frame size) as well as the muxing settings that will be used.

Beginning with AVI-Mux GUI 1.17.2, the space used for Cues can be configurated manually, and default is 20 kBytes per stream and hour. This, of course, makes the estimation very easy, but does obviously only apply to AVI-Mux GUI. If you use other muxers, estimating the space the cues will need is rather hopeless.

(3) Overhead of OGG files

OGG works completely different from AVI/MKV.

An OGG file is divided into pages. Each page has a header, taking 28+n bytes. n depends on the number of segments in that page, where 1 or several segments form one packet. If a segment is the last segment of a packet, its size is smaller than 255 bytes, otherwise it is 255 bytes. The size of each segment is stored in the header.

Example: A packet of 800 bytes is stored as 255 255 255 35

This has one consequence: There is at least one byte of overhead per 255 bytes of user data, meaning that any OGG file always has an overhead of at least 0.39%!

The page size used by libogg is around 4 kB, so that you will have another 28 bytes per 4 kB of overhead for the page headers, which is about 0.69%.

That means, the overhead of an OGG file will be around 1,1%, *independent* from the video or audio format! For audio formats with very small packets, the 0.39%-part will be higher (if every packet is e.g. only 100 bytes, then you need one byte per 100 bytes of course), but the affect on an OGM file will be neglectable, since the audio stream usually consumes little space, compared to the video.

OGG is the only container, out of those 3, where the overhead is O(size of input)!

Estimation of OGG overhead before muxing

First, this is only possible for video muxing, like OGM. Second, this is based on statistics and not as easy as for AVI. So the best is probably to use an example. Lets assume you aim for a file size of 600 MB for a movie of 2 hours. This means a bitrate of 670 kpbs, or 83 kB/s.
With 25 fps, that are 3,3 kB/frame, or 3400 bytes/frame in average. You need 14 bytes to code the segment size of such a frame, which is 0.41% overhead. Since smaller frames take less space for the packets, and larger frames take more space, this can be takes as a good estimation for the final overhead for the segment sizes. Then, there are again 28 bytes per page.

Any nonfinal segment of a packet has 255 bytes, meaning that 4 kB will most likely not be the exact final page size. It depends on the behaviour of the library if the real page size will be about 4 kB in average, or if it will be between 4096 bytes and (4096+256) bytes, meaning about (4096+128) = 4224 bytes per page in average.

With 28 bytes page header per 4224 bytes, the caused overhead is 0,66% (instead of 0,69%). Depending on the average page size, the total overhead will thus be 1,1% or 1,07%. Larger pages could get this down to 0.5%, but tell that libogg or its author...

(4) Overhead of OGM files

OGM is a hack of OGG with even more overhead:

For each packet, there is at least one more byte for a header byte, and up to 7 more for improved timecode granularity (typical should however be 0 - 2, whereas 3 or more seem unlikely).

A rough, but in most cases sufficient estimation would be to say OGM Overhead = OGG Overhead