back to main page
- Overhead of AVI files
- Overhead of MKV files
- Overhead of OGG files
- Overhead of OGM files
(1) Overhead of AVI files
It is hardly possible to calculate the exact overhead of an output file with a reasonable amount of effort.
That's why AVI-Mux GUI tries to estimate the overhead
This is pretty easy for standard AVI files as well as for pure
Open-DML files and should
work with a deviation of not more than a few percent.
However, it is much more sophisticated for Open-DML AVIs with legacy index. The problem is that the legacy index is formed only over a certain region of an AVI file (the first gigabyte).
If the output file is split, each output file will contain a legacy index, calculated over the first gigabyte. If you indicate manual split points, it is almost impossible to estimate the overhead easily.
That's why the overhead is only roughly estimated for Open-DML + legacy
output and can differ from the estimated value more than a few percent.
Some rules on overhead calculation:
Let's first define one unit of overhead:
The following items cause one unit of overhead in AVI-Mux GUI:
- standard AVI file: 24 bytes
- Open-DML without legacy index: 16 bytes
- Open-DML with legacy index: 32 bytes within the first RIFF list, then 16 bytes. The size
of the first RIFF list was 1 GB till AVI-Mux GUI 1.16.2, and is adjustable beginning with
(1): standard settings is 2, but can be adjusted beginning with 1.15 alpha 14
- one video frame
- 32 * number of frames per chunk (1) milliseconds of AC3 audio
- 24 * number of frames per chunk (2) milliseconds of MP3-VBR audio
- 1 second of MP3-CBR audio
- 21 milliseconds of DTS audio (3)
(2): almost all muxers force this value to 1, but it is adjustable in AVI-Mux GUI 1.16.11
(3): adjustable in AVI-Mux GUI 1.16.11
Furthermore, rec lists also cause overhead:
- 28 bytes per rec list in a standard AVI file
- 12 bytes per rec list in an Open-DML file
A movie with 25 fps, 3 hours long, 2 streams AC3 (2 frames per chunk), 1 stream MP3-VBR:
These are so far about 1.339.000 units of overhead.
- 3 hours are 10.800 seconds
- 270.000 frames
- 450.000 units of overhead per MP3 stream
- 168.750 units of overhead per AC3 stream
In a standard AVI file, this means 32 MB of overhead, while it
causes only 21 MB in an Open-DML file.
Lets assume a destination size of 2 GBs altogether.
With an audio interleave and rec list size of 250 kB, there will be 8000 rec lists, causing another 224 kB of overhead in a standard AVI file, or 96 kB in an Open-DML file.
As you can also see, one MP3-VBR stream causes more overhead than
the video stream and the AC3 stream together!
(2) Overhead of MKV files
If you seriously thought that AVI overhead estimation was difficult, then I have to disappoint you.
The obfuscated way Matroska stores element sizes can really cause some headache.
Variable size integer values
Matroska uses variable size integers to indicate the length of an element:
0x81 - 0xFE -> 1 - 127
0x4080 - 0x7FFE -> 128 - 16382
That means, the overhead an element, such as a video or audio frame, causes depends on its size!
This system is also used for the element headers, meaning that the ID code of an element can be
between 1 and 4 bytes large (longer values, up to 8 bytes, are allowed for element size indicators).
Audio and Video data is stored in 'blocks', where each block contains a stream number (variable
size integer!), a 16 bit timestamp (relative to the timestamp of the cluster it is in), and one byte
A block is contained in one 'blockgroup', which has again a header element (one byte) and a size
indicator (variable size integer). If the block contains a nonkeyframe, there will be one or more
reference blocks attached to the block, each of which takes 3 bytes (1 for P-frames, 2 for B-frames).
As a result, the normal overhead per video frame is:
If you have frames that are larger than 16382 bytes, you have to add 2 more bytes to that.
- 10 bytes for a keyframe
- 13 bytes for a p-frame
- 16 bytes for a true b-frame.
One video frame is contained in one block.
When using Matroska v2, you need 7 bytes per frame, independant from its type. When using MPEG4 ASP, you can save 4 bytes per frame by using header striping.
One or more audio frames are contained in one block.
If you store several audio frames in one block, each audio frame, except for the last one in
that block (!), causes 1+(frame size / 255) bytes of
extra overhead when using Xiph lacing, but saves the space for another block header and blockgroup.
When using EBML lacing, each frame within a lace, exept for the last one, causes either one
or 2 bytes (more would require rediculously high audio frame sizes). Fixed lacing uses a few bytes
per lace, independent from the number of frames in lace.
When using fixed lacing (possible for e.g. AC3 and DTS), there is no overhead 'per frame',
but only the usual overhead per block.
When using header striping (beginning with AVI-Mux GUI 1.17.4), you can save
- 1 byte per MP3 frame
- 2 bytes per AC3 frame
- 4 bytes per DTS frame
Subtitles cause 8 - 10 bytes of overhead for each subtitle element, depending on the size of each element.
A few seconds of video and audio data are put in one cluster, where usually each cluster is indexed
in the metaseek. Assuming that this is the case, each cluster causes 29 bytes of overhead in average.
There are 4 more bytes for both PrevClusterSize and Position element, if they are stored
(3 if a value is < 2^16 or 5 if it is >= 2^24, meaning that Position usually takes 5 bytes als PrevClusterSize usually takes 4),
However, if the cluster size is 2 MB or more, you need one more byte to store the cluster size.
There can be one cue point for each frame, but normally, there is one cue point only for each
keyframe, and no cue points for any non-video-tracks. Each cue point takes about 18 or 19 bytes (there
are, again, some variable size integers involved), if those conditions are met.
As you can see, an estimation on how much overhead a MKV file will have is based on a lot of hopes
and guesses, but will rarely ever be accurate if you don't know details about your source files (such
as format, frame size) as well as the muxing settings that will be used.
Beginning with AVI-Mux GUI 1.17.2, the space used for Cues can be configurated manually,
and default is 20 kBytes per stream and hour. This, of course, makes the estimation very easy, but does
obviously only apply to AVI-Mux GUI. If you use other muxers, estimating the space the cues will need
is rather hopeless.
(3) Overhead of OGG files
OGG works completely different from AVI/MKV.
An OGG file is divided into pages. Each page has a header, taking 28+n bytes.
n depends on the number of segments in that page, where 1 or several segments form
one packet. If a segment is the last segment of a packet, its size is smaller than 255 bytes,
otherwise it is 255 bytes. The size of each segment is stored in the header.
Example: A packet of 800 bytes is stored as 255 255 255 35
This has one consequence: There is at least one byte of overhead per 255 bytes of user data,
meaning that any OGG file always has an overhead of at least 0.39%!
The page size used by libogg is around 4 kB, so that you will have another 28 bytes per 4 kB
of overhead for the page headers, which is about 0.69%.
That means, the overhead of an OGG file will be around 1,1%, *independent* from the video or
audio format! For audio formats with very small packets, the 0.39%-part will be higher (if every
packet is e.g. only 100 bytes, then you need one byte per 100 bytes of course), but the affect
on an OGM file will be neglectable, since the audio stream usually consumes little space, compared
to the video.
OGG is the only container, out of those 3, where the overhead is O(size of input)!
Estimation of OGG overhead before muxing
First, this is only possible for video muxing, like OGM. Second, this is based on statistics
and not as easy as for AVI. So the best is probably to use an example. Lets assume you aim for
a file size of 600 MB for a movie of 2 hours. This means a bitrate of 670 kpbs, or 83 kB/s.
With 25 fps, that are 3,3 kB/frame, or 3400 bytes/frame in average. You need 14 bytes to
code the segment size of such a frame, which is 0.41% overhead. Since smaller frames take less
space for the packets, and larger frames take more space, this can be takes as a good estimation
for the final overhead for the segment sizes. Then, there are again 28 bytes per page.
Any nonfinal segment of a packet has 255 bytes, meaning that 4 kB will most likely not be
the exact final page size. It depends on the behaviour of the library if the real page size will
be about 4 kB in average, or if it will be between 4096 bytes and (4096+256) bytes, meaning about
(4096+128) = 4224 bytes per page in average.
With 28 bytes page header per 4224 bytes, the caused overhead is 0,66% (instead of 0,69%).
Depending on the average page size, the total overhead will thus be 1,1% or 1,07%. Larger pages
could get this down to 0.5%, but tell that libogg or its author...
(4) Overhead of OGM files
OGM is a hack of OGG with even more overhead:
For each packet, there is at least one more byte for a header byte, and up to
7 more for improved timecode granularity (typical should however be 0 - 2, whereas
3 or more seem unlikely).
A rough, but in most cases sufficient estimation would be to say
OGM Overhead = OGG Overhead