• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1MIME types in GStreamer
2
3What is a MIME type ?
4=====================
5
6A MIME type is a combination of two (short) strings (words)---the content type
7and the content subtype. Content types are broad categories used for describing
8almost all types of files: video, audio, text, and application are common
9content types. The subtype further breaks the content type down into a more
10specific type description, for example 'application/ogg', 'audio/raw',
11'video/mpeg', or 'text/plain'.
12
13So the content type and subtype make up a pair that describes the type of
14information contained in a file. In multimedia processing, MIME types are used
15to describe the type of information carried by a media stream. In GStreamer, we
16use MIME types in the same way, to identify the types of information that are
17allowed to pass between GStreamer elements. The MIME type is part of a GstCaps
18object that describes a media stream. Besides a MIME type, a GstCaps object also
19contains a name and some stream properties (GstProps, which hold combinations of
20key/value pairs).
21
22An example of a MIME type is 'video/mpeg'. A corresponding GstCaps could be
23created using code:
24
25GstCaps *caps = gst_caps_new_simple ("video/mpeg",
26				     "width",  G_TYPE_INT, 384,
27				     "height", G_TYPE_INT, 288,
28				     NULL);
29
30MIME types and their corresponding properties are of major importance in
31GStreamer for uniquely identifying media streams. Therefore, we define them
32per media type. All GStreamer plugins should keep to this definition.
33
34Official MIME media types are assigned by the IANA. Current assignments are at
35http://www.iana.org/assignments/media-types/.
36
37The problems
38============
39
40Some streams may have MIME types or GstCaps that do not fully describe the
41stream. In most cases, this is not a problem, though. For example, if a stream
42contains Ogg/Vorbis data (which is of type 'application/ogg'), we don't need to
43know the samplerate of the raw audio stream, since we can't play the encoded
44audio anyway. The samplerate is, however, important for raw audio, so a decoder
45would need to retrieve the samplerate from the Ogg/Vorbis stream headers (the
46headers are part of the bytestream) in order to pass it on in the GstCaps that
47belongs to the decoded audio (which becomes a type like 'audio/raw'). However,
48other plugins might want to know such properties, even for compressed streams.
49One such example is an AVI muxer, which does want to know the samplerate of an
50audio stream, even when it is compressed.
51
52Another problem is that many media types can be defined in multiple ways. For
53example, MJPEG video can be defined as 'video/jpeg', 'video/mjpeg',
54'image/jpeg', 'video/x-msvideo' with a compression of (fourcc) MJPG, etc.
55None of these is really official, since there isn't an official mimetype
56for encoded MJPEG video.
57
58The main focus of this document is to propose a standardized set of MIME types
59and properties that will be used by the GStreamer plugins.
60
61Different types of streams
62==========================
63
64There are several types of media streams. The most important distinction will be
65container formats, audio codecs and video codecs. Container formats are
66bytestreams that contain one or more substreams inside it, and don't provide any
67direct media data itself. Examples are Quicktime, AVI or MPEG System Stream.
68They mostly contain of a set of headers that define the media streams that are
69packed inside the container, along with the media data itself.
70
71Video codecs and audio codecs describe encoded audio or video data. Examples are
72MPEG-1 video, DivX video, MPEG-1 layer 3 (MP3) audio or Ogg/Vorbis audio.
73Actually, Ogg is a container format too (for Vorbis audio), but these are
74usually used in conjunction with each other.
75
76Finally, there are the somewhat obvious (but not commonly encountered as files)
77raw data formats.
78
79Container formats
80-----------------
81
821 - AVI (Microsoft RIFF/AVI)
83    MIME type: video/x-msvideo
84    Properties:
85    Parser: avidemux, ffdemux_avi
86    Formatter: avimux
87
882 - Quicktime (Apple)
89    MIME type: video/quicktime
90    Properties:
91    Parser: qtdemux
92    Formatter:
93
943 - MPEG (MPEG LA)
95    MIME type: video/mpeg
96    Properties: 'systemstream' = TRUE (BOOLEAN)
97    Parser: mpegdemux, ffdemux_mpeg (PS), ffdemux_mpegts (TS), dvddemux
98    Formatter: mplex
99
1004 - ASF (Microsoft)
101    MIME type: video/x-ms-asf
102    Properties:
103    Parser: asfdemux, ffdemux_asf
104    Formatter: asfmux
105
1065 - WAV (Microsoft RIFF/WAV)
107    MIME type: audio/x-wav
108    Properties:
109    Parser: wavparse, ffdemux_wav
110    Formatter: wavenc
111
1126 - RealMedia (Real)
113    MIME type: application/vnd.rn-realmedia
114    Properties: 'systemstream' = TRUE (BOOLEAN)
115    Parser: rmdemux, ffdemux_rm
116    Formatter:
117
1187 - DV (Digital Video)
119    MIME type: video/x-dv
120    Properties: 'systemstream' = TRUE (BOOLEAN)
121    Parser: gst1394, ffdemux_dv
122    Formatter:
123
1248 - Ogg (Xiph)
125    MIME type: application/ogg
126    Properties:
127    Parser: oggdemux
128    Formatter: oggmux
129
1309 - Matroska
131    MIME type: video/x-mkv
132    Properties:
133    Parser: matroskademux, ffdemux_matroska
134    Formatter: matroskamux
135
13610 - Shockwave (Macromedia)
137     MIME type: application/x-shockwave-flash
138     Properties:
139     Parser: swfdec, ffdemux_swf
140     Formatter:
141
14211 - AU audio (Sun)
143     MIME type: audio/x-au
144     Properties:
145     Parser: auparse, ffdemux_au
146     Formatter:
147
14812 - Mod audio
149     MIME type: audio/x-mod
150     Properties:
151     Parser: modplug, mikmod
152     Formatter:
153
15413 - FLX video
155     MIME type: video/x-fli
156     Properties:
157     Parser: flxdec
158     Formatter:
159
16014 - Monkeyaudio
161     MIME type: application/x-ape
162     Properties:
163     Parser:
164     Formatter:
165
16615 - AIFF audio
167     MIME type: audio/x-aiff
168     Properties:
169     Parser:
170     Formatter:
171
17216 - SID audio
173     MIME type: audio/x-sid
174     Properties:
175     Parser: siddec
176     Formatter:
177
178Please note that we try to keep these MIME types as similar as possible to the
179MIME types used as standards in Gnome (Gnome-VFS/Nautilus) and KDE
180(Konqueror). Both will (in future) stick to a shared-mime-info database that
181is hosted on freedesktop.org, and bases itself on IANA.
182
183Also, there is a very thin line between audio codecs and audio containers
184(take mp3 vs. sid, etc.). This is just a per-case thing right now and needs to
185be documented further.
186
187Video codecs
188------------
189
190For convenience, the fourcc codes used in the AVI container format will be
191listed along with the MIME type and optional properties.
192
193Optional properties for all video formats are the following:
194
195width = 1 - MAXINT (INT)
196height = 1 - MAXINT (INT)
197pixel_width = 1 - MAXINT (INT, with pixel_height forms aspect ratio)
198pixel_height = 1 - MAXINT (INT, with pixel_width forms aspect ratio)
199framerate = 0 - MAXFLOAT (FLOAT)
200
2011 - MPEG-1, -2 and -4 video (ISO/LA MPEG)
202    MIME type: video/mpeg
203    Properties: systemstream = FALSE (BOOLEAN)
204                mpegversion = 1/2/4 (INT)
205    Known fourccs: MPEG, MPGI
206    Encoder: mpeg1enc, mpeg2enc
207    Decoder: mpeg1dec, mpeg2dec, mpeg2subt
208
2092 - DivX 3.x, 4.x and 5.x video (divx.com)
210    MIME type: video/x-divx
211    Properties:
212    Optional properties: divxversion = 3/4/5 (INT)
213    Known fourccs: DIV3, DIV4, DIV5, DIVX, DX50, DIVX, divx
214    Encoder: divxenc
215    Decoder: divxdec, ffdec_mpeg4
216
2173 - Microsoft MPEG 4.1, 4.2 and 4.3
218    MIME type: video/x-msmpeg
219    Properties:
220    Optional properties: msmpegversion = 41/42/43 (INT)
221    Known fourccs: MPG4, MP42, MP43
222    Encoder: ffenc_msmpeg4, ffenc_msmpeg4v1, ffenc_msmpeg4v2
223    Decoder: ffdec_msmpeg4, ffdec_msmpeg4v1, ffdec_msmpeg4v2
224
2254 - Motion-JPEG (official and extended)
226    MIME type: video/x-jpeg
227    Properties:
228    Known fourccs: MJPG (YUY2 MJPEG), JPEG (any), PIXL (Pinnacle/Miro), VIXL
229    Encoder: jpegenc
230    Decoder: jpegdec, ffdec_mjpeg
231
2325 - Sorensen (Quicktime - SVQ1/SVQ3)
233    MIME types: video/x-svq
234    Properties: svqversion = 1/3 (INT)
235    Encoder:
236    Decoder: ffdec_svq1, ffdec_svq3
237
2386 - H263 and related codecs
239    MIME type: video/x-h263
240    Properties:
241    Known fourccs: H263/h263, i263, L263, M263/m263, s263, x263, VDOW, VIVO
242    Encoder: ffenc_h263, ffenc_h263p
243    Decoder: ffdec_h263, ffdec_h263i
244
2457 - RealVideo (Real)
246    MIME type: video/x-pn-realvideo
247    Properties: rmversion = "1"/"2"/"3"/"4" (INT)
248    Known fourccs: RV10, RV20, RV30, RV40
249    Encoder: ffenc_rv10
250    Decoder: ffdec_rv10, ffdec_rv20
251
2528 - Digital Video (DV)
253    MIME type: video/x-dv
254    Properties: systemstream = FALSE (BOOLEAN)
255    Known fourccs: DVSD/dvsd (SDTV), dvhd (HDTV), dvsl (SDTV LongPlay)
256    Encoder: ffenc_dvvideo
257    Decoder: dvdec, ffdec_dvvideo
258
2599 - Windows Media Video 1, 2 and 3 (WMV)
260    MIME type: video/x-wmv
261    Properties: wmvversion = 1/2/3 (INT)
262    Encoder: ffenc_wmv1, ffenc_wmv2, none
263    Decoder: ffdec_wmv1, ffdec_wmv2, none
264
26510 - XviD (xvid.org)
266     MIME type: video/x-xvid
267     Properties:
268     Known fourccs: xvid, XVID
269     Encoder: xvidenc
270     Decoder: xviddec, ffdec_mpeg4
271
27211 - 3IVX (3ivx.org)
273     MIME type: video/x-3ivx
274     Properties:
275     Known fourccs: 3IV0, 3IV1, 3IV2
276     Encoder:
277     Decoder:
278
27912 - Ogg/Tarkin (Xiph)
280     MIME type: video/x-tarkin
281     Properties:
282     Encoder:
283     Decoder:
284
28513 - VP3
286     MIME type: video/x-vp3
287     Properties:
288     Encoder:
289     Decoder: ffdec_vp3
290
29114 - Ogg/Theora (Xiph, VP3-like)
292     MIME type: video/x-theora
293     Properties:
294     Encoder: theoraenc
295     Decoder: theoradec, ffdec_theora
296     This is the raw stream that comes out of an ogg file.
297
29815 - Huffyuv
299     MIME type: video/x-huffyuv
300     Properties:
301     Known fourccs: HFYU
302     Encoder:
303     Decoder: ffdec_hfyu
304
30516 - FF Video 1 (FFMPEG)
306     MIME type: video/x-ffv
307     Properties: ffvversion = 1 (INT)
308     Encoder:
309     Decoder: ffdec_ffv1
310
31117 - H264
312     MIME type: video/x-h264
313     Properties:
314     Known fourccs: VSSH
315     Encoder:
316     Decoder: ffdec_h264
317
31818 - Indeo 3 (Intel)
319     MIME type: video/x-indeo
320     Properties: indeoversion = 3 (INT)
321     Encoder:
322     Decoder: ffdec_indeo3
323
32419 - Portable Network Graphics (PNG)
325     MIME type: video/x-png
326     Properties:
327     Encoder: pngenc
328     Decoder: pngdec, gdkpixbufdec
329
33020 - Cinepak
331     MIME type: video/x-cinepak
332     Properties:
333     Encoder:
334     Decoder: ffdec_cinepak
335
336TODO: subsampling information for YUV?
337
338TODO: colorspace identifications for MJPEG? How?
339
340TODO: how to distinguish MJPEG-A/B (Quicktime) and lossless JPEG?
341
342TODO: divx4/divx5/xvid/3ivx/mpeg-4 - how to make them overlap? (all
343      ISO MPEG-4 compatible)
344
3453c) Audio Codecs
346----------------
347For convenience, the two-byte hexcodes (as used for identification in AVI files)
348are also given.
349
350Properties for all audio formats include the following:
351
352rate = 1 - MAXINT (INT, sampling rate)
353channels = 1 - MAXINT (INT, number of audio channels)
354
3551 - Alaw Raw Audio
356    MIME type: audio/x-alaw
357    Properties:
358    Encoder: alawenc
359    Decoder: alawdec
360
3612 - Mulaw Raw Audio
362    MIME type: audio/x-mulaw
363    Properties:
364    Encoder: mulawenc
365    Decoder: mulawdec
366
3673 - MPEG-1 layer 1/2/3 audio
368    MIME type: audio/mpeg
369    Properties: mpegversion = 1 (INT)
370                layer = 1/2/3 (INT)
371    Encoder: lame, ffdec_mp3
372    Decoder: mad
373
3744 - Ogg/Vorbis
375    MIME type: audio/x-vorbis
376    Encoder: rawvorbisenc (vorbisenc does rawvorbisenc+oggmux)
377    Decoder: vorbisdec
378
3795 - Windows Media Audio 1, 2 and 3 (WMA)
380    MIME type: audio/x-wma
381    Properties: wmaversion = 1/2/3 (INT)
382    Encoder:
383    Decoder: ffdec_wmav1, ffdec_wmav2, none
384
3856 - AC3
386    MIME type: audio/x-ac3
387    Properties:
388    Encoder: ffenc_ac3
389    Decoder: a52dec, ac3parse
390
3917 - FLAC (Free Lossless Audio Codec)
392    MIME type: audio/x-flac
393    Properties:
394    Encoder: flacenc
395    Decoder: flacdec, ffdec_flac
396
3978 - MACE 3/6 (Quicktime audio)
398    MIME type: audio/x-mace
399    Properties: maceversion = 3/6 (INT)
400    Encoder:
401    Decoder: ffdec_mace3, ffdec_mace6
402
4039 - MPEG-4 AAC
404    MIME type: audio/mpeg
405    Properties: mpegversion = 4 (INT)
406    Encoder: faac
407    Decoder: faad
408
40910 - (IMA) ADPCM (Quicktime/WAV/Microsoft/4XM)
410     MIME type: audio/x-adpcm
411     Properties: layout = "quicktime"/"wav"/"microsoft"/"4xm"/"g721"/"g722"/"g723_3"/"g723_5" (STRING)
412     Encoder: ffenc_adpcm_ima_[qt/wav/dk3/dk4/ws/smjpeg], ffenc_adpcm_[ms/4xm/xa/adx/ea]
413     Decoder: ffdec_adpcm_ima_[qt/wav/dk3/dk4/ws/smjpeg], ffdec_adpcm_[ms/4xm/xa/adx/ea]
414
415     Note: The difference between each of these four PCM formats is the number
416           of samples packed together per channel. For WAV, for example, each
417           sample is 4 bit, and 8 samples are packed together per channel in the
418           bytestream. For the others, refer to technical documentation. We
419           probably want to distinguish these differently, but I don't know how,
420           yet.
421
42211 - RealAudio (Real)
423     MIME type: audio/x-pn-realaudio
424     Properties: raversion ="1"/"2" (INT)
425     Known fourccs: 14_4, 28_8
426     Encoder:
427     Decoder: ffdec_real_144 / ffdec_real_288
428
42912 - DV Audio
430     MIME type: audio/x-dv
431     Properties:
432     Encoder:
433     Decoder:
434
43513 - GSM Audio
436     MIME type: audio/x-gsm
437     Properties:
438     Encoder: gsmenc, rtpgsmenc
439     Decoder: gsmdec, rtpgsmparse
440
44114 - Speex audio
442     MIME type: audio/x-speex
443     Properties:
444     Encoder: speexenc
445     Decoder: speexdec
446
44715 - QDM2
448     MIME type: audio/x-qdm2
449     Properties:
450
45116 - Sony ATRAC4 (detected inside realmedia and wave/avi streams, nothing to decode it yet)
452     MIME type: audio/x-vnd.sony.atrac3
453     Properties:
454     Encoder:
455     Decoder:
456
45717 - Ensoniq PARIS audio
458     MIME type: audio/x-paris
459     Properties:
460     Encoder:
461     Decoder:
462
46318 - Amiga IFF / SVX8 / SV16 audio
464     MIME type: audio/x-svx
465     Properties:
466     Encoder:
467     Decoder:
468
46919 - Sphere NIST audio
470     MIME type: audio/x-nist
471     Properties:
472     Encoder:
473     Decoder:
474
47520 - Sound Blaster VOC audio
476     MIME type: audio/x-voc
477     Properties:
478     Encoder:
479     Decoder:
480
48121 - Berkeley/IRCAM/CARL audio
482     MIME type: audio/x-ircam
483     Properties:
484     Encoder:
485     Decoder:
486
48722 - Sonic Foundry's 64 bit RIFF/WAV
488     MIME type: audio/x-w64
489     Properties:
490     Encoder:
491     Decoder:
492
493TODO: adpcm/dv needs confirmation from someone with knowledge...
494
495Raw formats
496-----------
497
498Raw formats contain unencoded, raw media information. These are rather rare from
499an end user point of view since raw media files have historically been
500prohibitively large ... hence the multitude of encoding formats.
501
502Raw video formats require the following common properties, in addition to
503format-specific properties:
504
505width = 1 - MAXINT (INT)
506height = 1 - MAXINT (INT)
507
5081 - Raw Video (YUV/YCbCr)
509    MIME type: video/x-raw-yuv
510    Properties: 'format' = 'XXXX' (fourcc)
511    Known fourccs: YUY2, I420, Y41P, YVYU, UYVY, etc.
512    Properties:
513
514    Some raw video formats have implicit alignment rules. We should discuss this
515    more. Also, some formats have multiple fourccs (e.g. IYUV/I420 or
516    YUY2/YUYV). For each of these, we only use one (e.g. I420 and YUY2).
517
518    Currently recognized formats:
519
520    YUY2: packed, Y-U-Y-V order, U/V hor 2x subsampled (YUV-4:2:2, 16 bpp)
521    YVYU: packed, Y-V-Y-U order, U/V hor 2x subsampled (YUV-4:2:2, 16 bpp)
522    UYVY: packed, U-Y-V-Y order, U/V hor 2x subsampled (YUV-4:2:2, 16 bpp)
523    Y41P: packed, UYVYUYVYYYYY order, U/V hor 4x subsampled (YUV-4:1:1, 12 bpp)
524    IUY2: packed, U-Y-V order, not subsampled (YUV-1:1:1, 24 bpp)
525
526    Y42B: planar, Y-U-V order, U/V hor 2x subsampled (YUV-4:2:2, 16 bpp)
527    YV12: planar, Y-V-U order, U/V hor+ver 2x subsampled (YUV-4:2:0, 12 bpp)
528    I420: planar, Y-U-V order, U/V hor+ver 2x subsampled (YUV-4:2:0, 12 bpp)
529    Y41B: planar, Y-U-V order, U/V hor 4x subsampled (YUV-4:1:1, 12bpp)
530    YUV9: planar, Y-U-V order, U/V hor+ver 4x subsampled (YUV-4:1:0, 9bpp)
531    YVU9: planar, Y-V-U order, U/V hor+ver 4x subsampled (YUV-4:1:0, 9bpp)
532
533    Y800: one-plane (Y-only, YUV-4:0:0, 8bpp)
534
535    See http://www.fourcc.org/ for more information.
536
537    Note: YUV-4:4:4 (both planar and packed, in multiple orders) are missing.
538
5392 - Raw video (RGB)
540    MIME type: video/x-raw-rgb
541    Properties: endianness = 1234/4321 (INT) <- use G_LITTLE_ENDIAN/G_BIG_ENDIAN
542                depth = 15/16/24 (INT, color depth)
543                bpp = 16/24/32 (INT, bits used to store each pixel)
544                red_mask = bitmask (0x..) (INT)
545                green_mask = bitmask (0x..) (INT)
546                blue_mask = bitmask (0x..) (INT)
547
548    24 and 32 bit RGB should always be specified as big endian, since any little
549    endian format can be transformed into big endian by rearranging the color
550    masks. 15 and 16 bit formats should generally have the same byte order as
551    the CPU.
552
553    Color masks are interpreted by loading 'bpp' number of bits using the given
554    'endianness', and masking and shifting by each color mask. Loading a 24-bit
555    value cannot be done directly, but one can perform an equivalent operation.
556
557    Examples:
558               msb .. lsb
559      - memory: RRRRRRRR GGGGGGGG BBBBBBBB RRRRRRRR GGGGGGGG ...
560                bpp        = 24
561                depth      = 24
562                endianness = 4321 (G_BIG_ENDIAN)
563                red_mask   = 0xff0000
564                green_mask = 0x00ff00
565                blue_mask  = 0x0000ff
566
567      - memory: xRRRRRGG GGGBBBBB xRRRRRGG GGGBBBBB xRRRRRGG ...
568                bpp        = 16
569                depth      = 15
570                endianness = 4321 (G_BIG_ENDIAN)
571                red_mask   = 0x7c00
572                green_mask = 0x03e0
573                blue_mask  = 0x003f
574
575      - memory: GGGBBBBB xRRRRRGG GGGBBBBB xRRRRRGG GGGBBBBB ...
576                bpp        = 16
577                depth      = 15
578                endianness = 1234 (G_LITTLE_ENDIAN)
579                red_mask   = 0x7c00
580                green_mask = 0x03e0
581                blue_mask  = 0x003f
582
583The raw audio formats require the following common properties, in addition to
584format-specific properties:
585
586rate = 1 - MAXINT (INT, sampling rate)
587channels = 1 - MAXINT (INT, number of audio channels)
588endianness = 1234/4321 (INT) <- use G_LITTLE_ENDIAN/G_BIG_ENDIAN/G_BYTE_ORDER
589
5903 - Raw audio (integer format)
591    MIME type: audio/x-raw-int
592    properties: width = 8/16/24/32 (INT, bits used to store each sample)
593                depth = 8 - 32 (INT, bits actually used per sample)
594                signed = TRUE/FALSE (BOOLEAN)
595
5964 - Raw audio (floating point format)
597    MIME type: audio/x-raw-float
598    Properties: width = 32/64 (INT)
599                buffer-frames: number of audio frames per buffer, 0=undefined
600
601Plugin Guidelines
602=================
603
604So, a short bit on what plugins should do. Above, I've stated that audio
605properties like 'channels' and 'rate' or video properties like 'width' and
606'height' are all optional. This doesn't mean you can just simply omit them and
607everything will still work!
608
609An example is the best way to explain all this. AVI needs the width, height,
610rate and channels for the AVI header. So if these properties are missing, the
611avimux element cannot properly create the AVI header. On the other hand, MPEG
612doesn't have such properties in its header, so the mpegdemux element would need
613to parse the separate streams in order to find them out. We don't want that
614either, because a plugin only does one job. So normally, mpegdemux and avimux
615wouldn't allow transcoding. To solve this problem, there are stream parser
616elements (such as mpegaudioparse, ac3parse and mpeg1videoparse).
617
618Conclusions to draw from here: a plugin gives info it can provide as seen from
619its own task/job. If it can't, other elements might still need it and a stream
620parser needs to be written if it doesn't already exist.
621
622On properties that can be described by one of these (properties such as 'width',
623'height', 'fps', etc.): they're forbidden and should be handled using filtered
624caps.
625
626Status of this document
627=======================
628
629Not all plugins strictly follow these guidelines yet, but these are the official
630types. Plugins not following these specs either use extensions that should be
631documented, or are buggy (and should be fixed).
632
633Blame Ronald Bultje <rbultje@ronald.bitfreak.net> aka BBB for any mistakes in
634this document.
635