1MIME types in GStreamer 2 3What is a MIME type ? 4===================== 5 6A MIME type is a combination of two (short) strings (words)---the content type 7and the content subtype. Content types are broad categories used for describing 8almost all types of files: video, audio, text, and application are common 9content types. The subtype further breaks the content type down into a more 10specific type description, for example 'application/ogg', 'audio/raw', 11'video/mpeg', or 'text/plain'. 12 13So the content type and subtype make up a pair that describes the type of 14information contained in a file. In multimedia processing, MIME types are used 15to describe the type of information carried by a media stream. In GStreamer, we 16use MIME types in the same way, to identify the types of information that are 17allowed to pass between GStreamer elements. The MIME type is part of a GstCaps 18object that describes a media stream. Besides a MIME type, a GstCaps object also 19contains a name and some stream properties (GstProps, which hold combinations of 20key/value pairs). 21 22An example of a MIME type is 'video/mpeg'. A corresponding GstCaps could be 23created using code: 24 25GstCaps *caps = gst_caps_new_simple ("video/mpeg", 26 "width", G_TYPE_INT, 384, 27 "height", G_TYPE_INT, 288, 28 NULL); 29 30MIME types and their corresponding properties are of major importance in 31GStreamer for uniquely identifying media streams. Therefore, we define them 32per media type. All GStreamer plugins should keep to this definition. 33 34Official MIME media types are assigned by the IANA. Current assignments are at 35http://www.iana.org/assignments/media-types/. 36 37The problems 38============ 39 40Some streams may have MIME types or GstCaps that do not fully describe the 41stream. In most cases, this is not a problem, though. For example, if a stream 42contains Ogg/Vorbis data (which is of type 'application/ogg'), we don't need to 43know the samplerate of the raw audio stream, since we can't play the encoded 44audio anyway. The samplerate is, however, important for raw audio, so a decoder 45would need to retrieve the samplerate from the Ogg/Vorbis stream headers (the 46headers are part of the bytestream) in order to pass it on in the GstCaps that 47belongs to the decoded audio (which becomes a type like 'audio/raw'). However, 48other plugins might want to know such properties, even for compressed streams. 49One such example is an AVI muxer, which does want to know the samplerate of an 50audio stream, even when it is compressed. 51 52Another problem is that many media types can be defined in multiple ways. For 53example, MJPEG video can be defined as 'video/jpeg', 'video/mjpeg', 54'image/jpeg', 'video/x-msvideo' with a compression of (fourcc) MJPG, etc. 55None of these is really official, since there isn't an official mimetype 56for encoded MJPEG video. 57 58The main focus of this document is to propose a standardized set of MIME types 59and properties that will be used by the GStreamer plugins. 60 61Different types of streams 62========================== 63 64There are several types of media streams. The most important distinction will be 65container formats, audio codecs and video codecs. Container formats are 66bytestreams that contain one or more substreams inside it, and don't provide any 67direct media data itself. Examples are Quicktime, AVI or MPEG System Stream. 68They mostly contain of a set of headers that define the media streams that are 69packed inside the container, along with the media data itself. 70 71Video codecs and audio codecs describe encoded audio or video data. Examples are 72MPEG-1 video, DivX video, MPEG-1 layer 3 (MP3) audio or Ogg/Vorbis audio. 73Actually, Ogg is a container format too (for Vorbis audio), but these are 74usually used in conjunction with each other. 75 76Finally, there are the somewhat obvious (but not commonly encountered as files) 77raw data formats. 78 79Container formats 80----------------- 81 821 - AVI (Microsoft RIFF/AVI) 83 MIME type: video/x-msvideo 84 Properties: 85 Parser: avidemux, ffdemux_avi 86 Formatter: avimux 87 882 - Quicktime (Apple) 89 MIME type: video/quicktime 90 Properties: 91 Parser: qtdemux 92 Formatter: 93 943 - MPEG (MPEG LA) 95 MIME type: video/mpeg 96 Properties: 'systemstream' = TRUE (BOOLEAN) 97 Parser: mpegdemux, ffdemux_mpeg (PS), ffdemux_mpegts (TS), dvddemux 98 Formatter: mplex 99 1004 - ASF (Microsoft) 101 MIME type: video/x-ms-asf 102 Properties: 103 Parser: asfdemux, ffdemux_asf 104 Formatter: asfmux 105 1065 - WAV (Microsoft RIFF/WAV) 107 MIME type: audio/x-wav 108 Properties: 109 Parser: wavparse, ffdemux_wav 110 Formatter: wavenc 111 1126 - RealMedia (Real) 113 MIME type: application/vnd.rn-realmedia 114 Properties: 'systemstream' = TRUE (BOOLEAN) 115 Parser: rmdemux, ffdemux_rm 116 Formatter: 117 1187 - DV (Digital Video) 119 MIME type: video/x-dv 120 Properties: 'systemstream' = TRUE (BOOLEAN) 121 Parser: gst1394, ffdemux_dv 122 Formatter: 123 1248 - Ogg (Xiph) 125 MIME type: application/ogg 126 Properties: 127 Parser: oggdemux 128 Formatter: oggmux 129 1309 - Matroska 131 MIME type: video/x-mkv 132 Properties: 133 Parser: matroskademux, ffdemux_matroska 134 Formatter: matroskamux 135 13610 - Shockwave (Macromedia) 137 MIME type: application/x-shockwave-flash 138 Properties: 139 Parser: swfdec, ffdemux_swf 140 Formatter: 141 14211 - AU audio (Sun) 143 MIME type: audio/x-au 144 Properties: 145 Parser: auparse, ffdemux_au 146 Formatter: 147 14812 - Mod audio 149 MIME type: audio/x-mod 150 Properties: 151 Parser: modplug, mikmod 152 Formatter: 153 15413 - FLX video 155 MIME type: video/x-fli 156 Properties: 157 Parser: flxdec 158 Formatter: 159 16014 - Monkeyaudio 161 MIME type: application/x-ape 162 Properties: 163 Parser: 164 Formatter: 165 16615 - AIFF audio 167 MIME type: audio/x-aiff 168 Properties: 169 Parser: 170 Formatter: 171 17216 - SID audio 173 MIME type: audio/x-sid 174 Properties: 175 Parser: siddec 176 Formatter: 177 178Please note that we try to keep these MIME types as similar as possible to the 179MIME types used as standards in Gnome (Gnome-VFS/Nautilus) and KDE 180(Konqueror). Both will (in future) stick to a shared-mime-info database that 181is hosted on freedesktop.org, and bases itself on IANA. 182 183Also, there is a very thin line between audio codecs and audio containers 184(take mp3 vs. sid, etc.). This is just a per-case thing right now and needs to 185be documented further. 186 187Video codecs 188------------ 189 190For convenience, the fourcc codes used in the AVI container format will be 191listed along with the MIME type and optional properties. 192 193Optional properties for all video formats are the following: 194 195width = 1 - MAXINT (INT) 196height = 1 - MAXINT (INT) 197pixel_width = 1 - MAXINT (INT, with pixel_height forms aspect ratio) 198pixel_height = 1 - MAXINT (INT, with pixel_width forms aspect ratio) 199framerate = 0 - MAXFLOAT (FLOAT) 200 2011 - MPEG-1, -2 and -4 video (ISO/LA MPEG) 202 MIME type: video/mpeg 203 Properties: systemstream = FALSE (BOOLEAN) 204 mpegversion = 1/2/4 (INT) 205 Known fourccs: MPEG, MPGI 206 Encoder: mpeg1enc, mpeg2enc 207 Decoder: mpeg1dec, mpeg2dec, mpeg2subt 208 2092 - DivX 3.x, 4.x and 5.x video (divx.com) 210 MIME type: video/x-divx 211 Properties: 212 Optional properties: divxversion = 3/4/5 (INT) 213 Known fourccs: DIV3, DIV4, DIV5, DIVX, DX50, DIVX, divx 214 Encoder: divxenc 215 Decoder: divxdec, ffdec_mpeg4 216 2173 - Microsoft MPEG 4.1, 4.2 and 4.3 218 MIME type: video/x-msmpeg 219 Properties: 220 Optional properties: msmpegversion = 41/42/43 (INT) 221 Known fourccs: MPG4, MP42, MP43 222 Encoder: ffenc_msmpeg4, ffenc_msmpeg4v1, ffenc_msmpeg4v2 223 Decoder: ffdec_msmpeg4, ffdec_msmpeg4v1, ffdec_msmpeg4v2 224 2254 - Motion-JPEG (official and extended) 226 MIME type: video/x-jpeg 227 Properties: 228 Known fourccs: MJPG (YUY2 MJPEG), JPEG (any), PIXL (Pinnacle/Miro), VIXL 229 Encoder: jpegenc 230 Decoder: jpegdec, ffdec_mjpeg 231 2325 - Sorensen (Quicktime - SVQ1/SVQ3) 233 MIME types: video/x-svq 234 Properties: svqversion = 1/3 (INT) 235 Encoder: 236 Decoder: ffdec_svq1, ffdec_svq3 237 2386 - H263 and related codecs 239 MIME type: video/x-h263 240 Properties: 241 Known fourccs: H263/h263, i263, L263, M263/m263, s263, x263, VDOW, VIVO 242 Encoder: ffenc_h263, ffenc_h263p 243 Decoder: ffdec_h263, ffdec_h263i 244 2457 - RealVideo (Real) 246 MIME type: video/x-pn-realvideo 247 Properties: rmversion = "1"/"2"/"3"/"4" (INT) 248 Known fourccs: RV10, RV20, RV30, RV40 249 Encoder: ffenc_rv10 250 Decoder: ffdec_rv10, ffdec_rv20 251 2528 - Digital Video (DV) 253 MIME type: video/x-dv 254 Properties: systemstream = FALSE (BOOLEAN) 255 Known fourccs: DVSD/dvsd (SDTV), dvhd (HDTV), dvsl (SDTV LongPlay) 256 Encoder: ffenc_dvvideo 257 Decoder: dvdec, ffdec_dvvideo 258 2599 - Windows Media Video 1, 2 and 3 (WMV) 260 MIME type: video/x-wmv 261 Properties: wmvversion = 1/2/3 (INT) 262 Encoder: ffenc_wmv1, ffenc_wmv2, none 263 Decoder: ffdec_wmv1, ffdec_wmv2, none 264 26510 - XviD (xvid.org) 266 MIME type: video/x-xvid 267 Properties: 268 Known fourccs: xvid, XVID 269 Encoder: xvidenc 270 Decoder: xviddec, ffdec_mpeg4 271 27211 - 3IVX (3ivx.org) 273 MIME type: video/x-3ivx 274 Properties: 275 Known fourccs: 3IV0, 3IV1, 3IV2 276 Encoder: 277 Decoder: 278 27912 - Ogg/Tarkin (Xiph) 280 MIME type: video/x-tarkin 281 Properties: 282 Encoder: 283 Decoder: 284 28513 - VP3 286 MIME type: video/x-vp3 287 Properties: 288 Encoder: 289 Decoder: ffdec_vp3 290 29114 - Ogg/Theora (Xiph, VP3-like) 292 MIME type: video/x-theora 293 Properties: 294 Encoder: theoraenc 295 Decoder: theoradec, ffdec_theora 296 This is the raw stream that comes out of an ogg file. 297 29815 - Huffyuv 299 MIME type: video/x-huffyuv 300 Properties: 301 Known fourccs: HFYU 302 Encoder: 303 Decoder: ffdec_hfyu 304 30516 - FF Video 1 (FFMPEG) 306 MIME type: video/x-ffv 307 Properties: ffvversion = 1 (INT) 308 Encoder: 309 Decoder: ffdec_ffv1 310 31117 - H264 312 MIME type: video/x-h264 313 Properties: 314 Known fourccs: VSSH 315 Encoder: 316 Decoder: ffdec_h264 317 31818 - Indeo 3 (Intel) 319 MIME type: video/x-indeo 320 Properties: indeoversion = 3 (INT) 321 Encoder: 322 Decoder: ffdec_indeo3 323 32419 - Portable Network Graphics (PNG) 325 MIME type: video/x-png 326 Properties: 327 Encoder: pngenc 328 Decoder: pngdec, gdkpixbufdec 329 33020 - Cinepak 331 MIME type: video/x-cinepak 332 Properties: 333 Encoder: 334 Decoder: ffdec_cinepak 335 336TODO: subsampling information for YUV? 337 338TODO: colorspace identifications for MJPEG? How? 339 340TODO: how to distinguish MJPEG-A/B (Quicktime) and lossless JPEG? 341 342TODO: divx4/divx5/xvid/3ivx/mpeg-4 - how to make them overlap? (all 343 ISO MPEG-4 compatible) 344 3453c) Audio Codecs 346---------------- 347For convenience, the two-byte hexcodes (as used for identification in AVI files) 348are also given. 349 350Properties for all audio formats include the following: 351 352rate = 1 - MAXINT (INT, sampling rate) 353channels = 1 - MAXINT (INT, number of audio channels) 354 3551 - Alaw Raw Audio 356 MIME type: audio/x-alaw 357 Properties: 358 Encoder: alawenc 359 Decoder: alawdec 360 3612 - Mulaw Raw Audio 362 MIME type: audio/x-mulaw 363 Properties: 364 Encoder: mulawenc 365 Decoder: mulawdec 366 3673 - MPEG-1 layer 1/2/3 audio 368 MIME type: audio/mpeg 369 Properties: mpegversion = 1 (INT) 370 layer = 1/2/3 (INT) 371 Encoder: lame, ffdec_mp3 372 Decoder: mad 373 3744 - Ogg/Vorbis 375 MIME type: audio/x-vorbis 376 Encoder: rawvorbisenc (vorbisenc does rawvorbisenc+oggmux) 377 Decoder: vorbisdec 378 3795 - Windows Media Audio 1, 2 and 3 (WMA) 380 MIME type: audio/x-wma 381 Properties: wmaversion = 1/2/3 (INT) 382 Encoder: 383 Decoder: ffdec_wmav1, ffdec_wmav2, none 384 3856 - AC3 386 MIME type: audio/x-ac3 387 Properties: 388 Encoder: ffenc_ac3 389 Decoder: a52dec, ac3parse 390 3917 - FLAC (Free Lossless Audio Codec) 392 MIME type: audio/x-flac 393 Properties: 394 Encoder: flacenc 395 Decoder: flacdec, ffdec_flac 396 3978 - MACE 3/6 (Quicktime audio) 398 MIME type: audio/x-mace 399 Properties: maceversion = 3/6 (INT) 400 Encoder: 401 Decoder: ffdec_mace3, ffdec_mace6 402 4039 - MPEG-4 AAC 404 MIME type: audio/mpeg 405 Properties: mpegversion = 4 (INT) 406 Encoder: faac 407 Decoder: faad 408 40910 - (IMA) ADPCM (Quicktime/WAV/Microsoft/4XM) 410 MIME type: audio/x-adpcm 411 Properties: layout = "quicktime"/"wav"/"microsoft"/"4xm"/"g721"/"g722"/"g723_3"/"g723_5" (STRING) 412 Encoder: ffenc_adpcm_ima_[qt/wav/dk3/dk4/ws/smjpeg], ffenc_adpcm_[ms/4xm/xa/adx/ea] 413 Decoder: ffdec_adpcm_ima_[qt/wav/dk3/dk4/ws/smjpeg], ffdec_adpcm_[ms/4xm/xa/adx/ea] 414 415 Note: The difference between each of these four PCM formats is the number 416 of samples packed together per channel. For WAV, for example, each 417 sample is 4 bit, and 8 samples are packed together per channel in the 418 bytestream. For the others, refer to technical documentation. We 419 probably want to distinguish these differently, but I don't know how, 420 yet. 421 42211 - RealAudio (Real) 423 MIME type: audio/x-pn-realaudio 424 Properties: raversion ="1"/"2" (INT) 425 Known fourccs: 14_4, 28_8 426 Encoder: 427 Decoder: ffdec_real_144 / ffdec_real_288 428 42912 - DV Audio 430 MIME type: audio/x-dv 431 Properties: 432 Encoder: 433 Decoder: 434 43513 - GSM Audio 436 MIME type: audio/x-gsm 437 Properties: 438 Encoder: gsmenc, rtpgsmenc 439 Decoder: gsmdec, rtpgsmparse 440 44114 - Speex audio 442 MIME type: audio/x-speex 443 Properties: 444 Encoder: speexenc 445 Decoder: speexdec 446 44715 - QDM2 448 MIME type: audio/x-qdm2 449 Properties: 450 45116 - Sony ATRAC4 (detected inside realmedia and wave/avi streams, nothing to decode it yet) 452 MIME type: audio/x-vnd.sony.atrac3 453 Properties: 454 Encoder: 455 Decoder: 456 45717 - Ensoniq PARIS audio 458 MIME type: audio/x-paris 459 Properties: 460 Encoder: 461 Decoder: 462 46318 - Amiga IFF / SVX8 / SV16 audio 464 MIME type: audio/x-svx 465 Properties: 466 Encoder: 467 Decoder: 468 46919 - Sphere NIST audio 470 MIME type: audio/x-nist 471 Properties: 472 Encoder: 473 Decoder: 474 47520 - Sound Blaster VOC audio 476 MIME type: audio/x-voc 477 Properties: 478 Encoder: 479 Decoder: 480 48121 - Berkeley/IRCAM/CARL audio 482 MIME type: audio/x-ircam 483 Properties: 484 Encoder: 485 Decoder: 486 48722 - Sonic Foundry's 64 bit RIFF/WAV 488 MIME type: audio/x-w64 489 Properties: 490 Encoder: 491 Decoder: 492 493TODO: adpcm/dv needs confirmation from someone with knowledge... 494 495Raw formats 496----------- 497 498Raw formats contain unencoded, raw media information. These are rather rare from 499an end user point of view since raw media files have historically been 500prohibitively large ... hence the multitude of encoding formats. 501 502Raw video formats require the following common properties, in addition to 503format-specific properties: 504 505width = 1 - MAXINT (INT) 506height = 1 - MAXINT (INT) 507 5081 - Raw Video (YUV/YCbCr) 509 MIME type: video/x-raw-yuv 510 Properties: 'format' = 'XXXX' (fourcc) 511 Known fourccs: YUY2, I420, Y41P, YVYU, UYVY, etc. 512 Properties: 513 514 Some raw video formats have implicit alignment rules. We should discuss this 515 more. Also, some formats have multiple fourccs (e.g. IYUV/I420 or 516 YUY2/YUYV). For each of these, we only use one (e.g. I420 and YUY2). 517 518 Currently recognized formats: 519 520 YUY2: packed, Y-U-Y-V order, U/V hor 2x subsampled (YUV-4:2:2, 16 bpp) 521 YVYU: packed, Y-V-Y-U order, U/V hor 2x subsampled (YUV-4:2:2, 16 bpp) 522 UYVY: packed, U-Y-V-Y order, U/V hor 2x subsampled (YUV-4:2:2, 16 bpp) 523 Y41P: packed, UYVYUYVYYYYY order, U/V hor 4x subsampled (YUV-4:1:1, 12 bpp) 524 IUY2: packed, U-Y-V order, not subsampled (YUV-1:1:1, 24 bpp) 525 526 Y42B: planar, Y-U-V order, U/V hor 2x subsampled (YUV-4:2:2, 16 bpp) 527 YV12: planar, Y-V-U order, U/V hor+ver 2x subsampled (YUV-4:2:0, 12 bpp) 528 I420: planar, Y-U-V order, U/V hor+ver 2x subsampled (YUV-4:2:0, 12 bpp) 529 Y41B: planar, Y-U-V order, U/V hor 4x subsampled (YUV-4:1:1, 12bpp) 530 YUV9: planar, Y-U-V order, U/V hor+ver 4x subsampled (YUV-4:1:0, 9bpp) 531 YVU9: planar, Y-V-U order, U/V hor+ver 4x subsampled (YUV-4:1:0, 9bpp) 532 533 Y800: one-plane (Y-only, YUV-4:0:0, 8bpp) 534 535 See http://www.fourcc.org/ for more information. 536 537 Note: YUV-4:4:4 (both planar and packed, in multiple orders) are missing. 538 5392 - Raw video (RGB) 540 MIME type: video/x-raw-rgb 541 Properties: endianness = 1234/4321 (INT) <- use G_LITTLE_ENDIAN/G_BIG_ENDIAN 542 depth = 15/16/24 (INT, color depth) 543 bpp = 16/24/32 (INT, bits used to store each pixel) 544 red_mask = bitmask (0x..) (INT) 545 green_mask = bitmask (0x..) (INT) 546 blue_mask = bitmask (0x..) (INT) 547 548 24 and 32 bit RGB should always be specified as big endian, since any little 549 endian format can be transformed into big endian by rearranging the color 550 masks. 15 and 16 bit formats should generally have the same byte order as 551 the CPU. 552 553 Color masks are interpreted by loading 'bpp' number of bits using the given 554 'endianness', and masking and shifting by each color mask. Loading a 24-bit 555 value cannot be done directly, but one can perform an equivalent operation. 556 557 Examples: 558 msb .. lsb 559 - memory: RRRRRRRR GGGGGGGG BBBBBBBB RRRRRRRR GGGGGGGG ... 560 bpp = 24 561 depth = 24 562 endianness = 4321 (G_BIG_ENDIAN) 563 red_mask = 0xff0000 564 green_mask = 0x00ff00 565 blue_mask = 0x0000ff 566 567 - memory: xRRRRRGG GGGBBBBB xRRRRRGG GGGBBBBB xRRRRRGG ... 568 bpp = 16 569 depth = 15 570 endianness = 4321 (G_BIG_ENDIAN) 571 red_mask = 0x7c00 572 green_mask = 0x03e0 573 blue_mask = 0x003f 574 575 - memory: GGGBBBBB xRRRRRGG GGGBBBBB xRRRRRGG GGGBBBBB ... 576 bpp = 16 577 depth = 15 578 endianness = 1234 (G_LITTLE_ENDIAN) 579 red_mask = 0x7c00 580 green_mask = 0x03e0 581 blue_mask = 0x003f 582 583The raw audio formats require the following common properties, in addition to 584format-specific properties: 585 586rate = 1 - MAXINT (INT, sampling rate) 587channels = 1 - MAXINT (INT, number of audio channels) 588endianness = 1234/4321 (INT) <- use G_LITTLE_ENDIAN/G_BIG_ENDIAN/G_BYTE_ORDER 589 5903 - Raw audio (integer format) 591 MIME type: audio/x-raw-int 592 properties: width = 8/16/24/32 (INT, bits used to store each sample) 593 depth = 8 - 32 (INT, bits actually used per sample) 594 signed = TRUE/FALSE (BOOLEAN) 595 5964 - Raw audio (floating point format) 597 MIME type: audio/x-raw-float 598 Properties: width = 32/64 (INT) 599 buffer-frames: number of audio frames per buffer, 0=undefined 600 601Plugin Guidelines 602================= 603 604So, a short bit on what plugins should do. Above, I've stated that audio 605properties like 'channels' and 'rate' or video properties like 'width' and 606'height' are all optional. This doesn't mean you can just simply omit them and 607everything will still work! 608 609An example is the best way to explain all this. AVI needs the width, height, 610rate and channels for the AVI header. So if these properties are missing, the 611avimux element cannot properly create the AVI header. On the other hand, MPEG 612doesn't have such properties in its header, so the mpegdemux element would need 613to parse the separate streams in order to find them out. We don't want that 614either, because a plugin only does one job. So normally, mpegdemux and avimux 615wouldn't allow transcoding. To solve this problem, there are stream parser 616elements (such as mpegaudioparse, ac3parse and mpeg1videoparse). 617 618Conclusions to draw from here: a plugin gives info it can provide as seen from 619its own task/job. If it can't, other elements might still need it and a stream 620parser needs to be written if it doesn't already exist. 621 622On properties that can be described by one of these (properties such as 'width', 623'height', 'fps', etc.): they're forbidden and should be handled using filtered 624caps. 625 626Status of this document 627======================= 628 629Not all plugins strictly follow these guidelines yet, but these are the official 630types. Plugins not following these specs either use extensions that should be 631documented, or are buggy (and should be fixed). 632 633Blame Ronald Bultje <rbultje@ronald.bitfreak.net> aka BBB for any mistakes in 634this document. 635