1 2 DVD subtitles 3 --------------- 4 5 6 0. Introduction 7 1. Basics 8 2. The data structure 9 3. Reading the control header 10 4. Decoding the graphics 11 5. What I do not know yet / What I need 12 6. Thanks 13 7. Changes 14 15 16 17 18 19The latest version of this document can be found here: 20http://www.via.ecp.fr/~sam/doc/dvd/ 21 22 23 24 25 260. Introduction 27 28 One of the last things we missed in DVD decoding under my system was the 29decoding of subtitles. I found no information on the web or Usenet about them, 30apart from a few words on them being run-length encoded in the DVD FAQ. 31 32 So we decided to reverse-engineer their format (it's completely legal in 33France, since we did it on interoperability purposes), and managed to get 34almost all of it. 35 36 37 38 39 401. Basics 41 42 DVD subtitles are hidden in private PS packets (0x000001ba), just like AC3 43streams are. 44 45 Within the PS packet, there are PES packets, and like AC3, the header for the 46ones containing subtitles have a 0x000001bd header. 47 As for AC3, where there's an ID like (0x80 + x), there's a subtitle ID equal 48to (0x20 + x), where x is the subtitle ID. Thus there seems to be only 4916 possible different subtitles on a DVD (my Taxi Driver copy has 16). 50 51 I'll suppose you know how to extract AC3 from a DVD, and jump to the 52interesting part of this documentation. Anyway you're unlikely to have 53understood what I said without already being familiar with MPEG2. 54 55 56 57 58 592. The data structure 60 61A subtitle packet, after its parts have been collected and appended, looks 62like this : 63 64 +----------------------------------------------------------+ 65 | | 66 | 0 2 size | 67 | +----+------------------------+-----------------+ | 68 | |size| data packet | control | | 69 | +----+------------------------+-----------------+ | 70 | | 71 | a subtitle packet | 72 | | 73 +----------------------------------------------------------+ 74 75size is a 2 bytes word, and data packet and control may have any size. 76 77 78Here is the structure of the data packet : 79 80 +----------------------------------------------------------+ 81 | | 82 | 2 4 S0+2 | 83 | +----+------------------------------------------+ | 84 | | S0 | data | | 85 | +----+------------------------------------------+ | 86 | | 87 | the data packet | 88 | | 89 +----------------------------------------------------------+ 90 91S0, the data packet size, is a 2 bytes word. 92 93 94Finally, here's the structure of the control packet : 95 96 +----------------------------------------------------------+ 97 | | 98 | S0+2 S0+4 S1 size | 99 | +----+---------+---------+--+---------+--+---------+ | 100 | | S1 |ctrl seq |ctrl seq |..|ctrl seq |ff| end seq | | 101 | +----+---------+---------+--+---------+--+---------+ | 102 | | 103 | the control packet | 104 | | 105 +----------------------------------------------------------+ 106 107To summarize : 108 109 - S1, at offset S0+2, the position of the end sequence 110 - several control sequences 111 - the 'ff' byte 112 - the end sequence 113 114 115 116 117 1183. Reading the control header 119 120The first thing to read is the control sequences. There are several 121types of them, and each type is determined by its first byte. As far 122as I know, each type has a fixed length. 123 124 * type 0x01 : '01' - 1 byte 125 it seems to be an empty control sequence. 126 127 * type 0x03 : '03wxyz' - 3 bytes 128 this one has the palette information ; it basically says 'encoded color 0 129 is the with color of the palette, encoded color 1 is the xth color, aso. 130 131 * type 0x04 : '04wxyz' - 3 bytes 132 I *think* this is the alpha channel information ; I only saw values of 0 or f 133 for those nibbles, so I can't really be sure, but it seems plausible. 134 135 * type 0x05 : '05xxxXXXyyyYYY' - 7 bytes 136 the coordinates of the subtitle on the screen : 137 xxx is the first column of the subtitle 138 XXX is the last column of the subtitle 139 yyy is the first line of the subtitle 140 YYY is the last line of the subtitle 141 thus the subtitle's size is (XXX-xxx+1) x (YYY-yyy+1) 142 143 * type 0x06 : '06xxxxyyyy' - 5 bytes 144 xxxx is the position of the first graphic line, and yyyy is the position of 145 the second one (the graphics are interlaced, so it helps a lot :p) 146 147The end sequence has this structure: 148 149 xxxx yyyy 02 ff (ff) 150 151 it ends with 'ff' or 'ffff', to make the whole packet have an even length. 152 153FIXME: I absolutely don't know what xxxx is. I suppose it may be some date 154information since I found it nowhere else, but I can't be sure. 155 156 yyyy is equal to S1 (see picture). 157 158 159Example of a control header : 160---- 1610A 0C 01 03 02 31 04 0F F0 05 00 02 CF 00 22 3E 06 00 06 04 E9 FF 00 93 0A 0C 02 FF 162---- 163Let's decode it. First of all, S1 = 0x0a0c. 164 165The control sequences are : 166 01 167 Nothing to say about this one 168 03 02 31 169 Color 0 is 0, color 1 is 2, color 2 is 3, and color 3 is 1. 170 04 0F F0 171 Colors 0 and 3 are transparent, and colors 2 and 3 are opaque (not sure of this one) 172 05 00 02 CF 00 22 3E 173 The first column is 0x000, the last one is 0x2cf, the first line is 0x002, and 174 the last line is 0x23e. Thus the subtitle's size is 0x2d0 x 0x23d. 175 06 00 06 04 E9 176 The first encoded image starts at offset 0x006, and the second one starts at 0x04e9. 177 178And the end sequence is : 179 00 93 0A 0C 02 FF 180 Which means... well, not many things now. We can at least verify that S1 (0x0a0c) is 181 there. 182 183 184 185 186 1874. Decoding the graphics 188 189 The graphics are rather easy to decode (at least, when you know how to do it - it 190 took us one whole week to figure out what the encoding was :p). 191 192 The picture is interlaced, for instance for a 40 lines picture : 193 194 line 0 ---------------#---------- 195 line 2 ------#------------------- 196 ... 197 line 38 ------------#------------- 198 line 1 ------------------#------- 199 line 3 --------#----------------- 200 ... 201 line 39 -------------#------------ 202 203 When decoding you should get: 204 205 line 0 ---------------#---------- 206 line 1 ------------------#------- 207 line 2 ------#------------------- 208 line 3 --------#----------------- 209 ... 210 line 38 ------------#------------- 211 line 39 -------------#------------ 212 213 Computers with weak processors could choose only to decode even lines 214 in order to gain some time, for instance. 215 216 217 The encoding is run-length encoded, with the following alphabet: 218 219 0xf 220 0xe 221 0xd 222 0xc 223 0xb 224 0xa 225 0x9 226 0x8 227 0x7 228 0x6 229 0x5 230 0x4 231 0x3- 232 0x2- 233 0x1- 234 0x0f- 235 0x0e- 236 0x0d- 237 0x0c- 238 0x0b- 239 0x0a- 240 0x09- 241 0x08- 242 0x07- 243 0x06- 244 0x05- 245 0x04- 246 0x03-- 247 0x02-- 248 0x01-- 249 0x0000 250 251 '-' stands for any other nibble. Once a sequence X of this alphabet has 252 been read, the pixels can be displayed : (X >> 2) is the number of pixels 253 to display, and (X & 0x3) is the color of the pixel. 254 255 For instance, 0x23 means "8 pixels of color 3". 256 257 "0000" has a special meaning : it's a carriage return. The decoder should 258 do a carriage return when reaching the end of the line, or when encountering 259 this "0000" sequence. When doing a carriage return, the parser should be 260 reset to the next even position (it cannot be nibble-aligned at the start 261 of a line). 262 263 After a carriage return, the parser should read a line on the other 264 interlaced picture, and swap like this after each carriage return. 265 266 Perhaps I don't explain this very well, so you'd better have a look at 267 the enclosed source. 268 269 270 271 272 2735. What I do not know yet / What I need 274 275I don't know what's in the end sequence yet. 276 277Also, I don't know exactly when to display subtitles, and when to remove them. 278 279I don't know if there are other types of control sequences (in my programs I consider 2800xff as a control sequence type, as well as 0x02. I don't know if it's correct or not, 281so please comment on this). 282 283I don't know what the "official" color palette is. 284 285I don't know how to handle transparency information. 286 287I don't know if this document is generic enough. 288 289So what I need is you : 290 291 - if you can, patch this document or my programs to fix strange behaviour with your subtitles. 292 293 - send me your subtitles (there's a program to extract them enclosed) ; the first 10 KB 294 of subtitles in a VOB should be enough, but it would be cool if you sent me one subtitle 295 file per language. 296 297 298 299 300 3016. Thanks 302 303 Thanks to Michel Lespinasse <walken@via.ecp.fr> for his great help on understanding 304the RLE stuff, and for all the ideas he had. 305 306 Thanks to mass (David Waite) and taaz (David I. Lehn) from irc at 307openprojects.net for sending me their subtitles. 308 309 310 311 312 3137. Changes 314 315 20000116: added the 'changes' section. 316 20000116: added David Waite's and David I. Lehn's name. 317 20000116: changed "x0" and "x1" to "S0" and "S1" to make it less confusing. 318 319 320 321 322-- 323Paris, January 16th 2000 324Samuel Hocevar <sam@via.ecp.fr> 325