1<?xml version="1.0"?> 2<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN" 3 "http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd" [ 4 <!ENTITY % local.common.attrib "xmlns:xi CDATA #FIXED 'http://www.w3.org/2003/XInclude'"> 5 <!ENTITY version SYSTEM "version.xml"> 6]> 7<chapter id="shaping-and-shape-plans"> 8 <title>Shaping and shape plans</title> 9 <para> 10 Once you have your face and font objects configured as desired and 11 your input buffer is filled with the characters you need to shape, 12 all you need to do is call <function>hb_shape()</function>. 13 </para> 14 <para> 15 HarfBuzz will return the shaped version of the text in the same 16 buffer that you provided, but it will be in output mode. At that 17 point, you can iterate through the glyphs in the buffer, drawing 18 each one at the specified position or handing them off to the 19 appropriate graphics library. 20 </para> 21 <para> 22 For the most part, HarfBuzz's shaping step is straightforward from 23 the outside. But that doesn't mean there will never be cases where 24 you want to look under the hood and see what is happening on the 25 inside. HarfBuzz provides facilities for doing that, too. 26 </para> 27 28 <section id="shaping-buffer-output"> 29 <title>Shaping and buffer output</title> 30 <para> 31 The <function>hb_shape()</function> function call takes four arguments: the font 32 object to use, the buffer of characters to shape, an array of 33 user-specified features to apply, and the length of that feature 34 array. The feature array can be NULL, so for the sake of 35 simplicity we will start with that case. 36 </para> 37 <para> 38 Internally, HarfBuzz looks at the tables of the font file to 39 determine where glyph classes, substitutions, and positioning 40 are defined, using that information to decide which 41 <emphasis>shaper</emphasis> to use (<literal>ot</literal> for 42 OpenType fonts, <literal>aat</literal> for Apple Advanced 43 Typography fonts, and so on). It also looks at the direction, 44 script, and language properties of the segment to figure out 45 which script-specific shaping model is needed (at least, in 46 shapers that support multiple options). 47 </para> 48 <para> 49 If a font has a GDEF table, then that is used for 50 glyph classes; if not, HarfBuzz will fall back to Unicode 51 categorization by code point. If a font has an AAT <literal>morx</literal> table, 52 then it is used for substitutions; if not, but there is a GSUB 53 table, then the GSUB table is used. If the font has an AAT 54 <literal>kerx</literal> table, then it is used for positioning; if not, but 55 there is a GPOS table, then the GPOS table is used. If neither 56 table is found, but there is a <literal>kern</literal> table, then HarfBuzz will 57 use the <literal>kern</literal> table. If there is no <literal>kerx</literal>, no GPOS, and no 58 <literal>kern</literal>, HarfBuzz will fall back to positioning marks itself. 59 </para> 60 <para> 61 With a well-behaved OpenType font, you expect GDEF, GSUB, and 62 GPOS tables to all be applied. HarfBuzz implements the 63 script-specific shaping models in internal functions, rather 64 than in the public API. 65 </para> 66 <para> 67 The algorithms 68 used for complex scripts can be quite involved; HarfBuzz tries 69 to be compatible with the OpenType Layout specification 70 and, wherever there is any ambiguity, HarfBuzz attempts to replicate the 71 output of Microsoft's Uniscribe engine. See the <ulink 72 url="https://docs.microsoft.com/en-us/typography/script-development/standard">Microsoft 73 Typography pages</ulink> for more detail. 74 </para> 75 <para> 76 In general, though, all that you need to know is that 77 <function>hb_shape()</function> returns the results of shaping 78 in the same buffer that you provided. The buffer's content type 79 will now be set to 80 <literal>HB_BUFFER_CONTENT_TYPE_GLYPHS</literal>, indicating 81 that it contains shaped output, rather than input text. You can 82 now extract the glyph information and positioning arrays: 83 </para> 84 <programlisting language="C"> 85 hb_glyph_info_t *glyph_info = hb_buffer_get_glyph_infos(buf, &glyph_count); 86 hb_glyph_position_t *glyph_pos = hb_buffer_get_glyph_positions(buf, &glyph_count); 87 </programlisting> 88 <para> 89 The glyph information array holds a <type>hb_glyph_info_t</type> 90 for each output glyph, which has two fields: 91 <parameter>codepoint</parameter> and 92 <parameter>cluster</parameter>. Whereas, in the input buffer, 93 the <parameter>codepoint</parameter> field contained the Unicode 94 code point, it now contains the glyph ID of the corresponding 95 glyph in the font. The <parameter>cluster</parameter> field is 96 an integer that you can use to help identify when shaping has 97 reordered, split, or combined code points; we will say more 98 about that in the next chapter. 99 </para> 100 <para> 101 The glyph positions array holds a corresponding 102 <type>hb_glyph_position_t</type> for each output glyph, 103 containing four fields: <parameter>x_advance</parameter>, 104 <parameter>y_advance</parameter>, 105 <parameter>x_offset</parameter>, and 106 <parameter>y_offset</parameter>. The advances tell you how far 107 you need to move the drawing point after drawing this glyph, 108 depending on whether you are setting horizontal text (in which 109 case you will have x advances) or vertical text (for which you 110 will have y advances). The x and y offsets tell you where to 111 move to start drawing the glyph; usually you will have both and 112 x and a y offset, regardless of the text direction. 113 </para> 114 <para> 115 Most of the time, you will rely on a font-rendering library or 116 other graphics library to do the actual drawing of glyphs, so 117 you will need to iterate through the glyphs in the buffer and 118 pass the corresponding values off. 119 </para> 120 </section> 121 122 <section id="shaping-opentype-features"> 123 <title>OpenType features</title> 124 <para> 125 OpenType features enable fonts to include smart behavior, 126 implemented as "lookup" rules stored in the GSUB and GPOS 127 tables. The OpenType specification defines a long list of 128 standard features that fonts can use for these behaviors; each 129 feature has a four-character reserved name and a well-defined 130 semantic meaning. 131 </para> 132 <para> 133 Some OpenType features are defined for the purpose of supporting 134 complex-script shaping, and are automatically activated, but 135 only when a buffer's script property is set to a script that the 136 feature supports. 137 </para> 138 <para> 139 Other features are more generic and can apply to several (or 140 any) script, and shaping engines are expected to implement 141 them. By default, HarfBuzz activates several of these features 142 on every text run. They include <literal>abvm</literal>, 143 <literal>blwm</literal>, <literal>ccmp</literal>, 144 <literal>locl</literal>, <literal>mark</literal>, 145 <literal>mkmk</literal>, and <literal>rlig</literal>. 146 </para> 147 <para> 148 In addition, if the text direction is horizontal, HarfBuzz 149 also applies the <literal>calt</literal>, 150 <literal>clig</literal>, <literal>curs</literal>, 151 <literal>dist</literal>, <literal>kern</literal>, 152 <literal>liga</literal> and <literal>rclt</literal>, features. 153 </para> 154 <para> 155 Additionally, when HarfBuzz encounters a fraction slash 156 (<literal>U+2044</literal>), it looks backward and forward for decimal 157 digits (Unicode General Category = Nd), and enables features 158 <literal>numr</literal> on the sequence before the fraction slash, 159 <literal>dnom</literal> on the sequence after the fraction slash, 160 and <literal>frac</literal> on the whole sequence including the fraction 161 slash. 162 </para> 163 <para> 164 Some script-specific shaping models 165 (see <xref linkend="opentype-shaping-models" />) disable some of the 166 features listed above: 167 </para> 168 <itemizedlist> 169 <listitem> 170 <para> 171 Hangul: <literal>calt</literal> 172 </para> 173 </listitem> 174 <listitem> 175 <para> 176 Indic: <literal>liga</literal> 177 </para> 178 </listitem> 179 <listitem> 180 <para> 181 Khmer: <literal>liga</literal> 182 </para> 183 </listitem> 184 </itemizedlist> 185 <para> 186 If the text direction is vertical, HarfBuzz applies 187 the <literal>vert</literal> feature by default. 188 </para> 189 <para> 190 Still other features are designed to be purely optional and left 191 up to the application or the end user to enable or disable as desired. 192 </para> 193 <para> 194 You can adjust the set of features that HarfBuzz applies to a 195 buffer by supplying an array of <type>hb_feature_t</type> 196 features as the third argument to 197 <function>hb_shape()</function>. For a simple case, let's just 198 enable the <literal>dlig</literal> feature, which turns on any 199 "discretionary" ligatures in the font: 200 </para> 201 <programlisting language="C"> 202 hb_feature_t userfeatures[1]; 203 userfeatures[0].tag = HB_TAG('d','l','i','g'); 204 userfeatures[0].value = 1; 205 userfeatures[0].start = HB_FEATURE_GLOBAL_START; 206 userfeatures[0].end = HB_FEATURE_GLOBAL_END; 207 </programlisting> 208 <para> 209 <literal>HB_FEATURE_GLOBAL_END</literal> and 210 <literal>HB_FEATURE_GLOBAL_END</literal> are macros we can use 211 to indicate that the features will be applied to the entire 212 buffer. We could also have used a literal <literal>0</literal> 213 for the start and a <literal>-1</literal> to indicate the end of 214 the buffer (or have selected other start and end positions, if needed). 215 </para> 216 <para> 217 When we pass the <varname>userfeatures</varname> array to 218 <function>hb_shape()</function>, any discretionary ligature 219 substitutions from our font that match the text in our buffer 220 will get performed: 221 </para> 222 <programlisting language="C"> 223 hb_shape(font, buf, userfeatures, num_features); 224 </programlisting> 225 <para> 226 Just like we enabled the <literal>dlig</literal> feature by 227 setting its <parameter>value</parameter> to 228 <literal>1</literal>, you would disable a feature by setting its 229 <parameter>value</parameter> to <literal>0</literal>. Some 230 features can take other <parameter>value</parameter> settings; 231 be sure you read the full specification of each feature tag to 232 understand what it does and how to control it. 233 </para> 234 </section> 235 236 <section id="shaping-shaper-selection"> 237 <title>Shaper selection</title> 238 <para> 239 The basic version of <function>hb_shape()</function> determines 240 its shaping strategy based on examining the capabilities of the 241 font file. OpenType font tables cause HarfBuzz to try the 242 <literal>ot</literal> shaper, while AAT font tables cause HarfBuzz to try the 243 <literal>aat</literal> shaper. 244 </para> 245 <para> 246 In the real world, however, a font might include some unusual 247 mix of tables, or one of the tables might simply be broken for 248 the script you need to shape. So, sometimes, you might not 249 want to rely on HarfBuzz's process for deciding what to do, and 250 just tell <function>hb_shape()</function> what you want it to try. 251 </para> 252 <para> 253 <function>hb_shape_full()</function> is an alternate shaping 254 function that lets you supply a list of shapers for HarfBuzz to 255 try, in order, when shaping your buffer. For example, if you 256 have determined that HarfBuzz's attempts to work around broken 257 tables gives you better results than the AAT shaper itself does, 258 you might move the AAT shaper to the end of your list of 259 preferences and call <function>hb_shape_full()</function> 260 </para> 261 <programlisting language="C"> 262 char *shaperprefs[3] = {"ot", "default", "aat"}; 263 ... 264 hb_shape_full(font, buf, userfeatures, num_features, shaperprefs); 265 </programlisting> 266 <para> 267 to get results you are happier with. 268 </para> 269 <para> 270 You may also want to call 271 <function>hb_shape_list_shapers()</function> to get a list of 272 the shapers that were built at compile time in your copy of HarfBuzz. 273 </para> 274 </section> 275 276 <section id="shaping-plans-and-caching"> 277 <title>Plans and caching</title> 278 <para> 279 Internally, HarfBuzz uses a structure called a shape plan to 280 track its decisions about how to shape the contents of a 281 buffer. The <function>hb_shape()</function> function builds up the shape plan by 282 examining segment properties and by inspecting the contents of 283 the font. 284 </para> 285 <para> 286 This process can involve some decision-making and 287 trade-offs — for example, HarfBuzz inspects the GSUB and GPOS 288 lookups for the script and language tags set on the segment 289 properties, but it falls back on the lookups under the 290 <literal>DFLT</literal> tag (and sometimes other common tags) 291 if there are actually no lookups for the tag requested. 292 </para> 293 <para> 294 HarfBuzz also includes some work-arounds for 295 handling well-known older font conventions that do not follow 296 OpenType or Unicode specifications, for buggy system fonts, and for 297 peculiarities of Microsoft Uniscribe. All of that means that a 298 shape plan, while not something that you should edit directly in 299 client code, still might be an object that you want to 300 inspect. Furthermore, if resources are tight, you might want to 301 cache the shape plan that HarfBuzz builds for your buffer and 302 font, so that you do not have to rebuild it for every shaping call. 303 </para> 304 <para> 305 You can create a cacheable shape plan with 306 <function>hb_shape_plan_create_cached(face, props, 307 user_features, num_user_features, shaper_list)</function>, where 308 <parameter>face</parameter> is a face object (not a font object, 309 notably), <parameter>props</parameter> is an 310 <type>hb_segment_properties_t</type>, 311 <parameter>user_features</parameter> is an array of 312 <type>hb_feature_t</type>s (with length 313 <parameter>num_user_features</parameter>), and 314 <parameter>shaper_list</parameter> is a list of shapers to try. 315 </para> 316 <para> 317 Shape plans are objects in HarfBuzz, so there are 318 reference-counting functions and user-data attachment functions 319 you can 320 use. <function>hb_shape_plan_reference(shape_plan)</function> 321 increases the reference count on a shape plan, while 322 <function>hb_shape_plan_destroy(shape_plan)</function> decreases 323 the reference count, destroying the shape plan when the last 324 reference is dropped. 325 </para> 326 <para> 327 You can attach user data to a shaper (with a key) using the 328 <function>hb_shape_plan_set_user_data(shape_plan,key,data,destroy,replace)</function> 329 function, optionally supplying a <function>destroy</function> 330 callback to use. You can then fetch the user data attached to a 331 shape plan with 332 <function>hb_shape_plan_get_user_data(shape_plan, key)</function>. 333 </para> 334 </section> 335 336</chapter> 337