• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
2"https://www.w3.org/TR/html4/loose.dtd">
3<html>
4<head>
5  <meta name="generator" content=
6  "HTML Tidy for HTML5 for Apple macOS version 5.6.0">
7  <meta http-equiv="Content-Type" content=
8  "text/html; charset=utf-8">
9  <meta http-equiv="Content-Language" content="en-us">
10  <link rel="stylesheet" href=
11  "../reports.css" type="text/css">
12  <title>UTS #35: Unicode LDML: Keyboards</title>
13  <style type="text/css">
14  <!--
15  .dtd {
16        font-family: monospace;
17        font-size: 90%;
18        background-color: #CCCCFF;
19        border-style: dotted;
20        border-width: 1px;
21  }
22
23  .xmlExample {
24        font-family: monospace;
25        font-size: 80%
26  }
27
28  .blockedInherited {
29        font-style: italic;
30        font-weight: bold;
31        border-style: dashed;
32        border-width: 1px;
33        background-color: #FF0000
34  }
35
36  .inherited {
37        font-weight: bold;
38        border-style: dashed;
39        border-width: 1px;
40        background-color: #00FF00
41  }
42
43  .element {
44        font-weight: bold;
45        color: red;
46  }
47
48  .attribute {
49        font-weight: bold;
50        color: maroon;
51  }
52
53  .attributeValue {
54        font-weight: bold;
55        color: blue;
56  }
57
58  li, p {
59        margin-top: 0.5em;
60        margin-bottom: 0.5em
61  }
62
63  h2, h3, h4, table {
64        margin-top: 1.5em;
65        margin-bottom: 0.5em;
66  }
67  -->
68  </style>
69</head>
70<body>
71  <table class="header" width="100%">
72    <tr>
73      <td class="icon"><a href="https://unicode.org"><img alt=
74      "[Unicode]" src="../logo60s2.gif"
75      width="34" height="33" style=
76      "vertical-align: middle; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px; border-top-width: 0px;"></a>&nbsp;
77      <a class="bar" href=
78      "https://www.unicode.org/reports/">Technical Reports</a></td>
79    </tr>
80    <tr>
81      <td class="gray">&nbsp;</td>
82    </tr>
83  </table>
84  <div class="body">
85    <h2 style="text-align: center">Unicode Technical Standard #35</h2>
86    <h1>Unicode Locale Data Markup Language (LDML)<br>
87    Part 7: Keyboards</h1>
88    <!-- At least the first row of this header table should be identical across the parts of this UTS. -->
89    <table border="1" cellpadding="2" cellspacing="0" class="wide">
90      <tr>
91        <td>Version</td>
92        <td>38</td>
93      </tr>
94      <tr>
95        <td>Editors</td>
96        <td>Steven Loomis (<a href=
97        "mailto:srl@icu-project.org">srl@icu-project.org</a>) and
98        <a href="tr35.html#Acknowledgments">other CLDR committee
99        members</a></td>
100      </tr>
101    </table>
102    <p>For the full header, summary, and status, see <a href=
103    "tr35.html">Part 1: Core</a></p>
104    <h3><i>Summary</i></h3>
105    <p>This document describes parts of an XML format
106    (<i>vocabulary</i>) for the exchange of structured locale data.
107    This format is used in the <a href=
108    "https://unicode.org/cldr/">Unicode Common Locale Data
109    Repository</a>.</p>
110    <p>This is a partial document, describing keyboard mappings.
111    For the other parts of the LDML see the <a href=
112    "tr35.html">main LDML document</a> and the links above.</p>
113    <h3><i>Status</i></h3>
114
115    <!-- NOT YET APPROVED
116                <p>
117                                <i class="changed">This is a<b><font color="#ff3333">
118                                draft </font></b>document which may be updated, replaced, or superseded by
119                                other documents at any time. Publication does not imply endorsement
120                                by the Unicode Consortium. This is not a stable document; it is
121                                inappropriate to cite this document as other than a work in
122                                progress.
123                        </i>
124                </p>
125     END NOT YET APPROVED -->
126    <!-- APPROVED -->
127    <p><i>This document has been reviewed by Unicode members and
128    other interested parties, and has been approved for publication
129    by the Unicode Consortium. This is a stable document and may be
130    used as reference material or cited as a normative reference by
131    other specifications.</i></p>
132    <!-- END APPROVED -->
133
134    <blockquote>
135      <p><i><b>A Unicode Technical Standard (UTS)</b> is an
136      independent specification. Conformance to the Unicode
137      Standard does not imply conformance to any UTS.</i></p>
138    </blockquote>
139    <p><i>Please submit corrigenda and other comments with the CLDR
140    bug reporting form [<a href="tr35.html#Bugs">Bugs</a>]. Related
141    information that is useful in understanding this document is
142    found in the <a href="tr35.html#References">References</a>. For
143    the latest version of the Unicode Standard see [<a href=
144    "tr35.html#Unicode">Unicode</a>]. For a list of current Unicode
145    Technical Reports see [<a href=
146    "tr35.html#Reports">Reports</a>]. For more information about
147    versions of the Unicode Standard, see [<a href=
148    "tr35.html#Versions">Versions</a>].</i></p>
149    <h2><a name="Parts" href="#Parts" id="Parts">Parts</a></h2>
150    <!-- This section of Parts should be identical in all of the parts of this UTS. -->
151    <p>The LDML specification is divided into the following
152    parts:</p>
153    <ul class="toc">
154      <li>Part 1: <a href="tr35.html#Contents">Core</a> (languages,
155      locales, basic structure)</li>
156      <li>Part 2: <a href="tr35-general.html#Contents">General</a>
157      (display names &amp; transforms, etc.)</li>
158      <li>Part 3: <a href="tr35-numbers.html#Contents">Numbers</a>
159      (number &amp; currency formatting)</li>
160      <li>Part 4: <a href="tr35-dates.html#Contents">Dates</a>
161      (date, time, time zone formatting)</li>
162      <li>Part 5: <a href=
163      "tr35-collation.html#Contents">Collation</a> (sorting,
164      searching, grouping)</li>
165      <li>Part 6: <a href=
166      "tr35-info.html#Contents">Supplemental</a> (supplemental
167      data)</li>
168      <li>Part 7: <a href=
169      "tr35-keyboards.html#Contents">Keyboards</a> (keyboard
170      mappings)</li>
171    </ul>
172    <h2><a name="Contents" href="#Contents" id="Contents">Contents
173    of Part 7, Keyboards</a></h2>
174    <!-- START Generated TOC: CheckHtmlFiles -->
175    <ul class="toc">
176      <li>1 <a href="#Introduction">Keyboards</a></li>
177      <li>2 <a href="#Goals_and_Nongoals">Goals and
178      Nongoals</a></li>
179      <li>3 <a href="#Definitions">Definitions</a></li>
180      <li>4 <a href="#File_and_Dir_Structure">File and Directory
181      Structure</a></li>
182      <li>5 <a href="#Element_Heirarchy_Layout_File">Element
183      Hierarchy - Layout File</a>
184        <ul class="toc">
185          <li>5.1 <a href="#Element_Keyboard">Element:
186          keyboard</a></li>
187          <li>5.2 <a href="#Element_version">Element:
188          version</a></li>
189          <li>5.3 <a href="#Element_generation">Element:
190          generation</a></li>
191          <li>5.4 <a href="#Element_names">Element: names</a></li>
192          <li>5.5 <a href="#Element_name">Element: name</a></li>
193          <li>5.6 <a href="#Element_settings">Element:
194          settings</a></li>
195          <li>5.7 <a href="#Element_keyMap">Element: keyMap</a>
196            <ul class="toc">
197              <li>Table: <a href="#Possible_Modifier_Keys">Possible
198              Modifier Keys</a></li>
199            </ul>
200          </li>
201          <li>5.8 <a href="#Element_map">Element: map</a></li>
202          <li>5.9 <a href="#Element_import">Element:
203          import</a></li>
204          <li>5.10 <a href="#Element_displayMap">Element:
205          displayMap</a></li>
206          <li>5.11 <a href="#Element_display">Element:
207          display</a></li>
208          <li>5.12 <a href="#Element_layer">Element: layer</a></li>
209          <li>5.13 <a href="#Element_row">Element: row</a></li>
210          <li>5.14 <a href="#Element_switch">Element:
211          switch</a></li>
212          <li>5.15 <a href="#Element_vkeys">Element: vkeys</a></li>
213          <li>5.16 <a href="#Element_vkey">Element: vkey</a></li>
214          <li>5.17 <a href="#Element_transforms">Element:
215          transforms</a></li>
216          <li>5.18 <a href="#Element_transform">Element:
217          transform</a></li>
218          <li>5.19 <a href="#Element_reorder">Element:
219          reorder</a></li>
220          <li>5.20 <a href="#Element_final">Element: final</a></li>
221          <li>5.21 <a href="#Element_backspaces">Element:
222          backspaces</a></li>
223          <li>5.22 <a href="#Element_backspace">Element:
224          backspace</a></li>
225        </ul>
226      </li>
227      <li>6 <a href="#Element_Heirarchy_Platform_File">Element
228      Hierarchy - Platform File</a>
229        <ul class="toc">
230          <li>6.1 <a href="#Element_platform">Element:
231          platform</a></li>
232          <li>6.2 <a href="#Element_hardwareMap">Element:
233          hardwareMap</a></li>
234          <li>6.3 <a href="#Element_hardwareMap_map">Element:
235          map</a></li>
236        </ul>
237      </li>
238      <li>7 <a href="#Invariants">Invariants</a></li>
239      <li>8 <a href="#Data_Sources">Data Sources</a>
240        <ul class="toc">
241          <li>Table: <a href="#Key_Map_Data_Sources">Key Map Data
242          Sources</a></li>
243        </ul>
244      </li>
245      <li>9 <a href="#Keyboard_IDs">Keyboard IDs</a>
246        <ul class="toc">
247          <li>9.1 <a href="#Principles_for_Keyboard_Ids">Principles
248          for Keyboard Ids</a></li>
249        </ul>
250      </li>
251      <li>10 <a href="#Platform_Behaviors_in_Edge_Cases">Platform
252      Behaviors in Edge Cases</a></li>
253    </ul><!-- END Generated TOC: CheckHtmlFiles -->
254    <h2>1 <a name="Introduction" href="#Introduction" id=
255    "Introduction">Keyboards</a><a name="Keyboards" href=
256    "#Keyboards" id="Keyboards"></a></h2>
257    <p>The CLDR keyboard format provides for the communication of
258    keyboard mapping data between different modules, and the
259    comparison of data across different vendors and platforms. The
260    standardized identifier for keyboards can be used to
261    communicate, internally or externally, a request for a
262    particular keyboard mapping that is to be used to transform
263    either text or keystrokes. The corresponding data can then be
264    used to perform the requested actions.</p>
265    <p>For example, a web-based virtual keyboard may transform text
266    in the following way. Suppose the user types a key that
267    produces a "W" on a qwerty keyboard. A web-based tool using an
268    azerty virtual keyboard can map that text ("W") to the text
269    that would have resulted from typing a key on an azerty
270    keyboard, by transforming "W" to "Z". Such transforms are in
271    fact performed in existing web applications.</p>
272    <p>The data can also be used in analysis of the capabilities of
273    different keyboards. It also allows better interoperability by
274    making it easier for keyboard designers to see which characters
275    are generally supported on keyboards for given languages.</p>
276    <p>To illustrate this specification, here is an abridged layout
277    representing the English US 101 keyboard on the Mac OSX
278    operating system (with an inserted long-press example). For
279    more complete examples, and information collected about
280    keyboards, see keyboard data in XML.</p>
281    <pre>
282    &lt;keyboard locale="en-t-k0-osx"&gt;<br>                &lt;version platform="10.4" number="$Revision: 8294 $" /&gt;<br>            &lt;names&gt;<br>                       &lt;name value="U.S." /&gt;<br>                       &lt;/names&gt;<br>      &lt;keyMap&gt;<br>              &lt;map iso="E00" to="`" /&gt;<br>          &lt;map iso="E01" to="1" /&gt;<br>          &lt;map iso="D01" to="q" /&gt;<br>          &lt;map iso="D02" to="w" /&gt;<br>          &lt;map iso="D03" to="e" longPress="é è ê ë" /&gt;<br>            …<br>   &lt;/keyMap&gt;<br>     &lt;keyMap modifiers="caps"&gt;<br>           &lt;map iso="E00" to="`" /&gt;<br>          &lt;map iso="E01" to="1" /&gt;<br>          &lt;map iso="D01" to="Q" /&gt;<br>          &lt;map iso="D02" to="W" /&gt;<br>          …<br>   &lt;/keyMap&gt;<br>     &lt;keyMap modifiers="opt"&gt;<br>            &lt;map iso="E00" to="`" /&gt;<br>          &lt;map iso="E01" to="¡" /&gt; &lt;!-- key=1 --&gt;<br>             &lt;map iso="D01" to="œ" /&gt; &lt;!-- key=Q --&gt;<br>             &lt;map iso="D02" to="∑" /&gt; &lt;!-- key=W --&gt;<br>             …<br>   &lt;/keyMap&gt;<br>     &lt;transforms type="simple"&gt;<br>          &lt;transform from="` " to="`" /&gt;<br>            &lt;transform from="`a" to="à" /&gt;<br>            &lt;transform from="`A" to="À" /&gt;<br>            &lt;transform from="´ " to="´" /&gt;<br>            &lt;transform from="´a" to="á" /&gt;<br>            &lt;transform from="´A" to="Á" /&gt;<br>            &lt;transform from="˜ " to="˜" /&gt;<br>            &lt;transform from="˜a" to="ã" /&gt;<br>            &lt;transform from="˜A" to="Ã" /&gt;<br>            …<br>   &lt;/transforms&gt;<br>&lt;/keyboard&gt;</pre>
283    <p>And its associated platform file (which includes the
284    hardware mapping):</p>
285    <pre>
286    &lt;platform id="osx"&gt;<br>    &lt;hardwareMap&gt;<br>         &lt;map keycode="0" iso="C01" /&gt;<br>             &lt;map keycode="1" iso="C02" /&gt;<br>             &lt;map keycode="6" iso="B01" /&gt;<br>             &lt;map keycode="7" iso="B02" /&gt;<br>             &lt;map keycode="12" iso="D01" /&gt;<br>            &lt;map keycode="13" iso="D02" /&gt;<br>            &lt;map keycode="18" iso="E01" /&gt;<br>            &lt;map keycode="50" iso="E00" /&gt;<br>    &lt;/hardwareMap&gt;<br>&lt;/platform&gt;</pre>
287    <h2>2 <a name="Goals_and_Nongoals" href="#Goals_and_Nongoals"
288    id="Goals_and_Nongoals">Goals and Nongoals</a></h2>
289    <p>Some goals of this format are:</p>
290    <ol>
291      <li>Make the XML as readable as possible.</li>
292      <li>Represent faithfully keyboard data from major platforms:
293      it should be possible to create a functionally-equivalent
294      data file (such that given any input, it can produce the same
295      output).</li>
296      <li>Make as much commonality in the data across platforms as
297      possible to make comparison easy.</li>
298    </ol>
299    <p>Some non-goals (outside the scope of the format) currently
300    are:</p>
301    <ol>
302      <li>Display names or symbols for keycaps (eg, the German name
303      for "Return"). If that were added to LDML, it would be in a
304      different structure, outside the scope of this section.</li>
305      <li>Advanced IME features, handwriting recognition, etc.</li>
306      <li>Roundtrip mappings—the ability to recover precisely the
307      same format as an original platform's representation. In
308      particular, the internal structure may have no relation to
309      the internal structure of external keyboard source data, the
310      only goal is functional equivalence.</li>
311      <li>More sophisticated transforms, such as for Indic
312      character rearrangement. It is anticipated that these would
313      be added to a future version, after working out a reasonable
314      representation.</li>
315    </ol>
316    <p>Note: During development of this section, it was considered
317    whether the modifier RAlt (=AltGr) should be merged with
318    Option. In the end, they were kept separate, but for comparison
319    across platforms implementers may choose to unify them.</p>
320    <p>Note that in parts of this document, the format
321    <strong>@x</strong> is used to indicate the <em>attribute</em>
322    <strong>x</strong>.</p>
323    <h2>3 <a name="Definitions" href="#Definitions" id=
324    "Definitions">Definitions</a></h2>
325    <p><b>Arrangement</b> is the term used to describe the relative
326    position of the rectangles that represent keys, either
327    physically or virtually. A physical keyboard has a static
328    arrangement while a virtual keyboard may have a dynamic
329    arrangement that changes per language and/or layer. While the
330    arrangement of keys on a keyboard may be fixed, the mapping of
331    those keys may vary.</p>
332    <p><b>Base character:</b> The character emitted by a particular
333    key when no modifiers are active. In ISO terms, this is group
334    1, level 1.</p>
335    <p><b>Base map:</b> A mapping from the ISO positions to the
336    base characters. There is only one base map per layout. The
337    characters on this map can be output by not using any modifier
338    keys.</p>
339    <p><b>Core keyboard layout:</b> also known as “alpha” block.
340    The primary set of key values on a keyboard that are used for
341    typing the target language of the keyboard. For example, the
342    three rows of letters on a standard US QWERTY keyboard
343    (QWERTYUIOP, ASDFGHJKL, ZXCVBNM) together with the most
344    significant punctuation keys. Usually this equates to the
345    minimal keyset for a language as seen on mobile phone
346    keyboards.</p>
347    <p><b>Hardware map:</b> A mapping between key codes and ISO
348    layout positions.</p>
349    <p><b>Input Method Editor (IME):</b> a component or program
350    that supports input of large character sets. Typically, IMEs
351    employ contextual logic and candidate UI to identify the
352    Unicode characters intended by the user.</p>
353    <p><b>ISO position:</b> The corresponding position of a key
354    using the ISO layout convention where rows are identified by
355    letters and columns are identified by numbers. For example,
356    "D01" corresponds to the "Q" key on a US keyboard. For the
357    purposes of this document, an ISO layout position is depicted
358    by a one-letter row identifier followed by a two digit column
359    number (like "B03", "E12" or "C00"). The following diagram
360    depicts a typical US keyboard layout superimposed with the ISO
361    layout indicators (it is important to note that the number of
362    keys and their physical placement relative to each-other in
363    this diagram is irrelevant, rather what is important is their
364    logical placement using the ISO convention):<img src=
365    "images/keyPositions.png" alt=
366    "keyboard layout example showing ISO key numbering"></p>
367    <p>One may also extend the notion of the ISO layout to support
368    keys that don't map directly to the diagram above (such as the
369    Android device - see diagram). Per the ISO standard, the space
370    bar is mapped to "A03", so the period and comma keys are mapped
371    to "A02" and "A04" respectively based on their relative
372    position to the space bar. Also note that the "E" row does not
373    exist on the Android keyboard.</p>
374    <p><img src="images/androidKeyboard.png" alt=
375    "keyboard layout example showing extension of ISO key numbering"></p>
376    <p>If it becomes necessary in the future, the format could
377    extend the ISO layout to support keys that are located to the
378    left of the "00" column by using negative column numbers "-01",
379    "-02" and so on, or 100's complement "99", "98",...</p>
380    <p><b>Key:</b> A key on a physical keyboard.</p>
381    <p><b>Key code:</b> The integer code sent to the application on
382    pressing a key.</p>
383    <p><b>Key map:</b> The basic mapping between ISO positions and
384    the output characters for each set of modifier combinations
385    associated with a particular layout. There may be multiple key
386    maps for each layout.</p>
387    <p><b>Keyboard:</b> The physical keyboard.</p>
388    <p><b>Keyboard layout:</b> A layout is the overall keyboard
389    configuration for a particular locale. Within a keyboard
390    layout, there is a single base map, one or more key maps and
391    zero or more transforms.</p>
392    <p><b>Layer</b> is an arrangement of keys on a virtual
393    keyboard. Since it is often not intended to use two hands on a
394    visual keyboard to allow the pressing of modifier keys.
395    Modifier keys are made sticky in that one presses one, the
396    visual representation, and even arrangement, of the keys
397    change, and you press the key. This visual representation is a
398    layer. Thus a virtual keyboard is made up of a set of
399    layers.</p>
400    <p><b>Long-press key:</b> also known as a “child key”. A
401    secondary key that is invoked from a top level key on a
402    software keyboard. Secondary keys typically provide access to
403    variants of the top level key, such as accented variants (a
404    =&gt; á, à, ä, ã)</p>
405    <p><b>Modifier:</b> A key that is held to change the behavior
406    of a keyboard. For example, the "Shift" key allows access to
407    upper-case characters on a US keyboard. Other modifier keys
408    include but is not limited to: Ctrl, Alt, Option, Command and
409    Caps Lock.</p>
410    <p><b>Physical keyboard</b> is a keyboard that has individual
411    keys that are pressed. Each key has a unique identifier and the
412    arrangement doesn't change, even if the mapping of those keys
413    does.</p>
414    <p><b>Transform:</b>A transform is an element that specifies a
415    set of conversions from sequences of code points into one (or
416    more) other code points. For example, in most latin keyboards
417    hitting the "^" dead-key followed by the "e" key produces
418    "ê".</p>
419    <p><b>Virtual keyboard</b> is a keyboard that is rendered on a,
420    typically, touch surface. It has a dynamic arrangement and
421    contrasts with a physical keyboard. This term has many
422    synonyms: touch keyboard, software keyboard, SIP (Software
423    Input Panel). This contrasts with other uses of the term
424    virtual keyboard as an on-screen keyboard for reference or
425    accessibility data entry.</p>
426    <h2>4 <a name="File_and_Dir_Structure" href=
427    "#File_and_Dir_Structure" id="File_and_Dir_Structure">File and
428    Directory Structure</a></h2>
429    <p>Each platform has its own directory, where a "platform" is a
430    designation for a set of keyboards available from a particular
431    source, such as Windows or Chromeos. This directory name is the
432    platform name (see Table 2 located further in the document).
433    Within this directory there are two types of files:</p>
434    <ol>
435      <li>A single platform file (see XML structure for Platform
436      file), this file includes a mapping of hardware key codes to
437      the ISO layout positions. This file is also open to expansion
438      for any configuration elements that are valid across the
439      whole platform and that are not layout specific. This file is
440      simply called _platform.xml.</li>
441      <li>Multiple layout files named by their locale identifiers.
442      (eg. lt-t-k0-chromeos.xml or ne-t-k0-windows.xml).</li>
443    </ol>
444    <p>Keyboard data that is not supported on a given platform, but
445    intended for use with that platform, may be added to the
446    directory /und/. For example, there could be a file
447    /und/lt-t-k0-chromeos.xml, where the data is intended for use
448    with ChromeOS, but does not reflect data that is distributed as
449    part of a standard ChromeOS release.</p>
450    <h2>5 <a name="Element_Heirarchy_Layout_File" href=
451    "#Element_Heirarchy_Layout_File" id=
452    "Element_Heirarchy_Layout_File">Element Hierarchy - Layout
453    File</a></h2>
454    <h3>5.1 <a name="Element_Keyboard" href="#Element_Keyboard" id=
455    "Element_Keyboard">Element: keyboard</a></h3>
456    <p>This is the top level element. All other elements defined
457    below are under this element.</p>
458    <p>Syntax</p>
459    <p>&lt;keyboard locale="{locale ID}"&gt;</p>
460    <p>{definition of the layout as described by the elements
461    defined below}</p>
462    <p>&lt;/keyboard&gt;</p>
463    <dl>
464      <dt>Attribute: locale (required)</dt>
465      <dd>This mandatory attribute represents the locale of the
466      keyboard using Unicode locale identifiers (see <a href=
467      "tr35.html">LDML</a>) - for example 'el' for Greek.
468      Sometimes, the locale may not specify the base language. For
469      example, a Devanagari keyboard for many languages could be
470      specified by BCP-47 code: 'und-Deva'. For details, see
471      <a href="#Keyboard_IDs">Keyboard IDs</a> .</dd>
472    </dl>
473    <p>Examples (for illustrative purposes only, not indicative of
474    the real data)</p>
475    <pre>&lt;keyboard locale="ka-t-k0-qwerty-windows"&gt;
476477&lt;/keyboard&gt;
478&lt;keyboard locale="fr-CH-t-k0-android"&gt;
479480&lt;/keyboard&gt;</pre>
481    <hr>
482    <h3>5.2 <a name="Element_version" href="#Element_version" id=
483    "Element_version">Element: version</a></h3>
484    <p>Element used to keep track of the source data version.<br>
485    <br>
486    Syntax</p>
487    <p>&lt;version platform=".." revision=".."&gt;<br></p>
488    <dl>
489      <dt>Attribute: platform (required)</dt>
490      <dd>The platform source version. Specifies what version of
491      the platform the data is from. For example, data from Mac OSX
492      10.4 would be specified as platform="10.4". For platforms
493      that have unstable version numbers which change frequently
494      (like Linux), this field is set to an integer representing
495      the iteration of the data starting with "1". This number
496      would only increase if there were any significant changes in
497      the keyboard data.</dd>
498    </dl>
499    <dl>
500      <dt>Attribute: number (required)</dt>
501      <dd>The data revision version.</dd>
502    </dl>
503    <dl>
504      <dt>Attribute: cldrVersion (fixed by DTD)</dt>
505      <dd>The CLDR specification version that is associated with
506      this data file. This value is fixed and is inherited from the
507      DTD file and therefore does not show up directly in the XML
508      file.</dd>
509    </dl>
510    <p>Example</p>
511    <p>&lt;keyboard locale="..-osx"&gt;</p>
512    <p>…</p>
513    <p>&lt;version platform="10.4" number="1"/&gt;</p>
514    <p>…</p>
515    <p>&lt;/keyboard&gt;</p>
516    <hr>
517    <h3>5.3 <a name="Element_generation" href="#Element_generation"
518    id="Element_generation">Element: generation</a></h3>
519    <p>The generation element is now deprecated. It was used to
520    keep track of the generation date of the data.</p>
521    <hr>
522    <h3>5.4 <a name="Element_names" href="#Element_names" id=
523    "Element_names">Element: names</a></h3>
524    <p>Element used to store any names given to the layout by the
525    platform.<br>
526    <br>
527    Syntax</p>
528    <p>&lt;names&gt;</p>
529    <p>{set of name elements}</p>
530    <p>&lt;/names&gt;<br></p>
531    <h3>5.5 <a name="Element_name" href="#Element_name" id=
532    "Element_name">Element: name</a></h3>
533    <p>A single name given to the layout by the platform.<br>
534    <br>
535    Syntax</p>
536    <p>&lt;name value=".."&gt;<br></p>
537    <dl>
538      <dt>Attribute: value (required)</dt>
539      <dd>The name of the layout.</dd>
540    </dl>
541    <p>Example</p>
542    <p>&lt;keyboard locale="bg-t-k0-windows-phonetic-trad"&gt;</p>
543    <p>…</p>
544    <p>&lt;names&gt;</p>
545    <p>&lt;name value="Bulgarian (Phonetic Traditional)"/&gt;</p>
546    <p>&lt;/names&gt;</p>
547    <p>…</p>
548    <p>&lt;/keyboard&gt;</p>
549    <hr>
550    <h3>5.6 <a name="Element_settings" href="#Element_settings" id=
551    "Element_settings">Element: settings</a></h3>
552    <p>An element used to keep track of layout specific settings.
553    This element may or may not show up on a layout. These settings
554    reflect the normal practice on the platform. However, an
555    implementation using the data may customize the behavior. For
556    example, for transformFailures the implementation could ignore
557    the setting, or modify the text buffer in some other way (such
558    as by emitting backspaces).<br>
559    <br>
560    Syntax</p>
561    <p>&lt;settings [fallback="omit"] [transformFailure="omit"]
562    [transformPartial="hide"]&gt;<br></p>
563    <dl>
564      <dt>Attribute: fallback="omit" (optional)</dt>
565      <dd>The presence of this attribute means that when a modifier
566      key combination goes unmatched, no output is produced. The
567      default behavior (when this attribute is not present) is to
568      fallback to the base map when the modifier key combination
569      goes unmatched.</dd>
570    </dl>
571    <p>If this attribute is present, it must have a value of
572    omit.</p>
573    <dl>
574      <dt>Attribute: transformFailure="omit" (optional)</dt>
575      <dd>This attribute describes the behavior of a transform when
576      it is escaped (see the transform element in the Layout file
577      for more information). A transform is escaped when it can no
578      longer continue due to the entry of an invalid key. For
579      example, suppose the following set of transforms are
580      valid:</dd>
581    </dl>
582    <blockquote>
583      <p>^e → ê</p>
584      <p>^a → â</p>
585    </blockquote>
586    <p>Suppose a user now enters the "^" key then "^" is now stored
587    in a buffer and may or may not be shown to the user (see the
588    partial attribute).</p>
589    <p>If a user now enters d, then the transform has failed and
590    there are two options for output.</p>
591    <p>1. default behavior - "^d"</p>
592    <p>2. omit - "" (nothing and the buffer is cleared)</p>
593    <p>The default behavior (when this attribute is not present) is
594    to emit the contents of the buffer upon failure of a
595    transform.</p>
596    <p>If this attribute is present, it must have a value of
597    omit.</p>
598    <dl>
599      <dt>Attribute: transformPartial="hide" (optional)</dt>
600      <dd>This attribute describes the behavior the system while in
601      a transform. When this attribute is present then don't show
602      the values of the buffer as the user is typing a transform
603      (this behavior can be seen on Windows or Linux
604      platforms).</dd>
605    </dl>
606    <p>By default (when this attribute is not present), show the
607    values of the buffer as the user is typing a transform (this
608    behavior can be seen on the Mac OSX platform).</p>
609    <p>If this attribute is present, it must have a value of
610    hide.</p>
611    <p>Example</p>
612    <p>&lt;keyboard locale="bg-t-k0-windows-phonetic-trad"&gt;</p>
613    <p>…</p>
614    <p>&lt;settings fallback="omit" transformPartial="hide"&gt;</p>
615    <p>…</p>
616    <p>&lt;/keyboard&gt;</p>
617    <p>Indicates that:</p>
618    <ol>
619      <li>When a modifier combination goes unmatched, do not output
620      anything when a key is pressed.</li>
621      <li>If a transform is escaped, output the contents of the
622      buffer.</li>
623      <li>During a transform, hide the contents of the buffer as
624      the user is typing.</li>
625    </ol>
626    <hr>
627    <h3>5.7 <a name="Element_keyMap" href="#Element_keyMap" id=
628    "Element_keyMap">Element: keyMap</a></h3>
629    <p>This element defines the group of mappings for all the keys
630    that use the same set of modifier keys. It contains one or more
631    map elements.</p>
632    <p>Syntax</p>
633    <p>&lt;keyMap [modifiers="{Set of Modifier
634    Combinations}"]&gt;</p>
635    <p>{a set of map elements}</p>
636    <p>&lt;/keyMap&gt;</p>
637    <dl>
638      <dt>Attribute: modifiers (optional)</dt>
639      <dd>A set of modifier combinations that cause this key map to
640      be "active". Each combination is separated by a space. The
641      interpretation is that there is a match if any of the
642      combinations match, that is, they are ORed. Therefore, the
643      order of the combinations within this attribute does not
644      matter.<br>
645      <br>
646      A combination is simply a concatenation of words to represent
647      the simultaneous activation of one or more modifier keys. The
648      order of the modifier keys within a combination does not
649      matter, although don't care cases are generally added to the
650      end of the string for readability (see next paragraph). For
651      example: "cmd+caps" represents the Caps Lock and Command
652      modifier key combination. Some keys have right or left
653      variant keys, specified by a 'R' or 'L' suffix. For example:
654      "ctrlR+caps" would represent the Right-Control and Caps Lock
655      combination. For simplicity, the presence of a modifier
656      without a 'R' or 'L' suffix means that either its left or
657      right variants are valid. So "ctrl+caps" represents the same
658      as "ctrlL+ctrlR?+caps ctrlL?+ctrlR+caps"</dd>
659    </dl>
660    <p>A modifier key may be further specified to be in a "don't
661    care" state using the '?' suffix. The "don't care" state simply
662    means that the preceding modifier key may be either ON or OFF.
663    For example "ctrl+shift?" could be expanded into "ctrl
664    ctrl+shift".</p>
665    <p>Within a combination, the presence of a modifier WITHOUT the
666    '?' suffix indicates this key MUST be on. The converse is also
667    true, the absence of a modifier key means it MUST be off for
668    the combination to be active.</p>
669    <p>Here is an exhaustive list of all possible modifier
670    keys:</p>
671    <p>Possible Modifier Keys</p>
672    <table>
673      <caption>
674        <a name="Possible_Modifier_Keys" href=
675        "#Possible_Modifier_Keys" id=
676        "Possible_Modifier_Keys">Possible Modifier Keys</a>
677      </caption>
678      <tbody>
679        <tr>
680          <td>
681            <p>Modifier Keys</p>
682          </td>
683          <td>&nbsp;</td>
684          <td>
685            <p>Comments</p>
686          </td>
687        </tr>
688        <tr>
689          <td>
690            <p>altL</p>
691          </td>
692          <td>
693            <p>altR</p>
694          </td>
695          <td>
696            <p>xAlty → xAltR+AltL? xAltR?AltLy</p>
697          </td>
698        </tr>
699        <tr>
700          <td>
701            <p>ctrlL</p>
702          </td>
703          <td>
704            <p>ctrlR</p>
705          </td>
706          <td>
707            <p>ditto for Ctrl</p>
708          </td>
709        </tr>
710        <tr>
711          <td>
712            <p>shiftL</p>
713          </td>
714          <td>
715            <p>shiftR</p>
716          </td>
717          <td>
718            <p>ditto for Shift</p>
719          </td>
720        </tr>
721        <tr>
722          <td>
723            <p>optL</p>
724          </td>
725          <td>
726            <p>optR</p>
727          </td>
728          <td>
729            <p>ditto for Opt</p>
730          </td>
731        </tr>
732        <tr>
733          <td>
734            <p>caps</p>
735          </td>
736          <td>&nbsp;</td>
737          <td>
738            <p>Caps Lock</p>
739          </td>
740        </tr>
741        <tr>
742          <td>
743            <p>cmd</p>
744          </td>
745          <td>&nbsp;</td>
746          <td>
747            <p>Command on the Mac</p>
748          </td>
749        </tr>
750      </tbody>
751    </table>
752    <p>All sets of modifier combinations within a layout are
753    disjoint with no-overlap existing between the key maps. That
754    is, for every possible modifier combination, there is at most a
755    single match within the layout file. There are thus never
756    multiple matches. If no exact match is available, the match
757    falls back to the base map unless the fallback="omit" attribute
758    in the settings element is set, in which case there would be no
759    output at all.</p>
760    <p>To illustrate, the following example produces an invalid
761    layout because pressing the "Ctrl" modifier key produces an
762    indeterminate result:</p>
763    <p>&lt;keyMap modifiers="ctrl+shift?"&gt;</p>
764    <p>…</p>
765    <p>&lt;/keyMap&gt;</p>
766    <p>&lt;keyMap modifiers="ctrl"&gt;</p>
767    <p>…</p>
768    <p>&lt;/keyMap&gt;</p>
769    <p>Modifier Examples:</p>
770    <p>&lt;keyMap modifiers="cmd?+opt+caps?+shift" /&gt;</p>
771    <p>Caps-Lock may be ON or OFF, Option must be ON, Shift must be
772    ON and Command may be ON or OFF.</p>
773    <p>&lt;keyMap modifiers="shift caps" fallback="true" /&gt;</p>
774    <p>Caps-Lock must be ON OR Shift must be ON. Is also the
775    fallback key map.</p>
776    <p>If the modifiers attribute is not present on a keyMap then
777    that particular key map is the base map.</p>
778    <hr>
779    <h3>5.8 <a name="Element_map" href="#Element_map" id=
780    "Element_map">Element: map</a></h3>
781    <p>This element defines a mapping between the base character
782    and the output for a particular set of active modifier keys.
783    This element must have the keyMap element as its parent.</p>
784    <p>If a map element for a particular ISO layout position has
785    not been defined then if this key is pressed, no output is
786    produced.</p>
787    <p>Syntax</p>
788    <pre>&lt;map
789 iso="{the iso position}"
790 to="{the output}"
791 [longPress="{long press keys}"]
792 [transform="no"]
793/&gt;&lt;!-- {Comment to improve readability (if needed)} --&gt;</pre>
794    <dl>
795      <dt>Attribute: iso (exactly one of base and iso is
796      required)</dt>
797      <dd>The iso attribute represents the ISO layout position of
798      the key (see the definition at the beginning of the document
799      for more information).</dd>
800    </dl>
801    <dl>
802      <dt>Attribute: to (required)</dt>
803      <dd>The to attribute contains the output sequence of
804      characters that is emitted when pressing this particular key.
805      Control characters, whitespace (other than the regular space
806      character) and combining marks in this attribute are escaped
807      using the \u{...} notation.</dd>
808    </dl>
809    <dl>
810      <dt>Attribute: longPress (optional)</dt>
811      <dd>The longPress attribute contains any characters that can
812      be emitted by "long-pressing" a key, this feature is
813      prominent in mobile devices. The possible sequences of
814      characters that can be emitted are whitespace delimited.
815      Control characters, combining marks and whitespace (which is
816      intended to be a long-press option) in this attribute are
817      escaped using the \u{...} notation.</dd>
818    </dl>
819    <dl>
820      <dt>Attribute: transform="no" (optional)</dt>
821      <dd>The transform attribute is used to define a key that
822      never participates in a transform but its output shows up as
823      part of a transform. This attribute is necessary because two
824      different keys could output the same characters (with
825      different keys or modifier combinations) but only one of them
826      is intended to be a dead-key and participate in a transform.
827      This attribute value must be no if it is present.</dd>
828    </dl>
829    <dl>
830      <dt>Attribute: multitap (optional)</dt>
831      <dd>A space-delimited list of strings, where each successive
832      element of the list is produced by the corresponding number
833      of quick taps. For example, two taps on the key C01 will
834      produce a “c” in the following example.<br>
835      <br>
836      <em>Example:</em><br>
837      <br>
838      &lt;map iso="C01" to="a" multitap="bb c d"&gt;</dd>
839    </dl>
840    <dl>
841      <dt>Attribute: longPress-status (optional)</dt>
842      <dd>Indicates optional longPress values. Must only occur with
843      a longPress value. May be suppressed or shown, depending on
844      user settings. There can be two map elements that differ only
845      by long-press-status, allowing two different sets of
846      longpress values.<br>
847      <br>
848      <em>Example:</em><br>
849      <br>
850      &lt;map iso="D01" to="a" longPress="à â % æ á ä ã å ā
851      ª"/&gt;<br>
852      &lt;map iso="D01" to="a" longPress="à â á ä ã å ā"
853      longPress-status="optional"/&gt;</dd>
854    </dl>
855    <dl>
856      <dt>Attribute: optional (optional)</dt>
857      <dd>Indicates optional mappings. May be suppressed or shown,
858      depending on user settings.</dd>
859    </dl>
860    <dl>
861      <dt>Attribute: hint (optional)</dt>
862      <dd>
863        Indicates a hint as to long-press contents, such as the
864        first character of the longPress value, that can be
865        displayed on the key. May be suppressed or shown, depending
866        on user Settings.<br>
867        <br>
868        <i>Example:</i> where the hint is "{":<br>
869        <div style='text-align: center'><img alt="keycap hint" src=
870        'images/keycapHint.png'></div>
871      </dd>
872    </dl>
873    <p>For example, suppose there are the following keys, their
874    output and one transform:</p>
875    <blockquote>
876      <p>E00 outputs `</p>
877      <p>Option+E00 outputs ` (the dead-version which participates
878      in transforms).</p>
879      <p>`e → è</p>
880    </blockquote>
881    <p>Then the first key must be tagged with transform="no" to
882    indicate that it should never participate in a transform.</p>
883    <p>Comment: US key equivalent, base key, escaped output and
884    escaped longpress</p>
885    <p>In the generated files, a comment is included to help the
886    readability of the document. This comment simply shows the
887    English key equivalent (with prefix key=), the base character
888    (base=), the escaped output (to=) and escaped long-press keys
889    (long=). These comments have been inserted strategically in
890    places to improve readability. Not all comments include include
891    all components since some of them may be obvious.</p>
892    <p>Examples</p>
893    <pre>
894    &lt;keyboard locale="fr-BE-t-k0-windows"&gt;<br> …<br>   &lt;keyMap modifiers="shift"&gt;<br>          &lt;map iso="D01" to="A" /&gt; &lt;!-- key=Q --&gt;<br>             &lt;map iso="D02" to="Z" /&gt; &lt;!-- key=W --&gt;<br>             &lt;map iso="D03" to="E" /&gt;<br>          &lt;map iso="D04" to="R" /&gt;<br>          &lt;map iso="D05" to="T" /&gt;<br>          &lt;map iso="D06" to="Y" /&gt;<br>          …<br>   &lt;/keyMap&gt;<br>     …<br>&lt;/keyboard&gt;<br>&lt;keyboard locale="ps-t-k0-windows"&gt;<br>       …<br>   &lt;keyMap modifiers='altR+caps? ctrl+alt+caps?'&gt;<br>                &lt;map iso="D04" to="\u{200e}" /&gt; &lt;!-- key=R base=ق --&gt;<br>               &lt;map iso="D05" to="\u{200f}" /&gt; &lt;!-- key=T base=ف --&gt;<br>               &lt;map iso="D08" to="\u{670}" /&gt; &lt;!-- key=I base=ه to= ٰ --&gt;<br>          …<br>   &lt;/keyMap&gt;<br>     …<br>&lt;/keyboard&gt;</pre>
895    <h4>5.8.1 <a name="Element_flicks" href="#Element_flicks" id=
896    "Element_flicks">Elements: flicks, flick</a></h4>
897    <p class='dtd'>&lt;!ELEMENT keyMap ( map | flicks )+ &gt;<br>
898    &lt;!ELEMENT flick EMPTY&gt;<br>
899    &lt;!ATTLIST flick directions NMTOKENS&gt;<br>
900    &lt;!ATTLIST flick to CDATA&gt;<br>
901    &lt;!--@VALUE--&gt;</p>
902    <p>The flicks element is used to generate results from a
903    "flick" of the finger on a mobile device. The
904    <strong>directions</strong> attribute value is a
905    space-delimited list of keywords, that describe a path,
906    currently restricted to the cardinal and intercardinal
907    directions {n e s w ne nw se sw}. The <strong>to</strong>
908    attribute value is the result of (one or more) flicks.</p>
909    <p>Example: where a flick to the Northeast then South produces
910    two code points.</p>
911    <pre>&lt;flicks iso="C01"&gt;
912  &lt;flick directions=“ne s” to=“\uABCD\uDCBA”&gt;
913&lt;/flicks&gt;</pre>
914    <hr>
915    <h3>5.9 <a name="Element_import" href="#Element_import" id=
916    "Element_import">Element: import</a></h3>
917    <p>The import element references another file of the same type
918    and includes all the subelements of the top level element as
919    though the import element were being replaced by those
920    elements, in the appropriate section of the XML file. For
921    example:</p>
922    <pre>   &lt;import path="standard_transforms.xml"&gt;</pre>
923    <dl>
924      <dt>Attribute: path (required)</dt>
925      <dd>The value is contains a relative path to the included
926      ldml file. There is a standard set of directories to be
927      searched that an application may provide. This set is always
928      prepended with the directory in which the current file being
929      read, is stored.</dd>
930    </dl>
931    <p>If two identical elements, as described below, are defined,
932    the later element will take precedence. Thus if a
933    hardwareMap/map for the same keycode on the same page is
934    defined twice (for example once in an included file), the later
935    one will be the resulting mapping.</p>
936    <p>Elements are considered to have three attributes that make
937    them unique: the tag of the element, the parent and the
938    identifying attribute. The parent in its turn is a unique
939    element and so on up the chain. If the distinguishing attribute
940    is optional, its non-existence is represented with an empty
941    value. Here is a list of elements and their defining
942    attributes. If an element is not listed then if it is a leaf
943    element, only one occurs and it is merely replaced. If it has
944    children, then the sub elements are considered, in effect
945    merging the element in question.</p>
946    <table>
947      <!-- nocaption -->
948      <tbody>
949        <tr>
950          <td>
951            <p>Element</p>
952          </td>
953          <td>
954            <p>Parent</p>
955          </td>
956          <td>
957            <p>Distinguishing attribute</p>
958          </td>
959        </tr>
960        <tr>
961          <td>
962            <p>keyMap</p>
963          </td>
964          <td>
965            <p>keyboard</p>
966          </td>
967          <td>
968            <p>@modifiers</p>
969          </td>
970        </tr>
971        <tr>
972          <td>
973            <p>map</p>
974          </td>
975          <td>
976            <p>keyMap</p>
977          </td>
978          <td>
979            <p>@iso</p>
980          </td>
981        </tr>
982        <tr>
983          <td>
984            <p>display</p>
985          </td>
986          <td>
987            <p>displayMap</p>
988          </td>
989          <td>
990            <p>@char (new)</p>
991          </td>
992        </tr>
993        <tr>
994          <td>
995            <p>layout</p>
996          </td>
997          <td>
998            <p>layouts</p>
999          </td>
1000          <td>
1001            <p>@modifier</p>
1002          </td>
1003        </tr>
1004      </tbody>
1005    </table>
1006    <p>In order to help identify mistakes, it is an error if a file
1007    contains two elements that override each other. All element
1008    overrides must come as a result of an &lt;include&gt; element
1009    either for the element overridden or the element
1010    overriding.</p>
1011    <p>The following elements are not imported from the source
1012    file:</p>
1013    <ul>
1014      <li>version</li>
1015      <li>generation</li>
1016      <li>names</li>
1017      <li>settings</li>
1018    </ul>
1019    <hr>
1020    <h3>5.10 <a name="Element_displayMap" href=
1021    "#Element_displayMap" id="Element_displayMap">Element:
1022    displayMap</a></h3>
1023    <p>The displayMap can be used to describe what is to be
1024    displayed on the keytops for various keys. For the most part,
1025    such explicit information is unnecessary since the @char
1026    element from the keyMap/map element can be used. But there are
1027    some characters, such as diacritics, that do not display well
1028    on their own and so explicit overrides for such characters can
1029    help. The displayMap consists of a list of display sub
1030    elements.</p>
1031    <p>DisplayMaps are designed to be shared across many different
1032    keyboard layout descriptions, and included in where needed.</p>
1033    <hr>
1034    <h3>5.11 <a name="Element_display" href="#Element_display" id=
1035    "Element_display">Element: display</a></h3>
1036    <p>The display element describes how a character, that has come
1037    from a keyMap/map element, should be displayed on a keyboard
1038    layout where such display is possible.</p>
1039    <dl>
1040      <dt>Attribute: mapOutput (required)</dt>
1041      <dd>Specifies the character or character sequence from the
1042      keyMap/map element that is to have a special display.</dd>
1043    </dl>
1044    <dl>
1045      <dt>Attribute: display (required)</dt>
1046      <dd>Required and specifies the character sequence that should
1047      be displayed on the keytop for any key that generates the
1048      @mapOutput sequence. (It is an error if the value of the
1049      display attribute is the same as the value of the char
1050      attribute.)</dd>
1051    </dl>
1052    <pre>   &lt;keyboard &gt;
1053                &lt;keyboardMap&gt;
1054                        &lt;map iso="C01" to="a" longpress="\u0301 \u0300"/&gt;
1055                &lt;/keyboardMap&gt;
1056                &lt;displayMap&gt;
1057                        &lt;display mapOutput="\u0300" display="u\u02CB"/&gt;
1058                        &lt;display mapOutput="\u0301" display="u\u02CA"/&gt;
1059                &lt;/displayMap&gt;<br> &lt;/keyboard &gt;</pre>
1060    <p>To allow displayMaps to be shared across descriptions, there
1061    is no requirement that @mapOutput matches any @to in any
1062    keyMap/map element in the keyboard description.</p>
1063    <hr>
1064    <h3>5.12 <a name="Element_layer" href="#Element_layer" id=
1065    "Element_layer">Element: layer</a></h3>
1066    <p>A layer element describes the configuration of keys on a
1067    particular layer of a keyboard. It contains row elements to
1068    describe which keys exist in each row and also switch elements
1069    that describe how keys in the layer switch the layer to
1070    another. In addition, for platforms that require a mapping from
1071    a key to a virtual key (for example Windows or Mac) there is
1072    also a vkeys element to describe the mapping.</p>
1073    <dl>
1074      <dt>Attribute: modifier (required)</dt>
1075      <dd>This has two roles. It acts as an identifier for the
1076      layer element and also provides the linkage into a keyMap. A
1077      modifier is a single modifier combination such that it is
1078      matched by one of the modifier combinations in one of the
1079      keyMap/@modifiers attribute. To indicate that no modifiers
1080      apply the reserved name of "none" is used. For the purposes
1081      of fallback vkey mapping, the following modifier components
1082      are reserved: "shift", "ctrl", "alt", "caps", "cmd", "opt"
1083      along with the "L" and "R" optional single suffixes for the
1084      first 3 in that list. There must be a keyMap whose @modifiers
1085      attribute matches the @modifier attribute of the layer
1086      element. It is an error if there is no such keyMap.</dd>
1087    </dl>
1088    <p>The keymap/@modifier often includes multiple combinations
1089    that match. It is not necessary (or prefered) to include all of
1090    these. Instead a minimal matching element should be used, such
1091    that exactly one keymap is matched.</p>
1092    <p>The following are examples of situations where the
1093    @modifiers and @modifier do not match, with a different keymap
1094    definition than above.</p>
1095    <table>
1096      <!-- nocaption -->
1097      <tbody>
1098        <tr>
1099          <th>
1100            <p>keyMap/@modifiers</p>
1101          </th>
1102          <th>
1103            <p>layer/@modifier</p>
1104          </th>
1105        </tr>
1106        <tr>
1107          <td>
1108            <p>shiftL</p>
1109          </td>
1110          <td>
1111            <p>shift (ambiguous)</p>
1112          </td>
1113        </tr>
1114        <tr>
1115          <td>
1116            <p>altR</p>
1117          </td>
1118          <td>
1119            <p>alt</p>
1120          </td>
1121        </tr>
1122        <tr>
1123          <td>
1124            <p>shiftL?+shiftR</p>
1125          </td>
1126          <td>
1127            <p>shift</p>
1128          </td>
1129        </tr>
1130      </tbody>
1131    </table>
1132    <p>And these do match:</p>
1133    <table>
1134      <!-- nocaption -->
1135      <tbody>
1136        <tr>
1137          <th>
1138            <p>keyMap/@modifiers</p>
1139          </th>
1140          <th>
1141            <p>layer/@modifier</p>
1142          </th>
1143        </tr>
1144        <tr>
1145          <td>
1146            <p>shiftL shiftR</p>
1147          </td>
1148          <td>
1149            <p>shift</p>
1150          </td>
1151        </tr>
1152      </tbody>
1153    </table>
1154    <p>The use of @modifier as an identifier for a layer, is
1155    sufficient since it is always unique among the set of layer
1156    elements in a keyboard.</p>
1157    <hr>
1158    <h3>5.13 <a name="Element_row" href="#Element_row" id=
1159    "Element_row">Element: row</a></h3>
1160    <p>A row element describes the keys that are present in the row
1161    of a keyboard. Row elements are ordered within a layout element
1162    with the top visual row being stored first. The row element
1163    introduces the keyId which may be an ISOKey or a specialKey.
1164    More formally:</p>
1165    <pre>
1166    keyId = ISOKey | specialKey<br> ISOKey = [A-Z][0-9][0-9]<br>    specialKey = [a-z][a-zA-Z0-9]{2,7}</pre>
1167    <p>ISOKey denotes a key having an <a href="#Definitions">ISO
1168    Position</a>. SpecialKey is used to identify functional keys
1169    occurring on a virtual keyboard layout.</p>
1170    <dl>
1171      <dt>Attribute: keys (required)</dt>
1172      <dd>This is a string that lists the keyId for each of the
1173      keys in a row. Key ranges may be contracted to
1174      firstkey-lastkey but only for ISOKey type keyIds. The
1175      interpolation between the first and last keys names is
1176      entirely numeric. Thus D00-D03 is equivalent to D00 D01 D02
1177      D03. It is an error if the first and last keys do not have
1178      the same alphabetic prefix or the last key numeric component
1179      is less than or equal to the first key numeric
1180      component.</dd>
1181    </dl>
1182    <p>specialKey type keyIds may take any value within their
1183    syntactic constraint. But the following specialKeys are
1184    reserved to allow applications to identify them and give them
1185    special handling:</p>
1186    <ul>
1187      <li>"bksp", "enter", "space", "tab", "esc", "sym", "num"</li>
1188      <li>all the reserved modifier names</li>
1189      <li>specialKeys starting with the letter "x" for future
1190      reserved names.</li>
1191    </ul>
1192    <p>Here is an example of a row element:</p>
1193    <pre>   &lt;layer modifier="none"&gt;
1194                &lt;row keys="D01-D10"/&gt;
1195                &lt;row keys="C01-C09"/&gt;
1196                &lt;row keys="shift B01-B07 bksp"/&gt;
1197                &lt;row keys="sym A01 smilies A02-A03 enter"/&gt;
1198        &lt;/layer&gt;
1199                </pre>
1200    <hr>
1201    <h3>5.14 <a name="Element_switch" href="#Element_switch" id=
1202    "Element_switch">Element: switch</a></h3>
1203    <p>The switch element describes a function key that has been
1204    included in the layout. It specifies which layer pressing the
1205    key switches you to and also what the key looks like.</p>
1206    <dl>
1207      <dt>Attribute: iso (required)</dt>
1208      <dd>The keyId as specified in one of the row elements. This
1209      must be a specialKey and not an ISOKey.</dd>
1210    </dl>
1211    <dl>
1212      <dt>Attribute: layout (required)</dt>
1213      <dd>The modifier attribute of the resulting layout element
1214      that describes the layer the user gets switched to.</dd>
1215    </dl>
1216    <dl>
1217      <dt>Attribute: display (required)</dt>
1218      <dd>A string to be displayed on the key.</dd>
1219    </dl>
1220    <p>Here is an example of a switch element for a shift key:</p>
1221    <pre>   &lt;layer modifier="none"&gt;
1222                &lt;row keys="D01-D10"/&gt;
1223                &lt;row keys="C01-C09"/&gt;
1224                &lt;row keys="shift B01-B07 bksp"/&gt;
1225                &lt;row keys="sym A01 smilies A02-A03 enter"/&gt;
1226                &lt;switch iso="shift" layout="shift" display="&amp;#x21EA;"/&gt;
1227        &lt;/layer&gt;
1228        &lt;layer modifier="shift"&gt;
1229                &lt;row keys="D01-D10"/&gt;
1230                &lt;row keys="C01-C09"/&gt;
1231                &lt;row keys="shift B01-B07 bksp"/&gt;
1232                &lt;row keys="sym A01 smilies A02-A03 enter"/&gt;
1233                &lt;switch iso="shift" layout="none" display="&amp;#x21EA;"/&gt;
1234        &lt;/layer&gt;</pre>
1235    <hr>
1236    <h3>5.15 <a name="Element_vkeys" href="#Element_vkeys" id=
1237    "Element_vkeys">Element: vkeys</a></h3>
1238    <p>On some architectures, applications may directly interact
1239    with keys before they are converted to characters. The keys are
1240    identified using a virtual key identifier or vkey. The mapping
1241    between a physical keyboard key and a vkey is keyboard-layout
1242    dependent. For example, a French keyboard would identify the
1243    D01 key as being an 'a' with a vkey of 'a' as opposed to 'q' on
1244    a US English keyboard. While vkeys are layout dependent, they
1245    are not modifier dependent. A shifted key always has the same
1246    vkey as its unshifted counterpart. In effect, a key is
1247    identified by its vkey and the modifiers active at the time the
1248    key was pressed.</p>
1249    <p>For a physical keyboard there is a layout specific default
1250    mapping of keys to vkeys. These are listed in a vkeys element
1251    which takes a list of vkey element mappings and is identified
1252    by a type. There are different vkey mappings required for
1253    different platforms. While type="windows" vkeys are very
1254    similar to type="osx" vkeys, they are not identical and require
1255    their own mapping.</p>
1256    <p>The most common model for specifying vkeys is to import a
1257    standard mapping, say to the US layout, and then to add a vkeys
1258    element to change the mapping appropriately for the specific
1259    layout.</p>
1260    <p>In addition to describing physical keyboards, vkeys also get
1261    used in virtual keyboards. Here the vkey mapping is local to a
1262    layer and therefore a vkeys element may occur within a layout
1263    element. In the case where a layout element has no vkeys
1264    element then the resulting mapping may either be empty (none of
1265    the keys represent keys that have vkey identifiers) or may
1266    fallback to the layout wide vkeys mapping. Fallback only occurs
1267    if the layout's modifier attribute consists only of standard
1268    modifiers as listed as being reserved in the description of the
1269    layout/@modifier attribute, and if the modifiers are standard
1270    for the platform involved. So for Windows, 'cmd' is a reserved
1271    modifier but it is not standard for Windows. Therefore on
1272    Windows the vkey mapping for a layout with @modifier="cmd"
1273    would be empty.</p>
1274    <p>A vkeys element consists of a list of vkey elements.</p>
1275    <hr>
1276    <h3>5.16 <a name="Element_vkey" href="#Element_vkey" id=
1277    "Element_vkey">Element: vkey</a></h3>
1278    <p>A vkey element describes a mapping between a key and a vkey
1279    for a particular platform.</p>
1280    <dl>
1281      <dt>Attribute: iso (required)</dt>
1282      <dd>The ISOkey being mapped.</dd>
1283    </dl>
1284    <dl>
1285      <dt>Attribute: type</dt>
1286      <dd>Current values: android, chromeos, osx, und,
1287      windows.</dd>
1288    </dl>
1289    <dl>
1290      <dt>Attribute: vkey (required)</dt>
1291      <dd>The resultant vkey identifier.</dd>
1292    </dl>
1293    <dl>
1294      <dt>Attribute: modifier</dt>
1295      <dd>This attribute may only be used if the parent vkeys
1296      element is a child of a layout element. If present it allows
1297      an unmodified key from a layer to represent a modified
1298      virtual key.</dd>
1299    </dl>
1300    <p>This example shows some of the mappings for a French
1301    keyboard layout:</p>
1302    <pre>   <i>shared/win-vkey.xml</i>
1303        &lt;keyboard&gt;
1304                &lt;vkeys type="windows"&gt;
1305                        &lt;vkey iso="D01" vkey="VK_Q"/&gt;
1306                        &lt;vkey iso="D02" vkey="VK_W"/&gt;
1307                        &lt;vkey iso="C01" vkey="VK_A"/&gt;
1308                        &lt;vkey iso="B01" vkey="VK_Z"/&gt;
1309                &lt;/vkeys&gt;
1310        &lt;/keyboard&gt;<br>
1311        <i>shared/win-fr.xml</i>
1312        &lt;keyboard&gt;
1313                &lt;import path="shared/win-vkey.xml"&gt;
1314                &lt;keyMap&gt;
1315                        &lt;map iso="D01" to="a"/&gt;
1316                        &lt;map iso="D02" to="z"/&gt;
1317                        &lt;map iso="C01" to="q"/&gt;
1318                        &lt;map iso="B01" to="w"/&gt;
1319                &lt;/keyMap&gt;<br>
1320                &lt;keyMap modifiers="shift"&gt;
1321                        &lt;map iso="D01" to="A"/&gt;
1322                        &lt;map iso="D02" to="Z"/&gt;
1323                        &lt;map iso="C01" to="Q"/&gt;
1324                        &lt;map iso="B01" to="W"/&gt;
1325                &lt;/keyMap&gt;<br>
1326                &lt;vkeys type="windows"&gt;
1327                        &lt;vkey iso="D01" vkey="VK_A"/&gt;
1328                        &lt;vkey iso="D02" vkey="VK_Z"/&gt;
1329                        &lt;vkey iso="C01" vkey="VK_Q"/&gt;
1330                        &lt;vkey iso="B01" vkey="VK_W"/&gt;
1331                &lt;/vkeys&gt;
1332        &lt;/keyboard&gt;</pre>
1333    <p>In the context of a virtual keyboard there might be a symbol
1334    layer with the following layout:</p>
1335    <pre>   &lt;keyboard&gt;
1336                &lt;keyMap&gt;
1337                        &lt;map iso="D01" to="1"/&gt;
1338                        &lt;map iso="D02" to="2"/&gt;
1339                        ...
1340                        &lt;map iso="D09" to="9"/&gt;
1341                        &lt;map iso="D10" to="0"/&gt;
1342                        &lt;map iso="C01" to="!"/&gt;
1343                        &lt;map iso="C02" to="@"/&gt;
1344                        ...
1345                        &lt;map iso="C09" to="("/&gt;
1346                        &lt;map iso="C10" to=")"/&gt;
1347                &lt;/keyMap&gt;<br>
1348                &lt;layer modifier="sym"&gt;
1349                        &lt;row keys="D01-D10"/&gt;
1350                        &lt;row keys="C01-C09"/&gt;
1351                        &lt;row keys="shift B01-B07 bksp"/&gt;
1352                        &lt;row keys="sym A00-A03 enter"/&gt;
1353                        &lt;switch iso="sym" layout="none" display="ABC"/&gt;
1354                        &lt;switch iso="shift" layout="sym+shift" display="&amp;=/&lt;"/&gt;
1355                        &lt;vkeys type="windows"&gt;
1356                                &lt;vkey iso="D01" vkey="VK_1"/&gt;
1357                                ...
1358                                &lt;vkey iso="D10" vkey="VK_0"/&gt;
1359                                &lt;vkey iso="C01" vkey="VK_1" modifier="shift"/&gt;
1360                                ...
1361                                &lt;vkey iso="C10" vkey="VK_0" modifier="shift"/&gt;
1362                        &lt;/vkeys&gt;
1363                &lt;/layer&gt;
1364        &lt;/keyboard&gt;</pre>
1365    <hr>
1366    <h3>5.17 <a name="Element_transforms" href=
1367    "#Element_transforms" id="Element_transforms">Element:
1368    transforms</a></h3>
1369    <p>This element defines a group of one or more transform
1370    elements associated with this keyboard layout. This is used to
1371    support such as dead-keys using a straightforward structure
1372    that works for all the keyboards tested, and that results in
1373    readable source data.</p>
1374    <p>There can be multiple &lt;transforms&gt; elements</p>
1375    <p>Syntax</p>
1376    <p>&lt;transforms type="..."&gt;</p>
1377    <p>{a set of transform elements}</p>
1378    <p>&lt;/transforms&gt;</p>
1379    <dl>
1380      <dt>Attribute: type (required)</dt>
1381      <dd>Current values: simple, final.</dd>
1382    </dl>
1383    <hr>
1384    <h3>5.18 <a name="Element_transform" href="#Element_transform"
1385    id="Element_transform">Element: transform</a></h3>
1386    <p>This element must have the transforms element as its parent.
1387    This element represents a single transform that may be
1388    performed using the keyboard layout. A transform is an element
1389    that specifies a set of conversions from sequences of code
1390    points into one (or more) other code points.. For example, in
1391    most French keyboards hitting the "^" dead-key followed by the
1392    "e" key produces "ê".</p>
1393    <p>Syntax</p>
1394    <p>&lt;transform from="{combination of characters}"
1395    to="{output}"&gt;</p>
1396    <dl>
1397      <dt>Attribute: from (required)</dt>
1398      <dd>The from attribute consists of a sequence of elements.
1399      Each element matches one character and may consist of a
1400      codepoint or a UnicodeSet (both as defined in <a href=
1401      "https://www.unicode.org/reports/tr35/#Unicode_Sets">UTS#35
1402      section 5.3.3</a>).</dd>
1403    </dl>
1404    <p>For example, suppose there are the following transforms:</p>
1405    <blockquote>
1406      <p>^e → ê</p>
1407      <p>^a → â</p>
1408      <p>^o → ô</p>
1409    </blockquote>
1410    <p>If the user types a key that produces "^", the keyboard
1411    enters a dead state. When the user then types a key that
1412    produces an "e", the transform is invoked, and "ê" is output.
1413    Suppose a user presses keys producing "^" then "u". In this
1414    case, there is no match for the "^u", and the "^" is output if
1415    the failure attribute in the transform element is set to emit.
1416    If there is no transform starting with "u", then it is also
1417    output (again only if failure is set to emit) and the mechanism
1418    leaves the "dead" state.</p>
1419    <p>The UI may show an initial sequence of matching characters
1420    with a special format, as is done with dead-keys on the Mac,
1421    and modify them as the transform completes. This behavior is
1422    specified in the partial attribute in the transform
1423    element.</p>
1424    <p>Most transforms in practice have only a couple of
1425    characters. But for completeness, the behavior is defined on
1426    all strings:</p>
1427    <ol>
1428      <li>If there could be a longer match if the user were to type
1429      additional keys, go into a 'dead' state.</li>
1430      <li>If there could not be a longer match, find the longest
1431      actual match, emit the transformed text (if failure is set to
1432      emit), and start processing again with the remainder.</li>
1433      <li>If there is no possible match, output the first
1434      character, and start processing again with the
1435      remainder.</li>
1436    </ol>
1437    <p>Suppose that there is the following transforms:</p>
1438    <blockquote>
1439      <p>ab → x</p>
1440      <p>abc → y</p>
1441      <p>abef → z</p>
1442      <p>bc → m</p>
1443      <p>beq → n</p>
1444    </blockquote>
1445    <p>Here's what happens when the user types various sequence
1446    characters:</p>
1447    <table>
1448      <!-- nocaption -->
1449      <tbody>
1450        <tr>
1451          <td>
1452            <p>Input characters</p>
1453          </td>
1454          <td>
1455            <p>Result</p>
1456          </td>
1457          <td>
1458            <p>Comments</p>
1459          </td>
1460        </tr>
1461        <tr>
1462          <td>
1463            <p>ab</p>
1464          </td>
1465          <td>&nbsp;</td>
1466          <td>
1467            <p>No output, since there is a longer transform with
1468            this as prefix.</p>
1469          </td>
1470        </tr>
1471        <tr>
1472          <td>
1473            <p>abc</p>
1474          </td>
1475          <td>
1476            <p>y</p>
1477          </td>
1478          <td>
1479            <p>Complete transform match.</p>
1480          </td>
1481        </tr>
1482        <tr>
1483          <td>
1484            <p>abd</p>
1485          </td>
1486          <td>
1487            <p>xd</p>
1488          </td>
1489          <td>
1490            <p>The longest match is "ab", so that is converted and
1491            output. The 'd' follows, since it is not the start of
1492            any transform.</p>
1493          </td>
1494        </tr>
1495        <tr>
1496          <td>
1497            <p>abeq</p>
1498          </td>
1499          <td>
1500            <p>xeq</p>
1501          </td>
1502          <td>
1503            <p>"ab" wins over "beq", since it comes first. That is,
1504            there is no longer possible match starting with
1505            'a'.</p>
1506          </td>
1507        </tr>
1508        <tr>
1509          <td>
1510            <p>bc</p>
1511          </td>
1512          <td>
1513            <p>m</p>
1514          </td>
1515          <td>&nbsp;</td>
1516        </tr>
1517      </tbody>
1518    </table>
1519    <p>Control characters, combining marks and whitespace in this
1520    attribute are escaped using the \u{...} notation.</p>
1521    <dl>
1522      <dt>Attribute: to (required)</dt>
1523      <dd>This attribute represents the characters that are output
1524      from the transform. The output can contain more than one
1525      character, so you could have &lt;transform from="´A"
1526      to="Fred"/&gt;</dd>
1527    </dl>
1528    <p>Control characters, whitespace (other than the regular space
1529    character) and combining marks in this attribute are escaped
1530    using the \u{...} notation.</p>
1531    <p>Examples</p>
1532    <pre>
1533    &lt;keyboard locale="fr-CA-t-k0-CSA-osx"&gt;<br> &lt;transforms type="simple"&gt;<br>          &lt;transform from="´a" to="á" /&gt;<br>            &lt;transform from="´A" to="Á" /&gt;<br>            &lt;transform from="´e" to="é" /&gt;<br>            &lt;transform from="´E" to="É" /&gt;<br>            &lt;transform from="´i" to="í" /&gt;<br>            &lt;transform from="´I" to="Í" /&gt;<br>            &lt;transform from="´o" to="ó" /&gt;<br>            &lt;transform from="´O" to="Ó" /&gt;<br>            &lt;transform from="´u" to="ú" /&gt;<br>            &lt;transform from="´U" to="Ú" /&gt;<br>    &lt;/transforms&gt;<br> ...<br>&lt;/keyboard&gt;<br>&lt;keyboard locale="nl-BE-t-k0-chromeos"&gt;<br> &lt;transforms type="simple"&gt;<br>          &lt;transform from="\u{30c}a" to="ǎ" /&gt; &lt;!-- ̌a → ǎ --&gt;<br>                &lt;transform from="\u{30c}A" to="Ǎ" /&gt; &lt;!-- ̌A → Ǎ --&gt;<br>                &lt;transform from="\u{30a}a" to="å" /&gt; &lt;!-- ̊a → å --&gt;<br>                &lt;transform from="\u{30a}A" to="Å" /&gt; &lt;!-- ̊A → Å --&gt;<br>        &lt;/transforms&gt;<br> ...<br>&lt;/keyboard&gt;</pre>
1534    <dl>
1535      <dt>Attribute: before (optional)</dt>
1536      <dd>This attribute consists of a sequence of elements
1537      (codepoint or UnicodeSet) to match the text up to the current
1538      position in the text (this is similar to a regex "look
1539      behind" assertion: (?&lt;=a)b matches a "b" that is preceded
1540      by an "a"). The attribute must match for the transform to
1541      apply. If missing, no before constraint is applied. The
1542      attribute value must not be empty.</dd>
1543    </dl>
1544    <dl>
1545      <dt>Attribute: after (optional)</dt>
1546      <dd>This attribute consists of a sequence of elements
1547      (codepoint or UnicodeSet) and matches as a zero-width
1548      assertion after the @from sequence. The attribute must match
1549      for the transform to apply. If missing, no after constraint
1550      is applied. The attribute value must not be empty. When the
1551      transform is applied, the string matched by the @from
1552      attribute is replaced by the string in the @to attribute,
1553      with the text matched by the @after attribute left unchanged.
1554      After the change, the current position is reset to just after
1555      the text output from the @to attribute and just before the
1556      text matched by the @after attribute. Warning: some legacy
1557      implementations may not be able to make such an adjustment
1558      and will place the current position after the @after matched
1559      string.</dd>
1560    </dl>
1561    <dl>
1562      <dt>Attribute: error (optional)</dt>
1563      <dd>If set this attribute indicates that the keyboarding
1564      application may indicate an error to the user in some way.
1565      Processing may stop and rewind to any state before the key
1566      was pressed. If processing does stop, no further transforms
1567      on the same input are applied. The @error attribute takes the
1568      value "fail", or must be absent. If processing continues, the
1569      @to is used for output as normal. It thus should contain a
1570      reasonable value.</dd>
1571    </dl>
1572    <p>For example:</p>
1573    <blockquote>
1574      &lt;transform from="\u037A\u037A" to="\u037A" error="fail"
1575      /&gt;
1576    </blockquote>
1577    <p>This indicates that it is an error to type two iota
1578    subscripts immediately after each other.</p>
1579    <p>In terms of how these different attributes work in
1580    processing a sequences of transforms, consider the
1581    transform:</p>
1582    <blockquote>
1583      &lt;transform before="X" from="Y" after="Y" to="B"/&gt;
1584    </blockquote>
1585    <p>This would transform the string:</p>
1586    <blockquote>
1587      XYZ → XBZ
1588    </blockquote>
1589    <p>If we mark where the current match position is before and
1590    after the transform we see:</p>
1591    <blockquote>
1592      X | Y Z → X B | Z
1593    </blockquote>
1594    <p>And a subsequent transform could transform the Z string,
1595    looking back (using @before) to match the B.</p>
1596    <p>There are other keying behaviors that are needed
1597    particularly in handling languages and scripts from various
1598    parts of the world. The behaviors intended to be covered by the
1599    transforms are:</p>
1600    <ul>
1601      <li>Reordering combining marks. The order required for
1602      underlying storage may differ considerably from the desired
1603      typing order. In addition, a keyboard may want to allow for
1604      different typing orders.</li>
1605      <li>Error indication. Sometimes a keyboard layout will want
1606      to specify to the application that a particular keying
1607      sequence in a context is in error and that the application
1608      should indicate that that particular keypress is
1609      erroneous.</li>
1610      <li>Backspace handling. There are various approaches to
1611      handling the backspace key. An application may treat it as an
1612      undo of the last key input, or it may simply delete the last
1613      character in the currently output text, or it may use
1614      transform rules to tell it how much to delete.</li>
1615    </ul>
1616    <p>We consider each transform type in turn and consider
1617    attributes to the &lt;transforms&gt; element pertinent to that
1618    type.</p>
1619    <hr>
1620    <h3>5.19 <a name="Element_reorder" href="#Element_reorder" id=
1621    "Element_reorder">Element: reorder</a></h3>
1622    <p>The reorder transform is applied after all transform except
1623    for those with type=“final”.</p>
1624    <p>This transform has the job of reordering sequences of
1625    characters that have been typed, from their typed order to the
1626    desired output order. The primary concern in this transform is
1627    to sort combining marks into their correct relative order after
1628    a base, as described in this section. The reorder transforms
1629    can be quite complex, keyboard layouts will almost always
1630    import them.</p>
1631    <p>The reordering algorithm consists of four parts:</p>
1632    <ol>
1633      <li>Create a sort key for each character in the input string.
1634      A sort key has 4 parts: (primary, index, tertiary).
1635        <ul>
1636          <li>The <b>primary weight</b> is the primary order
1637          value.</li>
1638          <li>The <b>secondary weight</b> is the index, a position
1639          in the input string, usually of the character itself, but
1640          it may be of a character earlier in the string.</li>
1641          <li>The <b>tertiary weight</b> is a tertiary order value
1642          (defaulting to 0).</li>
1643          <li>The <b>quaternary weight</b> is the index of the
1644          character in the string. This ensures a stable sort for
1645          sequences of characters with the same tertiary
1646          weight.</li>
1647        </ul>
1648      </li>
1649      <li>Mark each character as to whether it is a prebase
1650      character, one that is typed before the base and logically
1651      stored after. Thus it will have a primary order &gt; 0.</li>
1652      <li>Use the sort key and the prebase mark to identify runs. A
1653      run starts with a prefix that contains any prebase characters
1654      and a single base character whose primary and tertiary key is
1655      0. The run extends until, but not including, the start of the
1656      prefix of the next run or end of the string.
1657        <ul>
1658          <li>run := prebase* (primary=0 &amp;&amp; tertiary=0)
1659          ((primary≠0 || tertiary≠0) &amp;&amp; !prebase)*</li>
1660        </ul>
1661      </li>
1662      <li>Sort the character order of each character in the run
1663      based on its sort key.</li>
1664    </ol>
1665    <p>The primary order of a character with the Unicode property
1666    Combining_Character_Class (ccc) of 0 may well not be 0. In
1667    addition, a character may receive a different primary order
1668    dependent on context. For example, in the Devanagari sequence
1669    ka halant ka, the first ka would have a primary order 0 while
1670    the halant ka sequence would give both halant and the second ka
1671    a primary order &gt; 0, for example 2. Note that “base”
1672    character in this discussion is not a Unicode base character.
1673    It is instead a character with primary=0.</p>
1674    <p>In order to get the characters into the correct relative
1675    order, it is necessary not only to order combining marks
1676    relative to the base character, but also to order some
1677    combining marks in a subsequence following another combining
1678    mark. For example in Devanagari, a nukta may follow consonant
1679    character, but it may also follow a conjunct consisting of a
1680    consonant, halant, consonant. Notice that the second consonant
1681    is not, in this model, the start of a new run because some
1682    characters may need to be reordered to before the first base,
1683    for example repha. The repha would get primary &lt; 0, and be
1684    sorted before the character with order = 0, which is, in the
1685    case of Devanagari, the initial consonant of the orthographic
1686    syllable.</p>
1687    <p>The reorder transform consists of a single element type:
1688    &lt;reorder&gt; encapsulated in a &lt;reorders&gt; element.
1689    Each is a rule that matches against a string of characters with
1690    the action of setting the various ordering attributes (primary,
1691    tertiary, tertiary_base, prebase) for the matched characters in
1692    the string.</p>
1693    <blockquote>
1694      <p><strong>from</strong> This attribute follows the
1695      transform/@from attribute and contains a string of elements.
1696      Each element matches one character and may consist of a
1697      codepoint or a UnicodeSet (both as defined in UTS#35 section
1698      5.3.3). This attribute is required.</p>
1699      <p><strong>before</strong> This attribute follows the
1700      transform/@before attribute and contains the element string
1701      that must match the string immediately preceding the start of
1702      the string that the @from matches.</p>
1703      <p><strong>after</strong> This attribute follows the
1704      transform/@after attribute and contains the element string
1705      that must match the string immediately following the end of
1706      the string that the @from matches.</p>
1707      <p><strong>order</strong> This attribute gives the primary
1708      order for the elements in the matched string in the @from
1709      attribute. The value is a simple integer between -128 and
1710      +127 inclusive, or a space separated list of such integers.
1711      For a single integer, it is applied to all the elements in
1712      the matched string. Details of such list type attributes are
1713      given after all the attributes are described. If missing, the
1714      order value of all the matched characters is 0. We consider
1715      the order value for a matched character in the string.</p>
1716      <ul>
1717        <li>If the value is 0 and its tertiary value is 0, then the
1718        character is the base of a new run.</li>
1719        <li>If the value is 0 and its tertiary value is non-zero,
1720        then it is a normal character in a run, with ordering
1721        semantics as described in the @tertiary attribute.</li>
1722        <li>If the value is negative, then the character is a
1723        primary character and will reorder to be before the base of
1724        the run.</li>
1725        <li>If the value is positive, then the character is a
1726        primary character and is sorted based on the order value as
1727        the primary key following a previous base character.</li>
1728      </ul>
1729      <p>A character with a zero tertiary value is a primary
1730      character and receives a sort key consisting of:</p>
1731      <ul>
1732        <li>Primary weight is the order value</li>
1733        <li>Secondary weight is the index of the character. This
1734        may be any value (character index, codepoint index) such
1735        that its value is greater than the character before it and
1736        less than the character after it.</li>
1737        <li>Tertiary weight is 0.</li>
1738        <li>Quaternary weight is the same as the secondary
1739        weight.</li>
1740      </ul>
1741      <p><strong>tertiary</strong> This attribute gives the
1742      tertiary order value to the characters matched. The value is
1743      a simple integer between -128 and +127 inclusive, or a space
1744      separated list of such integers. If missing, the value for
1745      all the characters matched is 0. We consider the tertiary
1746      value for a matched character in the string.</p>
1747      <ul>
1748        <li>If the value is 0 then the character is considered to
1749        have a primary order as specified in its order value and is
1750        a primary character.</li>
1751        <li>If the value is non zero, then the order value must be
1752        zero otherwise it is an error. The character is considered
1753        as a tertiary character for the purposes of ordering.</li>
1754      </ul>
1755      <p>A tertiary character receives its primary order and index
1756      from a previous character, which it is intended to sort
1757      closely after. The sort key for a tertiary character consists
1758      of:</p>
1759      <ul>
1760        <li>Primary weight is the primary weight of the primary
1761        character</li>
1762        <li>Secondary weight is the index of the primary character,
1763        not the tertiary character</li>
1764        <li>Tertiary weight is the tertiary value for the
1765        character.</li>
1766        <li>Quaternary weight is the index of the tertiary
1767        character.</li>
1768      </ul>
1769      <p><strong>tertiary_base</strong> This attribute is a space
1770      separated list of "true" or "false" values corresponding to
1771      each character matched. It is illegal for a tertiary
1772      character to have a true tertiary_base value. For a primary
1773      character it marks that this character may have tertiary
1774      characters moved after it. When calculating the secondary
1775      weight for a tertiary character, the most recently
1776      encountered primary character with a true tertiary_base
1777      attribute is used. Primary characters with an @order value of
1778      0 automatically are treated as having tertiary_base true
1779      regardless of what is specified for them.</p>
1780      <p><strong>prebase</strong> This attribute gives the prebase
1781      attribute for each character matched. The value may be "true"
1782      or "false" or a space separated list of such values. If
1783      missing the value for all the characters matched is false. It
1784      is illegal for a tertiary character to have a true prebase
1785      value.</p>
1786      <p>If a primary character has a true prebase value then the
1787      character is marked as being typed before the base character
1788      of a run, even though it is intended to be stored after it.
1789      The primary order gives the intended position in the order
1790      after the base character, that the prebase character will end
1791      up. Thus @primary may not be 0. These characters are part of
1792      the run prefix. If such characters are typed then, in order
1793      to give the run a base character after which characters can
1794      be sorted, an appropriate base character, such as a dotted
1795      circle, is inserted into the output run, until a real base
1796      character has been typed. A value of "false" indicates that
1797      the character is not a prebase.</p>
1798    </blockquote>
1799    <p>There is no @error attribute.</p>
1800    <p>For @from attributes with a match string length greater than
1801    1, the sort key information (@order, @tertiary, @tertiary_base,
1802    @prebase) may consist of a space separated list of values, one
1803    for each element matched. The last value is repeated to fill
1804    out any missing values. Such a list may not contain more values
1805    than there are elements in the @from attribute:</p>
1806    <pre>  if len(@from) &lt; len(@list) then error<br>  else
1807    while len(@from) &gt; len(@list)<br>      append lastitem(@list) to @list<br>    endwhile
1808  endif</pre>
1809    <p>For example, consider the word Northern Thai (nod-Lana)
1810    word: ᨡ᩠ᩅᩫ᩶ 'roasted'. This is ideally encoded as the
1811    following:</p>
1812    <table class='simple'>
1813      <tr>
1814        <th>name</th>
1815        <td><em>ka</em></td>
1816        <td><em>asat</em></td>
1817        <td><em>wa</em></td>
1818        <td><em>o</em></td>
1819        <td><em>t2</em></td>
1820      </tr>
1821      <tr>
1822        <th>code</th>
1823        <td>1A21</td>
1824        <td>1A60</td>
1825        <td>1A45</td>
1826        <td>1A6B</td>
1827        <td>1A76</td>
1828      </tr>
1829      <tr>
1830        <th>ccc</th>
1831        <td>0</td>
1832        <td>9</td>
1833        <td>0</td>
1834        <td>0</td>
1835        <td>230</td>
1836      </tr>
1837    </table>
1838    <p>(That sequence is already in NFC format.)</p>
1839    <p>Some users may type the upper component of the vowel first,
1840    and the tone before or after the lower component. Thus someone
1841    might type it as:</p>
1842    <table class='simple'>
1843      <tr>
1844        <th>name</th>
1845        <td><em>ka</em></td>
1846        <td><em>o</em></td>
1847        <td><em>t2</em></td>
1848        <td><em>asat</em></td>
1849        <td><em>wa</em></td>
1850      </tr>
1851      <tr>
1852        <th>code</th>
1853        <td>1A21</td>
1854        <td>1A6B</td>
1855        <td>1A76</td>
1856        <td>1A60</td>
1857        <td>1A45</td>
1858      </tr>
1859      <tr>
1860        <th>ccc</th>
1861        <td>0</td>
1862        <td>0</td>
1863        <td>230</td>
1864        <td>9</td>
1865        <td>0</td>
1866      </tr>
1867    </table>
1868    <p>The Unicode NFC format of that typed value reorders to:</p>
1869    <table class='simple'>
1870      <tr>
1871        <th>name</th>
1872        <td><em>ka</em></td>
1873        <td><em>o</em></td>
1874        <td><em>asat</em></td>
1875        <td><em>t2</em></td>
1876        <td><em>wa</em></td>
1877      </tr>
1878      <tr>
1879        <th>code</th>
1880        <td>1A21</td>
1881        <td>1A6B</td>
1882        <td>1A60</td>
1883        <td>1A76</td>
1884        <td>1A45</td>
1885      </tr>
1886      <tr>
1887        <th>ccc</th>
1888        <td>0</td>
1889        <td>0</td>
1890        <td>9</td>
1891        <td>230</td>
1892        <td>0</td>
1893      </tr>
1894    </table>
1895    <p>Finally, the user might also type in the sequence with the
1896    tone <em>after</em> the lower component.</p>
1897    <table class='simple'>
1898      <tr>
1899        <th>name</th>
1900        <td><em>ka</em></td>
1901        <td><em>o</em></td>
1902        <td><em>asat</em></td>
1903        <td><em>wa</em></td>
1904        <td><em>t2</em></td>
1905      </tr>
1906      <tr>
1907        <th>code</th>
1908        <td>1A21</td>
1909        <td>1A6B</td>
1910        <td>1A60</td>
1911        <td>1A45</td>
1912        <td>1A76</td>
1913      </tr>
1914      <tr>
1915        <th>ccc</th>
1916        <td>0</td>
1917        <td>0</td>
1918        <td>9</td>
1919        <td>0</td>
1920        <td>230</td>
1921      </tr>
1922    </table>
1923    <p>(That sequence is already in NFC format.)</p>
1924    <p>We want all of these sequences to end up ordered as the
1925    first. To do this, we use the following rules:</p>
1926    <pre>
1927    &lt;reorder from="\u1A60" order="127"/&gt;      &lt;!-- max possible order --&gt;
1928  &lt;reorder from="\u1A6B" order="42"/&gt;
1929  &lt;reorder from="[\u1A75-\u1A7C]" order="55"/&gt;<br>  &lt;reorder before="\u1A6B" from="\u1A60\u1A45" order="10"/&gt;<br>  &lt;reorder before="\u1A6B[\u1A75-\u1A7C]" from="\u1A60\u1A45" order="10"/&gt;<br>  &lt;reorder before="\u1A6B" from="\u1A60[\u1A75-\u1A7C]\u1A45" order="10 55 10"/&gt;</pre>
1930    <p>The first reorder is the default ordering for the
1931    <i>asat</i> which allows for it to be placed anywhere in a
1932    sequence, but moves any non-consonants that may immediately
1933    follow it, back before it in the sequence. The next two rules
1934    give the orders for the top vowel component and tone marks
1935    respectively. The next three rules give the <i>asat</i> and
1936    <i>wa</i> characters a primary order that places them before
1937    the <em>o</em>. Notice particularly the final reorder rule
1938    where the <i>asat</i>+<i>wa</i> is split by the tone mark. This
1939    rule is necessary in case someone types into the middle of
1940    previously normalized text.</p>
1941    <p>&lt;reorder&gt; elements are priority ordered based first on
1942    the length of string their @from attribute matches and then the
1943    sum of the lengths of the strings their @before and @after
1944    attributes match.</p>
1945    <p>If a layout has two &lt;transforms&gt; elements of type
1946    reorder, e.g. from importing one and specifying the second,
1947    then &lt;transform&gt; elements are merged. The @from string in
1948    a &lt;reorder&gt; element describes a set of strings that it
1949    matches. This also holds for the @before and @after attributes.
1950    The intersection of two &lt;reorder&gt; elements consists of
1951    the intersections of their @from, @before and @after string
1952    sets. It is illegal for the intersection between any two
1953    &lt;reorder&gt; elements in the same &lt;transforms&gt; element
1954    to be non empty, although implementors are encouraged to have
1955    pity on layout authors when reporting such errors, since they
1956    can be hard to track down.</p>
1957    <p>If two &lt;reorder&gt; elements in two different
1958    &lt;transforms&gt; elements have a non empty intersection, then
1959    they are split and merged. They are split such that where there
1960    were two &lt;reorder&gt; elements, there are, in effect (but
1961    not actuality), three elements consisting of:</p>
1962    <ul>
1963      <li>@from, @before, @after that match the intersection of the
1964      two rules. The other attributes are merged, as described
1965      below.</li>
1966      <li>@from, @before, @after that match the set of strings in
1967      the first rule not in the intersection with the other
1968      attributes from the first rule.</li>
1969      <li>@from, @before, @after that match the set of strings in
1970      the second rule not in the intersection, with the other
1971      attributes from the second rule.</li>
1972    </ul>
1973    <p>When merging the other attributes, the second rule is taken
1974    to have priority (occurring later in the layout description
1975    file). Where the second rule does not define the value for a
1976    character but the first does, it is taken from the first rule,
1977    otherwise it is taken from the second rule.</p>
1978    <p>Notice that it is possible for two rules to match the same
1979    string, but for them not to merge because the distribution of
1980    the string across @before, @from, and @after is different. For
1981    example:</p>
1982    <pre> &lt;reorder before="ab" from="cd" after="e"/&gt;</pre>
1983    <p>would not merge with:</p>
1984    <pre> &lt;reorder before="a" from="bcd" after="e"/&gt;</pre>
1985    <p>When two &lt;reorders&gt; elements merge as the result of an
1986    import, the resulting reorder elements are sorted into priority
1987    order for matching.</p>
1988    <p>Consider this fragment from a shared reordering for the
1989    Myanmar script:</p>
1990    <pre>&lt;!-- medial-r --&gt;
1991  &lt;reorder from="\u103C" order="20"/&gt;
1992
1993&lt;!-- [medial-wa or shan-medial-wa] --&gt;
1994  &lt;reorder from="[\u103D\u1082]" order="25"/&gt;
1995
1996&lt;!-- [medial-ha or shan-medial-wa]+asat = Mon <i>asat</i> --&gt;<br>  &lt;reorder from="[\u103E\u1082]\u103A" order="27"/&gt;
1997
1998&lt;!-- [medial-ha or mon-medial-wa] --&gt;<br>  &lt;reorder from="[\u103E\u1060]" order="27"/&gt;
1999
2000&lt;!-- [e-vowel or shan-e-vowel] --&gt;<br>  &lt;reorder from="[\u1031\u1084]" order="30"/&gt;
2001<br>  &lt;reorder from="[\u102D\u102E\u1033-\u1035\u1071-\u1074\u1085\u109D\uA9E5]" order="35"/&gt;</pre>
2002    <p>A particular Myanmar keyboard layout can have this reorders
2003    element:</p>
2004    <pre>&lt;reorders type="reorder"&gt;<br>&lt;!-- Kinzi --&gt;
2005  &lt;reorder from="\u1004\u103A\u1039" order="-1"/&gt;
2006
2007&lt;!-- e-vowel --&gt;
2008  &lt;reorder from="\u1031" prebase="1"/&gt;
2009
2010&lt;!-- medial-r --&gt;
2011  &lt;reorder from="\u103C" prebase="1"/&gt;<br>&lt;/reorders&gt;</pre>
2012    <p>The effect of this that the <em>e-vowel</em> will be
2013    identified as a prebase and will have an order of 30. Likewise
2014    a <em>medial-r</em> will be identified as a prebase and will
2015    have an order of 20. Notice that a <em>shan-e-vowel</em> will
2016    not be identified as a prebase (even if it should be!). The
2017    <em>kinzi</em> is described in the layout since it moves
2018    something across a run boundary. By separating such movements
2019    (prebase or moving to in front of a base) from the shared
2020    ordering rules, the shared ordering rules become a
2021    self-contained combining order description that can be used in
2022    other keyboards or even in other contexts than keyboarding.</p>
2023    <hr>
2024    <h3>5.20 <a name="Element_final" href="#Element_final" id=
2025    "Element_final">Element: final</a></h3>
2026    <p>The final transform is applied after the reorder transform.
2027    It executes in a similar way to the simple transform with the
2028    settings ignored, as if there were no settings in the
2029    &lt;settings&gt; element.</p>
2030    <p>This is an example from Khmer where split vowels are
2031    combined after reordering.</p>
2032    <pre>
2033  &lt;transforms type="final"&gt;
2034    &lt;transform from="\u17C1\u17B8" to="\u17BE"/&gt;
2035    &lt;transform from="\u17C1\u17B6" to="\u17C4"/&gt;
2036  &lt;/transforms&gt;</pre>
2037    <p>Another example allows a keyboard implementation to alert or
2038    stop people typing two lower vowels in a Burmese cluster:</p>
2039    <pre>
2040    &lt;transform from="[\u102F\u1030\u1048\u1059][\u102F\u1030\u1048\u1059]" error="fail"/&gt;</pre>
2041    <hr>
2042    <h3>5.21 <a name="Element_backspaces" href=
2043    "#Element_backspaces" id="Element_backspaces">Element:
2044    backspaces</a></h3>
2045    <p>The backspace transform is an optional transform that is not
2046    applied on input of normal characters, but is only used to
2047    perform extra backspace modifications to previously committed
2048    text.</p>
2049    <p>Keyboarding applications typically, but are not required, to
2050    work in one of two modes:</p>
2051    <dl>
2052      <dt><b>text entry</b></dt>
2053      <dd>text entry happens while a user is typing new text. A
2054      user typically wants the backspace key to undo whatever they
2055      last typed, whether or not they typed things in the 'right'
2056      order.</dd>
2057    </dl>
2058    <dl>
2059      <dt><b>text editing</b></dt>
2060      <dd>text editing happens when a user moves the cursor into
2061      some previously entered text which may have been entered by
2062      someone else. As such, there is no way to know in which order
2063      things were typed, but a user will still want appropriate
2064      behaviour when they press backspace. This may involve
2065      deleting more than one character or replacing a sequence of
2066      characters with a different sequence.</dd>
2067    </dl>
2068    <p>In the text entry mode, there is no need for any special
2069    description of backspace behaviour. A keyboarding application
2070    will typically keep a history of previous output states and
2071    just revert to the previous state when backspace is hit.</p>
2072    <p>In text editing mode, different keyboard layouts may behave
2073    differently in the same textual context. The backspace
2074    transform allows the keyboard layout to specify the effect of
2075    pressing backspace in a particular textual context. This is
2076    done by specifying a set of backspace rules that match a string
2077    before the cursor and replace it with another string. The rules
2078    are expressed as backspace elements encapsulated in a
2079    backspaces element.</p>
2080    <hr>
2081    <h3>5.22 <a name="Element_backspace" href="#Element_backspace"
2082    id="Element_backspace">Element: backspace</a></h3>
2083    <p>The backspace element has the same @before, @from, @after,
2084    @to, @errors of the transform element. The @to is optional with
2085    backspace.</p>
2086    <p>For example, consider deleting a Devanagari ksha:</p>
2087    <pre>
2088        &lt;backspaces&gt;
2089                &lt;backspace from="\u0915\u094D\u0936"/&gt;
2090        &lt;/backspaces&gt;</pre>
2091    <p>Here there is no @to attribute since the whole string is
2092    being deleted. This is not uncommon in the backspace
2093    transforms.</p>
2094    <p>A more complex example comes from a Burmese visually ordered
2095    keyboard:</p>
2096    <pre> &lt;backspaces&gt;
2097&lt;!-- Kinzi --&gt;<br>  &lt;backspace from="[\u1004\u101B\u105A]\u103A\u1039"/&gt;
2098
2099&lt;!-- subjoined consonant --&gt;<br>  &lt;backspace from="\u1039[\u1000-\u101C\u101E\u1020\u1021\u1050\u1051\u105A-\u105D]"/&gt;
2100<br>&lt;!-- tone mark --&gt;
2101  &lt;backspace from="\u102B\u103A"/&gt;
2102<br>&lt;!-- Handle prebases --&gt;
2103&lt;!-- diacritics stored before e-vowel --&gt;<br>  &lt;backspace from="[\u103A-\u103F\u105E-\u1060\u1082]\u1031" to="\u1031"/&gt;
2104
2105&lt;!-- diacritics stored before medial r --&gt;<br>  &lt;backspace from="[\u103A-\u103B\u105E-\u105F]\u103C" to="\u103C"/&gt;
2106<br>&lt;!-- subjoined consonant before e-vowel --&gt;
2107  &lt;backspace from="\u1039[\u1000-\u101C\u101E\u1020\u1021]\u1031" to="\u1031"/&gt;
2108<br>&lt;!-- base consonant before e-vowel --&gt;
2109  &lt;backspace from="[\u1000-\u102A\u103F-\u1049\u104E]\u1031" to="\uFDDF\u1031"/&gt;
2110<br>&lt;!-- subjoined consonant before medial r --&gt;
2111  &lt;backspace from="\u1039[\u1000-\u101C\u101E\u1020\u1021]\u103C" to="\u103C"/&gt;
2112<br>&lt;!-- base consonant before medial r --&gt;
2113  &lt;backspace from="[\u1000-\u102A\u103F-\u1049\u104E]\u103C" to="\uFDDF\u103C"/&gt;
2114<br>&lt;!-- delete lone medial r or e-vowel --&gt;
2115  &lt;backspace from="\uFDDF[\u1031\u103C]"/&gt;<br>&lt;/backspaces&gt;</pre>
2116    <p>The above example is simplified, and doesn't fully handle
2117    the interaction between medial-r and e-vowel.</p>
2118    <p>The character \uFDDF does not represent a literal character,
2119    but is instead a special placeholder, a "filler string". When a
2120    keyboard implementation handles a user pressing a key that
2121    inserts a prebase character, it also has to insert a special
2122    filler string before the prebase to ensure that the prebase
2123    character does not combine with the previous cluster. See the
2124    reorder transform for details. The precise filler string is
2125    implementation dependent. Rather than requiring keyboard layout
2126    designers to know what the filler string is, we reserve a
2127    special character that the keyboard layout designer may use to
2128    reference this filler string. It is up to the keyboard
2129    implementation to, in effect, replace that character with the
2130    filler string.</p>
2131    <p>The first three transforms above delete various ligatures
2132    with a single keypress. The other transforms handle prebase
2133    characters. There are two in this Burmese keyboard. The
2134    transforms delete the characters preceding the prebase
2135    character up to base which gets replaced with the prebase
2136    filler string, which represents a null base. Finally the
2137    prebase filler string + prebase is deleted as a unit.</p>
2138    <p>The backspace transform is much like other transforms except
2139    in its processing model. If we consider the same transform as
2140    in the simple transform example, but as a backspace:</p>
2141    <blockquote>
2142      &lt;backspace before="X" from="Y" after="Z" to="B"/&gt;
2143    </blockquote>
2144    <p>This would transform the string:</p>
2145    <blockquote>
2146      XYZ → XBZ
2147    </blockquote>
2148    <p>If we mark where the current match position is before and
2149    after the transform we see:</p>
2150    <blockquote>
2151      X Y | Z → X B | Z
2152    </blockquote>
2153    <p>Whereas a simple or final transform would then run other
2154    transforms in the transform list, advancing the processing
2155    position until it gets to the end of the string, the backspace
2156    transform only matches a single backspace rule and then
2157    finishes.</p>
2158    <hr>
2159    <h2>6 <a name="Element_Heirarchy_Platform_File" href=
2160    "#Element_Heirarchy_Platform_File" id=
2161    "Element_Heirarchy_Platform_File">Element Hierarchy - Platform
2162    File</a></h2>
2163    <p>There is a separate XML structure for platform-specific
2164    configuration elements. The most notable component is a mapping
2165    between the hardware key codes to the ISO layout positions for
2166    that platform.</p>
2167    <h3>6.1 <a name="Element_platform" href="#Element_platform" id=
2168    "Element_platform">Element: platform</a></h3>
2169    <p>This is the top level element. This element contains a set
2170    of elements defined below. A document shall only contain a
2171    single instance of this element.</p>
2172    <p>Syntax</p>
2173    <p>&lt;platform&gt;</p>
2174    <p>{platform-specific elements}</p>
2175    <p>&lt;/platform&gt;</p>
2176    <h3>6.2 <a name="Element_hardwareMap" href=
2177    "#Element_hardwareMap" id="Element_hardwareMap">Element:
2178    hardwareMap</a></h3>
2179    <p>This element must have a platform element as its parent.
2180    This element contains a set of map elements defined below. A
2181    document shall only contain a single instance of this
2182    element.</p>
2183    <p>Syntax</p>
2184    <pre>&lt;platform&gt;
2185    &lt;hardwareMap&gt;
2186        {a set of map elements}
2187    &lt;/hardwareMap&gt;
2188&lt;/platform&gt;</pre>
2189    <h3>6.3 <a name="Element_hardwareMap_map" href=
2190    "#Element_hardwareMap_map" id=
2191    "Element_hardwareMap_map">Element: map</a></h3>
2192    <p>This element must have a hardwareMap element as its parent.
2193    This element maps between a hardware keycode and the
2194    corresponding ISO layout position of the key.</p>
2195    <p>Syntax</p>
2196    <p>&lt;map keycode="{hardware keycode}" iso="{ISO layout
2197    position}"/&gt;</p>
2198    <dl>
2199      <dt>Attribute: keycode (required)</dt>
2200      <dd>The hardware key code value of the key. This value is an
2201      integer which is provided by the keyboard driver.</dd>
2202    </dl>
2203    <dl>
2204      <dt>Attribute: iso (required)</dt>
2205      <dd>The corresponding position of a key using the ISO layout
2206      convention where rows are identified by letters and columns
2207      are identified by numbers. For example, "D01" corresponds to
2208      the "Q" key on a US keyboard. (See the definition at the
2209      beginning of the document for a diagram).</dd>
2210    </dl>
2211    <p>Examples</p>
2212    <pre>
2213    &lt;platform&gt;<br>       &lt;hardwareMap&gt;<br>         &lt;map keycode="2" iso="E01" /&gt;<br>             &lt;map keycode="3" iso="E02" /&gt;<br>             &lt;map keycode="4" iso="E03" /&gt;<br>             &lt;map keycode="5" iso="E04" /&gt;<br>             &lt;map keycode="6" iso="E05" /&gt;<br>             &lt;map keycode="7" iso="E06" /&gt;<br>             &lt;map keycode="41" iso="E00" /&gt;<br>    &lt;/hardwareMap&gt;<br>&lt;/platform&gt;</pre>
2214    <h2>7 <a name="Invariants" href="#Invariants" id=
2215    "Invariants">Invariants</a></h2>
2216    <p>Beyond what the DTD imposes, certain other restrictions on
2217    the data are imposed on the data.</p>
2218    <ol>
2219      <li>For a given platform, every map[@iso] value must be in
2220      the hardwareMap if there is one (_keycodes.xml)</li>
2221      <li>Every map[@base] value must also be in base[@base]
2222      value</li>
2223      <li>No keyMap[@modifiers] value can overlap with another
2224      keyMap[@modifiers] value.
2225        <ul>
2226          <li>eg you can't have "RAlt Ctrl" in one keyMap, and "Alt
2227          Shift" in another (because Alt = RAltLAlt).</li>
2228        </ul>
2229      </li>
2230      <li>Every sequence of characters in a transform[@from] value
2231      must be a concatenation of two or more map[@to] values.
2232        <ul>
2233          <li>eg with &lt;transform from="xyz" to="q"&gt; there
2234          must be some map values to get there, such as &lt;map...
2235          to="xy"&gt; &amp; &lt;map... to="z"&gt;</li>
2236        </ul>
2237      </li>
2238      <li>There must be either 0 or 1 of (keyMap[@fallback] or
2239      baseMap[@fallback]) attributes</li>
2240      <li>If the base and chars values for modifiers="" are all
2241      identical, and there are no longpresses, that keyMap must not
2242      appear (??)</li>
2243      <li>There will never be overlaps among modifier values.</li>
2244      <li>A modifier set will never have ? (optional) on all values
2245        <ul>
2246          <li>eg, you'll never have RCtrl?Caps?LShift?</li>
2247        </ul>
2248      </li>
2249      <li>Every base[@base] value must be unique.</li>
2250      <li>A modifier attribute value will aways be minimal,
2251      observing the following simplification rules.<br></li>
2252    </ol>
2253    <table>
2254      <!-- nocaption -->
2255      <tbody>
2256        <tr>
2257          <td>
2258            <p>Notation</p>
2259          </td>
2260          <td>
2261            <p>Notes</p>
2262          </td>
2263        </tr>
2264        <tr>
2265          <td>
2266            <p>Lower case character (eg. <i>x</i> )</p>
2267          </td>
2268          <td>
2269            <p>Interpreted as any combination of modifiers.<br>
2270            (eg. <i>x</i> = CtrlShiftOption)</p>
2271          </td>
2272        </tr>
2273        <tr>
2274          <td>
2275            <p>Upper-case character (eg. <i>Y</i> )</p>
2276          </td>
2277          <td>
2278            <p>Interpreted as a single modifier key (which may or
2279            may not have a L and R variant)<br>
2280            (eg. <i>Y</i> = Ctrl, <i>RY</i> = RCtrl, etc..)</p>
2281          </td>
2282        </tr>
2283        <tr>
2284          <td>
2285            <p>Y? ⇔ Y ∨ ∅</p>
2286            <p>Y ⇔ LY ∨ RY ∨ LYRY</p>
2287          </td>
2288          <td>
2289            <p>Eg. Opt? ⇔ ROpt ∨ LOpt ∨ ROptLOpt ∨ ∅<br>
2290            Eg. Opt ⇔ ROpt ∨ LOpt ∨ ROptLOpt</p>
2291          </td>
2292        </tr>
2293      </tbody>
2294    </table>
2295    <table>
2296      <!-- nocaption -->
2297      <tbody>
2298        <tr>
2299          <td>
2300            <p>Axiom</p>
2301          </td>
2302          <td>
2303            <p>Example</p>
2304          </td>
2305        </tr>
2306        <tr>
2307          <td>
2308            <p>xY ∨ x ⇒ xY?</p>
2309          </td>
2310          <td>
2311            <p>OptCtrlShift OptCtrl → OptCtrlShift?</p>
2312          </td>
2313        </tr>
2314        <tr>
2315          <td>
2316            <p>xRY ∨ xY? ⇒ xY?</p>
2317            <p>xLY ∨ xY? ⇒ xY?</p>
2318          </td>
2319          <td>
2320            <p>OptCtrlRShift OptCtrlShift? → OptCtrlShift?</p>
2321          </td>
2322        </tr>
2323        <tr>
2324          <td>
2325            <p>xRY? ∨ xY ⇒ xY?</p>
2326            <p>xLY? ∨ xY ⇒ xY?</p>
2327          </td>
2328          <td>
2329            <p>OptCtrlRShift? OptCtrlShift → OptCtrlShift?</p>
2330          </td>
2331        </tr>
2332        <tr>
2333          <td>
2334            <p>xRY? ∨ xY? ⇒ xY?</p>
2335            <p>xLY? ∨ xY? ⇒ xY?</p>
2336          </td>
2337          <td>
2338            <p>OptCtrlRShift? OptCtrlShift? → OptCtrlShift?</p>
2339          </td>
2340        </tr>
2341        <tr>
2342          <td>
2343            <p>xRY ∨ xY ⇒ xY</p>
2344            <p>xLY ∨ xY ⇒ xY</p>
2345          </td>
2346          <td>
2347            <p>OptCtrlRShift OptCtrlShift → OptCtrlShift?</p>
2348          </td>
2349        </tr>
2350        <tr>
2351          <td>
2352            <p>LY?RY?</p>
2353          </td>
2354          <td>
2355            <p>OptRCtrl?LCtrl? → OptCtrl?</p>
2356          </td>
2357        </tr>
2358        <tr>
2359          <td>
2360            <p>xLY? ⋁ xLY ⇒ xLY?</p>
2361          </td>
2362          <td>&nbsp;</td>
2363        </tr>
2364        <tr>
2365          <td>
2366            <p>xY? ⋁ xY ⇒ xY?</p>
2367          </td>
2368          <td>&nbsp;</td>
2369        </tr>
2370        <tr>
2371          <td>
2372            <p>xY? ⋁ x ⇒ xY?</p>
2373          </td>
2374          <td>&nbsp;</td>
2375        </tr>
2376        <tr>
2377          <td>
2378            <p>xLY? ⋁ x ⇒ xLY?</p>
2379          </td>
2380          <td>&nbsp;</td>
2381        </tr>
2382        <tr>
2383          <td>
2384            <p>xLY ⋁ x ⇒ xLY?</p>
2385          </td>
2386          <td>&nbsp;</td>
2387        </tr>
2388      </tbody>
2389    </table>
2390    <h2>8 <a name="Data_Sources" href="#Data_Sources" id=
2391    "Data_Sources">Data Sources</a></h2>
2392    <p>Here is a list of the data sources used to generate the
2393    initial key map layouts:</p>
2394    <table>
2395      <caption>
2396        <a name="Key_Map_Data_Sources" href="#Key_Map_Data_Sources"
2397        id="Key_Map_Data_Sources">Key Map Data Sources</a>
2398      </caption>
2399      <tbody>
2400        <tr>
2401          <td>
2402            <p>Platform</p>
2403          </td>
2404          <td>
2405            <p>Source</p>
2406          </td>
2407          <td>
2408            <p>Notes</p>
2409          </td>
2410        </tr>
2411        <tr>
2412          <td>
2413            <p>Android</p>
2414          </td>
2415          <td>
2416            <p>Android 4.0 - Ice Cream Sandwich<br>
2417            (<a href=
2418            "https://source.android.com/source/downloading.html">https://source.android.com/source/downloading.html</a>)</p>
2419          </td>
2420          <td>
2421            <p>Parsed layout files located in
2422            packages/inputmethods/LatinIME/java/res</p>
2423          </td>
2424        </tr>
2425        <tr>
2426          <td>
2427            <p>ChromeOS</p>
2428          </td>
2429          <td>
2430            <p>XKB (<a href=
2431            "https://www.x.org/wiki/XKB">https://www.x.org/wiki/XKB</a>)</p>
2432          </td>
2433          <td>
2434            <p>The ChromeOS represents a very small subset of the
2435            keyboards available from XKB.</p>
2436          </td>
2437        </tr>
2438        <tr>
2439          <td>
2440            <p>Mac OSX</p>
2441          </td>
2442          <td>
2443            <p>Ukelele bundled System Keyboards (<a href=
2444            "https://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&amp;id=ukelele">https://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&amp;id=ukelele</a>)</p>
2445          </td>
2446          <td>
2447            <p>These layouts date from Mac OSX 10.4 and are
2448            therefore a bit outdated</p>
2449          </td>
2450        </tr>
2451        <tr>
2452          <td>
2453            <p>Windows</p>
2454          </td>
2455          <td>
2456            <p>Generated .klc files from the Microsoft Keyboard
2457            Layout Creator (<a href=
2458            "https://support.microsoft.com/en-us/help/823010/the-microsoft-keyboard-layout-creator">https://support.microsoft.com/en-us/help/823010/the-microsoft-keyboard-layout-creator</a>)</p>
2459          </td>
2460          <td>
2461            <!--waiting for information on the new location for the following-->
2462            <!--<p>For interactive layouts, see also <a href="xxxx">xxxx</a></p>-->
2463          </td>
2464        </tr>
2465      </tbody>
2466    </table>
2467
2468    <h2>9 <a name="Keyboard_IDs" href="#Keyboard_IDs" id=
2469    "Keyboard_IDs">Keyboard IDs</a></h2>
2470    <p>There is a set of subtags that help identify the keyboards.
2471    Each of these are used after the "t-k0" subtags to help
2472    identify the keyboards. The first tag appended is a mandatory
2473    platform tag followed by zero or more tags that help
2474    differentiate the keyboard from others with the same locale
2475    code.</p>
2476    <h3>9.1 <a name="Principles_for_Keyboard_Ids" href=
2477    "#Principles_for_Keyboard_Ids" id=
2478    "Principles_for_Keyboard_Ids">Principles for Keyboard
2479    Ids</a></h3>
2480    <p>The following are the design principles for the ids.</p>
2481    <ol>
2482      <li>BCP47 compliant.
2483        <ol>
2484          <li>Eg, "en-t-k0-extended".</li>
2485        </ol>
2486      </li>
2487      <li>Use the minimal language id based on likelySubtags.
2488        <ol>
2489          <li>Eg, instead of en-US-t-k0-xxx, use en-t-k0-xxx.
2490          Because there is &lt;likelySubtag from="en"
2491          to="en_Latn_US"/&gt;, en-US → en.</li>
2492          <li>The data is in <a href=
2493          "https://github.com/unicode-org/cldr/releases/tag/latest/common/supplemental/likelySubtags.xml">
2494          https://github.com/unicode-org/cldr/releases/tag/latest/common/supplemental/likelySubtags.xml</a></li>
2495        </ol>
2496      </li>
2497      <li>The platform goes first, if it exists. If a keyboard on
2498      the platform changes over time, both are dated, eg
2499      bg-t-k0-chromeos-2011. When selecting, if there is no date,
2500      it means the latest one.</li>
2501      <li>Keyboards are only tagged that differ from the "standard
2502      for each platform". That is, for each language on a platform,
2503      there will be a keyboard with no subtags other than the
2504      platform.Subtags with a common semantics across platforms are
2505      used, such as '-extended', -phonetic, -qwerty, -qwertz,
2506      -azerty, …</li>
2507      <li>In order to get to 8 letters, abbreviations are reused
2508      that are already in <a href=
2509      "https://github.com/unicode-org/cldr/releases/tag/latest/common/bcp47/">bcp47</a>
2510      -u/-t extensions and in <a href=
2511      "https://www.iana.org/assignments/language-subtag-registry">language-subtag-registry</a>
2512      variants, eg for Traditional use "-trad" or "-traditio" (both
2513      exist in <a href=
2514      "https://github.com/unicode-org/cldr/releases/tag/latest/common/bcp47/">bcp47</a>).</li>
2515      <li>Multiple languages cannot be indicated, so the
2516      predominant target is used.
2517        <ol>
2518          <li>For Finnish + Sami, use fi-t-k0-smi or
2519          extended-smi</li>
2520        </ol>
2521      </li>
2522      <li>In some cases, there are multiple subtags, like
2523      en-US-t-k0-chromeos-intl-altgr.xml</li>
2524      <li>Otherwise, platform names are used as a guide.</li>
2525    </ol>
2526    <h2>10 <a name="Platform_Behaviors_in_Edge_Cases" href=
2527    "#Platform_Behaviors_in_Edge_Cases" id=
2528    "Platform_Behaviors_in_Edge_Cases">Platform Behaviors in Edge
2529    Cases</a></h2>
2530    <table>
2531      <!-- nocaption -->
2532      <tbody>
2533        <tr>
2534          <td>
2535            <p>Platform</p>
2536          </td>
2537          <td>
2538            <p>No modifier combination match is available</p>
2539          </td>
2540          <td>
2541            <p>No map match is available for key position</p>
2542          </td>
2543          <td>
2544            <p>Transform fails (ie. if ^d is pressed when that
2545            transform does not exist)</p>
2546          </td>
2547        </tr>
2548        <tr>
2549          <td>
2550            <p>ChromeOS</p>
2551          </td>
2552          <td>
2553            <p>Fall back to base</p>
2554          </td>
2555          <td>
2556            <p>Fall back to character in a keyMap with same "level"
2557            of modifier combination. If this character does not
2558            exist, fall back to (n-1) level. (This is handled
2559            data-generation side).<br>
2560            In the spec: No output</p>
2561          </td>
2562          <td>
2563            <p>No output at all</p>
2564          </td>
2565        </tr>
2566        <tr>
2567          <td>
2568            <p>Mac OSX</p>
2569          </td>
2570          <td>
2571            <p>Fall back to base (unless combination is some sort
2572            of keyboard shortcut, eg. cmd-c)</p>
2573          </td>
2574          <td>
2575            <p>No output</p>
2576          </td>
2577          <td>
2578            <p>Both keys are output separately</p>
2579          </td>
2580        </tr>
2581        <tr>
2582          <td>
2583            <p>Windows</p>
2584          </td>
2585          <td>
2586            <p>No output</p>
2587          </td>
2588          <td>
2589            <p>No output</p>
2590          </td>
2591          <td>
2592            <p>Both keys are output separately</p>
2593          </td>
2594        </tr>
2595      </tbody>
2596    </table>
2597    <p>&nbsp;</p>
2598    <hr>
2599    <p class="copyright">Copyright © 2001–2020 Unicode, Inc. All
2600    Rights Reserved. The Unicode Consortium makes no expressed or
2601    implied warranty of any kind, and assumes no liability for
2602    errors or omissions. No liability is assumed for incidental and
2603    consequential damages in connection with or arising out of the
2604    use of the information or programs contained or accompanying
2605    this technical report. The Unicode <a href=
2606    "https://unicode.org/copyright.html">Terms of Use</a> apply.</p>
2607    <p class="copyright">Unicode and the Unicode logo are
2608    trademarks of Unicode, Inc., and are registered in some
2609    jurisdictions.</p>
2610  </div>
2611</body>
2612</html>
2613