rules.md - OpenGrok cross reference for /third_party/icu/docs/userguide/transforms/general/rules.md

Lines Matching full:rules
26 set of rules. The tutorial does not describe, in detail, the features of
27 transform; instead, it explains the process of building rules and describes the
30 of the rules.
44 since it illustrates more of the issues involved in the rules. (For guidelines
52 In this example, we start with a set of rules for Greek since they provide a
53 real example based on mathematics. We will use the rules that do not involve the
54 pronunciation of Modern Greek; instead, we will use rules that correspond to the
82 Greek and Latin. These rules map between a source string and a target string.
107 We will also add rules for completeness. These provide fallback mappings for
117 We have completed the simple one-to-one mappings and the rules for completeness.
129 All the rules are evaluated in the order they are listed. The transform will
130 first try to match the first four rules. If all of these rules fail, it will use
140 rules for converting γ, κ, ξ, and χ. We must consider how to convert the γ
142 those characters to be converted using their specific rules. This is done with
152 followed when the transform matches the rules against the source text, but
160 Using context, we have the same number of rules. But, by using range, we can
161 collapse the first four rules into one. The following shows how we can use
190 the two types of rules on some sample text:
242 These rules state that if an "s" is surrounded by non-letters, convert it to
247 This makes the rules much easier to write. *
249 To make the rules clearer, you can use variables. Instead of the example above,
272 Elements in a rule can also repeat. For example, in the following rules, the
298 …causes the rest of the rule to fail. For example, suppose we have the following (contrived) rules:*
318 at the end of a range, such as in the following rules:
320 | Rules | [0-9$] { a > b ; a } [0-9$] > b ;|
325 In these rules, an **a** before or after a number -- or at the start or end of a
352 context of your rules. For example:
354 | Rules | $break = [[:Zp:][:Zl:] \u000A-\u000D \u0085 $] ; $break { a > A ;|
379 We could handle each accented character by itself with rules such as the
388 ICU 1.8, we can add other transforms as rules either before or after all the
389 other rules. We then can modify the rules to the following:
397 These modified rules first separate accents from their base characters and then
400 standard canonical form. The inverse uses the transform rules in reverse order,
403 A global filter can also be used with the transform rules. The following example
404 shows a filter used in the rules:
434 In ICU, there are several of these mechanisms for the Greek rules. The ICU rules
436 these disambiguation rules ensure that the rules can pass these tests and handle
449 Rules allow for characters to be revisited after they are replaced. For example,
461 The ability to revisit is particularly useful in reducing the number of rules
465 number of rules with the following pattern:
469 characters within the ICU rules in any event.
490 Using these rules, "kyo" is first converted into "き~yo". Since the "~yo" is then
492 rules (3 + 11 = 14) provide for a large number of cases. If all of the
493 combinations of rules were used instead, it would require 3 x 11 = 33 rules.
536 Two rules overlap when there is a string that both rules could match at the
544 When rules do not overlap, they will produce the same result no matter what
553 When rules do overlap, order is important. In fact, a rule could be rendered
561 rule will already be matched by previous rules. If a rule is masked, then a
562 warning will be issued when you attempt to build a transform with the rules.
616 words have rough breathing marks. In this case, we would use several rules to
626     # lowercase target before applying forward rules
629     This will allow the rules to work even when they are given a mixture of
644 2.  **Normalization** Always design transform rules so that they work no matter
646     in the case of backwards rules.) Generally, the best way to do this is to
647     have `:: NFD (NFC);` as the first line of the rules, and `:: NFC (NFD);` as the
652     manipulation, we could use `:: NFC (NFC) ;` at the top of the rules instead.
680 > :point_right: **Note**: *Remember that the rules themselves must be in the same normalization for…
681 Otherwise, nothing will match. To do this, run NFD on the rules themselves. In
682 some cases, we must rearrange the order of the rules because of masking. For
683 example, consider the following rules:*
685 *If these rules are put in normalized form, then the second rule will mask the first. To avoid this…