• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1Compound word hyphenation
2
3Hyphen library supports better compound word hyphenation and special
4rules of compound word hyphenation of German languages and other
5languages with arbitrary number of compound words. The new options,
6COMPOUNDLEFTHYPHENMIN and COMPOUNDRIGHTHYPHENMIN help to set the right
7style for the hyphenation of compound words.
8
9Algorithm
10
11The algorithm is an extension of the original pattern based hyphenation
12algorithm. It uses two hyphenation pattern sets, defined in the same
13pattern file and separated by the NEXTLEVEL keyword. First pattern
14set is for hyphenation only at compound word boundaries, the second one
15is for hyphenation within words or word parts.
16
17Recursive compound level hyphenation
18
19The algorithm is recursive: every word parts of a successful
20first (compound) level hyphenation will be rehyphenated
21by the same (first) pattern set.
22
23Finally, when first level hyphenation is not possible, Hyphen uses
24the second level hyphenation for the word or the word parts.
25
26Word endings and word parts
27
28Patterns for word endings (patterns with ellipses) match the
29word parts, too.
30
31Options
32
33COMPOUNDLEFTHYPHENMIN: min. hyph. dist. from the left compound word boundary
34COMPOUNDRIGHTHYPHENMIN: min. hyph. dist. from the right comp. word boundary
35NEXTLEVEL: sign second level hyphenation patterns
36
37Default hyphenmin values
38
39Default values of COMPOUNDLEFTHYPHENMIN and COMPOUNDRIGHTHYPHENMIN are 0,
40and 0 under the hyphenation, too. ("0" values of
41LEFTHYPHENMIN and RIGHTHYPHENMIN mean the default "2" under the hyphenation.)
42
43Examples
44
45See tests/compound* test files.
46
47Preparation of hyphenation patterns
48
49It hasn't been special pattern generator tool for compound hyphenation
50patterns, yet. It is possible to use PATGEN to generate both of
51pattern sets, concatenate it manually and set the requested HYPHENMIN values.
52(But don't forget the preprocessing steps by substrings.pl before
53concatenation.) One of the disadvantage of this method, that PATGEN
54doesn't know recursive compound hyphenation of Hyphen.
55
56László Németh
57<nemeth (at) openoffice.org>
58