• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1remember:
2set lccode for - and '
3
4TODO checklist:
5- no comments were left out during conversion
6- same set of patterns
7- no commands and \'-like sequences left
8
9already converted:
10
11# language	original_file	code	supported_encoding(s)
12
13croatian	hrhyph.tex	hr[_HR]	ec
14========
15čćđšž
16^^a3 -> č	ccaron
17^^a2 -> ć	cacute
18^^9e -> đ	dcroat
19^^b2 -> š	scaron
20^^ba -> ž	zcaron
21
22serbian	sh
23=======
24
25latin	shhyphl.tex	sr-latn	ec
26-----
27\'c -> ć	cacute
28\v c -> č	ccaron
29\v s -> š	scaron
30\v z -> ž	zcaron
31
32^^a3 -> č	ccaron
33^^a2 -> ć	cacute
34^^9e -> đ	dcroat
35^^b2 -> š	scaron
36^^ba -> ž	zcaron
37
38cyrillic	srhyphc.tex	sr-cyrl	t2a
39--------
40
41convert from iso-8859-5
42
43
44dutch	nehyph96.tex	nl[_NL]	ec
45=====
46^^e4 -> ä adieresis
47^^e7 -> ç ccedilla
48^^e8 -> è egrave
49^^e9 -> é eacute
50^^ea -> ê ecircumflex
51^^eb -> ë edieresis
52^^ef -> ï idieresis
53^^ee -> î icircumflex
54^^f1 -> ñ ntilde
55^^f6 -> ö odieresis
56^^fc -> ü udieresis
57^^fb -> û ucircumflex
58
59patterns can be loaded as ec only, and the same patterns are usable under texnansi as well (same position of letters)
60
61
62finnish	fihyph.tex	fi[_FI]	ec
63=======
64åäö
65^^e4 -> ä adieresis
66^^f6 -> ö odieresis
67
68apparently there is no å in patterns?
69(šž may accur in foreign words)
70patterns can be loaded as ec only, and the same patterns are usable under texnansi as well (same position of letters)
71
72
73italian	ithyph.tex	it[_IT]	ascii/none
74========
75only ' is treated as a letter, patterns don't include any other accented character
76how does that interact with mapping=tex-text?
77
78
79polish	plhyph.tex	pl[_PL]	qx
80======
81ąćęłńóśźż
82/a -> ą	aogonek
83/c -> ć	cacute
84/e -> ę	eogonek
85/l -> ł	lslash
86/n -> ń	nacute
87/o -> ó	oacute
88/s -> ś	sacute
89/x -> ź	zacute
90/z -> ż	zdotaccent
91
92
93portuguese	pthyph.tex	pt	ec
94==========
95
96^^e0 -> à - agrave (not used)
97^^e1 -> á - aacute
98^^e2 -> â - acircumflex
99^^e3 -> ã - atilde
100^^e7 -> ç - ccedilla
101^^e8 -> è - egrave (not used)
102^^e9 -> é - eacute
103^^ea -> ê - ecircumflex
104^^ed -> í - iacute
105^^ee -> î - icircumflex
106^^ef -> ï - idieresis (not used)
107^^f3 -> ó - oacute
108^^f4 -> ô - ocircumflex
109^^f5 -> õ - otilde
110^^f6 -> ö - odieresis (not used)
111^^fa -> ú - uacute
112^^fb -> û - ucircumflex (not used)
113
114patterns can be loaded as ec only, and the same patterns are usable under texnansi as well (same position of letters)
115
116
117slovenian (slovene)	sihyph.tex	sl[-si]	ec
118=========
119čšž
120"c -> č	ccaron
121"s -> š	scaron
122"z -> ž	zcaron
123
124
125swedish	svhyph.tex	sv[-se]	ec
126=======
127åäö
128^^e5 -> å aring
129^^e4 -> ä adieresis
130^^f6 -> ö odieresis
131àé are considered a variant of the same letter
132^^e9 -> é eacute
133
134patterns can be loaded as ec only, and the same patterns are usable under texnansi as well (same position of letters)
135
136
137"ukenglish"	ukhyphen.tex	en-gb	ascii/none
138===========
139
140spanish (espanol)	eshyph.tex	es	ec
141=======
142áéíóúüñ
143convert from latin1 to utf-8
144
145^^f1 ntilde
146^^e1 aacute
147^^e9 eacute
148^^ed iacute
149^^f3 oacute
150^^fa uacute
151^^fc udieresis
152
153patterns can be loaded as ec only, and the same patterns are usable under texnansi as well (same position of letters)
154
155
156catalan	cahyph.tex	ca	ec
157=======
158
159^^e0 -> à	agrave
160^^e7 -> ç - ccedilla
161^^e8 -> è - egrave
162^^e9 -> é - eacute
163^^ed -> í - iacute
164^^ef -> ï - idieresis
165^^f2 -> ò - ograve
166^^f3 -> ó - oacute
167^^fa -> ú - uacute
168^^fc -> ü - udieresis
169
170patterns can be loaded as ec only, and the same patterns are usable under texnansi as well (same position of letters)
171
172galician	glhyph.tex	gl	ec
173========
174
175áéíóú ñ üï
176
177source in latin1 -> convert to utf-8
178
179patterns can be loaded as ec only, and the same patterns are usable under texnansi as well (same position of letters)
180
181
182uppersorbian	sorhyph.tex	hsb	ec
183============
184
185^^a3 -> č - ccaron
186^^a2 -> ć - cacute
187^^a5 -> ě - ecaron
188^^aa -> ł - lslash
189^^ab -> ń - nacute
190^^f3 -> ó - oacute
191^^b0 -> ř - rcaron
192^^b2 -> š - scaron
193^^ba -> ž - zcaron
194^^b9 -> ź - zacute
195
196welsh	cyhyph.tex	cy	ec
197=====
198^^e1 -> á - aacute
199^^e2 -> â - acircumflex
200^^ea -> ê - ecircumflex
201^^f4 -> ô - ocircumflex
202^^eb -> ë - edieresis
203^^ef -> ï - idieresis
204^^f6 -> ö - odieresis
205
206
207irish	gahyph.tex	ga	ec
208=====
209^^e1 -> á - aacute
210^^e9 -> é - eacute
211^^ed -> í - iacute
212^^f3 -> ó - oacute
213^^fa -> ú - uacute
214
215pinyin	pyhyph.tex	zh-latn	ec
216======
217^^fc -> ü - udieresis
218
219patterns can be loaded as ec only, and the same patterns are usable under texnansi as well (same position of letters)
220
221interlingua	iahyphen.tex	ia	ascii
222===========
223
224pure copy
225
226romanian	rohyphen.tex	ro	ec
227========
228ăâîșț ĂÂÎȘȚ
229
230"a -> ă
231"A -> â
232"i -> î
233"s -> ș
234"t -> ț
235
236"a = \u{a}
237"A = \^{a}
238"i = \^{\i}
239"s = \c{s}
240"t = \c{t}
241
242%				[-]  \u{A} [not encoded]
243%	"A = \^{a}			[-]  \^{A} [not encoded]
244%	"i = \^{\i}			"I = \^{I}
245%	"s = \c{s}			"S = \c{S}
246%	"t = \c{t}			"T = \c{T}
247
248"a -> ^^a0
249"A -> ^^e2
250"i -> ^^ee
251"s -> ^^b3
252"t -> ^^b5
253
254estonian	ethyph.tex	et	ec
255========
256šžäöüõ
257
258^^b2 -> š - scaron
259^^ba -> ž - zcaron
260
261^^e4 -> ä - adieresis
262^^f6 -> ö - odieresis
263^^fc -> ü - udieresis
264
265^^f5 -> õ - otilde
266
267hungarian	huhyphn.tex	hu	ec
268=========
269
270saved in ec encoding (that no editor can read)
271
272^^e1
273^^e9
274^^f3
275^^f6
276^^ae
277^^fc
278^^fa
279^^b6
280^^ed
281^^e4
282
283\lccode"E1="E1 % á - aacute
284\lccode"E9="E9 % é - eacute
285\lccode"ED="ED % í - iacute
286\lccode"F3="F3 % ó - oacute
287\lccode"FA="FA % ú - uacute
288
289\lccode"AE="AE % ő - ohungarumlaut
290\lccode"B6="B6 % ű - uhungarumlaut
291
292\lccode"E4="E4 % ä - adieresis
293\lccode"F6="F6 % ö - odieresis
294\lccode"FC="FC % ü - udieresis
295
296icelandic	icehyph.tex	is	ec
297=========
298
299^^e1 á - aacute
300^^e9 é - eacute
301^^ed í - iacute
302^^f3 ó - oacute
303^^fa ú - uacute
304^^fd ý - yacute
305^^fe þ - thorn
306^^e6 æ - ae
307^^f6 ö - odieresis
308^^f0 ð - eth
309
310turkish	tkhyph.tex	tr	ec
311=======
312
313weird conversion, see additional notes
314
315^^11 -> ı - dotlessi % error
316^^e2 -> â - acircumflex
317^^ee -> î - icircumflex
318^^f4 -> ô - ocircumflex
319^^f6 -> ö - odieresis
320^^fc -> ü - udieresis
321
322^^e7 ->  ç - ccedilla
323^^a7 ->  ğ - gbreve
324^^f1 ->  ñ - ntilde
325^^b3 ->  ş - scedilla
326
327---------------------
328
329czech - not to be included this year ? - works fine with ec, but probably includes other tricks as well
330=====
331
332\v e -> ě	ecaron
333\v c -> č	ccaron
334\v d -> ď	dcaron
335\v l -> ľ	lcaron (not used)
336\v n -> ň	ncaron
337\v r -> ř	rcaron
338\v s -> š	scaron
339\v t -> ť	tcaron
340\v z -> ž	zcaron
341\r u -> ů	uring
342\'a -> á	aacute
343\'e -> é	eacute
344\'i -> í	iacute
345\'o -> ó	oacute
346\'u -> ú	uacute
347\'r -> ŕ	racute (not used)
348\'y -> ý	yacute
349\"a -> ä	adieresis (not used)
350\^o -> ô	ocircumflex (not used)
351
352slovak
353======
354
355\v e -> ě	ecaron (not used)
356\v c -> č	ccaron
357\v d -> ď	dcaron
358\v l -> ľ	lcaron
359\v n -> ň	ncaron
360\v r -> ř	rcaron (not used)
361\v s -> š	scaron
362\v t -> ť	tcaron
363\v z -> ž	zcaron
364\r u -> ů	uring (not used)
365\'a -> á	aacute
366\'e -> é	eacute
367\'i -> í	iacute
368\'o -> ó	oacute
369\'u -> ú	uacute
370\'r -> ŕ	racute
371\'y -> ý	yacute
372\"a -> ä	adieresis
373\^o -> ô	ocircumflex
374
375csaccents that Jonathan has written have some more characters available:
376\"o
377\"u
378\'l
379\`a
380
381
382german - not to be included this year
383======
384ec, also texnansi; original "supports" ot1 as well
385äöü are positioned at equal places in both ec & texnansi, while ß is at a different place in ec/texnansi/OT1
386
387          0x19 0xDF 0xFF
388ec         ı    SS   ß
389texnansi   ß    ß    ÿ
390OT1        ß    -    -
391
392äöüß
393"a -> ä
394"o -> ö
395"u -> ü
396/3 -> ß
397
398/9 duplicated entry for ß, only inside \c
399\n - keep pattern
400\c - delete pattern (duplicated)
401
402german - old	dehypht.tex
403german - new (ngerman)	dehyphn.tex
404
405hyphenation exceptions apparently loaded separately
406
407
408latin
409=====
410\ae -> æ
411\oe -> œ
412
413\n - delete pattern (duplicated)
414
415french	frhyph	fr[_FR]	ec - not to be included this year
416======
417=patois
418=francais
419
420also kind-of-supports \oe in ot1, TODO: check for texnansi
421
422æ - ae (*unused*)
423œ - oe
424
425\`a à e0
426\`e è e8
427\`u ù f9 (*unused*)
428
429\'e é e9
430
431\^a â e2
432\^e ê ea
433\^i î ee
434\^o ô f4
435\^u û fb
436
437\"e ë eb (*unused*)
438\"i ï ef
439\"u ü fc (*unused*)
440\"y ÿ b8 (*unused*)
441
442\cc ç e7 ccedilla
443
444\oe œ f7 oe
445
446\n - remove pattern (only duplicated \oe inside)
447remove 0 from \oe0
448
449% For \oe which exists in T1 _and_ OT1 encoded fonts but with
450% different glyph codes, patterns for both glyphs are included.
451% Thus you can use either T1 encoded fonts, or OT1 encoded fonts
452% and MLTeX's character substitution definition.
453
454danish
455======
456
457the same problem
458
459danish		dkhyph.tex
460dkcommon.tex
461dkhyph.tex
462dkspecial.tex
463
464X -> æ
465Y -> ø
466Z -> å
467
468esperanto	latin3
469=========
470
471^c -> ĉ (E6)
472^g -> ĝ (F8)
473^h -> ĥ (B6)
474^j -> ĵ (BC)
475^s -> ŝ (FE)
476^u -> ŭ (FD)
477
478^C -> Ĉ (C6)
479^G -> Ĝ (D8)
480^H -> Ĥ (A6)
481^J -> Ĵ (AC)
482^S -> Ŝ (DE)
483^U -> Ŭ (DD)
484
485special converter needed (and already done):
486
487coptic		xu-copthyph.tex		utf8-copthyph.tex		copthyph.tex
488
489
490
491english		hyphen.tex  % do not change!
492=usenglish/USenglish/american
493%
494% ushyphmax.tex, on the other hand, includes Gerard Kuiken's additional
495% patterns; it is not frozen.
496usenglishmax	ushyphmax.tex
497
498
499
500TODO (delete all the three entries once patterns are converted):
501
502norsk		xu-nohyphbx.tex
503=norwegian
504nynorsk         nnhyph.tex
505bokmal          nbhyph.tex
506
507indonesian	inhyph.tex
508welsh		cyhyph.tex
509
510greek variants:
511- ibycus ibyhyph.tex
512- greek		xu-grphyph4.tex
513  =polygreek
514- monogreek	xu-grmhyph4.tex
515- ancientgreek	xu-grahyph4.tex
516
517cyrilic variants:
518- bulgarian	xu-bghyphen.tex
519- mongolian	xu-mnhyph.tex
520- mongolian2a	mnhyphn.tex
521- serbian	xu-srhyphc.tex
522- russian	xu-ruhyphen.tex
523- ukrainian	xu-ukrhyph.tex
524
525
526TODO:
527
528waiting list:
529
530czech
531slovak
532german		xu-dehypht.tex
533ngerman		xu-dehyphn.tex
534latin		xu-lahyph.tex
535esperanto	xu-eohyph.tex
536
537consult the authors:
538
539basque		xu-bahyph.tex
540bahyph.sh
541bahyph.tex
542galician	xu-glhyph.tex
543turkish		xu-tkhyph.tex
544
545
546xu-cp866nav.tex
547xu-dehyphtex.tex
548xu-eohyph.tex
549xu-nohyphbx.tex
550xu-ruhyphen.tex
551xu-ukrhyph.tex
552
553cyhyph.tex
554eohyph.tex
555grahyph4.tex
556grmhyph4.tex
557grphyph4.tex
558hyphen.tex
559hypht1.tex
560ibyhyph.tex
561inhyph.tex
562ushyphmax.tex
563