• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1This is a list of things that needs to be done to propagate the changes properly.
2Write DONE to each issue once resolved.
3
4Wishlist:
5- separate language.dat files on per-engine level (a different language.dat for XeTeX than for pdfTeX)
6- different syntax for loading multiple patterns? (Serbian)
7
8Short-term TODO list:
9- A web page with simple examples how to enable hyphenation
10- Documentation about hyphenation setup in TeX Live 2011, MikTeX & W32TeX
11- Documentation about conventions (naming guidelines, no-sequences-in-patterns, stable vs. non-stable filenames, ...) of hyph-utf8
12- Put licence in the manual (and explain licences of others)
13- Hans' examples for comparisons & documentation
14- discuss switching to special patterns for pdfTeX
15- better solutions for apostrophe
16
17
18
19TeX Distributions:
20- [DONE] propagate changes into TeX Live
21- [DONE] coordinate with Christian Schenk <cs at miktex.org> for changes in MikTeX
22- [DONE] coordinate with Akira Kakuto <kakuto at fsci.fuk.kindai.ac.jp> for changes in W32TeX
23
24Pattern creators:
25- make a list of pattern authors, notify them about changes,
26  explain them that they need to modify UTF-8 file/submit file it as UTF-8 in future
27- write DONE for each language once you receive the answer from the author
28
29Languages:
30- complete languages.html
31- unification of Mongolian
32  - encoding
33  - pattern versions
34    - [DONE] a list of differences received from Dorjgotov Batmunkh <bataak at gmail.com>
35    - add that list to repository
36    - waiting for answer from Oliver Corff <corff at zedat.fu-berlin.de>
37- unification of Spanish (some new version of patterns submitted)
38- describe differences between sh-cyrl and sr-cyrl
39- [DONE] do something with EO: I did the conversion for "plain" patterns, but left the original version intact
40  we could leave a separate version in "source" and use the "plain version" in pattern repository
41- Russian etc. that are still outsourcing
42  - solve encoding mess
43- versioning: German etc.
44- Norwegian
45- Belarussian:
46  - Contact the authors of hyphenation patterns on OOo:
47    http://extensions.services.openoffice.org/en/project/dict-be-classic
48    http://extensions.services.openoffice.org/en/project/dict-be-official
49  - Contact the author of support files for LaTeX (Aleksey Novodvorsky):
50    http://alt.linux.kiev.ua/srpm/tetex/patches/8
51    http://tldp.org/HOWTO/pdf/Belarusian-HOWTO.pdf
52  - Convert patterns into UTF-8
53  - Babel and Polyglossia support
54  - Which version has to be used? Classic or official?
55  - Do something about encoding in 8-bit engines (possibly together with Vladimir Volovich)
56
57CTAN:
58- move old patterns to obsolete
59- [DONE] upload new patterns
60- make an easy-to-browse structure
61- ask CTAN people to be careful if someone uploads new patterns
62- write auto-notificator for changes in CTAN
63- try to update the information in TeX Catalogue
64  - http://www.tex.ac.uk/tex-archive/help/Catalogue/hier.html
65  - some entries say that patterns are obsolete, but still claim that the package is included in TL/MikTeX
66    (http://www.tex.ac.uk/tex-archive/help/Catalogue/entries/hrhyph.html)
67  - [DONE] change script 4TL to generate shortdesc for hyph-xxx.tlpsrc
68  - once the shortdesc is written, also make sure to put longdesc or more extensive discription for TeX catalogue
69  - should we release a CTAN-only "browsable" version of hyphenation patterns:
70    - one folder per language
71    - symbolic links to all the files for that language
72    - a nice README for every language
73      - lists the author, copyright, ...
74      - says that the package is part of hyph-utf8 and properly redirect
75    - catalogue could have a reference to that folder and contain all the data
76    - discuss the idea with CTAN before doing extensive changes
77
78OpenOffice:
79- write auto-notificator for changes in OOo (more important)
80- check every single language on http://extensions.services.openoffice.org/en/dictionaries
81- contact the authors & join the efforts
82- [DONE] import indic scripts
83
84Impromevents:
85- prepare a better version of notes
86- auto-conversion into four different files:
87  - licence
88  - list of used letters for each language (lowercase and uppercase equivalent); difference between letters & others
89  - [DONE] list of patterns
90  - [DONE] list of exceptions
91- that list should be auto-generated from TeX sources and used for [complete a list]:
92  - hyphenator
93  - ...
94- LuaTeX patterns
95- [almost DONE] Describe and try to solve the apostrophe problem (users should be able
96  to input both U+0027 and U+2019)
97  See http://tug.org/pipermail/xetex/2010-October/018914.html ff. for
98  one use case.
99
100Documentation:
101- authors, history, changes
102- how to use patterns in plain TeX
103- how to use patterns in LaTeX with polyglossia (or babel?)
104- how to use patterns in ConTeXt
105- how to add a new language (once we quit the project)
106
107Testing:
108- collect lists of words for the languages where that is possible (no copyright problems)
109- write a luatex tester for a collection of words for each language
110
111Licences:
112- ... complete the list with tasks that need to be done
113- we want two licences: one for content and one for packaging
114- write a simple licence for packaging:
115  - we don't care what others (outside of TeX world) do; just mention the credit to hyph-utf8
116    and complete credit to pattern authors
117  - respect the licence of pattern authors
118  - within TeX world:
119    - keep the functionality; implementation may change, but not the end result (unless improved)
120    - contact the list first when changes are needed
121    - if team disappears, one should be free to continue the work (mention how that should be done)
122    - some general guidelines:
123      - no TeX commands (apart from \hyphenation and \patterns)
124      - everything in Unicode
125      - hyph-xx.tex will keep containing the patterns
126      - loadhyph-xx.tex will keep loading the patterns, but may change considerably (ugly tricks should go there)
127      - ... (complete the list)
128    - language-specific:
129      - author of existing patterns has full right to submit changes
130      - when a new author comes and the old one doesn't respond ... what to do?
131      - when two version of patterns appear, try to convince the authors to combine the patterns
132        ... write some general guidelines
133- collect a few ideas for valid licences (maybe a different licence f)
134- write the authors if they agree with that licence (give them a few options to choose from)
135- write some really nice, easily parsable form and apply to all the patterns
136- modify generating scripts for:
137  - German
138  - gl
139  - eu, tk, tr
140
141Example of a licence:
142
143This work consists of two parts:
144a) patterns for different languages
145b) support files (tex files, generating scripts, documentation, ...)
146with two different licences for each:
1471) for within the TeX world
1482) and outside the TeX world
149
150A short version of licence:
151b2) We don't care what you do, though it would be nice of you to mention the source of patterns (hyph-utf8 residing at http://tug.org/tex-hyphen) at some visible place.
152b1)
153
154a1)
155
156Collaboration:
157- make a list of people that we would want to collaborate with
158- some notification system for those using our patterns as source for whenever something in repository changes
159- write done for everyone we have collaborated with
160
161TeX users:
162- [on the way] write DOC about pattern conversion
163- [DONE] write an article for TUG boat and/or other TeX magazines
164- [DONE] maybe present the main idea at some TUG conference
165- notify mailing lists
166- CTAN people should take care for new patterns to be submitted in the proper form
167
168Web page:
169- split the page into several subpages:
170  - short intro & news
171  - languages - list all available language and add proper links
172  - collaboration and links
173  - description of algorithm (with Mathias' slides)
174- nicer design
175- apply Mathias' Hyphenator.js
176- add a test page to do hyphenation on-the-fly using Mathias' code
177  and creating some really nice output
178  - a dropdown box chooses the language
179  - text input field
180  - button for submission (maybe we don't even need that)
181  - sample output
182
183Software:
184- possibly rewrite patgen to handle UTF-8
185  - [DONE] try to convince Mathias Nater to do that; Deadline: BachoTeX 2011 :)
186  - try to convince him to finish the project
187
188OpenOffice:
189- http://extensions.services.openoffice.org/en/project/RomanianDictionaryPackCedillaVersion
190- (kurdish) http://extensions.services.openoffice.org/en/project/kitandin
191
192- http://www.mail-archive.com/dev@lingucomponent.openoffice.org/msg01104.html
193
194
195
196Hello Mojca,
197
198
199On 16/03/2010 10:21, Mojca Miklavec wrote:
200Hello François,
201
202Not so long ago the author of  Mongolian patterns has asked us for
203"help about how to use his patterns in TeX Live 2008/09". (They are
204called "mongolian2a" in language.dat since there are two version of
205Mongolian patterns in TeX Live - both different patterns and different
206encoding. I'm not sure that I know how to enable those patterns in
207LaTeX in some nice way.)
208
209First question: does polyglossia work ok with 8-bit engines (pdftex)
210as a replacement for babel?
211
212
213No. It assumes Unicode encoding and is currently for XeLaTeX only.
214Together with Elie Roux I plan to add support for LuaLaTeX in a later version (not very soon however).
215
216Second question: would you be willing to write a short introduction
217(in tex or maybe even better in html forhttp://tug.org/tex-hyphen/)
218
219about how to use some particular patterns in polyglossia (maybe we
220should start promoting polyglossia via our webpage, in particual if
221the answer to my first question was positive).
222
223
224Sure.
225
226The short answer is:
227
228Since Babel's hyphen.cfg is built in the xelatex format, hyphenation patterns can be used without even loading polyglossia (or Babel for that matter). At the low-level this simply corresponds to defining
229   \language=\l@<langname>
230(where langname is the string identifying a particular hyphenation file in language.dat). The user command for this is
231   \hyphenrules{langname}
232or
233   \begin{hyphenrules}{langname} ... \end{hyphenrules}.
234
235The above works with any flavour of LaTeX, and I think it is also available in Plain TeX with
236   \hyphenrules{langname}
237   ...
238   \endhyphenrules
239
240
241
242In particular, we need to provide examples of usage in {plain TeX,
243LaTeX, XeTeX, XeLaTeX, ConTeXt, Lua(La)TeX} for different encodings. I
244can write something about ConTeXt, but I need help for different
245flavours of LaTeX.
246
247
248Under "plain" pdftex, xetex, and luatex, one can use the macro \uselanguage{langname} for any language mentioned in language.def. It's that simple!
249
250Under polyglossia this is all done automatically, as in Babel, within the individual language definition files, depending of the options associated with each language. Example:
251
252\usepackage{polyglossia}
253\setmainlanguage{asturian}
254\setotherlanguage[variant=usmax]{english}% American English with extended hyphenation patterns
255\setotherlanguage[spelling=new,latesthyphen=true]{german}% German with experimental patterns "ngerman-x-latest"
256\setotherlanguages{spanish,catalan,french}
257\begin{document}
258Long Asturian text ... (Hyphenation for Asturian is not available, but polyglossia automatically falls back on Catalan for now, which seems to be a reasonable choice.)
259
260\begin{german}
261Deutscher Text ... (with the hyphenation patterns selected above: "ngerman-x-latest")
262\end{german}
263
264\begin[script=fraktur,spelling=old]{german}
265Deutſcher Text ... (set in Fraktur, with traditional hyphenation).
266\end{german}
267
268etc.
269
270Feel free to adapt and use the above to your liking!
271
272Regards,
273François
274