1This is a list of things that needs to be done to propagate the changes properly. 2Write DONE to each issue once resolved. 3 4Wishlist: 5- separate language.dat files on per-engine level (a different language.dat for XeTeX than for pdfTeX) 6- different syntax for loading multiple patterns? (Serbian) 7 8Short-term TODO list: 9- A web page with simple examples how to enable hyphenation 10- Documentation about hyphenation setup in TeX Live 2011, MikTeX & W32TeX 11- Documentation about conventions (naming guidelines, no-sequences-in-patterns, stable vs. non-stable filenames, ...) of hyph-utf8 12- Put licence in the manual (and explain licences of others) 13- Hans' examples for comparisons & documentation 14- discuss switching to special patterns for pdfTeX 15- better solutions for apostrophe 16 17 18 19TeX Distributions: 20- [DONE] propagate changes into TeX Live 21- [DONE] coordinate with Christian Schenk <cs at miktex.org> for changes in MikTeX 22- [DONE] coordinate with Akira Kakuto <kakuto at fsci.fuk.kindai.ac.jp> for changes in W32TeX 23 24Pattern creators: 25- make a list of pattern authors, notify them about changes, 26 explain them that they need to modify UTF-8 file/submit file it as UTF-8 in future 27- write DONE for each language once you receive the answer from the author 28 29Languages: 30- complete languages.html 31- unification of Mongolian 32 - encoding 33 - pattern versions 34 - [DONE] a list of differences received from Dorjgotov Batmunkh <bataak at gmail.com> 35 - add that list to repository 36 - waiting for answer from Oliver Corff <corff at zedat.fu-berlin.de> 37- unification of Spanish (some new version of patterns submitted) 38- describe differences between sh-cyrl and sr-cyrl 39- [DONE] do something with EO: I did the conversion for "plain" patterns, but left the original version intact 40 we could leave a separate version in "source" and use the "plain version" in pattern repository 41- Russian etc. that are still outsourcing 42 - solve encoding mess 43- versioning: German etc. 44- Norwegian 45- Belarussian: 46 - Contact the authors of hyphenation patterns on OOo: 47 http://extensions.services.openoffice.org/en/project/dict-be-classic 48 http://extensions.services.openoffice.org/en/project/dict-be-official 49 - Contact the author of support files for LaTeX (Aleksey Novodvorsky): 50 http://alt.linux.kiev.ua/srpm/tetex/patches/8 51 http://tldp.org/HOWTO/pdf/Belarusian-HOWTO.pdf 52 - Convert patterns into UTF-8 53 - Babel and Polyglossia support 54 - Which version has to be used? Classic or official? 55 - Do something about encoding in 8-bit engines (possibly together with Vladimir Volovich) 56 57CTAN: 58- move old patterns to obsolete 59- [DONE] upload new patterns 60- make an easy-to-browse structure 61- ask CTAN people to be careful if someone uploads new patterns 62- write auto-notificator for changes in CTAN 63- try to update the information in TeX Catalogue 64 - http://www.tex.ac.uk/tex-archive/help/Catalogue/hier.html 65 - some entries say that patterns are obsolete, but still claim that the package is included in TL/MikTeX 66 (http://www.tex.ac.uk/tex-archive/help/Catalogue/entries/hrhyph.html) 67 - [DONE] change script 4TL to generate shortdesc for hyph-xxx.tlpsrc 68 - once the shortdesc is written, also make sure to put longdesc or more extensive discription for TeX catalogue 69 - should we release a CTAN-only "browsable" version of hyphenation patterns: 70 - one folder per language 71 - symbolic links to all the files for that language 72 - a nice README for every language 73 - lists the author, copyright, ... 74 - says that the package is part of hyph-utf8 and properly redirect 75 - catalogue could have a reference to that folder and contain all the data 76 - discuss the idea with CTAN before doing extensive changes 77 78OpenOffice: 79- write auto-notificator for changes in OOo (more important) 80- check every single language on http://extensions.services.openoffice.org/en/dictionaries 81- contact the authors & join the efforts 82- [DONE] import indic scripts 83 84Impromevents: 85- prepare a better version of notes 86- auto-conversion into four different files: 87 - licence 88 - list of used letters for each language (lowercase and uppercase equivalent); difference between letters & others 89 - [DONE] list of patterns 90 - [DONE] list of exceptions 91- that list should be auto-generated from TeX sources and used for [complete a list]: 92 - hyphenator 93 - ... 94- LuaTeX patterns 95- [almost DONE] Describe and try to solve the apostrophe problem (users should be able 96 to input both U+0027 and U+2019) 97 See http://tug.org/pipermail/xetex/2010-October/018914.html ff. for 98 one use case. 99 100Documentation: 101- authors, history, changes 102- how to use patterns in plain TeX 103- how to use patterns in LaTeX with polyglossia (or babel?) 104- how to use patterns in ConTeXt 105- how to add a new language (once we quit the project) 106 107Testing: 108- collect lists of words for the languages where that is possible (no copyright problems) 109- write a luatex tester for a collection of words for each language 110 111Licences: 112- ... complete the list with tasks that need to be done 113- we want two licences: one for content and one for packaging 114- write a simple licence for packaging: 115 - we don't care what others (outside of TeX world) do; just mention the credit to hyph-utf8 116 and complete credit to pattern authors 117 - respect the licence of pattern authors 118 - within TeX world: 119 - keep the functionality; implementation may change, but not the end result (unless improved) 120 - contact the list first when changes are needed 121 - if team disappears, one should be free to continue the work (mention how that should be done) 122 - some general guidelines: 123 - no TeX commands (apart from \hyphenation and \patterns) 124 - everything in Unicode 125 - hyph-xx.tex will keep containing the patterns 126 - loadhyph-xx.tex will keep loading the patterns, but may change considerably (ugly tricks should go there) 127 - ... (complete the list) 128 - language-specific: 129 - author of existing patterns has full right to submit changes 130 - when a new author comes and the old one doesn't respond ... what to do? 131 - when two version of patterns appear, try to convince the authors to combine the patterns 132 ... write some general guidelines 133- collect a few ideas for valid licences (maybe a different licence f) 134- write the authors if they agree with that licence (give them a few options to choose from) 135- write some really nice, easily parsable form and apply to all the patterns 136- modify generating scripts for: 137 - German 138 - gl 139 - eu, tk, tr 140 141Example of a licence: 142 143This work consists of two parts: 144a) patterns for different languages 145b) support files (tex files, generating scripts, documentation, ...) 146with two different licences for each: 1471) for within the TeX world 1482) and outside the TeX world 149 150A short version of licence: 151b2) We don't care what you do, though it would be nice of you to mention the source of patterns (hyph-utf8 residing at http://tug.org/tex-hyphen) at some visible place. 152b1) 153 154a1) 155 156Collaboration: 157- make a list of people that we would want to collaborate with 158- some notification system for those using our patterns as source for whenever something in repository changes 159- write done for everyone we have collaborated with 160 161TeX users: 162- [on the way] write DOC about pattern conversion 163- [DONE] write an article for TUG boat and/or other TeX magazines 164- [DONE] maybe present the main idea at some TUG conference 165- notify mailing lists 166- CTAN people should take care for new patterns to be submitted in the proper form 167 168Web page: 169- split the page into several subpages: 170 - short intro & news 171 - languages - list all available language and add proper links 172 - collaboration and links 173 - description of algorithm (with Mathias' slides) 174- nicer design 175- apply Mathias' Hyphenator.js 176- add a test page to do hyphenation on-the-fly using Mathias' code 177 and creating some really nice output 178 - a dropdown box chooses the language 179 - text input field 180 - button for submission (maybe we don't even need that) 181 - sample output 182 183Software: 184- possibly rewrite patgen to handle UTF-8 185 - [DONE] try to convince Mathias Nater to do that; Deadline: BachoTeX 2011 :) 186 - try to convince him to finish the project 187 188OpenOffice: 189- http://extensions.services.openoffice.org/en/project/RomanianDictionaryPackCedillaVersion 190- (kurdish) http://extensions.services.openoffice.org/en/project/kitandin 191 192- http://www.mail-archive.com/dev@lingucomponent.openoffice.org/msg01104.html 193 194 195 196Hello Mojca, 197 198 199On 16/03/2010 10:21, Mojca Miklavec wrote: 200Hello François, 201 202Not so long ago the author of Mongolian patterns has asked us for 203"help about how to use his patterns in TeX Live 2008/09". (They are 204called "mongolian2a" in language.dat since there are two version of 205Mongolian patterns in TeX Live - both different patterns and different 206encoding. I'm not sure that I know how to enable those patterns in 207LaTeX in some nice way.) 208 209First question: does polyglossia work ok with 8-bit engines (pdftex) 210as a replacement for babel? 211 212 213No. It assumes Unicode encoding and is currently for XeLaTeX only. 214Together with Elie Roux I plan to add support for LuaLaTeX in a later version (not very soon however). 215 216Second question: would you be willing to write a short introduction 217(in tex or maybe even better in html forhttp://tug.org/tex-hyphen/) 218 219about how to use some particular patterns in polyglossia (maybe we 220should start promoting polyglossia via our webpage, in particual if 221the answer to my first question was positive). 222 223 224Sure. 225 226The short answer is: 227 228Since Babel's hyphen.cfg is built in the xelatex format, hyphenation patterns can be used without even loading polyglossia (or Babel for that matter). At the low-level this simply corresponds to defining 229 \language=\l@<langname> 230(where langname is the string identifying a particular hyphenation file in language.dat). The user command for this is 231 \hyphenrules{langname} 232or 233 \begin{hyphenrules}{langname} ... \end{hyphenrules}. 234 235The above works with any flavour of LaTeX, and I think it is also available in Plain TeX with 236 \hyphenrules{langname} 237 ... 238 \endhyphenrules 239 240 241 242In particular, we need to provide examples of usage in {plain TeX, 243LaTeX, XeTeX, XeLaTeX, ConTeXt, Lua(La)TeX} for different encodings. I 244can write something about ConTeXt, but I need help for different 245flavours of LaTeX. 246 247 248Under "plain" pdftex, xetex, and luatex, one can use the macro \uselanguage{langname} for any language mentioned in language.def. It's that simple! 249 250Under polyglossia this is all done automatically, as in Babel, within the individual language definition files, depending of the options associated with each language. Example: 251 252\usepackage{polyglossia} 253\setmainlanguage{asturian} 254\setotherlanguage[variant=usmax]{english}% American English with extended hyphenation patterns 255\setotherlanguage[spelling=new,latesthyphen=true]{german}% German with experimental patterns "ngerman-x-latest" 256\setotherlanguages{spanish,catalan,french} 257\begin{document} 258Long Asturian text ... (Hyphenation for Asturian is not available, but polyglossia automatically falls back on Catalan for now, which seems to be a reasonable choice.) 259 260\begin{german} 261Deutscher Text ... (with the hyphenation patterns selected above: "ngerman-x-latest") 262\end{german} 263 264\begin[script=fraktur,spelling=old]{german} 265Deutſcher Text ... (set in Fraktur, with traditional hyphenation). 266\end{german} 267 268etc. 269 270Feel free to adapt and use the above to your liking! 271 272Regards, 273François 274