1BABYL OPTIONS: -*- rmail -*- 2Version: 5 3Labels: 4Note: This is the header of an rmail file. 5Note: If you are seeing it in rmail, 6Note: it means the file has no messages in it. 7 81, edited,, 9From: terrell@druhi.ATT.COM (TerrellE) 10Newsgroups: comp.sys.ibm.pc,sci.astro 11Subject: Internationalization of Software? 12Date: 30 Jun 89 19:05:23 GMT 13Reply-To: terrell@druhi.ATT.COM (TerrellE) 14Organization: AT&T, Denver, CO 15 16*** EOOH *** 17From: terrell@druhi.ATT.COM (TerrellE) 18Newsgroups: comp.sys.ibm.pc,sci.astro 19Subject: Internationalization of Software? 20Date: 30 Jun 89 19:05:23 GMT 21Reply-To: terrell@druhi.ATT.COM (TerrellE) 22 23I know that there are some modifications that I will have to perform to 24"internationalize" software products developed for use in the USA. 25These changes include the obvious (translate the program 26and documentation into the right language). However, some of the 27other changes are more subtle. I'm sure that I've overlooked some, but 28here's what I have so far: 29 30Necessary changes to "internationalize" a software product: 31 321. Flexible date format: 33 34 dd/mm/yy 35 yy/dd/mm 36 yy/mm/dd 37 mm/dd/yy 38 392. Handle foreign daylight savings time. 40 413. Flexible radix (decimal) point (i.e. '.' or ','): 42 43 3.14159 44 3,14159 45 464. Allow English or Metric units. 47 485. Use "one thousand million" rather than "one billion". 49 506. Flexible time format: 51 52 hh:mm 53 hh.mm 54 hh'mm 55 567. Allow either ' ' or ',' for thousands delimiters: 57 58 1,000,000 59 1 000 000 60 61 62What else is necessary? Overseas users: what changes would you make 63to your "US Version" software to make it approprate for use in other 64countries? 65 66I'll post a summary of the results. Thanks in advance, 67 68 69 70Eric Terrell (att!druhi!terrell) 71 721,, 73Xref: IRO.UMontreal.CA comp.std.c:13991 comp.software.international:607 74Path: IRO.UMontreal.CA!CC.UMontreal.CA!newsflash.concordia.ca!utcsri!utnut!cs.utexas.edu!howland.reston.ans.net!nctuccca.edu.tw!news.cc.nctu.edu.tw!mall!ywliu 75From: ywliu@beta.wsl.sinica.edu.tw () 76Newsgroups: comp.std.c,comp.software.international 77Subject: Re: ANSI C Locale Character Sets 78Followup-To: comp.std.c,comp.software.international 79Date: 3 Oct 1994 06:39:25 GMT 80Organization: Computing Center, Academia Sinica 81Lines: 26 82Message-ID: <36o8ut$afu@mall.sinica.edu.tw> 83References: <Cx0Mpy.7Lo@actrix.gen.nz> 84NNTP-Posting-Host: ywliu%@beta.wsl.sinica.edu.tw 85X-Newsreader: TIN [version 1.2 PL0] 86 87*** EOOH *** 88From: ywliu@beta.wsl.sinica.edu.tw () 89Newsgroups: comp.std.c,comp.software.international 90Subject: Re: ANSI C Locale Character Sets 91Followup-To: comp.std.c,comp.software.international 92Date: 3 Oct 1994 06:39:25 GMT 93Organization: Computing Center, Academia Sinica 94References: <Cx0Mpy.7Lo@actrix.gen.nz> 95NNTP-Posting-Host: ywliu%@beta.wsl.sinica.edu.tw 96X-Newsreader: TIN [version 1.2 PL0] 97 98Gary Houston (ghouston@actrix.gen.nz) wrote: 99: It seems to me there are a couple of details missing from the ANSI C 100: locale stuff: 101 102: 1/ How can a program find out which character set is being used? 103 104 105 You may use setlocale(LC_ALL,NULL) to get the language info. 106 107: 2/ How can a program determine whether text files use multibyte or 108: wide characters, or is it to be assumed that multibyte will 109: always be used? 110 111 As far as I am concerned, the wide character is used as the representation 112inside your program. That is, wide character is your internal data 113representatin form, as I/O operates on multi-byte characters. So, I always 114read/write mutl-bytes and convert to wide character , and vice versa. 115 116: Does anyone know of other standards/conventions/plans which fill 117: in this missing information? 118 119 You may check out P.J. Plauger's "Standard C" column on CUJ May 1993 - July 1201993. There is another one "Internationlization and Localization" in CUJ July 121 1993 too. I am looking for more material. 122 123Yen-Wei Liu 124 1251, edited, answered,, 126Mail-from: From orac.iinet.com.au!pdcruze Thu Nov 24 17:38:19 1994 127Return-Path: <orac.iinet.com.au!pdcruze> 128Received: by icule (Smail3.1.28.1 #1) 129 id m0rAmnw-00009aC; Thu, 24 Nov 94 17:38 EST 130Received: from lagrande.iro.umontreal.ca by iros1.IRO.UMontreal.CA (8.6.9) with ESMTP 131 id LAA06293; Thu, 24 Nov 1994 11:57:58 -0500 132Received: from saguenay.IRO.UMontreal.CA (root@saguenay32.IRO.UMontreal.CA [132.204.32.54]) by lagrande.iro.umontreal.ca (8.6.9/8.6.9) with ESMTP id LAA23939 for <pinard@lagrande.IRO.UMontreal.CA>; Thu, 24 Nov 1994 11:57:50 -0500 133Received: from uniwa.uwa.edu.au (root@uniwa.uwa.edu.au [130.95.128.1]) by saguenay.IRO.UMontreal.CA (8.6.9/8.6.9) with ESMTP id LAA20957 for <pinard@IRO.UMontreal.CA>; Thu, 24 Nov 1994 11:57:46 -0500 134Received: from orac.iinet.com.au (orac.iinet.com.au [203.0.178.134]) by uniwa.uwa.edu.au (8.6.9/8.6.9) with ESMTP id AAA09394; Fri, 25 Nov 1994 00:57:29 +0800 135Received: from orac.iinet.com.au (pdcruze@localhost [127.0.0.1]) by orac.iinet.com.au (8.6.9/8.6.9) with ESMTP id AAA08605; Fri, 25 Nov 1994 00:57:11 +0800 136Message-Id: <199411241657.AAA08605@orac.iinet.com.au> 137To: pinard@IRO.UMontreal.CA 138cc: meyering@comco.com 139Subject: Re: Starting localization of GNU recode 140In-reply-to: Your message of "Thu, 24 Nov 1994 01:11:00 EST." 141 <m0rAXP2-00008sC@icule> 142Date: Fri, 25 Nov 1994 00:57:10 +0800 143From: "Patrick D'Cruze" <pdcruze@li.org> 144 145*** EOOH *** 146To: pinard@IRO.UMontreal.CA 147cc: meyering@comco.com 148Subject: Re: Starting localization of GNU recode 149In-reply-to: Your message of "Thu, 24 Nov 1994 01:11:00 EST." 150 <m0rAXP2-00008sC@icule> 151Date: Fri, 25 Nov 1994 00:57:10 +0800 152From: "Patrick D'Cruze" <pdcruze@li.org> 153 154> I met a few points of discussion while doing so: 155> 156> * I got to decide that, even if the program will eventually make 157> most of its output in the foreign languages, the input syntax, 158> option values, etc., are not to be localized. 159 160Yes. The purpose of message catalogs was to provide an easy to use method 161for displaying language independent messages. Hence little modifications 162need to be made to support this. However, no easy method exists for 163supporting language-independent inputs. So this will have to be left up to 164the developer to decide how they are going to implement this. 165 166> * it is not useful that I modify the lib/ routines if not done in the 167> true sources. How do you/I/they proceed for getting this job done? 168> I presume that lib/ routines will all use gettext for the time being. 169 170Probably Roland (or another volunteer) will internationalize glibc. Linux's 171libc has already been internationalised and a few message catalogs 172already exist - French, German, Polish. It probably would be useful 173modifying the routines in lib/ for those platforms that will be using 174the routines located in libc/. 175 176> I was expecting a problem which I did not met. All localizable 177> strings were luckily into executable positions, that is, affected 178> to variables or given as parameter to functions. But I will not 179> escape this problem in all my things, and will surely hit some 180> localizable strings in structured initializations. I'll see once 181> there, unless you thought out an all ready solution for this (?). 182 183I've come across this a few times within diffutils. Particularly struct 184definitions and the like. I'll send you a list of guidelines when looking 185for output messages. Will send this to you and Jim tommorrow. 186 187Regards, 188Patrick 189 190 191 1921, edited,, 193Mail-from: From pinard Mon Nov 28 12:15:47 1994 194Return-Path: <pinard> 195Received: by icule (Smail3.1.28.1 #1) 196 id m0rC9fz-00008uC; Mon, 28 Nov 94 12:15 EST 197Message-Id: <m0rC9fz-00008uC@icule> 198Date: Mon, 28 Nov 94 12:15 EST 199From: pinard (Francois Pinard) 200To: Richard M. Stallman <rms@prep.ai.mit.edu> 201CC: Jim Meyering <meyering@comco.com> 202Subject: GNU standards and localized message catalogs 203Mime-Version: 1.0 204Content-Type: text/plain; charset=US-ASCII 205 206*** EOOH *** 207Date: Mon, 28 Nov 94 12:15 EST 208From: pinard (Francois Pinard) 209To: Richard M. Stallman <rms@prep.ai.mit.edu> 210CC: Jim Meyering <meyering@comco.com> 211Subject: GNU standards and localized message catalogs 212Mime-Version: 1.0 213Content-Type: text/plain; charset=US-ASCII 214 215* We also need a uniform convention about where, in the installed 216hierarchy, to put translations of manuals in long term. The need is 217not immediate. One friend volunteered to translate the GNU recode 218manual in French. If this happens, I would like to know first *if* 219the distribution should install it by default, and where it should 220install it then. If not installed by default, what would be the 221uniform naming scheme for Makefile goals installing documents? 222 2231, edited,, 224Mail-from: From pinard Sat Dec 24 23:51:00 1994 225Return-Path: <pinard> 226Received: by icule (Smail3.1.28.1 #1) 227 id m0rLkv4-00009AC; Sat, 24 Dec 94 23:50 EST 228Message-Id: <m0rLkv4-00009AC@icule> 229Date: Sat, 24 Dec 94 23:50 EST 230From: pinard (Francois Pinard) 231To: rms@gnu.ai.mit.edu 232In-reply-to: <199412250445.XAA25324@mole.gnu.ai.mit.edu> (message from Richard Stallman on Sat, 24 Dec 1994 23:45:19 -0500) 233Subject: Re: GNU standards and localized message catalogs 234Mime-Version: 1.0 235Content-Type: text/plain; charset=ISO-8859-1 236Content-Transfer-Encoding: 8bit 237 238*** EOOH *** 239Date: Sat, 24 Dec 94 23:50 EST 240From: pinard (Francois Pinard) 241To: rms@gnu.ai.mit.edu 242In-reply-to: <199412250445.XAA25324@mole.gnu.ai.mit.edu> (message from Richard Stallman on Sat, 24 Dec 1994 23:45:19 -0500) 243Subject: Re: GNU standards and localized message catalogs 244Mime-Version: 1.0 245Content-Type: text/plain; charset=ISO-8859-1 246Content-Transfer-Encoding: 8bit 247 248 * We also need a uniform convention about where, in the installed 249 hierarchy, to put translations of manuals in long term. 250 251 I think they should go in the Info tree just like English manuals. 252 253Yes, of course. Suppose I have a French recode.info, and an 254English one. This kind of thing will not be immediate, but they 255will come. We need some convention to install both. We are not 256to give them different names, presumably. People will like to 257say, on an individual basis: ``if a French version of something is 258available, I'll prefer it over the standard English one''. So we 259need a convention to stock these, and a convention to select them. 260 2611,, 262Mail-from: From gnu.ai.mit.edu!rms Sun Dec 25 05:16:06 1994 263Return-Path: <gnu.ai.mit.edu!rms> 264Received: by icule (Smail3.1.28.1 #1) 265 id m0rLpze-00009IC; Sun, 25 Dec 94 05:16 EST 266Received: from lagrande.iro.umontreal.ca (lagrande.IRO.UMontreal.CA [132.204.32.32]) by iros1.IRO.UMontreal.CA (8.6.9/8.6.9) with ESMTP id AAA12366 for <icule!pinard>; Sun, 25 Dec 1994 00:01:47 -0500 267Received: from saguenay.IRO.UMontreal.CA (root@saguenay32.IRO.UMontreal.CA [132.204.32.54]) by lagrande.iro.umontreal.ca (8.6.9/8.6.9) with ESMTP id AAA10584 for <pinard@lagrande.IRO.UMontreal.CA>; Sun, 25 Dec 1994 00:01:46 -0500 268Received: from mole.gnu.ai.mit.edu (rms@mole.gnu.ai.mit.edu [128.52.46.33]) by saguenay.IRO.UMontreal.CA (8.6.9/8.6.9) with ESMTP id AAA14869 for <pinard@iro.umontreal.ca>; Sun, 25 Dec 1994 00:01:37 -0500 269Received: by mole.gnu.ai.mit.edu (8.6.9/4.0) 270 id <AAA25411@mole.gnu.ai.mit.edu>; Sun, 25 Dec 1994 00:01:33 -0500 271Date: Sun, 25 Dec 1994 00:01:33 -0500 272Message-Id: <199412250501.AAA25411@mole.gnu.ai.mit.edu> 273From: Richard Stallman <rms@gnu.ai.mit.edu> 274To: pinard@iro.umontreal.ca 275In-reply-to: <m0rLkv4-00009AC@icule> (pinard@iro.umontreal.ca) 276Subject: Re: GNU standards and localized message catalogs 277 278*** EOOH *** 279Date: Sun, 25 Dec 1994 00:01:33 -0500 280From: Richard Stallman <rms@gnu.ai.mit.edu> 281To: pinard@iro.umontreal.ca 282In-reply-to: <m0rLkv4-00009AC@icule> (pinard@iro.umontreal.ca) 283Subject: Re: GNU standards and localized message catalogs 284 285 We need some convention to install both. We are not 286 to give them different names, presumably. 287 288I would give them different names. They would have 289separate menu items in the Info directory. That is the 290easiest way and it seems good enough, so I don't see a reason 291to spend time looking for any other way. 292 293 2941, edited,, 295Mail-from: From pinard Tue Jan 3 16:17:29 1995 296Return-Path: <pinard> 297Received: by icule (Smail3.1.28.1 #1) 298 id m0rPGbe-00008xC; Tue, 3 Jan 95 16:17 EST 299Message-Id: <m0rPGbe-00008xC@icule> 300Date: Tue, 3 Jan 95 16:17 EST 301From: pinard (Francois Pinard) 302To: vern@ee.lbl.gov 303In-reply-to: <199501031914.LAA00333@daffy.ee.lbl.gov> (message from Vern Paxson on Tue, 03 Jan 95 11:14:17 PST) 304Subject: Re: Internationalization of Flex 305Mime-Version: 1.0 306Content-Type: text/plain; charset=ISO-8859-1 307Content-Transfer-Encoding: 8bit 308 309*** EOOH *** 310Date: Tue, 3 Jan 95 16:17 EST 311From: pinard (Francois Pinard) 312To: vern@ee.lbl.gov 313In-reply-to: <199501031914.LAA00333@daffy.ee.lbl.gov> (message from Vern Paxson on Tue, 03 Jan 95 11:14:17 PST) 314Subject: Re: Internationalization of Flex 315Mime-Version: 1.0 316Content-Type: text/plain; charset=ISO-8859-1 317Content-Transfer-Encoding: 8bit 318 319There are two categories of patches: a grouped set at initialization 320time, and all-over-the-place one which marks localizable strings. 321We can consider them separately (but I will most probably end up 322suggesting we give them the same treatment...). 323 324What would be easier would be that the original Flex sources already 325marks all strings which require localization. The way I do it in my 326things is merely replacing each "STRING" by _("STRING") *if* STRING 327should be translated. Flex could then be distributed with: 328 329 #define _(String) (String) 330 331effectively ignoring the marks. I may provide an initial patch 332to you for this. Later on, the maintenance would be relatively 333easy for you: if you add or modify a string, you will have to 334ask yourself if the new or altered string requires translation, 335and include it within _() if you think it should be translated. 336"%s: %d" is an example of string not requiring translation... 337 338The remaining work will be handled by group of volunteers from 339different countries. I took the responsibility of organizing how 340these things will be done. Once in a while, volunteers will provide 341you some COUNTRY.tt files which you might accept to distribute 342with Flex. (COUNTRY is a two letter code, like `de' for German.) 343If the COUNTRY.tt files ever lag with regard to Flex modifications, 344this would not break nationalized Flex: the current mechanics will 345merely return the original English string if a proper translation 346cannot be found. So you do not even have to feel tied to the 347translators for releasing new distributions for Flex. And nothing 348is subject to the GPL so far :-). 349 350The initialization is not very complex, and can be done within 351less than a dozen easy lines of code, hardly GPL'able. I think 352they could be included in standard Flex distribution, while being 353conditionalized out. The only harder modifications come from me, 354and touch Makefile.in, for including all the machinery to prepare 355and install locale message catalogs provided the underlying system 356has what is needed. In the way I am now distributing my things, this 357machinery automatically cut itself out when GNU locale is not usable. 358 359Remain only two modules, currently named libintl.h and libintl.c 360(this might change), which are covered by the GPL, which you 361do not want to distribute with Flex. The Flex README could 362suggest installers to grab them from any other GNU distribution. 363The configuration machinery might automatically check if they have 364been copied by the installer and, if not, forget about localization. 365 366This way, Flex will be easily and widely nationalized, the GPL 367principle will be safe, Flex will stay free of the GPL, and the 368burden on the installers, as well as both you and me, will be 369minimal in the long run. 370 371There is a difficulty I have not studied yet, and which comes from 372the fact that Flex generates C code (Bison has the same problem). 373Flex itself could be nationalized, and this is orthogonal to the fact 374Flex could generate nationalizable scanners. Both are desirable. 375 376 3771, edited,, 378Mail-from: From pinard Thu Jan 12 07:41:07 1995 379Return-Path: <pinard> 380Received: by icule (Smail3.1.28.1 #1) 381 id m0rSOpt-00007LC; Thu, 12 Jan 95 07:41 EST 382Message-Id: <m0rSOpt-00007LC@icule> 383Date: Thu, 12 Jan 95 07:41 EST 384From: pinard (Francois Pinard) 385To: vern@ee.lbl.gov 386In-reply-to: <199501051930.LAA04658@daffy.ee.lbl.gov> (message from Vern Paxson on Thu, 05 Jan 95 11:30:54 PST) 387Subject: Re: Internationalization of Flex 388Mime-Version: 1.0 389Content-Type: text/plain; charset=ISO-8859-1 390Content-Transfer-Encoding: 8bit 391 392*** EOOH *** 393Date: Thu, 12 Jan 95 07:41 EST 394From: pinard (Francois Pinard) 395To: vern@ee.lbl.gov 396In-reply-to: <199501051930.LAA04658@daffy.ee.lbl.gov> (message from Vern Paxson on Thu, 05 Jan 95 11:30:54 PST) 397Subject: Re: Internationalization of Flex 398Mime-Version: 1.0 399Content-Type: text/plain; charset=ISO-8859-1 400Content-Transfer-Encoding: 8bit 401 402Besides, not long after having started this i18n effort for my 403own things, I realized that the i18n attribute should really be 404attached to strings themselves, and not to what we do with them. 405A blatant example is an error message produced by formatting. 406The format string needs i18n, while the result from sprintf may 407have so many different instances that it is unpractical to list 408them all in some error_string_out() routine. I also got other 409cases forcing me to concentrate on strings for i18n. 410 411There is a stylistic issue here. I use _("hello"), adding three 412characters to each localizable string, while you will most probably 413use _( "hello" ), adding five characters per localizable string. 414Yet, it has the advantage of being shorter than error_string_out, 415and be done at the right level. 416 417By merely defining _(String) to be (String), you just turn off 418localization in standard flex, with not a single nanosecond spoiled 419on it. But this will then allow me to produce a quite smaller and 420maintainable patch for i18n of flex. 421 422 This [error_string_out()] routine could then look up every string 423 passed it in a translation table that's compiled into flex 424 like the skel[] array. All that's needed is a public-domain 425 description of the format of the COUNTRY.tt translation files 426 and the rest should be easy. 427 428If I clearly understand your idea, you will compile in flex 429a French table, and obtain a French speaking binary. You will 430produce different binaries for Catalan, Dutch, etc. That is not 431practical on big sites having multinational users. 432 433Right now in my things, the setting of LANG in the environment 434decides the language to use, and there is a single binary to handle 435all things. Further, the evolving GNU locale will soon change its 436*.tt file format, and will try to use the current system underlying 437localization mechanics, if any good one is found at configure time. 438 439There is no need that you redo all this and throw new solutions to 440this whole set of problems. The most workable solution to me looks 441like standard flex distribution already have all _() included -- and 442that you accept routinely adding _() to new localizable strings when 443you are doing flex maintenance, and that a separately distributed 444patch attaches flex to GNU locale complexities, without having you 445discovering and solving them anew. 446 447 Let me know if this is workable (I'm willing to do the work). 448 449Let me take one hour this morning to offer you a patch for _() for 4502.5.0.6, hoping that you will accept it. That would be a start. Let 451me take care of the remaining organizational problems, synchronizing 452with other teams, etc. I already do this for other GNU packages 453and will eventually help with most of them (I've accepted that role). 454 455Once we will have had success with i18ned flex for some time, it 456would then become easier to convince you to go further for other 457aspects (like *producing* i18nable scanners :-). 458 459Let me hope that my pleading for the cause will touch your heart, 460somewhere :-). Keep happy! 461 462-- 463Fran�ois Pinard ``Happy GNU Year!'' pinard@iro.umontreal.ca 464A New Year's gift? Give us Programming Freedom! Write lpf@uunet.uu.net 465 466 4671, edited,, 468Mail-from: From pinard Thu Jan 12 16:44:56 1995 469Return-Path: <pinard> 470Received: by icule (Smail3.1.28.1 #1) 471 id m0rSXKA-00007VC; Thu, 12 Jan 95 16:44 EST 472Message-Id: <m0rSXKA-00007VC@icule> 473Date: Thu, 12 Jan 95 16:44 EST 474From: pinard (Francois Pinard) 475To: vern@ee.lbl.gov 476In-reply-to: <199501121822.KAA21713@daffy.ee.lbl.gov> (message from Vern Paxson on Thu, 12 Jan 95 10:22:40 PST) 477Subject: Re: Internationalization of Flex 478Mime-Version: 1.0 479Content-Type: text/plain; charset=ISO-8859-1 480Content-Transfer-Encoding: 8bit 481 482*** EOOH *** 483Date: Thu, 12 Jan 95 16:44 EST 484From: pinard (Francois Pinard) 485To: vern@ee.lbl.gov 486In-reply-to: <199501121822.KAA21713@daffy.ee.lbl.gov> (message from Vern Paxson on Thu, 12 Jan 95 10:22:40 PST) 487Subject: Re: Internationalization of Flex 488Mime-Version: 1.0 489Content-Type: text/plain; charset=ISO-8859-1 490Content-Transfer-Encoding: 8bit 491 492 I'm not sure having to remember to use error_string_out() instead 493 of fprintf( stderr ... ) is any easier, though. 494 495Not only error strings are being made localizable by the patch I 496shipped this morning, but also statistics, version and help, and 497some debug output. These are not always error messages, and not 498always sent to stderr. 499 500Sometimes in flex, messages are constructed in pieces using %s to 501insert parts. Translating at the string level is the right approach 502in these situations. I'm not sure error_string_out() would have been 503satisfying (but I'm not going to argue, since I have your favor! :-) 504 5051, edited, answered,, 506Mail-from: From twinsun.com!eggert Tue Feb 14 05:16:50 1995 507Path: bloom-beacon.mit.edu!senator-bedfellow.mit.edu!faqserv 508From: mike@vlsivie.tuwien.ac.at 509Newsgroups: comp.unix.questions,comp.std.internat,comp.software.international,comp.lang.c,comp.windows.x,comp.std.c,comp.answers,news.answers 510Subject: Programming for Internationalization FAQ 511Supersedes: <internationalization/programming-faq_787570857@rtfm.mit.edu> 512Followup-To: comp.unix.questions,comp.std.internat,comp.software.international,comp.lang.c,comp.windows.x,comp.std.c 513Date: 15 Jan 1995 10:26:57 GMT 514Organization: TU Wien 515Lines: 564 516Approved: news-answers-request@MIT.EDU 517Expires: 28 Feb 1995 10:26:07 GMT 518Message-ID: <internationalization/programming-faq_790165567@rtfm.mit.edu> 519NNTP-Posting-Host: bloom-picayune.mit.edu 520Mime-Version: 1.0 521Content-Type: text/plain; charset=ISO-8859-1 522Content-Transfer-Encoding: 8bit 523Summary: This FAQ discusses writing programs which can handle 524 different language conventions/character sets/etc. 525 Applicable to all character set encodings; with particular 526 emphasis on ISO-8859-1. 527X-Last-Updated: 1994/11/15 528Originator: faqserv@bloom-picayune.MIT.EDU 529Xref: bloom-beacon.mit.edu comp.unix.questions:38263 comp.std.internat:2069 comp.software.international:1289 comp.lang.c:65751 comp.windows.x:34580 comp.std.c:7917 comp.answers:9514 news.answers:33146 530 531*** EOOH *** 532From: mike@vlsivie.tuwien.ac.at 533Newsgroups: comp.unix.questions,comp.std.internat,comp.software.international,comp.lang.c,comp.windows.x,comp.std.c,comp.answers,news.answers 534Subject: Programming for Internationalization FAQ 535Supersedes: <internationalization/programming-faq_787570857@rtfm.mit.edu> 536Followup-To: comp.unix.questions,comp.std.internat,comp.software.international,comp.lang.c,comp.windows.x,comp.std.c 537Date: 15 Jan 1995 10:26:57 GMT 538Organization: TU Wien 539Approved: news-answers-request@MIT.EDU 540Expires: 28 Feb 1995 10:26:07 GMT 541NNTP-Posting-Host: bloom-picayune.mit.edu 542Mime-Version: 1.0 543Content-Type: text/plain; charset=ISO-8859-1 544Content-Transfer-Encoding: 8bit 545Summary: This FAQ discusses writing programs which can handle 546 different language conventions/character sets/etc. 547 Applicable to all character set encodings; with particular 548 emphasis on ISO-8859-1. 549X-Last-Updated: 1994/11/15 550Originator: faqserv@bloom-picayune.MIT.EDU 551 552 553Archive-name: internationalization/programming-faq 554Posting-Frequency: monthly 555 556 557 558 Programming for Internationalization 559 560 561 562DISCLAIMER: THE AUTHOR MAKES NO WARRANTY OF ANY KIND WITH REGARD TO 563THIS MATERIAL, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES 564OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. 565 566Note: Most of this was tested on a Sun 10, running SunOS 4.1.* - other 567systems might differ slightly 568 569This FAQ discusses topics related to the use of ISO 8859-1 based 8 bit 570character sets. It discusses how to program applications which support 571the use European (Latin American) national character sets on 572UNIX-based systems and standard C environments. 573 574 575 5761. Which coding should I use for accented characters? 577Use the internationally standardized ISO-8859-1 character set to type 578accented characters. This character set contains all characters 579necessary to type (West) European languages. This encoding is also the 580preferred encoding on the Internet (where accepted - see below). 581 582This character set is also used by AmigaDOS, MS-Windows (Code Page 5831252 in Microsoft Speak. This is for Windows versions delivered in 584the US, Europe (except Eastern Europe) and Latin America. In Windows 5853.1 Microsoft added additional characters in the 0x80-9F range), 586VMS (DEC MCS is a draft version of the current ISO 8859-1 character 587set standard and differs in only two characters) and (most) UNIX 588implementations. MS-DOS uses a different character set and is not 589compatible with this character set. 590 591ISO 8859-X actually is a family of character set standards. Basically 592all of the information given here is also valid for these standards. 593These standards comprise 8859-X: 5948859-1 Europe, Latin America 5958859-2 Eastern Europe 5968859-3 SE Europe/miscellaneous (Esperanto, Maltese, etc.) 5978859-4 Scandinavia/Baltic (mostly covered by 8859-1 also) 5988859-5 Cyrillic 5998859-6 Arabic 6008859-7 Greek 6018859-8 Hebrew 6028859-9 Latin5, same as 8859-1 except for Turkish instead of Icelandic 6038859-10 Latin6, for Eskimo/Scandinavian languages 604 605Another nascent standard is UNICODE (ISO 10646). UNICODE is an 606extension of ISO 8859-1 (which itself is an extension of US-ASCII) to 60716 bit characters. Thus most of the world's languages (including 608Japanese, Korean, Chinese...) can be covered. 609 610Most of the information given here is independent of the character 611encoding used (e.g. DEC MCS, etc.), but can be applied to any 612character set, providing the programming environment has provisions 613for this standard. 614 615 616 6172. Getting your environment right 618To configure your environment such that you can enter, process and 619display 8 bit ISO characters, check out the ISO-8859-1 FAQ available 620via anonymous ftp from ftp.vlsivie.tuwien.ac.at in 621/pub/8bit/FAQ-ISO-8859-1. 622 623 624 6253. Setting your environment for ISO-C (ANSI-C) programs 626The ISO C Standard (ANSI C Standard 4.4) defines several functions for 627supporting localization. To set your international environment on 628program startup, you should make one or several calls to the setlocale 629functions. Calls to this function will predetermine the reaction of 630other localization functions according to your language/country 631environment. 632 633To configure a particular aspect of you environment, say the number 634representation, you would call 635-- 636setlocale (LC_NUMERIC, "Germany"); 637-- 638 639This call would set all number representation functions defined in the 640localization set to return numbers in the format used in Germany. If 641the call was successful, setlocale will return the name of your 642locale. A NULL return value indicates failure. Note that the 643environments are predetermined outside your C program by the system 644you run on. (So the example given here is likely to fail on all but a 645few systems.) Check the setlocale manual page or your system 646documentation to find out about the environments available. 647 648There are several LOCALE types available for different localization 649aspects (currency sign, number representation, characters sets). The 650value they can take is highly system dependent. Also, it should be up 651to the use to define the local environment he needs. 652 653A C program inherits its locale environment variables when it starts up. 654This happens automatically. However, these variables do not 655automatically control the locale used by the library functions, because 656ISO/ANSI C says that all programs start by default in the standard C 657locale. To use the locales specified by the environment, The POSIX 658standard defines the following call: 659----- 660setlocale (LC_ALL, ""); 661----- 662 663Of course, you can only set part of your environment, by calling, say: 664---- 665setlocale (LC_CTYPE, ""); 666---- 667This only defines the character classification macros (defined in 668ctype.h). 669 670This is a list of local categories: 671 672 Effect of Specifying Environment Variable 673 category the Value Affected 674 __________________________________________________________ 675 676 LC_ALL Sets or queries LANG 677 entire environment 678 LC_COLLATE Changes or queries LC_COLLATE 679 collation sequences 680 LC_CTYPE Changes or queries LC_CTYPE 681 character classifi- 682 cation 683 LC_NUMERIC Changes or queries LC_NUMERIC 684 number format infor- 685 mation 686 LC_TIME Changes or queries LC_TIME 687 time conversion 688 parameters 689 LC_MONETARY Changes or queries LC_MONETARY 690 monetary information 691 692 693 694 6954. Using the locale information for character classification 696If you write a program which supports international use, you should 697use the available standardized functions, as only these will be 698influenced by the setlocale call. Thus, if you want to convert a 699capital letter in c to a lower case letter in l, _don't_ write: 700 701l = c - 'A' + 'a'; 702 703While this will work for characters in the US-ASCII character set, it 704will not work with many other character sets. The following, 705standard-conformant code will: 706 707#include <ctype.h> 708 709.... 710 711l = tolower(c); 712 713Also note that the second code may actually be faster than even the 714full "C" locale functionality (for most implementations), as it 715replaces a complex expression ( (c<='Z' && c>='A')? c-'A'+a:c; )by a simple 716table lookup! 717 718Note that this ISO standard is independent of the character set 719encoding used! 720 721 722 7235. Language independent messages 724There are two competing standards for language independent messages: 725one by X/Open, and another one by POSIX. The X/Open standard seems to 726have found a larger following as it has been around for a longer time. 727 7285.1 X/Open language independent messages 729X/Open defines a method for providing language-independent messages. 730Error messages are kept in a catalog which is opened upon program 731start with a locale specification. Then the message number and a set 732specification are used to index the message catalog. A default fourth 733argument is specified which will be printed if a particular message 734cannot be found in the catalog. 735 736Here is the world-famous C program using the language-independent 737X/Open message standard: 738-------------------------------------------------------------------------- 739#include <stdio.h> 740#include <nl_types.h> 741 742#define SET 1 743#define MSG_HELLO 1 744 745nl_catd catfd; 746 747int main (int argc, char **argv) { 748 /* Open the message catalog. We use the basename of the program 749 * as the catalog name. Of course, several programs can also 750 * share a common catalog. 751 */ 752 catfd = catopen (basename (argv [0]), NL_CAT_LOCALE); 753 /* catgets returns message MSG_HELLO from set SET from the 754 * message catalog catfd. If catfd does not refer to a message 755 * catalog, or the requested message cannot be found, the 756 * catalog, or the requested message cannot be found, the 757 * fourth argument is returned. 758 */ 759 printf (catgets (catfd, SET, MSG_HELLO, "hello, world\n")); 760 catclose (catfd); 761 return 0; 762} 763------------------------------------------------------------------------- 764 765For catopen, specify the constant NL_CAT_LOCALE to open the message 766catalog for the locale set for the LC_MESSAGES variable; using 767NL_CAT_LOCALE conforms to the XPG4 standard. You can specify 0 (zero) 768for compatibility with XPG3; when oflag is set to zero, the locale set 769for the LANG variable determines the message catalog locale. 770 771Several utilities exist for generating message catalogs and for 772upgrading programs which contain hard-wired strings: 773* gencat is used to generate message catalogs 774[All other programs are OS-specific:] 775* Ultrix and OSF support the extract program which will extract string 776 constants from the C source code, and has an option to replace these 777 strings with calls to catgets. 778* HP/UX has a similar utility called findmsg. 779* Under OSF, message catalogs may be listed with the dspcat utility. 780* HP/UX calls a similar utility dumpmsg. 781 782 7835.2 Sun/XView 784Sun implements a different set of functions functions to support i18n 785of messages (the source is available with the XView code): 786 787You can either use: 788----------------------------------------------- 789 790main() 791{ 792 // get the message catalog named "helloprogram" 793 // for the hello world program 794 textdomain("helloprogram"); 795 796 // get the translation for the "Hello, world\n" string 797 printf(gettext("Hello, world\n")); 798} 799----------------------------------------------- 800 801or you can roll all in one and write 802 803----------------------------------------------- 804main() 805{ 806 // get the translation for the "Hello, world\n" string 807 // from the message catalog "helloprogram" 808 printf(dgettext("helloprogram","Hello, world\n")); 809} 810----------------------------------------------- 811 812The LC_MESSAGES locale category setting determines the locale of 813strings that gettext() returns. The message catalogs are generated 814with either the installtxt or gencat commands. 815 816No opening of files as in the old SYS V and X/Open routines, and no 817handling of message numbers that you must have in a database to 818administer. 819 820 8215.3 POSIX language independent messages 822Neither of the previous two mechanisms is in the POSIX standard. 823There was much disagreement in the POSIX.1 committee about using the 824gettext routines vs. catgets (XPG). In the end the committee couldn't 825agree on anything, so no messaging system was included as part of the 826standard. I believe the informative annex of the standard includes the 827XPG3 messaging interfaces, "...as an example of a messaging system 828that has been implemented..." 829 830They were very careful not to say anywhere that you should use one set 831of interfaces over the other. 832 833 834 8356. Other localization aspects in ISO/ANSI C (and POSIX environments) 836For a more thorough discussion of localization and 837internationalization (aka. i18n), check your system vendors 838documentation, and the C library manual which comes with the FSF's 839glibc library (Chapter 19, 'Locales and Internationalization'). 840 841 842 8437. Internationalization under X11 8447.1 Output 845To output text encoded with ISO 8859-1 under X11, simply invoke the X 846display routines with 8 bit characters as you would use them with 8477-bit ASCII. You should however choose a font which contains bitmaps 848for these characters. You can use the xfd utility to display a font 849to verify that it contains a full set of characters. 850 851 8527.2 Input 853If you use a national keyboard (that is a keyboard, which has distinct 854keys for your countries special characters), inputting accents is 855straight forward and you'll get the corresponding characters by using 856the X11 input functions. 857 858Sometimes it may be necessary to input characters for which there are 859no keys on your keyboard (e.g. if you want to enter the German '�' 860from a French keyboard). 861 862X11R5 and X11R6 both have extensive support for i18n, but due to a 863variety of factors the R5 i18n was not well understood or widely 864used. Many people resorted to a work-around and might have been 865disappointed when R6 did not include this misfeature. It is important 866to recognize that the correct use of R5 and R6 i18n features will 867ensure maximum portability of your program. 868 869Footnote: Amongst other reasons, the X Consortium decision not to add 870support for input methods to the Xaw Athena widget contributes to this 871situation. Many users (and much of the PD software) live in an 872Xaw-only world, so they will not be able to benefit from this i18n 873effort. 874 875X11 R5 and R6 support input methods for entering non-ASCII, and 876displaying and configuring text, menus etc. for a wide variety of 877languages. This input method has to be installed by the application 878by calls to the Xlib library (or an Xt toolkit call). 879 880[Under X11R5, some X servers (notably the Xsun server) will let you 881enter ISO characters by supplying a built-in escape mechanism, if no 882keys for these characters are on your keyboard, and will pass along 883and display ISO 8859-1. This hack obviated the need to install an 884input method, but was less flexible.] 885 886 887If you are using a toolkit, it is quite simple to support localization 888of you X11 code: 889If you're using a toolkit -- Xt and a widget set like Motif or R6 Xaw -- 890you need only add a single line of code to your source. Before any other 891calls to Xt, add a call to XtSetLanguageProc, e.g.: 892 893 int main (int argc, char** argv) 894 { 895 ... 896 XtSetLanguageProc (NULL, NULL, NULL); 897 top = XtAppInitialize ( ... ); 898 ... 899 } 900 901The LANG and LC_xxx environment variables (see section 3) will then be 902used to determine the 'input method' for this X application. This 903input method is responsible for managing COMPOSE character sequences 904or any other input mechanism for this particular implementation. Also 905see section 9 of ftp://ftp.vlsivie.tuwien.ac.at/pub/8bit/FAQ-ISO-8859-1, 906the FAQ on ISO 8859-1 usage. 907 908 9097.3 Toolkits, Widgets, and I18N 910The preferred way of inputing national characters when a national 911keyboard is not available is one/several input methods. These input 912methods will then support various kinds of compose sequences to enter 913national characters. 914 915The environment variables LANG and/or LC_xxx select the language for 916the Input Method (IM), but if several input methods exist, the 917environment variable XMODIFIERS can be used to select a specific input 918method. 919 920Xlib supports IMs 921Xt supports IMs 922Xaw does not support IMs 923 924Thus, applications written with Xlib or Xt can support IMs (see 925section 7.2 on how to install input methods under Xt), but Xaw based 926applications will not. 927 928Motif 1.2 or greater automatically uses the R5/R6 input method APIs. 929Thus applications using Motif 1.2+ can be made to support IMs. 930Several Motif 1.[01] versions also had similar functionality added to 931them by the respective vendors, but these extensions are 932vendor-specific and not portable. 933 934FOOTNOTE: If you can have comments/corrections for this section and on 935 OpenLook, please let me know. 936 937 9387.4 I18N under X11R6, General Information 939Background information from the X11R6 announcement: 940Internationalization (also known as I18N, there being 18 letters between the 941i and n) of the X Window System, which was originally introduced in Release 9425, has been significantly improved in R6. The R6 I18N architecture follows 943that in R5, being based on the locale model used in ANSI C and POSIX, with 944most of the I18N capability provided by Xlib. R5 introduced a fundamental 945framework for internationalized input and output. It could enable basic 946localization for left-to-right, non-context sensitive, 8-bit or multi-byte 947codeset languages and cultural conventions. However, it did not deal with 948all possible languages and cultural conventions. R6 also does not cover all 949possible languages and cultural conventions, but R6 contains substantial new 950Xlib interfaces to support I18N enhancements, in order to enable additional 951language support and more practical localization. 952 953The additional support is mainly in the area of text display. In order to 954support multi-byte encodings, the concept of a FontSet was introduced in R5. 955In R6, Xlib enhances this concept to a more generalized notion of output 956methods and output contexts. Just as input methods and input contexts sup- 957port complex text input, output methods and output contexts support complex 958and more intelligent text display, dealing not only with multiple fonts but 959also with context dependencies. The result is a general framework to enable 960bi-directional text and context sensitive text display. 961 962The description of the X11R6 internationalization framework is 963available via anonymous ftp from ftp.x.org in 964/pub/R6untarred/xc/doc/specs/i18n. 965 966 967 9688. Supporting I18N Network Protocols 9698.1 MIME 970MIME is specified in RFC 1521 and RFC 1522 which are available from 971ftp.uu.net. There is also a MIME FAQ which is available via anonymous 972ftp from ftp.ics.uci.edu in /mh/contrib/multimedia/mime-faq.txt.gz. 973(This file is in compressed format. You will need the GNU gunzip 974program to decompress this file.) 975 976If you want to write applications which support the MIME protocol, 977there are several libraries/tools which can ease your task: 978 979 9808.1.1 metamail 981Source for supporting MIME (the `metamail' package) in various mail 982readers is available via anonymous ftp from thumper.bellcore.com in 983/pub/nsb. This distribution consists of several utilities, which can 984be called by MIME applications to handle MIME types. 985 986 9878.1.2 MIMElt 988A "lightweight" MIME library available via anon ftp from 989oslonett.no:Software/MsDos/Comm/Offline/mimeltXX.zip 990 991It is source code (ANSI C) packaged as a library to facilitate 992construction of a limited MIME facility (limited == handling only 993character-set aspects of MIME, not the multimedia-aspects). It 994includes hooks to recode character sets into whatever system you are 995running off (e.g. if you read mail on a MsDos platform using CP-850, 996MIMElite may be set up so that QUOTED-PRINTABLE ISO Latin 1 is recoded 997into CP-850 for reading and saving to file). 998 999It's main use is to provide programmers of so-called "off-line 1000readers" (used by user's who access Internet mail through dial-up 1001service providers) with the tools needed to include proper support for 1002QUOTED-PRINTABLE encoding in their product. 1003 1004The archive also contain a couple of sample applications that 1005demonstrates how the library may be used. UNMIME is a stand-alone 1006utility to decode MIME-encoded messages (e.g. it works like UUDECODE 1007for binary files with BASE64 encoding), SENDMIME is a simple utility 1008to send MIME-encoded messages if your service provider doesn't have 1009PINE or similar tools. 1010 1011The current version (2.1) is limited to character set issues. I am 1012about to release version 2.2, which will support additional 1013Content-Types (e.g. "application/octet-stream"). 1014 1015 1016 10179. Programming in Prolog 1018SICStus Prolog accepts ISO characters as part of atoms, so you can 1019even define goal names containing accented characters. I/O of 8 bit 1020characters is (obviously) also supported. 1021 1022 1023 102410. ISO 8859-1 on non-UNIX systems 102510.1 MS-DOS 1026MS-DOS generally uses its own characters set. There are several code 1027pages (one with the same symbols as ISO 8859-1, albeit at different 1028character code positions, which can lead to problems with the transfer 1029of data). 1030 1031If interoperability without data conversion is your goal, you can 1032reconfigure your MS-DOS PC to use an ISO-8859-1 code page. Check out 1033the anonymous ftp archive ftp.uni-erlangen.de, which contains data on 1034how to do this (and other ISO-related stuff) in /pub/doc/ISO/charsets. 1035The README file contains an index of the files you need. 1036 1037Most (all?) C compilers/libraries for MS-DOS have only minimal support 1038for the ANSI/POSIX locale mechanism. The setlocale() and localeconv() 1039calls (and stuff like strxfrm()) are generally hardwired. 1040 1041 104210.2 MS Windows 1043MS-Windows (using code page 1252) normally uses the first 256 1044characters of Unicode, which is (for all practical purposes) 1045equivalent to ISO 8859-1. Thus, data representation and conversion 1046for interoperability with other ISO 8859-1 compliant systems is not an 1047issue. 1048 1049It seems that C libraries for MS Windows do not support the ANSI/POSIX 1050locale mechanism. (If you have any experiences with that, please let 1051me know.) There is a POSIX-like mechanism in some Microsoft platform 1052services, but none in the compilers from any vendor. 1053 1054 105510.3 OS/2 1056Text mode OS/2 programs generally suffer the same limitations as do 1057MS-DOS programs, because the display hardware is the same. 1058 1059Presentation Manager OS/2 programs using code page 1004 will order 1060the font glyphs in the same sequence as ISO 8859-1 (although of 1061course whether the glyphs will actually look anything like those 1062from ISO 8859-1 depends entirely from the font). 1063 1064The IBM CSet++ compiler supports full internationalization, with 1065several predefined locales. 1066 1067The Borland C++ compiler supports only the "C" locale. 1068 1069The Watcom C++ compiler supports only the "C" locale. 1070 1071The Metaware High C++ compiler supports only the "C" locale. It 1072does, however, also support UNICODE, providing UNICODE character 1073types and UNICODE versions of the appropriate parts of the standard 1074library (including I/O). 1075 1076 1077 107810.4 Apple Macintosh 1079MacIntoshes have their own non-standard character encodings; 1080the first 128 characters are US-ASCII but the remaining characters are 1081non-standard. 1082 1083I do not know whether C libraries (for which compilers?) for the 1084MacIntosh support the ANSI/POSIX locale mechanism. If you have any 1085experiences with that, please let me know. 1086 1087 108810.5 Amiga 1089The AmigaOS uses ISO-8859-1. As of OS version 2.1, Amiga-specific 1090means of localization are available. 1091 1092 1093 109411. Home location of this document 1095The most recent version of this document is available via anonymous 1096ftp from ftp.vlsivie.tuwien.ac.at under the file name 1097/pub/8bit/ISO-programming. 1098 1099----------------- 1100 1101Copyright � 1994 Michael Gschwind (mike@vlsivie.tuwien.ac.at) 1102 1103This document may be copied for non-commercial purposes, provided this 1104copyright notice appears. Publication in any other form requires the 1105author's consent. 1106 1107Dieses Dokument darf unter Angabe dieser urheberrechtlichen 1108Bestimmungen zum Zwecke der nicht-kommerziellen Nutzung beliebig 1109vervielf�ltigt werden. Die Publikation in jeglicher anderer Form 1110erfordert die Zustimmung des Autors. 1111 1112Michael Gschwind, Institut f. Technische Informatik, TU Wien 1113snail: Treitlstrasse 3-182-2 || A-1040 Wien || Austria 1114email: mike@vlsivie.tuwien.ac.at note: real time != real fast 1115phone: +(43)(1)58801 8156 fax: +(43)(1)586 9697 1116 1117 11181, edited, resent,, 1119Mail-from: From li.org!owner-li-international Fri Jan 20 08:56:04 1995 1120Return-Path: <li.org!owner-li-international> 1121Received: by icule (Smail3.1.28.1 #1) 1122 id m0rVJon-00009Da; Fri, 20 Jan 95 08:56 EST 1123Sender: li.org!owner-li-international 1124Received: from lagrande.iro.umontreal.ca (lagrande.IRO.UMontreal.CA [132.204.32.32]) by iros1.IRO.UMontreal.CA (8.6.9/8.6.9) with ESMTP id RAA25970 for <icule!pinard>; Mon, 16 Jan 1995 17:34:02 -0500 1125Received: from saguenay.IRO.UMontreal.CA (root@saguenay32.IRO.UMontreal.CA [132.204.32.54]) by lagrande.iro.umontreal.ca (8.6.9/8.6.9) with ESMTP id RAA14270 for <pinard@lagrande.IRO.UMontreal.CA>; Mon, 16 Jan 1995 17:33:53 -0500 1126Received: from uniwa.uwa.edu.au (root@uniwa.uwa.edu.au [130.95.128.1]) by saguenay.IRO.UMontreal.CA (8.6.9/8.6.9) with ESMTP id RAA07348 for <pinard@iro.umontreal.ca>; Mon, 16 Jan 1995 17:33:41 -0500 1127Received: from orac.aust.li.org (orac.iinet.com.au [203.0.178.134]) by uniwa.uwa.edu.au (8.6.9/8.6.9) with ESMTP id GAA22040; Tue, 17 Jan 1995 06:29:21 +0800 1128Received: (from majordom@localhost) by orac.aust.li.org (8.6.9/8.6.9) id FAA01118 for li-international-list; Tue, 17 Jan 1995 05:34:39 +0800 1129Received: from alcor (alcor.twinsun.com [198.147.65.1]) by orac.aust.li.org (8.6.9/8.6.9) with ESMTP id FAA01112 for <li-international@li.org>; Tue, 17 Jan 1995 05:34:28 +0800 1130Received: from twinsun.com (twinsun.twinsun.com [192.54.239.2]) by alcor (8.6.5/8.6.5) with SMTP id NAA04793 for <li-international@li.org>; Mon, 16 Jan 1995 13:06:52 -0800 1131Received: from spot.twinsun.com by twinsun.com (4.1/SMI-4.1) 1132 id AA06664; Mon, 16 Jan 95 13:33:30 PST 1133Received: by spot.twinsun.com (4.1/SMI-4.1) 1134 id AA04256; Mon, 16 Jan 95 13:33:30 PST 1135Old-From: eggert@twinsun.com (Paul Eggert) 1136Message-Id: <9501162133.AA04256@spot.twinsun.com> 1137Date: 16 Jan 1995 13:33:28 -0800 1138To: li-international@li.org 1139Subject: ISO Normative Addendum 1 and its effect on the C library 1140From: International List <li-international@li.org> 1141Sender: owner-li-international@li.org 1142Precedence: bulk 1143Reply-To: LI-international@li.org 1144 1145*** EOOH *** 1146From: eggert@twinsun.com (Paul Eggert) 1147Date: 16 Jan 1995 13:33:28 -0800 1148To: li-international@li.org 1149Subject: ISO Normative Addendum 1 and its effect on the C library 1150Reply-To: LI-international@li.org 1151 1152Normative Addendum 1 (NA1) to the ISO C standard was approved last year, 1153and I recently ran across a nice summary written by Clive Feather. 1154Please see <http://sf.www.lysator.liu.se/c/nal.html> for this; 1155 1156Most of the changes required by NA1 are to the C library's wide 1157character and multibyte string support. I don't see these changes 1158mentioned in the latest glibc snapshot. I asked Roland McGrath, 1159glibc's developer, about this, and he replied: 1160 1161 Date: Mon, 16 Jan 95 15:53:26 -0500 1162 From: Roland McGrath <roland@gnu.ai.mit.edu> 1163 1164 I think if you make the specifications available to the Linux community, 1165 the new library functions will get written and contributed to glibc. 1166 Try the mailing list li-international@li.org. 1167 1168So I'm sending this message to li-international. I can forward a copy 1169of the NA1 summary to whoever needs it; just ask. 1170 1171Two of the NA1 changes (__STDC_VERSION__ and digraphs) require changes 1172to GCC itself; I've volunteered to do this. One change (namely 1173<iso646.h>) can be done either in GCC or in libc, though if GCC does 1174digraphs it may make more sense for it to do <iso646.h> as well. 1175But the other changes belong to the C library proper. 1176 1177 1178 11791,, 1180Mail-from: From twinsun.com!eggert Tue Feb 14 05:16:49 1995 1181Return-Path: <twinsun.com!eggert> 1182Received: by icule (Smail3.1.28.1 #1) 1183 id m0reKJK-00009mC; Tue, 14 Feb 95 05:16 EST 1184Received: from lagrande.iro.umontreal.ca (lagrande.IRO.UMontreal.CA [132.204.32.32]) by iros1.IRO.UMontreal.CA (8.6.9/8.6.9) with ESMTP id CAA00816 for <icule!pinard>; Tue, 14 Feb 1995 02:16:27 -0500 1185Received: from saguenay.IRO.UMontreal.CA (root@saguenay32.IRO.UMontreal.CA [132.204.32.54]) by lagrande.iro.umontreal.ca (8.6.9/8.6.9) with ESMTP id CAA02807 for <pinard@lagrande.IRO.UMontreal.CA>; Tue, 14 Feb 1995 02:16:20 -0500 1186Received: from alcor.twinsun.com (alcor.twinsun.com [198.147.65.1]) by saguenay.IRO.UMontreal.CA (8.6.9/8.6.9) with ESMTP id CAA29451 for <pinard@iro.umontreal.ca>; Tue, 14 Feb 1995 02:16:16 -0500 1187Received: from twinsun.com (twinsun.twinsun.com [192.54.239.2]) by alcor.twinsun.com (8.6.5/8.6.5) with SMTP id WAA03362 for <pinard@iro.umontreal.ca>; Mon, 13 Feb 1995 22:44:50 -0800 1188Received: from spot.twinsun.com by twinsun.com (4.1/SMI-4.1) 1189 id AA08130; Mon, 13 Feb 95 23:15:06 PST 1190Received: by spot.twinsun.com (4.1/SMI-4.1) 1191 id AA05763; Mon, 13 Feb 95 23:15:05 PST 1192From: eggert@twinsun.com (Paul Eggert) 1193Message-Id: <9502140715.AA05763@spot.twinsun.com> 1194Date: 13 Feb 1995 23:15:04 -0800 1195To: pinard@iro.umontreal.ca 1196In-Reply-To: <m0rdrDE-00009QC@icule> (pinard@iro.umontreal.ca) 1197Subject: Re: glocale and Uniforum gettext simplicity 1198 1199*** EOOH *** 1200From: eggert@twinsun.com (Paul Eggert) 1201Date: 13 Feb 1995 23:15:04 -0800 1202To: pinard@iro.umontreal.ca 1203In-Reply-To: <m0rdrDE-00009QC@icule> (pinard@iro.umontreal.ca) 1204Subject: Re: glocale and Uniforum gettext simplicity 1205 1206 1207 Date: Sun, 12 Feb 95 22:12 EST 1208 From: pinard@iro.umontreal.ca (Francois Pinard) 1209 1210 Hello, Paul. 1211 1212 For more on this topic please see the Programming 1213 for Internationalization FAQ (Message-ID: 1214 <internationalization/programming-faq_784901999@rtfm.mit.edu>) 1215 which I can forward to you if you like. 1216 1217 Would you do this, please? 1218 1219Sure, the latest revision be in my next message. For future 1220reference, the coordinates are 1221<ftp://rtfm.mit.edu/pub/usenet-by-group/comp.answers/internationalization/programming-faq>. 1222 1223Alas, I haven't had time to work on this much lately -- beset with hardware 1224problems at home and no time to fix them.... 1225 1226 12271, edited,, 1228Mail-from: From pinard Tue Mar 21 12:53:53 1995 1229Return-Path: <pinard> 1230Received: by icule (Smail3.1.28.1 #1) 1231 id m0rr87q-00009TC; Tue, 21 Mar 95 12:53 EST 1232Message-Id: <m0rr87q-00009TC@icule> 1233Date: Tue, 21 Mar 95 12:53 EST 1234From: pinard (Fran�ois Pinard) 1235To: meyering@comco.com 1236CC: drepper@ipd.info.uni-karlsruhe.de 1237In-reply-to: <199503211712.LAA25472@idefix.comco.com> (message from Jim Meyering on Tue, 21 Mar 1995 11:12:49 -0600) 1238Subject: Re: international fileutils 1239Mime-Version: 1.0 1240Content-Type: text/plain; charset=ISO-8859-1 1241Content-Transfer-Encoding: 8bit 1242 1243*** EOOH *** 1244Date: Tue, 21 Mar 95 12:53 EST 1245From: pinard (Fran�ois Pinard) 1246To: meyering@comco.com 1247CC: drepper@ipd.info.uni-karlsruhe.de 1248In-reply-to: <199503211712.LAA25472@idefix.comco.com> (message from Jim Meyering on Tue, 21 Mar 1995 11:12:49 -0600) 1249Subject: Re: international fileutils 1250Mime-Version: 1.0 1251Content-Type: text/plain; charset=ISO-8859-1 1252Content-Transfer-Encoding: 8bit 1253 1254There are three things to do for each package: 1255 1256* Adjust Autoconf and Makefiles 1257* Mark all localizable strings in sources and doing other adjustments 1258* Translating messages for French (and maybe, let's be fair, German :-). 1259 12601, edited,, 1261Mail-from: From pinard Sun Apr 23 13:26:30 1995 1262Return-Path: <pinard> 1263Received: by icule (Smail3.1.28.1 #1) 1264 id m0s35QR-00008FC; Sun, 23 Apr 95 13:26 EDT 1265Message-Id: <m0s35QR-00008FC@icule> 1266Date: Sun, 23 Apr 95 13:26 EDT 1267From: pinard (Fran�ois Pinard) 1268To: Jim Meyering <meyering@comco.com>, 1269 Ulrich Drepper <drepper@gnu.ai.mit.edu>, 1270 Roland McGrath <roland@gnu.ai.mit.edu>, 1271 Paul Eggert <eggert@twinsun.com> 1272Subject: GNU locale and Ulrich's effort 1273Mime-Version: 1.0 1274Content-Type: text/plain; charset=ISO-8859-1 1275Content-Transfer-Encoding: 8bit 1276 1277*** EOOH *** 1278Date: Sun, 23 Apr 95 13:26 EDT 1279From: pinard (Fran�ois Pinard) 1280To: Jim Meyering <meyering@comco.com>, 1281 Ulrich Drepper <drepper@gnu.ai.mit.edu>, 1282 Roland McGrath <roland@gnu.ai.mit.edu>, 1283 Paul Eggert <eggert@twinsun.com> 1284Subject: GNU locale and Ulrich's effort 1285Mime-Version: 1.0 1286Content-Type: text/plain; charset=ISO-8859-1 1287Content-Transfer-Encoding: 8bit 1288 1289I'm trying to get started the overall effort for GNU localization, 1290by offering translators GNU packages to translate, and the means 1291to do so. I also do not want to spoil the energies being offered. 1292Many pieces of the puzzle are in place already and, as usual, I 1293contemplate them all trying to see what is missing, and working 1294towards the complete picture. 1295 1296Surely to me, GNU locale (glocale, as a package) has to provide a 1297fairly complete set of self-contained tools for helping package 1298maintainers to internationalize their product, and also for 1299localizers to translate message catalogs. Further, being itself 1300internationalized, it should be a very carefully crafted example 1301for maintainers, about how one might set his/her own package to be 1302easily installed while localization is effective, and portably! 1303 1304 1305 13061,, 1307Mail-from: From pinard Mon May 1 22:16:31 1995 1308Return-Path: <pinard> 1309Received: by icule (Smail3.1.28.1 #1) 1310 id m0s67Vl-00008NC; Mon, 1 May 95 22:16 EDT 1311Message-Id: <m0s67Vl-00008NC@icule> 1312Date: Mon, 1 May 95 22:16 EDT 1313From: pinard (=?ISO-8859-1?Q?Fran=E7ois_Pinard?=) 1314To: gnu@prep.ai.mit.edu 1315CC: rms@gnu.ai.mit.edu 1316In-reply-to: <9505020044.AA12891@pizza> (gnu@ai.mit.edu) 1317Subject: Re: [pinard@iro.umontreal.ca: Internationalizing GNU: the maintainer side] 1318Mime-Version: 1.0 1319Content-Type: text/plain; charset=ISO-8859-1 1320Content-Transfer-Encoding: 8bit 1321 1322*** EOOH *** 1323Date: Mon, 1 May 95 22:16 EDT 1324From: pinard (Fran�ois Pinard) 1325To: gnu@prep.ai.mit.edu 1326CC: rms@gnu.ai.mit.edu 1327In-reply-to: <9505020044.AA12891@pizza> (gnu@ai.mit.edu) 1328Subject: Re: [pinard@iro.umontreal.ca: Internationalizing GNU: the maintainer side] 1329Mime-Version: 1.0 1330Content-Type: text/plain; charset=ISO-8859-1 1331Content-Transfer-Encoding: 8bit 1332 1333 It contains some statements that are harsh and, I believe, 1334 not true. The practice of using gettext to mark strings is 1335 *not* just "for the time being." 1336 1337 Fran\cois: Could you work with rms to update the GNU coding 1338 standards to describe what GNUers needs to be do to make their 1339 GNU programs use GNU Locale. 1340 1341I may try, but do not know exactly how to proceed. I also confess 1342I've rewritten this paragraph twenty times, to merely censor myself. 1343 1344 We can then post that section of the GNU coding standards, so 1345 all the GNUers know what to do. 1346 1347If GNU ever publishes utilities for Native Language Support, their 1348own documentation should explain how to proceed, and maintainers 1349should find in there the information they need about what to do. 1350GNU standards might state the general principle, something like: 1351``GNU programs and packages should be opened to Native Language 1352Support (NLS) and, in particular, be able to write their messages 1353translated into native languages, as selected at run time by 1354environment variables''. 1355 1356-- 1357Fran�ois Pinard ``Vivement GNU!'' <pinard@iro.umontreal.ca> 1358Email lpf@uunet.uu.net for info about the League for Programming Freedom. 1359 1360 13611,, 1362Mail-from: From IRO.UMontreal.CA!pinard Tue May 2 05:16:32 1995 1363Return-Path: <IRO.UMontreal.CA!pinard> 1364Received: by icule (Smail3.1.28.1 #1) 1365 id m0s6E4E-0000CaC; Tue, 2 May 95 05:16 EDT 1366Received: from lagrande.iro.umontreal.ca (lagrande.IRO.UMontreal.CA [132.204.32.32]) by iros1.IRO.UMontreal.CA (8.6.9/8.6.9) with ESMTP id AAA19507 for <icule!pinard>; Tue, 2 May 1995 00:02:38 -0400 1367Received: (from pinard@localhost) by lagrande.iro.umontreal.ca (8.6.9/8.6.9) id AAA00659 for icule!pinard; Tue, 2 May 1995 00:02:37 -0400 1368Received: from saguenay.IRO.UMontreal.CA (saguenay32.IRO.UMontreal.CA [132.204.32.54]) by lagrande.iro.umontreal.ca (8.6.9/8.6.9) with ESMTP id AAA00657 for <pinard@lagrande.IRO.UMontreal.CA>; Tue, 2 May 1995 00:02:34 -0400 1369Received: from mole.gnu.ai.mit.edu (mole.gnu.ai.mit.edu [128.52.46.33]) by saguenay.IRO.UMontreal.CA (8.6.9/8.6.9) with ESMTP id AAA08792 for <pinard@iro.umontreal.ca>; Tue, 2 May 1995 00:02:33 -0400 1370Received: by mole.gnu.ai.mit.edu (8.6.12/8.6.12GNU) id AAA07143; Tue, 2 May 1995 00:02:31 -0400 1371Date: Tue, 2 May 1995 00:02:31 -0400 1372Message-Id: <199505020402.AAA07143@mole.gnu.ai.mit.edu> 1373From: Richard Stallman <rms@gnu.ai.mit.edu> 1374To: pinard@IRO.UMontreal.CA 1375In-reply-to: <m0s67Vl-00008NC@icule> (pinard@iro.umontreal.ca) 1376Subject: Re: [pinard@iro.umontreal.ca: Internationalizing GNU: the maintainer side] 1377 1378*** EOOH *** 1379Date: Tue, 2 May 1995 00:02:31 -0400 1380From: Richard Stallman <rms@gnu.ai.mit.edu> 1381To: pinard@IRO.UMontreal.CA 1382In-reply-to: <m0s67Vl-00008NC@icule> (pinard@iro.umontreal.ca) 1383Subject: Re: [pinard@iro.umontreal.ca: Internationalizing GNU: the maintainer side] 1384 1385 ``GNU programs and packages should be opened to Native Language 1386 Support (NLS) and, in particular, be able to write their messages 1387 translated into native languages, as selected at run time by 1388 environment variables''. 1389 1390I think that is too vague to be useful. I'd rather put in some 1391variant of what you sent before. But I don't have time right now 1392to fix it. 1393 1394 13951, answered, edited,, 1396Mail-from: From IRO.UMontreal.CA!pinard Wed May 3 00:19:10 1995 1397Return-Path: <IRO.UMontreal.CA!pinard> 1398Received: by icule (Smail3.1.28.1 #1) 1399 id m0s6Vty-0000CSC; Wed, 3 May 95 00:19 EDT 1400Received: from lagrande.iro.umontreal.ca (lagrande.IRO.UMontreal.CA [132.204.32.32]) by iros1.IRO.UMontreal.CA (8.6.9/8.6.9) with ESMTP id XAA19717 for <icule!pinard>; Tue, 2 May 1995 23:51:54 -0400 1401Received: (from pinard@localhost) by lagrande.iro.umontreal.ca (8.6.9/8.6.9) id XAA20985 for icule!pinard; Tue, 2 May 1995 23:51:52 -0400 1402Received: from saguenay.IRO.UMontreal.CA (saguenay32.IRO.UMontreal.CA [132.204.32.54]) by lagrande.iro.umontreal.ca (8.6.9/8.6.9) with ESMTP id XAA20983 for <pinard@lagrande.IRO.UMontreal.CA>; Tue, 2 May 1995 23:51:49 -0400 1403Received: from nz11.rz.uni-karlsruhe.de (nz11.rz.uni-karlsruhe.de [129.13.64.7]) by saguenay.IRO.UMontreal.CA (8.6.9/8.6.9) with ESMTP id XAA12985 for <pinard@iro.umontreal.ca>; Tue, 2 May 1995 23:51:15 -0400 1404Received: from ipd.info.uni-karlsruhe.de (actually i44ms.info.uni-karlsruhe.de) 1405 by nz11.rz.uni-karlsruhe.de with SMTP (PP); 1406 Wed, 3 May 1995 03:54:26 +0200 1407Received: from i44pc2.info.uni-karlsruhe.de (i44pc2.info.uni-karlsruhe.de [129.13.171.31]) 1408 by ipd.info.uni-karlsruhe.de (8.6.4/8.6.4) with SMTP id DAA00768; 1409 Wed, 3 May 1995 03:57:08 +0200 1410Message-Id: <199505030157.DAA00768@ipd.info.uni-karlsruhe.de> 1411To: "ois \"Pinard)\""@rz.uni-karlsruhe.de, meyering@comco.com (Jim Meyering), 1412 eggert@twinsun.com (Paul Eggert), 1413 roland@gnu.ai.mit.edu (Roland McGrath) 1414Original-To: pinard@iro.umontreal.ca (Fran�ois Pinard), 1415 meyering@comco.com (Jim Meyering), 1416 eggert@twinsun.com (Paul Eggert), 1417 roland@gnu.ai.mit.edu (Roland McGrath) 1418PP-Warning: Parse error in original version of preceding To line 1419Subject: nlsutils-0.4.2 1420Date: Wed, 03 May 1995 03:56:24 +0200 1421From: Ulrich Drepper <drepper@ipd.info.uni-karlsruhe.de> 1422 1423*** EOOH *** 1424To: "ois \"Pinard)\""@rz.uni-karlsruhe.de, meyering@comco.com (Jim Meyering), 1425 eggert@twinsun.com (Paul Eggert), 1426 roland@gnu.ai.mit.edu (Roland McGrath) 1427Original-To: pinard@iro.umontreal.ca (Fran�ois Pinard), 1428 meyering@comco.com (Jim Meyering), 1429 eggert@twinsun.com (Paul Eggert), 1430 roland@gnu.ai.mit.edu (Roland McGrath) 1431PP-Warning: Parse error in original version of preceding To line 1432Subject: nlsutils-0.4.2 1433Date: Wed, 03 May 1995 03:56:24 +0200 1434From: Ulrich Drepper <drepper@ipd.info.uni-karlsruhe.de> 1435 1436I tried hard to limit all external things in the libgintl directory. 1437You have to copy this, some variation of my code in aclocal.m4 1438and acconfig.h. This should be all. 1439 14401, answered,, 1441Mail-from: From IRO.UMontreal.CA!pinard Thu May 4 08:22:15 1995 1442Return-Path: <IRO.UMontreal.CA!pinard> 1443Received: by icule (Smail3.1.28.1 #1) 1444 id m0s6zv4-0000CSC; Thu, 4 May 95 08:22 EDT 1445Received: from lagrande.iro.umontreal.ca (lagrande.IRO.UMontreal.CA [132.204.32.32]) by iros1.IRO.UMontreal.CA (8.6.9/8.6.9) with ESMTP id HAA19349 for <icule!pinard>; Thu, 4 May 1995 07:48:32 -0400 1446Received: (from pinard@localhost) by lagrande.iro.umontreal.ca (8.6.9/8.6.9) id HAA24822 for icule!pinard; Thu, 4 May 1995 07:47:28 -0400 1447Received: from saguenay.IRO.UMontreal.CA (saguenay32.IRO.UMontreal.CA [132.204.32.54]) by lagrande.iro.umontreal.ca (8.6.9/8.6.9) with ESMTP id HAA24816 for <pinard@lagrande.IRO.UMontreal.CA>; Thu, 4 May 1995 07:47:25 -0400 1448Received: from nz11.rz.uni-karlsruhe.de (nz11.rz.uni-karlsruhe.de [129.13.64.7]) by saguenay.IRO.UMontreal.CA (8.6.9/8.6.9) with ESMTP id HAA17159 for <pinard@iro.umontreal.ca>; Thu, 4 May 1995 07:48:25 -0400 1449Received: from ipd.info.uni-karlsruhe.de (actually i44ms.info.uni-karlsruhe.de) 1450 by nz11.rz.uni-karlsruhe.de with SMTP (PP); 1451 Thu, 4 May 1995 13:45:17 +0200 1452Received: from i44pc2.info.uni-karlsruhe.de (i44pc2.info.uni-karlsruhe.de [129.13.171.31]) 1453 by ipd.info.uni-karlsruhe.de (8.6.4/8.6.4) with SMTP id NAA06097 1454 for <pinard@iro.umontreal.ca>; Thu, 4 May 1995 13:48:06 +0200 1455Message-Id: <199505041148.NAA06097@ipd.info.uni-karlsruhe.de> 1456To: pinard@IRO.UMontreal.CA 1457Subject: Re: Path to message? 1458In-Reply-To: Your message of "Thu, 4 May 95 00:45 EDT" 1459References: <m0s6snG-00008NC@icule> 1460X-Mailer: Mew beta version 0.89 on Emacs 19.28.1 1461Mime-Version: 1.0 1462Content-Type: Text/Plain; charset=iso-8859-1 1463Date: Thu, 04 May 1995 13:47:46 +0200 1464From: Ulrich Drepper <drepper@ipd.info.uni-karlsruhe.de> 1465Content-Transfer-Encoding: 8bit 1466X-Original-Encoding: quoted-printable 1467 1468*** EOOH *** 1469To: pinard@IRO.UMontreal.CA 1470Subject: Re: Path to message? 1471In-Reply-To: Your message of "Thu, 4 May 95 00:45 EDT" 1472References: <m0s6snG-00008NC@icule> 1473Mime-Version: 1.0 1474Content-Type: Text/Plain; charset=iso-8859-1 1475Date: Thu, 04 May 1995 13:47:46 +0200 1476From: Ulrich Drepper <drepper@ipd.info.uni-karlsruhe.de> 1477Content-Transfer-Encoding: 8bit 1478X-Original-Encoding: quoted-printable 1479 1480From: pinard@iro.umontreal.ca (Fran�ois Pinard) 1481Subject: Path to message? 1482Date: Thu, 4 May 95 00:45 EDT 1483 1484> Ulrich, always me. I do not understand that xgettext --help writes: 1485> 1486> Suchpfad ist: /usr/local/share/nls/src 1487> 1488> while /usr/local/share/locale/de/LC_MESSAGES is indeed searched. 1489> Could we solve this inconsistency? 1490> 1491 1492Not quite. /usr/local/share/locale/de/LC_MESSAGES is the path where 1493the .mo/.cat files will go. The search path (Suchpfad :) represents 1494the path to additional directories where other .po files can be found. 1495 1496I thought to use this feature for standard .po files for, say, libiberty 1497etc. Each package would have to translate it again and again but if 1498we could install this somewhere and use the -x option to exclude this 1499strings from the generation. 1500 1501Perhaps I should use a different description? 1502 1503-- Uli 1504________--------------------------------------------------------------- 1505\ / Ulrich Drepper / Univ. at Karlsruhe, Germany / CS Dept. / IPD 1506L\inux/ email: drepper@gnu.ai.mit.edu smail: Rubensstr. 5 1507 \ / drepper@ipd.info.uni-karlsruhe.de 76149 Karlsruhe 1508 \/1.2.7 ------------------------------------------- Germany -------- 1509 1510 15111, forwarded, edited,, 1512Mail-from: From pinard Thu May 4 15:27:13 1995 1513Return-Path: <pinard> 1514Received: by icule (Smail3.1.28.1 #1) 1515 id m0s76YH-00008NC; Thu, 4 May 95 15:27 EDT 1516Message-Id: <m0s76YH-00008NC@icule> 1517Date: Thu, 4 May 95 15:27 EDT 1518From: pinard (=?ISO-8859-1?Q?Fran=E7ois_Pinard?=) 1519To: ajc@di.uminho.pt 1520In-reply-to: <9505041601.AA20254@shiva.di.uminho.pt> (ajc@di.uminho.pt) 1521Subject: Re: tar is ready for pt 1522Mime-Version: 1.0 1523Content-Type: text/plain; charset=ISO-8859-1 1524Content-Transfer-Encoding: 8bit 1525 1526*** EOOH *** 1527Date: Thu, 4 May 95 15:27 EDT 1528From: pinard (Fran�ois Pinard) 1529To: ajc@di.uminho.pt 1530In-reply-to: <9505041601.AA20254@shiva.di.uminho.pt> (ajc@di.uminho.pt) 1531Subject: Re: tar is ready for pt 1532Mime-Version: 1.0 1533Content-Type: text/plain; charset=ISO-8859-1 1534Content-Transfer-Encoding: 8bit 1535 1536Even if it is not completely official yet in GNU, the format of 1537translation file is being revised, and the extension is being 1538changed from `.tt' to `.po'. This should bring the format closer 1539to one of the few standards in existence for translation files. 1540Hopefully, we think that translation files will be more easily 1541manageable afterwards. We do not want to make a religious issue of 1542this format selection, as each standard has proponents and opponents. 1543Please help us by being receptive to the format GNU uses. 1544 1545Existing `.tt' translation files are being converted to `.po' files 1546by maintainers. Translators should switch to using the `.po' format, 1547as soon as possible. This is an easy job. The `.po' translation 1548file format is quite affordable. Schematically, it looks like: 1549 1550 msgid STRING-TO-TRANSLATE 1551 msgstr TRANSLATED-STRING 1552 1553 msgid STRING-TO-TRANSLATE 1554 msgstr TRANSLATED-STRING 1555 1556 msgid STRING-TO-TRANSLATE 1557 msgstr TRANSLATED-STRING 1558 [...] 1559 1560`msgid' and `msgstr' are kind of keywords, written at the beginning 1561of a line. Each STRING-TO-TRANSLATE or TRANSLATED-STRING respects 1562the C syntax for a character string, including the surrounding 1563quotes, escape sequences, and usual techniques for writing multi-line 1564C strings. 1565 1566Outside strings, white lines and comments may be used freely. 1567In the schema, white lines preceding the msgid lines are optional. 1568Comments start at the beginning of a line with `#' and extend until 1569the end of line. Comments written by translators should have the 1570initial `#' immediately followed by some white space. If the `#' 1571is not immediately followed by white space, this comment is most 1572likely generated and managed by specialized GNU tools. 1573 1574There is a conventional, uniform way of presenting a `.po' file, but 1575a description of this format is not yet available. It will be all 1576easy to make suggested adjustements at a later time, so do not worry 1577right now about precise conventions. Further, there are normalizing 1578tools automating conformance to a great extent, to be published soon. 1579 1580 And another question: what happens when new versions of the 1581 program are released, with new messages? 1582 1583Usually, most GNU packages are pretested before being released. 1584All teams of translators are made aware of localizable prereleases. 1585A special tool regenerates a `.po' file with obsolescent strings 1586commented out, and new strings put in evidence. 1587 1588Further, for those of us using GNU Emacs, a special editing mode is 1589being written for `.po' files, in which mode translators is able 1590to navigate easily in the `.po' file, find untranslated entries, 1591examine at will the context of these strings in the program sources, 1592and also observe other translations already made in other languages, 1593for the string being translated. 1594 1595Teams members should share their translations and resolve linguistic 1596or terminological issues. When they reach something satisfying, 1597the team should formally submit the translation to the package 1598maintainer for the final release. The precise formalities are not 1599organized yet, and there are many details to clear up. Some legal 1600aspects also have to be addressed, this is under study right now. 1601 1602Special means should be used for transiting translation files 1603over email. The simplest way is using GNU shar in default mode, 1604or else, uuencoding the `.po' file prior to mailing. 1605 1606-- 1607Fran�ois Pinard ``Vivement GNU!'' <pinard@iro.umontreal.ca> 1608Email lpf@uunet.uu.net for info about the League for Programming Freedom. 1609 1610 16111, edited,, 1612Mail-from: From IRO.UMontreal.CA!pinard Thu Apr 20 16:54:03 1995 1613Return-Path: <IRO.UMontreal.CA!pinard> 1614Received: by icule (Smail3.1.28.1 #1) 1615 id m0s23Ea-0000CxC; Thu, 20 Apr 95 16:53 EDT 1616Received: from lagrande.iro.umontreal.ca (lagrande.IRO.UMontreal.CA [132.204.32.32]) by iros1.IRO.UMontreal.CA (8.6.9/8.6.9) with ESMTP id KAA12085 for <icule!pinard>; Thu, 20 Apr 1995 10:13:02 -0400 1617Received: (from pinard@localhost) by lagrande.iro.umontreal.ca (8.6.9/8.6.9) id KAA08298 for icule!pinard; Thu, 20 Apr 1995 10:12:34 -0400 1618Received: from saguenay.IRO.UMontreal.CA (saguenay32.IRO.UMontreal.CA [132.204.32.54]) by lagrande.iro.umontreal.ca (8.6.9/8.6.9) with ESMTP id KAA08254 for <pinard@lagrande.IRO.UMontreal.CA>; Thu, 20 Apr 1995 10:10:49 -0400 1619Received: from nz11.rz.uni-karlsruhe.de (nz11.rz.uni-karlsruhe.de [129.13.64.7]) by saguenay.IRO.UMontreal.CA (8.6.9/8.6.9) with ESMTP id KAA20778 for <pinard@iro.umontreal.ca>; Thu, 20 Apr 1995 10:10:25 -0400 1620Received: from ipd.info.uni-karlsruhe.de (actually i44ms.info.uni-karlsruhe.de) 1621 by nz11.rz.uni-karlsruhe.de with SMTP (PP); 1622 Thu, 20 Apr 1995 16:05:34 +0200 1623Received: from i44pc2.info.uni-karlsruhe.de (i44pc2.info.uni-karlsruhe.de [129.13.171.31]) 1624 by ipd.info.uni-karlsruhe.de (8.6.4/8.6.4) with SMTP id QAA28513; 1625 Thu, 20 Apr 1995 16:08:10 +0200 1626Message-Id: <199504201408.QAA28513@ipd.info.uni-karlsruhe.de> 1627To: pinard@IRO.UMontreal.CA (Francois Pinard), 1628 meyering@comco.com (Jim Meyering), 1629 roland@gnu.ai.mit.edu (Roland McGrath) 1630Subject: more points to discuss 1631X-Mailer: Mew beta version 0.89 on Emacs 19.28.1 1632Mime-Version: 1.0 1633Content-Type: Text/Plain; charset=iso-8859-1 1634Date: Thu, 20 Apr 1995 16:08:55 +0200 1635From: Ulrich Drepper <drepper@ipd.info.uni-karlsruhe.de> 1636Content-Transfer-Encoding: 8bit 1637X-Original-Encoding: quoted-printable 1638 1639*** EOOH *** 1640To: pinard@IRO.UMontreal.CA (Francois Pinard), 1641 meyering@comco.com (Jim Meyering), 1642 roland@gnu.ai.mit.edu (Roland McGrath) 1643Subject: more points to discuss 1644Mime-Version: 1.0 1645Content-Type: Text/Plain; charset=iso-8859-1 1646Date: Thu, 20 Apr 1995 16:08:55 +0200 1647From: Ulrich Drepper <drepper@ipd.info.uni-karlsruhe.de> 1648Content-Transfer-Encoding: 8bit 1649X-Original-Encoding: quoted-printable 1650 1651BTW my implementation will be able to process a lot of strange situation: 1652- strings in preprocessor macros 1653- something like gettext ("jkh" "jkhlk") 1654or even 1655- gettext ("jkkjh\ 1656sdfsdf") 1657 16581, edited,, 1659Received: from titan.comco.com (root@titan.comco.com [198.214.63.11]) by idefix.comco.com (8.6.9/8.6.9) with ESMTP id QAA16073 for <meyering@idefix.comco.com>; Sat, 19 Nov 1994 16:03:48 -0600 1660Received: from alcor.twinsun.com (alcor.twinsun.com [198.147.65.1]) by titan.comco.com (8.6.9/8.6.9) with ESMTP id QAA03006 for <meyering@comco.com>; Sat, 19 Nov 1994 16:04:38 -0600 1661Received: from twinsun.com (twinsun.twinsun.com [192.54.239.2]) by alcor.twinsun.com (8.6.5/8.6.5) with SMTP id NAA19013; Sat, 19 Nov 1994 13:55:18 -0800 1662Received: from spot.twinsun.com by twinsun.com (4.1/SMI-4.1) 1663 id AA29144; Sat, 19 Nov 94 14:01:01 PST 1664Received: by spot.twinsun.com (4.1/SMI-4.1) 1665 id AA02990; Sat, 19 Nov 94 14:01:00 PST 1666From: eggert@twinsun.com (Paul Eggert) 1667Message-Id: <9411192201.AA02990@spot.twinsun.com> 1668Date: 19 Nov 1994 14:00:59 -0800 1669To: rms@gnu.ai.mit.edu (Richard Stallman) 1670Cc: meyering@comco.com, pdcruze@orac.iinet.com.au 1671Subject: Re: glocale and diffutils 1672Status: RO 1673 1674*** EOOH *** 1675From: eggert@twinsun.com (Paul Eggert) 1676Date: 19 Nov 1994 14:00:59 -0800 1677To: rms@gnu.ai.mit.edu (Richard Stallman) 1678Cc: meyering@comco.com, pdcruze@orac.iinet.com.au 1679Subject: Re: glocale and diffutils 1680 1681The Uniforum proposal addresses this problem by partitioning message 1682catalogs into ``textdomains''. Each textdomain can be maintained 1683separately. Programs can share textdomains. Messages in different 1684textdomains cannot clash. With diffutils, for example, I would expect 1685one textdomain for diffutils programs and another for libc. The main 1686module would use the default textdomain and invoke `gettext ("No 1687newline at end of file")' just as diffutils-2.7.1 does; libc modules 1688would use a system textdomain and would invoke something like 1689`dgettext ("SYS_libc", "No such file or directory")'. 1690 1691 1692