1--- 2layout: default 3title: Cobol 4nav_order: 1 5parent: Use From... 6--- 7<!-- 8© 2020 and later: Unicode, Inc. and others. 9License & terms of use: http://www.unicode.org/copyright.html 10--> 11 12# How To Use ICU4C From COBOL 13{: .no_toc } 14 15## Contents 16{: .no_toc .text-delta } 17 181. TOC 19{:toc} 20 21--- 22 23## Overview 24 25This document describes how to use ICU functions within a COBOL program. It is 26assumed that the programmer understands the concepts behind ICU, and is able to 27identify which ICU APIs are appropriate for his/her purpose. The programmer must 28also understand the meaning of the arguments passed to these APIs and of the 29returned value, if any. This is all explained in the ICU documentation, although 30in C/C++ style. This document’s objective is to facilitate the adaptation of 31these explanations to COBOL syntax. 32 33It must be understood that the packaging of ICU data and executable code into 34libraries is platform dependent. Consequently, the calling conventions between 35COBOL programs and the C/C++ functions in ICU may vary from platform to 36platform. In a lesser way, the C/C++ types of arguments and return values may 37have different equivalents in COBOL, depending on the platform and even the 38specific COBOL compiler used. 39 40This document is supplemented with three [sample 41programs](https://sourceforge.net/projects/icu/files/OldFiles/samples/ICU-COBOL.zip) 42illustrating using ICU APIs for code page conversion, collation and 43normalization. Description of the sample programs appears in the appendix at the 44end of this document. 45 46## ICU API invocation in COBOL 47 481. Invocation of ICU APIs is done with the COBOL “CALL” statement. 49 502. Variables, pointers and constants appearing in ICU \*.H files (for C/C++) 51 must be defined in the WORKING-STORAGE section for COBOL. 52 533. Arguments to a C/C++ API translate into arguments to a COBOL CALL statement, 54 passed by value or by reference as will be detailed below. 55 564. For a C/C++ API with a non-void return value, the RETURNING clause will be 57 used for the CALL statement. 58 595. Character string arguments to C/C++ must be null-terminated. In COBOL, this 60 means using the `Z"xxx"` format for literals, and adding `X"00"` at the end of 61 the content of variables. 62 636. Special consideration must be given when a pointer is the value returned by 64 an API, since COBOL implements a more limited concept of pointers than 65 C/C++. How to handle this case will be explained below. 66 67### COBOL and C/C++ Data Types 68 69The following table (extracted from IBM VisualAge COBOL documentation) shows the 70correspondence between the data types available in COBOL and C/C++. 71 72> :point_right: **Note**: Parts of identifier names in Cobol are separated by `-`, not by `_` as in C. 73 74| C/C++ data types | COBOL data types | 75|--------------------------- |--------------------------------------------------------------------------------------------------- | 76| wchar_t | "DISPLAY-1 (PICTURE N, G) wchar_t is the processing code whereas DISPLAY-1 is the file code." | 77| char | PIC X. | 78| signed char | No appropriate COBOL equivalent. | 79| unsigned char | No appropriate COBOL equivalent. | 80| short signed int | PIC S9-S9(4) COMP-5. Can beCOMP, COMP-4, or BINARY if you use the TRUNC(BIN) compiler option. | 81| short unsigned int | PIC 9-9(4) COMP-5. Can be COMP, COMP-4, or BINARY if you use the TRUNC(BIN) compiler option. | 82| long int | PIC 9(5)-9(9) COMP-5. Can be COMP, COMP-4, or BINARY if you use the TRUNC(BIN) compiler option. | 83| long long int | PIC 9(10)-9(18) COMP-5. Can be COMP, COMP-4, or BINARY if you use the TRUNC(BIN) compiler option. | 84| float | COMP-1. | 85| double | COMP-2. | 86| enumeration | Equivalent to level 88, but not identical. | 87| char(n) | PICTURE X(n). | 88| array pointer (*) to type | No appropriate COBOL equivalent. | 89| pointer(*) to function | PROCEDURE-POINTER. | 90 91A number of C definitions specific to ICU (and many other compilers on POSIX 92platforms) that are not presented in the table above can also be translated into 93COBOL definitions. 94 95| C/C++ data types | COBOL data types | 96|------------------------------------------|---------------------------------------------------------------------------------------------| 97| int8_t | PIC X. Not really equivalent. | 98| uint8_t | PIC X. Not really equivalent. | 99| int16_t | PIC S9(4) BINARY. Can beCOMP, COMP-4, or BINARY if you use the TRUNC(BIN) compiler option. | 100| uint16_t | PIC 9(4) BINARY. Can beCOMP, COMP-4, or BINARY if you use the TRUNC(BIN) compiler option. | 101| int32_t | PIC S9(9) COMP-5. Can be COMP, COMP-4, or BINARY if you use the TRUNC(BIN) compiler option. | 102| uint32_t | PIC 9(9) COMP-5. Can be COMP, COMP-4, or BINARY if you use the TRUNC(BIN) compiler option. | 103| Uchar | PIC 9(4) BINARY. Can beCOMP, COMP-4, or BINARY if you use the TRUNC(BIN) compiler option. | 104| Uchar32 | PIC 9(9) COMP-5. Can be COMP, COMP-4, or BINARY if you use the TRUNC(BIN) compiler option. | 105| UNormalizationMode | PIC S9(9) COMP-5. Can be COMP, COMP-4, or BINARY if you use the TRUNC(BIN) compiler option. | 106| UerrorCode | PIC S9(9) COMP-5. Can be COMP, COMP-4, or BINARY if you use the TRUNC(BIN) compiler option. | 107| pointer(*) to object (e.g. Uconverter *) | PIC S9(9) COMP-5. Can be COMP, COMP-4, or BINARY if you use the TRUNC(BIN) compiler option. | 108| Windows Handle | PIC S9(9) COMP-5. Can be COMP, COMP-4, or BINARY if you use the TRUNC(BIN) compiler option. | 109 110### Enumerations (first possibility) 111 112C Enumeration types do not translate very well into COBOL. There are two 113possible ways to simulate these enumerations. 114 115#### C example 116 117```c 118 typedef enum { 119 /** No decomposition/composition. @draft ICU 1.8 */ 120 UNORM_NONE = 1, 121 /** Canonical decomposition. @draft ICU 1.8 */ 122 UNORM_NFD = 2, 123 . . . 124 } UNormalizationMode; 125``` 126 127#### COBOL example 128 129```cobol 130 WORKING-STORAGE section. 131 *--------------- Ported from unorm.h ------------ 132 * enum UNormalizationMode { 133 77 UNORM-NONE PIC 134 S9(9) Binary value 1. 135 77 UNORM-NFD PIC 136 S9(9) Binary value 2. 137 … 138``` 139 140### Enumerations (second possibility) 141 142#### C example 143 144```c 145 /*==== utypes.h ========*/ 146 typedef enum UErrorCode { 147 U_USING_FALLBACK_WARNING = -128, /* (not an error) */ 148 U_USING_DEFAULT_WARNING = -127, /* (not an error) */ 149 . . . 150 } UErrorCode; 151``` 152 153#### COBOL example 154 155```cobol 156 *==== utypes.h ======== 157 01 UerrorCode PIC S9(9) Binary value 0. 158 * A resource bundle lookup returned a fallback 159 * (not an error) 160 88 U-USING-FALLBACK-WARNING value -128. 161 * (not an error) 162 88 U-USING-DEFAULT-WARNING value -127. 163 . . . 164``` 165 166## Call statement, calling by value or by reference 167 168In general, arguments defined in C as pointers (`\*`) must be listed in the 169COBOL Call statement with the using by reference clause. Arguments which are not 170pointers must be transferred with the using by value clause. The exception to 171this requirement is when an argument is a pointer which has been assigned to a 172COBOL variable (e.g. as a value returned by an ICU API), then it must be passed 173by value. For instance, a pointer to a Converter passed as argument to 174conversion APIs. 175 176### Conversion Declaration Examples 177 178#### C (API definition in \*.h file) 179 180```c 181 /*--------------------- UCNV.H ---------------------------*/ 182 U_CAPI int32_t U_EXPORT2 183 ucnv_toUChars(UConverter * cnv, 184 UChar * dest, 185 int32_t destCapacity, 186 const char * src, 187 int32_t srcLength, 188 UErrorCode * pErrorCode); 189``` 190 191#### COBOL 192 193```cobol 194 PROCEDURE DIVISION. 195 Call API-Pointer using 196 by value Converter-toU-Pointer 197 by reference Unicode-Input-Buffer 198 by value destCapacity 199 by reference Input-Buffer 200 by value srcLength 201 by reference UErrorCode 202 Returning Text-Length. 203``` 204 205## Call statement, Returning clause 206 207### Returned value is Pointer or Binary 208 209#### C (API definition in \*.h file) 210 211```c 212 U_CAPI UConverter * U_EXPORT2 213 ucnv_open(const char * converterName, 214 UErrorCode * err); 215``` 216 217#### COBOL 218 219```cobol 220 WORKING-STORAGE section. 221 01 Converter-Pointer PIC S9(9) BINARY. 222 PROCEDURE DIVISION 223 Move Z"iso-8859-8" to converterNameSource. 224 . . . 225 Call API-Pointer using 226 by reference converterNameSource 227 by reference UErrorCode 228 Returning Converter-Pointer. 229``` 230 231### Returned value is a Pointer to string 232 233If the returned value in C is a string pointer (`char \*`), then in COBOL we 234must use a pointer to string defined in the Linkage section. 235 236#### C ( API definition in \*.h file) 237 238```c 239 U_CAPI const char * U_EXPORT2 240 ucnv_getAvailableName(int32_t n); 241``` 242 243#### COBOL 244 245```cobol 246 DATA DIVISION. 247 WORKING-STORAGE section. 248 01 Converter-Name-Link-Pointer Usage is Pointer. 249 LINKAGE section. 250 01 Converter-Name-Link. 251 03 Converter-Name-String pic X(80). 252 PROCEDURE DIVISION using Converter-Name-Link. 253 Call API-Pointer using by value Converters-Index 254 Returning Converter-Name-Link-Pointer. 255 SET Address of Converter-Name-Link 256 to Converter-Name-Link-Pointer. 257 . . . 258 Move Converter-Name-String to Debug-Value. 259``` 260 261## How to invoke ICU APIs 262 263Inter-language communication is often problematic. This is certainly the case 264when calling C/C++ functions from COBOL, because of the very different roots of 265the two languages. How to invoke the ICU APIs from a COBOL program is likely to 266depend on the operating system and even on the specific compilers in use. The 267section below deals with COBOL to C calls on a Windows platform. Similar 268sections should be added for other platforms. 269 270### Windows platforms 271 272The following instructions were tested on a Windows 2000 platform, with the IBM 273VisualAge COBOL compiler and the Microsoft Visual C/C++ compiler. 274 275For Windows, ICU APIs are normally packaged as DLLs (Dynamic Load Libraries). 276For technical reasons, COBOL calls to C/C++ functions need to be done via 277dynamic loading of the DLLs at execution time (load on call). 278 279The COBOL program must be compiled with the following compiler options: 280 281 \* options CBL PGMNAME(MIXED) CALLINT(SYSTEM) NODYNAM 282 283In order to call an ICU API, two preparation steps are needed: 284 2851. Load in memory the DLL which contains the API 286 2872. Get the address of the API 288 289For performance, it is better to perform these steps once before the first call 290and to save the returned values for future use (the sample programs get the 291address of APIs for each call, for the sake of logging; production programs 292should get the address once and reuse it 293as many times as needed). 294 295When no more APIs from a DLL are needed, the DLL should be unloaded in order to 296free the associated memory. 297 298### Load DLL Into Memory 299 300This is done as follows: 301 302 Call "LoadLibraryA" using by reference DLL-Name 303 Returning DLL-Handle. 304 IF DLL-Handle = ZEROS 305 Perform error handling. . . 306 307Return value: DLL Handle, defined as `PIC S9(9) BINARY` 308 309Input Value: DLL Name (null-terminated string) 310 311Errors may happen if the DLL name is not correct, or the string is not 312null-terminated, or the DLL file is not available (in the current directory or 313in a directory included in the PATH system variable). 314 315#### Get API address 316 317This is done as follows: 318 319 Call "GetProcAddress" using by value DLL-Handle 320 by reference API-Name 321 Returning API-Pointer. 322 IF API-Pointer = NULL 323 Perform error handling... 324 325Return value: API address, defined as PROCEDURE-POINTER 326Input Value: DLL Handle (returned by call to LoadLibraryA) 327Procedure Name (null-terminated string) 328 329Errors may happen if the API name is not correct (remember that API names are 330case-sensitive), or the string is not null-terminated, or the API is not 331included in the specified DLL. If the API pointer is not null, the call to the 332API is done with following according to the arguments and return value of the 333API. 334 335 Call API-Pointer using . . . returning . . . 336 337After calling an API, the returned error code should be checked when relevant. 338Code to check for error conditions is illustrated in the sample programs. 339 340#### Unload DLL from Memory 341 342This is done as follows: 343 344 Call "FreeLibrary" using DLL-Handle. 345 346Return value: none 347Input Value: DLL Handle (returned by call to LoadLibraryA) 348 349## Sample Programs 350 351Three sample programs are supplied with this document. The sample programs were 352developed on and for a Windows 2000 platform. Some adaptations may be necessary 353for other platforms 354 355Before running the sample programs, you must perform the following steps: 356 3571. Install the version of ICU appropriate for your platform 358 3592. Build ICU libraries if needed (see the ICU Readme file) 360 3613. Make the libraries accessible (for instance on Windows systems, add the 362 directory containing the libraries to the PATH system variable) 363 3644. Compile the sample programs with appropriate compiler options 365 3665. Copy the test files to a work directory 367 368Each program is supplied with input test files and with a model log file. If the 369log file that you create by running a sample program is equivalent to the model 370log file, your setup is probably correct. 371 372The three sample programs focus each on a certain ICU area of functionality: 373 3741. Conversion 375 3762. Collation 377 3783. Normalization 379 380### Conversion sample program 381 382 * The sample program includes the following steps: 383 * - Display the names of the converters from a list of all 384 * converters contained in the alias file. 385 * - Display the current default converter name. 386 * - Set new default converter name. 387 * 388 * - Read a string from Input file "ICU_Conv_Input_8.txt" 389 * (File in UTF-8 Format) 390 * - Convert this string from UTF-8 to code page iso-8859-8 391 * - Write the result to output file "ICU_Conv_Output.txt" 392 * 393 * - Read a line from Input file "ICU_Conv_Input.txt" 394 * (File in ANSI Format, code page 862) 395 * - Convert this string from code page ibm-862 to UTF-16 396 * - Convert the resulting string from UTF-16 to code page windows-1255 397 * - Write the result to output file "ICU_ Conv_Output.txt" 398 * - Write debugging information to Display and 399 * log file "ICU_Conv_Log.txt" (File in ANSI Format) 400 * - Repeat for all lines in Input file 401 ** 402 * The following ICU APIs are used: 403 * ucnv_countAvailable 404 * ucnv_getAvailableName 405 * ucnv_getDefaultName 406 * ucnv_setDefaultName 407 * ucnv_convert 408 * ucnv_open 409 * ucnv_toUChars 410 * ucnv_fromUChars 411 * ucnv_close 412 413The ucnv_xxx APIs are documented in file "UCNV.H". 414 415### Collation sample program 416 417 * The sample program includes the following steps: 418 * - Read a string array from Input file "ICU_Coll_Input.txt" 419 * (file in ANSI format) 420 * - Convert string array from code page into UTF-16 format 421 * - Compare the string array into the canonical composed 422 * - Perform bubble sort of string array, according 423 * to Unicode string equivalence comparisons 424 * - Convert string array from Unicode into code page format 425 * - Write the result to output file "ICU_Coll_Output.txt" 426 * (file in ANSI format) 427 * - Write debugging information to Display and 428 * log file "ICU_Coll_Log.txt" (file in ANSI format) 429 ** 430 * The following ICU APIs are used: 431 * ucol_open 432 * ucol_strcoll 433 * ucol_close 434 * ucnv_open 435 * ucnv_toUChars 436 * ucnv_fromUChars 437 * ucnv_close 438 439The ucol_xxx APIs are documented in file "UCOL.H". 440The ucnv_xxx APIs are documented in file "UCNV.H". 441 442### Normalization sample program 443 444 * The sample includes the following steps: 445 * - Read a string from input file "ICU_NORM_Input.txt" 446 * (file in ANSI format) 447 * - Convert the string from code page into UTF-16 format 448 * - Perform quick check on the string, to determine if the 449 * string is in NFD (Canonical decomposition) 450 * normalization format. 451 * - Normalize the string into canonical composed form 452 * (FCD and decomposed) 453 * - Perform quick check on the result string, to determine 454 * if the string is in NFD normalization form 455 * - Convert the string from Unicode into the code page format 456 * - Write the result to output file "ICU_NORM_Output.txt" 457 * (file in ANSI format) 458 * - Write debugging information to Display and 459 * log file "ICU_NORM_Log.txt" (file in ANSI format) 460 ** 461 * The following ICU APIs are used: 462 * ucnv_open 463 * ucnv_toUChars 464 * unorm_normalize 465 * unorm_quickCheck 466 * ucnv_fromUChars 467 * ucnv_close 468 469The unorm_xxx APIs are documented in file "UNORM.H". 470 471The ucnv_xxx APIs are documented in file "UCNV.H". 472