1# Copyright (C) 2016 and later: Unicode, Inc. and others. 2# License & terms of use: http://www.unicode.org/copyright.html 3# Copyright (C) 2010, International Business Machines 4# Corporation and others. All Rights Reserved. 5# 6# file name: testnorm.txt 7# encoding: US-ASCII 8# tab size: 8 (not used) 9# indentation:4 10# 11# created on: 2010feb15 12# created by: Markus W. Scherer 13# 14# Normalization test data, for improving code coverage. 15 16# Selection of Canonical_Combining_Class (ccc) values 170300..0314:230 180315:232 190316..0319:220 20031A:232 21031B:216 22031C..0320:220 230321..0322:202 240323..0326:220 250327..0328:202 260329..0333:220 270334..0338:1 280339..033C:220 29033D..0344:230 300345:240 310346:230 320347..0349:220 33034A..034C:230 34034D..034E:220 350350..0352:230 360353..0356:220 370357:230 380358:232 390359..035A:220 40035B:230 41035C:233 42035D..035E:234 43035F:233 440360..0361:234 450362:233 460363..036F:230 47# ICU 63 normalization with UCPTrie requires inert surrogate code points. 48# D802:2 # surrogates with non-zero combining classes 49# D803:3 50# D804:4 51110B9:9 52110BA:7 53 54# Some interesting mappings 5500C0=0041 0300 5600C1=0041 0301 5700C2=0041 0302 5800C3=0041 0303 5900C4=0041 0308 6000C5=0041 030A 6100C7=0043 0327 62# ICU 63 normalization with UCPTrie requires inert surrogate code points. 63# D800>D7FF # surrogates with mappings, and mappings to empty strings 64# D801> 65# DFFE> 66# DFFF>FFFF 67E000> 68E001=61 338 # composition with trail<=33FF and composite>7FFF 69E002=E001 308 # recursive mapping needs reordering 70E003>62 307 327 337 # mapping needs reordering 71E011=E010 F0011 # composition of BMP+supplementary, and F0011 is maybe & combines-fwd 72E111>1101 # mapping ends in Jamo L 73E112>1102 62 # mapping starts with Jamo L 74FFF3>FFF4 75FFF4>FFF5 76FFF5>FFF7 77FFF7>10037 7810036>FFF6 7910077>10037 801109A=11099 110BA 811109C=1109B 110BA 82110AB=110A5 110BA 83F0010=F0011 E012 # composition of supplementary+BMP 84