| /third_party/python/Lib/ |
| D | locale.py | 3 The module provides low-level access to the C lib's locale APIs and adds high 25 # Yuck: LC_MESSAGES is non-standard: can't tell whether it exists before 34 """ strcoll(string,string) -> int. 37 return (a > b) - (a < b) 40 """ strxfrm(string) -> string. 41 Returns a string that behaves for cmp locale-aware. 64 """ localeconv() -> dict. 65 Returns numeric and monetary locale-specific parameters. 88 """ setlocale(integer,string=None) -> string. 125 # if grouping is -1, we are done [all …]
|
| /third_party/icu/docs/userguide/strings/ |
| D | utf-8.md | 1 --- 3 title: UTF-8 6 --- 7 <!-- 10 --> 12 # UTF-8 chapter 15 UTF-16, except for conversion from bytes to strings (via InputStreamReader or 18 While most of ICU works with UTF-16 strings and uses data structures optimized 19 for UTF-16, there are APIs that facilitate working with UTF-8, or are optimized 20 for UTF-8, or work with Unicode code points (21-bit integer values) regardless [all …]
|
| /third_party/lzma/CPP/Common/ |
| D | UTFConvert.h | 49 if (NonUtf) s.Add_OptSpaced("non-UTF8"); in PrintStatus() 84 if (allowReduced == false) - all UTF-8 character sequences must be finished. 85 if (allowReduced == true) - it allows truncated last character-Utf8-sequence 100 it processes SINGLE-SURROGATE-8 as valid Unicode point. 101 it converts SINGLE-SURROGATE-8 to SINGLE-SURROGATE-16 102 Note: some sequencies of two SINGLE-SURROGATE-8 points 103 will generate correct SURROGATE-16-PAIR, and 104 that SURROGATE-16-PAIR later will be converted to correct 105 UTF8-SURROGATE-21 point. So we don't restore original 106 STR-8 sequence in that case. [all …]
|
| /third_party/icu/ohos_icu4j/src/main/tests/resources/ohos/global/icu/dev/test/charsetdet/ |
| D | CharsetDetectionTests.xml | 1 <?xml version="1.0" encoding="UTF-8"?> 3 <!-- Copyright (C) 2016 and later: Unicode, Inc. and others. --> 4 <!-- License & terms of use: http://www.unicode.org/copyright.html#License --> 5 <!-- Copyright (c) 2005-2015 IBM Corporation and others. All rights reserved --> 6 <!-- See individual test cases for their specific copyright. --> 8 <charset-detection-tests> 9 …<test-case id="IUC10-ar" encodings="UTF-8 UTF-16LE UTF-16BE UTF-32BE UTF-32LE ISO-8859-6/ar window… 10 <!-- Copyright © 1991-2005 Unicode, Inc. All rights reserved. --> 15 تسجّل الآن لحضور المؤتمر الدولي العاشر ليونيكود, الذي سيعقد في 10-12 آذار 1997 بمدينة ماينتس, 23 </test-case> [all …]
|
| /third_party/icu/icu4j/main/tests/core/src/com/ibm/icu/dev/test/charsetdet/ |
| D | CharsetDetectionTests.xml | 1 <?xml version="1.0" encoding="UTF-8"?> 3 <!-- Copyright (C) 2016 and later: Unicode, Inc. and others. --> 4 <!-- License & terms of use: http://www.unicode.org/copyright.html --> 5 <!-- Copyright (c) 2005-2015 IBM Corporation and others. All rights reserved --> 6 <!-- See individual test cases for their specific copyright. --> 8 <charset-detection-tests> 9 …<test-case id="IUC10-ar" encodings="UTF-8 UTF-16LE UTF-16BE UTF-32BE UTF-32LE ISO-8859-6/ar window… 10 <!-- Copyright © 1991-2005 Unicode, Inc. All rights reserved. --> 15 تسجّل الآن لحضور المؤتمر الدولي العاشر ليونيكود, الذي سيعقد في 10-12 آذار 1997 بمدينة ماينتس, 23 </test-case> [all …]
|
| /third_party/icu/icu4c/source/test/testdata/ |
| D | csdetest.xml | 1 <?xml version="1.0" encoding="UTF-8"?> 3 <!-- Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.uni… 4 <!-- Copyright (c) 2005-2013 IBM Corporation and others. All rights reserved --> 5 <!-- See individual test cases for their specific copyright. --> 7 <charset-detection-tests> 8 …<test-case id="IUC10-ar" encodings="UTF-8 UTF-16LE UTF-16BE UTF-32BE UTF-32LE ISO-8859-6/ar window… 9 <!-- Copyright © 1991-2005 Unicode, Inc. All rights reserved. --> 14 تسجّل الآن لحضور المؤتمر الدولي العاشر ليونيكود, الذي سيعقد في 10-12 آذار 1997 بمدينة ماينتس, 22 </test-case> 24 … <test-case id="IUC10-da-Q" encodings="UTF-8 UTF-16LE UTF-16BE UTF-32BE UTF-32LE windows-1252/da"> [all …]
|
| /third_party/skia/m133/third_party/externals/icu/source/test/testdata/ |
| D | csdetest.xml | 1 <?xml version="1.0" encoding="UTF-8"?> 3 <!-- Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.uni… 4 <!-- Copyright (c) 2005-2013 IBM Corporation and others. All rights reserved --> 5 <!-- See individual test cases for their specific copyright. --> 7 <charset-detection-tests> 8 …<test-case id="IUC10-ar" encodings="UTF-8 UTF-16LE UTF-16BE UTF-32BE UTF-32LE ISO-8859-6/ar window… 9 <!-- Copyright © 1991-2005 Unicode, Inc. All rights reserved. --> 14 تسجّل الآن لحضور المؤتمر الدولي العاشر ليونيكود, الذي سيعقد في 10-12 آذار 1997 بمدينة ماينتس, 22 </test-case> 24 … <test-case id="IUC10-da-Q" encodings="UTF-8 UTF-16LE UTF-16BE UTF-32BE UTF-32LE windows-1252/da"> [all …]
|
| /third_party/cups/cups/ |
| D | transcode.c | 4 * Copyright © 2020-2024 by OpenPrinting. 5 * Copyright 2007-2014 by Apple Inc. 6 * Copyright 1997-2007 by Easy Software Products. 15 #include "cups-private.h" 16 #include "debug-internal.h" 31 static iconv_t map_from_utf8 = (iconv_t)-1; 32 /* Convert from UTF-8 to charset */ 33 static iconv_t map_to_utf8 = (iconv_t)-1; 34 /* Convert from charset to UTF-8 */ 41 * '_cupsCharmapFlush()' - Flush all character set maps out of cache. [all …]
|
| /third_party/pcre2/pcre2/testdata/ |
| D | testoutput10 | 1 # This set of tests is for UTF-8 support and Unicode property support, with 2 # relevance only for the 8-bit library. 6 # The next 5 patterns have UTF-8 errors 8 /[�]/utf 9 Failed: error -8 at offset 1: UTF-8 error: byte 2 top bits not 0x80 11 /�/utf 12 Failed: error -3 at offset 0: UTF-8 error: 1 byte missing at end 14 /���xxx/utf 15 Failed: error -8 at offset 0: UTF-8 error: byte 2 top bits not 0x80 17 /Â��������/utf [all …]
|
| D | testoutput14-8 | 1 # These test special UTF and UCP features of DFA matching. The output is 6 # ---------------------------------------------------- 8 # non-DFA matching. 10 /X/utf 12 Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2 14 Error -36 (bad UTF-8 offset) 18 Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2 22 Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2 26 Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2 30 Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2 [all …]
|
| D | testinput10 | 1 # This set of tests is for UTF-8 support and Unicode property support, with 2 # relevance only for the 8-bit library. 6 # The next 5 patterns have UTF-8 errors 8 /[�]/utf 10 /�/utf 12 /���xxx/utf 14 /Â��������/utf 20 /badutf/utf 21 \= Expect UTF-8 errors 62 /badutf/utf [all …]
|
| /third_party/grpc/third_party/utf8_range/utf8_corpus_dir/ |
| D | utf8_corpus_kuhn.txt | 1 UTF-8 decoder capability and stress test 2 ---------------------------------------- 4 Markus Kuhn <http://www.cl.cam.ac.uk/~mgk25/> - 2015-08-28 - CC BY 4.0 6 This test file can help you examine, how your UTF-8 decoder handles 7 various types of correct, malformed, or otherwise interesting UTF-8 12 help you think about, and test, the behaviour of your UTF-8 decoder on a 14 that most first-time authors of UTF-8 decoders find at least one 17 The test lines below cover boundary conditions, malformed UTF-8 18 sequences, as well as correctly encoded UTF-8 sequences of Unicode code 19 points that should never occur in a correct UTF-8 file. [all …]
|
| /third_party/protobuf/third_party/utf8_range/utf8_corpus_dir/ |
| D | utf8_corpus_kuhn.txt | 1 UTF-8 decoder capability and stress test 2 ---------------------------------------- 4 Markus Kuhn <http://www.cl.cam.ac.uk/~mgk25/> - 2015-08-28 - CC BY 4.0 6 This test file can help you examine, how your UTF-8 decoder handles 7 various types of correct, malformed, or otherwise interesting UTF-8 12 help you think about, and test, the behaviour of your UTF-8 decoder on a 14 that most first-time authors of UTF-8 decoders find at least one 17 The test lines below cover boundary conditions, malformed UTF-8 18 sequences, as well as correctly encoded UTF-8 sequences of Unicode code 19 points that should never occur in a correct UTF-8 file. [all …]
|
| /third_party/protobuf/java/core/src/test/java/com/google/protobuf/ |
| D | CheckUtf8Test.java | 1 // Protocol Buffers - Google's data interchange format 4 // Use of this source code is governed by a BSD-style 6 // https://developers.google.com/open-source/licenses/bsd 24 * UTF-8 checks. 51 assertWithMessage("Expected IllegalArgumentException for non UTF-8 byte string.").fail(); in testBuildRequiredStringWithBadUtf8() 53 assertThat(exception).hasMessageThat().isEqualTo("Byte string is not UTF-8."); in testBuildRequiredStringWithBadUtf8() 61 assertWithMessage("Expected IllegalArgumentException for non UTF-8 byte string.").fail(); in testBuildOptionalStringWithBadUtf8() 63 assertThat(exception).hasMessageThat().isEqualTo("Byte string is not UTF-8."); in testBuildOptionalStringWithBadUtf8() 71 assertWithMessage("Expected IllegalArgumentException for non UTF-8 byte string.").fail(); in testBuildRepeatedStringWithBadUtf8() 73 assertThat(exception).hasMessageThat().isEqualTo("Byte string is not UTF-8."); in testBuildRepeatedStringWithBadUtf8() [all …]
|
| /third_party/python/Lib/test/test_email/ |
| D | test__encoded_words.py | 62 _ew.decode('=?utf-8?X?somevalue?=') 64 def _test(self, source, result, charset='us-ascii', lang='', defects=[]): 72 self._test('=?us-ascii?q?foo?=', 'foo') 75 self._test('=?us-ascii?b?dmk=?=', 'vi') 78 self._test('=?us-ascii?Q?foo?=', 'foo') 81 self._test('=?us-ascii?B?dmk=?=', 'vi') 84 self._test('=?latin-1?q?=20F=fcr=20Elise=20?=', ' Für Elise ', 'latin-1') 87 self._test(b'=?us-ascii?q?=20\xACfoo?='.decode('us-ascii', 93 self._test(b'=?us-ascii?b?dm\xACk?='.decode('us-ascii', 101 self._test('=?us-ascii?b?dm\x01k===?=', [all …]
|
| /third_party/python/Lib/test/ |
| D | test_utf8_mode.py | 2 Test the implementation of the PEP 540: the UTF-8 Mode. 46 out = self.get_output('-c', code, LC_ALL=loc) 52 out = self.get_output('-X', 'utf8', '-c', code) 55 # undocumented but accepted syntax: -X utf8=1 56 out = self.get_output('-X', 'utf8=1', '-c', code) 59 out = self.get_output('-X', 'utf8=0', '-c', code) 63 # PYTHONLEGACYWINDOWSFSENCODING disables the UTF-8 Mode 64 # and has the priority over -X utf8 65 out = self.get_output('-X', 'utf8', '-c', code, 72 out = self.get_output('-c', code, PYTHONUTF8='1') [all …]
|
| /third_party/PyYAML/tests/legacy_tests/ |
| D | test_input_output.py | 7 data = file.read().decode('utf-8') 13 for input in [data.encode('utf-8'), 14 codecs.BOM_UTF8+data.encode('utf-8'), 15 codecs.BOM_UTF16_BE+data.encode('utf-16-be'), 16 codecs.BOM_UTF16_LE+data.encode('utf-16-le')]: 28 data = file.read().decode('utf-8') 29 for input in [data.encode('utf-16-be'), 30 data.encode('utf-16-le'), 31 codecs.BOM_UTF8+data.encode('utf-16-be'), 32 codecs.BOM_UTF8+data.encode('utf-16-le')]: [all …]
|
| /third_party/rust/crates/regex/regex-capi/ |
| D | README.md | 19 -------- 20 There are readable examples in the `ctest` and `examples` sub-directories. 23 [Rust and Cargo installed](https://www.rust-lang.org/downloads.html) 27 $ git clone git://github.com/rust-lang/regex 28 $ cd regex/regex-capi/examples 35 ----------- 45 https://github.com/rust-lang/regex/blob/master/PERFORMANCE.md 49 ------------- 50 All regular expressions must be valid UTF-8. 53 approximation, haystacks should be UTF-8. In fact, UTF-8 (and, one [all …]
|
| /third_party/icu/icu4j/perf-tests/ |
| D | normperf.pl | 5 # * Copyright (C) 2002-2007 International Business Machines Corporation and * 15 #--------------------------------------------------------------------- 39 [ "TestNames_SerbianSH.txt", "UTF-8", "b"], 40 # [ "arabic.txt", "UTF-8", "b"], 41 # [ "french.txt", "UTF-8", "b"], 42 # [ "greek.txt", "UTF-8", "b"], 43 # [ "hebrew.txt", "UTF-8", "b"], 44 # [ "hindi.txt" , "UTF-8", "b"], 45 # [ "japanese.txt", "UTF-8", "b"], 46 # [ "korean.txt", "UTF-8", "b"], [all …]
|
| /third_party/pcre2/pcre2/src/ |
| D | pcre2_error.c | 2 * Perl-Compatible Regular Expressions * 9 Original API code Copyright (c) 1997-2012 University of Cambridge 10 New API code Copyright (c) 2016-2024 University of Cambridge 12 ----------------------------------------------------------------------------- 38 ----------------------------------------------------------------------------- 51 /* The texts of compile-time error messages. Compile-time error numbers start 58 pcre2_get_error_message() counts through to the one it wants - this isn't a 79 "unrecognized character after (? or (?-\0" 84 "reference to non-existent subpattern\0" 85 "pattern passed as NULL with non-zero length\0" [all …]
|
| /third_party/mindspore/mindspore-src/source/tests/ut/python/dataset/ |
| D | test_save_op.py | 1 # Copyright 2020-2022 Huawei Technologies Co., Ltd 7 # http://www.apache.org/licenses/LICENSE-2.0 49 file_name = os.environ.get('PYTEST_CURRENT_TEST').split(':')[-1].split(' ')[0] 50 data = [{"image1": bytes("image1 bytes abcddddd", encoding='UTF-8'), 51 "image2": bytes("image1 bytes def", encoding='UTF-8'), 52 "image3": bytes("image1 bytes ghixxxxxxxxxx", encoding='UTF-8'), 53 "image4": bytes("image1 bytes jklzz", encoding='UTF-8'), 54 "image5": bytes("image1 bytes mno", encoding='UTF-8')}, 55 {"image1": bytes("image2 bytes abca", encoding='UTF-8'), 56 "image2": bytes("image2 bytes defbb", encoding='UTF-8'), [all …]
|
| /third_party/tex-hyphen/hyph-utf8/source/generic/hyph-utf8/lib/tex/hyphen/ |
| D | packages.yml | 3 Hyphenation patterns for Finnish in T1 and UTF-8 encodings. 5 while the newer ones (fi-x-school) implements the simpler rules taught at Finnish school. 8 description: |- 9 Hyphenation patterns for German in T1/EC and UTF-8 encodings, 11 The package includes the latest patterns from dehyph-exptl 13 however 8-bit engines still load old versions of patterns 14 for 'german' and 'ngerman' for backward-compatibility reasons. 27 description: |- 29 spelling in LGR and UTF-8 encodings. Patterns in UTF-8 use two code 40 description: |- [all …]
|
| /third_party/cups/examples/ |
| D | ipp-everywhere.test | 4 # Copyright © 2020-2024 by OpenPrinting. 5 # Copyright © 2007-2018 by Apple Inc. 6 # Copyright © 2001-2006 by Easy Software Products. All rights reserved. 13 # ./ipptool -V 2.0 -tf filename.ext printer-uri ipp-everywhere.test 17 INCLUDE "ipp-2.0.test" 24 NAME "PWG 5100.14 section 5.1/5.2 - Required Operations and Attributes" 25 OPERATION Get-Printer-Attributes 26 GROUP operation-attributes-tag 27 ATTR charset attributes-charset utf-8 28 ATTR naturalLanguage attributes-natural-language en [all …]
|
| /third_party/jerryscript/jerry-core/lit/ |
| D | lit-globals.h | 7 * http://www.apache.org/licenses/LICENSE-2.0 22 * ECMAScript standard defines terms "code unit" and "character" as 16-bit unsigned value 23 …* used to represent 16-bit unit of text, this is the same as code unit in UTF-16 (See ECMA-262 5.1… 26 …* than 16 bits: 0x0 - 0x10FFFFF). One code point could be represented with one ore two 16-bit code… 32 …* Internally JerryScript engine uses UTF-8 representation of strings to reduce memory overhead. Un… 33 * occupies from one to four bytes in UTF-8 representation. 35 * Unicode scalar value | Bytes in UTF-8 | Bytes in UTF-16 37 * ---------------------------------------------------------------------- 38 * 0x0 - 0x7F | 1 byte | 2 bytes 39 * 0x80 - 0x7FF | 2 bytes | 2 bytes [all …]
|
| /third_party/icu/docs/userguide/icu/ |
| D | unicode.md | 1 --- 6 --- 7 <!-- 10 --> 16 {: .no_toc .text-delta } 21 --- 41 Go to the [online ICU demos](https://icu4c-demos.unicode.org/icu-bin/icudemos) to 42 see how a Unicode-based server application can handle text in many languages and 47 Representing text-format data in computers is a matter of defining a set of 67 graphic, displayable characters. It was designed to represent English-language [all …]
|