Home
last modified time | relevance | path

Searched +full:utf +full:- +full:8 (Results 1 – 25 of 1126) sorted by relevance

12345678910>>...46

/third_party/python/Lib/
Dlocale.py3 The module provides low-level access to the C lib's locale APIs and adds high
25 # Yuck: LC_MESSAGES is non-standard: can't tell whether it exists before
34 """ strcoll(string,string) -> int.
37 return (a > b) - (a < b)
40 """ strxfrm(string) -> string.
41 Returns a string that behaves for cmp locale-aware.
64 """ localeconv() -> dict.
65 Returns numeric and monetary locale-specific parameters.
88 """ setlocale(integer,string=None) -> string.
125 # if grouping is -1, we are done
[all …]
/third_party/icu/docs/userguide/strings/
Dutf-8.md1 ---
3 title: UTF-8
6 ---
7 <!--
10 -->
12 # UTF-8 chapter
15 UTF-16, except for conversion from bytes to strings (via InputStreamReader or
18 While most of ICU works with UTF-16 strings and uses data structures optimized
19 for UTF-16, there are APIs that facilitate working with UTF-8, or are optimized
20 for UTF-8, or work with Unicode code points (21-bit integer values) regardless
[all …]
/third_party/lzma/CPP/Common/
DUTFConvert.h49 if (NonUtf) s.Add_OptSpaced("non-UTF8"); in PrintStatus()
84 if (allowReduced == false) - all UTF-8 character sequences must be finished.
85 if (allowReduced == true) - it allows truncated last character-Utf8-sequence
100 it processes SINGLE-SURROGATE-8 as valid Unicode point.
101 it converts SINGLE-SURROGATE-8 to SINGLE-SURROGATE-16
102 Note: some sequencies of two SINGLE-SURROGATE-8 points
103 will generate correct SURROGATE-16-PAIR, and
104 that SURROGATE-16-PAIR later will be converted to correct
105 UTF8-SURROGATE-21 point. So we don't restore original
106 STR-8 sequence in that case.
[all …]
/third_party/icu/ohos_icu4j/src/main/tests/resources/ohos/global/icu/dev/test/charsetdet/
DCharsetDetectionTests.xml1 <?xml version="1.0" encoding="UTF-8"?>
3 <!-- Copyright (C) 2016 and later: Unicode, Inc. and others. -->
4 <!-- License & terms of use: http://www.unicode.org/copyright.html#License -->
5 <!-- Copyright (c) 2005-2015 IBM Corporation and others. All rights reserved -->
6 <!-- See individual test cases for their specific copyright. -->
8 <charset-detection-tests>
9 …<test-case id="IUC10-ar" encodings="UTF-8 UTF-16LE UTF-16BE UTF-32BE UTF-32LE ISO-8859-6/ar window…
10 <!-- Copyright © 1991-2005 Unicode, Inc. All rights reserved. -->
15 تسجّل الآن لحضور المؤتمر الدولي العاشر ليونيكود, الذي سيعقد في 10-12 آذار 1997 بمدينة ماينتس,
23 </test-case>
[all …]
/third_party/icu/icu4j/main/tests/core/src/com/ibm/icu/dev/test/charsetdet/
DCharsetDetectionTests.xml1 <?xml version="1.0" encoding="UTF-8"?>
3 <!-- Copyright (C) 2016 and later: Unicode, Inc. and others. -->
4 <!-- License & terms of use: http://www.unicode.org/copyright.html -->
5 <!-- Copyright (c) 2005-2015 IBM Corporation and others. All rights reserved -->
6 <!-- See individual test cases for their specific copyright. -->
8 <charset-detection-tests>
9 …<test-case id="IUC10-ar" encodings="UTF-8 UTF-16LE UTF-16BE UTF-32BE UTF-32LE ISO-8859-6/ar window…
10 <!-- Copyright © 1991-2005 Unicode, Inc. All rights reserved. -->
15 تسجّل الآن لحضور المؤتمر الدولي العاشر ليونيكود, الذي سيعقد في 10-12 آذار 1997 بمدينة ماينتس,
23 </test-case>
[all …]
/third_party/icu/icu4c/source/test/testdata/
Dcsdetest.xml1 <?xml version="1.0" encoding="UTF-8"?>
3 <!-- Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.uni…
4 <!-- Copyright (c) 2005-2013 IBM Corporation and others. All rights reserved -->
5 <!-- See individual test cases for their specific copyright. -->
7 <charset-detection-tests>
8 …<test-case id="IUC10-ar" encodings="UTF-8 UTF-16LE UTF-16BE UTF-32BE UTF-32LE ISO-8859-6/ar window…
9 <!-- Copyright © 1991-2005 Unicode, Inc. All rights reserved. -->
14 تسجّل الآن لحضور المؤتمر الدولي العاشر ليونيكود, الذي سيعقد في 10-12 آذار 1997 بمدينة ماينتس,
22 </test-case>
24 … <test-case id="IUC10-da-Q" encodings="UTF-8 UTF-16LE UTF-16BE UTF-32BE UTF-32LE windows-1252/da">
[all …]
/third_party/skia/m133/third_party/externals/icu/source/test/testdata/
Dcsdetest.xml1 <?xml version="1.0" encoding="UTF-8"?>
3 <!-- Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.uni…
4 <!-- Copyright (c) 2005-2013 IBM Corporation and others. All rights reserved -->
5 <!-- See individual test cases for their specific copyright. -->
7 <charset-detection-tests>
8 …<test-case id="IUC10-ar" encodings="UTF-8 UTF-16LE UTF-16BE UTF-32BE UTF-32LE ISO-8859-6/ar window…
9 <!-- Copyright © 1991-2005 Unicode, Inc. All rights reserved. -->
14 تسجّل الآن لحضور المؤتمر الدولي العاشر ليونيكود, الذي سيعقد في 10-12 آذار 1997 بمدينة ماينتس,
22 </test-case>
24 … <test-case id="IUC10-da-Q" encodings="UTF-8 UTF-16LE UTF-16BE UTF-32BE UTF-32LE windows-1252/da">
[all …]
/third_party/cups/cups/
Dtranscode.c4 * Copyright © 2020-2024 by OpenPrinting.
5 * Copyright 2007-2014 by Apple Inc.
6 * Copyright 1997-2007 by Easy Software Products.
15 #include "cups-private.h"
16 #include "debug-internal.h"
31 static iconv_t map_from_utf8 = (iconv_t)-1;
32 /* Convert from UTF-8 to charset */
33 static iconv_t map_to_utf8 = (iconv_t)-1;
34 /* Convert from charset to UTF-8 */
41 * '_cupsCharmapFlush()' - Flush all character set maps out of cache.
[all …]
/third_party/pcre2/pcre2/testdata/
Dtestoutput101 # This set of tests is for UTF-8 support and Unicode property support, with
2 # relevance only for the 8-bit library.
6 # The next 5 patterns have UTF-8 errors
8 /[�]/utf
9 Failed: error -8 at offset 1: UTF-8 error: byte 2 top bits not 0x80
11 /�/utf
12 Failed: error -3 at offset 0: UTF-8 error: 1 byte missing at end
14 /���xxx/utf
15 Failed: error -8 at offset 0: UTF-8 error: byte 2 top bits not 0x80
17 /��������/utf
[all …]
Dtestoutput14-81 # These test special UTF and UCP features of DFA matching. The output is
6 # ----------------------------------------------------
8 # non-DFA matching.
10 /X/utf
12 Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2
14 Error -36 (bad UTF-8 offset)
18 Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2
22 Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2
26 Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2
30 Failed: error -16: UTF-8 error: code points 0xd800-0xdfff are not defined at offset 2
[all …]
Dtestinput101 # This set of tests is for UTF-8 support and Unicode property support, with
2 # relevance only for the 8-bit library.
6 # The next 5 patterns have UTF-8 errors
8 /[�]/utf
10 /�/utf
12 /���xxx/utf
14 /��������/utf
20 /badutf/utf
21 \= Expect UTF-8 errors
62 /badutf/utf
[all …]
/third_party/grpc/third_party/utf8_range/utf8_corpus_dir/
Dutf8_corpus_kuhn.txt1 UTF-8 decoder capability and stress test
2 ----------------------------------------
4 Markus Kuhn <http://www.cl.cam.ac.uk/~mgk25/> - 2015-08-28 - CC BY 4.0
6 This test file can help you examine, how your UTF-8 decoder handles
7 various types of correct, malformed, or otherwise interesting UTF-8
12 help you think about, and test, the behaviour of your UTF-8 decoder on a
14 that most first-time authors of UTF-8 decoders find at least one
17 The test lines below cover boundary conditions, malformed UTF-8
18 sequences, as well as correctly encoded UTF-8 sequences of Unicode code
19 points that should never occur in a correct UTF-8 file.
[all …]
/third_party/protobuf/third_party/utf8_range/utf8_corpus_dir/
Dutf8_corpus_kuhn.txt1 UTF-8 decoder capability and stress test
2 ----------------------------------------
4 Markus Kuhn <http://www.cl.cam.ac.uk/~mgk25/> - 2015-08-28 - CC BY 4.0
6 This test file can help you examine, how your UTF-8 decoder handles
7 various types of correct, malformed, or otherwise interesting UTF-8
12 help you think about, and test, the behaviour of your UTF-8 decoder on a
14 that most first-time authors of UTF-8 decoders find at least one
17 The test lines below cover boundary conditions, malformed UTF-8
18 sequences, as well as correctly encoded UTF-8 sequences of Unicode code
19 points that should never occur in a correct UTF-8 file.
[all …]
/third_party/protobuf/java/core/src/test/java/com/google/protobuf/
DCheckUtf8Test.java1 // Protocol Buffers - Google's data interchange format
4 // Use of this source code is governed by a BSD-style
6 // https://developers.google.com/open-source/licenses/bsd
24 * UTF-8 checks.
51 assertWithMessage("Expected IllegalArgumentException for non UTF-8 byte string.").fail(); in testBuildRequiredStringWithBadUtf8()
53 assertThat(exception).hasMessageThat().isEqualTo("Byte string is not UTF-8."); in testBuildRequiredStringWithBadUtf8()
61 assertWithMessage("Expected IllegalArgumentException for non UTF-8 byte string.").fail(); in testBuildOptionalStringWithBadUtf8()
63 assertThat(exception).hasMessageThat().isEqualTo("Byte string is not UTF-8."); in testBuildOptionalStringWithBadUtf8()
71 assertWithMessage("Expected IllegalArgumentException for non UTF-8 byte string.").fail(); in testBuildRepeatedStringWithBadUtf8()
73 assertThat(exception).hasMessageThat().isEqualTo("Byte string is not UTF-8."); in testBuildRepeatedStringWithBadUtf8()
[all …]
/third_party/python/Lib/test/test_email/
Dtest__encoded_words.py62 _ew.decode('=?utf-8?X?somevalue?=')
64 def _test(self, source, result, charset='us-ascii', lang='', defects=[]):
72 self._test('=?us-ascii?q?foo?=', 'foo')
75 self._test('=?us-ascii?b?dmk=?=', 'vi')
78 self._test('=?us-ascii?Q?foo?=', 'foo')
81 self._test('=?us-ascii?B?dmk=?=', 'vi')
84 self._test('=?latin-1?q?=20F=fcr=20Elise=20?=', ' Für Elise ', 'latin-1')
87 self._test(b'=?us-ascii?q?=20\xACfoo?='.decode('us-ascii',
93 self._test(b'=?us-ascii?b?dm\xACk?='.decode('us-ascii',
101 self._test('=?us-ascii?b?dm\x01k===?=',
[all …]
/third_party/python/Lib/test/
Dtest_utf8_mode.py2 Test the implementation of the PEP 540: the UTF-8 Mode.
46 out = self.get_output('-c', code, LC_ALL=loc)
52 out = self.get_output('-X', 'utf8', '-c', code)
55 # undocumented but accepted syntax: -X utf8=1
56 out = self.get_output('-X', 'utf8=1', '-c', code)
59 out = self.get_output('-X', 'utf8=0', '-c', code)
63 # PYTHONLEGACYWINDOWSFSENCODING disables the UTF-8 Mode
64 # and has the priority over -X utf8
65 out = self.get_output('-X', 'utf8', '-c', code,
72 out = self.get_output('-c', code, PYTHONUTF8='1')
[all …]
/third_party/PyYAML/tests/legacy_tests/
Dtest_input_output.py7 data = file.read().decode('utf-8')
13 for input in [data.encode('utf-8'),
14 codecs.BOM_UTF8+data.encode('utf-8'),
15 codecs.BOM_UTF16_BE+data.encode('utf-16-be'),
16 codecs.BOM_UTF16_LE+data.encode('utf-16-le')]:
28 data = file.read().decode('utf-8')
29 for input in [data.encode('utf-16-be'),
30 data.encode('utf-16-le'),
31 codecs.BOM_UTF8+data.encode('utf-16-be'),
32 codecs.BOM_UTF8+data.encode('utf-16-le')]:
[all …]
/third_party/rust/crates/regex/regex-capi/
DREADME.md19 --------
20 There are readable examples in the `ctest` and `examples` sub-directories.
23 [Rust and Cargo installed](https://www.rust-lang.org/downloads.html)
27 $ git clone git://github.com/rust-lang/regex
28 $ cd regex/regex-capi/examples
35 -----------
45 https://github.com/rust-lang/regex/blob/master/PERFORMANCE.md
49 -------------
50 All regular expressions must be valid UTF-8.
53 approximation, haystacks should be UTF-8. In fact, UTF-8 (and, one
[all …]
/third_party/icu/icu4j/perf-tests/
Dnormperf.pl5 # * Copyright (C) 2002-2007 International Business Machines Corporation and *
15 #---------------------------------------------------------------------
39 [ "TestNames_SerbianSH.txt", "UTF-8", "b"],
40 # [ "arabic.txt", "UTF-8", "b"],
41 # [ "french.txt", "UTF-8", "b"],
42 # [ "greek.txt", "UTF-8", "b"],
43 # [ "hebrew.txt", "UTF-8", "b"],
44 # [ "hindi.txt" , "UTF-8", "b"],
45 # [ "japanese.txt", "UTF-8", "b"],
46 # [ "korean.txt", "UTF-8", "b"],
[all …]
/third_party/pcre2/pcre2/src/
Dpcre2_error.c2 * Perl-Compatible Regular Expressions *
9 Original API code Copyright (c) 1997-2012 University of Cambridge
10 New API code Copyright (c) 2016-2024 University of Cambridge
12 -----------------------------------------------------------------------------
38 -----------------------------------------------------------------------------
51 /* The texts of compile-time error messages. Compile-time error numbers start
58 pcre2_get_error_message() counts through to the one it wants - this isn't a
79 "unrecognized character after (? or (?-\0"
84 "reference to non-existent subpattern\0"
85 "pattern passed as NULL with non-zero length\0"
[all …]
/third_party/mindspore/mindspore-src/source/tests/ut/python/dataset/
Dtest_save_op.py1 # Copyright 2020-2022 Huawei Technologies Co., Ltd
7 # http://www.apache.org/licenses/LICENSE-2.0
49 file_name = os.environ.get('PYTEST_CURRENT_TEST').split(':')[-1].split(' ')[0]
50 data = [{"image1": bytes("image1 bytes abcddddd", encoding='UTF-8'),
51 "image2": bytes("image1 bytes def", encoding='UTF-8'),
52 "image3": bytes("image1 bytes ghixxxxxxxxxx", encoding='UTF-8'),
53 "image4": bytes("image1 bytes jklzz", encoding='UTF-8'),
54 "image5": bytes("image1 bytes mno", encoding='UTF-8')},
55 {"image1": bytes("image2 bytes abca", encoding='UTF-8'),
56 "image2": bytes("image2 bytes defbb", encoding='UTF-8'),
[all …]
/third_party/tex-hyphen/hyph-utf8/source/generic/hyph-utf8/lib/tex/hyphen/
Dpackages.yml3 Hyphenation patterns for Finnish in T1 and UTF-8 encodings.
5 while the newer ones (fi-x-school) implements the simpler rules taught at Finnish school.
8 description: |-
9 Hyphenation patterns for German in T1/EC and UTF-8 encodings,
11 The package includes the latest patterns from dehyph-exptl
13 however 8-bit engines still load old versions of patterns
14 for 'german' and 'ngerman' for backward-compatibility reasons.
27 description: |-
29 spelling in LGR and UTF-8 encodings. Patterns in UTF-8 use two code
40 description: |-
[all …]
/third_party/cups/examples/
Dipp-everywhere.test4 # Copyright © 2020-2024 by OpenPrinting.
5 # Copyright © 2007-2018 by Apple Inc.
6 # Copyright © 2001-2006 by Easy Software Products. All rights reserved.
13 # ./ipptool -V 2.0 -tf filename.ext printer-uri ipp-everywhere.test
17 INCLUDE "ipp-2.0.test"
24 NAME "PWG 5100.14 section 5.1/5.2 - Required Operations and Attributes"
25 OPERATION Get-Printer-Attributes
26 GROUP operation-attributes-tag
27 ATTR charset attributes-charset utf-8
28 ATTR naturalLanguage attributes-natural-language en
[all …]
/third_party/jerryscript/jerry-core/lit/
Dlit-globals.h7 * http://www.apache.org/licenses/LICENSE-2.0
22 * ECMAScript standard defines terms "code unit" and "character" as 16-bit unsigned value
23 …* used to represent 16-bit unit of text, this is the same as code unit in UTF-16 (See ECMA-262 5.1…
26 …* than 16 bits: 0x0 - 0x10FFFFF). One code point could be represented with one ore two 16-bit code…
32 …* Internally JerryScript engine uses UTF-8 representation of strings to reduce memory overhead. Un…
33 * occupies from one to four bytes in UTF-8 representation.
35 * Unicode scalar value | Bytes in UTF-8 | Bytes in UTF-16
37 * ----------------------------------------------------------------------
38 * 0x0 - 0x7F | 1 byte | 2 bytes
39 * 0x80 - 0x7FF | 2 bytes | 2 bytes
[all …]
/third_party/icu/docs/userguide/icu/
Dunicode.md1 ---
6 ---
7 <!--
10 -->
16 {: .no_toc .text-delta }
21 ---
41 Go to the [online ICU demos](https://icu4c-demos.unicode.org/icu-bin/icudemos) to
42 see how a Unicode-based server application can handle text in many languages and
47 Representing text-format data in computers is a matter of defining a set of
67 graphic, displayable characters. It was designed to represent English-language
[all …]

12345678910>>...46