Lines Matching +full:stable +full:- +full:branch
5 * Copyright (C) 2010-2012, International Business Machines
9 * encoding: UTF-8
22 * \brief C++ API: Trie for mapping Unicode strings (or 16-bit-unit sequences)
41 * Light-weight, non-const reader class for a UCharsTrie.
42 * Traverses a char16_t-serialized data structure with minimal state,
43 * for mapping strings (16-bit-unit sequences) to non-negative integer values.
51 * @stable ICU 4.8
67 * @stable ICU 4.8
71 pos_(uchars_), remainingMatchLength_(-1) {} in UCharsTrie()
75 * @stable ICU 4.8
83 * @stable ICU 4.8
92 * @stable ICU 4.8
96 remainingMatchLength_=-1; in reset()
101 * Returns the state of this trie as a 64-bit integer.
106 * @stable ICU 65
110 static_cast<uint64_t>(pos_ - uchars_); in getState64()
115 * Unlike resetToState(State), the 64-bit state value
125 * @stable ICU 65
128 remainingMatchLength_ = static_cast<int32_t>(state >> kState64RemainingShift) - 2; in resetToState64()
136 * @stable ICU 4.8
142 * @stable ICU 4.8
158 * @stable ICU 4.8
175 * @stable ICU 4.8
189 * @stable ICU 4.8
198 * @stable ICU 4.8
201 remainingMatchLength_=-1; in first()
207 * one or two UTF-16 code units for this input code point.
211 * @stable ICU 4.8
219 * @stable ICU 4.8
225 * one or two UTF-16 code units for this input code point.
228 * @stable ICU 4.8
243 * @param length The length of the string. Can be -1 if NUL-terminated.
245 * @stable ICU 4.8
256 * @stable ICU 4.8
270 * (output-only)
273 * @stable ICU 4.8
277 // Skip the rest of a pending linear-match node. in hasUniqueValue()
286 * @stable ICU 4.8
292 * @stable ICU 4.8
297 * Iterates from the root of a char16_t-serialized UCharsTrie.
305 * @stable ICU 4.8
318 * @stable ICU 4.8
324 * @stable ICU 4.8
331 * @stable ICU 4.8
337 * @stable ICU 4.8
345 * have a real value, then the value is set to -1.
347 * a real value of -1.
353 * @stable ICU 4.8
359 * @stable ICU 4.8
364 * @stable ICU 4.8
371 value_=-1; // no real value for str in truncateAndStop()
389 // outbound edge of a branch node.
392 // and the remaining branch length in bits 31..16.
393 … // (We could store the remaining branch length minus 1 in bits 30..16 and not use the sign bit,
409 pos_(uchars_), remainingMatchLength_(-1) {} in UCharsTrie()
418 // Reads a compact 32-bit integer.
425 value=((leadUnit-kMinTwoUnitValueLead)<<16)|*pos; in readValue()
450 value=(leadUnit>>6)-1; in readNodeValue()
452 value=(((leadUnit&0x7fc0)-kMinTwoUnitNodeValueLead)<<10)|*pos; in readNodeValue()
477 delta=((delta-kMinTwoUnitDeltaLead)<<16)|*pos++; in jumpByDelta()
496 return static_cast<UStringTrieResult>(USTRINGTRIE_INTERMEDIATE_VALUE - (node >> 15)); in valueResult()
499 // Handles a branch node for both next(uchar) and next(string).
507 // from a branch.
515 // getNextUChars() when pos is on a branch node.
520 // The trie consists of a series of char16_t-serialized nodes for incremental
521 // Unicode string/char16_t sequence matching. (char16_t=16-bit unsigned integer)
525 // After each node, except a final-value node, another node follows to
529 // - Final-value node: Stores a 32-bit integer in a compact, variable-length format.
531 // - Match node, optionally with an intermediate value in a different compact format.
536 // - Linear-match node: Matches a number of units.
537 // - Branch node: Branches to other nodes according to the current input unit.
538 // The node unit is the length of the branch (number of units to select from)
539 // minus 1. It is followed by a sub-node:
540 // - If the length is at most kMaxBranchLinearSubNodeLength, then
541 // there are length-1 (key, value) pairs and then one more comparison unit.
545 // (Values have the same encoding as final-value nodes.)
546 // - If the length is greater than kMaxBranchLinearSubNodeLength, then
548 // If the input unit is less than the sub-node unit, then "jump" by delta to
549 // the next sub-node which will have a length of length/2.
551 // Otherwise, skip the "jump" delta to the next sub-node
552 // which will have a length of length-length/2.
554 // Match-node lead unit values, after masking off intermediate-value bits:
556 // 0000..002f: Branch node. If node!=0 then the length is node+1, otherwise
559 // For a branch sub-node with at most this many entries, we drop down
563 // 0030..003f: Linear-match node, match 1..16 units and continue reading the next node.
567 // Match-node lead unit bits 14..6 for the optional intermediate value.
571 static const int32_t kNodeTypeMask=kMinValueLead-1; // 0x003f
573 // A final-value node has bit 15 set.
582 …static const int32_t kMaxTwoUnitValue=((kThreeUnitValueLead-kMinTwoUnitValueLead)<<16)-1; // 0x3f…
584 // Compact intermediate-value integer, lead unit shared with a branch or linear-match node.
590 ((kThreeUnitNodeValueLead-kMinTwoUnitNodeValueLead)<<10)-1; // 0xfdffff
597 …static const int32_t kMaxTwoUnitDelta=((kThreeUnitDeltaLead-kMinTwoUnitDeltaLead)<<16)-1; // 0x03…
600 // The remainingMatchLength_ is -1..14=(kMaxLinearMatchLength=0x10)-2
604 static constexpr uint64_t kState64PosMask = (UINT64_C(1) << kState64RemainingShift) - 1;
615 // Remaining length of a linear-match node, minus 1. Negative if not in such a node.