public final class BytesToNameCanonicalizer extends Object
Names which are constructed directly from a byte-based
 input source).
 Complications arise from trying to do efficient reuse and merging of
 symbol tables, to be able to make use of usually shared vocabulary
 of subsequent parsing runs.| Modifier and Type | Field and Description | 
|---|---|
| protected int | _collCountTotal number of Names in collision buckets (included in
  _countalong with primary entries) | 
| protected int | _collEndIndex of the first unused collision bucket entry (== size of
 the used portion of collision list): less than
 or equal to 0xFF (255), since max number of entries is 255
 (8-bit, minus 0 used as 'empty' marker) | 
| protected org.codehaus.jackson.sym.BytesToNameCanonicalizer.Bucket[] | _collListArray of heads of collision bucket chains; size dynamically | 
| protected int | _countTotal number of Names in the symbol table;
 only used for child tables. | 
| protected boolean | _internWhether canonical symbol Strings are to be intern()ed before added
 to the table or not | 
| protected int | _longestCollisionListWe need to keep track of the longest collision list; this is needed
 both to indicate problems with attacks and to allow flushing for
 other cases. | 
| protected int[] | _mainHashArray of 2^N size, which contains combination
 of 24-bits of hash (0 to indicate 'empty' slot),
 and 8-bit collision bucket index (0 to indicate empty
 collision bucket chain; otherwise subtract one from index) | 
| protected int | _mainHashMaskMask used to truncate 32-bit hash value to current hash array
 size; essentially, hash array size - 1 (since hash array sizes
 are 2^N). | 
| protected Name[] | _mainNamesArray that contains  Nameinstances matching
 entries in_mainHash. | 
| protected BytesToNameCanonicalizer | _parentReference to the root symbol table, for child tables, so
 that they can merge table information back as necessary. | 
| protected AtomicReference<org.codehaus.jackson.sym.BytesToNameCanonicalizer.TableInfo> | _tableInfoMember that is only used by the root table instance: root
 passes immutable state into child instances, and children
 may return new state if they add entries to the table. | 
| protected static int | DEFAULT_TABLE_SIZE | 
| protected static int | MAX_TABLE_SIZELet's not expand symbol tables past some maximum size;
 this should protected against OOMEs caused by large documents
 with unique (~= random) names. | 
| Modifier and Type | Method and Description | 
|---|---|
| Name | addName(String symbolStr,
       int[] quads,
       int qlen) | 
| Name | addName(String symbolStr,
       int q1,
       int q2) | 
| int | bucketCount() | 
| int | calcHash(int firstQuad) | 
| int | calcHash(int[] quads,
        int qlen) | 
| int | calcHash(int firstQuad,
        int secondQuad) | 
| protected static int[] | calcQuads(byte[] wordBytes) | 
| int | collisionCount()Method mostly needed by unit tests; calculates number of
 entries that are in collision list. | 
| static BytesToNameCanonicalizer | createRoot()Factory method to call to create a symbol table instance with a
 randomized seed value. | 
| protected static BytesToNameCanonicalizer | createRoot(int hashSeed)Factory method that should only be called from unit tests, where seed
 value should remain the same. | 
| Name | findName(int firstQuad)Finds and returns name matching the specified symbol, if such
 name already exists in the table. | 
| Name | findName(int[] quads,
        int qlen)Finds and returns name matching the specified symbol, if such
 name already exists in the table; or if not, creates name object,
 adds to the table, and returns it. | 
| Name | findName(int firstQuad,
        int secondQuad)Finds and returns name matching the specified symbol, if such
 name already exists in the table. | 
| static Name | getEmptyName() | 
| int | hashSeed() | 
| BytesToNameCanonicalizer | makeChild(boolean canonicalize,
         boolean intern)Factory method used to create actual symbol table instance to
 use for parsing. | 
| int | maxCollisionLength()Method mostly needed by unit tests; calculates length of the
 longest collision chain. | 
| boolean | maybeDirty()Method called to check to quickly see if a child symbol table
 may have gotten additional entries. | 
| void | release()Method called by the using code to indicate it is done
 with this instance. | 
| protected void | reportTooManyCollisions(int maxLen) | 
| int | size() | 
protected static final int DEFAULT_TABLE_SIZE
protected static final int MAX_TABLE_SIZE
protected final BytesToNameCanonicalizer _parent
protected final AtomicReference<org.codehaus.jackson.sym.BytesToNameCanonicalizer.TableInfo> _tableInfo
protected final boolean _intern
protected int _count
protected int _longestCollisionList
protected int _mainHashMask
protected int[] _mainHash
protected Name[] _mainNames
Name instances matching
 entries in _mainHash. Contains nulls for unused
 entries.protected org.codehaus.jackson.sym.BytesToNameCanonicalizer.Bucket[] _collList
protected int _collCount
_count along with primary entries)protected int _collEnd
public static BytesToNameCanonicalizer createRoot()
protected static BytesToNameCanonicalizer createRoot(int hashSeed)
public BytesToNameCanonicalizer makeChild(boolean canonicalize, boolean intern)
intern - Whether canonical symbol Strings should be interned
   or notpublic void release()
public int size()
public int bucketCount()
public boolean maybeDirty()
public int hashSeed()
public int collisionCount()
size() - 1), but should usually be much lower, ideally 0.public int maxCollisionLength()
size() - 1 in the pathological casepublic static Name getEmptyName()
public Name findName(int firstQuad)
Note: separate methods to optimize common case of short element/attribute names (4 or less ascii characters)
firstQuad - int32 containing first 4 bytes of the name;
   if the whole name less than 4 bytes, padded with zero bytes
   in front (zero MSBs, ie. right aligned)public Name findName(int firstQuad, int secondQuad)
Note: separate methods to optimize common case of relatively short element/attribute names (8 or less ascii characters)
firstQuad - int32 containing first 4 bytes of the name.secondQuad - int32 containing bytes 5 through 8 of the
   name; if less than 8 bytes, padded with up to 3 zero bytes
   in front (zero MSBs, ie. right aligned)public Name findName(int[] quads, int qlen)
Note: this is the general purpose method that can be called for names of any length. However, if name is less than 9 bytes long, it is preferable to call the version optimized for short names.
quads - Array of int32s, each of which contain 4 bytes of
   encoded nameqlen - Number of int32s, starting from index 0, in quads
   parameterpublic final int calcHash(int firstQuad)
public final int calcHash(int firstQuad,
           int secondQuad)
public final int calcHash(int[] quads,
           int qlen)
protected static int[] calcQuads(byte[] wordBytes)
protected void reportTooManyCollisions(int maxLen)