README.chromium
README.md
1xxHash - Extremely fast hash algorithm
2======================================
3
4xxHash is an Extremely fast Hash algorithm, running at RAM speed limits.
5It successfully completes the [SMHasher](http://code.google.com/p/smhasher/wiki/SMHasher) test suite
6which evaluates collision, dispersion and randomness qualities of hash functions.
7Code is highly portable, and hashes are identical on all platforms (little / big endian).
8
9|Branch |Status |
10|------------|---------|
11|master | [![Build Status](https://travis-ci.org/Cyan4973/xxHash.svg?branch=master)](https://travis-ci.org/Cyan4973/xxHash?branch=master) |
12|dev | [![Build Status](https://travis-ci.org/Cyan4973/xxHash.svg?branch=dev)](https://travis-ci.org/Cyan4973/xxHash?branch=dev) |
13
14
15
16Benchmarks
17-------------------------
18
19The benchmark uses SMHasher speed test, compiled with Visual 2010 on a Windows Seven 32-bit box.
20The reference system uses a Core 2 Duo @3GHz
21
22
23| Name | Speed | Quality | Author |
24|---------------|----------|:-------:|------------------|
25| [xxHash] | 5.4 GB/s | 10 | Y.C. |
26| MurmurHash 3a | 2.7 GB/s | 10 | Austin Appleby |
27| SBox | 1.4 GB/s | 9 | Bret Mulvey |
28| Lookup3 | 1.2 GB/s | 9 | Bob Jenkins |
29| CityHash64 | 1.05 GB/s| 10 | Pike & Alakuijala|
30| FNV | 0.55 GB/s| 5 | Fowler, Noll, Vo |
31| CRC32 | 0.43 GB/s| 9 | |
32| MD5-32 | 0.33 GB/s| 10 | Ronald L.Rivest |
33| SHA1-32 | 0.28 GB/s| 10 | |
34
35[xxHash]: http://www.xxhash.com
36
37Q.Score is a measure of quality of the hash function.
38It depends on successfully passing SMHasher test set.
3910 is a perfect score.
40Algorithms with a score < 5 are not listed on this table.
41
42A more recent version, XXH64, has been created thanks to [Mathias Westerdahl](https://github.com/JCash),
43which offers superior speed and dispersion for 64-bit systems.
44Note however that 32-bit applications will still run faster using the 32-bit version.
45
46SMHasher speed test, compiled using GCC 4.8.2, on Linux Mint 64-bit.
47The reference system uses a Core i5-3340M @2.7GHz
48
49| Version | Speed on 64-bit | Speed on 32-bit |
50|------------|------------------|------------------|
51| XXH64 | 13.8 GB/s | 1.9 GB/s |
52| XXH32 | 6.8 GB/s | 6.0 GB/s |
53
54This project also includes a command line utility, named `xxhsum`, offering similar features as `md5sum`,
55thanks to [Takayuki Matsuoka](https://github.com/t-mat) contributions.
56
57
58### License
59
60The library files `xxhash.c` and `xxhash.h` are BSD licensed.
61The utility `xxhsum` is GPL licensed.
62
63
64### Build modifiers
65
66The following macros can be set at compilation time,
67they modify xxhash behavior. They are all disabled by default.
68
69- `XXH_INLINE_ALL` : Make all functions `inline`, with bodies directly included within `xxhash.h`.
70 There is no need for an `xxhash.o` module in this case.
71 Inlining functions is generally beneficial for speed on small keys.
72 It's especially effective when key length is a compile time constant,
73 with observed performance improvement in the +200% range .
74 See [this article](https://fastcompression.blogspot.com/2018/03/xxhash-for-small-keys-impressive-power.html) for details.
75- `XXH_ACCEPT_NULL_INPUT_POINTER` : if set to `1`, when input is a null-pointer,
76 xxhash result is the same as a zero-length key
77 (instead of a dereference segfault).
78- `XXH_FORCE_MEMORY_ACCESS` : default method `0` uses a portable `memcpy()` notation.
79 Method `1` uses a gcc-specific `packed` attribute, which can provide better performance for some targets.
80 Method `2` forces unaligned reads, which is not standard compliant, but might sometimes be the only way to extract better performance.
81- `XXH_CPU_LITTLE_ENDIAN` : by default, endianess is determined at compile time.
82 It's possible to skip auto-detection and force format to little-endian, by setting this macro to 1.
83 Setting it to 0 forces big-endian.
84- `XXH_FORCE_NATIVE_FORMAT` : on big-endian systems : use native number representation.
85 Breaks consistency with little-endian results.
86- `XXH_PRIVATE_API` : same impact as `XXH_INLINE_ALL`.
87 Name underlines that symbols will not be published on library public interface.
88- `XXH_NAMESPACE` : prefix all symbols with the value of `XXH_NAMESPACE`.
89 Useful to evade symbol naming collisions,
90 in case of multiple inclusions of xxHash source code.
91 Client applications can still use regular function name,
92 symbols are automatically translated through `xxhash.h`.
93- `XXH_STATIC_LINKING_ONLY` : gives access to state declaration for static allocation.
94 Incompatible with dynamic linking, due to risks of ABI changes.
95- `XXH_NO_LONG_LONG` : removes support for XXH64,
96 for targets without 64-bit support.
97
98
99### Example
100
101Calling xxhash 64-bit variant from a C program :
102
103```c
104#include "xxhash.h"
105
106unsigned long long calcul_hash(const void* buffer, size_t length)
107{
108 unsigned long long const seed = 0; /* or any other value */
109 unsigned long long const hash = XXH64(buffer, length, seed);
110 return hash;
111}
112```
113
114Using streaming variant is more involved, but makes it possible to provide data in multiple rounds :
115```c
116#include "stdlib.h" /* abort() */
117#include "xxhash.h"
118
119
120unsigned long long calcul_hash_streaming(someCustomType handler)
121{
122 XXH64_state_t* const state = XXH64_createState();
123 if (state==NULL) abort();
124
125 size_t const bufferSize = SOME_VALUE;
126 void* const buffer = malloc(bufferSize);
127 if (buffer==NULL) abort();
128
129 unsigned long long const seed = 0; /* or any other value */
130 XXH_errorcode const resetResult = XXH64_reset(state, seed);
131 if (resetResult == XXH_ERROR) abort();
132
133 (...)
134 while ( /* any condition */ ) {
135 size_t const length = get_more_data(buffer, bufferSize, handler); /* undescribed */
136 XXH_errorcode const addResult = XXH64_update(state, buffer, length);
137 if (addResult == XXH_ERROR) abort();
138 (...)
139 }
140
141 (...)
142 unsigned long long const hash = XXH64_digest(state);
143
144 free(buffer);
145 XXH64_freeState(state);
146
147 return hash;
148}
149```
150
151
152### Other programming languages
153
154Beyond the C reference version,
155xxHash is also available on many programming languages,
156thanks to great contributors.
157They are [listed here](http://www.xxhash.com/#other-languages).
158
159
160### Branch Policy
161
162> - The "master" branch is considered stable, at all times.
163> - The "dev" branch is the one where all contributions must be merged
164 before being promoted to master.
165> + If you plan to propose a patch, please commit into the "dev" branch,
166 or its own feature branch.
167 Direct commit to "master" are not permitted.
168