• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1 
2 /* Dictionary object implementation using a hash table */
3 
4 /* The distribution includes a separate file, Objects/dictnotes.txt,
5    describing explorations into dictionary design and optimization.
6    It covers typical dictionary use patterns, the parameters for
7    tuning dictionaries, and several ideas for possible optimizations.
8 */
9 
10 #include "Python.h"
11 
12 
13 /* Set a key error with the specified argument, wrapping it in a
14  * tuple automatically so that tuple keys are not unpacked as the
15  * exception arguments. */
16 static void
set_key_error(PyObject * arg)17 set_key_error(PyObject *arg)
18 {
19     PyObject *tup;
20     tup = PyTuple_Pack(1, arg);
21     if (!tup)
22         return; /* caller will expect error to be set anyway */
23     PyErr_SetObject(PyExc_KeyError, tup);
24     Py_DECREF(tup);
25 }
26 
27 /* Define this out if you don't want conversion statistics on exit. */
28 #undef SHOW_CONVERSION_COUNTS
29 
30 /* See large comment block below.  This must be >= 1. */
31 #define PERTURB_SHIFT 5
32 
33 /*
34 Major subtleties ahead:  Most hash schemes depend on having a "good" hash
35 function, in the sense of simulating randomness.  Python doesn't:  its most
36 important hash functions (for strings and ints) are very regular in common
37 cases:
38 
39 >>> map(hash, (0, 1, 2, 3))
40 [0, 1, 2, 3]
41 >>> map(hash, ("namea", "nameb", "namec", "named"))
42 [-1658398457, -1658398460, -1658398459, -1658398462]
43 >>>
44 
45 This isn't necessarily bad!  To the contrary, in a table of size 2**i, taking
46 the low-order i bits as the initial table index is extremely fast, and there
47 are no collisions at all for dicts indexed by a contiguous range of ints.
48 The same is approximately true when keys are "consecutive" strings.  So this
49 gives better-than-random behavior in common cases, and that's very desirable.
50 
51 OTOH, when collisions occur, the tendency to fill contiguous slices of the
52 hash table makes a good collision resolution strategy crucial.  Taking only
53 the last i bits of the hash code is also vulnerable:  for example, consider
54 [i << 16 for i in range(20000)] as a set of keys.  Since ints are their own
55 hash codes, and this fits in a dict of size 2**15, the last 15 bits of every
56 hash code are all 0:  they *all* map to the same table index.
57 
58 But catering to unusual cases should not slow the usual ones, so we just take
59 the last i bits anyway.  It's up to collision resolution to do the rest.  If
60 we *usually* find the key we're looking for on the first try (and, it turns
61 out, we usually do -- the table load factor is kept under 2/3, so the odds
62 are solidly in our favor), then it makes best sense to keep the initial index
63 computation dirt cheap.
64 
65 The first half of collision resolution is to visit table indices via this
66 recurrence:
67 
68     j = ((5*j) + 1) mod 2**i
69 
70 For any initial j in range(2**i), repeating that 2**i times generates each
71 int in range(2**i) exactly once (see any text on random-number generation for
72 proof).  By itself, this doesn't help much:  like linear probing (setting
73 j += 1, or j -= 1, on each loop trip), it scans the table entries in a fixed
74 order.  This would be bad, except that's not the only thing we do, and it's
75 actually *good* in the common cases where hash keys are consecutive.  In an
76 example that's really too small to make this entirely clear, for a table of
77 size 2**3 the order of indices is:
78 
79     0 -> 1 -> 6 -> 7 -> 4 -> 5 -> 2 -> 3 -> 0 [and here it's repeating]
80 
81 If two things come in at index 5, the first place we look after is index 2,
82 not 6, so if another comes in at index 6 the collision at 5 didn't hurt it.
83 Linear probing is deadly in this case because there the fixed probe order
84 is the *same* as the order consecutive keys are likely to arrive.  But it's
85 extremely unlikely hash codes will follow a 5*j+1 recurrence by accident,
86 and certain that consecutive hash codes do not.
87 
88 The other half of the strategy is to get the other bits of the hash code
89 into play.  This is done by initializing a (unsigned) vrbl "perturb" to the
90 full hash code, and changing the recurrence to:
91 
92     j = (5*j) + 1 + perturb;
93     perturb >>= PERTURB_SHIFT;
94     use j % 2**i as the next table index;
95 
96 Now the probe sequence depends (eventually) on every bit in the hash code,
97 and the pseudo-scrambling property of recurring on 5*j+1 is more valuable,
98 because it quickly magnifies small differences in the bits that didn't affect
99 the initial index.  Note that because perturb is unsigned, if the recurrence
100 is executed often enough perturb eventually becomes and remains 0.  At that
101 point (very rarely reached) the recurrence is on (just) 5*j+1 again, and
102 that's certain to find an empty slot eventually (since it generates every int
103 in range(2**i), and we make sure there's always at least one empty slot).
104 
105 Selecting a good value for PERTURB_SHIFT is a balancing act.  You want it
106 small so that the high bits of the hash code continue to affect the probe
107 sequence across iterations; but you want it large so that in really bad cases
108 the high-order hash bits have an effect on early iterations.  5 was "the
109 best" in minimizing total collisions across experiments Tim Peters ran (on
110 both normal and pathological cases), but 4 and 6 weren't significantly worse.
111 
112 Historical:  Reimer Behrends contributed the idea of using a polynomial-based
113 approach, using repeated multiplication by x in GF(2**n) where an irreducible
114 polynomial for each table size was chosen such that x was a primitive root.
115 Christian Tismer later extended that to use division by x instead, as an
116 efficient way to get the high bits of the hash code into play.  This scheme
117 also gave excellent collision statistics, but was more expensive:  two
118 if-tests were required inside the loop; computing "the next" index took about
119 the same number of operations but without as much potential parallelism
120 (e.g., computing 5*j can go on at the same time as computing 1+perturb in the
121 above, and then shifting perturb can be done while the table index is being
122 masked); and the PyDictObject struct required a member to hold the table's
123 polynomial.  In Tim's experiments the current scheme ran faster, produced
124 equally good collision statistics, needed less code & used less memory.
125 
126 Theoretical Python 2.5 headache:  hash codes are only C "long", but
127 sizeof(Py_ssize_t) > sizeof(long) may be possible.  In that case, and if a
128 dict is genuinely huge, then only the slots directly reachable via indexing
129 by a C long can be the first slot in a probe sequence.  The probe sequence
130 will still eventually reach every slot in the table, but the collision rate
131 on initial probes may be much higher than this scheme was designed for.
132 Getting a hash code as fat as Py_ssize_t is the only real cure.  But in
133 practice, this probably won't make a lick of difference for many years (at
134 which point everyone will have terabytes of RAM on 64-bit boxes).
135 */
136 
137 /* Object used as dummy key to fill deleted entries */
138 static PyObject *dummy = NULL; /* Initialized by first call to newPyDictObject() */
139 
140 #ifdef Py_REF_DEBUG
141 PyObject *
_PyDict_Dummy(void)142 _PyDict_Dummy(void)
143 {
144     return dummy;
145 }
146 #endif
147 
148 /* forward declarations */
149 static PyDictEntry *
150 lookdict_string(PyDictObject *mp, PyObject *key, long hash);
151 
152 #ifdef SHOW_CONVERSION_COUNTS
153 static long created = 0L;
154 static long converted = 0L;
155 
156 static void
show_counts(void)157 show_counts(void)
158 {
159     fprintf(stderr, "created %ld string dicts\n", created);
160     fprintf(stderr, "converted %ld to normal dicts\n", converted);
161     fprintf(stderr, "%.2f%% conversion rate\n", (100.0*converted)/created);
162 }
163 #endif
164 
165 /* Debug statistic to compare allocations with reuse through the free list */
166 #undef SHOW_ALLOC_COUNT
167 #ifdef SHOW_ALLOC_COUNT
168 static size_t count_alloc = 0;
169 static size_t count_reuse = 0;
170 
171 static void
show_alloc(void)172 show_alloc(void)
173 {
174     fprintf(stderr, "Dict allocations: %" PY_FORMAT_SIZE_T "d\n",
175         count_alloc);
176     fprintf(stderr, "Dict reuse through freelist: %" PY_FORMAT_SIZE_T
177         "d\n", count_reuse);
178     fprintf(stderr, "%.2f%% reuse rate\n\n",
179         (100.0*count_reuse/(count_alloc+count_reuse)));
180 }
181 #endif
182 
183 /* Debug statistic to count GC tracking of dicts */
184 #ifdef SHOW_TRACK_COUNT
185 static Py_ssize_t count_untracked = 0;
186 static Py_ssize_t count_tracked = 0;
187 
188 static void
show_track(void)189 show_track(void)
190 {
191     fprintf(stderr, "Dicts created: %" PY_FORMAT_SIZE_T "d\n",
192         count_tracked + count_untracked);
193     fprintf(stderr, "Dicts tracked by the GC: %" PY_FORMAT_SIZE_T
194         "d\n", count_tracked);
195     fprintf(stderr, "%.2f%% dict tracking rate\n\n",
196         (100.0*count_tracked/(count_untracked+count_tracked)));
197 }
198 #endif
199 
200 
201 /* Initialization macros.
202    There are two ways to create a dict:  PyDict_New() is the main C API
203    function, and the tp_new slot maps to dict_new().  In the latter case we
204    can save a little time over what PyDict_New does because it's guaranteed
205    that the PyDictObject struct is already zeroed out.
206    Everyone except dict_new() should use EMPTY_TO_MINSIZE (unless they have
207    an excellent reason not to).
208 */
209 
210 #define INIT_NONZERO_DICT_SLOTS(mp) do {                                \
211     (mp)->ma_table = (mp)->ma_smalltable;                               \
212     (mp)->ma_mask = PyDict_MINSIZE - 1;                                 \
213     } while(0)
214 
215 #define EMPTY_TO_MINSIZE(mp) do {                                       \
216     memset((mp)->ma_smalltable, 0, sizeof((mp)->ma_smalltable));        \
217     (mp)->ma_used = (mp)->ma_fill = 0;                                  \
218     INIT_NONZERO_DICT_SLOTS(mp);                                        \
219     } while(0)
220 
221 /* Dictionary reuse scheme to save calls to malloc, free, and memset */
222 #ifndef PyDict_MAXFREELIST
223 #define PyDict_MAXFREELIST 80
224 #endif
225 static PyDictObject *free_list[PyDict_MAXFREELIST];
226 static int numfree = 0;
227 
228 void
PyDict_Fini(void)229 PyDict_Fini(void)
230 {
231     PyDictObject *op;
232 
233     while (numfree) {
234         op = free_list[--numfree];
235         assert(PyDict_CheckExact(op));
236         PyObject_GC_Del(op);
237     }
238 }
239 
240 PyObject *
PyDict_New(void)241 PyDict_New(void)
242 {
243     register PyDictObject *mp;
244     if (dummy == NULL) { /* Auto-initialize dummy */
245         dummy = PyString_FromString("<dummy key>");
246         if (dummy == NULL)
247             return NULL;
248 #ifdef SHOW_CONVERSION_COUNTS
249         Py_AtExit(show_counts);
250 #endif
251 #ifdef SHOW_ALLOC_COUNT
252         Py_AtExit(show_alloc);
253 #endif
254 #ifdef SHOW_TRACK_COUNT
255         Py_AtExit(show_track);
256 #endif
257     }
258     if (numfree) {
259         mp = free_list[--numfree];
260         assert (mp != NULL);
261         assert (Py_TYPE(mp) == &PyDict_Type);
262         _Py_NewReference((PyObject *)mp);
263         if (mp->ma_fill) {
264             EMPTY_TO_MINSIZE(mp);
265         } else {
266             /* At least set ma_table and ma_mask; these are wrong
267                if an empty but presized dict is added to freelist */
268             INIT_NONZERO_DICT_SLOTS(mp);
269         }
270         assert (mp->ma_used == 0);
271         assert (mp->ma_table == mp->ma_smalltable);
272         assert (mp->ma_mask == PyDict_MINSIZE - 1);
273 #ifdef SHOW_ALLOC_COUNT
274         count_reuse++;
275 #endif
276     } else {
277         mp = PyObject_GC_New(PyDictObject, &PyDict_Type);
278         if (mp == NULL)
279             return NULL;
280         EMPTY_TO_MINSIZE(mp);
281 #ifdef SHOW_ALLOC_COUNT
282         count_alloc++;
283 #endif
284     }
285     mp->ma_lookup = lookdict_string;
286 #ifdef SHOW_TRACK_COUNT
287     count_untracked++;
288 #endif
289 #ifdef SHOW_CONVERSION_COUNTS
290     ++created;
291 #endif
292     return (PyObject *)mp;
293 }
294 
295 /*
296 The basic lookup function used by all operations.
297 This is based on Algorithm D from Knuth Vol. 3, Sec. 6.4.
298 Open addressing is preferred over chaining since the link overhead for
299 chaining would be substantial (100% with typical malloc overhead).
300 
301 The initial probe index is computed as hash mod the table size. Subsequent
302 probe indices are computed as explained earlier.
303 
304 All arithmetic on hash should ignore overflow.
305 
306 (The details in this version are due to Tim Peters, building on many past
307 contributions by Reimer Behrends, Jyrki Alakuijala, Vladimir Marangozov and
308 Christian Tismer).
309 
310 lookdict() is general-purpose, and may return NULL if (and only if) a
311 comparison raises an exception (this was new in Python 2.5).
312 lookdict_string() below is specialized to string keys, comparison of which can
313 never raise an exception; that function can never return NULL.  For both, when
314 the key isn't found a PyDictEntry* is returned for which the me_value field is
315 NULL; this is the slot in the dict at which the key would have been found, and
316 the caller can (if it wishes) add the <key, value> pair to the returned
317 PyDictEntry*.
318 */
319 static PyDictEntry *
lookdict(PyDictObject * mp,PyObject * key,register long hash)320 lookdict(PyDictObject *mp, PyObject *key, register long hash)
321 {
322     register size_t i;
323     register size_t perturb;
324     register PyDictEntry *freeslot;
325     register size_t mask = (size_t)mp->ma_mask;
326     PyDictEntry *ep0 = mp->ma_table;
327     register PyDictEntry *ep;
328     register int cmp;
329     PyObject *startkey;
330 
331     i = (size_t)hash & mask;
332     ep = &ep0[i];
333     if (ep->me_key == NULL || ep->me_key == key)
334         return ep;
335 
336     if (ep->me_key == dummy)
337         freeslot = ep;
338     else {
339         if (ep->me_hash == hash) {
340             startkey = ep->me_key;
341             Py_INCREF(startkey);
342             cmp = PyObject_RichCompareBool(startkey, key, Py_EQ);
343             Py_DECREF(startkey);
344             if (cmp < 0)
345                 return NULL;
346             if (ep0 == mp->ma_table && ep->me_key == startkey) {
347                 if (cmp > 0)
348                     return ep;
349             }
350             else {
351                 /* The compare did major nasty stuff to the
352                  * dict:  start over.
353                  * XXX A clever adversary could prevent this
354                  * XXX from terminating.
355                  */
356                 return lookdict(mp, key, hash);
357             }
358         }
359         freeslot = NULL;
360     }
361 
362     /* In the loop, me_key == dummy is by far (factor of 100s) the
363        least likely outcome, so test for that last. */
364     for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
365         i = (i << 2) + i + perturb + 1;
366         ep = &ep0[i & mask];
367         if (ep->me_key == NULL)
368             return freeslot == NULL ? ep : freeslot;
369         if (ep->me_key == key)
370             return ep;
371         if (ep->me_hash == hash && ep->me_key != dummy) {
372             startkey = ep->me_key;
373             Py_INCREF(startkey);
374             cmp = PyObject_RichCompareBool(startkey, key, Py_EQ);
375             Py_DECREF(startkey);
376             if (cmp < 0)
377                 return NULL;
378             if (ep0 == mp->ma_table && ep->me_key == startkey) {
379                 if (cmp > 0)
380                     return ep;
381             }
382             else {
383                 /* The compare did major nasty stuff to the
384                  * dict:  start over.
385                  * XXX A clever adversary could prevent this
386                  * XXX from terminating.
387                  */
388                 return lookdict(mp, key, hash);
389             }
390         }
391         else if (ep->me_key == dummy && freeslot == NULL)
392             freeslot = ep;
393     }
394     assert(0);          /* NOT REACHED */
395     return 0;
396 }
397 
398 /*
399  * Hacked up version of lookdict which can assume keys are always strings;
400  * this assumption allows testing for errors during PyObject_RichCompareBool()
401  * to be dropped; string-string comparisons never raise exceptions.  This also
402  * means we don't need to go through PyObject_RichCompareBool(); we can always
403  * use _PyString_Eq() directly.
404  *
405  * This is valuable because dicts with only string keys are very common.
406  */
407 static PyDictEntry *
lookdict_string(PyDictObject * mp,PyObject * key,register long hash)408 lookdict_string(PyDictObject *mp, PyObject *key, register long hash)
409 {
410     register size_t i;
411     register size_t perturb;
412     register PyDictEntry *freeslot;
413     register size_t mask = (size_t)mp->ma_mask;
414     PyDictEntry *ep0 = mp->ma_table;
415     register PyDictEntry *ep;
416 
417     /* Make sure this function doesn't have to handle non-string keys,
418        including subclasses of str; e.g., one reason to subclass
419        strings is to override __eq__, and for speed we don't cater to
420        that here. */
421     if (!PyString_CheckExact(key)) {
422 #ifdef SHOW_CONVERSION_COUNTS
423         ++converted;
424 #endif
425         mp->ma_lookup = lookdict;
426         return lookdict(mp, key, hash);
427     }
428     i = hash & mask;
429     ep = &ep0[i];
430     if (ep->me_key == NULL || ep->me_key == key)
431         return ep;
432     if (ep->me_key == dummy)
433         freeslot = ep;
434     else {
435         if (ep->me_hash == hash && _PyString_Eq(ep->me_key, key))
436             return ep;
437         freeslot = NULL;
438     }
439 
440     /* In the loop, me_key == dummy is by far (factor of 100s) the
441        least likely outcome, so test for that last. */
442     for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
443         i = (i << 2) + i + perturb + 1;
444         ep = &ep0[i & mask];
445         if (ep->me_key == NULL)
446             return freeslot == NULL ? ep : freeslot;
447         if (ep->me_key == key
448             || (ep->me_hash == hash
449             && ep->me_key != dummy
450             && _PyString_Eq(ep->me_key, key)))
451             return ep;
452         if (ep->me_key == dummy && freeslot == NULL)
453             freeslot = ep;
454     }
455     assert(0);          /* NOT REACHED */
456     return 0;
457 }
458 
459 #ifdef SHOW_TRACK_COUNT
460 #define INCREASE_TRACK_COUNT \
461     (count_tracked++, count_untracked--);
462 #define DECREASE_TRACK_COUNT \
463     (count_tracked--, count_untracked++);
464 #else
465 #define INCREASE_TRACK_COUNT
466 #define DECREASE_TRACK_COUNT
467 #endif
468 
469 #define MAINTAIN_TRACKING(mp, key, value) \
470     do { \
471         if (!_PyObject_GC_IS_TRACKED(mp)) { \
472             if (_PyObject_GC_MAY_BE_TRACKED(key) || \
473                 _PyObject_GC_MAY_BE_TRACKED(value)) { \
474                 _PyObject_GC_TRACK(mp); \
475                 INCREASE_TRACK_COUNT \
476             } \
477         } \
478     } while(0)
479 
480 void
_PyDict_MaybeUntrack(PyObject * op)481 _PyDict_MaybeUntrack(PyObject *op)
482 {
483     PyDictObject *mp;
484     PyObject *value;
485     Py_ssize_t mask, i;
486     PyDictEntry *ep;
487 
488     if (!PyDict_CheckExact(op) || !_PyObject_GC_IS_TRACKED(op))
489         return;
490 
491     mp = (PyDictObject *) op;
492     ep = mp->ma_table;
493     mask = mp->ma_mask;
494     for (i = 0; i <= mask; i++) {
495         if ((value = ep[i].me_value) == NULL)
496             continue;
497         if (_PyObject_GC_MAY_BE_TRACKED(value) ||
498             _PyObject_GC_MAY_BE_TRACKED(ep[i].me_key))
499             return;
500     }
501     DECREASE_TRACK_COUNT
502     _PyObject_GC_UNTRACK(op);
503 }
504 
505 
506 /*
507 Internal routine to insert a new item into the table.
508 Used both by the internal resize routine and by the public insert routine.
509 Eats a reference to key and one to value.
510 Returns -1 if an error occurred, or 0 on success.
511 */
512 static int
insertdict(register PyDictObject * mp,PyObject * key,long hash,PyObject * value)513 insertdict(register PyDictObject *mp, PyObject *key, long hash, PyObject *value)
514 {
515     PyObject *old_value;
516     register PyDictEntry *ep;
517     typedef PyDictEntry *(*lookupfunc)(PyDictObject *, PyObject *, long);
518 
519     assert(mp->ma_lookup != NULL);
520     ep = mp->ma_lookup(mp, key, hash);
521     if (ep == NULL) {
522         Py_DECREF(key);
523         Py_DECREF(value);
524         return -1;
525     }
526     MAINTAIN_TRACKING(mp, key, value);
527     if (ep->me_value != NULL) {
528         old_value = ep->me_value;
529         ep->me_value = value;
530         Py_DECREF(old_value); /* which **CAN** re-enter */
531         Py_DECREF(key);
532     }
533     else {
534         if (ep->me_key == NULL)
535             mp->ma_fill++;
536         else {
537             assert(ep->me_key == dummy);
538             Py_DECREF(dummy);
539         }
540         ep->me_key = key;
541         ep->me_hash = (Py_ssize_t)hash;
542         ep->me_value = value;
543         mp->ma_used++;
544     }
545     return 0;
546 }
547 
548 /*
549 Internal routine used by dictresize() to insert an item which is
550 known to be absent from the dict.  This routine also assumes that
551 the dict contains no deleted entries.  Besides the performance benefit,
552 using insertdict() in dictresize() is dangerous (SF bug #1456209).
553 Note that no refcounts are changed by this routine; if needed, the caller
554 is responsible for incref'ing `key` and `value`.
555 */
556 static void
insertdict_clean(register PyDictObject * mp,PyObject * key,long hash,PyObject * value)557 insertdict_clean(register PyDictObject *mp, PyObject *key, long hash,
558                  PyObject *value)
559 {
560     register size_t i;
561     register size_t perturb;
562     register size_t mask = (size_t)mp->ma_mask;
563     PyDictEntry *ep0 = mp->ma_table;
564     register PyDictEntry *ep;
565 
566     MAINTAIN_TRACKING(mp, key, value);
567     i = hash & mask;
568     ep = &ep0[i];
569     for (perturb = hash; ep->me_key != NULL; perturb >>= PERTURB_SHIFT) {
570         i = (i << 2) + i + perturb + 1;
571         ep = &ep0[i & mask];
572     }
573     assert(ep->me_value == NULL);
574     mp->ma_fill++;
575     ep->me_key = key;
576     ep->me_hash = (Py_ssize_t)hash;
577     ep->me_value = value;
578     mp->ma_used++;
579 }
580 
581 /*
582 Restructure the table by allocating a new table and reinserting all
583 items again.  When entries have been deleted, the new table may
584 actually be smaller than the old one.
585 */
586 static int
dictresize(PyDictObject * mp,Py_ssize_t minused)587 dictresize(PyDictObject *mp, Py_ssize_t minused)
588 {
589     Py_ssize_t newsize;
590     PyDictEntry *oldtable, *newtable, *ep;
591     Py_ssize_t i;
592     int is_oldtable_malloced;
593     PyDictEntry small_copy[PyDict_MINSIZE];
594 
595     assert(minused >= 0);
596 
597     /* Find the smallest table size > minused. */
598     for (newsize = PyDict_MINSIZE;
599          newsize <= minused && newsize > 0;
600          newsize <<= 1)
601         ;
602     if (newsize <= 0) {
603         PyErr_NoMemory();
604         return -1;
605     }
606 
607     /* Get space for a new table. */
608     oldtable = mp->ma_table;
609     assert(oldtable != NULL);
610     is_oldtable_malloced = oldtable != mp->ma_smalltable;
611 
612     if (newsize == PyDict_MINSIZE) {
613         /* A large table is shrinking, or we can't get any smaller. */
614         newtable = mp->ma_smalltable;
615         if (newtable == oldtable) {
616             if (mp->ma_fill == mp->ma_used) {
617                 /* No dummies, so no point doing anything. */
618                 return 0;
619             }
620             /* We're not going to resize it, but rebuild the
621                table anyway to purge old dummy entries.
622                Subtle:  This is *necessary* if fill==size,
623                as lookdict needs at least one virgin slot to
624                terminate failing searches.  If fill < size, it's
625                merely desirable, as dummies slow searches. */
626             assert(mp->ma_fill > mp->ma_used);
627             memcpy(small_copy, oldtable, sizeof(small_copy));
628             oldtable = small_copy;
629         }
630     }
631     else {
632         newtable = PyMem_NEW(PyDictEntry, newsize);
633         if (newtable == NULL) {
634             PyErr_NoMemory();
635             return -1;
636         }
637     }
638 
639     /* Make the dict empty, using the new table. */
640     assert(newtable != oldtable);
641     mp->ma_table = newtable;
642     mp->ma_mask = newsize - 1;
643     memset(newtable, 0, sizeof(PyDictEntry) * newsize);
644     mp->ma_used = 0;
645     i = mp->ma_fill;
646     mp->ma_fill = 0;
647 
648     /* Copy the data over; this is refcount-neutral for active entries;
649        dummy entries aren't copied over, of course */
650     for (ep = oldtable; i > 0; ep++) {
651         if (ep->me_value != NULL) {             /* active entry */
652             --i;
653             insertdict_clean(mp, ep->me_key, (long)ep->me_hash,
654                              ep->me_value);
655         }
656         else if (ep->me_key != NULL) {          /* dummy entry */
657             --i;
658             assert(ep->me_key == dummy);
659             Py_DECREF(ep->me_key);
660         }
661         /* else key == value == NULL:  nothing to do */
662     }
663 
664     if (is_oldtable_malloced)
665         PyMem_DEL(oldtable);
666     return 0;
667 }
668 
669 /* Create a new dictionary pre-sized to hold an estimated number of elements.
670    Underestimates are okay because the dictionary will resize as necessary.
671    Overestimates just mean the dictionary will be more sparse than usual.
672 */
673 
674 PyObject *
_PyDict_NewPresized(Py_ssize_t minused)675 _PyDict_NewPresized(Py_ssize_t minused)
676 {
677     PyObject *op = PyDict_New();
678 
679     if (minused>5 && op != NULL && dictresize((PyDictObject *)op, minused) == -1) {
680         Py_DECREF(op);
681         return NULL;
682     }
683     return op;
684 }
685 
686 /* Note that, for historical reasons, PyDict_GetItem() suppresses all errors
687  * that may occur (originally dicts supported only string keys, and exceptions
688  * weren't possible).  So, while the original intent was that a NULL return
689  * meant the key wasn't present, in reality it can mean that, or that an error
690  * (suppressed) occurred while computing the key's hash, or that some error
691  * (suppressed) occurred when comparing keys in the dict's internal probe
692  * sequence.  A nasty example of the latter is when a Python-coded comparison
693  * function hits a stack-depth error, which can cause this to return NULL
694  * even if the key is present.
695  */
696 PyObject *
PyDict_GetItem(PyObject * op,PyObject * key)697 PyDict_GetItem(PyObject *op, PyObject *key)
698 {
699     long hash;
700     PyDictObject *mp = (PyDictObject *)op;
701     PyDictEntry *ep;
702     PyThreadState *tstate;
703     if (!PyDict_Check(op))
704         return NULL;
705     if (!PyString_CheckExact(key) ||
706         (hash = ((PyStringObject *) key)->ob_shash) == -1)
707     {
708         hash = PyObject_Hash(key);
709         if (hash == -1) {
710             PyErr_Clear();
711             return NULL;
712         }
713     }
714 
715     /* We can arrive here with a NULL tstate during initialization: try
716        running "python -Wi" for an example related to string interning.
717        Let's just hope that no exception occurs then...  This must be
718        _PyThreadState_Current and not PyThreadState_GET() because in debug
719        mode, the latter complains if tstate is NULL. */
720     tstate = _PyThreadState_Current;
721     if (tstate != NULL && tstate->curexc_type != NULL) {
722         /* preserve the existing exception */
723         PyObject *err_type, *err_value, *err_tb;
724         PyErr_Fetch(&err_type, &err_value, &err_tb);
725         ep = (mp->ma_lookup)(mp, key, hash);
726         /* ignore errors */
727         PyErr_Restore(err_type, err_value, err_tb);
728         if (ep == NULL)
729             return NULL;
730     }
731     else {
732         ep = (mp->ma_lookup)(mp, key, hash);
733         if (ep == NULL) {
734             PyErr_Clear();
735             return NULL;
736         }
737     }
738     return ep->me_value;
739 }
740 
741 /* CAUTION: PyDict_SetItem() must guarantee that it won't resize the
742  * dictionary if it's merely replacing the value for an existing key.
743  * This means that it's safe to loop over a dictionary with PyDict_Next()
744  * and occasionally replace a value -- but you can't insert new keys or
745  * remove them.
746  */
747 int
PyDict_SetItem(register PyObject * op,PyObject * key,PyObject * value)748 PyDict_SetItem(register PyObject *op, PyObject *key, PyObject *value)
749 {
750     register PyDictObject *mp;
751     register long hash;
752     register Py_ssize_t n_used;
753 
754     if (!PyDict_Check(op)) {
755         PyErr_BadInternalCall();
756         return -1;
757     }
758     assert(key);
759     assert(value);
760     mp = (PyDictObject *)op;
761     if (PyString_CheckExact(key)) {
762         hash = ((PyStringObject *)key)->ob_shash;
763         if (hash == -1)
764             hash = PyObject_Hash(key);
765     }
766     else {
767         hash = PyObject_Hash(key);
768         if (hash == -1)
769             return -1;
770     }
771     assert(mp->ma_fill <= mp->ma_mask);  /* at least one empty slot */
772     n_used = mp->ma_used;
773     Py_INCREF(value);
774     Py_INCREF(key);
775     if (insertdict(mp, key, hash, value) != 0)
776         return -1;
777     /* If we added a key, we can safely resize.  Otherwise just return!
778      * If fill >= 2/3 size, adjust size.  Normally, this doubles or
779      * quaduples the size, but it's also possible for the dict to shrink
780      * (if ma_fill is much larger than ma_used, meaning a lot of dict
781      * keys have been * deleted).
782      *
783      * Quadrupling the size improves average dictionary sparseness
784      * (reducing collisions) at the cost of some memory and iteration
785      * speed (which loops over every possible entry).  It also halves
786      * the number of expensive resize operations in a growing dictionary.
787      *
788      * Very large dictionaries (over 50K items) use doubling instead.
789      * This may help applications with severe memory constraints.
790      */
791     if (!(mp->ma_used > n_used && mp->ma_fill*3 >= (mp->ma_mask+1)*2))
792         return 0;
793     return dictresize(mp, (mp->ma_used > 50000 ? 2 : 4) * mp->ma_used);
794 }
795 
796 int
PyDict_DelItem(PyObject * op,PyObject * key)797 PyDict_DelItem(PyObject *op, PyObject *key)
798 {
799     register PyDictObject *mp;
800     register long hash;
801     register PyDictEntry *ep;
802     PyObject *old_value, *old_key;
803 
804     if (!PyDict_Check(op)) {
805         PyErr_BadInternalCall();
806         return -1;
807     }
808     assert(key);
809     if (!PyString_CheckExact(key) ||
810         (hash = ((PyStringObject *) key)->ob_shash) == -1) {
811         hash = PyObject_Hash(key);
812         if (hash == -1)
813             return -1;
814     }
815     mp = (PyDictObject *)op;
816     ep = (mp->ma_lookup)(mp, key, hash);
817     if (ep == NULL)
818         return -1;
819     if (ep->me_value == NULL) {
820         set_key_error(key);
821         return -1;
822     }
823     old_key = ep->me_key;
824     Py_INCREF(dummy);
825     ep->me_key = dummy;
826     old_value = ep->me_value;
827     ep->me_value = NULL;
828     mp->ma_used--;
829     Py_DECREF(old_value);
830     Py_DECREF(old_key);
831     return 0;
832 }
833 
834 void
PyDict_Clear(PyObject * op)835 PyDict_Clear(PyObject *op)
836 {
837     PyDictObject *mp;
838     PyDictEntry *ep, *table;
839     int table_is_malloced;
840     Py_ssize_t fill;
841     PyDictEntry small_copy[PyDict_MINSIZE];
842 #ifdef Py_DEBUG
843     Py_ssize_t i, n;
844 #endif
845 
846     if (!PyDict_Check(op))
847         return;
848     mp = (PyDictObject *)op;
849 #ifdef Py_DEBUG
850     n = mp->ma_mask + 1;
851     i = 0;
852 #endif
853 
854     table = mp->ma_table;
855     assert(table != NULL);
856     table_is_malloced = table != mp->ma_smalltable;
857 
858     /* This is delicate.  During the process of clearing the dict,
859      * decrefs can cause the dict to mutate.  To avoid fatal confusion
860      * (voice of experience), we have to make the dict empty before
861      * clearing the slots, and never refer to anything via mp->xxx while
862      * clearing.
863      */
864     fill = mp->ma_fill;
865     if (table_is_malloced)
866         EMPTY_TO_MINSIZE(mp);
867 
868     else if (fill > 0) {
869         /* It's a small table with something that needs to be cleared.
870          * Afraid the only safe way is to copy the dict entries into
871          * another small table first.
872          */
873         memcpy(small_copy, table, sizeof(small_copy));
874         table = small_copy;
875         EMPTY_TO_MINSIZE(mp);
876     }
877     /* else it's a small table that's already empty */
878 
879     /* Now we can finally clear things.  If C had refcounts, we could
880      * assert that the refcount on table is 1 now, i.e. that this function
881      * has unique access to it, so decref side-effects can't alter it.
882      */
883     for (ep = table; fill > 0; ++ep) {
884 #ifdef Py_DEBUG
885         assert(i < n);
886         ++i;
887 #endif
888         if (ep->me_key) {
889             --fill;
890             Py_DECREF(ep->me_key);
891             Py_XDECREF(ep->me_value);
892         }
893 #ifdef Py_DEBUG
894         else
895             assert(ep->me_value == NULL);
896 #endif
897     }
898 
899     if (table_is_malloced)
900         PyMem_DEL(table);
901 }
902 
903 /*
904  * Iterate over a dict.  Use like so:
905  *
906  *     Py_ssize_t i;
907  *     PyObject *key, *value;
908  *     i = 0;   # important!  i should not otherwise be changed by you
909  *     while (PyDict_Next(yourdict, &i, &key, &value)) {
910  *              Refer to borrowed references in key and value.
911  *     }
912  *
913  * CAUTION:  In general, it isn't safe to use PyDict_Next in a loop that
914  * mutates the dict.  One exception:  it is safe if the loop merely changes
915  * the values associated with the keys (but doesn't insert new keys or
916  * delete keys), via PyDict_SetItem().
917  */
918 int
PyDict_Next(PyObject * op,Py_ssize_t * ppos,PyObject ** pkey,PyObject ** pvalue)919 PyDict_Next(PyObject *op, Py_ssize_t *ppos, PyObject **pkey, PyObject **pvalue)
920 {
921     register Py_ssize_t i;
922     register Py_ssize_t mask;
923     register PyDictEntry *ep;
924 
925     if (!PyDict_Check(op))
926         return 0;
927     i = *ppos;
928     if (i < 0)
929         return 0;
930     ep = ((PyDictObject *)op)->ma_table;
931     mask = ((PyDictObject *)op)->ma_mask;
932     while (i <= mask && ep[i].me_value == NULL)
933         i++;
934     *ppos = i+1;
935     if (i > mask)
936         return 0;
937     if (pkey)
938         *pkey = ep[i].me_key;
939     if (pvalue)
940         *pvalue = ep[i].me_value;
941     return 1;
942 }
943 
944 /* Internal version of PyDict_Next that returns a hash value in addition to the key and value.*/
945 int
_PyDict_Next(PyObject * op,Py_ssize_t * ppos,PyObject ** pkey,PyObject ** pvalue,long * phash)946 _PyDict_Next(PyObject *op, Py_ssize_t *ppos, PyObject **pkey, PyObject **pvalue, long *phash)
947 {
948     register Py_ssize_t i;
949     register Py_ssize_t mask;
950     register PyDictEntry *ep;
951 
952     if (!PyDict_Check(op))
953         return 0;
954     i = *ppos;
955     if (i < 0)
956         return 0;
957     ep = ((PyDictObject *)op)->ma_table;
958     mask = ((PyDictObject *)op)->ma_mask;
959     while (i <= mask && ep[i].me_value == NULL)
960         i++;
961     *ppos = i+1;
962     if (i > mask)
963         return 0;
964     *phash = (long)(ep[i].me_hash);
965     if (pkey)
966         *pkey = ep[i].me_key;
967     if (pvalue)
968         *pvalue = ep[i].me_value;
969     return 1;
970 }
971 
972 /* Methods */
973 
974 static void
dict_dealloc(register PyDictObject * mp)975 dict_dealloc(register PyDictObject *mp)
976 {
977     register PyDictEntry *ep;
978     Py_ssize_t fill = mp->ma_fill;
979     PyObject_GC_UnTrack(mp);
980     Py_TRASHCAN_SAFE_BEGIN(mp)
981     for (ep = mp->ma_table; fill > 0; ep++) {
982         if (ep->me_key) {
983             --fill;
984             Py_DECREF(ep->me_key);
985             Py_XDECREF(ep->me_value);
986         }
987     }
988     if (mp->ma_table != mp->ma_smalltable)
989         PyMem_DEL(mp->ma_table);
990     if (numfree < PyDict_MAXFREELIST && Py_TYPE(mp) == &PyDict_Type)
991         free_list[numfree++] = mp;
992     else
993         Py_TYPE(mp)->tp_free((PyObject *)mp);
994     Py_TRASHCAN_SAFE_END(mp)
995 }
996 
997 static int
dict_print(register PyDictObject * mp,register FILE * fp,register int flags)998 dict_print(register PyDictObject *mp, register FILE *fp, register int flags)
999 {
1000     register Py_ssize_t i;
1001     register Py_ssize_t any;
1002     int status;
1003 
1004     status = Py_ReprEnter((PyObject*)mp);
1005     if (status != 0) {
1006         if (status < 0)
1007             return status;
1008         Py_BEGIN_ALLOW_THREADS
1009         fprintf(fp, "{...}");
1010         Py_END_ALLOW_THREADS
1011         return 0;
1012     }
1013 
1014     Py_BEGIN_ALLOW_THREADS
1015     fprintf(fp, "{");
1016     Py_END_ALLOW_THREADS
1017     any = 0;
1018     for (i = 0; i <= mp->ma_mask; i++) {
1019         PyDictEntry *ep = mp->ma_table + i;
1020         PyObject *pvalue = ep->me_value;
1021         if (pvalue != NULL) {
1022             /* Prevent PyObject_Repr from deleting value during
1023                key format */
1024             Py_INCREF(pvalue);
1025             if (any++ > 0) {
1026                 Py_BEGIN_ALLOW_THREADS
1027                 fprintf(fp, ", ");
1028                 Py_END_ALLOW_THREADS
1029             }
1030             if (PyObject_Print((PyObject *)ep->me_key, fp, 0)!=0) {
1031                 Py_DECREF(pvalue);
1032                 Py_ReprLeave((PyObject*)mp);
1033                 return -1;
1034             }
1035             Py_BEGIN_ALLOW_THREADS
1036             fprintf(fp, ": ");
1037             Py_END_ALLOW_THREADS
1038             if (PyObject_Print(pvalue, fp, 0) != 0) {
1039                 Py_DECREF(pvalue);
1040                 Py_ReprLeave((PyObject*)mp);
1041                 return -1;
1042             }
1043             Py_DECREF(pvalue);
1044         }
1045     }
1046     Py_BEGIN_ALLOW_THREADS
1047     fprintf(fp, "}");
1048     Py_END_ALLOW_THREADS
1049     Py_ReprLeave((PyObject*)mp);
1050     return 0;
1051 }
1052 
1053 static PyObject *
dict_repr(PyDictObject * mp)1054 dict_repr(PyDictObject *mp)
1055 {
1056     Py_ssize_t i;
1057     PyObject *s, *temp, *colon = NULL;
1058     PyObject *pieces = NULL, *result = NULL;
1059     PyObject *key, *value;
1060 
1061     i = Py_ReprEnter((PyObject *)mp);
1062     if (i != 0) {
1063         return i > 0 ? PyString_FromString("{...}") : NULL;
1064     }
1065 
1066     if (mp->ma_used == 0) {
1067         result = PyString_FromString("{}");
1068         goto Done;
1069     }
1070 
1071     pieces = PyList_New(0);
1072     if (pieces == NULL)
1073         goto Done;
1074 
1075     colon = PyString_FromString(": ");
1076     if (colon == NULL)
1077         goto Done;
1078 
1079     /* Do repr() on each key+value pair, and insert ": " between them.
1080        Note that repr may mutate the dict. */
1081     i = 0;
1082     while (PyDict_Next((PyObject *)mp, &i, &key, &value)) {
1083         int status;
1084         /* Prevent repr from deleting value during key format. */
1085         Py_INCREF(value);
1086         s = PyObject_Repr(key);
1087         PyString_Concat(&s, colon);
1088         PyString_ConcatAndDel(&s, PyObject_Repr(value));
1089         Py_DECREF(value);
1090         if (s == NULL)
1091             goto Done;
1092         status = PyList_Append(pieces, s);
1093         Py_DECREF(s);  /* append created a new ref */
1094         if (status < 0)
1095             goto Done;
1096     }
1097 
1098     /* Add "{}" decorations to the first and last items. */
1099     assert(PyList_GET_SIZE(pieces) > 0);
1100     s = PyString_FromString("{");
1101     if (s == NULL)
1102         goto Done;
1103     temp = PyList_GET_ITEM(pieces, 0);
1104     PyString_ConcatAndDel(&s, temp);
1105     PyList_SET_ITEM(pieces, 0, s);
1106     if (s == NULL)
1107         goto Done;
1108 
1109     s = PyString_FromString("}");
1110     if (s == NULL)
1111         goto Done;
1112     temp = PyList_GET_ITEM(pieces, PyList_GET_SIZE(pieces) - 1);
1113     PyString_ConcatAndDel(&temp, s);
1114     PyList_SET_ITEM(pieces, PyList_GET_SIZE(pieces) - 1, temp);
1115     if (temp == NULL)
1116         goto Done;
1117 
1118     /* Paste them all together with ", " between. */
1119     s = PyString_FromString(", ");
1120     if (s == NULL)
1121         goto Done;
1122     result = _PyString_Join(s, pieces);
1123     Py_DECREF(s);
1124 
1125 Done:
1126     Py_XDECREF(pieces);
1127     Py_XDECREF(colon);
1128     Py_ReprLeave((PyObject *)mp);
1129     return result;
1130 }
1131 
1132 static Py_ssize_t
dict_length(PyDictObject * mp)1133 dict_length(PyDictObject *mp)
1134 {
1135     return mp->ma_used;
1136 }
1137 
1138 static PyObject *
dict_subscript(PyDictObject * mp,register PyObject * key)1139 dict_subscript(PyDictObject *mp, register PyObject *key)
1140 {
1141     PyObject *v;
1142     long hash;
1143     PyDictEntry *ep;
1144     assert(mp->ma_table != NULL);
1145     if (!PyString_CheckExact(key) ||
1146         (hash = ((PyStringObject *) key)->ob_shash) == -1) {
1147         hash = PyObject_Hash(key);
1148         if (hash == -1)
1149             return NULL;
1150     }
1151     ep = (mp->ma_lookup)(mp, key, hash);
1152     if (ep == NULL)
1153         return NULL;
1154     v = ep->me_value;
1155     if (v == NULL) {
1156         if (!PyDict_CheckExact(mp)) {
1157             /* Look up __missing__ method if we're a subclass. */
1158             PyObject *missing, *res;
1159             static PyObject *missing_str = NULL;
1160             missing = _PyObject_LookupSpecial((PyObject *)mp,
1161                                               "__missing__",
1162                                               &missing_str);
1163             if (missing != NULL) {
1164                 res = PyObject_CallFunctionObjArgs(missing,
1165                                                    key, NULL);
1166                 Py_DECREF(missing);
1167                 return res;
1168             }
1169             else if (PyErr_Occurred())
1170                 return NULL;
1171         }
1172         set_key_error(key);
1173         return NULL;
1174     }
1175     else
1176         Py_INCREF(v);
1177     return v;
1178 }
1179 
1180 static int
dict_ass_sub(PyDictObject * mp,PyObject * v,PyObject * w)1181 dict_ass_sub(PyDictObject *mp, PyObject *v, PyObject *w)
1182 {
1183     if (w == NULL)
1184         return PyDict_DelItem((PyObject *)mp, v);
1185     else
1186         return PyDict_SetItem((PyObject *)mp, v, w);
1187 }
1188 
1189 static PyMappingMethods dict_as_mapping = {
1190     (lenfunc)dict_length, /*mp_length*/
1191     (binaryfunc)dict_subscript, /*mp_subscript*/
1192     (objobjargproc)dict_ass_sub, /*mp_ass_subscript*/
1193 };
1194 
1195 static PyObject *
dict_keys(register PyDictObject * mp)1196 dict_keys(register PyDictObject *mp)
1197 {
1198     register PyObject *v;
1199     register Py_ssize_t i, j;
1200     PyDictEntry *ep;
1201     Py_ssize_t mask, n;
1202 
1203   again:
1204     n = mp->ma_used;
1205     v = PyList_New(n);
1206     if (v == NULL)
1207         return NULL;
1208     if (n != mp->ma_used) {
1209         /* Durnit.  The allocations caused the dict to resize.
1210          * Just start over, this shouldn't normally happen.
1211          */
1212         Py_DECREF(v);
1213         goto again;
1214     }
1215     ep = mp->ma_table;
1216     mask = mp->ma_mask;
1217     for (i = 0, j = 0; i <= mask; i++) {
1218         if (ep[i].me_value != NULL) {
1219             PyObject *key = ep[i].me_key;
1220             Py_INCREF(key);
1221             PyList_SET_ITEM(v, j, key);
1222             j++;
1223         }
1224     }
1225     assert(j == n);
1226     return v;
1227 }
1228 
1229 static PyObject *
dict_values(register PyDictObject * mp)1230 dict_values(register PyDictObject *mp)
1231 {
1232     register PyObject *v;
1233     register Py_ssize_t i, j;
1234     PyDictEntry *ep;
1235     Py_ssize_t mask, n;
1236 
1237   again:
1238     n = mp->ma_used;
1239     v = PyList_New(n);
1240     if (v == NULL)
1241         return NULL;
1242     if (n != mp->ma_used) {
1243         /* Durnit.  The allocations caused the dict to resize.
1244          * Just start over, this shouldn't normally happen.
1245          */
1246         Py_DECREF(v);
1247         goto again;
1248     }
1249     ep = mp->ma_table;
1250     mask = mp->ma_mask;
1251     for (i = 0, j = 0; i <= mask; i++) {
1252         if (ep[i].me_value != NULL) {
1253             PyObject *value = ep[i].me_value;
1254             Py_INCREF(value);
1255             PyList_SET_ITEM(v, j, value);
1256             j++;
1257         }
1258     }
1259     assert(j == n);
1260     return v;
1261 }
1262 
1263 static PyObject *
dict_items(register PyDictObject * mp)1264 dict_items(register PyDictObject *mp)
1265 {
1266     register PyObject *v;
1267     register Py_ssize_t i, j, n;
1268     Py_ssize_t mask;
1269     PyObject *item, *key, *value;
1270     PyDictEntry *ep;
1271 
1272     /* Preallocate the list of tuples, to avoid allocations during
1273      * the loop over the items, which could trigger GC, which
1274      * could resize the dict. :-(
1275      */
1276   again:
1277     n = mp->ma_used;
1278     v = PyList_New(n);
1279     if (v == NULL)
1280         return NULL;
1281     for (i = 0; i < n; i++) {
1282         item = PyTuple_New(2);
1283         if (item == NULL) {
1284             Py_DECREF(v);
1285             return NULL;
1286         }
1287         PyList_SET_ITEM(v, i, item);
1288     }
1289     if (n != mp->ma_used) {
1290         /* Durnit.  The allocations caused the dict to resize.
1291          * Just start over, this shouldn't normally happen.
1292          */
1293         Py_DECREF(v);
1294         goto again;
1295     }
1296     /* Nothing we do below makes any function calls. */
1297     ep = mp->ma_table;
1298     mask = mp->ma_mask;
1299     for (i = 0, j = 0; i <= mask; i++) {
1300         if ((value=ep[i].me_value) != NULL) {
1301             key = ep[i].me_key;
1302             item = PyList_GET_ITEM(v, j);
1303             Py_INCREF(key);
1304             PyTuple_SET_ITEM(item, 0, key);
1305             Py_INCREF(value);
1306             PyTuple_SET_ITEM(item, 1, value);
1307             j++;
1308         }
1309     }
1310     assert(j == n);
1311     return v;
1312 }
1313 
1314 static PyObject *
dict_fromkeys(PyObject * cls,PyObject * args)1315 dict_fromkeys(PyObject *cls, PyObject *args)
1316 {
1317     PyObject *seq;
1318     PyObject *value = Py_None;
1319     PyObject *it;       /* iter(seq) */
1320     PyObject *key;
1321     PyObject *d;
1322     int status;
1323 
1324     if (!PyArg_UnpackTuple(args, "fromkeys", 1, 2, &seq, &value))
1325         return NULL;
1326 
1327     d = PyObject_CallObject(cls, NULL);
1328     if (d == NULL)
1329         return NULL;
1330 
1331     if (PyDict_CheckExact(d) && PyDict_CheckExact(seq)) {
1332         PyDictObject *mp = (PyDictObject *)d;
1333         PyObject *oldvalue;
1334         Py_ssize_t pos = 0;
1335         PyObject *key;
1336         long hash;
1337 
1338         if (dictresize(mp, Py_SIZE(seq)))
1339             return NULL;
1340 
1341         while (_PyDict_Next(seq, &pos, &key, &oldvalue, &hash)) {
1342             Py_INCREF(key);
1343             Py_INCREF(value);
1344             if (insertdict(mp, key, hash, value))
1345                 return NULL;
1346         }
1347         return d;
1348     }
1349 
1350     if (PyDict_CheckExact(d) && PyAnySet_CheckExact(seq)) {
1351         PyDictObject *mp = (PyDictObject *)d;
1352         Py_ssize_t pos = 0;
1353         PyObject *key;
1354         long hash;
1355 
1356         if (dictresize(mp, PySet_GET_SIZE(seq)))
1357             return NULL;
1358 
1359         while (_PySet_NextEntry(seq, &pos, &key, &hash)) {
1360             Py_INCREF(key);
1361             Py_INCREF(value);
1362             if (insertdict(mp, key, hash, value))
1363                 return NULL;
1364         }
1365         return d;
1366     }
1367 
1368     it = PyObject_GetIter(seq);
1369     if (it == NULL){
1370         Py_DECREF(d);
1371         return NULL;
1372     }
1373 
1374     if (PyDict_CheckExact(d)) {
1375         while ((key = PyIter_Next(it)) != NULL) {
1376             status = PyDict_SetItem(d, key, value);
1377             Py_DECREF(key);
1378             if (status < 0)
1379                 goto Fail;
1380         }
1381     } else {
1382         while ((key = PyIter_Next(it)) != NULL) {
1383             status = PyObject_SetItem(d, key, value);
1384             Py_DECREF(key);
1385             if (status < 0)
1386                 goto Fail;
1387         }
1388     }
1389 
1390     if (PyErr_Occurred())
1391         goto Fail;
1392     Py_DECREF(it);
1393     return d;
1394 
1395 Fail:
1396     Py_DECREF(it);
1397     Py_DECREF(d);
1398     return NULL;
1399 }
1400 
1401 static int
dict_update_common(PyObject * self,PyObject * args,PyObject * kwds,char * methname)1402 dict_update_common(PyObject *self, PyObject *args, PyObject *kwds, char *methname)
1403 {
1404     PyObject *arg = NULL;
1405     int result = 0;
1406 
1407     if (!PyArg_UnpackTuple(args, methname, 0, 1, &arg))
1408         result = -1;
1409 
1410     else if (arg != NULL) {
1411         if (PyObject_HasAttrString(arg, "keys"))
1412             result = PyDict_Merge(self, arg, 1);
1413         else
1414             result = PyDict_MergeFromSeq2(self, arg, 1);
1415     }
1416     if (result == 0 && kwds != NULL)
1417         result = PyDict_Merge(self, kwds, 1);
1418     return result;
1419 }
1420 
1421 static PyObject *
dict_update(PyObject * self,PyObject * args,PyObject * kwds)1422 dict_update(PyObject *self, PyObject *args, PyObject *kwds)
1423 {
1424     if (dict_update_common(self, args, kwds, "update") != -1)
1425         Py_RETURN_NONE;
1426     return NULL;
1427 }
1428 
1429 /* Update unconditionally replaces existing items.
1430    Merge has a 3rd argument 'override'; if set, it acts like Update,
1431    otherwise it leaves existing items unchanged.
1432 
1433    PyDict_{Update,Merge} update/merge from a mapping object.
1434 
1435    PyDict_MergeFromSeq2 updates/merges from any iterable object
1436    producing iterable objects of length 2.
1437 */
1438 
1439 int
PyDict_MergeFromSeq2(PyObject * d,PyObject * seq2,int override)1440 PyDict_MergeFromSeq2(PyObject *d, PyObject *seq2, int override)
1441 {
1442     PyObject *it;       /* iter(seq2) */
1443     Py_ssize_t i;       /* index into seq2 of current element */
1444     PyObject *item;     /* seq2[i] */
1445     PyObject *fast;     /* item as a 2-tuple or 2-list */
1446 
1447     assert(d != NULL);
1448     assert(PyDict_Check(d));
1449     assert(seq2 != NULL);
1450 
1451     it = PyObject_GetIter(seq2);
1452     if (it == NULL)
1453         return -1;
1454 
1455     for (i = 0; ; ++i) {
1456         PyObject *key, *value;
1457         Py_ssize_t n;
1458 
1459         fast = NULL;
1460         item = PyIter_Next(it);
1461         if (item == NULL) {
1462             if (PyErr_Occurred())
1463                 goto Fail;
1464             break;
1465         }
1466 
1467         /* Convert item to sequence, and verify length 2. */
1468         fast = PySequence_Fast(item, "");
1469         if (fast == NULL) {
1470             if (PyErr_ExceptionMatches(PyExc_TypeError))
1471                 PyErr_Format(PyExc_TypeError,
1472                     "cannot convert dictionary update "
1473                     "sequence element #%zd to a sequence",
1474                     i);
1475             goto Fail;
1476         }
1477         n = PySequence_Fast_GET_SIZE(fast);
1478         if (n != 2) {
1479             PyErr_Format(PyExc_ValueError,
1480                          "dictionary update sequence element #%zd "
1481                          "has length %zd; 2 is required",
1482                          i, n);
1483             goto Fail;
1484         }
1485 
1486         /* Update/merge with this (key, value) pair. */
1487         key = PySequence_Fast_GET_ITEM(fast, 0);
1488         value = PySequence_Fast_GET_ITEM(fast, 1);
1489         if (override || PyDict_GetItem(d, key) == NULL) {
1490             int status = PyDict_SetItem(d, key, value);
1491             if (status < 0)
1492                 goto Fail;
1493         }
1494         Py_DECREF(fast);
1495         Py_DECREF(item);
1496     }
1497 
1498     i = 0;
1499     goto Return;
1500 Fail:
1501     Py_XDECREF(item);
1502     Py_XDECREF(fast);
1503     i = -1;
1504 Return:
1505     Py_DECREF(it);
1506     return Py_SAFE_DOWNCAST(i, Py_ssize_t, int);
1507 }
1508 
1509 int
PyDict_Update(PyObject * a,PyObject * b)1510 PyDict_Update(PyObject *a, PyObject *b)
1511 {
1512     return PyDict_Merge(a, b, 1);
1513 }
1514 
1515 int
PyDict_Merge(PyObject * a,PyObject * b,int override)1516 PyDict_Merge(PyObject *a, PyObject *b, int override)
1517 {
1518     register PyDictObject *mp, *other;
1519     register Py_ssize_t i;
1520     PyDictEntry *entry;
1521 
1522     /* We accept for the argument either a concrete dictionary object,
1523      * or an abstract "mapping" object.  For the former, we can do
1524      * things quite efficiently.  For the latter, we only require that
1525      * PyMapping_Keys() and PyObject_GetItem() be supported.
1526      */
1527     if (a == NULL || !PyDict_Check(a) || b == NULL) {
1528         PyErr_BadInternalCall();
1529         return -1;
1530     }
1531     mp = (PyDictObject*)a;
1532     if (PyDict_Check(b)) {
1533         other = (PyDictObject*)b;
1534         if (other == mp || other->ma_used == 0)
1535             /* a.update(a) or a.update({}); nothing to do */
1536             return 0;
1537         if (mp->ma_used == 0)
1538             /* Since the target dict is empty, PyDict_GetItem()
1539              * always returns NULL.  Setting override to 1
1540              * skips the unnecessary test.
1541              */
1542             override = 1;
1543         /* Do one big resize at the start, rather than
1544          * incrementally resizing as we insert new items.  Expect
1545          * that there will be no (or few) overlapping keys.
1546          */
1547         if ((mp->ma_fill + other->ma_used)*3 >= (mp->ma_mask+1)*2) {
1548            if (dictresize(mp, (mp->ma_used + other->ma_used)*2) != 0)
1549                return -1;
1550         }
1551         for (i = 0; i <= other->ma_mask; i++) {
1552             entry = &other->ma_table[i];
1553             if (entry->me_value != NULL &&
1554                 (override ||
1555                  PyDict_GetItem(a, entry->me_key) == NULL)) {
1556                 Py_INCREF(entry->me_key);
1557                 Py_INCREF(entry->me_value);
1558                 if (insertdict(mp, entry->me_key,
1559                                (long)entry->me_hash,
1560                                entry->me_value) != 0)
1561                     return -1;
1562             }
1563         }
1564     }
1565     else {
1566         /* Do it the generic, slower way */
1567         PyObject *keys = PyMapping_Keys(b);
1568         PyObject *iter;
1569         PyObject *key, *value;
1570         int status;
1571 
1572         if (keys == NULL)
1573             /* Docstring says this is equivalent to E.keys() so
1574              * if E doesn't have a .keys() method we want
1575              * AttributeError to percolate up.  Might as well
1576              * do the same for any other error.
1577              */
1578             return -1;
1579 
1580         iter = PyObject_GetIter(keys);
1581         Py_DECREF(keys);
1582         if (iter == NULL)
1583             return -1;
1584 
1585         for (key = PyIter_Next(iter); key; key = PyIter_Next(iter)) {
1586             if (!override && PyDict_GetItem(a, key) != NULL) {
1587                 Py_DECREF(key);
1588                 continue;
1589             }
1590             value = PyObject_GetItem(b, key);
1591             if (value == NULL) {
1592                 Py_DECREF(iter);
1593                 Py_DECREF(key);
1594                 return -1;
1595             }
1596             status = PyDict_SetItem(a, key, value);
1597             Py_DECREF(key);
1598             Py_DECREF(value);
1599             if (status < 0) {
1600                 Py_DECREF(iter);
1601                 return -1;
1602             }
1603         }
1604         Py_DECREF(iter);
1605         if (PyErr_Occurred())
1606             /* Iterator completed, via error */
1607             return -1;
1608     }
1609     return 0;
1610 }
1611 
1612 static PyObject *
dict_copy(register PyDictObject * mp)1613 dict_copy(register PyDictObject *mp)
1614 {
1615     return PyDict_Copy((PyObject*)mp);
1616 }
1617 
1618 PyObject *
PyDict_Copy(PyObject * o)1619 PyDict_Copy(PyObject *o)
1620 {
1621     PyObject *copy;
1622 
1623     if (o == NULL || !PyDict_Check(o)) {
1624         PyErr_BadInternalCall();
1625         return NULL;
1626     }
1627     copy = PyDict_New();
1628     if (copy == NULL)
1629         return NULL;
1630     if (PyDict_Merge(copy, o, 1) == 0)
1631         return copy;
1632     Py_DECREF(copy);
1633     return NULL;
1634 }
1635 
1636 Py_ssize_t
PyDict_Size(PyObject * mp)1637 PyDict_Size(PyObject *mp)
1638 {
1639     if (mp == NULL || !PyDict_Check(mp)) {
1640         PyErr_BadInternalCall();
1641         return -1;
1642     }
1643     return ((PyDictObject *)mp)->ma_used;
1644 }
1645 
1646 PyObject *
PyDict_Keys(PyObject * mp)1647 PyDict_Keys(PyObject *mp)
1648 {
1649     if (mp == NULL || !PyDict_Check(mp)) {
1650         PyErr_BadInternalCall();
1651         return NULL;
1652     }
1653     return dict_keys((PyDictObject *)mp);
1654 }
1655 
1656 PyObject *
PyDict_Values(PyObject * mp)1657 PyDict_Values(PyObject *mp)
1658 {
1659     if (mp == NULL || !PyDict_Check(mp)) {
1660         PyErr_BadInternalCall();
1661         return NULL;
1662     }
1663     return dict_values((PyDictObject *)mp);
1664 }
1665 
1666 PyObject *
PyDict_Items(PyObject * mp)1667 PyDict_Items(PyObject *mp)
1668 {
1669     if (mp == NULL || !PyDict_Check(mp)) {
1670         PyErr_BadInternalCall();
1671         return NULL;
1672     }
1673     return dict_items((PyDictObject *)mp);
1674 }
1675 
1676 /* Subroutine which returns the smallest key in a for which b's value
1677    is different or absent.  The value is returned too, through the
1678    pval argument.  Both are NULL if no key in a is found for which b's status
1679    differs.  The refcounts on (and only on) non-NULL *pval and function return
1680    values must be decremented by the caller (characterize() increments them
1681    to ensure that mutating comparison and PyDict_GetItem calls can't delete
1682    them before the caller is done looking at them). */
1683 
1684 static PyObject *
characterize(PyDictObject * a,PyDictObject * b,PyObject ** pval)1685 characterize(PyDictObject *a, PyDictObject *b, PyObject **pval)
1686 {
1687     PyObject *akey = NULL; /* smallest key in a s.t. a[akey] != b[akey] */
1688     PyObject *aval = NULL; /* a[akey] */
1689     Py_ssize_t i;
1690     int cmp;
1691 
1692     for (i = 0; i <= a->ma_mask; i++) {
1693         PyObject *thiskey, *thisaval, *thisbval;
1694         if (a->ma_table[i].me_value == NULL)
1695             continue;
1696         thiskey = a->ma_table[i].me_key;
1697         Py_INCREF(thiskey);  /* keep alive across compares */
1698         if (akey != NULL) {
1699             cmp = PyObject_RichCompareBool(akey, thiskey, Py_LT);
1700             if (cmp < 0) {
1701                 Py_DECREF(thiskey);
1702                 goto Fail;
1703             }
1704             if (cmp > 0 ||
1705                 i > a->ma_mask ||
1706                 a->ma_table[i].me_value == NULL)
1707             {
1708                 /* Not the *smallest* a key; or maybe it is
1709                  * but the compare shrunk the dict so we can't
1710                  * find its associated value anymore; or
1711                  * maybe it is but the compare deleted the
1712                  * a[thiskey] entry.
1713                  */
1714                 Py_DECREF(thiskey);
1715                 continue;
1716             }
1717         }
1718 
1719         /* Compare a[thiskey] to b[thiskey]; cmp <- true iff equal. */
1720         thisaval = a->ma_table[i].me_value;
1721         assert(thisaval);
1722         Py_INCREF(thisaval);   /* keep alive */
1723         thisbval = PyDict_GetItem((PyObject *)b, thiskey);
1724         if (thisbval == NULL)
1725             cmp = 0;
1726         else {
1727             /* both dicts have thiskey:  same values? */
1728             cmp = PyObject_RichCompareBool(
1729                                     thisaval, thisbval, Py_EQ);
1730             if (cmp < 0) {
1731                 Py_DECREF(thiskey);
1732                 Py_DECREF(thisaval);
1733                 goto Fail;
1734             }
1735         }
1736         if (cmp == 0) {
1737             /* New winner. */
1738             Py_XDECREF(akey);
1739             Py_XDECREF(aval);
1740             akey = thiskey;
1741             aval = thisaval;
1742         }
1743         else {
1744             Py_DECREF(thiskey);
1745             Py_DECREF(thisaval);
1746         }
1747     }
1748     *pval = aval;
1749     return akey;
1750 
1751 Fail:
1752     Py_XDECREF(akey);
1753     Py_XDECREF(aval);
1754     *pval = NULL;
1755     return NULL;
1756 }
1757 
1758 static int
dict_compare(PyDictObject * a,PyDictObject * b)1759 dict_compare(PyDictObject *a, PyDictObject *b)
1760 {
1761     PyObject *adiff, *bdiff, *aval, *bval;
1762     int res;
1763 
1764     /* Compare lengths first */
1765     if (a->ma_used < b->ma_used)
1766         return -1;              /* a is shorter */
1767     else if (a->ma_used > b->ma_used)
1768         return 1;               /* b is shorter */
1769 
1770     /* Same length -- check all keys */
1771     bdiff = bval = NULL;
1772     adiff = characterize(a, b, &aval);
1773     if (adiff == NULL) {
1774         assert(!aval);
1775         /* Either an error, or a is a subset with the same length so
1776          * must be equal.
1777          */
1778         res = PyErr_Occurred() ? -1 : 0;
1779         goto Finished;
1780     }
1781     bdiff = characterize(b, a, &bval);
1782     if (bdiff == NULL && PyErr_Occurred()) {
1783         assert(!bval);
1784         res = -1;
1785         goto Finished;
1786     }
1787     res = 0;
1788     if (bdiff) {
1789         /* bdiff == NULL "should be" impossible now, but perhaps
1790          * the last comparison done by the characterize() on a had
1791          * the side effect of making the dicts equal!
1792          */
1793         res = PyObject_Compare(adiff, bdiff);
1794     }
1795     if (res == 0 && bval != NULL)
1796         res = PyObject_Compare(aval, bval);
1797 
1798 Finished:
1799     Py_XDECREF(adiff);
1800     Py_XDECREF(bdiff);
1801     Py_XDECREF(aval);
1802     Py_XDECREF(bval);
1803     return res;
1804 }
1805 
1806 /* Return 1 if dicts equal, 0 if not, -1 if error.
1807  * Gets out as soon as any difference is detected.
1808  * Uses only Py_EQ comparison.
1809  */
1810 static int
dict_equal(PyDictObject * a,PyDictObject * b)1811 dict_equal(PyDictObject *a, PyDictObject *b)
1812 {
1813     Py_ssize_t i;
1814 
1815     if (a->ma_used != b->ma_used)
1816         /* can't be equal if # of entries differ */
1817         return 0;
1818 
1819     /* Same # of entries -- check all of 'em.  Exit early on any diff. */
1820     for (i = 0; i <= a->ma_mask; i++) {
1821         PyObject *aval = a->ma_table[i].me_value;
1822         if (aval != NULL) {
1823             int cmp;
1824             PyObject *bval;
1825             PyObject *key = a->ma_table[i].me_key;
1826             /* temporarily bump aval's refcount to ensure it stays
1827                alive until we're done with it */
1828             Py_INCREF(aval);
1829             /* ditto for key */
1830             Py_INCREF(key);
1831             bval = PyDict_GetItem((PyObject *)b, key);
1832             Py_DECREF(key);
1833             if (bval == NULL) {
1834                 Py_DECREF(aval);
1835                 return 0;
1836             }
1837             cmp = PyObject_RichCompareBool(aval, bval, Py_EQ);
1838             Py_DECREF(aval);
1839             if (cmp <= 0)  /* error or not equal */
1840                 return cmp;
1841         }
1842     }
1843     return 1;
1844  }
1845 
1846 static PyObject *
dict_richcompare(PyObject * v,PyObject * w,int op)1847 dict_richcompare(PyObject *v, PyObject *w, int op)
1848 {
1849     int cmp;
1850     PyObject *res;
1851 
1852     if (!PyDict_Check(v) || !PyDict_Check(w)) {
1853         res = Py_NotImplemented;
1854     }
1855     else if (op == Py_EQ || op == Py_NE) {
1856         cmp = dict_equal((PyDictObject *)v, (PyDictObject *)w);
1857         if (cmp < 0)
1858             return NULL;
1859         res = (cmp == (op == Py_EQ)) ? Py_True : Py_False;
1860     }
1861     else {
1862         /* Py3K warning if comparison isn't == or !=  */
1863         if (PyErr_WarnPy3k("dict inequality comparisons not supported "
1864                            "in 3.x", 1) < 0) {
1865             return NULL;
1866         }
1867         res = Py_NotImplemented;
1868     }
1869     Py_INCREF(res);
1870     return res;
1871  }
1872 
1873 static PyObject *
dict_contains(register PyDictObject * mp,PyObject * key)1874 dict_contains(register PyDictObject *mp, PyObject *key)
1875 {
1876     long hash;
1877     PyDictEntry *ep;
1878 
1879     if (!PyString_CheckExact(key) ||
1880         (hash = ((PyStringObject *) key)->ob_shash) == -1) {
1881         hash = PyObject_Hash(key);
1882         if (hash == -1)
1883             return NULL;
1884     }
1885     ep = (mp->ma_lookup)(mp, key, hash);
1886     if (ep == NULL)
1887         return NULL;
1888     return PyBool_FromLong(ep->me_value != NULL);
1889 }
1890 
1891 static PyObject *
dict_has_key(register PyDictObject * mp,PyObject * key)1892 dict_has_key(register PyDictObject *mp, PyObject *key)
1893 {
1894     if (PyErr_WarnPy3k("dict.has_key() not supported in 3.x; "
1895                        "use the in operator", 1) < 0)
1896         return NULL;
1897     return dict_contains(mp, key);
1898 }
1899 
1900 static PyObject *
dict_get(register PyDictObject * mp,PyObject * args)1901 dict_get(register PyDictObject *mp, PyObject *args)
1902 {
1903     PyObject *key;
1904     PyObject *failobj = Py_None;
1905     PyObject *val = NULL;
1906     long hash;
1907     PyDictEntry *ep;
1908 
1909     if (!PyArg_UnpackTuple(args, "get", 1, 2, &key, &failobj))
1910         return NULL;
1911 
1912     if (!PyString_CheckExact(key) ||
1913         (hash = ((PyStringObject *) key)->ob_shash) == -1) {
1914         hash = PyObject_Hash(key);
1915         if (hash == -1)
1916             return NULL;
1917     }
1918     ep = (mp->ma_lookup)(mp, key, hash);
1919     if (ep == NULL)
1920         return NULL;
1921     val = ep->me_value;
1922     if (val == NULL)
1923         val = failobj;
1924     Py_INCREF(val);
1925     return val;
1926 }
1927 
1928 
1929 static PyObject *
dict_setdefault(register PyDictObject * mp,PyObject * args)1930 dict_setdefault(register PyDictObject *mp, PyObject *args)
1931 {
1932     PyObject *key;
1933     PyObject *failobj = Py_None;
1934     PyObject *val = NULL;
1935     long hash;
1936     PyDictEntry *ep;
1937 
1938     if (!PyArg_UnpackTuple(args, "setdefault", 1, 2, &key, &failobj))
1939         return NULL;
1940 
1941     if (!PyString_CheckExact(key) ||
1942         (hash = ((PyStringObject *) key)->ob_shash) == -1) {
1943         hash = PyObject_Hash(key);
1944         if (hash == -1)
1945             return NULL;
1946     }
1947     ep = (mp->ma_lookup)(mp, key, hash);
1948     if (ep == NULL)
1949         return NULL;
1950     val = ep->me_value;
1951     if (val == NULL) {
1952         val = failobj;
1953         if (PyDict_SetItem((PyObject*)mp, key, failobj))
1954             val = NULL;
1955     }
1956     Py_XINCREF(val);
1957     return val;
1958 }
1959 
1960 
1961 static PyObject *
dict_clear(register PyDictObject * mp)1962 dict_clear(register PyDictObject *mp)
1963 {
1964     PyDict_Clear((PyObject *)mp);
1965     Py_RETURN_NONE;
1966 }
1967 
1968 static PyObject *
dict_pop(PyDictObject * mp,PyObject * args)1969 dict_pop(PyDictObject *mp, PyObject *args)
1970 {
1971     long hash;
1972     PyDictEntry *ep;
1973     PyObject *old_value, *old_key;
1974     PyObject *key, *deflt = NULL;
1975 
1976     if(!PyArg_UnpackTuple(args, "pop", 1, 2, &key, &deflt))
1977         return NULL;
1978     if (mp->ma_used == 0) {
1979         if (deflt) {
1980             Py_INCREF(deflt);
1981             return deflt;
1982         }
1983         set_key_error(key);
1984         return NULL;
1985     }
1986     if (!PyString_CheckExact(key) ||
1987         (hash = ((PyStringObject *) key)->ob_shash) == -1) {
1988         hash = PyObject_Hash(key);
1989         if (hash == -1)
1990             return NULL;
1991     }
1992     ep = (mp->ma_lookup)(mp, key, hash);
1993     if (ep == NULL)
1994         return NULL;
1995     if (ep->me_value == NULL) {
1996         if (deflt) {
1997             Py_INCREF(deflt);
1998             return deflt;
1999         }
2000         set_key_error(key);
2001         return NULL;
2002     }
2003     old_key = ep->me_key;
2004     Py_INCREF(dummy);
2005     ep->me_key = dummy;
2006     old_value = ep->me_value;
2007     ep->me_value = NULL;
2008     mp->ma_used--;
2009     Py_DECREF(old_key);
2010     return old_value;
2011 }
2012 
2013 static PyObject *
dict_popitem(PyDictObject * mp)2014 dict_popitem(PyDictObject *mp)
2015 {
2016     Py_ssize_t i = 0;
2017     PyDictEntry *ep;
2018     PyObject *res;
2019 
2020     /* Allocate the result tuple before checking the size.  Believe it
2021      * or not, this allocation could trigger a garbage collection which
2022      * could empty the dict, so if we checked the size first and that
2023      * happened, the result would be an infinite loop (searching for an
2024      * entry that no longer exists).  Note that the usual popitem()
2025      * idiom is "while d: k, v = d.popitem()". so needing to throw the
2026      * tuple away if the dict *is* empty isn't a significant
2027      * inefficiency -- possible, but unlikely in practice.
2028      */
2029     res = PyTuple_New(2);
2030     if (res == NULL)
2031         return NULL;
2032     if (mp->ma_used == 0) {
2033         Py_DECREF(res);
2034         PyErr_SetString(PyExc_KeyError,
2035                         "popitem(): dictionary is empty");
2036         return NULL;
2037     }
2038     /* Set ep to "the first" dict entry with a value.  We abuse the hash
2039      * field of slot 0 to hold a search finger:
2040      * If slot 0 has a value, use slot 0.
2041      * Else slot 0 is being used to hold a search finger,
2042      * and we use its hash value as the first index to look.
2043      */
2044     ep = &mp->ma_table[0];
2045     if (ep->me_value == NULL) {
2046         i = ep->me_hash;
2047         /* The hash field may be a real hash value, or it may be a
2048          * legit search finger, or it may be a once-legit search
2049          * finger that's out of bounds now because it wrapped around
2050          * or the table shrunk -- simply make sure it's in bounds now.
2051          */
2052         if (i > mp->ma_mask || i < 1)
2053             i = 1;              /* skip slot 0 */
2054         while ((ep = &mp->ma_table[i])->me_value == NULL) {
2055             i++;
2056             if (i > mp->ma_mask)
2057                 i = 1;
2058         }
2059     }
2060     PyTuple_SET_ITEM(res, 0, ep->me_key);
2061     PyTuple_SET_ITEM(res, 1, ep->me_value);
2062     Py_INCREF(dummy);
2063     ep->me_key = dummy;
2064     ep->me_value = NULL;
2065     mp->ma_used--;
2066     assert(mp->ma_table[0].me_value == NULL);
2067     mp->ma_table[0].me_hash = i + 1;  /* next place to start */
2068     return res;
2069 }
2070 
2071 static int
dict_traverse(PyObject * op,visitproc visit,void * arg)2072 dict_traverse(PyObject *op, visitproc visit, void *arg)
2073 {
2074     Py_ssize_t i = 0;
2075     PyObject *pk;
2076     PyObject *pv;
2077 
2078     while (PyDict_Next(op, &i, &pk, &pv)) {
2079         Py_VISIT(pk);
2080         Py_VISIT(pv);
2081     }
2082     return 0;
2083 }
2084 
2085 static int
dict_tp_clear(PyObject * op)2086 dict_tp_clear(PyObject *op)
2087 {
2088     PyDict_Clear(op);
2089     return 0;
2090 }
2091 
2092 
2093 extern PyTypeObject PyDictIterKey_Type; /* Forward */
2094 extern PyTypeObject PyDictIterValue_Type; /* Forward */
2095 extern PyTypeObject PyDictIterItem_Type; /* Forward */
2096 static PyObject *dictiter_new(PyDictObject *, PyTypeObject *);
2097 
2098 static PyObject *
dict_iterkeys(PyDictObject * dict)2099 dict_iterkeys(PyDictObject *dict)
2100 {
2101     return dictiter_new(dict, &PyDictIterKey_Type);
2102 }
2103 
2104 static PyObject *
dict_itervalues(PyDictObject * dict)2105 dict_itervalues(PyDictObject *dict)
2106 {
2107     return dictiter_new(dict, &PyDictIterValue_Type);
2108 }
2109 
2110 static PyObject *
dict_iteritems(PyDictObject * dict)2111 dict_iteritems(PyDictObject *dict)
2112 {
2113     return dictiter_new(dict, &PyDictIterItem_Type);
2114 }
2115 
2116 static PyObject *
dict_sizeof(PyDictObject * mp)2117 dict_sizeof(PyDictObject *mp)
2118 {
2119     Py_ssize_t res;
2120 
2121     res = sizeof(PyDictObject);
2122     if (mp->ma_table != mp->ma_smalltable)
2123         res = res + (mp->ma_mask + 1) * sizeof(PyDictEntry);
2124     return PyInt_FromSsize_t(res);
2125 }
2126 
2127 PyDoc_STRVAR(has_key__doc__,
2128 "D.has_key(k) -> True if D has a key k, else False");
2129 
2130 PyDoc_STRVAR(contains__doc__,
2131 "D.__contains__(k) -> True if D has a key k, else False");
2132 
2133 PyDoc_STRVAR(getitem__doc__, "x.__getitem__(y) <==> x[y]");
2134 
2135 PyDoc_STRVAR(sizeof__doc__,
2136 "D.__sizeof__() -> size of D in memory, in bytes");
2137 
2138 PyDoc_STRVAR(get__doc__,
2139 "D.get(k[,d]) -> D[k] if k in D, else d.  d defaults to None.");
2140 
2141 PyDoc_STRVAR(setdefault_doc__,
2142 "D.setdefault(k[,d]) -> D.get(k,d), also set D[k]=d if k not in D");
2143 
2144 PyDoc_STRVAR(pop__doc__,
2145 "D.pop(k[,d]) -> v, remove specified key and return the corresponding value.\n\
2146 If key is not found, d is returned if given, otherwise KeyError is raised");
2147 
2148 PyDoc_STRVAR(popitem__doc__,
2149 "D.popitem() -> (k, v), remove and return some (key, value) pair as a\n\
2150 2-tuple; but raise KeyError if D is empty.");
2151 
2152 PyDoc_STRVAR(keys__doc__,
2153 "D.keys() -> list of D's keys");
2154 
2155 PyDoc_STRVAR(items__doc__,
2156 "D.items() -> list of D's (key, value) pairs, as 2-tuples");
2157 
2158 PyDoc_STRVAR(values__doc__,
2159 "D.values() -> list of D's values");
2160 
2161 PyDoc_STRVAR(update__doc__,
2162 "D.update(E, **F) -> None.  Update D from dict/iterable E and F.\n"
2163 "If E has a .keys() method, does:     for k in E: D[k] = E[k]\n\
2164 If E lacks .keys() method, does:     for (k, v) in E: D[k] = v\n\
2165 In either case, this is followed by: for k in F: D[k] = F[k]");
2166 
2167 PyDoc_STRVAR(fromkeys__doc__,
2168 "dict.fromkeys(S[,v]) -> New dict with keys from S and values equal to v.\n\
2169 v defaults to None.");
2170 
2171 PyDoc_STRVAR(clear__doc__,
2172 "D.clear() -> None.  Remove all items from D.");
2173 
2174 PyDoc_STRVAR(copy__doc__,
2175 "D.copy() -> a shallow copy of D");
2176 
2177 PyDoc_STRVAR(iterkeys__doc__,
2178 "D.iterkeys() -> an iterator over the keys of D");
2179 
2180 PyDoc_STRVAR(itervalues__doc__,
2181 "D.itervalues() -> an iterator over the values of D");
2182 
2183 PyDoc_STRVAR(iteritems__doc__,
2184 "D.iteritems() -> an iterator over the (key, value) items of D");
2185 
2186 /* Forward */
2187 static PyObject *dictkeys_new(PyObject *);
2188 static PyObject *dictitems_new(PyObject *);
2189 static PyObject *dictvalues_new(PyObject *);
2190 
2191 PyDoc_STRVAR(viewkeys__doc__,
2192              "D.viewkeys() -> a set-like object providing a view on D's keys");
2193 PyDoc_STRVAR(viewitems__doc__,
2194              "D.viewitems() -> a set-like object providing a view on D's items");
2195 PyDoc_STRVAR(viewvalues__doc__,
2196              "D.viewvalues() -> an object providing a view on D's values");
2197 
2198 static PyMethodDef mapp_methods[] = {
2199     {"__contains__",(PyCFunction)dict_contains,         METH_O | METH_COEXIST,
2200      contains__doc__},
2201     {"__getitem__", (PyCFunction)dict_subscript,        METH_O | METH_COEXIST,
2202      getitem__doc__},
2203     {"__sizeof__",      (PyCFunction)dict_sizeof,       METH_NOARGS,
2204      sizeof__doc__},
2205     {"has_key",         (PyCFunction)dict_has_key,      METH_O,
2206      has_key__doc__},
2207     {"get",         (PyCFunction)dict_get,          METH_VARARGS,
2208      get__doc__},
2209     {"setdefault",  (PyCFunction)dict_setdefault,   METH_VARARGS,
2210      setdefault_doc__},
2211     {"pop",         (PyCFunction)dict_pop,          METH_VARARGS,
2212      pop__doc__},
2213     {"popitem",         (PyCFunction)dict_popitem,      METH_NOARGS,
2214      popitem__doc__},
2215     {"keys",            (PyCFunction)dict_keys,         METH_NOARGS,
2216     keys__doc__},
2217     {"items",           (PyCFunction)dict_items,        METH_NOARGS,
2218      items__doc__},
2219     {"values",          (PyCFunction)dict_values,       METH_NOARGS,
2220      values__doc__},
2221     {"viewkeys",        (PyCFunction)dictkeys_new,      METH_NOARGS,
2222      viewkeys__doc__},
2223     {"viewitems",       (PyCFunction)dictitems_new,     METH_NOARGS,
2224      viewitems__doc__},
2225     {"viewvalues",      (PyCFunction)dictvalues_new,    METH_NOARGS,
2226      viewvalues__doc__},
2227     {"update",          (PyCFunction)dict_update,       METH_VARARGS | METH_KEYWORDS,
2228      update__doc__},
2229     {"fromkeys",        (PyCFunction)dict_fromkeys,     METH_VARARGS | METH_CLASS,
2230      fromkeys__doc__},
2231     {"clear",           (PyCFunction)dict_clear,        METH_NOARGS,
2232      clear__doc__},
2233     {"copy",            (PyCFunction)dict_copy,         METH_NOARGS,
2234      copy__doc__},
2235     {"iterkeys",        (PyCFunction)dict_iterkeys,     METH_NOARGS,
2236      iterkeys__doc__},
2237     {"itervalues",      (PyCFunction)dict_itervalues,   METH_NOARGS,
2238      itervalues__doc__},
2239     {"iteritems",       (PyCFunction)dict_iteritems,    METH_NOARGS,
2240      iteritems__doc__},
2241     {NULL,              NULL}   /* sentinel */
2242 };
2243 
2244 /* Return 1 if `key` is in dict `op`, 0 if not, and -1 on error. */
2245 int
PyDict_Contains(PyObject * op,PyObject * key)2246 PyDict_Contains(PyObject *op, PyObject *key)
2247 {
2248     long hash;
2249     PyDictObject *mp = (PyDictObject *)op;
2250     PyDictEntry *ep;
2251 
2252     if (!PyString_CheckExact(key) ||
2253         (hash = ((PyStringObject *) key)->ob_shash) == -1) {
2254         hash = PyObject_Hash(key);
2255         if (hash == -1)
2256             return -1;
2257     }
2258     ep = (mp->ma_lookup)(mp, key, hash);
2259     return ep == NULL ? -1 : (ep->me_value != NULL);
2260 }
2261 
2262 /* Internal version of PyDict_Contains used when the hash value is already known */
2263 int
_PyDict_Contains(PyObject * op,PyObject * key,long hash)2264 _PyDict_Contains(PyObject *op, PyObject *key, long hash)
2265 {
2266     PyDictObject *mp = (PyDictObject *)op;
2267     PyDictEntry *ep;
2268 
2269     ep = (mp->ma_lookup)(mp, key, hash);
2270     return ep == NULL ? -1 : (ep->me_value != NULL);
2271 }
2272 
2273 /* Hack to implement "key in dict" */
2274 static PySequenceMethods dict_as_sequence = {
2275     0,                          /* sq_length */
2276     0,                          /* sq_concat */
2277     0,                          /* sq_repeat */
2278     0,                          /* sq_item */
2279     0,                          /* sq_slice */
2280     0,                          /* sq_ass_item */
2281     0,                          /* sq_ass_slice */
2282     PyDict_Contains,            /* sq_contains */
2283     0,                          /* sq_inplace_concat */
2284     0,                          /* sq_inplace_repeat */
2285 };
2286 
2287 static PyObject *
dict_new(PyTypeObject * type,PyObject * args,PyObject * kwds)2288 dict_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
2289 {
2290     PyObject *self;
2291 
2292     assert(type != NULL && type->tp_alloc != NULL);
2293     self = type->tp_alloc(type, 0);
2294     if (self != NULL) {
2295         PyDictObject *d = (PyDictObject *)self;
2296         /* It's guaranteed that tp->alloc zeroed out the struct. */
2297         assert(d->ma_table == NULL && d->ma_fill == 0 && d->ma_used == 0);
2298         INIT_NONZERO_DICT_SLOTS(d);
2299         d->ma_lookup = lookdict_string;
2300         /* The object has been implicitly tracked by tp_alloc */
2301         if (type == &PyDict_Type)
2302             _PyObject_GC_UNTRACK(d);
2303 #ifdef SHOW_CONVERSION_COUNTS
2304         ++created;
2305 #endif
2306 #ifdef SHOW_TRACK_COUNT
2307         if (_PyObject_GC_IS_TRACKED(d))
2308             count_tracked++;
2309         else
2310             count_untracked++;
2311 #endif
2312     }
2313     return self;
2314 }
2315 
2316 static int
dict_init(PyObject * self,PyObject * args,PyObject * kwds)2317 dict_init(PyObject *self, PyObject *args, PyObject *kwds)
2318 {
2319     return dict_update_common(self, args, kwds, "dict");
2320 }
2321 
2322 static PyObject *
dict_iter(PyDictObject * dict)2323 dict_iter(PyDictObject *dict)
2324 {
2325     return dictiter_new(dict, &PyDictIterKey_Type);
2326 }
2327 
2328 PyDoc_STRVAR(dictionary_doc,
2329 "dict() -> new empty dictionary\n"
2330 "dict(mapping) -> new dictionary initialized from a mapping object's\n"
2331 "    (key, value) pairs\n"
2332 "dict(iterable) -> new dictionary initialized as if via:\n"
2333 "    d = {}\n"
2334 "    for k, v in iterable:\n"
2335 "        d[k] = v\n"
2336 "dict(**kwargs) -> new dictionary initialized with the name=value pairs\n"
2337 "    in the keyword argument list.  For example:  dict(one=1, two=2)");
2338 
2339 PyTypeObject PyDict_Type = {
2340     PyVarObject_HEAD_INIT(&PyType_Type, 0)
2341     "dict",
2342     sizeof(PyDictObject),
2343     0,
2344     (destructor)dict_dealloc,                   /* tp_dealloc */
2345     (printfunc)dict_print,                      /* tp_print */
2346     0,                                          /* tp_getattr */
2347     0,                                          /* tp_setattr */
2348     (cmpfunc)dict_compare,                      /* tp_compare */
2349     (reprfunc)dict_repr,                        /* tp_repr */
2350     0,                                          /* tp_as_number */
2351     &dict_as_sequence,                          /* tp_as_sequence */
2352     &dict_as_mapping,                           /* tp_as_mapping */
2353     (hashfunc)PyObject_HashNotImplemented,      /* tp_hash */
2354     0,                                          /* tp_call */
2355     0,                                          /* tp_str */
2356     PyObject_GenericGetAttr,                    /* tp_getattro */
2357     0,                                          /* tp_setattro */
2358     0,                                          /* tp_as_buffer */
2359     Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC |
2360         Py_TPFLAGS_BASETYPE | Py_TPFLAGS_DICT_SUBCLASS,         /* tp_flags */
2361     dictionary_doc,                             /* tp_doc */
2362     dict_traverse,                              /* tp_traverse */
2363     dict_tp_clear,                              /* tp_clear */
2364     dict_richcompare,                           /* tp_richcompare */
2365     0,                                          /* tp_weaklistoffset */
2366     (getiterfunc)dict_iter,                     /* tp_iter */
2367     0,                                          /* tp_iternext */
2368     mapp_methods,                               /* tp_methods */
2369     0,                                          /* tp_members */
2370     0,                                          /* tp_getset */
2371     0,                                          /* tp_base */
2372     0,                                          /* tp_dict */
2373     0,                                          /* tp_descr_get */
2374     0,                                          /* tp_descr_set */
2375     0,                                          /* tp_dictoffset */
2376     dict_init,                                  /* tp_init */
2377     PyType_GenericAlloc,                        /* tp_alloc */
2378     dict_new,                                   /* tp_new */
2379     PyObject_GC_Del,                            /* tp_free */
2380 };
2381 
2382 /* For backward compatibility with old dictionary interface */
2383 
2384 PyObject *
PyDict_GetItemString(PyObject * v,const char * key)2385 PyDict_GetItemString(PyObject *v, const char *key)
2386 {
2387     PyObject *kv, *rv;
2388     kv = PyString_FromString(key);
2389     if (kv == NULL)
2390         return NULL;
2391     rv = PyDict_GetItem(v, kv);
2392     Py_DECREF(kv);
2393     return rv;
2394 }
2395 
2396 int
PyDict_SetItemString(PyObject * v,const char * key,PyObject * item)2397 PyDict_SetItemString(PyObject *v, const char *key, PyObject *item)
2398 {
2399     PyObject *kv;
2400     int err;
2401     kv = PyString_FromString(key);
2402     if (kv == NULL)
2403         return -1;
2404     PyString_InternInPlace(&kv); /* XXX Should we really? */
2405     err = PyDict_SetItem(v, kv, item);
2406     Py_DECREF(kv);
2407     return err;
2408 }
2409 
2410 int
PyDict_DelItemString(PyObject * v,const char * key)2411 PyDict_DelItemString(PyObject *v, const char *key)
2412 {
2413     PyObject *kv;
2414     int err;
2415     kv = PyString_FromString(key);
2416     if (kv == NULL)
2417         return -1;
2418     err = PyDict_DelItem(v, kv);
2419     Py_DECREF(kv);
2420     return err;
2421 }
2422 
2423 /* Dictionary iterator types */
2424 
2425 typedef struct {
2426     PyObject_HEAD
2427     PyDictObject *di_dict; /* Set to NULL when iterator is exhausted */
2428     Py_ssize_t di_used;
2429     Py_ssize_t di_pos;
2430     PyObject* di_result; /* reusable result tuple for iteritems */
2431     Py_ssize_t len;
2432 } dictiterobject;
2433 
2434 static PyObject *
dictiter_new(PyDictObject * dict,PyTypeObject * itertype)2435 dictiter_new(PyDictObject *dict, PyTypeObject *itertype)
2436 {
2437     dictiterobject *di;
2438     di = PyObject_GC_New(dictiterobject, itertype);
2439     if (di == NULL)
2440         return NULL;
2441     Py_INCREF(dict);
2442     di->di_dict = dict;
2443     di->di_used = dict->ma_used;
2444     di->di_pos = 0;
2445     di->len = dict->ma_used;
2446     if (itertype == &PyDictIterItem_Type) {
2447         di->di_result = PyTuple_Pack(2, Py_None, Py_None);
2448         if (di->di_result == NULL) {
2449             Py_DECREF(di);
2450             return NULL;
2451         }
2452     }
2453     else
2454         di->di_result = NULL;
2455     _PyObject_GC_TRACK(di);
2456     return (PyObject *)di;
2457 }
2458 
2459 static void
dictiter_dealloc(dictiterobject * di)2460 dictiter_dealloc(dictiterobject *di)
2461 {
2462     Py_XDECREF(di->di_dict);
2463     Py_XDECREF(di->di_result);
2464     PyObject_GC_Del(di);
2465 }
2466 
2467 static int
dictiter_traverse(dictiterobject * di,visitproc visit,void * arg)2468 dictiter_traverse(dictiterobject *di, visitproc visit, void *arg)
2469 {
2470     Py_VISIT(di->di_dict);
2471     Py_VISIT(di->di_result);
2472     return 0;
2473 }
2474 
2475 static PyObject *
dictiter_len(dictiterobject * di)2476 dictiter_len(dictiterobject *di)
2477 {
2478     Py_ssize_t len = 0;
2479     if (di->di_dict != NULL && di->di_used == di->di_dict->ma_used)
2480         len = di->len;
2481     return PyInt_FromSize_t(len);
2482 }
2483 
2484 PyDoc_STRVAR(length_hint_doc, "Private method returning an estimate of len(list(it)).");
2485 
2486 static PyMethodDef dictiter_methods[] = {
2487     {"__length_hint__", (PyCFunction)dictiter_len, METH_NOARGS, length_hint_doc},
2488     {NULL,              NULL}           /* sentinel */
2489 };
2490 
dictiter_iternextkey(dictiterobject * di)2491 static PyObject *dictiter_iternextkey(dictiterobject *di)
2492 {
2493     PyObject *key;
2494     register Py_ssize_t i, mask;
2495     register PyDictEntry *ep;
2496     PyDictObject *d = di->di_dict;
2497 
2498     if (d == NULL)
2499         return NULL;
2500     assert (PyDict_Check(d));
2501 
2502     if (di->di_used != d->ma_used) {
2503         PyErr_SetString(PyExc_RuntimeError,
2504                         "dictionary changed size during iteration");
2505         di->di_used = -1; /* Make this state sticky */
2506         return NULL;
2507     }
2508 
2509     i = di->di_pos;
2510     if (i < 0)
2511         goto fail;
2512     ep = d->ma_table;
2513     mask = d->ma_mask;
2514     while (i <= mask && ep[i].me_value == NULL)
2515         i++;
2516     di->di_pos = i+1;
2517     if (i > mask)
2518         goto fail;
2519     di->len--;
2520     key = ep[i].me_key;
2521     Py_INCREF(key);
2522     return key;
2523 
2524 fail:
2525     Py_DECREF(d);
2526     di->di_dict = NULL;
2527     return NULL;
2528 }
2529 
2530 PyTypeObject PyDictIterKey_Type = {
2531     PyVarObject_HEAD_INIT(&PyType_Type, 0)
2532     "dictionary-keyiterator",                   /* tp_name */
2533     sizeof(dictiterobject),                     /* tp_basicsize */
2534     0,                                          /* tp_itemsize */
2535     /* methods */
2536     (destructor)dictiter_dealloc,               /* tp_dealloc */
2537     0,                                          /* tp_print */
2538     0,                                          /* tp_getattr */
2539     0,                                          /* tp_setattr */
2540     0,                                          /* tp_compare */
2541     0,                                          /* tp_repr */
2542     0,                                          /* tp_as_number */
2543     0,                                          /* tp_as_sequence */
2544     0,                                          /* tp_as_mapping */
2545     0,                                          /* tp_hash */
2546     0,                                          /* tp_call */
2547     0,                                          /* tp_str */
2548     PyObject_GenericGetAttr,                    /* tp_getattro */
2549     0,                                          /* tp_setattro */
2550     0,                                          /* tp_as_buffer */
2551     Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
2552     0,                                          /* tp_doc */
2553     (traverseproc)dictiter_traverse,            /* tp_traverse */
2554     0,                                          /* tp_clear */
2555     0,                                          /* tp_richcompare */
2556     0,                                          /* tp_weaklistoffset */
2557     PyObject_SelfIter,                          /* tp_iter */
2558     (iternextfunc)dictiter_iternextkey,         /* tp_iternext */
2559     dictiter_methods,                           /* tp_methods */
2560     0,
2561 };
2562 
dictiter_iternextvalue(dictiterobject * di)2563 static PyObject *dictiter_iternextvalue(dictiterobject *di)
2564 {
2565     PyObject *value;
2566     register Py_ssize_t i, mask;
2567     register PyDictEntry *ep;
2568     PyDictObject *d = di->di_dict;
2569 
2570     if (d == NULL)
2571         return NULL;
2572     assert (PyDict_Check(d));
2573 
2574     if (di->di_used != d->ma_used) {
2575         PyErr_SetString(PyExc_RuntimeError,
2576                         "dictionary changed size during iteration");
2577         di->di_used = -1; /* Make this state sticky */
2578         return NULL;
2579     }
2580 
2581     i = di->di_pos;
2582     mask = d->ma_mask;
2583     if (i < 0 || i > mask)
2584         goto fail;
2585     ep = d->ma_table;
2586     while ((value=ep[i].me_value) == NULL) {
2587         i++;
2588         if (i > mask)
2589             goto fail;
2590     }
2591     di->di_pos = i+1;
2592     di->len--;
2593     Py_INCREF(value);
2594     return value;
2595 
2596 fail:
2597     Py_DECREF(d);
2598     di->di_dict = NULL;
2599     return NULL;
2600 }
2601 
2602 PyTypeObject PyDictIterValue_Type = {
2603     PyVarObject_HEAD_INIT(&PyType_Type, 0)
2604     "dictionary-valueiterator",                 /* tp_name */
2605     sizeof(dictiterobject),                     /* tp_basicsize */
2606     0,                                          /* tp_itemsize */
2607     /* methods */
2608     (destructor)dictiter_dealloc,               /* tp_dealloc */
2609     0,                                          /* tp_print */
2610     0,                                          /* tp_getattr */
2611     0,                                          /* tp_setattr */
2612     0,                                          /* tp_compare */
2613     0,                                          /* tp_repr */
2614     0,                                          /* tp_as_number */
2615     0,                                          /* tp_as_sequence */
2616     0,                                          /* tp_as_mapping */
2617     0,                                          /* tp_hash */
2618     0,                                          /* tp_call */
2619     0,                                          /* tp_str */
2620     PyObject_GenericGetAttr,                    /* tp_getattro */
2621     0,                                          /* tp_setattro */
2622     0,                                          /* tp_as_buffer */
2623     Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
2624     0,                                          /* tp_doc */
2625     (traverseproc)dictiter_traverse,            /* tp_traverse */
2626     0,                                          /* tp_clear */
2627     0,                                          /* tp_richcompare */
2628     0,                                          /* tp_weaklistoffset */
2629     PyObject_SelfIter,                          /* tp_iter */
2630     (iternextfunc)dictiter_iternextvalue,       /* tp_iternext */
2631     dictiter_methods,                           /* tp_methods */
2632     0,
2633 };
2634 
dictiter_iternextitem(dictiterobject * di)2635 static PyObject *dictiter_iternextitem(dictiterobject *di)
2636 {
2637     PyObject *key, *value, *result = di->di_result;
2638     register Py_ssize_t i, mask;
2639     register PyDictEntry *ep;
2640     PyDictObject *d = di->di_dict;
2641 
2642     if (d == NULL)
2643         return NULL;
2644     assert (PyDict_Check(d));
2645 
2646     if (di->di_used != d->ma_used) {
2647         PyErr_SetString(PyExc_RuntimeError,
2648                         "dictionary changed size during iteration");
2649         di->di_used = -1; /* Make this state sticky */
2650         return NULL;
2651     }
2652 
2653     i = di->di_pos;
2654     if (i < 0)
2655         goto fail;
2656     ep = d->ma_table;
2657     mask = d->ma_mask;
2658     while (i <= mask && ep[i].me_value == NULL)
2659         i++;
2660     di->di_pos = i+1;
2661     if (i > mask)
2662         goto fail;
2663 
2664     if (result->ob_refcnt == 1) {
2665         Py_INCREF(result);
2666         Py_DECREF(PyTuple_GET_ITEM(result, 0));
2667         Py_DECREF(PyTuple_GET_ITEM(result, 1));
2668     } else {
2669         result = PyTuple_New(2);
2670         if (result == NULL)
2671             return NULL;
2672     }
2673     di->len--;
2674     key = ep[i].me_key;
2675     value = ep[i].me_value;
2676     Py_INCREF(key);
2677     Py_INCREF(value);
2678     PyTuple_SET_ITEM(result, 0, key);
2679     PyTuple_SET_ITEM(result, 1, value);
2680     return result;
2681 
2682 fail:
2683     Py_DECREF(d);
2684     di->di_dict = NULL;
2685     return NULL;
2686 }
2687 
2688 PyTypeObject PyDictIterItem_Type = {
2689     PyVarObject_HEAD_INIT(&PyType_Type, 0)
2690     "dictionary-itemiterator",                  /* tp_name */
2691     sizeof(dictiterobject),                     /* tp_basicsize */
2692     0,                                          /* tp_itemsize */
2693     /* methods */
2694     (destructor)dictiter_dealloc,               /* tp_dealloc */
2695     0,                                          /* tp_print */
2696     0,                                          /* tp_getattr */
2697     0,                                          /* tp_setattr */
2698     0,                                          /* tp_compare */
2699     0,                                          /* tp_repr */
2700     0,                                          /* tp_as_number */
2701     0,                                          /* tp_as_sequence */
2702     0,                                          /* tp_as_mapping */
2703     0,                                          /* tp_hash */
2704     0,                                          /* tp_call */
2705     0,                                          /* tp_str */
2706     PyObject_GenericGetAttr,                    /* tp_getattro */
2707     0,                                          /* tp_setattro */
2708     0,                                          /* tp_as_buffer */
2709     Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
2710     0,                                          /* tp_doc */
2711     (traverseproc)dictiter_traverse,            /* tp_traverse */
2712     0,                                          /* tp_clear */
2713     0,                                          /* tp_richcompare */
2714     0,                                          /* tp_weaklistoffset */
2715     PyObject_SelfIter,                          /* tp_iter */
2716     (iternextfunc)dictiter_iternextitem,        /* tp_iternext */
2717     dictiter_methods,                           /* tp_methods */
2718     0,
2719 };
2720 
2721 /***********************************************/
2722 /* View objects for keys(), items(), values(). */
2723 /***********************************************/
2724 
2725 /* The instance lay-out is the same for all three; but the type differs. */
2726 
2727 typedef struct {
2728     PyObject_HEAD
2729     PyDictObject *dv_dict;
2730 } dictviewobject;
2731 
2732 
2733 static void
dictview_dealloc(dictviewobject * dv)2734 dictview_dealloc(dictviewobject *dv)
2735 {
2736     Py_XDECREF(dv->dv_dict);
2737     PyObject_GC_Del(dv);
2738 }
2739 
2740 static int
dictview_traverse(dictviewobject * dv,visitproc visit,void * arg)2741 dictview_traverse(dictviewobject *dv, visitproc visit, void *arg)
2742 {
2743     Py_VISIT(dv->dv_dict);
2744     return 0;
2745 }
2746 
2747 static Py_ssize_t
dictview_len(dictviewobject * dv)2748 dictview_len(dictviewobject *dv)
2749 {
2750     Py_ssize_t len = 0;
2751     if (dv->dv_dict != NULL)
2752         len = dv->dv_dict->ma_used;
2753     return len;
2754 }
2755 
2756 static PyObject *
dictview_new(PyObject * dict,PyTypeObject * type)2757 dictview_new(PyObject *dict, PyTypeObject *type)
2758 {
2759     dictviewobject *dv;
2760     if (dict == NULL) {
2761         PyErr_BadInternalCall();
2762         return NULL;
2763     }
2764     if (!PyDict_Check(dict)) {
2765         /* XXX Get rid of this restriction later */
2766         PyErr_Format(PyExc_TypeError,
2767                      "%s() requires a dict argument, not '%s'",
2768                      type->tp_name, dict->ob_type->tp_name);
2769         return NULL;
2770     }
2771     dv = PyObject_GC_New(dictviewobject, type);
2772     if (dv == NULL)
2773         return NULL;
2774     Py_INCREF(dict);
2775     dv->dv_dict = (PyDictObject *)dict;
2776     _PyObject_GC_TRACK(dv);
2777     return (PyObject *)dv;
2778 }
2779 
2780 /* TODO(guido): The views objects are not complete:
2781 
2782  * support more set operations
2783  * support arbitrary mappings?
2784    - either these should be static or exported in dictobject.h
2785    - if public then they should probably be in builtins
2786 */
2787 
2788 /* Return 1 if self is a subset of other, iterating over self;
2789    0 if not; -1 if an error occurred. */
2790 static int
all_contained_in(PyObject * self,PyObject * other)2791 all_contained_in(PyObject *self, PyObject *other)
2792 {
2793     PyObject *iter = PyObject_GetIter(self);
2794     int ok = 1;
2795 
2796     if (iter == NULL)
2797         return -1;
2798     for (;;) {
2799         PyObject *next = PyIter_Next(iter);
2800         if (next == NULL) {
2801             if (PyErr_Occurred())
2802                 ok = -1;
2803             break;
2804         }
2805         ok = PySequence_Contains(other, next);
2806         Py_DECREF(next);
2807         if (ok <= 0)
2808             break;
2809     }
2810     Py_DECREF(iter);
2811     return ok;
2812 }
2813 
2814 static PyObject *
dictview_richcompare(PyObject * self,PyObject * other,int op)2815 dictview_richcompare(PyObject *self, PyObject *other, int op)
2816 {
2817     Py_ssize_t len_self, len_other;
2818     int ok;
2819     PyObject *result;
2820 
2821     assert(self != NULL);
2822     assert(PyDictViewSet_Check(self));
2823     assert(other != NULL);
2824 
2825     if (!PyAnySet_Check(other) && !PyDictViewSet_Check(other)) {
2826         Py_INCREF(Py_NotImplemented);
2827         return Py_NotImplemented;
2828     }
2829 
2830     len_self = PyObject_Size(self);
2831     if (len_self < 0)
2832         return NULL;
2833     len_other = PyObject_Size(other);
2834     if (len_other < 0)
2835         return NULL;
2836 
2837     ok = 0;
2838     switch(op) {
2839 
2840     case Py_NE:
2841     case Py_EQ:
2842         if (len_self == len_other)
2843             ok = all_contained_in(self, other);
2844         if (op == Py_NE && ok >= 0)
2845             ok = !ok;
2846         break;
2847 
2848     case Py_LT:
2849         if (len_self < len_other)
2850             ok = all_contained_in(self, other);
2851         break;
2852 
2853       case Py_LE:
2854           if (len_self <= len_other)
2855               ok = all_contained_in(self, other);
2856           break;
2857 
2858     case Py_GT:
2859         if (len_self > len_other)
2860             ok = all_contained_in(other, self);
2861         break;
2862 
2863     case Py_GE:
2864         if (len_self >= len_other)
2865             ok = all_contained_in(other, self);
2866         break;
2867 
2868     }
2869     if (ok < 0)
2870         return NULL;
2871     result = ok ? Py_True : Py_False;
2872     Py_INCREF(result);
2873     return result;
2874 }
2875 
2876 static PyObject *
dictview_repr(dictviewobject * dv)2877 dictview_repr(dictviewobject *dv)
2878 {
2879     PyObject *seq;
2880     PyObject *seq_str;
2881     PyObject *result;
2882 
2883     seq = PySequence_List((PyObject *)dv);
2884     if (seq == NULL)
2885         return NULL;
2886 
2887     seq_str = PyObject_Repr(seq);
2888     result = PyString_FromFormat("%s(%s)", Py_TYPE(dv)->tp_name,
2889                                  PyString_AS_STRING(seq_str));
2890     Py_DECREF(seq_str);
2891     Py_DECREF(seq);
2892     return result;
2893 }
2894 
2895 /*** dict_keys ***/
2896 
2897 static PyObject *
dictkeys_iter(dictviewobject * dv)2898 dictkeys_iter(dictviewobject *dv)
2899 {
2900     if (dv->dv_dict == NULL) {
2901         Py_RETURN_NONE;
2902     }
2903     return dictiter_new(dv->dv_dict, &PyDictIterKey_Type);
2904 }
2905 
2906 static int
dictkeys_contains(dictviewobject * dv,PyObject * obj)2907 dictkeys_contains(dictviewobject *dv, PyObject *obj)
2908 {
2909     if (dv->dv_dict == NULL)
2910         return 0;
2911     return PyDict_Contains((PyObject *)dv->dv_dict, obj);
2912 }
2913 
2914 static PySequenceMethods dictkeys_as_sequence = {
2915     (lenfunc)dictview_len,              /* sq_length */
2916     0,                                  /* sq_concat */
2917     0,                                  /* sq_repeat */
2918     0,                                  /* sq_item */
2919     0,                                  /* sq_slice */
2920     0,                                  /* sq_ass_item */
2921     0,                                  /* sq_ass_slice */
2922     (objobjproc)dictkeys_contains,      /* sq_contains */
2923 };
2924 
2925 static PyObject*
dictviews_sub(PyObject * self,PyObject * other)2926 dictviews_sub(PyObject* self, PyObject *other)
2927 {
2928     PyObject *result = PySet_New(self);
2929     PyObject *tmp;
2930     if (result == NULL)
2931         return NULL;
2932 
2933     tmp = PyObject_CallMethod(result, "difference_update", "O", other);
2934     if (tmp == NULL) {
2935         Py_DECREF(result);
2936         return NULL;
2937     }
2938 
2939     Py_DECREF(tmp);
2940     return result;
2941 }
2942 
2943 static PyObject*
dictviews_and(PyObject * self,PyObject * other)2944 dictviews_and(PyObject* self, PyObject *other)
2945 {
2946     PyObject *result = PySet_New(self);
2947     PyObject *tmp;
2948     if (result == NULL)
2949         return NULL;
2950 
2951     tmp = PyObject_CallMethod(result, "intersection_update", "O", other);
2952     if (tmp == NULL) {
2953         Py_DECREF(result);
2954         return NULL;
2955     }
2956 
2957     Py_DECREF(tmp);
2958     return result;
2959 }
2960 
2961 static PyObject*
dictviews_or(PyObject * self,PyObject * other)2962 dictviews_or(PyObject* self, PyObject *other)
2963 {
2964     PyObject *result = PySet_New(self);
2965     PyObject *tmp;
2966     if (result == NULL)
2967         return NULL;
2968 
2969     tmp = PyObject_CallMethod(result, "update", "O", other);
2970     if (tmp == NULL) {
2971         Py_DECREF(result);
2972         return NULL;
2973     }
2974 
2975     Py_DECREF(tmp);
2976     return result;
2977 }
2978 
2979 static PyObject*
dictviews_xor(PyObject * self,PyObject * other)2980 dictviews_xor(PyObject* self, PyObject *other)
2981 {
2982     PyObject *result = PySet_New(self);
2983     PyObject *tmp;
2984     if (result == NULL)
2985         return NULL;
2986 
2987     tmp = PyObject_CallMethod(result, "symmetric_difference_update", "O",
2988                               other);
2989     if (tmp == NULL) {
2990         Py_DECREF(result);
2991         return NULL;
2992     }
2993 
2994     Py_DECREF(tmp);
2995     return result;
2996 }
2997 
2998 static PyNumberMethods dictviews_as_number = {
2999     0,                                  /*nb_add*/
3000     (binaryfunc)dictviews_sub,          /*nb_subtract*/
3001     0,                                  /*nb_multiply*/
3002     0,                                  /*nb_divide*/
3003     0,                                  /*nb_remainder*/
3004     0,                                  /*nb_divmod*/
3005     0,                                  /*nb_power*/
3006     0,                                  /*nb_negative*/
3007     0,                                  /*nb_positive*/
3008     0,                                  /*nb_absolute*/
3009     0,                                  /*nb_nonzero*/
3010     0,                                  /*nb_invert*/
3011     0,                                  /*nb_lshift*/
3012     0,                                  /*nb_rshift*/
3013     (binaryfunc)dictviews_and,          /*nb_and*/
3014     (binaryfunc)dictviews_xor,          /*nb_xor*/
3015     (binaryfunc)dictviews_or,           /*nb_or*/
3016 };
3017 
3018 static PyMethodDef dictkeys_methods[] = {
3019     {NULL,              NULL}           /* sentinel */
3020 };
3021 
3022 PyTypeObject PyDictKeys_Type = {
3023     PyVarObject_HEAD_INIT(&PyType_Type, 0)
3024     "dict_keys",                                /* tp_name */
3025     sizeof(dictviewobject),                     /* tp_basicsize */
3026     0,                                          /* tp_itemsize */
3027     /* methods */
3028     (destructor)dictview_dealloc,               /* tp_dealloc */
3029     0,                                          /* tp_print */
3030     0,                                          /* tp_getattr */
3031     0,                                          /* tp_setattr */
3032     0,                                          /* tp_reserved */
3033     (reprfunc)dictview_repr,                    /* tp_repr */
3034     &dictviews_as_number,                       /* tp_as_number */
3035     &dictkeys_as_sequence,                      /* tp_as_sequence */
3036     0,                                          /* tp_as_mapping */
3037     0,                                          /* tp_hash */
3038     0,                                          /* tp_call */
3039     0,                                          /* tp_str */
3040     PyObject_GenericGetAttr,                    /* tp_getattro */
3041     0,                                          /* tp_setattro */
3042     0,                                          /* tp_as_buffer */
3043     Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC |
3044         Py_TPFLAGS_CHECKTYPES,                  /* tp_flags */
3045     0,                                          /* tp_doc */
3046     (traverseproc)dictview_traverse,            /* tp_traverse */
3047     0,                                          /* tp_clear */
3048     dictview_richcompare,                       /* tp_richcompare */
3049     0,                                          /* tp_weaklistoffset */
3050     (getiterfunc)dictkeys_iter,                 /* tp_iter */
3051     0,                                          /* tp_iternext */
3052     dictkeys_methods,                           /* tp_methods */
3053     0,
3054 };
3055 
3056 static PyObject *
dictkeys_new(PyObject * dict)3057 dictkeys_new(PyObject *dict)
3058 {
3059     return dictview_new(dict, &PyDictKeys_Type);
3060 }
3061 
3062 /*** dict_items ***/
3063 
3064 static PyObject *
dictitems_iter(dictviewobject * dv)3065 dictitems_iter(dictviewobject *dv)
3066 {
3067     if (dv->dv_dict == NULL) {
3068         Py_RETURN_NONE;
3069     }
3070     return dictiter_new(dv->dv_dict, &PyDictIterItem_Type);
3071 }
3072 
3073 static int
dictitems_contains(dictviewobject * dv,PyObject * obj)3074 dictitems_contains(dictviewobject *dv, PyObject *obj)
3075 {
3076     PyObject *key, *value, *found;
3077     if (dv->dv_dict == NULL)
3078         return 0;
3079     if (!PyTuple_Check(obj) || PyTuple_GET_SIZE(obj) != 2)
3080         return 0;
3081     key = PyTuple_GET_ITEM(obj, 0);
3082     value = PyTuple_GET_ITEM(obj, 1);
3083     found = PyDict_GetItem((PyObject *)dv->dv_dict, key);
3084     if (found == NULL) {
3085         if (PyErr_Occurred())
3086             return -1;
3087         return 0;
3088     }
3089     return PyObject_RichCompareBool(value, found, Py_EQ);
3090 }
3091 
3092 static PySequenceMethods dictitems_as_sequence = {
3093     (lenfunc)dictview_len,              /* sq_length */
3094     0,                                  /* sq_concat */
3095     0,                                  /* sq_repeat */
3096     0,                                  /* sq_item */
3097     0,                                  /* sq_slice */
3098     0,                                  /* sq_ass_item */
3099     0,                                  /* sq_ass_slice */
3100     (objobjproc)dictitems_contains,     /* sq_contains */
3101 };
3102 
3103 static PyMethodDef dictitems_methods[] = {
3104     {NULL,              NULL}           /* sentinel */
3105 };
3106 
3107 PyTypeObject PyDictItems_Type = {
3108     PyVarObject_HEAD_INIT(&PyType_Type, 0)
3109     "dict_items",                               /* tp_name */
3110     sizeof(dictviewobject),                     /* tp_basicsize */
3111     0,                                          /* tp_itemsize */
3112     /* methods */
3113     (destructor)dictview_dealloc,               /* tp_dealloc */
3114     0,                                          /* tp_print */
3115     0,                                          /* tp_getattr */
3116     0,                                          /* tp_setattr */
3117     0,                                          /* tp_reserved */
3118     (reprfunc)dictview_repr,                    /* tp_repr */
3119     &dictviews_as_number,                       /* tp_as_number */
3120     &dictitems_as_sequence,                     /* tp_as_sequence */
3121     0,                                          /* tp_as_mapping */
3122     0,                                          /* tp_hash */
3123     0,                                          /* tp_call */
3124     0,                                          /* tp_str */
3125     PyObject_GenericGetAttr,                    /* tp_getattro */
3126     0,                                          /* tp_setattro */
3127     0,                                          /* tp_as_buffer */
3128     Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC |
3129         Py_TPFLAGS_CHECKTYPES,                  /* tp_flags */
3130     0,                                          /* tp_doc */
3131     (traverseproc)dictview_traverse,            /* tp_traverse */
3132     0,                                          /* tp_clear */
3133     dictview_richcompare,                       /* tp_richcompare */
3134     0,                                          /* tp_weaklistoffset */
3135     (getiterfunc)dictitems_iter,                /* tp_iter */
3136     0,                                          /* tp_iternext */
3137     dictitems_methods,                          /* tp_methods */
3138     0,
3139 };
3140 
3141 static PyObject *
dictitems_new(PyObject * dict)3142 dictitems_new(PyObject *dict)
3143 {
3144     return dictview_new(dict, &PyDictItems_Type);
3145 }
3146 
3147 /*** dict_values ***/
3148 
3149 static PyObject *
dictvalues_iter(dictviewobject * dv)3150 dictvalues_iter(dictviewobject *dv)
3151 {
3152     if (dv->dv_dict == NULL) {
3153         Py_RETURN_NONE;
3154     }
3155     return dictiter_new(dv->dv_dict, &PyDictIterValue_Type);
3156 }
3157 
3158 static PySequenceMethods dictvalues_as_sequence = {
3159     (lenfunc)dictview_len,              /* sq_length */
3160     0,                                  /* sq_concat */
3161     0,                                  /* sq_repeat */
3162     0,                                  /* sq_item */
3163     0,                                  /* sq_slice */
3164     0,                                  /* sq_ass_item */
3165     0,                                  /* sq_ass_slice */
3166     (objobjproc)0,                      /* sq_contains */
3167 };
3168 
3169 static PyMethodDef dictvalues_methods[] = {
3170     {NULL,              NULL}           /* sentinel */
3171 };
3172 
3173 PyTypeObject PyDictValues_Type = {
3174     PyVarObject_HEAD_INIT(&PyType_Type, 0)
3175     "dict_values",                              /* tp_name */
3176     sizeof(dictviewobject),                     /* tp_basicsize */
3177     0,                                          /* tp_itemsize */
3178     /* methods */
3179     (destructor)dictview_dealloc,               /* tp_dealloc */
3180     0,                                          /* tp_print */
3181     0,                                          /* tp_getattr */
3182     0,                                          /* tp_setattr */
3183     0,                                          /* tp_reserved */
3184     (reprfunc)dictview_repr,                    /* tp_repr */
3185     0,                                          /* tp_as_number */
3186     &dictvalues_as_sequence,                    /* tp_as_sequence */
3187     0,                                          /* tp_as_mapping */
3188     0,                                          /* tp_hash */
3189     0,                                          /* tp_call */
3190     0,                                          /* tp_str */
3191     PyObject_GenericGetAttr,                    /* tp_getattro */
3192     0,                                          /* tp_setattro */
3193     0,                                          /* tp_as_buffer */
3194     Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
3195     0,                                          /* tp_doc */
3196     (traverseproc)dictview_traverse,            /* tp_traverse */
3197     0,                                          /* tp_clear */
3198     0,                                          /* tp_richcompare */
3199     0,                                          /* tp_weaklistoffset */
3200     (getiterfunc)dictvalues_iter,               /* tp_iter */
3201     0,                                          /* tp_iternext */
3202     dictvalues_methods,                         /* tp_methods */
3203     0,
3204 };
3205 
3206 static PyObject *
dictvalues_new(PyObject * dict)3207 dictvalues_new(PyObject *dict)
3208 {
3209     return dictview_new(dict, &PyDictValues_Type);
3210 }
3211