Monitor inflation and the monitor pool -------------------------------------- Every Java object conceptually has an associated monitor that is used to implement `synchronized` blocks as well as `Object.wait()` and related functionality. Initially, the state of this monitor is represented by the 32-bit `LockWord` in every object. This same word is also used to represent the system hash code if one has been computed for the object. See `lock_word.h` for details. The `LockWord` suffices for representing the state of unlocked objects with no waiters, whether or not they have a system hash code. It can also represent a locked (via a `synchronized` block) object that has neither a system hash code, nor an `Object.wait()` waiter. In all other cases, the lock must be "inflated". In that case, the `LockWord` contents reflect the "fat lock" state, and it contains primarily a `MonitorId`. We also inflate significantly contended locks, since the lock word itself does not have space for a wait queue or the like. (With the Concurrent Copying garbage collector, the `LockWord` supports another "forwarding address" state. In that case the above information is instead contained in the `LockWord` for the to-space objects referenced by the forwarding address.) MonitorIds ---------- A `MonitorId` is logically a pointer to a `Monitor` structure. The `Monitor` data structure is allocated when needed, and holds all the data needed to support any hashcode or `Object` synchronization operation provided by Java. On 32-bit implementations, it is essentially a pointer, except for some shifting and masking to avoid collisions with a few bits in the lock word required for other purposes. In a 64-bit environment, the situation is more complicated, since we don't have enough bits in the lock word to store a 64-bit pointer. In the 64-bit case, `Monitor`s are stored in "chunks" of 4K bytes. The bottom bits of a `MonitorId` are the index of the `Monitor` within its chunk. The remaining bits are used to find the correct chunk. Continuing with the 64-bit case, chunk addresses are stored in the `monitor_chunks_` data structure, which is logically a two-dimensional, vaguely triangular, array of pointers to chunks Each row is allocated separately, and twice as long as the previous one. (Thus the array is not really triangular.) The high bits of a `MonitorId` are interpreted as row- and column-indices into this array. Both the second level "rows" of this array, and the chunks themselves are allocated on demand. By making the rows grow exponentially in size, and keeping a free list of recycled `Monitor` slots, we can accommodate a large number of `Monitor`s, without ever allocating more than twice the index size that was actually required. (This doesn't count the top-level index needed to find the rows, but exponential growth of the rows makes that tiny.) And there is no need to ever move `Monitor` objects, which would significantly complicate the logic. The above data structure is distinct from the `MonitorList` data structure, which is used simply to enumerate all monitors. (It might be possible to save a bit of space in the 64-bit case, and have the `monitor_chunks_` data structure handle this as well.) Monitor inflation ----------------- Monitors are inflated, and `Monitor` structs are allocated on demand, when an operation is performed that cannot be accommodated with just the lock word. This normally happens when we need to store a hash code in a locked object, when there is lock contention, or when `Object.wait` is executed. (Notification on an object with no waiters is trivial and does not require inflation.) In the 64-bit case, the `monitor_chunks_` data structure may also need to be extended at this time to allow mapping of an additional `MonitorId`. If we have to inflate a lock that is currently held by another Thread B, we must suspend B while updating the data structure representing the lock B holds. When the lock is later released by B, it will notice the change and operate on the fat lock representation instead. Monitor deflation ----------------- Monitors are deflated, and the `Monitor` structs associated with deflated monitors are reclaimed as part of `Heap::Trim()` by invoking `SweepMonitorList()` with an `IsMarkedVisitor` that deflates unheld monitors with no waiters. This is done with all other threads suspended. Monitors are also reclaimed, again via `SweepMonitorList()`, in `SweepSystemWeaks()`, if the corresponding object was not marked. (There is one other use of monitor deflation from `ImageWriter`. That does not maintain `MonitorList`. It relies on the fact that the dex2oat process is single-threaded, and the heap is about to be discarded.)