Please disable your adblock and script blockers to view this page

JVM Anatomy: Compressed References

Epsilon GC

Aleksey Shipilёv
32 GB
~35 GB

No matching tags


No matching tags


No matching tags

Positivity     49.00%   
   Negativity   51.00%
The New York Times
Write a review: Hacker News

It says that heap starts at 0x0000000080000000 mark, closer to 2 GB.Graphically, it can be sketched like this:Now, the reference field only takes 4 bytes and the instance size is down to 16 bytes:[3]In generated code, the access looks like this:See, the access is still in the same form, that is because the hardware itself just accepts the 32-bit pointer and extends it to 64 bits when doing the access. The easiest way to do that is to bit-shift-right the reference bits, and this gives us 2(32+shift) bytes of heap encodeable into 32 bits.Graphically, it can be sketched like this:With default object alignment of 8 bytes, shift is 3 (23 = 8), therefore we can represent the references to 235 = 32 GB heap. Again, the same problem with base heap address surfaces here and makes the actual limit a bit lower.In Hotspot, this mode is called "zero based compressed oops", see for example:The access via the reference is now a bit more complicated:Getting the field o.x involves executing mov 0xc(%r12,%r11,8),%eax: "Taketh the ref’rence from %r11, multiplyeth the ref’rence by 8, addeth the heapeth base from %r12, and that wouldst be the objecteth that you can now readeth at offset 0xc; putteth that value into %eax, please". The fact that %r12 is zero in this mode can be used by code generator in other places too.To simplify the internal implementation, Hotspot usually carries only uncompressed references in registers, and that is why the access to field o is just the plain access from this (that is in %rsi) at offset 0xc.But zero-based compressed references still rely on assumption that heap is mapped at lower addresses. I would say this again: the same dataset does not fit anymore just because we requested the excessively large heap size, even though we don’t even use it.If we try to figure out what is the minimum heap size required to fit the dataset after 32 GB, this would be the minimum:See, we used to take ~26 GB for the dataset, now we are taking ~35 GB, almost 40% increase!Compressed references is a nice optimization that keeps memory footprint at bay for reference-heavy workloads.

As said here by