HashMap Internals: From Buckets to Trees

The main chapter introduced HashMap’s bucket structure. This section opens the implementation further: the exact memory layout of Node and TreeNode objects, the rehashing algorithm, the treeification process, and JMH benchmarks isolating the cost of each.

Node Memory Layout

Every entry in a HashMap (when not treeified) is a Node<K,V>:

// From java.util.HashMap source (simplified)
static class Node<K,V> implements Map.Entry<K,V> {
    final int hash;      // 4 bytes (cached hash value)
    final K key;          // 4 bytes (compressed oop reference)
    V value;              // 4 bytes (compressed oop reference)
    Node<K,V> next;       // 4 bytes (compressed oop reference, chain link)
}

With compressed oops (heaps under 32GB), a Node object occupies:

12 bytes: object header (mark word + class pointer)
4 bytes: hash
4 bytes: key reference
4 bytes: value reference
4 bytes: next reference
4 bytes: padding to 8-byte alignment
Total: 32 bytes per Node

When treeification converts a bucket to a red-black tree, nodes become TreeNode<K,V>, which extends LinkedHashMap.Entry<K,V> which extends Node<K,V>:

// From java.util.HashMap source (simplified)
static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
    TreeNode<K,V> parent;   // 4 bytes
    TreeNode<K,V> left;     // 4 bytes
    TreeNode<K,V> right;    // 4 bytes
    TreeNode<K,V> prev;     // 4 bytes (for unlinking on deletion)
    boolean red;             // 1 byte + 7 padding
}

Total TreeNode size:

32 bytes: base Node fields
8 bytes: LinkedHashMap.Entry’s before and after pointers
4 bytes: parent
4 bytes: left
4 bytes: right
4 bytes: prev
1 byte: red + 7 bytes padding
Total: ~64 bytes per TreeNode

Treeification doubles the per-entry memory cost. This is acceptable because it only triggers for buckets with 8+ collisions, which indicates either a hash quality problem or a hash-flooding attack. The tree structure bounds worst-case lookup at O(log n) regardless of hash quality.

The Rehash Algorithm

When size > capacity * loadFactor, HashMap allocates a new table at double the length and redistributes all entries:

// Simplified rehash logic from HashMap.resize()
Node<K,V>[] newTable = new Node[oldCapacity << 1]; // double size
for (int j = 0; j < oldCapacity; j++) {
    Node<K,V> e = oldTable[j];
    if (e != null) {
        oldTable[j] = null; // help GC
        if (e.next == null) {
            // Single node: place directly in new table
            newTable[e.hash & (newCapacity - 1)] = e;
        } else {
            // Chain: split into "low" and "high" chains
            // based on the new bit in the hash
            Node<K,V> loHead = null, loTail = null;
            Node<K,V> hiHead = null, hiTail = null;
            do {
                if ((e.hash & oldCapacity) == 0) {
                    // stays in same index
                    if (loTail == null) loHead = e;
                    else loTail.next = e;
                    loTail = e;
                } else {
                    // moves to index + oldCapacity
                    if (hiTail == null) hiHead = e;
                    else hiTail.next = e;
                    hiTail = e;
                }
            } while ((e = e.next) != null);
            newTable[j] = loHead;
            newTable[j + oldCapacity] = hiHead;
        }
    }
}

The key optimization: because the table size doubles (a power of two), each entry either stays at the same index or moves to index + oldCapacity. The decision is a single bit check: (hash & oldCapacity) == 0. This avoids recomputing the full modulo for each entry.

Despite this optimization, rehashing is expensive because it touches every entry in the map and allocates a new table array. For large maps, this creates a latency spike.

Benchmarking Rehash Cost

@BenchmarkMode(Mode.SingleShotTime)
@Warmup(iterations = 5)
@Measurement(iterations = 10)
@Fork(2)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@State(Scope.Benchmark)
public class RehashCostBenchmark {

    @Param({"1000", "10000", "100000", "1000000"})
    private int size;

    @Benchmark
    public Map<String, String> withRehashing() {
        Map<String, String> map = new HashMap<>(); // starts at 16
        for (int i = 0; i < size; i++) {
            map.put("key-" + i, "val-" + i);
        }
        return map;
    }

    @Benchmark
    public Map<String, String> withoutRehashing() {
        Map<String, String> map = HashMap.newHashMap(size);
        for (int i = 0; i < size; i++) {
            map.put("key-" + i, "val-" + i);
        }
        return map;
    }
}

Size	With Rehash (us)	Without Rehash (us)	Overhead	Resize Count
1,000	210	160	31%	7
10,000	2,800	2,000	40%	10
100,000	38,000	24,000	58%	14
1,000,000	520,000	310,000	68%	17

The overhead percentage increases with size because each resize copies more entries. At 1M entries, the rehash path does 17 resizes, with the final resize copying ~750,000 entries from a 1M-slot table to a 2M-slot table. That single resize takes longer than all previous resizes combined.

Symptom: Latency spike during map population, visible as a tail latency outlier in P99 metrics. Cause: HashMap resize copying all entries to a new table. Fix: Pre-size with HashMap.newHashMap(expectedSize). Proof: The benchmark above. 68% overhead at 1M entries. Trade-off: Pre-sizing allocates the full table upfront. If the expected size is a large overestimate, memory is wasted. For maps where the final size is unknown, consider growing in stages: start at a reasonable estimate, accept one or two resizes, and trim with Map.copyOf() when done.

Treeification in Practice

Treeification is a safety net, not a normal operating mode. If your HashMap regularly treeifies buckets, your hashCode() is broken. But understanding the mechanics helps diagnose performance issues.

@BenchmarkMode(Mode.AverageTime)
@Warmup(iterations = 3, time = 2)
@Measurement(iterations = 5, time = 5)
@Fork(2)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Benchmark)
public class TreeificationBenchmark {

    // Keys that all hash to the same bucket
    static class CollidingKey {
        private final int id;

        CollidingKey(int id) { this.id = id; }

        @Override
        public int hashCode() { return 42; } // everything collides

        @Override
        public boolean equals(Object o) {
            return o instanceof CollidingKey ck && ck.id == this.id;
        }

        @Override
        public String toString() { return "CK-" + id; }
    }

    // Keys with good hash distribution
    static class GoodKey {
        private final int id;

        GoodKey(int id) { this.id = id; }

        @Override
        public int hashCode() { return Integer.hashCode(id); }

        @Override
        public boolean equals(Object o) {
            return o instanceof GoodKey gk && gk.id == this.id;
        }
    }

    @Param({"10", "100", "1000"})
    private int chainLength;

    private Map<CollidingKey, String> collidingMap;
    private Map<GoodKey, String> goodMap;
    private CollidingKey lookupColliding;
    private GoodKey lookupGood;

    @Setup
    public void setup() {
        collidingMap = new HashMap<>(128);
        goodMap = HashMap.newHashMap(chainLength);

        for (int i = 0; i < chainLength; i++) {
            collidingMap.put(new CollidingKey(i), "val-" + i);
            goodMap.put(new GoodKey(i), "val-" + i);
        }

        // Lookup the middle element (worst average case for chain)
        lookupColliding = new CollidingKey(chainLength / 2);
        lookupGood = new GoodKey(chainLength / 2);
    }

    @Benchmark
    public String collidingGet() {
        return collidingMap.get(lookupColliding);
    }

    @Benchmark
    public String goodGet() {
        return goodMap.get(lookupGood);
    }
}

Chain Length	Colliding Get (ns)	Good Get (ns)	Ratio
10	28	8	3.5x
100	85	8	10.6x
1,000	140	8	17.5x

With 10 colliding keys (above the treeify threshold of 8), the bucket is a red-black tree. Lookup is O(log 10) = ~3 comparisons. With 1,000 colliding keys, lookup is O(log 1000) = ~10 comparisons. Each comparison involves compareTo() or identity-based tiebreaking, which is slower than the simple hash check + equals() of a non-colliding bucket.

The good-hash variant maintains O(1) regardless of size because each bucket has 0 or 1 entry.

The Content Platform Category Index

The content platform’s category index maps category names to article lists. Categories are strings like “technology”, “science”, “finance”. With 50 categories, the map is small, but accessed on every page render.

// SLOW: rebuilding the category index without capacity hints
public class CategoryIndex {

    public Map<String, List<Article>> buildIndex(List<Article> articles) {
        Map<String, List<Article>> index = new HashMap<>(); // SLOW
        for (Article article : articles) {
            for (String category : article.categories()) {
                index.computeIfAbsent(category, k -> new ArrayList<>())
                     .add(article);
            }
        }
        return index;
    }
}

// FAST: pre-sized map, pre-sized lists, immutable result
public class CategoryIndex {

    private static final int EXPECTED_CATEGORIES = 64;

    public Map<String, List<Article>> buildIndex(List<Article> articles) {
        Map<String, List<Article>> index = HashMap.newHashMap(EXPECTED_CATEGORIES);
        for (Article article : articles) {
            for (String category : article.categories()) {
                index.computeIfAbsent(category, k -> new ArrayList<>(256))
                     .add(article);
            }
        }
        // Freeze lists and map for thread-safe reads
        Map<String, List<Article>> frozen = HashMap.newHashMap(index.size());
        index.forEach((cat, list) -> frozen.put(cat, List.copyOf(list)));
        return Map.copyOf(frozen);
    }
}

The optimized version:

Pre-sizes the HashMap to avoid resizing during population
Pre-sizes the ArrayList per category (256 initial capacity prevents multiple ArrayList resizes)
Freezes each list with List.copyOf() and the entire map with Map.copyOf() for thread-safe, GC-friendly reads
The frozen collections have no mutation tracking overhead and can be shared across threads without synchronization

The equals() and hashCode() Contract

HashMap correctness depends on the contract between equals() and hashCode():

If a.equals(b), then a.hashCode() == b.hashCode(). Violating this causes HashMap to look in the wrong bucket and fail to find existing entries.
If a.hashCode() == b.hashCode(), a.equals(b) may or may not be true. This is a collision, handled by chain/tree traversal within the bucket.
hashCode() must be consistent: calling it multiple times on the same object (with no state changes) must return the same value.

Common violations:

// BUG: mutable field used in hashCode, then mutated after insertion
public class MutableArticle {
    private String id;
    private String title;

    @Override
    public int hashCode() {
        return Objects.hash(id, title); // title is mutable
    }

    @Override
    public boolean equals(Object o) {
        if (!(o instanceof MutableArticle other)) return false;
        return Objects.equals(id, other.id)
            && Objects.equals(title, other.title);
    }
}

// This breaks HashMap:
Map<MutableArticle, String> map = new HashMap<>();
MutableArticle article = new MutableArticle("1", "Old Title");
map.put(article, "data");

article.setTitle("New Title"); // hashCode changes!
map.get(article); // returns null! Key is in the wrong bucket.

Rule: Use only immutable fields in hashCode() and equals(). Prefer record types, which generate correct hashCode() and equals() based on all components, and whose components are final by definition.

// FAST: record with correct, immutable hashCode/equals
public record ArticleId(String category, int sequence) {
    // hashCode and equals generated from category and sequence
    // Both fields are final (record components are implicitly final)
}

Practical Diagnosis Checklist

Symptom: HashMap get() latency increases linearly with map size. Cause: Poor hashCode() creating collision chains. Diagnosis: Add a temporary probe:

// Count entries per bucket to detect clustering
int[] bucketCounts = new int[getTableLength(map)];
for (Map.Entry<K,V> entry : map.entrySet()) {
    int bucket = entry.getKey().hashCode() & (bucketCounts.length - 1);
    bucketCounts[bucket]++;
}
int maxChain = IntStream.of(bucketCounts).max().orElse(0);
long emptyBuckets = IntStream.of(bucketCounts).filter(c -> c == 0).count();
System.out.printf("Max chain: %d, Empty buckets: %d/%d%n",
    maxChain, emptyBuckets, bucketCounts.length);

Fix: Improve hashCode() to use all identifying fields with the 31 * result + field.hashCode() pattern. Trade-off: A more thorough hashCode (hashing more fields) costs more CPU per hash computation. For hot-path keys, consider caching the hash value in a field (as String does internally).

Symptom: GC pause spikes correlated with map population. Cause: Multiple resize operations creating and discarding large arrays. Diagnosis: Run with -Xlog:gc* and correlate pause timestamps with map building code paths. Fix: Pre-size with HashMap.newHashMap(expectedSize). Trade-off: Upfront memory allocation. Acceptable for known-size maps. For streaming/unknown-size scenarios, use a reasonable estimate and accept 1-2 resizes.