HashMap Internals: From Buckets to Trees
HashMap Internals: From Buckets to Trees
The main chapter introduced HashMap’s bucket structure. This section opens the implementation further: the exact memory layout of Node and TreeNode objects, the rehashing algorithm, the treeification process, and JMH benchmarks isolating the cost of each.
Node Memory Layout
Every entry in a HashMap (when not treeified) is a Node<K,V>:
// From java.util.HashMap source (simplified)
static class Node<K,V> implements Map.Entry<K,V> {
final int hash; // 4 bytes (cached hash value)
final K key; // 4 bytes (compressed oop reference)
V value; // 4 bytes (compressed oop reference)
Node<K,V> next; // 4 bytes (compressed oop reference, chain link)
}
With compressed oops (heaps under 32GB), a Node object occupies:
- 12 bytes: object header (mark word + class pointer)
- 4 bytes:
hash - 4 bytes:
keyreference - 4 bytes:
valuereference - 4 bytes:
nextreference - 4 bytes: padding to 8-byte alignment
- Total: 32 bytes per Node
When treeification converts a bucket to a red-black tree, nodes become TreeNode<K,V>, which extends LinkedHashMap.Entry<K,V> which extends Node<K,V>:
// From java.util.HashMap source (simplified)
static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
TreeNode<K,V> parent; // 4 bytes
TreeNode<K,V> left; // 4 bytes
TreeNode<K,V> right; // 4 bytes
TreeNode<K,V> prev; // 4 bytes (for unlinking on deletion)
boolean red; // 1 byte + 7 padding
}
Total TreeNode size:
- 32 bytes: base Node fields
- 8 bytes: LinkedHashMap.Entry’s
beforeandafterpointers - 4 bytes:
parent - 4 bytes:
left - 4 bytes:
right - 4 bytes:
prev - 1 byte:
red+ 7 bytes padding - Total: ~64 bytes per TreeNode
Treeification doubles the per-entry memory cost. This is acceptable because it only triggers for buckets with 8+ collisions, which indicates either a hash quality problem or a hash-flooding attack. The tree structure bounds worst-case lookup at O(log n) regardless of hash quality.
The Rehash Algorithm
When size > capacity * loadFactor, HashMap allocates a new table at double the length and redistributes all entries:
// Simplified rehash logic from HashMap.resize()
Node<K,V>[] newTable = new Node[oldCapacity << 1]; // double size
for (int j = 0; j < oldCapacity; j++) {
Node<K,V> e = oldTable[j];
if (e != null) {
oldTable[j] = null; // help GC
if (e.next == null) {
// Single node: place directly in new table
newTable[e.hash & (newCapacity - 1)] = e;
} else {
// Chain: split into "low" and "high" chains
// based on the new bit in the hash
Node<K,V> loHead = null, loTail = null;
Node<K,V> hiHead = null, hiTail = null;
do {
if ((e.hash & oldCapacity) == 0) {
// stays in same index
if (loTail == null) loHead = e;
else loTail.next = e;
loTail = e;
} else {
// moves to index + oldCapacity
if (hiTail == null) hiHead = e;
else hiTail.next = e;
hiTail = e;
}
} while ((e = e.next) != null);
newTable[j] = loHead;
newTable[j + oldCapacity] = hiHead;
}
}
}
The key optimization: because the table size doubles (a power of two), each entry either stays at the same index or moves to index + oldCapacity. The decision is a single bit check: (hash & oldCapacity) == 0. This avoids recomputing the full modulo for each entry.
Despite this optimization, rehashing is expensive because it touches every entry in the map and allocates a new table array. For large maps, this creates a latency spike.
Benchmarking Rehash Cost
@BenchmarkMode(Mode.SingleShotTime)
@Warmup(iterations = 5)
@Measurement(iterations = 10)
@Fork(2)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@State(Scope.Benchmark)
public class RehashCostBenchmark {
@Param({"1000", "10000", "100000", "1000000"})
private int size;
@Benchmark
public Map<String, String> withRehashing() {
Map<String, String> map = new HashMap<>(); // starts at 16
for (int i = 0; i < size; i++) {
map.put("key-" + i, "val-" + i);
}
return map;
}
@Benchmark
public Map<String, String> withoutRehashing() {
Map<String, String> map = HashMap.newHashMap(size);
for (int i = 0; i < size; i++) {
map.put("key-" + i, "val-" + i);
}
return map;
}
}
| Size | With Rehash (us) | Without Rehash (us) | Overhead | Resize Count |
|---|---|---|---|---|
| 1,000 | 210 | 160 | 31% | 7 |
| 10,000 | 2,800 | 2,000 | 40% | 10 |
| 100,000 | 38,000 | 24,000 | 58% | 14 |
| 1,000,000 | 520,000 | 310,000 | 68% | 17 |
The overhead percentage increases with size because each resize copies more entries. At 1M entries, the rehash path does 17 resizes, with the final resize copying ~750,000 entries from a 1M-slot table to a 2M-slot table. That single resize takes longer than all previous resizes combined.
Symptom: Latency spike during map population, visible as a tail latency outlier in P99 metrics.
Cause: HashMap resize copying all entries to a new table.
Fix: Pre-size with HashMap.newHashMap(expectedSize).
Proof: The benchmark above. 68% overhead at 1M entries.
Trade-off: Pre-sizing allocates the full table upfront. If the expected size is a large overestimate, memory is wasted. For maps where the final size is unknown, consider growing in stages: start at a reasonable estimate, accept one or two resizes, and trim with Map.copyOf() when done.
Treeification in Practice
Treeification is a safety net, not a normal operating mode. If your HashMap regularly treeifies buckets, your hashCode() is broken. But understanding the mechanics helps diagnose performance issues.
@BenchmarkMode(Mode.AverageTime)
@Warmup(iterations = 3, time = 2)
@Measurement(iterations = 5, time = 5)
@Fork(2)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Benchmark)
public class TreeificationBenchmark {
// Keys that all hash to the same bucket
static class CollidingKey {
private final int id;
CollidingKey(int id) { this.id = id; }
@Override
public int hashCode() { return 42; } // everything collides
@Override
public boolean equals(Object o) {
return o instanceof CollidingKey ck && ck.id == this.id;
}
@Override
public String toString() { return "CK-" + id; }
}
// Keys with good hash distribution
static class GoodKey {
private final int id;
GoodKey(int id) { this.id = id; }
@Override
public int hashCode() { return Integer.hashCode(id); }
@Override
public boolean equals(Object o) {
return o instanceof GoodKey gk && gk.id == this.id;
}
}
@Param({"10", "100", "1000"})
private int chainLength;
private Map<CollidingKey, String> collidingMap;
private Map<GoodKey, String> goodMap;
private CollidingKey lookupColliding;
private GoodKey lookupGood;
@Setup
public void setup() {
collidingMap = new HashMap<>(128);
goodMap = HashMap.newHashMap(chainLength);
for (int i = 0; i < chainLength; i++) {
collidingMap.put(new CollidingKey(i), "val-" + i);
goodMap.put(new GoodKey(i), "val-" + i);
}
// Lookup the middle element (worst average case for chain)
lookupColliding = new CollidingKey(chainLength / 2);
lookupGood = new GoodKey(chainLength / 2);
}
@Benchmark
public String collidingGet() {
return collidingMap.get(lookupColliding);
}
@Benchmark
public String goodGet() {
return goodMap.get(lookupGood);
}
}
| Chain Length | Colliding Get (ns) | Good Get (ns) | Ratio |
|---|---|---|---|
| 10 | 28 | 8 | 3.5x |
| 100 | 85 | 8 | 10.6x |
| 1,000 | 140 | 8 | 17.5x |
With 10 colliding keys (above the treeify threshold of 8), the bucket is a red-black tree. Lookup is O(log 10) = ~3 comparisons. With 1,000 colliding keys, lookup is O(log 1000) = ~10 comparisons. Each comparison involves compareTo() or identity-based tiebreaking, which is slower than the simple hash check + equals() of a non-colliding bucket.
The good-hash variant maintains O(1) regardless of size because each bucket has 0 or 1 entry.
The Content Platform Category Index
The content platform’s category index maps category names to article lists. Categories are strings like “technology”, “science”, “finance”. With 50 categories, the map is small, but accessed on every page render.
// SLOW: rebuilding the category index without capacity hints
public class CategoryIndex {
public Map<String, List<Article>> buildIndex(List<Article> articles) {
Map<String, List<Article>> index = new HashMap<>(); // SLOW
for (Article article : articles) {
for (String category : article.categories()) {
index.computeIfAbsent(category, k -> new ArrayList<>())
.add(article);
}
}
return index;
}
}
// FAST: pre-sized map, pre-sized lists, immutable result
public class CategoryIndex {
private static final int EXPECTED_CATEGORIES = 64;
public Map<String, List<Article>> buildIndex(List<Article> articles) {
Map<String, List<Article>> index = HashMap.newHashMap(EXPECTED_CATEGORIES);
for (Article article : articles) {
for (String category : article.categories()) {
index.computeIfAbsent(category, k -> new ArrayList<>(256))
.add(article);
}
}
// Freeze lists and map for thread-safe reads
Map<String, List<Article>> frozen = HashMap.newHashMap(index.size());
index.forEach((cat, list) -> frozen.put(cat, List.copyOf(list)));
return Map.copyOf(frozen);
}
}
The optimized version:
- Pre-sizes the HashMap to avoid resizing during population
- Pre-sizes the ArrayList per category (256 initial capacity prevents multiple ArrayList resizes)
- Freezes each list with
List.copyOf()and the entire map withMap.copyOf()for thread-safe, GC-friendly reads - The frozen collections have no mutation tracking overhead and can be shared across threads without synchronization
The equals() and hashCode() Contract
HashMap correctness depends on the contract between equals() and hashCode():
-
If
a.equals(b), thena.hashCode() == b.hashCode(). Violating this causes HashMap to look in the wrong bucket and fail to find existing entries. -
If
a.hashCode() == b.hashCode(),a.equals(b)may or may not be true. This is a collision, handled by chain/tree traversal within the bucket. -
hashCode()must be consistent: calling it multiple times on the same object (with no state changes) must return the same value.
Common violations:
// BUG: mutable field used in hashCode, then mutated after insertion
public class MutableArticle {
private String id;
private String title;
@Override
public int hashCode() {
return Objects.hash(id, title); // title is mutable
}
@Override
public boolean equals(Object o) {
if (!(o instanceof MutableArticle other)) return false;
return Objects.equals(id, other.id)
&& Objects.equals(title, other.title);
}
}
// This breaks HashMap:
Map<MutableArticle, String> map = new HashMap<>();
MutableArticle article = new MutableArticle("1", "Old Title");
map.put(article, "data");
article.setTitle("New Title"); // hashCode changes!
map.get(article); // returns null! Key is in the wrong bucket.
Rule: Use only immutable fields in hashCode() and equals(). Prefer record types, which generate correct hashCode() and equals() based on all components, and whose components are final by definition.
// FAST: record with correct, immutable hashCode/equals
public record ArticleId(String category, int sequence) {
// hashCode and equals generated from category and sequence
// Both fields are final (record components are implicitly final)
}
Practical Diagnosis Checklist
Symptom: HashMap get() latency increases linearly with map size.
Cause: Poor hashCode() creating collision chains.
Diagnosis: Add a temporary probe:
// Count entries per bucket to detect clustering
int[] bucketCounts = new int[getTableLength(map)];
for (Map.Entry<K,V> entry : map.entrySet()) {
int bucket = entry.getKey().hashCode() & (bucketCounts.length - 1);
bucketCounts[bucket]++;
}
int maxChain = IntStream.of(bucketCounts).max().orElse(0);
long emptyBuckets = IntStream.of(bucketCounts).filter(c -> c == 0).count();
System.out.printf("Max chain: %d, Empty buckets: %d/%d%n",
maxChain, emptyBuckets, bucketCounts.length);
Fix: Improve hashCode() to use all identifying fields with the 31 * result + field.hashCode() pattern.
Trade-off: A more thorough hashCode (hashing more fields) costs more CPU per hash computation. For hot-path keys, consider caching the hash value in a field (as String does internally).
Symptom: GC pause spikes correlated with map population.
Cause: Multiple resize operations creating and discarding large arrays.
Diagnosis: Run with -Xlog:gc* and correlate pause timestamps with map building code paths.
Fix: Pre-size with HashMap.newHashMap(expectedSize).
Trade-off: Upfront memory allocation. Acceptable for known-size maps. For streaming/unknown-size scenarios, use a reasonable estimate and accept 1-2 resizes.