Heaps
SummaryCovers binary heap properties, array representation, sift-up/sift-down operations,...
Covers binary heap properties, array representation, sift-up/sift-down operations,...
Covers binary heap properties, array representation, sift-up/sift-down operations, heapify in O(n), priority queue applications, and top-K element patterns.
Heaps
A heap gives you instant access to the most extreme element in a collection — the smallest or the largest — while still supporting efficient insertions and removals. Every time you need to repeatedly grab the “best” item from a changing dataset, a heap is your weapon of choice. Priority queues, scheduling algorithms, graph shortest-path computations, and streaming analytics all lean on heaps as their foundational data structure.
The Heap Property
A binary heap is a complete binary tree that satisfies one ordering invariant:
- Max-Heap: every parent node holds a value greater than or equal to both its children.
- Min-Heap: every parent node holds a value less than or equal to both its children.
“Complete” means every level is fully filled except possibly the last, which fills left to right. This completeness constraint is what makes heaps so powerful — it guarantees the tree stays balanced without any rotations, and it enables a remarkably elegant array representation.
Array Representation
Because a binary heap is always complete, you can store it in a flat array with zero pointer overhead:
| Relationship | Formula |
|---|---|
Left child of node at index i | 2 * i + 1 |
Right child of node at index i | 2 * i + 2 |
Parent of node at index i | (i - 1) / 2 |
The root sits at index 0. Children of index 0 are at indices 1 and 2. Children of index 1 are at indices 3 and 4. The pattern continues predictably.
This layout eliminates pointer chasing entirely. Adjacent heap levels occupy contiguous memory, which makes traversal cache-friendly — a real performance win on modern hardware where a cache miss costs 100x more than an arithmetic operation. No left and right references, no Node objects, no garbage collector pressure. One array does everything.
Core Operations
Insert (Sift-Up)
To insert a new element:
- Place it at the end of the array (maintaining completeness).
- Sift up: compare it with its parent, swap if the heap property is violated, and repeat until the element settles into its correct position.
Time complexity: O(log n) — the element travels at most from a leaf to the root, which is the height of a complete binary tree.
Extract Min/Max (Sift-Down)
To remove the root (the minimum in a min-heap, maximum in a max-heap):
- Swap the root with the last element in the array.
- Remove the last element (it was the old root).
- Sift down: compare the new root with its children, swap with the smaller child (min-heap), and repeat.
Time complexity: O(log n) — the element travels at most from the root to a leaf.
Peek
Return the root element without removing it. Time complexity: O(1).
Complete MinHeap Implementation
public class MinHeap<T extends Comparable<T>> {
private final List<T> data;
public MinHeap() {
this.data = new ArrayList<>();
}
public MinHeap(Collection<T> elements) {
this.data = new ArrayList<>(elements);
buildHeap();
}
public void insert(T value) {
data.add(value);
siftUp(data.size() - 1);
}
public T extractMin() {
if (data.isEmpty()) {
throw new NoSuchElementException("Heap is empty");
}
T min = data.getFirst();
T last = data.removeLast();
if (!data.isEmpty()) {
data.set(0, last);
siftDown(0);
}
return min;
}
public T peek() {
if (data.isEmpty()) {
throw new NoSuchElementException("Heap is empty");
}
return data.getFirst();
}
public int size() {
return data.size();
}
public boolean isEmpty() {
return data.isEmpty();
}
private void siftUp(int index) {
while (index > 0) {
int parent = (index - 1) / 2;
if (data.get(index).compareTo(data.get(parent)) >= 0) {
break;
}
swap(index, parent);
index = parent;
}
}
private void siftDown(int index) {
int size = data.size();
while (true) {
int smallest = index;
int left = 2 * index + 1;
int right = 2 * index + 2;
if (left < size && data.get(left).compareTo(data.get(smallest)) < 0) {
smallest = left;
}
if (right < size && data.get(right).compareTo(data.get(smallest)) < 0) {
smallest = right;
}
if (smallest == index) {
break;
}
swap(index, smallest);
index = smallest;
}
}
private void buildHeap() {
for (int i = (data.size() / 2) - 1; i >= 0; i--) {
siftDown(i);
}
}
private void swap(int i, int j) {
T temp = data.get(i);
data.set(i, data.get(j));
data.set(j, temp);
}
}
Notice how siftUp walks toward the root and siftDown walks toward the leaves. These two operations are the building blocks for everything a heap does.
Build Heap: O(n), Not O(n log n)
A common mistake is claiming that building a heap from an unsorted array takes O(n log n) — inserting n elements one at a time, each costing O(log n). That approach works, but buildHeap does it faster.
The Heapify Algorithm
Start from the last non-leaf node (index n/2 - 1) and sift down each node. Leaves are already valid heaps of size 1, so you skip them entirely.
private void buildHeap() {
for (int i = (data.size() / 2) - 1; i >= 0; i--) {
siftDown(i);
}
}
Why O(n)?
The proof relies on counting the total work across all levels:
- At height 0 (leaves):
n/2nodes, each doing 0 swaps. - At height 1:
n/4nodes, each doing at most 1 swap. - At height 2:
n/8nodes, each doing at most 2 swaps. - At height h:
n / 2^(h+1)nodes, each doing at most h swaps.
Total work:
$$\sum_{h=0}^{\log n} \frac{n}{2^{h+1}} \cdot h = \frac{n}{2} \sum_{h=0}^{\log n} \frac{h}{2^h}$$
The series $\sum_{h=0}^{\infty} h/2^h$ converges to 2. So total work = $\frac{n}{2} \cdot 2 = n$, giving us O(n).
The key insight: most nodes sit near the bottom of the tree where the sift-down distance is short. The few nodes near the top that require long sift-downs are vastly outnumbered by the many nodes at the bottom that require no work at all.
Java’s PriorityQueue
Java provides PriorityQueue<E> in java.util, which implements a min-heap by default:
// Min-heap (natural ordering)
var minHeap = new PriorityQueue<Integer>();
minHeap.offer(5);
minHeap.offer(2);
minHeap.offer(8);
System.out.println(minHeap.poll()); // 2
// Max-heap (reversed comparator)
var maxHeap = new PriorityQueue<Integer>(Comparator.reverseOrder());
maxHeap.offer(5);
maxHeap.offer(2);
maxHeap.offer(8);
System.out.println(maxHeap.poll()); // 8
// Custom objects
record Task(String name, int priority) {}
var taskQueue = new PriorityQueue<Task>(Comparator.comparingInt(Task::priority));
Key details:
offer()andpoll()are O(log n).peek()is O(1).remove(Object)is O(n) — it does a linear scan.- Not thread-safe. Use
PriorityBlockingQueuefor concurrent access. - Iteration order is not sorted. Only
poll()guarantees ordering.
Classic Interview Problems
Kth Largest Element in an Array
Strategy: maintain a min-heap of size K. After processing all elements, the root is the Kth largest.
public int findKthLargest(int[] nums, int k) {
var minHeap = new PriorityQueue<Integer>();
for (int num : nums) {
minHeap.offer(num);
if (minHeap.size() > k) {
minHeap.poll();
}
}
return minHeap.peek();
}
Time: O(n log k). Space: O(k). This beats sorting at O(n log n) when k is much smaller than n.
Why a min-heap for the Kth largest? Because you want to discard the smallest elements. The min-heap’s root is the gatekeeper — anything smaller gets kicked out, and the K largest survivors remain.
Top K Frequent Elements
Strategy: count frequencies with a map, then use a min-heap of size K to keep only the most frequent.
public int[] topKFrequent(int[] nums, int k) {
var frequencyMap = new HashMap<Integer, Integer>();
for (int num : nums) {
frequencyMap.merge(num, 1, Integer::sum);
}
var minHeap = new PriorityQueue<Map.Entry<Integer, Integer>>(
Comparator.comparingInt(Map.Entry::getValue)
);
for (var entry : frequencyMap.entrySet()) {
minHeap.offer(entry);
if (minHeap.size() > k) {
minHeap.poll();
}
}
return minHeap.stream()
.mapToInt(Map.Entry::getKey)
.toArray();
}
Time: O(n log k). Space: O(n) for the frequency map plus O(k) for the heap.
Merge K Sorted Lists
Strategy: put the head of each list into a min-heap. Repeatedly extract the minimum, append it to the result, and insert that node’s next pointer into the heap.
public ListNode mergeKLists(ListNode[] lists) {
var minHeap = new PriorityQueue<ListNode>(
Comparator.comparingInt(node -> node.val)
);
for (ListNode head : lists) {
if (head != null) {
minHeap.offer(head);
}
}
var dummy = new ListNode(0);
ListNode current = dummy;
while (!minHeap.isEmpty()) {
ListNode smallest = minHeap.poll();
current.next = smallest;
current = current.next;
if (smallest.next != null) {
minHeap.offer(smallest.next);
}
}
return dummy.next;
}
Time: O(N log k) where N is the total number of nodes across all lists and k is the number of lists. The heap never holds more than k elements. Space: O(k).
Find Median from Data Stream
Strategy: split the stream into two halves using two heaps:
- maxHeap (lower half): stores the smaller half of numbers, root is the largest of that half.
- minHeap (upper half): stores the larger half, root is the smallest of that half.
The median is either the root of the maxHeap (odd count) or the average of both roots (even count).
public class MedianFinder {
private final PriorityQueue<Integer> maxHeap; // lower half
private final PriorityQueue<Integer> minHeap; // upper half
public MedianFinder() {
maxHeap = new PriorityQueue<>(Comparator.reverseOrder());
minHeap = new PriorityQueue<>();
}
public void addNum(int num) {
maxHeap.offer(num);
minHeap.offer(maxHeap.poll());
// Keep maxHeap same size or one larger
if (minHeap.size() > maxHeap.size()) {
maxHeap.offer(minHeap.poll());
}
}
public double findMedian() {
if (maxHeap.size() > minHeap.size()) {
return maxHeap.peek();
}
return (maxHeap.peek() + minHeap.peek()) / 2.0;
}
}
Time: O(log n) per addNum, O(1) per findMedian. Space: O(n).
The elegance here is in addNum: every new number passes through the maxHeap first, then its largest element moves to the minHeap, and rebalancing ensures the heaps stay within one element of each other. This two-step dance guarantees correct partitioning regardless of insertion order.
Heap vs BST vs Sorted Array
| Operation | Heap | Balanced BST | Sorted Array |
|---|---|---|---|
| Find min/max | O(1) | O(log n) | O(1) |
| Insert | O(log n) | O(log n) | O(n) |
| Delete min/max | O(log n) | O(log n) | O(1) amortized |
| Search arbitrary | O(n) | O(log n) | O(log n) |
| Find median | O(n) | O(log n) with augmentation | O(1) |
| Space overhead | None (array) | Pointers per node | None |
Use a heap when you only need the extreme element repeatedly and insertions/deletions are frequent. Use a BST when you need ordered traversal, rank queries, or range searches. Use a sorted array when the dataset is static and you need fast lookups.
Edge Cases and Interviewer Tips
- Empty heap: always guard
extractMinandpeekwith an emptiness check. Interviewers watch for this. - Single element: insert then extract should return that element. Verify your swap logic handles size-1 correctly.
- Duplicate values: heaps handle duplicates naturally — they are not unique-element structures.
- Integer overflow: when computing
(a + b) / 2.0for the median, usea + (b - a) / 2.0if dealing with values nearInteger.MAX_VALUE. - Choosing min vs max heap: a surprisingly common source of bugs. When finding the Kth largest, use a min-heap of size K. When finding the Kth smallest, use a max-heap of size K. Think about which elements you want to discard.
- PriorityQueue.remove(): avoid it in tight loops — it is O(n). If you need efficient arbitrary removal, consider an indexed priority queue or a lazy deletion strategy with a secondary set tracking invalidated entries.
- Stability: Java’s
PriorityQueueis not stable. Equal-priority elements can come out in any order. If insertion order matters for ties, add a sequence number to your comparator.
When your interviewer says “find the top K” or “find the Kth something” or “continuously track the minimum/maximum,” reach for a heap before anything else. It is the canonical data structure for these patterns, and demonstrating fluency with it signals strong algorithmic foundations.