Median of Two Sorted Arrays
SummaryDerives the optimal binary search approach by partitioning...
Derives the optimal binary search approach by partitioning...
Derives the optimal binary search approach by partitioning both arrays such that left halves contain all elements smaller than right halves, achieving logarithmic time complexity.
Median of Two Sorted Arrays
This is one of the most feared interview problems — classified as “Hard” on every major platform. It appears at Google, Apple, and quantitative finance firms where algorithmic precision matters. The brute force solution is trivial. The optimal solution requires a conceptual leap that separates strong candidates from exceptional ones.
Problem Statement
Given two sorted arrays nums1 of size m and nums2 of size n, return the median of the two sorted arrays. The overall run time complexity must be O(log(m + n)).
Examples:
Input: nums1 = [1, 3], nums2 = [2]
Output: 2.0
Explanation: Merged = [1, 2, 3], median = 2
Input: nums1 = [1, 2], nums2 = [3, 4]
Output: 2.5
Explanation: Merged = [1, 2, 3, 4], median = (2 + 3) / 2 = 2.5
The median of a sorted array is the middle element if the length is odd, or the average of the two middle elements if the length is even.
Naive Approach: Merge — O(m + n)
The obvious solution merges both sorted arrays and picks the middle element(s).
public double findMedianSortedArrays(int[] nums1, int[] nums2) {
int m = nums1.length, n = nums2.length;
int[] merged = new int[m + n];
int i = 0, j = 0, k = 0;
while (i < m && j < n) {
if (nums1[i] <= nums2[j]) {
merged[k++] = nums1[i++];
} else {
merged[k++] = nums2[j++];
}
}
while (i < m) merged[k++] = nums1[i++];
while (j < n) merged[k++] = nums2[j++];
int total = m + n;
if (total % 2 == 1) {
return merged[total / 2];
}
return (merged[total / 2 - 1] + merged[total / 2]) / 2.0;
}
Why this fails the requirement: The problem demands O(log(m + n)) time. The merge approach runs in O(m + n) — linear in the total size. For arrays with millions of elements, this is unacceptable.
When to mention it in an interview: Present this approach in 30 seconds as a baseline. Say: “The merge approach gives O(m + n). To hit logarithmic time, I need binary search.” This shows you recognize the gap and have a plan.
You could optimize the merge to stop after finding the median position — still O(m + n) but with O(1) space and better constants. It still does not meet the requirement.
Key Insight: Binary Search on Partitions
The median divides a sorted collection into two equal halves: all elements in the left half are smaller than or equal to all elements in the right half.
For two sorted arrays, you do not need to merge them physically. You need to find partition points i in nums1 and j in nums2 such that:
- The left half contains
(m + n + 1) / 2elements total - Every element in the left half ≤ every element in the right half
Visually:
nums1: [a₁, a₂, ..., aᵢ₋₁ | aᵢ, aᵢ₊₁, ..., aₘ₋₁]
nums2: [b₁, b₂, ..., bⱼ₋₁ | bⱼ, bⱼ₊₁, ..., bₙ₋₁]
← left half → | ← right half →
Left half: {a₁..aᵢ₋₁} ∪ {b₁..bⱼ₋₁}
Right half: {aᵢ..aₘ₋₁} ∪ {bⱼ..bₙ₋₁}
Since both arrays are individually sorted, the condition “every left element ≤ every right element” reduces to two cross-boundary checks:
nums1[i-1] <= nums2[j](left side of A ≤ right side of B)nums2[j-1] <= nums1[i](left side of B ≤ right side of A)
The within-array conditions (nums1[i-1] <= nums1[i] and nums2[j-1] <= nums2[j]) are guaranteed because the arrays are sorted.
Deriving the Algorithm
Step 1: Always Binary Search on the Smaller Array
Let nums1 be the smaller array. If nums1 is larger, swap them. This ensures you binary-search over at most min(m, n) positions, giving O(log(min(m, n))) time.
Step 2: Define the Search Space
The partition index i in nums1 ranges from 0 (take nothing from nums1 into the left half) to m (take everything from nums1 into the left half). Set low = 0 and high = m.
Step 3: Compute the Dependent Partition
Given i, the partition in nums2 is:
$$j = \frac{m + n + 1}{2} - i$$
This ensures the left half has exactly (m + n + 1) / 2 elements. The +1 floors correctly for both odd and even total lengths.
Step 4: Check the Cross-Boundary Conditions
At each binary search iteration, compute:
maxLeftA = (i == 0) ? Integer.MIN_VALUE : nums1[i - 1]minRightA = (i == m) ? Integer.MAX_VALUE : nums1[i]maxLeftB = (j == 0) ? Integer.MIN_VALUE : nums2[j - 1]minRightB = (j == n) ? Integer.MAX_VALUE : nums2[j]
The sentinel values handle edge cases where one partition takes all or no elements from an array.
Valid partition: maxLeftA <= minRightB AND maxLeftB <= minRightA
If maxLeftA > minRightB: We took too many elements from nums1. Move left: high = i - 1.
If maxLeftB > minRightA: We took too few elements from nums1. Move right: low = i + 1.
Step 5: Compute the Median
Once you find the valid partition:
- Odd total: median =
max(maxLeftA, maxLeftB)— the largest element in the left half - Even total: median =
(max(maxLeftA, maxLeftB) + min(minRightA, minRightB)) / 2.0
Optimal Solution — O(log(min(m, n)))
public double findMedianSortedArrays(int[] nums1, int[] nums2) {
// Ensure nums1 is the smaller array for O(log(min(m,n)))
if (nums1.length > nums2.length) {
return findMedianSortedArrays(nums2, nums1);
}
int m = nums1.length;
int n = nums2.length;
int low = 0;
int high = m;
while (low <= high) {
// Partition index in nums1: i elements go to left half
int i = low + (high - low) / 2;
// Partition index in nums2: determined by total left-half size
int j = (m + n + 1) / 2 - i;
// Elements at the partition boundaries
// Sentinels handle cases where partition is at array edge
int maxLeftA = (i == 0) ? Integer.MIN_VALUE : nums1[i - 1];
int minRightA = (i == m) ? Integer.MAX_VALUE : nums1[i];
int maxLeftB = (j == 0) ? Integer.MIN_VALUE : nums2[j - 1];
int minRightB = (j == n) ? Integer.MAX_VALUE : nums2[j];
if (maxLeftA <= minRightB && maxLeftB <= minRightA) {
// Found valid partition — compute median
if ((m + n) % 2 == 1) {
// Odd total: median is the max of left half
return Math.max(maxLeftA, maxLeftB);
} else {
// Even total: average of max-left and min-right
return (Math.max(maxLeftA, maxLeftB)
+ Math.min(minRightA, minRightB)) / 2.0;
}
} else if (maxLeftA > minRightB) {
// Too many elements from nums1 in left half — shrink
high = i - 1;
} else {
// Too few elements from nums1 in left half — expand
low = i + 1;
}
}
// Should never reach here if inputs are valid sorted arrays
throw new IllegalArgumentException("Input arrays are not sorted");
}
Line-by-line breakdown:
The swap at the top is not cosmetic — it guarantees the binary search runs over the smaller array. If m = 3 and n = 1,000,000, you search 4 positions instead of 1,000,001.
The sentinel values (Integer.MIN_VALUE and Integer.MAX_VALUE) elegantly handle boundary conditions. When i = 0, you take nothing from nums1 — there is no left element, so maxLeftA cannot constrain anything. Setting it to Integer.MIN_VALUE ensures maxLeftA <= minRightB is always true in that case.
The (m + n + 1) / 2 formula uses integer division. For odd totals, this puts one extra element in the left half, making the median the maximum of the left half. For even totals, both halves have equal size.
Using Records for Clarity
You can use a Java 25 record to bundle partition boundary elements and make the code more self-documenting:
record Partition(int maxLeft, int minRight) {}
public double findMedianSortedArrays(int[] nums1, int[] nums2) {
if (nums1.length > nums2.length) {
return findMedianSortedArrays(nums2, nums1);
}
int m = nums1.length, n = nums2.length;
int low = 0, high = m;
while (low <= high) {
int i = low + (high - low) / 2;
int j = (m + n + 1) / 2 - i;
var partA = new Partition(
i == 0 ? Integer.MIN_VALUE : nums1[i - 1],
i == m ? Integer.MAX_VALUE : nums1[i]
);
var partB = new Partition(
j == 0 ? Integer.MIN_VALUE : nums2[j - 1],
j == n ? Integer.MAX_VALUE : nums2[j]
);
if (partA.maxLeft() <= partB.minRight()
&& partB.maxLeft() <= partA.minRight()) {
int leftMax = Math.max(partA.maxLeft(), partB.maxLeft());
int rightMin = Math.min(partA.minRight(), partB.minRight());
return (m + n) % 2 == 1
? leftMax
: (leftMax + rightMin) / 2.0;
} else if (partA.maxLeft() > partB.minRight()) {
high = i - 1;
} else {
low = i + 1;
}
}
throw new IllegalArgumentException("Input arrays are not sorted");
}
The Partition record bundles the two boundary values, making the comparison logic read naturally: “Is A’s left ≤ B’s right, and B’s left ≤ A’s right?”
Dry Run
Input: nums1 = [1, 3, 8, 9, 15], nums2 = [7, 11, 18, 19, 21, 25]
Total elements: 11 (odd). Left half needs (11 + 1) / 2 = 6 elements.
nums1 is smaller (m = 5), so we binary-search on nums1.
Iteration 1: low=0, high=5
i = 2, j = 6 - 2 = 4
maxLeftA = nums1[1] = 3 minRightA = nums1[2] = 8
maxLeftB = nums2[3] = 19 minRightB = nums2[4] = 21
Check: maxLeftA(3) <= minRightB(21) ✓
maxLeftB(19) <= minRightA(8) ✗ → too few from nums1
→ low = 3
Iteration 2: low=3, high=5
i = 4, j = 6 - 4 = 2
maxLeftA = nums1[3] = 9 minRightA = nums1[4] = 15
maxLeftB = nums2[1] = 11 minRightB = nums2[2] = 18
Check: maxLeftA(9) <= minRightB(18) ✓
maxLeftB(11) <= minRightA(15) ✓ → valid partition!
Left half: {1, 3, 8, 9} ∪ {7, 11} → max = max(9, 11) = 11
Odd total → median = 11
Verification: Merged = [1, 3, 7, 8, 9, 11, 15, 18, 19, 21, 25]. Middle element (index 5) = 11. ✓
The binary search found the answer in 2 iterations instead of merging 11 elements.
Complexity Analysis
Time Complexity: O(log(min(m, n)))
The binary search operates on the smaller array. At each step, the search space halves. Starting from min(m, n) + 1 positions, the search completes in at most ⌈log₂(min(m, n) + 1)⌉ iterations. Each iteration performs O(1) work (comparisons and arithmetic).
Why binary-search on the smaller array? If m = 5 and n = 10⁶, searching nums1 takes at most ⌈log₂(6)⌉ = 3 iterations. Searching nums2 would take up to 20 iterations. Both are logarithmic, but searching the smaller array is strictly faster and is the tighter bound: O(log(min(m, n))) ⊆ O(log(m + n)).
Space Complexity: O(1)
The algorithm uses a fixed number of integer variables (low, high, i, j, boundary values). No auxiliary data structures are allocated. The recursive swap call is a tail call that any JIT compiler eliminates.
Edge Cases
One Empty Array
If nums1 is empty (m = 0), all elements come from nums2. The partition in nums1 is i = 0, so maxLeftA = Integer.MIN_VALUE and minRightA = Integer.MAX_VALUE. The median depends only on nums2. The sentinel values handle this without special-case code.
nums1 = [], nums2 = [2, 3]
i = 0, j = 1
maxLeftA = -∞, minRightA = +∞
maxLeftB = 2, minRightB = 3
Valid: -∞ <= 3 ✓, 2 <= +∞ ✓
Even total → (max(-∞, 2) + min(+∞, 3)) / 2 = (2 + 3) / 2 = 2.5 ✓
Arrays of Very Different Sizes
The algorithm remains efficient. A nums1 of size 1 and nums2 of size 10⁹ requires at most 1 binary search iteration on nums1. The formula j = (m + n + 1) / 2 - i computes the correct partition in nums2 without touching its elements.
All Elements in One Array Smaller Than the Other
nums1 = [1, 2, 3], nums2 = [10, 11, 12]
The partition that takes all of nums1 and none of nums2 into the left half: i = 3, j = 0. Sentinels handle j = 0: maxLeftB = Integer.MIN_VALUE. Cross-check: maxLeftA = 3 <= minRightB = 10 and maxLeftB = -∞ <= minRightA = +∞. Valid.
Duplicate Elements
Duplicates do not affect correctness. The <= comparisons (not <) ensure that equal elements can appear on either side of the partition. The algorithm makes no uniqueness assumption.
nums1 = [1, 1, 1], nums2 = [1, 1, 1]
Any valid partition works. Median = 1.0.
Interviewer Tips
This is a HARD problem — showing your thought process matters more than getting the optimal solution on the first attempt. Many candidates who fail this problem do so because they jump straight to binary search without building up intuition. Follow this progression:
-
State the merge approach (10 seconds): “Merging gives O(m + n). The problem requires logarithmic time, so I need binary search.”
-
Explain the partition insight (2 minutes): “The median splits the combined array into two equal halves. If I find the right cut in one array, the other cut is determined. I need to binary-search for that cut.”
-
Derive the conditions (2 minutes): Draw the partition diagram. Explain the two cross-boundary checks. Explain why sentinels handle edge cases.
-
Code the solution (10-15 minutes): Write the binary search loop. Handle odd and even totals.
-
Dry run an example (3 minutes): Walk the interviewer through one iteration to prove it works.
Common follow-up: Kth element of two sorted arrays. Instead of finding the median, find the kth smallest element. The same partition approach works, but now the left half must contain exactly k elements. The binary search logic is identical — only the target partition size changes.
Another follow-up: More than two arrays. For k sorted arrays, a min-heap merge in O(N log k) is the standard approach (covered in the Merge K Sorted Lists chapter). The binary search partition technique does not generalize cleanly beyond two arrays.
Precision note: When computing the average for even-length results, use / 2.0 (double division), not / 2 (integer division). This is a common bug that costs candidates. The return type is double for a reason.