Insertion Sort

Insertion sort is the algorithm you already use intuitively. Picture holding a hand of playing cards: you pick up each new card and slide it into its correct position among the cards you’ve already arranged. This natural strategy translates directly into code — and it happens to be one of the most important sorting algorithms in practice, powering the inner loops of Java’s Arrays.sort, Python’s sorted, and C++‘s std::sort.

The Algorithm

Insertion sort maintains a sorted prefix at the beginning of the array. For each new element:

Pick the next element from the unsorted region.
Compare it with elements in the sorted prefix, moving right to left.
Shift larger elements one position to the right.
Insert the element into the gap created.

After processing all elements, the sorted prefix spans the entire array.

Core invariant: After processing element at index i, elements arr[0..i] are sorted relative to each other.

Note the distinction from selection sort’s invariant: insertion sort’s prefix contains the first i elements in sorted order, but they are not necessarily in their globally final positions. New elements arriving later could shift existing elements. Selection sort’s prefix contains the globally smallest elements in their final positions.

Java 25 Implementation

void insertionSort(int[] arr) {
    int n = arr.length;
    for (int i = 1; i < n; i++) {
        int key = arr[i];
        int j = i - 1;
        // Shift elements greater than key to the right
        while (j >= 0 && arr[j] > key) {
            arr[j + 1] = arr[j];
            j--;
        }
        arr[j + 1] = key;
    }
}

Three details worth highlighting:

Start at index 1: A single element (arr[0]) is trivially sorted.
Save the key: Store arr[i] in key before shifting. Shifting overwrites arr[i], so without this save, the value is lost.
Strict inequality (arr[j] > key): Equal elements don’t trigger shifts, preserving stability.

Visual Walkthrough

Sorting [5, 2, 4, 6, 1, 3]:

i = 1: key = 2. Compare with 5 → shift 5 right. Insert 2 at index 0.

[2, 5, 4, 6, 1, 3]   sorted prefix: [2, 5]

i = 2: key = 4. Compare with 5 → shift right. Compare with 2 → stop. Insert 4 at index 1.

[2, 4, 5, 6, 1, 3]   sorted prefix: [2, 4, 5]

i = 3: key = 6. Compare with 5 → no shift needed. Insert 6 at index 3 (already there).

[2, 4, 5, 6, 1, 3]   sorted prefix: [2, 4, 5, 6]

i = 4: key = 1. Shift 6, 5, 4, 2 all right. Insert 1 at index 0.

[1, 2, 4, 5, 6, 3]   sorted prefix: [1, 2, 4, 5, 6]

i = 5: key = 3. Shift 6, 5, 4 right. Compare with 2 → stop. Insert 3 at index 2.

[1, 2, 3, 4, 5, 6]   sorted prefix: [1, 2, 3, 4, 5, 6]

Result: [1, 2, 3, 4, 5, 6] — sorted.

Notice how key = 6 at i = 3 required zero shifts — the element was already in the correct position. This is the adaptive behavior that makes insertion sort fast on nearly sorted data.

Complexity Analysis

Time Complexity

Best case — O(n): The array is already sorted. For each element, the while loop condition arr[j] > key fails immediately (the previous element is smaller or equal). Zero shifts per element, one comparison per element. Total: n - 1 comparisons, 0 shifts = O(n).

Average case — O(n²): Each element shifts past half the sorted prefix on average. Element at position i shifts approximately i/2 positions. Total shifts:

$$\sum_{i=1}^{n-1} \frac{i}{2} = \frac{n(n-1)}{4} = O(n^2)$$

Worst case — O(n²): The array is reverse sorted. Each element shifts past the entire sorted prefix. Element at position i requires i shifts and comparisons:

$$\sum_{i=1}^{n-1} i = \frac{n(n-1)}{2} = O(n^2)$$

Space Complexity

O(1) — Insertion sort is in-place. Uses only key, j, and loop variables.

Stability

Stable — Equal elements never pass each other. The condition arr[j] > key (strict inequality) ensures that when arr[j] == key, the shift stops, keeping equal elements in their original relative order.

Adaptive Behavior

Insertion sort’s running time is O(n + d), where d is the number of inversions in the array. An inversion is a pair (i, j) where i < j but arr[i] > arr[j].

Sorted array: 0 inversions → O(n).
Array with O(n) inversions (a few elements out of place): O(n).
Random array: O(n²) inversions → O(n²).
Reverse sorted: O(n²) inversions → O(n²).

This makes insertion sort the ideal choice when you know the data is “almost sorted” — a common real-world scenario.

Summary

Metric	Value
Best time	O(n) — sorted input
Average time	O(n²)
Worst time	O(n²) — reverse sorted
Space	O(1)
Stable	Yes
Adaptive	Yes — O(n + inversions)

Binary Insertion Sort

The standard insertion sort finds the correct insertion position through sequential comparison — scanning right to left until finding a smaller element. Since the prefix is already sorted, you can use binary search to find the insertion position in O(log n) comparisons instead of O(n).

void binaryInsertionSort(int[] arr) {
    int n = arr.length;
    for (int i = 1; i < n; i++) {
        int key = arr[i];
        // Binary search for insertion position in sorted prefix
        int insertPos = findInsertionPoint(arr, 0, i, key);
        // Shift elements right to make room
        System.arraycopy(arr, insertPos, arr, insertPos + 1, i - insertPos);
        arr[insertPos] = key;
    }
}

int findInsertionPoint(int[] arr, int low, int high, int key) {
    while (low < high) {
        int mid = low + (high - low) / 2;
        if (arr[mid] <= key) {
            low = mid + 1;
        } else {
            high = mid;
        }
    }
    return low;
}

You can also use Arrays.binarySearch from the standard library, but the manual implementation above gives you precise control over the insertion point and maintains stability (the <= condition ensures equal elements insert after existing ones).

Analysis of Binary Insertion Sort

Comparisons: O(n log n) total — each of the n elements uses O(log n) comparisons via binary search.
Shifts: Still O(n²) in the worst case — you still need to shift elements right to create the insertion gap.
Overall time: O(n²) — the shift cost dominates.
Benefit: Fewer comparisons. When comparison is expensive (comparing strings, complex objects), binary insertion sort provides a measurable speedup despite the same asymptotic complexity.

Binary insertion sort is a great interview talking point: it demonstrates that you think about optimizing constants and operations within the same big-O class, not solely about asymptotic improvements.

Why Insertion Sort Powers Hybrid Algorithms

Here’s where insertion sort transcends textbook status and becomes a production algorithm.

TimSort (Java’s Arrays.sort for Objects)

Java’s Arrays.sort(Object[]) and Arrays.sort(T[], Comparator) use TimSort — a hybrid algorithm invented by Tim Peters for Python. TimSort’s strategy:

Divide the array into small “runs” (naturally ascending or descending subsequences).
Sort each run using insertion sort when the run is shorter than a threshold (typically 32 elements).
Merge runs using a modified merge sort.

Why insertion sort for small runs? Three reasons:

Low overhead: No recursion, no partitioning, no auxiliary arrays. The constant factors in insertion sort are tiny.
Cache efficiency: Insertion sort accesses elements sequentially and locally. On modern CPUs, this means excellent cache hit rates — data stays in L1 cache.
Adaptive: Natural runs in real-world data (timestamps, sequential IDs) are already partially sorted. Insertion sort handles these in O(n).

IntroSort (C++‘s std::sort)

C++‘s std::sort uses IntroSort — a hybrid of quicksort, heapsort, and insertion sort:

Start with quicksort.
If recursion depth exceeds 2·log₂(n), switch to heapsort (guaranteeing O(n log n) worst case).
When the partition size falls below a threshold (~16 elements), switch to insertion sort.

The same rationale applies: for small partitions, insertion sort’s low overhead beats quicksort’s recursive machinery.

The Takeaway

Insertion sort is not “a beginner’s algorithm.” It is a critical component of every major sorting implementation in production today. Understanding why — cache locality, low overhead, adaptivity — separates candidates who memorize from candidates who think.

When to Use Insertion Sort

Scenario	Why Insertion Sort
Small arrays (< 32 elements)	Lower constant factors than O(n log n) algorithms
Nearly sorted data	O(n) performance due to adaptive behavior
Online sorting	Process elements one at a time as they arrive
Stable sort requirement	Maintains relative order of equal elements
Within hybrid algorithms	Powers TimSort and IntroSort for small partitions

Online sorting deserves special mention. Insertion sort processes elements incrementally — each new element is inserted into an already-sorted prefix. If your data arrives one item at a time (streaming sensor data, real-time event processing), insertion sort maintains a sorted collection without re-sorting from scratch.

Comparison: Insertion Sort vs Bubble Sort vs Selection Sort

Property	Bubble Sort	Selection Sort	Insertion Sort
Best case	O(n)	O(n²)	O(n)
Average case	O(n²)	O(n²)	O(n²)
Worst case	O(n²)	O(n²)	O(n²)
Swaps (worst)	O(n²)	O(n)	O(n²) shifts
Stable	Yes	No	Yes
Adaptive	Yes	No	Yes
Cache performance	Poor	Poor	Good
Used in production	No	Rarely	Yes (TimSort, IntroSort)

Insertion sort dominates this trio. It matches or beats bubble sort in every metric and offers adaptivity and stability that selection sort lacks. The only edge selection sort holds is minimizing swap count — relevant for write-limited hardware.

Interviewer Tips

Lead with insertion sort for small data. When an interviewer says “the input has at most 50 elements,” insertion sort is a perfectly valid — and efficient — answer. Explain why: constant factors dominate at small sizes, and insertion sort has the smallest overhead.

Know the TimSort connection. If asked “What sorting algorithm does Java use?”, the full answer is: “TimSort for objects (stable, O(n log n)), dual-pivot quicksort for primitives (in-place, O(n log n) average).” Follow up with “TimSort uses insertion sort for runs shorter than 32 elements because of cache locality and adaptive behavior.”

Explain the inversion connection. Insertion sort’s runtime depends on the number of inversions — O(n + d). This is a deeper answer than “O(n) best case” and shows algorithmic maturity.

Mention binary insertion sort as an optimization. This demonstrates you think about reducing comparison costs when comparisons are expensive, even within the same complexity class. It’s the kind of nuance that separates strong candidates from the rest.