Skip Lists
SummaryCovers multi-level linked list design, probabilistic level assignment,...
Covers multi-level linked list design, probabilistic level assignment,...
Covers multi-level linked list design, probabilistic level assignment, search/insert/delete in expected O(log n) time, and comparison with balanced trees and concurrent skip lists.
Skip Lists
Balanced binary search trees — AVL trees, Red-Black trees — deliver guaranteed O(log n) search, insert, and delete. They achieve this through rotations: structural adjustments that keep the tree height logarithmic. The problem? Rotations are tricky to implement correctly. One wrong pointer update in a Red-Black tree’s double rotation, and your entire structure collapses.
What if you could get O(log n) expected performance without ever rotating anything? Skip lists achieve exactly that through randomization — a coin flip replaces the complexity of rebalancing.
The Core Idea: Express Lanes
Think of a sorted linked list as a local train that stops at every station. Searching for an element requires visiting every node in sequence — O(n). Now add an express line that stops at every other station. You ride the express line until you overshoot your target, drop down to the local line, and walk a few stops. This cuts the search roughly in half.
Stack more express lines on top — one that stops every 4th station, another every 8th, another every 16th — and search becomes O(log n).
Level 3: HEAD ─────────────────────────────── 47 ───────────────────── NIL
│ │
Level 2: HEAD ──────────── 17 ──────────────── 47 ──────── 73 ─────── NIL
│ │ │ │
Level 1: HEAD ───── 8 ──── 17 ──── 31 ──────── 47 ── 55 ── 73 ── 89 NIL
│ │ │ │ │ │ │ │
Level 0: HEAD ─ 3 ─ 8 ─ 12 17 ─ 25 31 ─ 39 ── 47 ── 55 ── 73 ── 89 NIL
Level 0 contains all elements in sorted order. Each higher level is a sparser subset. The height of an element — how many levels it appears in — is determined randomly at insertion time. No rebalancing required, ever.
Structure
A skip list consists of:
- A sentinel head node with maximum height (forward pointers at every level)
- Multiple levels of sorted linked lists layered on top of each other
- Each node has an array of forward pointers, one per level the node participates in
Node Definition
class SkipListNode {
final int value;
final SkipListNode[] forward; // forward[i] = next node at level i
SkipListNode(int value, int level) {
this.value = value;
this.forward = new SkipListNode[level + 1];
}
}
The forward array’s length equals the node’s height (number of levels it spans). A node at level 3 has forward[0] through forward[3] — four pointers, one for each level.
Probabilistic Level Assignment
When inserting a new element, you flip a coin to decide its height:
- Level 0: always (probability 1)
- Level 1: with probability p (typically 0.5)
- Level 2: with probability p² (0.25)
- Level 3: with probability p³ (0.125)
- …
Keep flipping until you get tails. The number of heads determines the level.
private int randomLevel() {
int level = 0;
while (level < maxLevel && random.nextDouble() < P) {
level++;
}
return level;
}
With p = 0.5, the expected level of any node is 1 (it appears in 2 levels on average: level 0 and level 1). The expected maximum level across n elements is O(log n). This probabilistic structure mirrors the deterministic balance of a BST — without rotations.
Why does this work? At level 0, you have n elements. At level 1, roughly n/2. At level 2, roughly n/4. The expected number of elements at level k is n/2^k. This geometric decay means the top levels are extremely sparse, enabling fast traversal.
Operations
Search
Start at the highest level of the head node. Move right along the current level as long as the next node’s value is less than the target. When the next node is greater than or equal to the target (or null), drop down one level. Repeat until you reach level 0 and either find the target or determine it’s absent.
public boolean search(int target) {
SkipListNode current = head;
for (int i = level; i >= 0; i--) {
while (current.forward[i] != null && current.forward[i].value < target) {
current = current.forward[i];
}
}
current = current.forward[0];
return current != null && current.value == target;
}
Trace for searching 55 in the example above:
- Level 3: HEAD → 47 (47 < 55, advance). 47 → NIL (stop, drop down).
- Level 2: 47 → 73 (73 > 55, stop, drop down).
- Level 1: 47 → 55 (55 = target, but we drop to level 0 to confirm).
- Level 0: forward[0].value == 55 ✓ → Found!
At each level, you skip over large sections of the list. The total number of nodes visited across all levels is O(log n) in expectation.
Insert
Insertion has three phases:
- Find the position at every level (track predecessor nodes)
- Generate random level for the new node
- Splice the node into each level it participates in
public void insert(int value) {
SkipListNode[] update = new SkipListNode[maxLevel + 1];
SkipListNode current = head;
// Phase 1: find predecessors at each level
for (int i = level; i >= 0; i--) {
while (current.forward[i] != null && current.forward[i].value < value) {
current = current.forward[i];
}
update[i] = current; // predecessor at level i
}
// Phase 2: determine new node's level
int newLevel = randomLevel();
// If new node is taller than current max level, update head pointers
if (newLevel > level) {
for (int i = level + 1; i <= newLevel; i++) {
update[i] = head;
}
level = newLevel;
}
// Phase 3: create and splice node
SkipListNode newNode = new SkipListNode(value, newLevel);
for (int i = 0; i <= newLevel; i++) {
newNode.forward[i] = update[i].forward[i];
update[i].forward[i] = newNode;
}
size++;
}
The update array stores the predecessor of the new node at each level. Splicing works identically to linked list insertion — for each level, the new node’s forward pointer takes the predecessor’s old forward pointer, and the predecessor points to the new node.
Delete
Deletion mirrors insertion: find the node, record predecessors, and unlink at every level.
public boolean delete(int value) {
SkipListNode[] update = new SkipListNode[maxLevel + 1];
SkipListNode current = head;
for (int i = level; i >= 0; i--) {
while (current.forward[i] != null && current.forward[i].value < value) {
current = current.forward[i];
}
update[i] = current;
}
current = current.forward[0];
if (current == null || current.value != value) {
return false; // not found
}
// Unlink at each level
for (int i = 0; i <= level; i++) {
if (update[i].forward[i] != current) break;
update[i].forward[i] = current.forward[i];
}
// Reduce level if top levels are now empty
while (level > 0 && head.forward[level] == null) {
level--;
}
size--;
return true;
}
After unlinking, check if the top level(s) became empty — if so, reduce the skip list’s current level. This keeps traversal efficient even after many deletions.
Complete Implementation
import java.util.Random;
public class SkipList {
private static final double P = 0.5;
private static final int MAX_LEVEL = 16; // supports up to 2^16 = 65,536 elements comfortably
private final SkipListNode head;
private final Random random;
private int level; // current highest level in use
private int size;
static class SkipListNode {
final int value;
final SkipListNode[] forward;
SkipListNode(int value, int level) {
this.value = value;
this.forward = new SkipListNode[level + 1];
}
}
public SkipList() {
head = new SkipListNode(Integer.MIN_VALUE, MAX_LEVEL);
random = new Random();
level = 0;
size = 0;
}
private int randomLevel() {
int lvl = 0;
while (lvl < MAX_LEVEL && random.nextDouble() < P) {
lvl++;
}
return lvl;
}
public boolean search(int target) {
SkipListNode current = head;
for (int i = level; i >= 0; i--) {
while (current.forward[i] != null && current.forward[i].value < target) {
current = current.forward[i];
}
}
current = current.forward[0];
return current != null && current.value == target;
}
public void insert(int value) {
SkipListNode[] update = new SkipListNode[MAX_LEVEL + 1];
SkipListNode current = head;
for (int i = level; i >= 0; i--) {
while (current.forward[i] != null && current.forward[i].value < value) {
current = current.forward[i];
}
update[i] = current;
}
int newLevel = randomLevel();
if (newLevel > level) {
for (int i = level + 1; i <= newLevel; i++) {
update[i] = head;
}
level = newLevel;
}
SkipListNode newNode = new SkipListNode(value, newLevel);
for (int i = 0; i <= newLevel; i++) {
newNode.forward[i] = update[i].forward[i];
update[i].forward[i] = newNode;
}
size++;
}
public boolean delete(int value) {
SkipListNode[] update = new SkipListNode[MAX_LEVEL + 1];
SkipListNode current = head;
for (int i = level; i >= 0; i--) {
while (current.forward[i] != null && current.forward[i].value < value) {
current = current.forward[i];
}
update[i] = current;
}
current = current.forward[0];
if (current == null || current.value != value) {
return false;
}
for (int i = 0; i <= level; i++) {
if (update[i].forward[i] != current) break;
update[i].forward[i] = current.forward[i];
}
while (level > 0 && head.forward[level] == null) {
level--;
}
size--;
return true;
}
public int size() {
return size;
}
}
The entire implementation fits comfortably in an interview setting. No rotations, no color flipping, no case analysis — pointer manipulation and a coin flip.
Expected Complexity Analysis
Expected Number of Levels
Each element reaches level k with probability p^k. The expected maximum level across n elements:
$$E[\text{max level}] = \log_{1/p}(n) = O(\log n)$$
With p = 0.5, the expected maximum level for 1,000 elements is about 10. For 1,000,000 elements, about 20.
Expected Space
Each element appears at level 0 (always), level 1 (with probability p), level 2 (with probability p²), and so on. The expected number of total node appearances across all levels:
$$n \cdot \sum_{k=0}^{\infty} p^k = n \cdot \frac{1}{1 - p}$$
With p = 0.5, this equals 2n. Each element has, on average, 2 forward pointers. Total space: O(n).
Expected Search Time
The classic analysis uses a backwards argument. Start at the found node at level 0 and trace the search path backwards. At each step, you either:
- Climb up a level (probability p) — this happened because the current node was promoted
- Move left on the same level (probability 1 - p) — this happened because the current node was not promoted
The expected number of left moves at any level is 1/(1-p) = 2 (for p = 0.5). Since there are O(log n) levels, the total expected number of nodes visited is:
$$O\left(\frac{\log n}{1 - p}\right) = O(\log n)$$
This is an expected bound, not a worst-case guarantee. The probability of the search taking significantly longer than O(log n) drops exponentially — bad luck is vanishingly rare.
Summary
| Operation | Expected Time | Space |
|---|---|---|
| Search | O(log n) | — |
| Insert | O(log n) | O(log n) per node (expected) |
| Delete | O(log n) | — |
| Space | — | O(n) total |
Skip List vs. Balanced BST
| Aspect | Skip List | Balanced BST (AVL/RB) |
|---|---|---|
| Time guarantee | O(log n) expected | O(log n) worst-case |
| Implementation | Straightforward pointer manipulation | Complex rotations and case analysis |
| Concurrency | Excellent — lock-free variants exist | Difficult — rotations touch multiple nodes |
| Space | O(n) expected, ~2n pointers with p=0.5 | O(n), exactly 2 child pointers per node |
| Cache behavior | Pointer-heavy, less cache-friendly | Similar; both involve pointer chasing |
| Ordered iteration | Level 0 is a sorted linked list — natural | In-order traversal required |
| Determinism | Probabilistic — rare worst cases possible | Deterministic guarantees |
When to pick a skip list:
- Concurrent access is critical (lock-free algorithms are well-studied for skip lists)
- Implementation simplicity matters more than worst-case guarantees
- You need sorted iteration and the level-0 linked list is convenient
When to pick a balanced BST:
- Deterministic O(log n) is non-negotiable
- You’re using a standard library (Java’s
TreeMapuses a Red-Black tree) - Memory predictability matters
ConcurrentSkipListMap in Java
Java’s java.util.concurrent package includes ConcurrentSkipListMap — a thread-safe, sorted map backed by a skip list. It uses lock-free algorithms based on compare-and-swap (CAS) operations.
import java.util.concurrent.ConcurrentSkipListMap;
var map = new ConcurrentSkipListMap<Integer, String>();
map.put(3, "three");
map.put(1, "one");
map.put(7, "seven");
map.put(5, "five");
// Sorted iteration — guaranteed order
map.forEach((k, v) -> System.out.println(k + " → " + v));
// 1 → one, 3 → three, 5 → five, 7 → seven
// Range views
var subMap = map.subMap(2, 6); // keys in [2, 6)
System.out.println(subMap); // {3=three, 5=five}
// Concurrent-safe updates
map.computeIfAbsent(4, k -> "four");
ConcurrentSkipListMap vs. ConcurrentHashMap
| Aspect | ConcurrentSkipListMap | ConcurrentHashMap |
|---|---|---|
| Ordering | Sorted by key | No ordering |
| Get/Put | O(log n) | O(1) amortized |
| Range queries | subMap, headMap, tailMap | Not supported |
| Implementation | Lock-free skip list | Segmented hash table |
| Use case | Need sorted order + concurrency | Need fast lookup + concurrency |
Choose ConcurrentSkipListMap when you need both thread safety and sorted key order. If you only need fast concurrent key-value access without ordering, ConcurrentHashMap is the better choice.
Why Lock-Free Works Well for Skip Lists
In a balanced BST, rotations modify parent, child, and sibling pointers atomically — coordinating this without locks is extremely difficult. In a skip list, insertion and deletion affect only a chain of independent next-pointers at different levels. Each pointer update can be performed with a single CAS operation, making lock-free implementation natural.
The algorithm marks nodes for logical deletion before physically unlinking them, allowing concurrent readers to traverse safely even while modifications occur.
Real-World Application: Redis Sorted Sets
Redis, the in-memory data store used by millions of applications, chose skip lists for its sorted set implementation. Every ZADD, ZRANK, and ZRANGE command operates on a skip list internally.
Why skip lists over balanced trees? Redis creator Salvatore Sanfilippo offered three reasons:
- Simpler to implement — fewer bugs in a system that must be rock-solid
- Easy to modify for custom operations — like getting all elements in a score range
- Competitive performance — the constant factors are small enough that skip lists match or beat balanced trees in practice
When a battle-tested production system serving billions of requests chooses skip lists, that’s a strong signal about the structure’s practical value.
Edge Cases and Interviewer Tips
Edge cases to handle:
- Duplicate values — decide upfront whether your skip list allows them. The implementation above inserts duplicates as separate nodes. To disallow duplicates, check after search and reject.
- Empty skip list — search returns
false, delete returnsfalse. Insert works on the first call. - Searching for values smaller than all elements or larger than all elements — the loop naturally handles these by staying at the head or reaching null respectively.
- Very unlucky random level — a single node could theoretically get a very high level. Cap the maximum level (as done with
MAX_LEVEL = 16) to bound memory per node.
What interviewers want to hear:
- Explain the probabilistic balancing clearly: “Each node is promoted to the next level with probability p. This creates a geometric distribution of heights that mirrors the balance of a complete binary tree — without any structural adjustments.”
- Articulate the expected vs. worst-case distinction. Skip list search is O(n) in the worst case (all nodes at level 0), but this event has exponentially small probability.
- Know the backwards analysis argument for O(log n) expected search time — even a sketch shows deep understanding.
- Mention
ConcurrentSkipListMap— it demonstrates awareness of how the data structure appears in production Java code. - If asked “why not use a balanced BST?”, discuss the concurrency advantage and implementation simplicity trade-offs.
Time complexity summary:
- Search: O(log n) expected, O(n) worst-case
- Insert: O(log n) expected
- Delete: O(log n) expected
- Space: O(n) expected
- Level of a node: O(log n) expected maximum
Skip lists prove that randomization is a legitimate design tool, not a hack. A coin flip per element produces the same logarithmic behavior that balanced trees achieve through intricate restructuring — and it does so with code clean enough to write on a whiteboard under pressure.