Skip to main content
cracking the tech interview system design and algorithms in java 25

Depth-First Search (DFS)

9 min read Chapter 59 of 75
Summary

Covers recursive and iterative DFS, pre/post-order visits, DFS...

Covers recursive and iterative DFS, pre/post-order visits, DFS on grids, cycle detection, topological sort, and connected component identification.

Depth-First Search (DFS)

Depth-First Search is the algorithmic equivalent of exploring a maze by always taking the next unexplored turn and only backtracking when you hit a dead end. It goes deep before going wide — exhausting one branch completely before trying another. This single strategy unlocks cycle detection, topological sorting, connected component counting, and dozens of grid-based interview problems.

Recursive DFS

The most natural DFS implementation mirrors the algorithm’s definition directly: visit a node, mark it as visited, then recurse on every unvisited neighbor.

import java.util.*;

void dfs(Map<Integer, List<Integer>> graph, int node, Set<Integer> visited) {
    visited.add(node);
    System.out.println("Visited: " + node);

    for (int neighbor : graph.getOrDefault(node, List.of())) {
        if (!visited.contains(neighbor)) {
            dfs(graph, neighbor, visited);
        }
    }
}

// Usage
var graph = Map.of(
    0, List.of(1, 2),
    1, List.of(3),
    2, List.of(4),
    3, List.of(),
    4, List.of()
);
var visited = new HashSet<Integer>();
dfs(graph, 0, visited);

The adjacency list representation Map<Integer, List<Integer>> stores each vertex as a key and its neighbors as the value list. Recursive DFS pushes frames onto the call stack — one frame per vertex on the current path.

Iterative DFS

Recursive DFS has a limit: the JVM call stack. A graph with 100,000 nodes in a chain triggers a StackOverflowError. The fix is to replace the implicit call stack with an explicit Deque:

void dfsIterative(Map<Integer, List<Integer>> graph, int start) {
    var visited = new HashSet<Integer>();
    Deque<Integer> stack = new ArrayDeque<>();
    stack.push(start);

    while (!stack.isEmpty()) {
        int node = stack.pop();
        if (visited.contains(node)) continue;

        visited.add(node);
        System.out.println("Visited: " + node);

        for (int neighbor : graph.getOrDefault(node, List.of())) {
            if (!visited.contains(neighbor)) {
                stack.push(neighbor);
            }
        }
    }
}

When to prefer iterative DFS: Use the iterative version whenever the graph is deep or the depth is unpredictable. Interview problems involving large grids (1000×1000) or long chains are prime candidates. The iterative approach also makes it easier to manage additional state alongside the stack — for example, tracking path length or accumulated cost.

Pre-order vs Post-order DFS

The distinction between pre-order and post-order comes down to when you process a node relative to its descendants.

Pre-order processes the node before recursing into children:

void dfsPreOrder(Map<Integer, List<Integer>> graph, int node, Set<Integer> visited) {
    visited.add(node);
    process(node); // ← BEFORE children

    for (int neighbor : graph.getOrDefault(node, List.of())) {
        if (!visited.contains(neighbor)) {
            dfsPreOrder(graph, neighbor, visited);
        }
    }
}

Use pre-order when you need to record entry order — copying a tree, serializing a structure, or assigning discovery timestamps.

Post-order processes the node after all descendants have been visited:

void dfsPostOrder(Map<Integer, List<Integer>> graph, int node, Set<Integer> visited) {
    visited.add(node);

    for (int neighbor : graph.getOrDefault(node, List.of())) {
        if (!visited.contains(neighbor)) {
            dfsPostOrder(graph, neighbor, visited);
        }
    }

    process(node); // ← AFTER children
}

Use post-order when the result depends on children being processed first — computing subtree sizes, evaluating expression trees, or building topological orderings (more on this below).

DFS on Grids

Many interview problems present a 2D grid as an implicit graph. Each cell is a vertex; its up/down/left/right neighbors are edges. A direction array record keeps the movement logic clean:

record Direction(int dr, int dc) {}

static final Direction[] DIRS = {
    new Direction(-1, 0), // up
    new Direction(1, 0),  // down
    new Direction(0, -1), // left
    new Direction(0, 1)   // right
};

Number of Islands (Flood Fill)

Given a grid of '1' (land) and '0' (water), count the number of islands. Each island is a group of connected '1' cells.

int numIslands(char[][] grid) {
    int rows = grid.length, cols = grid[0].length;
    int count = 0;

    for (int r = 0; r < rows; r++) {
        for (int c = 0; c < cols; c++) {
            if (grid[r][c] == '1') {
                count++;
                floodFill(grid, r, c, rows, cols);
            }
        }
    }
    return count;
}

void floodFill(char[][] grid, int r, int c, int rows, int cols) {
    if (r < 0 || r >= rows || c < 0 || c >= cols || grid[r][c] != '1') return;

    grid[r][c] = '0'; // mark visited by sinking the island

    for (var d : DIRS) {
        floodFill(grid, r + d.dr(), c + d.dc(), rows, cols);
    }
}

The trick: instead of maintaining a separate visited set, mutate the grid itself — change '1' to '0' as you visit. This saves space and is the standard approach for “destructive” DFS on grids.

Related grid problems that use the same DFS flood-fill pattern:

  • Surrounded Regions: Identify 'O' regions NOT connected to the border, then flip them to 'X'. Start DFS from border 'O' cells to mark safe regions first.
  • Pacific Atlantic Water Flow: Run DFS from each ocean’s border inward, tracking which cells can flow to each ocean. The answer is cells reachable from both.

DFS for Cycle Detection

Directed Graphs: Three-Color Marking

In a directed graph, a cycle exists if and only if DFS encounters a back edge — an edge pointing to a vertex currently on the recursion stack. The three-color scheme tracks this precisely:

enum Color { WHITE, GRAY, BLACK }

boolean hasCycleDirected(Map<Integer, List<Integer>> graph, int numVertices) {
    var color = new Color[numVertices];
    Arrays.fill(color, Color.WHITE);

    for (int v = 0; v < numVertices; v++) {
        if (color[v] == Color.WHITE && dfsCycleDirected(graph, v, color)) {
            return true;
        }
    }
    return false;
}

boolean dfsCycleDirected(Map<Integer, List<Integer>> graph, int node, Color[] color) {
    color[node] = Color.GRAY; // currently on recursion stack

    for (int neighbor : graph.getOrDefault(node, List.of())) {
        if (color[neighbor] == Color.GRAY) return true;  // back edge → cycle
        if (color[neighbor] == Color.WHITE && dfsCycleDirected(graph, neighbor, color)) {
            return true;
        }
    }

    color[node] = Color.BLACK; // fully processed
    return false;
}
  • WHITE: not yet visited.
  • GRAY: currently being explored (on the recursion stack).
  • BLACK: fully processed — all descendants explored.

An edge from a GRAY node to another GRAY node is a back edge, confirming a cycle.

Undirected Graphs: Parent Tracking

In undirected graphs, every edge appears in both directions. To avoid false positives, track the parent of each node during DFS. A cycle exists if you reach a visited node that is NOT your parent:

boolean hasCycleUndirected(Map<Integer, List<Integer>> graph, int numVertices) {
    var visited = new boolean[numVertices];

    for (int v = 0; v < numVertices; v++) {
        if (!visited[v] && dfsCycleUndirected(graph, v, -1, visited)) {
            return true;
        }
    }
    return false;
}

boolean dfsCycleUndirected(Map<Integer, List<Integer>> graph, int node, int parent,
                           boolean[] visited) {
    visited[node] = true;

    for (int neighbor : graph.getOrDefault(node, List.of())) {
        if (!visited[neighbor]) {
            if (dfsCycleUndirected(graph, neighbor, node, visited)) return true;
        } else if (neighbor != parent) {
            return true; // visited neighbor that isn't our parent → cycle
        }
    }
    return false;
}

Topological Sort with DFS

A topological ordering arranges vertices in a directed acyclic graph (DAG) so that every edge points from an earlier vertex to a later one. Think of it as a valid execution order for tasks with dependencies.

The DFS approach exploits post-order: when a vertex finishes (all descendants explored), push it onto a result stack. The reversed post-order IS the topological sort:

List<Integer> topologicalSort(Map<Integer, List<Integer>> graph, int numVertices) {
    var visited = new boolean[numVertices];
    Deque<Integer> stack = new ArrayDeque<>();

    for (int v = 0; v < numVertices; v++) {
        if (!visited[v]) {
            dfsTopoSort(graph, v, visited, stack);
        }
    }

    return new ArrayList<>(stack);
}

void dfsTopoSort(Map<Integer, List<Integer>> graph, int node, boolean[] visited,
                 Deque<Integer> stack) {
    visited[node] = true;

    for (int neighbor : graph.getOrDefault(node, List.of())) {
        if (!visited[neighbor]) {
            dfsTopoSort(graph, neighbor, visited, stack);
        }
    }

    stack.push(node); // post-order: add AFTER all descendants
}

Application — Course Schedule II: Given numCourses and prerequisite pairs, return a valid course order. Model courses as vertices and prerequisites as directed edges, then run topological sort. If a cycle exists (checked with three-color DFS), no valid ordering exists — return an empty array.

DFS Tree and Edge Classification

When DFS traverses a directed graph, it classifies every edge into one of four types:

Edge TypeDescriptionSignificance
Tree edgeEdge to a WHITE (unvisited) vertexForms the DFS tree
Back edgeEdge to a GRAY (in-progress) vertexIndicates a cycle
Forward edgeEdge to a BLACK vertex with higher discovery timeSkips ahead in the DFS tree
Cross edgeEdge to a BLACK vertex with lower discovery timeConnects different branches

The critical insight: a directed graph has a cycle if and only if DFS finds a back edge. This is why the three-color scheme works for cycle detection — GRAY-to-GRAY edges are exactly the back edges.

Complexity Analysis

MetricValue
TimeO(V + E) — each vertex visited once, each edge examined once
SpaceO(V) — visited set + recursion stack (or explicit stack)

For grids with R rows and C columns: V = R × C, E = up to 4 × R × C, so overall O(R × C).

DFS vs BFS: When to Choose DFS

ScenarioChoose DFS
Exhaustive search (find ALL solutions)✓ — DFS naturally explores all branches
Cycle detection✓ — three-color marking with DFS
Topological sort✓ — post-order gives topological ordering
Path finding in mazes✓ — goes deep, finds A path (not necessarily shortest)
Connected components✓ — one DFS per component
Shortest path (unweighted)✗ — use BFS instead
Level-order processing✗ — use BFS instead

Interviewer Tips

  • State the traversal strategy first: “I will use DFS because I need to explore all possibilities” or “I need post-order processing for topological sort.” This shows intentional algorithm selection.
  • Choose iterative for large inputs: Mention the StackOverflowError risk with recursive DFS on deep graphs. Interviewers appreciate candidates who consider practical constraints.
  • Clarify the graph type: Directed vs undirected changes the cycle detection approach. Always ask.
  • Grid problems are graph problems: Explicitly say “I will treat this grid as a graph where each cell is a vertex with up to 4 neighbors.” This reframing signals strong problem-solving skills.
  • Watch for implicit visited marking: On grid problems, mutating the input (sinking islands) eliminates the need for a separate visited set. Mention the trade-off: faster but destructive.
  • Know when DFS is wrong: If the interviewer asks for the shortest path in an unweighted graph, DFS gives A path but not the shortest. Switch to BFS immediately.