The N+1 You Didn’t Write

The Lie

Hibernate tells you that lazy loading is an optimization. Load the parent entity, and the children will be fetched only when you actually need them. Sounds reasonable. The documentation even frames FetchType.LAZY as the responsible default, the grown-up choice compared to eager loading’s indiscriminate data hoarding.

The Reality

Lazy loading defers queries. It does not eliminate them. When you iterate over a collection of parent entities and touch any lazy association on each one, Hibernate fires a separate SELECT for every single parent. You wrote zero loops over the database. Hibernate wrote them for you.

This is not a bug. It is the direct consequence of how Hibernate’s proxy mechanism works. When you declare a @OneToMany relationship as lazy, Hibernate does not load the child collection. Instead, it injects a proxy (a PersistentBag, PersistentSet, or PersistentList) that intercepts the first access and fires a query. Each proxy instance is bound to its parent entity’s foreign key. There is no mechanism in the proxy to batch-load across parents.

The result: one query for the parent list, then N queries for the children. The N+1.

N+1 Query Execution Flow

Every bar in the right column represents a round trip your application makes to the database. The gap between 1 expected query and 101 actual queries is the N+1 problem in concrete numbers. The fix (JOIN FETCH) collapses that to a single bar, at the cost of a wider result set that the database handles in one pass.

The Evidence

@Entity
@Table(name = "orders")
public class Order {
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    private String customerName;

    @OneToMany(mappedBy = "order", fetch = FetchType.LAZY)
    private List<OrderItem> items = new ArrayList<>();

    // getters
}

@Entity
@Table(name = "order_items")
public class OrderItem {
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    private String productName;
    private int quantity;
    private BigDecimal price;

    @ManyToOne(fetch = FetchType.LAZY)
    @JoinColumn(name = "order_id")
    private Order order;

    // getters
}

// BAD: This triggers N+1
@Service
public class OrderService {
    @Autowired
    private OrderRepository orderRepository;

    @Transactional(readOnly = true)
    public List<OrderSummary> getOrderSummaries() {
        List<Order> orders = orderRepository.findAll(); // 1 query
        return orders.stream()
            .map(order -> new OrderSummary(
                order.getId(),
                order.getCustomerName(),
                order.getItems().size() // N queries, one per order
            ))
            .toList();
    }
}

// Generated SQL:
// select o1_0.id, o1_0.customer_name from orders o1_0
// select i1_0.order_id, i1_0.id, i1_0.price, i1_0.product_name, i1_0.quantity from order_items i1_0 where i1_0.order_id=?
// select i1_0.order_id, i1_0.id, i1_0.price, i1_0.product_name, i1_0.quantity from order_items i1_0 where i1_0.order_id=?
// select i1_0.order_id, i1_0.id, i1_0.price, i1_0.product_name, i1_0.quantity from order_items i1_0 where i1_0.order_id=?
// ... repeated for every order in the database

With hibernate.generate_statistics=true, you see the damage in the log:

Session Metrics {
    23456 nanoseconds spent acquiring 1 JDBC connections;
    0 nanoseconds spent releasing 0 JDBC connections;
    3456789 nanoseconds spent preparing 101 JDBC statements;
    56789012 nanoseconds spent executing 101 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
}

101 statements. One for the parent query, 100 for 100 orders.

The Fix

Three approaches, each with different characteristics.

1. JOIN FETCH in JPQL

// BETTER: Single query with JOIN FETCH
public interface OrderRepository extends JpaRepository<Order, Long> {
    @Query("SELECT o FROM Order o JOIN FETCH o.items")
    List<Order> findAllWithItems();
}

// Generated SQL:
// select o1_0.id, o1_0.customer_name, i1_0.order_id, i1_0.id, i1_0.price, i1_0.product_name, i1_0.quantity
// from orders o1_0
// join order_items i1_0 on o1_0.id=i1_0.order_id

One query. The join fetches everything in a single round trip. But there is a cost: the result set contains duplicated parent columns for every child row. If each order has 10 items, the parent columns are repeated 10 times. At scale, this inflates network traffic and memory consumption.

2. @EntityGraph

// BETTER: EntityGraph approach
public interface OrderRepository extends JpaRepository<Order, Long> {
    @EntityGraph(attributePaths = {"items"})
    @Query("SELECT o FROM Order o")
    List<Order> findAllWithItemsGraph();
}

// Generated SQL:
// select o1_0.id, o1_0.customer_name, i1_0.order_id, i1_0.id, i1_0.price, i1_0.product_name, i1_0.quantity
// from orders o1_0
// left join order_items i1_0 on o1_0.id=i1_0.order_id

EntityGraph produces a LEFT JOIN by default, which means orders without items are still included. JOIN FETCH produces an inner join. Pick based on whether you need orphan parents.

3. @BatchSize

// BETTER: Batch loading approach
@Entity
@Table(name = "orders")
public class Order {
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    private String customerName;

    @OneToMany(mappedBy = "order", fetch = FetchType.LAZY)
    @BatchSize(size = 25)
    private List<OrderItem> items = new ArrayList<>();
}

// Generated SQL:
// select o1_0.id, o1_0.customer_name from orders o1_0
// select i1_0.order_id, i1_0.id, i1_0.price, i1_0.product_name, i1_0.quantity
//   from order_items i1_0 where i1_0.order_id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
// select i1_0.order_id, i1_0.id, i1_0.price, i1_0.product_name, i1_0.quantity
//   from order_items i1_0 where i1_0.order_id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)

@BatchSize keeps lazy loading but batches the proxy initialization queries. Instead of 100 individual SELECTs, Hibernate fires ceil(100/25) = 4 queries using IN clauses. This is the least invasive fix: you do not change your query, your repository interface, or your service code.

The Cost Model

Approach	Queries (100 orders, 10 items each)	Result Set Rows	Memory Overhead
N+1 (default)	101	100 + 1000	Low per query, high total latency
JOIN FETCH	1	1000 (with duplication)	High: duplicated parent columns
EntityGraph	1	1000 (with duplication)	High: duplicated parent columns
@BatchSize(25)	5	100 + 1000	Moderate: separate result sets

At 100 orders: the N+1 costs roughly 100ms of additional network round-trip time on a local database, 2-5 seconds on a remote database with 20ms latency per round trip. JOIN FETCH brings it to one round trip but transfers more bytes.

At 10,000 orders: N+1 is catastrophic. 10,001 queries. Even at 1ms each, that is 10 seconds of pure query execution, ignoring network. JOIN FETCH at this scale produces a result set with 100,000 rows of duplicated parent data. @BatchSize with size 100 produces 101 queries, which is manageable, but a JOIN FETCH or subselect fetch is still faster.

At 100,000 orders: none of these approaches work well. You need pagination, and the answer is in Chapter 7 (projection queries) and Chapter 4 (bulk operations).

Hibernate N+1 Query Flow

The diagram above traces how a single findAll() call fans out into N+1 database round trips. The persistence context holds proxy references for each lazy collection. When application code touches any proxy, it triggers an individual SELECT. The key insight: the decision to fire those queries happens inside the proxy, not in your code. You never wrote a loop over the database. Hibernate did.