Core Data Structures in Java 21+ - java interview engineering first principles to system design • Dev|Journal

Introduction to Core Data Structures

Building on the Java 21+ foundations established in Chapter 1—Records for immutable data carriers, sealed classes for type-safe hierarchies, pattern matching for exhaustive handling, and virtual threads for scalable concurrency—this chapter focuses on core data structures essential for technical interviews. The goal is to enable rapid selection between structures like ArrayList and LinkedList or HashMap and TreeMap based on operation frequencies and complexity requirements, with justifications delivered in under 30 seconds. Mastery involves not only implementation but also explicit analysis of time and space trade-offs, leveraging modern Java features to enhance code clarity and performance.

Data structures in Java are primarily defined through the Collections Framework interfaces: List, Set, and Map. As introduced in prerequisite summaries, Java Records reduce boilerplate for immutable data carriers, sealed classes enable closed hierarchies, and pattern matching simplifies data extraction—all critical for implementing custom structures when standard classes fall short. For instance, the Expression interface from CH1-S1 demonstrates how sealed hierarchies with Records can model recursive data like trees, while DuplicateFinder from CH1-S3 shows HashMap usage for duplicate detection with O(n) average time complexity. This chapter extends these concepts to core structures, emphasizing analytical comparisons and production-ready code.

Immutable Data Structures with Java 21+ Features

Immutable data structures offer thread-safety and predictability, often required in interview scenarios where state changes must be controlled. Java 21+ enhances this with Records and sealed classes, allowing concise implementations. For example, an immutable linked list can be built using a sealed interface and Records, as shown below. This implementation uses pattern matching in switch expressions for exhaustive handling, ensuring type safety and reducing errors.

public sealed interface ImmutableList<T> permits Nil, Cons {}
record Nil<T>() implements ImmutableList<T> {}
record Cons<T>(T head, ImmutableList<T> tail) implements ImmutableList<T> {}

static <T> int size(ImmutableList<T> list) {
    return switch(list) {
        case Nil<T> n -> 0;
        case Cons<T> c -> 1 + size(c.tail());
    };
}
// Time Complexity: O(n) due to linear traversal, Space Complexity: O(n) due to recursion stack in worst case.

This code defines an ImmutableList with two permitted subtypes: Nil for an empty list and Cons for a node with a head and tail. The size method uses pattern matching to traverse the list recursively. Time complexity is O(n) because each element is visited once, and space complexity is O(n) in the worst case due to recursion depth matching the list length. Common interview mistakes include forgetting to handle all cases in the switch or using mutable fields in Records, which violates immutability—addressed in failure modes later. This approach contrasts with mutable structures like ArrayList, where internal array resizing can lead to O(n) amortized insertion costs.

Time and Space Complexity Analysis

Selecting the optimal data structure requires rigorous complexity analysis using Big-O notation. The following table summarizes key performance characteristics for core structures, based on average and worst-case scenarios. This aligns with the analytical synthesis style, comparing access, insertion, and deletion operations.

Data Structure	Access (get)	Insertion (beginning)	Deletion (beginning)	Space Complexity
ArrayList	O(1)	O(n)	O(n)	O(n)
LinkedList	O(n)	O(1)	O(1)	O(n)
HashMap	O(1) avg	O(1) avg	O(1) avg	O(n)
TreeMap	O(log n)	O(log n)	O(log n)	O(n)

ArrayList provides O(1) time complexity for get(int index) due to direct array access, but insertions or deletions at arbitrary positions are O(n) because elements must be shifted. LinkedList, in contrast, offers O(1) for addFirst() and addLast() by adjusting node pointers, but index-based access is O(n) due to traversal. HashMap has average O(1) operations for put and get, but worst-case O(n) when hash collisions occur—a topic reserved for Chapter 4. TreeMap guarantees O(log n) operations due to its Red-Black tree structure, maintaining sorted order.

Memory overhead is equally critical. ArrayList memory layout includes an object header for the instance plus an internal array of references to elements; each reference costs 4-8 bytes on 32/64-bit JVMs, with resizing overhead during growth. LinkedList memory layout involves per-node overhead: an object header, data reference, next pointer, and previous pointer, leading to higher per-element costs. This JVM-specific consideration influences space complexity, which is O(n) for both but with different constant factors. For I/O-bound tasks, virtual threads in Java 21+ can reduce memory overhead to about 2KB per thread, improving scalability when data structures involve blocking operations, as seen in the ParallelFileProcessor from CH1-S2.

Trade-offs in Data Structure Selection

Explicit trade-offs guide rapid selection. The matrix below compares pros and cons, helping candidates justify choices based on operation frequencies.

Data Structure	Pros	Cons	Best For
ArrayList	Fast random access (O(1)), lower memory overhead	Slow insertions/deletions in middle (O(n))	Frequent access, infrequent modifications
LinkedList	Fast insertions/deletions at ends (O(1))	Slow random access (O(n)), higher memory use	Frequent insertions/deletions, queues
HashMap	Fast average operations (O(1)), flexible keys	Worst-case O(n) collisions, no order	Key-value lookups, caching
TreeMap	Sorted order, predictable O(log n) operations	Slower than HashMap on average, more memory	Range queries, ordered traversals

ArrayList provides fast random access at the cost of slow modifications, making it ideal for scenarios with frequent reads but rare writes. LinkedList excels in frequent insertions or deletions at ends, such as in queue implementations, but suffers from higher memory use and slow access. HashMap offers flexibility and average O(1) performance, best for caching or lookups, but lacks order and can degrade to O(n) in collisions. TreeMap ensures sorted traversal with predictable O(log n) operations, suitable for range queries, at the expense of slower average performance compared to HashMap. These trade-offs must be stated explicitly: for example, ArrayList provides O(1) access at the cost of O(n) insertions, while LinkedList provides O(1) insertions at the cost of O(n) access.

Common Failure Modes and Edge Cases

Interview pitfalls often stem from oversight. The following checklist identifies failure modes to avoid during implementation and selection:

Not handling null inputs in data structure methods, leading to NullPointerException.
Incorrect time complexity analysis, e.g., stating O(1) for LinkedList get without considering traversal.
Ignoring space complexity, such as forgetting auxiliary storage in HashMap for duplicate detection.
Using mutable fields in Records, violating immutability and causing unintended side effects.
Missing edge cases like empty collections or single-element scenarios in implementations.
Thread-safety issues when using non-concurrent data structures in multi-threaded environments without synchronization.

For instance, in the ImmutableList implementation, failing to validate inputs in Record constructors could lead to invalid states. Similarly, when using HashMap, not accounting for worst-case O(n) collisions might mislead complexity analysis. Edge cases include empty arrays in ArrayList operations or null keys in TreeMap, which must be tested rigorously. By adhering to Java 21+ features like compact constructors in Records for validation, candidates can mitigate these risks.

Interview Patterns for Rapid Selection

A structured approach ensures efficient decision-making. The template below, derived from interview patterns, guides candidates from problem analysis to implementation:

Understand problem constraints: Identify required operations (access, insertion, deletion) and their frequencies.
Analyze time and space complexity: Use Big-O notation to evaluate candidate data structures.
Select optimal structure: Choose based on trade-offs (e.g., ArrayList for fast access, LinkedList for frequent insertions).
Implement with Java 21+ features: Use Records for data carriers, sealed classes for hierarchies, pattern matching for handling.
Test edge cases: Include null checks, empty inputs, and performance boundaries.
Re-verify complexities: Ensure implementation matches analyzed time and space complexity.

For example, given a problem requiring frequent insertions at the beginning and infrequent access, LinkedList is optimal due to O(1) insertion complexity, while ArrayList would incur O(n) costs. Implementation should leverage modern Java: if a custom immutable structure is needed, use Records and sealed classes as shown earlier. This pattern integrates with prerequisite knowledge, such as the six-step template from CH1-S3 for complexity analysis, ensuring a cohesive approach.

Conclusion

Mastering core data structures in Java 21+ involves not only memorizing complexities but also applying modern language features to build robust, interview-ready solutions. By analytically comparing structures like ArrayList vs. LinkedList and HashMap vs. TreeMap, candidates can justify selections in under 30 seconds based on operation trade-offs. The integration of Records, sealed classes, pattern matching, and virtual threads enhances code quality, while explicit complexity analysis and failure mode awareness prevent common pitfalls. This chapter sets the foundation for advanced topics in subsequent chapters, where algorithmic patterns and concurrency will be explored.