Java ArrayList — 38 Resizes, 225M Copies, Monday OOMs
A 15M-row ArrayList without initialCapacity triggered 38 resizes and heap fragmentation.
- ArrayList is a resizable array implementation of the List interface — grows automatically when full, unlike a primitive array
- Key components: backing Object[] array, size field (element count), capacity (array length, always ≥ size)
- Performance: add() amortized O(1) — occasional resize cost O(n) but rare. get(index) O(1). add(index, element) O(n). remove(index) O(n)
- Growth factor: 1.5x (increase by 50%) — prevents excessive memory waste while keeping amortized cost low
- Production trap: No initial capacity hint in loop that adds millions of elements — repeated resizes cause O(n²) copies and massive GC pressure
- Biggest mistake: Removing elements inside for-each loop — throws ConcurrentModificationException at runtime; use iterator.remove() or removeIf()
Imagine you're organising a birthday party and you write a guest list on a sticky note — but you only have space for 10 names. Every time someone new RSVPs, you have to grab a bigger sticky note and copy everyone over. An ArrayList is Java's way of handling that automatically. You just keep adding names, and it quietly grabs more space behind the scenes whenever it runs out. No manual copying, no guessing how many guests you'll have upfront.
Every non-trivial Java application manages collections of data — a shopping cart full of products, a feed full of posts, a queue full of tasks. The humble array can do this, but only if you know exactly how many items you'll have before you write a single line of code. In the real world, you almost never do. ArrayList exists precisely because the world is unpredictable, and your code needs to keep up.
ArrayList solves the fixed-size problem of plain arrays by wrapping one internally and automatically resizing it whenever capacity is exhausted. It gives you the fast random access of an array (jump straight to index 7 without touching 0–6) while also letting you add and remove items freely at runtime. That combination — speed plus flexibility — is why ArrayList is the most-used collection class in Java by a significant margin.
By the end you'll understand not just how to declare and populate an ArrayList, but why it resizes the way it does, how that affects performance, when you should reach for something else (like LinkedList or a plain array), and the specific gotchas that cause subtle bugs. You'll also have a handful of interview-ready answers.
Creating and Populating an ArrayList — The Right Way from Day One
Declaring an ArrayList takes one line, but that one line contains decisions that matter more than they look.
First, always use the generic type parameter — ArrayList<String> instead of the raw ArrayList. Without it, Java lets you shove any object in, and the compiler can't warn you when you mix types. You'll get a ClassCastException at runtime instead of a compile-time error, and runtime surprises are the worst kind.
Second, favour the List<String> interface type on the left side of the declaration: List<String> guests = new ArrayList<>(). This is called programming to the interface. It means if you later decide to swap ArrayList for a LinkedList, you change exactly one word. Every method call you wrote still compiles. This is not just an academic best practice — it's the pattern you'll see in every professional codebase.
Third, if you have a rough idea of your list's final size, pass it as the initial capacity: new ArrayList<>(50). This pre-allocates space and avoids the internal resizing penalty we'll dig into shortly. You're not locking in a maximum — you're just giving ArrayList a head start.
List<String> items = new ArrayList<>() instead of ArrayList<String> items = new ArrayList<>(). The left-hand type is what your method signatures, fields and return types will use — keeping it as the interface means you can swap the implementation in one place without touching anything else. Interviewers notice this immediately.List<T> list = new ArrayList<>() — the left side is interface, right side is implementation. This is not optional style; it is maintainability.new ArrayList<>(estimatedCapacity) in any loop that builds large lists to avoid repeated O(n) copy operations.new ArrayList<>(exactSize). Prevents all resizes. Example: new ArrayList<>(listOfOtherThings.size()).new ArrayList<>(estimatedSize). Reduces resizes from O(log n) to O(1) or O(2). 50% over-estimate is fine; 50% under-estimate still helps.ArrayList with default capacity, but call ensureCapacity(largerEstimate) before batch add if you learn size later.trimToSize() to release unused capacity. Reduces memory footprint by up to 50% if growth factor overshot.How ArrayList Actually Resizes — Why This Changes How You Write Code
This is the section most tutorials skip, and it's exactly where performance bugs hide.
An ArrayList is backed by a plain Object[] array internally. When you create new ArrayList<>() without specifying a capacity, Java allocates an internal array of size 10. The moment you add an 11th element, ArrayList creates a new array that is 50% larger (so size 15), copies all 10 existing elements into it, and then adds your new element. This copy operation is O(n). It happens again at 16, then 24, then 36, and so on.
Most of the time this is invisible because modern hardware is fast. But if you're building a list of 10,000 items inside a tight loop — say, parsing a large CSV file — those repeated copy operations add up. The fix is simple: if you know the approximate final size, pass it to the constructor upfront. You don't need to be exact. Even a rough estimate dramatically reduces the number of resize events.
The key mental model: at the end is O(1) amortised (fast on average), but occasionally triggers an O(n) resize. add()add(index, element) in the middle is always O(n) because elements must shift. get(index) is always O(1). Understanding this lets you write code that stays fast even at scale.
ArrayList.add()?', the correct answer is O(1) amortised, not just O(1). The word 'amortised' signals that you know occasional resizes happen but that the average cost per operation over many calls remains constant. Candidates who just say 'O(1)' are half right. Candidates who explain amortised analysis stand out.new ArrayList<>(estimatedSize) to eliminate resizes entirely. For 100,000 elements, capacity hint can cut copy operations from 3N to N, a 3x performance improvement.new ArrayList<>(estimatedCapacity) in any loop that builds large lists to avoid repeated O(n) copy operations.add() is true on average, but the constant factor matters. Each resize is O(n) and can cause GC pressure.new ArrayList<>(finalSize). No resizes, no copying overhead. Ideal for CSV parsing, DB result sets.new ArrayList<>(upperBoundEstimate * 1.2). Adds a bit of waste but prevents resizes. Better to over-allocate slightly than to resize.ensureCapacity(currentSize + collection.size()) before addAll. This pre-sizes precisely, avoiding resizes during the batch copy.trimToSize() to release any unused capacity. List.copyOf() also trims implicitly.clear() (keeps backing array) rather than creating new ArrayList. This reuses capacity. To shrink, call clear(); trimToSize();Iterating, Removing and Sorting — Real-World Patterns That Don't Break
Iterating an ArrayList seems trivial until you try to remove items while iterating — then it breaks in ways that are genuinely confusing the first time you hit them.
The safest way to iterate is the enhanced for-loop when you're just reading. When you need the index, use a classic for-loop with get(i). When you need to remove items during iteration, you have two good options: use an Iterator explicitly and call , or use iterator.remove()removeIf() with a lambda (available since Java 8). Never call inside an enhanced for-loop — that causes a list.remove()ConcurrentModificationException.
Sorting an ArrayList is one line: Collections.sort(list) for natural ordering, or list.sort(Comparator.comparing(...)) when you need custom logic. The sort is backed by TimSort, an adaptive algorithm that's particularly efficient on partially-sorted data — which is exactly what real-world data tends to be.
For filtering and transformation, streams give you a clean declarative pipeline. They don't modify the original list — they produce a new one. This immutability makes your code much easier to reason about, especially in concurrent contexts.
list.remove(item) inside a for (Item i : list) loop throws ConcurrentModificationException at runtime — not at compile time, so the compiler won't save you. Use list.removeIf(predicate) for simple cases, or an explicit Iterator with iterator.remove() when you need more control. This is one of the most common ArrayList bugs in code reviews.list.remove() increments modCount but doesn't update the iterator's expectedModCount.next() call after modification, not immediately. This makes debugging harder.remove(), or use removeIf(). For concurrent access, use CopyOnWriteArrayList.list.remove() inside a for-each loop — use list.removeIf() or an explicit Iterator to avoid ConcurrentModificationException.list.removeIf(element -> condition). Cleanest, most readable, single line. Available since Java 8.Iterator<T> it = list.iterator(); while (it.hasNext()) { T t = it.next(); if (complexCondition(t, index)) it.remove(); }it.remove() removes the last element returned by it.next(), which is perfect.CopyOnWriteArrayList (if reads dominate) or synchronise externally with Collections.synchronizedList(list). Neither is ideal; re-evaluate design if concurrent modification is common.for (int i = list.size()-1; i >= 0; i--) { if (condition) list.remove(i); }ArrayList vs Array — Which One to Use?
The decision between ArrayList and a plain array comes down to three factors: whether the size is fixed, whether you need primitives, and performance requirements.
Size flexibility: Arrays have a fixed length set at creation new int[10]. You cannot add or remove elements. ArrayList grows and shrinks dynamically. If you know the exact number of elements and that count won't change, arrays are simpler and faster. But in most business applications, the size is unknown ahead of time, making ArrayList the default choice.
Type safety: Arrays support primitives (int[], char[]) directly, saving memory and avoiding boxing overhead. ArrayList always works with objects — primitives are autoboxed into wrapper types (Integer, Double, etc.), which adds memory and a small performance cost. For large numeric collections, arrays are significantly more efficient.
Performance: Array access is slightly faster than ArrayList (no method call overhead, no bounds checking beyond what the JVM already does). For read-heavy workloads the difference is negligible. For write-heavy workloads with many add/remove operations in the middle, the difference matters because arrays require manual resizing and shifting.
API convenience: ArrayList provides a rich set of methods (add, remove, contains, sort, stream) that you'd have to implement manually on an array. This reduces boilerplate and bugs.
Memory: Array of 1000 int elements takes ~4KB. ArrayList of 1000 Integer elements takes ~24KB (object headers + reference pointers) plus the array object itself. The overhead is significant.
When to choose array: (1) Fixed size known at compile time. (2) Working with large numbers of primitives (int, long, double). (3) Ultra-low-latency code where every nanosecond counts — though ArrayList is already extremely fast. (4) When you need multidimensional structures with consistent dimensions.
When to choose ArrayList: (1) Size is not known upfront. (2) You need add/remove/contains/sort out of the box. (3) You are storing objects (not primitives) — the overhead of ArrayList vs array of references is minimal. (4) You want to use streams or other modern Java features.
List<Integer> with 1 million elements takes about 24 MB of memory due to Integer object overhead. An int[] with the same 1 million ints takes only 4 MB. If you're processing millions of numbers, use int[] or consider libraries like Trove that provide primitive collections. For most enterprise apps with objects (Strings, Products, Orders), ArrayList is the right choice.ArrayList vs LinkedList — When to Choose Which
ArrayList and LinkedList are the two most common List implementations, but they have dramatically different performance characteristics. Choosing the wrong one can make your application an order of magnitude slower.
Random access (get/set): ArrayList provides O(1) random access — direct array indexing. LinkedList requires walking from the head or tail, O(n). If your code frequently reads elements by index, ArrayList is the clear winner.
Insert/delete at ends: ArrayList's add() at end is O(1) amortised (fast). LinkedList's add() at end is O(1) always (just append a node). However, ArrayList's add at beginning is O(n) because all elements shift. LinkedList's add at beginning is O(1). Use LinkedList if you frequently insert or remove at the head of the list — for example, a FIFO queue.
Insert/delete in middle: ArrayList requires shifting elements — O(n). LinkedList requires O(1) if you already have the node (via iterator), but O(n) to find the position otherwise. In practice, LinkedList is slower for middle operations because traversing to the index is O(n) anyway.
Memory: ArrayList stores only the object references in a contiguous array (plus some overhead for unused capacity). LinkedList stores each element as a separate Node object with two extra references (prev, next). For 1 million elements, ArrayList uses ~8 MB (plus capacity overhead), LinkedList uses ~24 MB for nodes alone (plus element overhead). LinkedList also has poor cache locality — traversing nodes follows pointers scattered across heap memory, while ArrayList's contiguous array is CPU-cache-friendly.
Conclusion: Choose ArrayList as the default. Choose LinkedList only when (1) you are adding/removing at both ends frequently, and (2) you never need random access by index. For most real-world applications, ArrayList is faster even when LinkedList's theoretical complexity suggests otherwise, because CPU cache hits matter more than algorithmic complexity.
Advantages and Disadvantages of ArrayList
Every developer should know the trade-offs of ArrayList before using it in production.
Advantages: 1. Fast random access — get(index) is O(1). This is the #1 reason to use ArrayList over other List implementations. 2. Append is fast — add() at end is O(1) amortised. For most workloads, this is the operation you do most. 3. Cache-friendly — elements are stored contiguously in memory, so iterating sequentially is CPU-cache efficient. 4. Rich API — built-in methods for sorting, searching, sublisting, and streaming. 5. Flexible growth — automatically resizes, so you don't need to manage capacity manually (though you should hint it). 6. Serializable — ArrayList implements Serializable, making it easy to persist or send over the network.
Disadvantages: 1. Insert/delete in middle is slow — O(n) because elements shift. If your code frequently inserts or removes from the middle of large lists, ArrayList is a poor choice. 2. No primitives — ArrayList<Integer> boxes each int to an Integer object, increasing memory and GC pressure. For large numeric collections, use primitive arrays or specialised libraries. 3. Not thread-safe — concurrent modifications cause ConcurrentModificationException or data corruption. Must synchronize externally or use CopyOnWriteArrayList. 4. Resizing cost — occasional O(n) resizes cause latency spikes. Pre-sizing mitigates this, but if you can't guess the size, you pay the cost. 5. Unused capacity overhead — the backing array may be larger than the actual size, wasting memory. trimToSize() helps but is not always called. 6. No fast contains for custom objects — contains() does a linear scan using equals(). For large lists, consider HashSet for uniqueness checks.
- Insert/remove frequently at the beginning of the list (use LinkedList or ArrayDeque).
- Need to store millions of primitive values (use primitive arrays).
- Multithreaded writes without external synchronization (use CopyOnWriteArrayList).
- Need to check existence frequently and order doesn't matter (use HashSet).
Verdict: ArrayList is the Swiss Army knife of Java collections — it's good for most tasks, but not all. Understand its limits and you'll choose the right tool every time.
Sorting ArrayList with Comparator — Sort by Name, Price, or Custom Logic
Sorting an ArrayList is trivial with or list.sort()Collections.sort(). When you need a custom order (by name alphabetically, by price descending, by date, etc.), you provide a Comparator.
Java 8 introduced the Comparator.comparing() factory methods, making custom sorting highly readable. You chain methods like .thenComparing() for tie-breaking and .reversed() for descending order.
For simple natural order (String alphabetical, Integer numeric), implement Comparable or use Collections.sort(list). For objects like products, orders, or employees, define a Comparator either as a lambda or using method references.
Important: Sorting modifies the list in place. It does not create a new list. If you need to preserve the original order, copy the list first: new ArrayList<>(original).
Stability: The sort is stable — equal elements maintain their relative order. This matters when you sort by multiple fields: a primary sort (e.g., price) then secondary sort (e.g., name) will keep ties in name order from the original list.
Performance: Sorting is O(n log n) backed by TimSort, which is adaptive — it detects already-sorted runs and performs nearly O(n) on partially sorted data. This is huge for real-world data that's often nearly sorted (e.g., a CSV that's already sorted by ID and you re-sort by date).
Comparator.comparing(ClassName::field) for simple field comparisons — it's concise and readable. Use lambdas like (a, b) -> Double.compare(a.price(), b.price()) when you need custom logic (e.g., handling nulls, ignoring case). The .reversed() and .thenComparing() methods chain naturally for multi-field sorts.list.sort(Comparator.comparing(...)) for clean, readable custom sorting. Chain .reversed() and .thenComparing() for multi-field sorts. Remember that sort is in-place — copy the list if you need to keep the original order.Practice Problems — Test Your ArrayList Knowledge
Working through these problems will solidify your understanding of ArrayList's strengths and pitfalls. Try to solve each one without looking at hints first — then check the concepts they test.
Problem 1: Merge Two Sorted Lists Given two ArrayList<Integer> that are individually sorted in ascending order, merge them into a single sorted ArrayList without using Collections.sort(). This tests your understanding of two-pointer merge and ArrayList's add() with index.
Problem 2: Remove Duplicates in Place Given an ArrayList of integers, remove all duplicate values while preserving the order of first occurrence. Do not use a Set. This tests you on iteration and removal patterns (use Iterator or removeIf).
Problem 3: Rotate the List Write a method that rotates an ArrayList by k positions to the right. For example, [1,2,3,4,5] rotated by 2 becomes [4,5,1,2,3]. Try to do this in O(n) time and O(1) extra space (excluding the list itself). You'll use subList and manual index manipulation.
Problem 4: Find the Middle Element Given an ArrayList, find the middle element(s). Use two-pointer technique: one moves one step, the other moves two steps. This mimics the classic linked list slow/fast pointer but on an ArrayList you also can just use index math. However, the challenge is to do it without using size() more than once.
Problem 5: Custom Sort by Frequency Given an ArrayList<String>, sort it so that elements are ordered by frequency (most frequent first), and for ties, alphabetically. Use a HashMap to count frequencies, then implement a Comparator that uses the map. This combines maps, comparators, and ArrayList sorting.
Why these problems matter: Each one forces you to reason about ArrayList's internal structure. Problem 1 teaches you that index-based insertion is O(n) — merge using add at end, not insert at front. Problem 2 drives home the safe removal pattern. Problem 3 tests your ability to manipulate the backing array mentally. Problem 4 shows you the limits of direct indexing. Problem 5 mixes external data structures with ArrayList's sort.
The 47 OutOfMemoryError Emails on Monday Morning
List<String> values = new ArrayList<>();. For a 15M-row file, ArrayList started at capacity 10, then resized at 10, 15, 22, 33, 49, 73, 109, 163, 244, 366, 549, 823, 1234, 1851, 2776, 4164, 6246, 9369, 14053, 21079, 31618, 47427, 71140, 106710, 160065, 240097, 360145, 540217, 810325, 1,215,487, 1,823,230, 2,734,845, 4,102,267, 6,153,400, 9,230,100, 13,845,150, 20,767,725. That's 38 resize events. Final allocation at 15M required a 15M-element array, but the heap had been fragmented by the 37 previous temporary arrays that were GC'd but left gaps. The JVM couldn't find a contiguous 15M-slot block. The Friday file with 1M rows had only 20 resizes and succeeded.List<String> values = new ArrayList<>(expectedRowCount + 1000); The expected row count was known from file metadata — use it.
2. For unknown sizes, used ArrayList with initial capacity of 1M for large imports, plus ensureCapacity() when new max discovered.
3. Switched to LinkedList for intermediate processing when final size wasn't needed as array, but that wasn't the fix.
4. The real fix: processed the CSV in streaming fashion — parse, transform, write to DB, discard — never held all rows in memory. ArrayList wasn't needed at all for final storage.
5. Added a metrics alert when ArrayList growth factor causes capacity to double more than 5 times within a single operation.- Always provide an initial capacity hint when you know expected size, especially for large lists. Even a rough estimate (like 'size will be > 1000') dramatically reduces resizes.
- ArrayList's 1.5x growth factor triggers O(n log n) total copies — for 15M elements, that's ~225M element moves, not 15M. Each move is a full array copy.
- Large ArrayLists are not just memory-heavy — they cause heap fragmentation. Each resize allocates a new array, leaving the old array for GC, fragmenting heap and making large contiguous allocations impossible.
- If you don't need random access by index, consider processing data in streaming fashion instead of loading all into memory. Often, ArrayList is the wrong abstraction for batch jobs.
trimToSize() after building list. If size is huge, you may need streaming instead of in-memory collection.Iterator with iterator.remove() or list.removeIf(predicate). For multi-threaded access, use CopyOnWriteArrayList or synchronised wrapper.new ArrayList<>(expectedSize). Use ensureCapacity() before bulk add.list.indexOf(element) returned -1 (not found), and you used result as index. Always check int idx = list.indexOf(x); if (idx >= 0) { list.get(idx); }trimToSize() after finishing modifications to release unused slots. For repeated patterns, re-use list and call clear() (doesn't release backing array) or assign new instance.new ArrayList<>(estimatedSize). For unknown size, process in chunks or use streaming.Common mistakes to avoid
5 patternsRemoving elements inside a for-each loop
list.removeIf(predicate) for simple filter-and-remove scenarios, or use an explicit Iterator with iterator.remove() when you need conditional logic during the loop. For multi-threaded access, use CopyOnWriteArrayList.Using `==` to compare ArrayList contents instead of `.equals()`
if (listA == listB) checks whether both variables point to the exact same object in memory, not whether they contain the same elements. Two separate ArrayLists with identical contents will always return false with ==.listA.equals(listB) instead, which compares element-by-element using each element's own .equals() method. For nested lists, use Objects.deepEquals(listA, listB).Storing Integer indexes from `indexOf()` without checking for -1
list.indexOf(element) returns -1 when the element isn't found. If you pass that -1 directly into list.get(-1) or use it in an off-by-one calculation, you'll get IndexOutOfBoundsException or silently wrong results.int idx = list.indexOf(target); if (idx != -1) { ... } before using the result. For binary search, check idx >= 0.Building large ArrayList without capacity hint
System.arraycopy consuming CPU in ArrayList.grow(). Memory usage spikes during resizes.new ArrayList<>(expectedSize). If size unknown, use ensureCapacity(growthEstimate) before batch additions. For really large lists (>10M elements), consider streaming instead of in-memory list.Using raw ArrayList without generics
ClassCastException at runtime when retrieving elements. The error stack trace points to a cast line far from the actual mistake.List<String> list = new ArrayList<>(); Never use raw ArrayList. Enable compiler warnings with -Xlint:unchecked to catch these at compile time.Interview Questions on This Topic
What is the time complexity of ArrayList.add()? Why is it called amortised O(1)?
ArrayList.add() is amortised O(1), not strictly O(1). Most add operations simply write to the next available slot in the internal array, which is O(1). However, when the internal array is full, ArrayList creates a new array 1.5x larger, copies all existing elements (O(n)), then adds the new element. This resize happens about O(log n) times over n adds. The total cost of all resizes over n adds is O(n), so the average cost per add is O(1). This is amortised analysis: occasional expensive operations are averaged over many cheap ones. The growth factor of 1.5 minimizes copies while avoiding the memory waste of doubling each time. For 1000 adds, total copies are about 1500 — average 1.5 copies per add.That's Collections. Mark it forged?
11 min read · try the examples if you haven't