C++ Iterator Invalidation — The Vector Reallocation Crash
Vector reallocation invalidates all iterators — the classic midnight crash.
20+ years shipping performance-critical C and C++ systems. Written from production experience, not tutorials.
- Iterators are the glue between algorithms and containers: a common interface that works for vector, list, map, and more.
- Five categories: Input (read-once), Output (write-once), Forward (multi-pass), Bidirectional (forward+back), Random Access (jump by index).
- Each category is a superset – Random Access can do everything below it, but algorithms require specific categories. Try using list with std::sort and the compiler stops you.
- Performance hit: std::advance on a list is O(n), but on a vector it’s O(1). The difference matters when iterating over a million elements.
- Production insight: iterator invalidation (especially on vector reallocation) is the #1 cause of silent corruption. Always re-acquire iterators after modifying the container.
- Biggest mistake: assuming all iterators support arithmetic or comparison. Pointer arithmetic only works for Random Access; for list you get a compile error.
Imagine a music playlist on your phone. You can tap the forward button to go to the next song, tap back to return to the previous one, or jump straight to song number 7. An STL iterator is exactly that — it's your 'current position' pointer inside a container like a vector or list. It lets your code navigate through data without knowing or caring how that data is stored underneath. Some iterators can only move forward (like a cassette tape), while others can jump anywhere instantly (like a digital track scrubber).
Every serious C++ program deals with collections of data — user records, sensor readings, queued tasks. The moment you need to visit, transform, or search those collections, you hit a fundamental question: how does my algorithm talk to my data structure in a way that works regardless of whether the data lives in a contiguous array, a doubly-linked list, or a balanced tree? That question is exactly what STL iterators were designed to answer, and it's why they sit at the very heart of the C++ Standard Library.
Before iterators existed, writing a generic sort or find meant either duplicating code for every container type or sacrificing type safety with raw void pointers. Iterators solve this by acting as a common contract — a standardised interface that any container can fulfil and any algorithm can depend on. The algorithm doesn't need to know it's talking to a vector; it just needs an iterator that supports the operations it requires. This separation is what makes std::sort work on a vector but std::list::sort exist as a member function — the details matter, and understanding them saves you from subtle bugs and wasted CPU cycles.
By the end of this article you'll know the five iterator categories and why the distinction between them is not just academic trivia, you'll be able to choose the right iterator for the job, write range-based patterns that professional C++ engineers use daily, avoid the two most common iterator invalidation crashes, and walk into a technical interview able to answer iterator questions with genuine depth rather than surface-level recitation.
How Vector Iterator Invalidation Works — and Why It Crashes
Iterator invalidation is the rule that an iterator, pointer, or reference to a container element becomes invalid after certain container operations. For std::vector, the most common crash trigger is reallocation: when push_back, insert, or resize exceeds capacity, the vector allocates new memory, moves all elements, and deallocates the old buffer. Every iterator into the old buffer is now dangling — dereferencing it is undefined behavior.
Reallocation happens when size == capacity. The vector typically grows by a factor of 1.5–2× (implementation-defined). After reallocation, all iterators, pointers, and references to any element are invalidated — not just those past the insertion point. This differs from list or map, where only iterators to the erased element are invalidated. The cost is O(n) copy/move per reallocation, but the real danger is silent corruption: a stale iterator may appear to work until the memory is reused.
Use vector when you need contiguous storage and amortized O(1) push_back. Reserve capacity upfront if you know the final size — this avoids reallocation entirely. When iterating and inserting, either recompute iterators after each insertion or switch to a container like deque or list that doesn't invalidate iterators on growth. In performance-critical paths, measure reallocation frequency with a memory profiler.
The Five Iterator Categories — What They Can Do and Why It Matters
The C++ standard defines iterators not by their type name but by the operations they support. Think of them as tiers of capability, each a superset of the one below it.
Input Iterator — read-only, single-pass. You can dereference it once and advance forward. Classic example: reading from std::cin via std::istream_iterator. Once you've moved past a value, it's gone.
Output Iterator — write-only, single-pass. You can assign through it and advance. std::ostream_iterator writes to a stream this way.
Forward Iterator — read and write, multi-pass. You can make a copy of the iterator, advance one copy, and the original still points to the same element. std::forward_list uses these.
Bidirectional Iterator — everything a forward iterator can do, plus operator-- to step backwards. std::list and std::map give you these.
Random Access Iterator — the full package. Jump forward or backward by any amount with +, -, [], and compare positions with < and >. std::vector and std::deque provide these, and they're what std::sort requires.
Why does the categorisation matter? Because algorithms advertise which category they require. If you try to pass a std::list iterator to std::sort, you get a compile-time error — not a silent wrong answer at runtime. The type system is doing safety work for you.
batch_process(first, last, n) that used it += n directly.it += n with std::advance(it, n).std::advance in generic code — it's free for random access, works for all categories.list.sort() (mergesort).rbegin() and rend().Iterator Invalidation — The Silent Crash You Must Understand
Here's the scenario that causes production bugs: you grab an iterator to an element, do some work, then modify the container — and suddenly your iterator is pointing at garbage memory or a completely wrong element. This is called iterator invalidation, and the rules differ between containers.
For std::vector, any operation that causes a reallocation (push_back when size == capacity, insert, erase, resize) invalidates ALL iterators, pointers, and references to elements. Even erasing a single element invalidates iterators to every element after it because they all shift left.
For std::list, insert never invalidates any existing iterator — nodes just get linked in. Erase only invalidates the iterator to the element being erased; everything else stays valid. This is why std::list exists: predictable iterator stability.
For std::map and std::unordered_map, insert doesn't invalidate any iterators. Erase only invalidates the iterator to the erased element.
The practical takeaway is this: if you need to erase elements while iterating over a vector, you need a different pattern (erase-remove idiom or index-based loop going backwards). For lists and maps you can erase inside the loop safely — but you must capture the next iterator before you erase.
Reverse Iterators, const_iterators, and Writing Generic Code
Once you're comfortable with forward iteration, two refinements immediately pay dividends: iterating in reverse and enforcing read-only access.
Reverse iterators wrap a regular iterator and flip the direction of ++ and --. Every standard container that supports bidirectional or random-access iteration provides rbegin() and rend(). The mental model: rbegin() points at the last element, rend() points one-before-the-first.
const_iterators (returned by cbegin() and cend()) give you a read-only view of the container's elements. Dereferencing one gives a const reference — the compiler won't let you accidentally modify data you promised to leave alone. Use them in any function that only reads, and you communicate intent clearly to the next developer.
Writing generic code with iterators means templating on the iterator type itself rather than on the container. This is how the STL algorithms are written. Your own functions can follow the same pattern — accept an iterator range [first, last) and work on anything that satisfies the required iterator category.
begin() on a const-qualified container already returns a const_iterator — the compiler enforces it. cbegin() exists so you can explicitly get a const_iterator even from a non-const container. It's a signal to human readers: 'I chose read-only access here deliberately.' Prefer cbegin()/cend() in read-only functions to make intent unmistakable.std::vector<int>& and only read it, but used begin() and end(). Later, a developer passed a const vector and the function didn't compile because begin() on const returns const_iterator but the internal code tried to assign to a non-const iterator. The fix: change the function to accept const std::vector<int>& and use cbegin()/cend() to be explicit. Better yet: template on the iterator type and accept any range.cbegin()/cend() when you only read data.cbegin() for clarity.cbegin()/cend() to enforce const-correctness and prevent accidental modifications.Iterator Traits and Tag Dispatching — The Engine Behind Generic Algorithms
The real power of iterators lies in how the STL dispatches different implementations based on iterator category. This is accomplished through iterator_traits and tag dispatching.
std::iterator_traits<T> is a template that exposes five type aliases: value_type, difference_type, reference, pointer, and iterator_category. For raw pointers, the library provides a specialization so that int* is treated as a Random Access Iterator.
Tag dispatching: STL algorithms use the iterator_category to choose the most efficient implementation. For example, std::distance can be O(1) for Random Access, but O(n) for others. The implementation uses a helper function overloaded on tag types.
``cpp namespace detail { template<typename It> auto distance_impl(It first, It last, std::random_access_iterator_tag) -> decltype(last - first) { return last - first; } template<typename It> auto distance_impl(It first, It last, std::input_iterator_tag) -> typename std::iterator_traits<It>::difference_type { typename std::iterator_traits<It>::difference_type n = 0; while (first != last) { ++first; ++n; } return n; } } template<typename It> auto distance(It first, It last) -> typename std::iterator_traits<It>::difference_type { return detail::distance_impl(first, last, typename std::iterator_traits<It>::iterator_category()); } ``
You can use this technique in your own code. If you have an algorithm that can be optimized for forward vs random access, tag dispatching gives you compile-time polymorphism without runtime overhead.
process_impl with an empty tag object is completely inlined and optimized away. The compiler sees two different functions and picks the right one at compile time. It's zero-cost abstraction.std::distance inside a loop, it will be O(n) per call for list and O(1) for vector. Usually fine, but if you call it repeatedly, you can get O(n^2) unintentionally. Capture the distance once outside the loop when possible.std::distance(begin, end) inside a loop iterating over a list of a million entries – it caused O(n^2) runtime. Fix: precompute auto count = std::distance(begin, end) once, then loop.C++20 Ranges and Views — A Modern Iteration Paradigm
C++20 introduced the Ranges library, which lifts iteration to a higher level of abstraction. Instead of passing two iterators everywhere, you pass a range object that bundles begin and end together. This leads to composable operations via 'adaptors'.
For example, filtering and transforming a vector:
```cpp #include <ranges> #include <vector> #include <iostream>
int main() { std::vector<int> numbers = {1,2,3,4,5,6}; // Lazy pipeline: no temporary containers created auto result = numbers | std::views::filter([](int n) { return n % 2 == 0; }) // keep evens | std::views::transform([](int n) { return n * 10; }); // multiply by 10 for (int v : result) { std::cout << v << " "; // outputs 20 40 60 } } ```
Behind the scenes, std::views::filter returns a range whose iterators are filtered iterators. The underlying iterator category is usually preserved, but constraints apply: filter requires a forward iterator, transform works on input iterators.
Understanding the traditional iterator categories is essential to using Ranges correctly. For instance, if you apply views::reverse to a vector, the resulting iterators are still random access. But apply views::reverse to a list, they remain bidirectional.
Ranges also solve the dreaded 'sentinel' problem: C++20 allows the end of a range to be a different type from begin (e.g., a null-terminated string). This is done via sentinel types, which are compared with the iterator using a custom predicate.
std::views::filter requires at least a forward iterator because it needs to cache the next matching element. If you try to apply it to an input iterator, you'll get a compile-time error. Same with take, drop, reverse. Always check the category requirements.Iterator Operations — Pointer Arithmetic You Can Actually Trust
Every iterator in C++ supports at least two operations: dereference (*it) and increment (++it). That's the contract. But what else you get depends on the iterator category. Random-access iterators (vector, deque) let you do it + 5, it - 3, or even it1 - it2 to get the distance. Bidirectional iterators (list, set) only allow ++ and --. Forward iterators (forward_list) only ++. Input iterators can only be read once, then they're gone. Output iterators only write.
Why this matters: algorithms like std::sort require random-access iterators. If you pass a list's bidirectional iterator, the code won't compile. The compiler catches this at compile time — not at 3 AM in production. Always check the category before writing generic code. Use std::iterator_traits to query the category at compile time. Don't guess. The type system is your ally.
Iterator Adaptors — Real-World Patterns for Tricky Iteration
Production code rarely iterates forward from begin to end. You need reverse iteration, read-only access, or to filter out elements. That's where iterator adaptors save you.
std::reverse_iterator wraps any bidirectional iterator and reverses its direction. Call rbegin() and rend() on any container — you get a reverse view without copying. std::move_iterator converts assignment to move semantics — critical for moving unique_ptrs out of a container.
Then there's std::back_insert_iterator: instead of overwriting elements, it calls push_back. Use it with std::copy to append results. Same pattern with std::front_insert_iterator and std::inserter for arbitrary positions.
These adaptors live in production code every day. They eliminate manual loops and reduce the chance of off-by-one errors. They also document intent: seeing std::reverse_iterator says "we're going backward" without reading loop logic. Your future self will thank you.
Iterator Utility Functions — std::advance, std::distance, std::next, std::prev
Raw iterator arithmetic breaks generic code. std::list's bidirectional iterator doesn't support +, but you still need to advance by N positions. Solution: utility functions that work with any iterator category.
std::advance(it, n) moves the iterator by n positions. For random-access iterators, it's O(1). For bidirectional, it's O(n). The function knows the category via iterator_traits and picks the optimal path — no template metaprogramming on your part.
std::distance(first, last) returns the number of increments needed to go from first to last. Again, O(1) for random-access, O(n) otherwise. Use it when you need the size of a subrange.
std::next(it, n) returns a new iterator advanced by n without modifying the original. std::prev does the reverse. These are const-correct and avoid mutating state when you just need to peek ahead.
In production: never write it + 5 for anything but raw arrays and vector. Use std::next. It documents the cost and works everywhere. Your code reviewer will nod approvingly.
end(), it's undefined behavior — your program will silently corrupt memory. Always check container size before calling std::advance with distances you don't control.The Midnight Reallocation Crash: Vector Iterator Invalidation
- Never store iterators to a vector across a modification that may cause reallocation (push_back, insert, resize, reserve).
- Use indices instead of iterators when you need stable references to elements in a growing vector.
- If iterator stability is critical, prefer std::list or std::forward_list, or use a container like Boost.Container's stable_vector.
begin(). If using a loop, re-fetch begin() after each modification.list.sort() member function which is optimized for bidirectional iterators (O(n log n) with mergesort).static_assert(std::is_same_v<typename std::iterator_traits<It>::iterator_category, std::random_access_iterator_tag>, "Random access required");template<typename It, typename = std::enable_if_t<std::is_base_of_v<std::random_access_iterator_tag, typename std::iterator_traits<It>::iterator_category>>>Key takeaways
cbegin()/cend() in any function that only reads dataCommon mistakes to avoid
5 patternsUsing an invalidated vector iterator after push_back
reserve() upfront to prevent reallocation entirely. If you must store a position, store an index instead.Erasing elements inside a range-based for loop on a vector
Comparing iterators from two different containers
if (vecA.begin() == vecB.begin()) compiles fine but invokes undefined behavior. The result may appear true or false but is meaningless.Assuming std::list iterators support arithmetic (it + n)
Using std::sort on std::list
list.sort() member function, which is O(n log n) and uses mergesort that doesn't require random access. Alternatively, copy list to vector, sort, then copy back if you need the list structure.Interview Questions on This Topic
Explain the internal differences between a Bidirectional Iterator and a Random Access Iterator. Why is std::sort limited to the latter?
(first + (last-first)/2) and swaps elements from both ends using indices. Without O(1) jump, partitioning would be O(n^2). For bidirectional iterators, std::list provides its own sort member that uses mergesort.Frequently Asked Questions
20+ years shipping performance-critical C and C++ systems. Written from production experience, not tutorials.
That's STL. Mark it forged?
9 min read · try the examples if you haven't