C++ STL Vectors Explained — Usage, Internals and Pitfalls
reserve() and emplace_back().- A vector is a contiguous heap array — that contiguity is why iteration is cache-friendly and fast, but it's also why reallocation invalidates every iterator you held.
- Call
reserve()before filling a vector when you know the approximate size — it collapses potentially dozens of reallocations into a single allocation and is one of the highest-return-on-effort optimisations in C++. - Prefer
emplace_back()overpush_back()for new element construction — it builds the object directly in the buffer, skipping the temporary object thatpush_back()would otherwise create and move.
Imagine you're packing books into a bookshelf. A regular array is like buying a fixed-size shelf — you decide upfront how many books fit and you're stuck with that. A vector is like a magical expanding shelf: it starts small, and whenever it runs out of room it quietly moves everything to a bigger shelf behind the scenes. You just keep adding books without ever worrying about the shelf size yourself.
Every non-trivial C++ program needs to store and manipulate collections of data. Whether you're building a game engine tracking active enemies, a web server queuing incoming requests, or a trading system buffering price ticks, you need a container that's fast, flexible, and predictable. std::vector is the default answer to that need — and for good reason. It's the most widely used container in the entire C++ Standard Library.
At its heart, the vector solves the 'fixed-size' problem of traditional arrays while maintaining their greatest strength: contiguous memory. In this guide, we’ll peel back the abstraction to see how the buffer actually grows, why your iterators might suddenly 'die', and how to write high-performance code that the CPU's cache will love.
How Vectors Actually Work — The Dynamic Array Under the Hood
A std::vector is a wrapper around a dynamically allocated array. It tracks three things: a pointer to the heap-allocated buffer, the current number of elements (size), and the total allocated capacity. When you push_back() an element and size equals capacity, the vector allocates a brand-new buffer (typically 1.5x or 2x the old capacity depending on the implementation), copies or moves all existing elements into it, then destroys the old buffer. This is called a reallocation.
That reallocation is the most important thing to understand about vectors. It's O(n) work — every existing element must be moved. It's also the event that invalidates every iterator, pointer, and reference you held into the vector. This isn't a bug; it's the fundamental trade-off that lets vectors remain a contiguous block of memory, which is what makes them cache-friendly and blazing fast for iteration.
So the mental model is: a vector is a resizable contiguous array. 'Resizable' is the feature. 'Contiguous' is the performance guarantee. Both matter.
#include <iostream> #include <vector> namespace io_thecodeforge { /** * Demonstrates how capacity grows dynamically as elements are added. * Note: Growth factor (usually 1.5 or 2) is implementation-defined. */ void demonstrateInternals() { std::vector<int> scores; std::cout << "Initial state: size=" << scores.size() << ", capacity=" << scores.capacity() << "\n"; for (int i = 1; i <= 5; ++i) { scores.push_back(i * 10); std::cout << "Added " << i * 10 << " | size: " << scores.size() << " | capacity: " << scores.capacity() << "\n"; } } } int main() { io_thecodeforge::demonstrateInternals(); return 0; }
Added 10 | size: 1 | capacity: 1
Added 20 | size: 2 | capacity: 2
Added 30 | size: 3 | capacity: 4
Added 40 | size: 4 | capacity: 4
Added 50 | size: 5 | capacity: 8
push_back() at O(1). If the vector grew by 1 slot each time, every insertion would be O(n) — a 10,000-element vector would do ~50 million copy operations total just to fill up.reserve() and emplace_back() — Writing Vector Code That Doesn't Waste Time
Now that you know reallocation is expensive, you can prevent it. If you know (or can estimate) how many elements you'll store, call reserve() before your loop. It pre-allocates the buffer without changing size, so no reallocations happen during insertion. This is not a micro-optimisation — on a vector of 1 million objects it's the difference between milliseconds and seconds.
The second upgrade is emplace_back() over push_back(). push_back() takes an already-constructed object and copies or moves it into the vector. emplace_back() takes constructor arguments and builds the object directly inside the vector's buffer — zero copies, zero temporaries. For types with expensive constructors (strings, custom objects), emplace_back() is the right default.
#include <iostream> #include <vector> #include <string> namespace io_thecodeforge { struct UserProfile { std::string username; int id; // Constructor used by emplace_back to build in-place UserProfile(std::string name, int userId) : username(std::move(name)), id(userId) { std::cout << "Constructed: " << username << "\n"; } }; void optimizedFill() { std::vector<UserProfile> users; // Performance critical: collapse N reallocations into 1 users.reserve(10); // emplace_back passes args directly to UserProfile constructor // Result: No temporary object created, no copy/move overhead users.emplace_back("senior_editor", 101); } } int main() { io_thecodeforge::optimizedFill(); return 0; }
emplace_back() your muscle memory for vector insertion. The only time you'd prefer push_back() is when you already have a constructed object you want to move in — and even then, std::move() with push_back() is equivalent to emplace_back(). For new construction, emplace_back() is always the cleaner choice.Iterating, Modifying and Erasing — Patterns That Actually Appear in Production
Iterating a vector is straightforward, but modifying it while iterating is where most bugs are born. The erase() method removes an element by iterator and returns an iterator to the element that took its place. If you ignore that return value and keep using your old iterator, you're in undefined behaviour territory — the iterator is now invalid.
The erase-remove idiom is the idiomatic C++ pattern for removing elements that match a condition without hand-rolling an index-shuffling loop. std::remove() (from <algorithm>) shuffles all 'keep' elements to the front and returns an iterator pointing to the start of the 'trash' region. You then call erase() on that range. It's a single-pass O(n) operation and it's in every production codebase.
#include <iostream> #include <vector> #include <algorithm> namespace io_thecodeforge { void cleanData() { std::vector<int> data = {1, 2, 3, 4, 5, 6}; // Erase-remove idiom: remove all even numbers // std::remove_if shuffles elements; erase() clips the tail data.erase(std::remove_if(data.begin(), data.end(), [](int n){ return n % 2 == 0; }), data.end()); for (int n : data) { std::cout << n << " "; } } } int main() { io_thecodeforge::cleanData(); return 0; }
erase() inside a manual for-loop and increment your iterator unconditionally, you'll skip the element that slid into the erased position. Always reassign: it = vec.erase(it) and only increment it in the else branch. The erase-remove idiom avoids this trap entirely — prefer it whenever possible.Vectors of Objects vs Vectors of Pointers — Choosing the Right Layout
This is the decision that separates intermediate C++ developers from seniors. You have two options: store objects directly (std::vector<Player>) or store pointers to heap objects (std::vector<Player*> or std::vector<std::unique_ptr<Player>>). They have completely different performance profiles.
Storing objects directly keeps everything in a single contiguous block of memory. Iterating fires up the CPU's prefetcher and tears through the cache. Storing pointers enables polymorphism but results in 'pointer chasing' across the heap, which kills performance on large collections due to cache misses.
#include <iostream> #include <vector> #include <memory> namespace io_thecodeforge { class Base { public: virtual void identify() = 0; virtual ~Base() = default; }; class Derived : public Base { public: void identify() override { std::cout << "Derived Instance identified.\n"; } }; void layoutChoice() { // Polymorphic vector using smart pointers for safety // Memory layout: Vector holds contiguous unique_ptr objects, // which point to non-contiguous Derived objects on the heap. std::vector<std::unique_ptr<Base>> heterogenousContainer; heterogenousContainer.push_back(std::make_unique<Derived>()); for (const auto& item : heterogenousContainer) { item->identify(); } } } int main() { io_thecodeforge::layoutChoice(); return 0; }
| Aspect | std::vector | std::array | Raw C Array |
|---|---|---|---|
| Size | Dynamic — grows at runtime | Fixed at compile time | Fixed at compile time |
| Memory layout | Contiguous heap allocation | Contiguous stack allocation | Contiguous stack or heap |
| Bounds checking | at() throws, [] is unchecked | at() throws, [] is unchecked | No bounds checking at all |
| Reallocation cost | O(n) on growth, amortized O(1) | N/A — size never changes | N/A — size never changes |
| Iterator invalidation | On any reallocation | Never | N/A — no iterators |
| STL algorithm support | Full — begin()/end() available | Full — begin()/end() available | Partial — pointer arithmetic only |
| Ownership semantics | RAII — auto memory management | RAII — auto memory management | Manual — you manage the lifetime |
| Best use case | Most runtime collections | Fixed config, stack-local data | C interop, legacy code |
🎯 Key Takeaways
- A vector is a contiguous heap array — that contiguity is why iteration is cache-friendly and fast, but it's also why reallocation invalidates every iterator you held.
- Call
reserve()before filling a vector when you know the approximate size — it collapses potentially dozens of reallocations into a single allocation and is one of the highest-return-on-effort optimisations in C++. - Prefer
emplace_back()overpush_back()for new element construction — it builds the object directly in the buffer, skipping the temporary object thatpush_back()would otherwise create and move. - Use the erase-remove idiom (std::remove_if +
erase()) for conditional deletion — hand-rolling an index-based erase loop almost always produces a subtle iterator-invalidation bug sooner or later.
⚠ Common Mistakes to Avoid
Interview Questions on This Topic
- QExplain the 'Amortized O(1)' time complexity of
push_back(). If reallocation is O(n), why is the overall operation considered constant time? - QHow does the exponential growth strategy (2x or 1.5x) affect memory fragmentation compared to a linear growth strategy (+100 slots)?
- QWhat is the time complexity of std::vector::insert at an arbitrary position? Explain why it differs from std::list::insert.
- QWhy does C++ forbid std::vector<bool> from being a 'real' container of bools? Discuss the proxy object returned by its operator[] and why it breaks generic code.
- QLeetCode Challenge: Given a vector of integers, how do you efficiently remove all elements that are less than a threshold in O(n) time and O(1) space?
Frequently Asked Questions
What happens to size and capacity when I call vector::clear()?
Calling .clear() sets the size of the vector to zero, destroying all contained elements (calling their destructors). However, it does NOT change the capacity. The heap-allocated buffer remains allocated to allow for future reuse without a new allocation. To truly release memory back to the OS, use .shrink_to_fit() or the swap idiom.
Is std::vector thread-safe for concurrent operations?
Generally, no. Multiple threads can read from a vector simultaneously without locks. However, if even one thread modifies the vector (insertion, deletion, or any operation that could trigger reallocation), all access must be synchronized with a mutex. Concurrent writes to different indices are only safe if the vector's size does not change during the process.
When should I use std::vector::at() instead of the subscript operator []?
Use .at() when you want the safety of bounds checking; it throws a std::out_of_range exception if the index is invalid. Use [] for maximum performance in tight loops where you are 100% certain the index is within bounds (0 to size-1), as it skips the overhead of the check.
How do I pre-allocate a vector with actual values instead of just reserving space?
If you want the vector to immediately contain N elements (effectively changing the size), use the constructor std::vector<int> vec(N, initial_value); or call vec.resize(N);. This is different from , which only changes the capacity (reserved memory) without actually creating elements.reserve()
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.