Home C / C++ C++ STL Vectors Explained — Usage, Internals and Pitfalls

C++ STL Vectors Explained — Usage, Internals and Pitfalls

In Plain English 🔥
Imagine you're packing books into a bookshelf. A regular array is like buying a fixed-size shelf — you decide upfront how many books fit and you're stuck with that. A vector is like a magical expanding shelf: it starts small, and whenever it runs out of room it quietly moves everything to a bigger shelf behind the scenes. You just keep adding books without ever worrying about the shelf size yourself.
⚡ Quick Answer
Imagine you're packing books into a bookshelf. A regular array is like buying a fixed-size shelf — you decide upfront how many books fit and you're stuck with that. A vector is like a magical expanding shelf: it starts small, and whenever it runs out of room it quietly moves everything to a bigger shelf behind the scenes. You just keep adding books without ever worrying about the shelf size yourself.

Every non-trivial C++ program needs to store and manipulate collections of data. Whether you're building a game engine tracking active enemies, a web server queuing incoming requests, or a trading system buffering price ticks, you need a container that's fast, flexible, and predictable. std::vector is the default answer to that need — and for good reason. It's the most widely used container in the entire C++ Standard Library.

How Vectors Actually Work — The Dynamic Array Under the Hood

A std::vector is a wrapper around a dynamically allocated array. It tracks three things: a pointer to the heap-allocated buffer, the current number of elements (size), and the total allocated capacity. When you push_back() an element and size equals capacity, the vector allocates a brand-new buffer (typically 1.5x or 2x the old capacity depending on the implementation), copies or moves all existing elements into it, then destroys the old buffer. This is called a reallocation.

That reallocation is the most important thing to understand about vectors. It's O(n) work — every existing element must be moved. It's also the event that invalidates every iterator, pointer, and reference you held into the vector. This isn't a bug; it's the fundamental trade-off that lets vectors remain a contiguous block of memory, which is what makes them cache-friendly and blazing fast for iteration.

So the mental model is: a vector is a resizable contiguous array. 'Resizable' is the feature. 'Contiguous' is the performance guarantee. Both matter.

vector_internals.cpp · CPP
12345678910111213141516171819202122232425262728
#include <iostream>
#include <vector>

int main() {
    std::vector<int> scores;

    // Print size and capacity before adding anything
    std::cout << "Initial state:\n";
    std::cout << "  size:     " << scores.size()     << "\n";
    std::cout << "  capacity: " << scores.capacity() << "\n\n";

    // Watch capacity grow in steps as we push elements in
    for (int i = 1; i <= 10; ++i) {
        scores.push_back(i * 10);

        // capacity jumps only when the buffer needs to be reallocated
        std::cout << "After push_back(" << i * 10 << "):"
                  << "  size=" << scores.size()
                  << "  capacity=" << scores.capacity() << "\n";
    }

    // Accessing elements is identical to a raw array — O(1)
    std::cout << "\nFirst score  (index 0): " << scores[0]     << "\n";
    std::cout << "Last score  (back()): "   << scores.back()  << "\n";
    std::cout << "Safe access (at(2)):  "   << scores.at(2)   << "\n";

    return 0;
}
▶ Output
Initial state:
size: 0
capacity: 0

After push_back(10): size=1 capacity=1
After push_back(20): size=2 capacity=2
After push_back(30): size=3 capacity=4
After push_back(40): size=4 capacity=4
After push_back(50): size=5 capacity=8
After push_back(60): size=6 capacity=8
After push_back(70): size=7 capacity=8
After push_back(80): size=8 capacity=8
After push_back(90): size=9 capacity=16
After push_back(100): size=10 capacity=16

First score (index 0): 10
Last score (back()): 100
Safe access (at(2)): 30
🔥
Why capacity doubles:Doubling capacity on each reallocation keeps the amortized cost of push_back() at O(1). If the vector grew by 1 slot each time, every insertion would be O(n) — a 10,000-element vector would do ~50 million copy operations total just to fill up.

reserve() and emplace_back() — Writing Vector Code That Doesn't Waste Time

Now that you know reallocation is expensive, you can prevent it. If you know (or can estimate) how many elements you'll store, call reserve() before your loop. It pre-allocates the buffer without changing size, so no reallocations happen during insertion. This is not a micro-optimisation — on a vector of 1 million objects it's the difference between milliseconds and seconds.

The second upgrade is emplace_back() over push_back(). push_back() takes an already-constructed object and copies or moves it into the vector. emplace_back() takes constructor arguments and builds the object directly inside the vector's buffer — zero copies, zero temporaries. For small types like int the difference is negligible. For types with expensive constructors (strings, custom objects), emplace_back() is the right default.

There's also shrink_to_fit(), which hints to the implementation to release excess capacity after you're done filling a vector. It's a hint, not a guarantee, but it's worth knowing when memory footprint matters.

vector_reserve_emplace.cpp · CPP
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
#include <iostream>
#include <vector>
#include <string>
#include <chrono>

// A simple Player struct to demonstrate emplace_back vs push_back
struct Player {
    std::string name;
    int         level;
    float       health;

    Player(std::string n, int lvl, float hp)
        : name(std::move(n)), level(lvl), health(hp) {
        // In a real codebase you might load assets here — construction is NOT free
    }
};

void fillWithoutReserve(int playerCount) {
    std::vector<Player> roster;
    // No reserve() — vector will reallocate multiple times
    for (int i = 0; i < playerCount; ++i) {
        // push_back constructs a temporary Player, then moves it in
        roster.push_back(Player("Warrior_" + std::to_string(i), i % 60 + 1, 100.0f));
    }
    std::cout << "Without reserve — final capacity: " << roster.capacity() << "\n";
}

void fillWithReserve(int playerCount) {
    std::vector<Player> roster;
    roster.reserve(playerCount); // single allocation up front
    for (int i = 0; i < playerCount; ++i) {
        // emplace_back constructs the Player DIRECTLY inside the buffer — no temporary
        roster.emplace_back("Warrior_" + std::to_string(i), i % 60 + 1, 100.0f);
    }
    std::cout << "With reserve    — final capacity: " << roster.capacity() << "\n";
    std::cout << "First player: " << roster.front().name
              << "  level " << roster.front().level << "\n";
    std::cout << "Last  player: " << roster.back().name
              << "  level " << roster.back().level  << "\n";

    // Done adding — release wasted capacity (a hint, not guaranteed)
    roster.shrink_to_fit();
    std::cout << "After shrink_to_fit capacity: " << roster.capacity() << "\n";
}

int main() {
    const int PLAYER_COUNT = 500;
    fillWithoutReserve(PLAYER_COUNT);
    fillWithReserve(PLAYER_COUNT);
    return 0;
}
▶ Output
Without reserve — final capacity: 512
With reserve — final capacity: 500
First player: Warrior_0 level 1
Last player: Warrior_499 level 20
After shrink_to_fit capacity: 500
⚠️
Pro Tip — Default to emplace_back:Make emplace_back() your muscle memory for vector insertion. The only time you'd prefer push_back() is when you already have a constructed object you want to move in — and even then, std::move() with push_back() is equivalent to emplace_back(). For new construction, emplace_back() is always the cleaner choice.

Iterating, Modifying and Erasing — Patterns That Actually Appear in Production

Iterating a vector is straightforward, but modifying it while iterating is where most bugs are born. The erase() method removes an element by iterator and returns an iterator to the element that took its place. If you ignore that return value and keep using your old iterator, you're in undefined behaviour territory — the iterator is now invalid.

The erase-remove idiom is the idiomatic C++ pattern for removing elements that match a condition without hand-rolling an index-shuffling loop. std::remove() (from ) shuffles all 'keep' elements to the front and returns an iterator pointing to the start of the 'trash' region. You then call erase() on that range. It sounds odd the first time you see it, but it's a single-pass O(n) operation and it's in every production codebase.

For read-only traversal, prefer range-for loops. For index-based access where you need the index itself, use a size_t loop variable — not int — to avoid signed/unsigned comparison warnings that some teams treat as errors.

vector_iteration_erase.cpp · CPP
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
#include <iostream>
#include <vector>
#include <algorithm>   // std::remove_if
#include <string>

struct LogEntry {
    std::string message;
    int         severity;  // 1=info, 2=warning, 3=error
};

void printLog(const std::vector<LogEntry>& log) {
    for (const auto& entry : log) {   // range-for, const ref — no copies
        std::cout << "  [" << entry.severity << "] " << entry.message << "\n";
    }
}

int main() {
    std::vector<LogEntry> eventLog = {
        {"Server started",           1},
        {"Disk usage at 80%",        2},
        {"Failed login attempt",     3},
        {"Health check OK",          1},
        {"Memory pressure detected", 2},
        {"Database unreachable",     3},
        {"Cache warmed up",          1}
    };

    std::cout << "Full log (" << eventLog.size() << " entries):\n";
    printLog(eventLog);

    // Erase-remove idiom: purge all info-level (severity==1) entries
    // std::remove_if shuffles matching elements to the end and returns
    // an iterator to the start of the junk region
    auto junkStart = std::remove_if(
        eventLog.begin(), eventLog.end(),
        [](const LogEntry& e) { return e.severity == 1; }
    );
    // erase() actually removes the junk region from the vector
    eventLog.erase(junkStart, eventLog.end());

    std::cout << "\nAfter removing info entries (" << eventLog.size() << " remain):\n";
    printLog(eventLog);

    // Safe indexed iteration when you need the index
    std::cout << "\nWarnings and errors by index:\n";
    for (std::size_t i = 0; i < eventLog.size(); ++i) {
        std::cout << "  index " << i << ": " << eventLog[i].message << "\n";
    }

    return 0;
}
▶ Output
Full log (7 entries):
[1] Server started
[2] Disk usage at 80%
[3] Failed login attempt
[1] Health check OK
[2] Memory pressure detected
[3] Database unreachable
[1] Cache warmed up

After removing info entries (4 remain):
[2] Disk usage at 80%
[3] Failed login attempt
[2] Memory pressure detected
[3] Database unreachable

Warnings and errors by index:
index 0: Disk usage at 80%
index 1: Failed login attempt
index 2: Memory pressure detected
index 3: Database unreachable
⚠️
Watch Out — Iterator Invalidation After erase():If you call erase() inside a manual for-loop and increment your iterator unconditionally, you'll skip the element that slid into the erased position. Always reassign: it = vec.erase(it) and only increment it in the else branch. The erase-remove idiom avoids this trap entirely — prefer it whenever possible.

Vectors of Objects vs Vectors of Pointers — Choosing the Right Layout

This is the decision that separates intermediate C++ developers from seniors. You have two options: store objects directly (std::vector) or store pointers to heap objects (std::vector or std::vector>). They have completely different performance profiles.

Storing objects directly keeps everything in a single contiguous block of memory. Iterating fires up the CPU's prefetcher and tears through the cache. This is almost always what you want for plain data types and anything without polymorphism. The downside: every reallocation moves every object, so your objects must be moveable.

Storing pointers enables polymorphism (a vector of base-class pointers can hold derived-class objects) and avoids moving large objects on reallocation. The downside: each element is a separate heap allocation scattered across memory. Iterating chases pointers across the heap — cache misses pile up and performance tanks on large collections. In game engines and high-frequency systems this distinction alone can mean a 5–10x speed difference on hot loops.

The rule of thumb: store by value unless you need polymorphism or the objects are truly non-moveable.

vector_value_vs_pointer.cpp · CPP
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
#include <iostream>
#include <vector>
#include <memory>    // std::unique_ptr
#include <string>

// Base class for a simple shape hierarchy
struct Shape {
    virtual std::string describe() const = 0;
    virtual ~Shape() = default;
};

struct Circle : Shape {
    float radius;
    explicit Circle(float r) : radius(r) {}
    std::string describe() const override {
        return "Circle(r=" + std::to_string(radius) + ")";
    }
};

struct Rectangle : Shape {
    float width, height;
    Rectangle(float w, float h) : width(w), height(h) {}
    std::string describe() const override {
        return "Rect(" + std::to_string(width) + "x" + std::to_string(height) + ")";
    }
};

int main() {
    // --- Scenario 1: store by value (no polymorphism needed) ---
    // Contiguous memory, cache-friendly, zero pointer chasing
    struct Particle { float x, y, z, mass; };

    std::vector<Particle> particles;
    particles.reserve(5);
    particles.emplace_back(Particle{0.0f, 1.0f, 2.0f, 1.5f});
    particles.emplace_back(Particle{3.0f, 0.5f, 1.0f, 2.0f});
    particles.emplace_back(Particle{1.0f, 2.0f, 0.0f, 0.8f});

    std::cout << "Particles (by value — contiguous memory):\n";
    for (const auto& p : particles) {
        std::cout << "  pos=(" << p.x << "," << p.y << "," << p.z
                  << ") mass=" << p.mass << "\n";
    }

    // --- Scenario 2: store by unique_ptr (polymorphism required) ---
    // Each Shape is a separate heap allocation — necessary for virtual dispatch
    std::vector<std::unique_ptr<Shape>> canvas;
    canvas.push_back(std::make_unique<Circle>(5.0f));
    canvas.push_back(std::make_unique<Rectangle>(3.0f, 4.0f));
    canvas.push_back(std::make_unique<Circle>(2.5f));

    std::cout << "\nShapes (by unique_ptr — polymorphism enabled):\n";
    for (const auto& shape : canvas) {
        // Virtual dispatch works correctly because we're going through a pointer
        std::cout << "  " << shape->describe() << "\n";
    }
    // unique_ptr: memory is automatically freed when canvas goes out of scope

    return 0;
}
▶ Output
Particles (by value — contiguous memory):
pos=(0,1,2) mass=1.5
pos=(3,0.5,1) mass=2
pos=(1,2,0) mass=0.8

Shapes (by unique_ptr — polymorphism enabled):
Circle(r=5.000000)
Rect(3.000000x4.000000)
Circle(r=2.500000)
🔥
Interview Gold — Why unique_ptr in a vector?Raw pointers in a vector (std::vector) mean you own the memory but must manually delete every element before the vector is destroyed. Miss one and you leak. std::vector> gives you the same polymorphism with automatic, exception-safe cleanup — always prefer it over raw owning pointers.
Aspectstd::vectorstd::arrayRaw C Array
SizeDynamic — grows at runtimeFixed at compile timeFixed at compile time
Memory layoutContiguous heap allocationContiguous stack allocationContiguous stack or heap
Bounds checkingat() throws, [] is uncheckedat() throws, [] is uncheckedNo bounds checking at all
Reallocation costO(n) on growth, amortized O(1)N/A — size never changesN/A — size never changes
Iterator invalidationOn any reallocationNeverN/A — no iterators
STL algorithm supportFull — begin()/end() availableFull — begin()/end() availablePartial — pointer arithmetic only
Ownership semanticsRAII — auto memory managementRAII — auto memory managementManual — you manage the lifetime
Best use caseMost runtime collectionsFixed config, stack-local dataC interop, legacy code

🎯 Key Takeaways

  • A vector is a contiguous heap array — that contiguity is why iteration is cache-friendly and fast, but it's also why reallocation invalidates every iterator you held.
  • Call reserve() before filling a vector when you know the approximate size — it collapses potentially dozens of reallocations into a single allocation and is one of the highest-return-on-effort optimisations in C++.
  • Prefer emplace_back() over push_back() for new element construction — it builds the object directly in the buffer, skipping the temporary object that push_back() would otherwise create and move.
  • Use the erase-remove idiom (std::remove_if + erase()) for conditional deletion — hand-rolling an index-based erase loop almost always produces a subtle iterator-invalidation bug sooner or later.

⚠ Common Mistakes to Avoid

  • Mistake 1: Storing an iterator or pointer into a vector, then modifying the vector — After any push_back() or insert() that triggers reallocation, every iterator, pointer, and reference into that vector is dangling. Using them is undefined behaviour — your program may crash, silently corrupt data, or 'work' in debug and explode in release. Fix: never hold iterators across mutating operations, or call reserve() upfront so no reallocation occurs during the insertions.
  • Mistake 2: Using operator[] with an out-of-bounds index — Unlike at(), operator[] does zero bounds checking. Accessing scores[scores.size()] doesn't throw; it reads whatever memory happens to be there. This is a silent data corruption bug that's notoriously hard to track down. Fix: use at() during development and testing. Switch to [] only in hot loops after you've verified correctness, or enable address sanitiser (compile with -fsanitize=address) to catch it automatically.
  • Mistake 3: Calling erase() in a range-for loop — Erasing an element inside a range-for loop is undefined behaviour because the hidden iterator the loop is using becomes invalid. The code may appear to work sometimes and crash others. Fix: either collect elements to remove in a separate vector of indices and erase in reverse order, or use the erase-remove idiom with std::remove_if() which handles this correctly in a single pass.

Interview Questions on This Topic

  • QWhat is the difference between size() and capacity() in a std::vector, and why does capacity sometimes grow by more than one when you push a single element?
  • QExplain iterator invalidation in std::vector. Under what operations does it occur, and how would you write safe code that needs to erase multiple elements matching a condition?
  • QWhen would you choose std::vector> over std::vector, and what are the performance implications of that choice on cache behaviour during iteration?

Frequently Asked Questions

What is the difference between push_back and emplace_back in C++ vectors?

push_back() takes an already-constructed object and copies or moves it into the vector, creating at least one temporary. emplace_back() forwards its arguments directly to the element's constructor inside the vector's buffer, building the object in-place with no temporary. For anything more complex than a primitive type, emplace_back() is more efficient and should be your default.

Does std::vector store elements on the heap or the stack?

The vector object itself (the three-pointer control block) lives wherever you declare it — on the stack for local variables. But the actual element data is always heap-allocated. When you call reserve() or when the vector grows, it calls new[] under the hood to allocate a contiguous block on the heap.

Why does iterating over a vector invalidate iterators after erase()?

When you erase an element, every element after it shifts one position to the left to fill the gap. Any iterator pointing to a position at or after the erased element now points to a different element than it did before — or past the end. The erase() function returns a valid iterator to the element that moved into the erased position, so always reassign: it = vec.erase(it).

🔥
TheCodeForge Editorial Team Verified Author

Written and reviewed by senior developers with real-world experience across enterprise, startup and open-source projects. Every article on TheCodeForge is written to be clear, accurate and genuinely useful — not just SEO filler.

← PreviousSTL in C++ — Standard Template LibraryNext →STL Maps and Sets in C++
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged