Home C / C++ Structures and Unions in C Explained — Memory, Use Cases and Pitfalls

Structures and Unions in C Explained — Memory, Use Cases and Pitfalls

In Plain English 🔥
Think of a struct like a passport — it holds your name, date of birth, nationality, and photo all in one booklet, each piece of information living in its own dedicated slot. A union is more like a whiteboard that only one person can write on at a time — the same physical space gets reused for different types of information depending on who needs it. The passport always has room for every field; the whiteboard only ever holds the most recent thing written on it. That single difference in 'shared vs dedicated memory' is the entire story of structs vs unions.
⚡ Quick Answer
Think of a struct like a passport — it holds your name, date of birth, nationality, and photo all in one booklet, each piece of information living in its own dedicated slot. A union is more like a whiteboard that only one person can write on at a time — the same physical space gets reused for different types of information depending on who needs it. The passport always has room for every field; the whiteboard only ever holds the most recent thing written on it. That single difference in 'shared vs dedicated memory' is the entire story of structs vs unions.

Every real-world program deals with grouped data. A game needs to track a player's name, health, score, and position together. A network driver needs to interpret the same 4 bytes as either an IPv4 address, a 32-bit integer, or four individual octets depending on context. Trying to manage all of that with loose individual variables is like trying to run a hospital with sticky notes instead of patient records — technically possible, catastrophically unmanageable. Structures and unions are C's answer to that chaos.

The problem they solve is fundamentally about organisation and memory semantics. A struct gives you a custom data type that bundles related variables under one name, each with its own guaranteed memory slot. A union takes that idea and flips the memory model — all members share the same block of memory, which means you get type-reinterpretation and memory efficiency at the cost of only being able to use one member at a time. These aren't just syntax features; they're tools that let you model the real world accurately in code.

By the end of this article you'll understand exactly how struct and union memory layouts work, when each is the right tool, how to combine them for practical patterns like tagged unions, and the exact mistakes that trip up even experienced C developers. You'll also be able to confidently answer the interview questions that separate candidates who've read about C from those who've actually used it.

A struct (short for structure) lets you define a composite data type — a single named container that holds multiple members, each with its own type. The compiler allocates memory for every member independently, so all fields exist simultaneously and can be read or written in any order.

The real power isn't just convenience — it's that a struct becomes a first-class type. You can pass it to functions, return it, put it in arrays, and point to it. This lets you model domain concepts directly. A 'Player' struct isn't just three variables that happen to be related; it's a single coherent entity your code can reason about.

Under the hood, struct members are laid out sequentially in memory, but the compiler is allowed to insert padding bytes between members to satisfy alignment requirements of the target CPU. This means sizeof(struct Player) might be larger than you expect, and it's the first thing you need to internalise before you do anything serious with structs in systems programming or binary file I/O.

Use structs whenever you have data that naturally belongs together and needs all its fields present at the same time — think database records, configuration objects, game entities, or network packet headers.

player_struct.c · C
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657
#include <stdio.h>
#include <string.h>

// Define a struct that models a game player.
// Every field gets its OWN memory slot — all exist at the same time.
typedef struct {
    char  username[32];  // 32 bytes
    int   health;        // 4 bytes (likely padded to align next field)
    float position_x;   // 4 bytes
    float position_y;   // 4 bytes
    int   score;         // 4 bytes
} Player;

// Function that accepts a Player by VALUE (a copy is made)
void print_player_status(Player p) {
    printf("--- Player Status ---\n");
    printf("Username : %s\n",  p.username);
    printf("Health   : %d\n",  p.health);
    printf("Position : (%.1f, %.1f)\n", p.position_x, p.position_y);
    printf("Score    : %d\n",  p.score);
}

// Function that accepts a POINTER to Player — modifies original, no copy overhead
void apply_damage(Player *p, int damage_amount) {
    p->health -= damage_amount;  // Arrow operator dereferences pointer AND accesses member
    if (p->health < 0) {
        p->health = 0;           // Clamp health to zero — no negative health
    }
}

int main(void) {
    // Initialise a Player using designated initialisers (C99+)
    // Any field not listed is zero-initialised automatically
    Player hero = {
        .username   = "Aria_Stormblade",
        .health     = 100,
        .position_x = 12.5f,
        .position_y = 7.0f,
        .score      = 0
    };

    print_player_status(hero);

    // Apply damage via pointer — mutates the original struct
    apply_damage(&hero, 35);
    hero.score += 500;  // Direct member access with dot operator

    printf("\nAfter combat:\n");
    print_player_status(hero);

    // See how much memory the struct actually occupies
    printf("\nsizeof(Player) = %zu bytes\n", sizeof(Player));
    // Individual fields sum to: 32 + 4 + 4 + 4 + 4 = 48 bytes
    // Actual size may match or exceed this due to alignment padding

    return 0;
}
▶ Output
--- Player Status ---
Username : Aria_Stormblade
Health : 100
Position : (12.5, 7.0)
Score : 0

After combat:
--- Player Status ---
Username : Aria_Stormblade
Health : 65
Position : (12.5, 7.0)
Score : 500

sizeof(Player) = 48 bytes
⚠️
Pro Tip: Prefer Designated InitialisersUsing `.fieldname = value` syntax (C99+) instead of positional initialisation means adding a new field to your struct won't silently corrupt all your existing initialisers. It also makes the code self-documenting — you can see exactly which field each value maps to without counting commas.

Memory Layout and Padding — Why sizeof Surprises You

This is the section most tutorials skip, and it's the one that causes the most real-world bugs. CPUs are picky about alignment — a 4-byte int wants to live at a memory address that's divisible by 4. A double wants an address divisible by 8. When the compiler lays out struct members sequentially, it inserts invisible padding bytes to honour these constraints.

Consider a struct with a char (1 byte) followed by an int (4 bytes). The char sits at offset 0, but the int needs to start at offset 4 — so 3 bytes of padding are inserted silently. The struct's total size also gets padded at the end so that arrays of the struct keep every element aligned.

This matters enormously in three situations: serialising structs to binary files or network packets (padding bytes contain garbage), computing offsets manually, and squeezing memory in embedded systems. The fix in the first two cases is either reordering your members largest-to-smallest (which often eliminates padding naturally) or using __attribute__((packed)) / #pragma pack — but only when you truly need it, because unaligned access is slower on most architectures and outright illegal on some.

The code below makes this concrete by printing the byte offset of each member so you can see exactly where the padding lives.

struct_padding_demo.c · C
12345678910111213141516171819202122232425262728293031323334353637383940414243
#include <stdio.h>
#include <stddef.h>  // for offsetof macro

// BAD layout: fields ordered in a way that wastes memory via padding
typedef struct {
    char   is_active;    // 1 byte at offset 0
                         // 3 bytes of PADDING inserted here by compiler
    int    user_id;      // 4 bytes at offset 4
    char   grade;        // 1 byte at offset 8
                         // 7 bytes of PADDING inserted here
    double score;        // 8 bytes at offset 16
} StudentUnoptimised;    // Total: 24 bytes

// GOOD layout: reordered largest-to-smallest — padding mostly eliminated
typedef struct {
    double score;        // 8 bytes at offset 0
    int    user_id;      // 4 bytes at offset 8
    char   is_active;    // 1 byte at offset 12
    char   grade;        // 1 byte at offset 13
                         // 2 bytes of padding at end to round up to multiple of 8
} StudentOptimised;      // Total: 16 bytes

int main(void) {
    printf("=== Unoptimised Layout ===\n");
    printf("sizeof = %zu bytes\n",         sizeof(StudentUnoptimised));
    printf("is_active offset : %zu\n",     offsetof(StudentUnoptimised, is_active));
    printf("user_id   offset : %zu\n",     offsetof(StudentUnoptimised, user_id));
    printf("grade     offset : %zu\n",     offsetof(StudentUnoptimised, grade));
    printf("score     offset : %zu\n",     offsetof(StudentUnoptimised, score));

    printf("\n=== Optimised Layout (reordered) ===\n");
    printf("sizeof = %zu bytes\n",         sizeof(StudentOptimised));
    printf("score     offset : %zu\n",     offsetof(StudentOptimised, score));
    printf("user_id   offset : %zu\n",     offsetof(StudentOptimised, user_id));
    printf("is_active offset : %zu\n",     offsetof(StudentOptimised, is_active));
    printf("grade     offset : %zu\n",     offsetof(StudentOptimised, grade));

    printf("\nMemory saved per instance: %zu bytes\n",
           sizeof(StudentUnoptimised) - sizeof(StudentOptimised));
    // With 1 million students in memory, that's 8 MB saved

    return 0;
}
▶ Output
=== Unoptimised Layout ===
sizeof = 24 bytes
is_active offset : 0
user_id offset : 4
grade offset : 8
score offset : 16

=== Optimised Layout (reordered) ===
sizeof = 16 bytes
score offset : 0
user_id offset : 8
is_active offset : 12
grade offset : 13

Memory saved per instance: 8 bytes
⚠️
Watch Out: Never memcpy a padded struct over a networkPadding bytes are uninitialised — they hold whatever garbage was in memory. If you send a struct directly over a socket or write it to a binary file, the receiver may read different values depending on the platform. Always serialise field-by-field, or use a packed struct only for the wire/file format and copy into a normal struct for local processing.

Unions: One Memory Location, Many Interpretations

A union looks syntactically identical to a struct but operates on a completely different principle: all members share the same starting address and the same block of memory. The union's size equals the size of its largest member. Writing to one member and reading from a different one reinterprets the raw bytes — which is either a powerful tool or a disaster, depending on whether you do it intentionally.

The classic legitimate use cases are: type-punning (reinterpreting the raw bytes of a float as a uint32_t, for example), memory-mapped hardware registers where the same address has different meanings, and building tagged unions (also called discriminated unions) where a type tag tells you which member is currently valid.

The illegitimate use — writing member A and reading member B expecting a meaningful 'conversion' — is undefined behaviour in C for most type combinations. The exception is char/unsigned char, which you're always allowed to use to inspect raw bytes.

Unions shine in embedded systems, protocol parsers, and anywhere you need to look at the same memory through different lenses. They're also the foundation of variant types — the C equivalent of what Rust calls enums with data.

network_packet_union.c · C
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283
#include <stdio.h>
#include <stdint.h>  // for uint8_t, uint32_t

// A union that lets us treat a 4-byte IPv4 address as either
// a single 32-bit integer OR four individual octets.
// This is a textbook legitimate use of a union.
typedef union {
    uint32_t as_integer;    // View the address as one 32-bit number
    uint8_t  octets[4];     // View the same bytes as four separate octets
} IPv4Address;

// A tagged union — the 'type' field tells us which union member is valid.
// This is the safe pattern for using unions in application code.
typedef enum {
    SENSOR_TEMPERATURE,
    SENSOR_PRESSURE,
    SENSOR_HUMIDITY
} SensorType;

typedef struct {
    SensorType type;         // The TAG — always check this before reading the union
    union {
        float   temperature_celsius;  // Valid when type == SENSOR_TEMPERATURE
        float   pressure_hpa;         // Valid when type == SENSOR_PRESSURE
        uint8_t humidity_percent;     // Valid when type == SENSOR_HUMIDITY
    } reading;               // All three share the same 4 bytes
} SensorReading;

void print_sensor_reading(const SensorReading *s) {
    switch (s->type) {
        case SENSOR_TEMPERATURE:
            printf("Temperature : %.2f C\n", s->reading.temperature_celsius);
            break;
        case SENSOR_PRESSURE:
            printf("Pressure    : %.2f hPa\n", s->reading.pressure_hpa);
            break;
        case SENSOR_HUMIDITY:
            printf("Humidity    : %u %%\n", s->reading.humidity_percent);
            break;
    }
}

int main(void) {
    // --- IPv4 union demo ---
    IPv4Address addr;
    addr.as_integer = 0xC0A80101;  // 192.168.1.1 in hex (big-endian representation)

    printf("=== IPv4 Union Demo ===\n");
    printf("As integer : 0x%08X\n", addr.as_integer);
    // NOTE: octet order depends on host endianness (little-endian shown here)
    printf("As octets  : %u.%u.%u.%u\n",
           addr.octets[3], addr.octets[2],
           addr.octets[1], addr.octets[0]);

    printf("sizeof(IPv4Address) = %zu bytes\n", sizeof(IPv4Address));

    // --- Tagged union demo ---
    printf("\n=== Sensor Tagged Union Demo ===\n");

    SensorReading temp_reading = {
        .type = SENSOR_TEMPERATURE,
        .reading.temperature_celsius = 23.7f
    };

    SensorReading pressure_reading = {
        .type = SENSOR_PRESSURE,
        .reading.pressure_hpa = 1013.25f
    };

    SensorReading humidity_reading = {
        .type = SENSOR_HUMIDITY,
        .reading.humidity_percent = 68
    };

    print_sensor_reading(&temp_reading);
    print_sensor_reading(&pressure_reading);
    print_sensor_reading(&humidity_reading);

    // Show that all three sensor readings share the same memory footprint
    printf("\nsizeof(SensorReading) = %zu bytes\n", sizeof(SensorReading));

    return 0;
}
▶ Output
=== IPv4 Union Demo ===
As integer : 0xC0A80101
As octets : 192.168.1.1
sizeof(IPv4Address) = 4 bytes

=== Sensor Tagged Union Demo ===
Temperature : 23.70 C
Pressure : 1013.25 hPa
Humidity : 68 %

sizeof(SensorReading) = 8 bytes
🔥
Interview Gold: The Tagged Union PatternA tagged union (struct containing an enum tag + a union) is C's version of a type-safe variant. Interviewers love asking how you'd implement a value that can be one of several types — this is the answer. It's also the foundation of how compilers represent AST nodes and how SQLite stores its dynamic column types internally.

Combining Structs and Unions — Building Real Data Structures

In production C code, structs and unions almost always appear together. A pure union with no tag is hard to use safely. A struct with no unions is sometimes wasteful. Combine them and you get expressive, memory-efficient data models.

A common real-world pattern is a variant record — a struct that represents one of several possible entity types, where the correct interpretation depends on a discriminator field. This pattern powers everything from protocol buffer implementations to expression trees in compilers.

Another key pattern is bit fields inside structs, which let you pack boolean flags and small integers into individual bits rather than full bytes. This is critical in embedded systems where a microcontroller might have only 2KB of RAM.

The example below builds a minimal expression tree node — the kind of structure you'd find inside a C compiler or calculator app. Each node is either a number literal, a binary operation, or a variable reference. The tagged union makes this safe and self-describing.

expression_tree.c · C
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109
#include <stdio.h>
#include <stdlib.h>

// Every node in our expression tree is one of these types
typedef enum {
    NODE_NUMBER,     // A literal integer value like 42
    NODE_VARIABLE,   // A named variable like 'x'
    NODE_BINARY_OP   // An operation like left + right
} NodeType;

typedef enum {
    OP_ADD,
    OP_SUBTRACT,
    OP_MULTIPLY
} BinaryOperator;

// Forward declaration so we can have self-referential pointers
typedef struct ExprNode ExprNode;

struct ExprNode {
    NodeType type;   // The tag — ALWAYS set this, ALWAYS check it before reading

    union {
        // Used when type == NODE_NUMBER
        int literal_value;

        // Used when type == NODE_VARIABLE
        char variable_name[16];

        // Used when type == NODE_BINARY_OP
        struct {
            BinaryOperator  op;
            ExprNode       *left;   // Pointer — nodes can't contain themselves directly
            ExprNode       *right;
        } binary;
    } data;
};

// Helper constructors — clean API hides the union details from callers
ExprNode *make_number(int value) {
    ExprNode *node = malloc(sizeof(ExprNode));
    node->type = NODE_NUMBER;
    node->data.literal_value = value;
    return node;
}

ExprNode *make_binary(BinaryOperator op, ExprNode *left, ExprNode *right) {
    ExprNode *node = malloc(sizeof(ExprNode));
    node->type = NODE_BINARY_OP;
    node->data.binary.op    = op;
    node->data.binary.left  = left;
    node->data.binary.right = right;
    return node;
}

// Recursive evaluator — the type tag drives every decision
int evaluate(const ExprNode *node) {
    switch (node->type) {
        case NODE_NUMBER:
            return node->data.literal_value;

        case NODE_BINARY_OP: {
            int left_val  = evaluate(node->data.binary.left);
            int right_val = evaluate(node->data.binary.right);
            switch (node->data.binary.op) {
                case OP_ADD:      return left_val + right_val;
                case OP_SUBTRACT: return left_val - right_val;
                case OP_MULTIPLY: return left_val * right_val;
            }
        }

        case NODE_VARIABLE:
            // Variable lookup not implemented in this demo
            return 0;
    }
    return 0;  // Unreachable — silences compiler warning
}

void free_tree(ExprNode *node) {
    if (node == NULL) return;
    if (node->type == NODE_BINARY_OP) {
        free_tree(node->data.binary.left);
        free_tree(node->data.binary.right);
    }
    free(node);
}

int main(void) {
    // Build the expression tree for: (3 + 4) * (10 - 2)
    //
    //         MULTIPLY
    //        /        \
    //      ADD       SUBTRACT
    //     /   \      /      \
    //    3     4   10        2

    ExprNode *left_branch  = make_binary(OP_ADD,      make_number(3),  make_number(4));
    ExprNode *right_branch = make_binary(OP_SUBTRACT, make_number(10), make_number(2));
    ExprNode *root         = make_binary(OP_MULTIPLY, left_branch, right_branch);

    int result = evaluate(root);
    printf("(3 + 4) * (10 - 2) = %d\n", result);

    // Show that each node is the same size regardless of which union member is active
    printf("sizeof(ExprNode)    = %zu bytes\n", sizeof(ExprNode));

    free_tree(root);  // Clean up — always free what you malloc
    return 0;
}
▶ Output
(3 + 4) * (10 - 2) = 56
sizeof(ExprNode) = 32 bytes
⚠️
Pro Tip: Hide Union Complexity Behind Constructor FunctionsNever let callers manually set a union member without also setting the tag — that's how you get type confusion bugs that don't surface until 3 months later in production. Wrap every union-containing struct in small constructor functions (like make_number and make_binary above) that set both the tag and the member together atomically. This pattern is called an opaque factory and it's how every serious C codebase handles variants.
Feature / Aspectstructunion
Memory allocationEach member gets its own dedicated memory slotAll members share a single memory block
Total sizeSum of all member sizes + padding bytesSize of the largest single member
Simultaneous membersAll members are valid and accessible at all timesOnly the last-written member is valid
Primary use caseGrouping related data that all needs to coexistType-punning, variant types, memory-mapped registers
SafetyInherently safe — no conflicts between membersUnsafe unless paired with a type tag (discriminator)
Padding behaviourPadding inserted between members for alignmentPadding added only at the end to round up to largest member's alignment
Array of elementsCommon and straightforward — each element is independentPossible but unusual — all elements share the same size
Nested usageCan contain unions as members (tagged union pattern)Can contain structs as members (anonymous struct inside union)
Typical domainsApplication data models, protocol headers, game entitiesEmbedded systems, compilers, network protocol parsers

🎯 Key Takeaways

  • A struct allocates independent memory for every member — all fields coexist. A union allocates memory for only its largest member — all fields overlap. This single difference defines every use case for each.
  • The compiler inserts silent padding bytes between struct members for CPU alignment. Reordering fields largest-to-smallest typically reduces or eliminates padding, which matters at scale and in embedded systems.
  • A bare union is almost always a bug waiting to happen. Always pair a union with an enum tag inside a struct — this creates a tagged union (discriminated union) that's the only safe pattern for using unions in application code.
  • Never memcpy or memcmp raw structs across a network boundary or to a binary file — padding bytes hold uninitialised garbage. Serialise field-by-field or zero-initialise the entire struct with = {0} before populating it.

⚠ Common Mistakes to Avoid

  • Mistake 1: Reading a union member you didn't write to — If you write to union.float_value and then read union.int_value expecting an implicit conversion, you'll get the raw bit reinterpretation of the float, not a converted integer. This is undefined behaviour for most type pairs and produces wildly wrong results. Fix: always track which member is active using a tag enum alongside the union, and only read the member that matches the current tag.
  • Mistake 2: Using memcmp or memcpy on padded structs for equality checks or serialisation — Padding bytes contain whatever garbage was in memory at the time the struct was allocated. Two structs with identical field values but different padding garbage will fail a memcmp equality check. Sending the raw struct over a network or to a file transmits that garbage too. Fix: either use a designated initialiser with = {0} to zero-initialise the entire struct including padding, or write field-by-field serialisation functions that never touch padding.
  • Mistake 3: Assuming a pointer to a struct and a pointer to its first member are always the same — While the C standard guarantees that a pointer to a struct and a pointer to its first member have the same address value, casting between unrelated struct pointer types and reading through the wrong type is undefined behaviour. This bites people who try to implement polymorphism by casting between struct types that merely happen to share a common first field. Fix: use a proper tagged union or an explicit void* with a type tag rather than relying on undefined pointer casting behaviour.

Interview Questions on This Topic

  • QIf a struct has a char, a double, and an int in that order, what is likely to be its size on a 64-bit system and why? Walk me through the padding the compiler inserts.
  • QWhat is a tagged union, when would you use one over a plain union, and can you sketch a real example where it would be the right data structure?
  • QIs it ever valid to write to one union member and read from a different one? What does the C standard actually say, and are there any exceptions?

Frequently Asked Questions

What is the difference between a struct and a union in C?

A struct allocates separate memory for each member, so all fields exist simultaneously and can be read or written independently. A union allocates one shared block of memory sized for its largest member, meaning only one member holds a valid value at any given time. Structs model entities with multiple concurrent properties; unions model a single value that can be interpreted as different types.

Why is sizeof(struct) larger than the sum of its members?

The compiler inserts padding bytes between struct members to satisfy CPU alignment requirements — for example, a 4-byte int must start at an address divisible by 4. There may also be trailing padding at the end so that arrays of the struct keep each element correctly aligned. You can see exact offsets using the offsetof macro from stddef.h.

Can I use a union to convert between types, like writing a float and reading an int?

This is called type-punning and the rules are nuanced. In C, reading a union member that wasn't the last one written is technically undefined behaviour for most type combinations, meaning the compiler is not required to give you a predictable result. The one guaranteed exception is reading through an unsigned char array, which always gives you the raw bytes. For deliberate type-punning (like inspecting the bit pattern of a float), use memcpy into an unsigned char buffer instead — it's always defined behaviour and modern compilers optimise it to zero overhead.

🔥
TheCodeForge Editorial Team Verified Author

Written and reviewed by senior developers with real-world experience across enterprise, startup and open-source projects. Every article on TheCodeForge is written to be clear, accurate and genuinely useful — not just SEO filler.

← PreviousPointer Arithmetic in CNext →Memory Management in C — malloc calloc free
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged