C Struct Padding — 3-Byte Pad Corrupted 50% Packets
A 3-byte padding mismatch between x86 and ARM silently corrupted 50% network packets.
20+ years shipping performance-critical C and C++ systems. Written from production experience, not tutorials.
- A struct gives each member its own memory slot; a union makes all members share one block.
- Structs are for data that coexists; unions for data that is mutually exclusive.
- Padding aligns members to CPU boundaries — sizeof(struct) often exceeds sum of its members.
- Union type-punning is undefined behavior unless reading via char/unsigned char.
- Always pair a union with an enum tag to track the active member.
- Reorder struct fields largest-to-smallest to minimize padding and save memory.
Think of a struct like a passport — it holds your name, date of birth, nationality, and photo all in one booklet, each piece of information living in its own dedicated slot. A union is more like a whiteboard that only one person can write on at a time — the same physical space gets reused for different types of information depending on who needs it. The passport always has room for every field; the whiteboard only ever holds the most recent thing written on it. That single difference in 'shared vs dedicated memory' is the entire story of structs vs unions.
Every real-world program deals with grouped data. A game needs to track a player's name, health, score, and position together. A network driver needs to interpret the same 4 bytes as either an IPv4 address, a 32-bit integer, or four individual octets depending on context. Trying to manage all of that with loose individual variables is like trying to run a hospital with sticky notes instead of patient records — technically possible, catastrophically unmanageable. Structures and unions are C's answer to that chaos.
The problem they solve is fundamentally about organisation and memory semantics. A struct gives you a custom data type that bundles related variables under one name, each with its own guaranteed memory slot. A union takes that idea and flips the memory model — all members share the same block of memory, which means you get type-reinterpretation and memory efficiency at the cost of only being able to use one member at a time. These aren't just syntax features; they're tools that let you model the real world accurately in code.
By the end of this article you'll understand exactly how struct and union memory layouts work, when each is the right tool, how to combine them for practical patterns like tagged unions, and the exact mistakes that trip up even experienced C developers. You'll also be able to confidently answer the interview questions that separate candidates who've read about C from those who've actually used it.
How C Struct Padding Corrupts Your Data
C structs are composite data types that group related variables under one name. The core mechanic is that the compiler may insert unused bytes between members to satisfy alignment requirements of the target architecture. This padding ensures each member starts at an address that is a multiple of its size (e.g., a 4-byte int must start at an address divisible by 4). The result is that the in-memory layout of a struct is not simply the sum of its members' sizes — it can be larger, and the offsets are compiler- and platform-dependent.
When you define a struct with members of different sizes (e.g., char, int, short), the compiler aligns each member to its natural boundary. For example, a struct with a char followed by an int will have 3 bytes of padding after the char so the int starts at a 4-byte boundary. The total size becomes 8 bytes instead of 5. This padding is invisible in source code but critical when serializing structs to a buffer, sending over a network, or writing to a file. If you memcpy the struct directly, you copy the padding bytes — which may contain garbage or stale data.
Use struct padding consciously when you need to control memory layout for hardware registers, network protocols, or binary file formats. The __attribute__((packed)) directive (GCC) or #pragma pack(1) (MSVC) eliminates padding, but at the cost of potential misaligned access penalties on some architectures. In real systems, ignoring padding leads to buffer overruns, checksum mismatches, and corrupted packets — especially when crossing language boundaries (e.g., C struct sent to Java via JNI).
__attribute__((packed)) can cause unaligned memory access, which on ARM or SPARC triggers a bus error or silently degrades performance.sizeof() and verify against wire format; never assume layout matches declaration order.sizeof() and offsetof() to determine actual layout; never hardcode offsets.Structs: Grouping Related Data With Dedicated Memory
A struct (short for structure) lets you define a composite data type — a single named container that holds multiple members, each with its own type. The compiler allocates memory for every member independently, so all fields exist simultaneously and can be read or written in any order.
The real power isn't just convenience — it's that a struct becomes a first-class type. You can pass it to functions, return it, put it in arrays, and point to it. This lets you model domain concepts directly. A 'Player' struct isn't just three variables that happen to be related; it's a single coherent entity your code can reason about.
Under the hood, struct members are laid out sequentially in memory, but the compiler is allowed to insert padding bytes between members to satisfy alignment requirements of the target CPU. This means sizeof(struct Player) might be larger than you expect, and it's the first thing you need to internalise before you do anything serious with structs in systems programming or binary file I/O.
Use structs whenever you have data that naturally belongs together and needs all its fields present at the same time — think database records, configuration objects, game entities, or network packet headers.
.fieldname = value syntax (C99+) instead of positional initialisation means adding a new field to your struct won't silently corrupt all your existing initialisers. It also makes the code self-documenting — you can see exactly which field each value maps to without counting commas.sizeof() may be larger than the sum of sizes.sizeof() before serializing or allocating arrays of structs.Nested Structures: Structs Within Structs
Real-world data is rarely flat. A Person doesn't just have a name and age — they have an address, which itself has street, city, and zip code. C lets you model this naturally by placing one struct inside another as a member. This is called nesting, and it's how you build hierarchical data models without losing type safety.
When you nest a struct, the inner struct's fields are laid out as a contiguous block inside the outer struct, subject to alignment rules. The total size of the outer struct includes the full size of the inner struct, plus any padding needed after it. Accessing a nested member requires multiple dot operators: person.address.zip. You can also use designated initializers with nested structs: .address.city = "Boston" — but each nested level must be initialized in a separate brace-enclosed block or with the dot notation.
Nested structs appear everywhere: database records with sub-objects, network packet headers with inner protocol fields, and configuration trees. They are also the foundation of the tagged union pattern when combined with unions.
One common pitfall is assuming the inner struct's fields are laid out continuously from the start of the outer struct. They are not — the compiler may skip padding after the previous member before placing the inner struct, depending on alignment. Always check sizeof and offsets if you rely on binary layout.
.addr.street = ... in your designated initialisers. It makes the hierarchy explicit and avoids mismatched braces. This is C99 and later, and every modern compiler supports it.Pointer to Structure and the Arrow Operator
When you work with large structs in production C, you almost never pass them by value. Copying a 100-byte struct on the stack is expensive and unnecessary. Instead, you pass a pointer to the struct — typically a 4 or 8-byte value — and access members through that pointer. This is where the arrow operator -> comes in.
The arrow operator is syntactic sugar. ptr->member is exactly equivalent to (ptr).member. But it's not just about saving two keystrokes — it makes pointer semantics explicit and reduces the chance of precedence errors (. binds tighter than , so *ptr.member would dereference the wrong thing).
Using pointers to structs is essential for dynamic allocation (malloc(sizeof(MyStruct))), linked lists, trees, and any function that needs to modify the original struct. When you pass a struct by value, the function works on a copy — modifications are lost. When you pass by pointer with ->, the function can mutate the original.
Always check for NULL before dereferencing a struct pointer. A null pointer dereference crashes instantly. Use if (p != NULL) { p->field; } religiously.
One advanced pattern: you can have pointers to nested struct members. For example, Address *a = &person.addr; then a->zip = 90210; modifies the original. This is how you build flexible hierarchical APIs.
& for address-of) you may still need parentheses. For example, &hero->health is fine (address of health), but *hero->name is fine too. When in doubt, add parentheses: (hero->health) — it never hurts.Memory Layout and Padding — Why sizeof Surprises You
This is the section most tutorials skip, and it's the one that causes the most real-world bugs. CPUs are picky about alignment — a 4-byte int wants to live at a memory address that's divisible by 4. A double wants an address divisible by 8. When the compiler lays out struct members sequentially, it inserts invisible padding bytes to honour these constraints.
Consider a struct with a char (1 byte) followed by an int (4 bytes). The char sits at offset 0, but the int needs to start at offset 4 — so 3 bytes of padding are inserted silently. The struct's total size also gets padded at the end so that arrays of the struct keep every element aligned.
This matters enormously in three situations: serialising structs to binary files or network packets (padding bytes contain garbage), computing offsets manually, and squeezing memory in embedded systems. The fix in the first two cases is either reordering your members largest-to-smallest (which often eliminates padding naturally) or using __attribute__((packed)) / #pragma pack — but only when you truly need it, because unaligned access is slower on most architectures and outright illegal on some.
Unions: One Memory Location, Many Interpretations
A union looks syntactically identical to a struct but operates on a completely different principle: all members share the same starting address and the same block of memory. The union's size equals the size of its largest member. Writing to one member and reading from a different one reinterprets the raw bytes — which is either a powerful tool or a disaster, depending on whether you do it intentionally.
The classic legitimate use cases are: type-punning (reinterpreting the raw bytes of a float as a uint32_t, for example), memory-mapped hardware registers where the same address has different meanings, and building tagged unions (also called discriminated unions) where a type tag tells you which member is currently valid.
The illegitimate use — writing member A and reading member B expecting a meaningful 'conversion' — is undefined behaviour in C for most type combinations. The exception is char/unsigned char, which you're always allowed to use to inspect raw bytes.
Combining Structs and Unions — Building Real Data Structures
In production C code, structs and unions almost always appear together. A pure union with no tag is hard to use safely. A struct with no unions is sometimes wasteful. Combine them and you get expressive, memory-efficient data models.
A common real-world pattern is a variant record — a struct that represents one of several possible entity types, where the correct interpretation depends on a discriminator field. This pattern powers everything from protocol buffer implementations to expression trees in compilers.
Another key pattern is bit fields inside structs, which let you pack boolean flags and small integers into individual bits rather than full bytes. This is critical in embedded systems where a microcontroller might have only 2KB of RAM.
Bit Fields and Packed Structs: Fine-Grained Control of Memory Layout
Bit fields let you specify the exact number of bits each member occupies. They're invaluable for hardware register maps, protocol flags, and any scenario where every byte counts. The syntax unsigned int flag : 1; declares a 1-bit field. Multiple bit fields can be packed into the same underlying storage unit.
However, bit fields are highly implementation-defined. The compiler decides whether fields are allocated from left to right or right to left, whether they span storage unit boundaries, and whether int bit fields are signed or unsigned. This makes them non-portable across compilers and even across compiler versions.
Packed structs (__attribute__((packed)) or #pragma pack(1)) force the compiler to remove all padding. They guarantee byte-exact layout, which is essential for wire protocols and binary file formats. The cost: every member access becomes an unaligned memory access. On x86 this is slow; on ARM prior to v6 it crashes. Always benchmark before deploying packed structs in hot paths.
Creating a Union: One Memory Slot, Many Hats
You define a union like a struct, but every member shares the same memory address. The compiler sizes the union to the largest member. That's it. No separate slots. This isn't a bug — it's a deliberate tool for memory multiplexing.
When you write to one union member, you overwrite all others. There's no magic safety net. The last write wins. This is brilliant for protocol buffers, variant types, or register maps where you need to reinterpret the same bytes differently at different times.
Here's the production grade pattern: wrap the union in a struct with a discriminator field. That enum tells you which member is currently valid. Without that tag, you're gambling on runtime state. Don't gamble.
Declare with union Tag { ... };. Access with dot operator. Pay attention to alignment — unions don't escape padding rules.
Enumeration: Named Constants That Don't Suck
Enums are integer constants with names. They replace magic numbers with something a human can read six months later. In C++, enum class gives you type safety — no implicit conversion to int, no accidental pollution of the enclosing namespace.
Plain enum leaks into the parent scope. That means enum Color { RED, GREEN, BLUE }; puts RED at file scope. Two enums can't share a name without colliding. enum class fixes that: enum class Color { RED, GREEN, BLUE }; requires Color::RED to access.
Use enums for state machines, error codes, configuration flags. They compile down to integers — zero runtime cost. But watch the underlying type. By default it's int. For embedded work or packed structures, specify the type: enum class Flag : uint8_t { ... };.
Don't use enums for bitmask combinations. That's what constexpr or using with bitwise operators is for. Enums are for mutually exclusive choices.
enum class over plain enum in C++ unless you specifically need implicit int conversion for legacy APIs. It prevents accidental comparisons and namespace pollution.enum class adds type safety and scope control at zero runtime cost.The 3-Byte Pad That Silently Corrupted 50% of Our Network Packets
- Never memcpy a struct across a network or file boundary. Padding bytes are uninitialized and platform-dependent.
- Always define a wire format with explicit offsets and sizes, and serialize field-by-field.
- Use static_assert to catch layout surprises at compile time — not at 3am during an outage.
printf("offset of field: %zu\n", offsetof(MyStruct, field));printf("sizeof struct: %zu\n", sizeof(MyStruct));Key takeaways
Common mistakes to avoid
4 patternsReading a union member that wasn't the last written
Using memcmp or memcpy on padded structs for equality or serialization
Assuming pointer cast between struct types with same first field is safe
Applying __attribute__((packed)) to every struct thinking it saves memory everywhere
Interview Questions on This Topic
Explain memory alignment and padding. Why might a struct containing a char and a double occupy 16 bytes instead of 9?
Frequently Asked Questions
20+ years shipping performance-critical C and C++ systems. Written from production experience, not tutorials.
That's C Basics. Mark it forged?
10 min read · try the examples if you haven't