Skip to content
Home C / C++ Strings in C Explained — How They Work, Why They're Tricky, and How to Use Them Safely

Strings in C Explained — How They Work, Why They're Tricky, and How to Use Them Safely

Where developers are forged. · Structured learning · Free forever.
📍 Part of: C Basics → Topic 7 of 17
Master C strings: learn about the null terminator, memory layout, buffer overflows, and safe input using fgets.
🧑‍💻 Beginner-friendly — no prior C / C++ experience needed
In this tutorial, you'll learn
Master C strings: learn about the null terminator, memory layout, buffer overflows, and safe input using fgets.
  • A C string is just a char array in contiguous memory with a '\0' byte at the end — there's no magic, just a convention every standard function depends on.
  • strlen() and sizeof() measure different things: strlen counts characters before the null terminator; sizeof counts the total bytes of the array variable including the null terminator.
  • Never use gets() or unconstrained scanf("%s") for user input — use fgets(buffer, sizeof(buffer), stdin) to prevent buffer overflows.
✦ Plain-English analogy ✦ Real code with output ✦ Interview questions
Quick Answer

Imagine you're writing letters on a long strip of paper, one letter per box, and at the very end you draw a big red STOP sign so whoever's reading knows the message is finished. That's exactly how C stores text — one character per memory slot, with a special invisible 'stop' character at the end. Without that stop sign, your program wouldn't know where your message ends and would keep reading random garbage off the paper.

Every program that talks to a human needs text. Whether it's a login prompt, an error message, a username, or a file path — text is everywhere. In languages like Python or JavaScript, strings are cosy, fully managed objects that do a lot of heavy lifting for you. C, on the other hand, hands you the raw tools and trusts you to build the house yourself. That might sound scary, but understanding how C handles text under the hood makes you a dramatically better programmer in any language.

The core problem C strings solve is deceptively simple: how do you store a sequence of characters in memory and then find where that sequence ends? Memory is just a giant numbered grid of bytes. There's no built-in concept of 'a word' or 'a sentence'. C's answer is a convention called the null-terminated string — store your characters in consecutive memory slots and place a special zero-value byte at the end as a sentinel. Every standard library function that works with strings relies on this single rule.

By the end of this article you'll know exactly how C strings are stored in memory, how to declare and initialise them correctly, how to manipulate them using the standard library, and — most importantly — how to avoid the buffer overflows and undefined behaviour that trip up even experienced developers. You'll be reading real code, seeing real output, and walking away with a mental model that actually sticks.

What a C String Actually Is in Memory

A C string is not a special type — it's just a pointer to a sequence of 'char' values stored in contiguous memory, where the last character is always '\0' (the null terminator, ASCII value 0). That's it. There's no hidden length field, no magic object — just raw bytes in a row.

Think of RAM as a long street of numbered houses. Each house holds one character. When C stores the word 'Hello', it rents five houses in a row — one for 'H', one for 'e', one for 'l', one for 'l', one for 'o' — and then immediately rents one more house where it places a STOP sign (the '\0'). So 'Hello' actually occupies 6 bytes, not 5.

This is why the length of a string and the memory it needs are different numbers. strlen() counts the characters before the stop sign. sizeof() tells you the total space including the stop sign. Confusing these two is one of the most common beginner mistakes, so burn that distinction into your memory right now.

Whenever a standard library function like printf or strcpy reads a C string, it starts at the first character and keeps going until it hits that '\0'. That's the contract every piece of C string code relies on. Break that contract — forget the null terminator — and your program wanders into memory it doesn't own.

string_memory_layout.c · C
123456789101112131415161718192021222324252627
#include <stdio.h>
#include <string.h>

/**
 * io.thecodeforge package-style demonstration
 * Showing the internal byte representation of a C string
 */
void debug_string_memory() {
    char greeting[] = "Hello";
    size_t char_count = strlen(greeting);
    size_t byte_count = sizeof(greeting);

    printf("String: %s\n", greeting);
    printf("strlen: %zu (visible chars)\n", char_count);
    printf("sizeof: %zu (total memory bytes)\n", byte_count);

    printf("\nByte Map:\n");
    for (size_t i = 0; i < byte_count; i++) {
        printf("  index [%zu]: '%c' (Hex: 0x%02X)\n", 
               i, (greeting[i] ? greeting[i] : '?'), (unsigned char)greeting[i]);
    }
}

int main(void) {
    debug_string_memory();
    return 0;
}
▶ Output
String: Hello
strlen: 5 (visible chars)
sizeof: 6 (total memory bytes)

Byte Map:
index [0]: 'H' (Hex: 0x48)
index [1]: 'e' (Hex: 0x65)
index [2]: 'l' (Hex: 0x6C)
index [3]: 'l' (Hex: 0x6C)
index [4]: 'o' (Hex: 0x6F)
index [5]: '?' (Hex: 0x00)
⚠ Watch Out: strlen vs sizeof
Never use sizeof() to get the number of characters in a string — use strlen(). sizeof gives you the byte size of the array variable, not the logical length. They're only the same for single-character strings by coincidence. This mix-up causes off-by-one bugs that are incredibly hard to track down.

Three Ways to Declare a String — and Which One to Use When

C gives you three different ways to create a string, and each one behaves differently in memory. Picking the wrong one at the wrong time is a classic source of bugs.

The first way is a character array initialised with a string literal: 'char name[] = "Alice";'. The compiler figures out the right size, copies the characters including the null terminator into stack memory, and gives you a mutable buffer you can change. This is the go-to choice when you need to modify the string later.

The second way is to give the array an explicit size: 'char name[50] = "Alice";'. Now you've got 50 bytes reserved, with 'Alice\0' at the start and the rest zeroed out. This is what you want when you're planning to read user input into the buffer — you're pre-allocating the space.

The third way is a pointer to a string literal: 'const char *message = "Hello";'. This does NOT copy the string into a regular variable. Instead, the string 'Hello\0' lives in a read-only section of your program's memory, and 'message' is just a pointer to it. Trying to modify this string causes undefined behaviour — the program might crash, might silently corrupt data, or might appear to work fine on your machine and explode on someone else's. Always mark these 'const'.

string_declarations.c · C
1234567891011121314151617181920
#include <stdio.h>
#include <string.h>

int main(void) {
    // 1. Stack Array (Mutable)
    char mutable_str[] = "Forge";
    mutable_str[0] = 'f'; // Valid

    // 2. Pre-allocated Buffer
    char buffer[128] = "io.thecodeforge";

    // 3. String Literal Pointer (Read-Only)
    const char *readonly_msg = "Strictly Read Only";

    printf("Array: %s\n", mutable_str);
    printf("Buffer: %s\n", buffer);
    printf("Pointer: %s\n", readonly_msg);

    return 0;
}
▶ Output
Array: forge
Buffer: io.thecodeforge
Pointer: Strictly Read Only
🔥Pro Tip: Always Use const for Literal Pointers
The compiler won't always stop you from writing 'char msg = "hello";' (without const), but it's lying to you — that memory is read-only at runtime. Always write 'const char msg = "hello";'. It makes your intent clear, and modern compilers will warn you if you accidentally try to modify it.

The Essential String Functions You'll Use Every Day

C's standard library ships with a set of string functions in <string.h> that cover the operations you'll need constantly — measuring length, copying, joining, comparing, and searching. They're thin, fast, and they all depend on that null terminator contract we talked about.

strlen(s) walks the string from the start until it hits '\0' and returns how many steps it took. O(n) — it actually loops through every character each time you call it, so don't call it inside a loop's condition if you can avoid it.

strcpy(destination, source) copies every character from source into destination, including the final '\0'. The danger: it blindly trusts that destination is big enough. If it isn't, you've just written past the end of your buffer — a classic buffer overflow. Prefer strncpy or strlcpy (where available) for safer copying.

strcmp(a, b) returns 0 if the strings are identical, a negative number if a comes before b alphabetically, and a positive number if a comes after b. Do NOT use == to compare strings in C — it compares pointer addresses, not content.

string_functions_demo.c · C
123456789101112131415161718192021222324
#include <stdio.h>
#include <string.h>

int main(void) {
    const char *src = "thecodeforge";
    char dest[20];

    // Safe Copying
    strncpy(dest, src, sizeof(dest) - 1);
    dest[sizeof(dest) - 1] = '\0'; // Manual safety termination

    // Comparison
    if (strcmp(dest, "thecodeforge") == 0) {
        printf("Strings match exactly.\n");
    }

    // Substring Search
    char *found = strstr(dest, "forge");
    if (found) {
        printf("Found substring at index: %ld\n", found - dest);
    }

    return 0;
}
▶ Output
Strings match exactly.
Found substring at index: 7
💡Interview Gold: Why Can't You Use == to Compare Strings?
Because in C, a string variable is a pointer. Writing 'str1 == str2' compares the memory addresses the two pointers point to, not the characters they contain. Two strings with identical content can sit at different addresses and return false. Always use strcmp() — and always check its return value against 0, not just treat it as a boolean.

Reading Strings from the User Safely with fgets

This is where beginners cause the most damage. The classic first instinct is to use scanf("%s", buffer) to read a string from the keyboard. It works — until your user types more characters than your buffer holds, and now you've written past the end of your array into memory you don't own. That's a buffer overflow, and it's one of the most exploited classes of security vulnerabilities in the history of software.

fgets is the safe alternative. It takes three arguments: the buffer to write into, the maximum number of bytes to read (including the null terminator), and the stream to read from (stdin for keyboard input). It will never write more than that maximum, so your buffer stays intact.

safe_string_input.c · C
123456789101112131415161718
#include <stdio.h>
#include <string.h>

int main(void) {
    char input_buffer[32];

    printf("Enter code tag: ");

    // fgets is safe; prevents reading more than 32 bytes
    if (fgets(input_buffer, sizeof(input_buffer), stdin)) {
        // Strip the trailing newline often left by enter key
        input_buffer[strcspn(input_buffer, "\n")] = 0;
        
        printf("Processing: [%s]\n", input_buffer);
    }

    return 0;
}
▶ Output
Enter code tag: feature-request
Processing: [feature-request]
⚠ Watch Out: Never Use gets()
gets() was removed from the C11 standard because it cannot be used safely — there is no way to tell it your buffer size, so any input longer than the buffer causes undefined behaviour. Every major OS lists gets-based code as a security vulnerability. Use fgets(buffer, sizeof(buffer), stdin) every single time.
Aspectchar array (char name[])char pointer (const char *)
Memory locationStack (local) or data segmentRead-only data segment
Can you modify the content?Yes — it's your bufferNo — undefined behaviour if you try
Size known at compile time?Yes — sizeof() works correctlyNo — sizeof() gives pointer size, not string length
Good for user input?Yes — use with fgets()No — never point this at mutable input
Good for fixed messages?Works, but wastes a copyYes — ideal, mark const
Null terminator required?Yes, alwaysYes, always — it's the law of C strings
Comparison methodstrcmp() onlystrcmp() only
Common beginner trapForgetting to allocate +1 for nullTrying to modify without const warning

🎯 Key Takeaways

  • A C string is just a char array in contiguous memory with a '\0' byte at the end — there's no magic, just a convention every standard function depends on.
  • strlen() and sizeof() measure different things: strlen counts characters before the null terminator; sizeof counts the total bytes of the array variable including the null terminator.
  • Never use gets() or unconstrained scanf("%s") for user input — use fgets(buffer, sizeof(buffer), stdin) to prevent buffer overflows.
  • Always use strcmp() to compare strings, never == — strings are pointers, and == compares addresses, not the characters they point to.

⚠ Common Mistakes to Avoid

    Forgetting to allocate space for the null terminator — if you write 'char word[5] = "Hello";' you've asked for exactly 5 bytes but 'Hello' needs 6 (5 chars + '\0'), so the compiler either rejects it or the null terminator gets silently dropped, corrupting every string function call that follows. Fix: always declare char arrays with length = expected characters + 1, or let the compiler count for you with 'char word[] = "Hello";'.
    Fix

    always declare char arrays with length = expected characters + 1, or let the compiler count for you with 'char word[] = "Hello";'.

    Using == to compare strings — writing 'if (input == "yes")' compares two memory addresses, not the characters they point to. It will almost always be false even when the strings look identical, causing logic bugs that are maddening to diagnose. Fix: always use 'strcmp(input, "yes") == 0' — the return value of 0 means the strings are identical.
    Fix

    always use 'strcmp(input, "yes") == 0' — the return value of 0 means the strings are identical.

    Using strcpy or strcat without checking buffer capacity — if the source string is longer than the destination buffer, these functions will happily write past the end of your array, overwriting adjacent variables or return addresses. This is a buffer overflow. Fix: use strncpy(dest, src, sizeof(dest) - 1) followed by 'dest[sizeof(dest)-1] = "\0";' to guarantee null termination, or use the safer snprintf() for building strings.
    Fix

    use strncpy(dest, src, sizeof(dest) - 1) followed by 'dest[sizeof(dest)-1] = "\0";' to guarantee null termination, or use the safer snprintf() for building strings.

Interview Questions on This Topic

  • QHow does the null-terminator affect the time complexity of the strlen() function? Explain the difference between O(1) and O(n) in this context.
  • QExplain why 'char *p = "Hello"; p[0] = 'h';' leads to a Segmentation Fault on most modern operating systems.
  • QGiven a character array char buf[10], what happens if you attempt to store the string "IDENTIFICATION" using strcpy? Describe the impact on the stack frame.
  • QHow would you implement a basic version of strlen without using any library functions? Write the code using a while loop and pointer arithmetic.
  • QWhat is the 'Off-by-one' error specifically related to C strings and the null terminator?

Frequently Asked Questions

What is a null terminator in C strings and why is it needed?

The null terminator is a byte with the value zero ('\0') placed at the end of every C string. Because C has no built-in string type and strings are just arrays of characters in raw memory, the null terminator is the only signal that tells functions like printf, strlen, and strcpy where the string ends. Without it, those functions keep reading memory past your string until they accidentally find a zero byte somewhere, causing unpredictable bugs.

What is the difference between a string literal and a char array in C?

A string literal like "Hello" is stored in a read-only section of your program's memory and should never be modified. A char array like 'char greeting[] = "Hello";' copies those characters into a mutable buffer on the stack that you can freely change. The literal is the source of truth; the array is your working copy.

Why does sizeof() give the wrong length for a string pointer in C?

When you have 'const char *msg = "Hello";', msg is a pointer variable — typically 8 bytes on a 64-bit system. sizeof(msg) gives you the size of the pointer itself, not the size of the string it points to. To get the character count of the string, use strlen(msg). This is one of the most common beginner confusions in C.

How do I clear a C string buffer efficiently?

The most common way is using memset(buffer, 0, sizeof(buffer)); which fills the entire array with null characters. Alternatively, simply setting buffer[0] = '\0'; effectively makes it an 'empty' string from the perspective of standard C functions, though the old data remains in the subsequent memory slots.

🔥
Naren Founder & Author

Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.

← PreviousArrays in CNext →Pointers in C
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged