Strings in C Explained — How They Work, Why They're Tricky, and How to Use Them Safely
- A C string is just a char array in contiguous memory with a '\0' byte at the end — there's no magic, just a convention every standard function depends on.
- strlen() and
sizeof()measure different things: strlen counts characters before the null terminator; sizeof counts the total bytes of the array variable including the null terminator. - Never use
gets()or unconstrained scanf("%s") for user input — use fgets(buffer, sizeof(buffer), stdin) to prevent buffer overflows.
Imagine you're writing letters on a long strip of paper, one letter per box, and at the very end you draw a big red STOP sign so whoever's reading knows the message is finished. That's exactly how C stores text — one character per memory slot, with a special invisible 'stop' character at the end. Without that stop sign, your program wouldn't know where your message ends and would keep reading random garbage off the paper.
Every program that talks to a human needs text. Whether it's a login prompt, an error message, a username, or a file path — text is everywhere. In languages like Python or JavaScript, strings are cosy, fully managed objects that do a lot of heavy lifting for you. C, on the other hand, hands you the raw tools and trusts you to build the house yourself. That might sound scary, but understanding how C handles text under the hood makes you a dramatically better programmer in any language.
The core problem C strings solve is deceptively simple: how do you store a sequence of characters in memory and then find where that sequence ends? Memory is just a giant numbered grid of bytes. There's no built-in concept of 'a word' or 'a sentence'. C's answer is a convention called the null-terminated string — store your characters in consecutive memory slots and place a special zero-value byte at the end as a sentinel. Every standard library function that works with strings relies on this single rule.
By the end of this article you'll know exactly how C strings are stored in memory, how to declare and initialise them correctly, how to manipulate them using the standard library, and — most importantly — how to avoid the buffer overflows and undefined behaviour that trip up even experienced developers. You'll be reading real code, seeing real output, and walking away with a mental model that actually sticks.
What a C String Actually Is in Memory
A C string is not a special type — it's just a pointer to a sequence of 'char' values stored in contiguous memory, where the last character is always '\0' (the null terminator, ASCII value 0). That's it. There's no hidden length field, no magic object — just raw bytes in a row.
Think of RAM as a long street of numbered houses. Each house holds one character. When C stores the word 'Hello', it rents five houses in a row — one for 'H', one for 'e', one for 'l', one for 'l', one for 'o' — and then immediately rents one more house where it places a STOP sign (the '\0'). So 'Hello' actually occupies 6 bytes, not 5.
This is why the length of a string and the memory it needs are different numbers. strlen() counts the characters before the stop sign. sizeof() tells you the total space including the stop sign. Confusing these two is one of the most common beginner mistakes, so burn that distinction into your memory right now.
Whenever a standard library function like printf or strcpy reads a C string, it starts at the first character and keeps going until it hits that '\0'. That's the contract every piece of C string code relies on. Break that contract — forget the null terminator — and your program wanders into memory it doesn't own.
#include <stdio.h> #include <string.h> /** * io.thecodeforge package-style demonstration * Showing the internal byte representation of a C string */ void debug_string_memory() { char greeting[] = "Hello"; size_t char_count = strlen(greeting); size_t byte_count = sizeof(greeting); printf("String: %s\n", greeting); printf("strlen: %zu (visible chars)\n", char_count); printf("sizeof: %zu (total memory bytes)\n", byte_count); printf("\nByte Map:\n"); for (size_t i = 0; i < byte_count; i++) { printf(" index [%zu]: '%c' (Hex: 0x%02X)\n", i, (greeting[i] ? greeting[i] : '?'), (unsigned char)greeting[i]); } } int main(void) { debug_string_memory(); return 0; }
strlen: 5 (visible chars)
sizeof: 6 (total memory bytes)
Byte Map:
index [0]: 'H' (Hex: 0x48)
index [1]: 'e' (Hex: 0x65)
index [2]: 'l' (Hex: 0x6C)
index [3]: 'l' (Hex: 0x6C)
index [4]: 'o' (Hex: 0x6F)
index [5]: '?' (Hex: 0x00)
sizeof() to get the number of characters in a string — use strlen(). sizeof gives you the byte size of the array variable, not the logical length. They're only the same for single-character strings by coincidence. This mix-up causes off-by-one bugs that are incredibly hard to track down.Three Ways to Declare a String — and Which One to Use When
C gives you three different ways to create a string, and each one behaves differently in memory. Picking the wrong one at the wrong time is a classic source of bugs.
The first way is a character array initialised with a string literal: 'char name[] = "Alice";'. The compiler figures out the right size, copies the characters including the null terminator into stack memory, and gives you a mutable buffer you can change. This is the go-to choice when you need to modify the string later.
The second way is to give the array an explicit size: 'char name[50] = "Alice";'. Now you've got 50 bytes reserved, with 'Alice\0' at the start and the rest zeroed out. This is what you want when you're planning to read user input into the buffer — you're pre-allocating the space.
The third way is a pointer to a string literal: 'const char *message = "Hello";'. This does NOT copy the string into a regular variable. Instead, the string 'Hello\0' lives in a read-only section of your program's memory, and 'message' is just a pointer to it. Trying to modify this string causes undefined behaviour — the program might crash, might silently corrupt data, or might appear to work fine on your machine and explode on someone else's. Always mark these 'const'.
#include <stdio.h> #include <string.h> int main(void) { // 1. Stack Array (Mutable) char mutable_str[] = "Forge"; mutable_str[0] = 'f'; // Valid // 2. Pre-allocated Buffer char buffer[128] = "io.thecodeforge"; // 3. String Literal Pointer (Read-Only) const char *readonly_msg = "Strictly Read Only"; printf("Array: %s\n", mutable_str); printf("Buffer: %s\n", buffer); printf("Pointer: %s\n", readonly_msg); return 0; }
Buffer: io.thecodeforge
Pointer: Strictly Read Only
The Essential String Functions You'll Use Every Day
C's standard library ships with a set of string functions in <string.h> that cover the operations you'll need constantly — measuring length, copying, joining, comparing, and searching. They're thin, fast, and they all depend on that null terminator contract we talked about.
strlen(s) walks the string from the start until it hits '\0' and returns how many steps it took. O(n) — it actually loops through every character each time you call it, so don't call it inside a loop's condition if you can avoid it.
strcpy(destination, source) copies every character from source into destination, including the final '\0'. The danger: it blindly trusts that destination is big enough. If it isn't, you've just written past the end of your buffer — a classic buffer overflow. Prefer strncpy or strlcpy (where available) for safer copying.
strcmp(a, b) returns 0 if the strings are identical, a negative number if a comes before b alphabetically, and a positive number if a comes after b. Do NOT use == to compare strings in C — it compares pointer addresses, not content.
#include <stdio.h> #include <string.h> int main(void) { const char *src = "thecodeforge"; char dest[20]; // Safe Copying strncpy(dest, src, sizeof(dest) - 1); dest[sizeof(dest) - 1] = '\0'; // Manual safety termination // Comparison if (strcmp(dest, "thecodeforge") == 0) { printf("Strings match exactly.\n"); } // Substring Search char *found = strstr(dest, "forge"); if (found) { printf("Found substring at index: %ld\n", found - dest); } return 0; }
Found substring at index: 7
strcmp() — and always check its return value against 0, not just treat it as a boolean.Reading Strings from the User Safely with fgets
This is where beginners cause the most damage. The classic first instinct is to use scanf("%s", buffer) to read a string from the keyboard. It works — until your user types more characters than your buffer holds, and now you've written past the end of your array into memory you don't own. That's a buffer overflow, and it's one of the most exploited classes of security vulnerabilities in the history of software.
fgets is the safe alternative. It takes three arguments: the buffer to write into, the maximum number of bytes to read (including the null terminator), and the stream to read from (stdin for keyboard input). It will never write more than that maximum, so your buffer stays intact.
#include <stdio.h> #include <string.h> int main(void) { char input_buffer[32]; printf("Enter code tag: "); // fgets is safe; prevents reading more than 32 bytes if (fgets(input_buffer, sizeof(input_buffer), stdin)) { // Strip the trailing newline often left by enter key input_buffer[strcspn(input_buffer, "\n")] = 0; printf("Processing: [%s]\n", input_buffer); } return 0; }
Processing: [feature-request]
| Aspect | char array (char name[]) | char pointer (const char *) |
|---|---|---|
| Memory location | Stack (local) or data segment | Read-only data segment |
| Can you modify the content? | Yes — it's your buffer | No — undefined behaviour if you try |
| Size known at compile time? | Yes — sizeof() works correctly | No — sizeof() gives pointer size, not string length |
| Good for user input? | Yes — use with fgets() | No — never point this at mutable input |
| Good for fixed messages? | Works, but wastes a copy | Yes — ideal, mark const |
| Null terminator required? | Yes, always | Yes, always — it's the law of C strings |
| Comparison method | strcmp() only | strcmp() only |
| Common beginner trap | Forgetting to allocate +1 for null | Trying to modify without const warning |
🎯 Key Takeaways
- A C string is just a char array in contiguous memory with a '\0' byte at the end — there's no magic, just a convention every standard function depends on.
- strlen() and
sizeof()measure different things: strlen counts characters before the null terminator; sizeof counts the total bytes of the array variable including the null terminator. - Never use
gets()or unconstrained scanf("%s") for user input — use fgets(buffer, sizeof(buffer), stdin) to prevent buffer overflows. - Always use
strcmp()to compare strings, never == — strings are pointers, and == compares addresses, not the characters they point to.
⚠ Common Mistakes to Avoid
Interview Questions on This Topic
- QHow does the null-terminator affect the time complexity of the
strlen()function? Explain the difference between O(1) and O(n) in this context. - QExplain why 'char *p = "Hello"; p[0] = 'h';' leads to a Segmentation Fault on most modern operating systems.
- QGiven a character array
char buf[10], what happens if you attempt to store the string "IDENTIFICATION" usingstrcpy? Describe the impact on the stack frame. - QHow would you implement a basic version of
strlenwithout using any library functions? Write the code using a while loop and pointer arithmetic. - QWhat is the 'Off-by-one' error specifically related to C strings and the null terminator?
Frequently Asked Questions
What is a null terminator in C strings and why is it needed?
The null terminator is a byte with the value zero ('\0') placed at the end of every C string. Because C has no built-in string type and strings are just arrays of characters in raw memory, the null terminator is the only signal that tells functions like printf, strlen, and strcpy where the string ends. Without it, those functions keep reading memory past your string until they accidentally find a zero byte somewhere, causing unpredictable bugs.
What is the difference between a string literal and a char array in C?
A string literal like "Hello" is stored in a read-only section of your program's memory and should never be modified. A char array like 'char greeting[] = "Hello";' copies those characters into a mutable buffer on the stack that you can freely change. The literal is the source of truth; the array is your working copy.
Why does sizeof() give the wrong length for a string pointer in C?
When you have 'const char *msg = "Hello";', msg is a pointer variable — typically 8 bytes on a 64-bit system. sizeof(msg) gives you the size of the pointer itself, not the size of the string it points to. To get the character count of the string, use strlen(msg). This is one of the most common beginner confusions in C.
How do I clear a C string buffer efficiently?
The most common way is using memset(buffer, 0, sizeof(buffer)); which fills the entire array with null characters. Alternatively, simply setting buffer[0] = '\0'; effectively makes it an 'empty' string from the perspective of standard C functions, though the old data remains in the subsequent memory slots.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.