Strings in C Explained — How They Work, Why They're Tricky, and How to Use Them Safely
Every program that talks to a human needs text. Whether it's a login prompt, an error message, a username, or a file path — text is everywhere. In languages like Python or JavaScript, strings are cosy, fully managed objects that do a lot of heavy lifting for you. C, on the other hand, hands you the raw tools and trusts you to build the house yourself. That might sound scary, but understanding how C handles text under the hood makes you a dramatically better programmer in any language.
The core problem C strings solve is deceptively simple: how do you store a sequence of characters in memory and then find where that sequence ends? Memory is just a giant numbered grid of bytes. There's no built-in concept of 'a word' or 'a sentence'. C's answer is a convention called the null-terminated string — store your characters in consecutive memory slots and place a special zero-value byte at the end as a sentinel. Every standard library function that works with strings relies on this single rule.
By the end of this article you'll know exactly how C strings are stored in memory, how to declare and initialise them correctly, how to manipulate them using the standard library, and — most importantly — how to avoid the buffer overflows and undefined behaviour that trip up even experienced developers. You'll be reading real code, seeing real output, and walking away with a mental model that actually sticks.
What a C String Actually Is in Memory
A C string is not a special type — it's just a pointer to a sequence of 'char' values stored in contiguous memory, where the last character is always '\0' (the null terminator, ASCII value 0). That's it. There's no hidden length field, no magic object — just raw bytes in a row.
Think of RAM as a long street of numbered houses. Each house holds one character. When C stores the word 'Hello', it rents five houses in a row — one for 'H', one for 'e', one for 'l', one for 'l', one for 'o' — and then immediately rents one more house where it places a STOP sign (the '\0'). So 'Hello' actually occupies 6 bytes, not 5.
This is why the length of a string and the memory it needs are different numbers. strlen() counts the characters before the stop sign. sizeof() tells you the total space including the stop sign. Confusing these two is one of the most common beginner mistakes, so burn that distinction into your memory right now.
Whenever a standard library function like printf or strcpy reads a C string, it starts at the first character and keeps going until it hits that '\0'. That's the contract every piece of C string code relies on. Break that contract — forget the null terminator — and your program wanders into memory it doesn't own.
#include <stdio.h> #include <string.h> // needed for strlen() int main(void) { // Declare a string literal — the compiler automatically adds '\0' at the end char greeting[] = "Hello"; // strlen() counts characters UP TO (but not including) the null terminator size_t char_count = strlen(greeting); // sizeof() counts the TOTAL bytes the array occupies, INCLUDING '\0' size_t byte_count = sizeof(greeting); printf("String : %s\n", greeting); printf("strlen : %zu (characters only, no stop sign)\n", char_count); printf("sizeof : %zu (bytes in memory, includes stop sign)\n", byte_count); // Let's peek at each byte to prove '\0' is really there printf("\nMemory layout (character : ASCII value):\n"); for (size_t i = 0; i < byte_count; i++) { // Cast to unsigned to print the raw numeric value of each byte printf(" greeting[%zu] = '%c' (ASCII %d)\n", i, greeting[i] == '\0' ? '?' : greeting[i], // show ? for invisible null (unsigned char)greeting[i]); } return 0; }
strlen : 5 (characters only, no stop sign)
sizeof : 6 (bytes in memory, includes stop sign)
Memory layout (character : ASCII value):
greeting[0] = 'H' (ASCII 72)
greeting[1] = 'e' (ASCII 101)
greeting[2] = 'l' (ASCII 108)
greeting[3] = 'l' (ASCII 108)
greeting[4] = 'o' (ASCII 111)
greeting[5] = '?' (ASCII 0)
Three Ways to Declare a String — and Which One to Use When
C gives you three different ways to create a string, and each one behaves differently in memory. Picking the wrong one at the wrong time is a classic source of bugs.
The first way is a character array initialised with a string literal: 'char name[] = "Alice";'. The compiler figures out the right size, copies the characters including the null terminator into stack memory, and gives you a mutable buffer you can change. This is the go-to choice when you need to modify the string later.
The second way is to give the array an explicit size: 'char name[50] = "Alice";'. Now you've got 50 bytes reserved, with 'Alice\0' at the start and the rest zeroed out. This is what you want when you're planning to read user input into the buffer — you're pre-allocating the space.
The third way is a pointer to a string literal: 'const char *message = "Hello";'. This does NOT copy the string into a regular variable. Instead, the string 'Hello\0' lives in a read-only section of your program's memory, and 'message' is just a pointer to it. Trying to modify this string causes undefined behaviour — the program might crash, might silently corrupt data, or might appear to work fine on your machine and explode on someone else's. Always mark these 'const'.
The rule of thumb: if you need to modify the string, use an array. If it's a fixed message you'll never change, use a const pointer to a literal.
#include <stdio.h> #include <string.h> int main(void) { // --- Method 1: char array, compiler determines size --- // Safe to modify. Stored on the stack. Size = 6 (5 chars + null) char username[] = "Alice"; username[0] = 'a'; // perfectly fine — this memory is ours to change printf("Method 1 (auto-sized array) : %s\n", username); // --- Method 2: char array with explicit size --- // We reserved 50 bytes. Only 6 are used initially, the rest are zero. // Perfect for reading user input with fgets() later. char city[50] = "London"; // We can safely append or overwrite because we have room strcat(city, ", UK"); // adds ", UK" after "London" printf("Method 2 (fixed-size array) : %s\n", city); // --- Method 3: pointer to string literal --- // The string "Error: file not found" lives in read-only memory. // We can READ it freely, but must NEVER write to it. const char *error_message = "Error: file not found"; printf("Method 3 (const pointer) : %s\n", error_message); // Uncommenting the next line would be undefined behaviour (likely a crash): // error_message[0] = 'e'; // DO NOT DO THIS // Show sizes to highlight the difference printf("\nsizeof(username) = %zu (array — includes null)\n", sizeof(username)); printf("sizeof(city) = %zu (full reserved buffer)\n", sizeof(city)); printf("sizeof(error_message) = %zu (pointer size, NOT string length!)\n", sizeof(error_message)); return 0; }
Method 2 (fixed-size array) : London, UK
Method 3 (const pointer) : Error: file not found
sizeof(username) = 6 (array — includes null)
sizeof(city) = 50 (full reserved buffer)
sizeof(error_message) = 8 (pointer size, NOT string length!)
The Essential String Functions You'll Use Every Day
C's standard library ships with a set of string functions in
strlen(s) walks the string from the start until it hits '\0' and returns how many steps it took. O(n) — it actually loops through every character each time you call it, so don't call it inside a loop's condition if you can avoid it.
strcpy(destination, source) copies every character from source into destination, including the final '\0'. The danger: it blindly trusts that destination is big enough. If it isn't, you've just written past the end of your buffer — a classic buffer overflow. Prefer strncpy or strlcpy (where available) for safer copying.
strcat(destination, source) finds the '\0' at the end of destination and starts copying source from there. Same overflow danger as strcpy. Always check you have enough room.
strcmp(a, b) returns 0 if the strings are identical, a negative number if a comes before b alphabetically, and a positive number if a comes after b. Do NOT use == to compare strings in C — it compares pointer addresses, not content.
sprintf and snprintf let you build strings with formatted data, much like printf but writing into a buffer instead of the screen. snprintf is the safe version because it accepts a maximum byte count.
#include <stdio.h> #include <string.h> // strlen, strcpy, strcat, strcmp, strstr int main(void) { // --- strlen: measure a string --- const char *language = "C Programming"; printf("strlen(\"%s\") = %zu\n", language, strlen(language)); // --- strcpy: copy a string into a buffer --- // We allocate enough space for the source + null terminator char buffer[50]; strcpy(buffer, "Hello"); // copies 'H','e','l','l','o','\0' into buffer printf("After strcpy, buffer = \"%s\"\n", buffer); // --- strcat: append one string onto another --- // buffer currently holds "Hello" — we'll add ", World!" strcat(buffer, ", World!"); printf("After strcat, buffer = \"%s\"\n", buffer); // --- strcmp: compare two strings (not their addresses!) --- const char *password_entered = "secret123"; const char *password_stored = "secret123"; const char *wrong_password = "wrongpass"; // strcmp returns 0 when strings are identical if (strcmp(password_entered, password_stored) == 0) { printf("Passwords match — access granted\n"); } if (strcmp(wrong_password, password_stored) != 0) { printf("Passwords differ — access denied\n"); } // --- strstr: find a substring inside a string --- const char *sentence = "The quick brown fox jumps over the lazy dog"; const char *target = "fox"; char *found_at = strstr(sentence, target); // returns pointer to first match, or NULL if (found_at != NULL) { // Pointer arithmetic: subtract start address to get the index printf("Found \"%s\" at index %td\n", target, found_at - sentence); } // --- snprintf: build a string safely with formatted data --- char welcome_message[100]; const char *username = "Jordan"; int login_count = 42; // snprintf will NEVER write more than 100 bytes (including null terminator) snprintf(welcome_message, sizeof(welcome_message), "Welcome back, %s! You have logged in %d times.", username, login_count); printf("%s\n", welcome_message); return 0; }
After strcpy, buffer = "Hello"
After strcat, buffer = "Hello, World!"
Passwords match — access granted
Passwords differ — access denied
Found "fox" at index 16
Welcome back, Jordan! You have logged in 42 times.
Reading Strings from the User Safely with fgets
This is where beginners cause the most damage. The classic first instinct is to use scanf("%s", buffer) to read a string from the keyboard. It works — until your user types more characters than your buffer holds, and now you've written past the end of your array into memory you don't own. That's a buffer overflow, and it's one of the most exploited classes of security vulnerabilities in the history of software.
fgets is the safe alternative. It takes three arguments: the buffer to write into, the maximum number of bytes to read (including the null terminator), and the stream to read from (stdin for keyboard input). It will never write more than that maximum, so your buffer stays intact.
There are two small quirks to be aware of. First, fgets includes the newline character ' ' in the string if there's room — the user pressed Enter, and fgets captures that too. You'll often want to strip it. Second, fgets returns NULL if it hits an error or end-of-file, so always check the return value before using the buffer.
gets() is the dangerous old alternative — it has literally no size limit and was officially removed from the C standard in C11. Never use it. If you see it in old code, replace it with fgets immediately.
#include <stdio.h> #include <string.h> // strcspn, strlen // A small helper that strips the trailing newline fgets might leave behind void strip_newline(char *str) { // strcspn returns the index of the first '\n' in str // We replace that character with '\0' to end the string there str[strcspn(str, "\n")] = '\0'; } int main(void) { // Reserve a buffer large enough for a reasonable name // 64 bytes means we'll accept up to 63 characters + null terminator char full_name[64]; printf("Enter your full name: "); // fgets: safe because we tell it the maximum bytes to read // It will NEVER overflow the buffer, no matter what the user types if (fgets(full_name, sizeof(full_name), stdin) == NULL) { // fgets returns NULL on error or end-of-file — always handle this printf("Failed to read input.\n"); return 1; } // Remove the trailing '\n' that fgets includes when the user presses Enter strip_newline(full_name); // Now the string is clean and safe to use printf("Hello, %s! Your name is %zu characters long.\n", full_name, strlen(full_name)); return 0; }
Hello, Ada Lovelace! Your name is 12 characters long.
| Aspect | char array (char name[]) | char pointer (const char *) |
|---|---|---|
| Memory location | Stack (local) or data segment | Read-only data segment |
| Can you modify the content? | Yes — it's your buffer | No — undefined behaviour if you try |
| Size known at compile time? | Yes — sizeof() works correctly | No — sizeof() gives pointer size, not string length |
| Good for user input? | Yes — use with fgets() | No — never point this at mutable input |
| Good for fixed messages? | Works, but wastes a copy | Yes — ideal, mark const |
| Null terminator required? | Yes, always | Yes, always — it's the law of C strings |
| Comparison method | strcmp() only | strcmp() only |
| Common beginner trap | Forgetting to allocate +1 for null | Trying to modify without const warning |
🎯 Key Takeaways
- A C string is just a char array in contiguous memory with a '\0' byte at the end — there's no magic, just a convention every standard function depends on.
- strlen() and sizeof() measure different things: strlen counts characters before the null terminator; sizeof counts the total bytes of the array variable including the null terminator.
- Never use gets() or unconstrained scanf("%s") for user input — use fgets(buffer, sizeof(buffer), stdin) to prevent buffer overflows.
- Always use strcmp() to compare strings, never == — strings are pointers, and == compares addresses, not the characters they point to.
⚠ Common Mistakes to Avoid
- ✕Mistake 1: Forgetting to allocate space for the null terminator — if you write 'char word[5] = "Hello";' you've asked for exactly 5 bytes but 'Hello' needs 6 (5 chars + '\0'), so the compiler either rejects it or the null terminator gets silently dropped, corrupting every string function call that follows. Fix: always declare char arrays with length = expected characters + 1, or let the compiler count for you with 'char word[] = "Hello";'.
- ✕Mistake 2: Using == to compare strings — writing 'if (input == "yes")' compares two memory addresses, not the characters they point to. It will almost always be false even when the strings look identical, causing logic bugs that are maddening to diagnose. Fix: always use 'strcmp(input, "yes") == 0' — the return value of 0 means the strings are identical.
- ✕Mistake 3: Using strcpy or strcat without checking buffer capacity — if the source string is longer than the destination buffer, these functions will happily write past the end of your array, overwriting adjacent variables or return addresses. This is a buffer overflow. Fix: use strncpy(dest, src, sizeof(dest) - 1) followed by 'dest[sizeof(dest)-1] = "\0";' to guarantee null termination, or use the safer snprintf() for building strings.
Interview Questions on This Topic
- QWhat is the null terminator in C, why does it exist, and what happens if a string is missing it?
- QWhy can't you use the == operator to compare two strings in C, and what should you use instead?
- QWhat is a buffer overflow in the context of C strings? Can you give an example of code that causes one, and how would you fix it?
Frequently Asked Questions
What is a null terminator in C strings and why is it needed?
The null terminator is a byte with the value zero ('\0') placed at the end of every C string. Because C has no built-in string type and strings are just arrays of characters in raw memory, the null terminator is the only signal that tells functions like printf, strlen, and strcpy where the string ends. Without it, those functions keep reading memory past your string until they accidentally find a zero byte somewhere, causing unpredictable bugs.
What is the difference between a string literal and a char array in C?
A string literal like "Hello" is stored in a read-only section of your program's memory and should never be modified. A char array like 'char greeting[] = "Hello";' copies those characters into a mutable buffer on the stack that you can freely change. The literal is the source of truth; the array is your working copy.
Why does sizeof() give the wrong length for a string pointer in C?
When you have 'const char *msg = "Hello";', msg is a pointer variable — typically 8 bytes on a 64-bit system. sizeof(msg) gives you the size of the pointer itself, not the size of the string it points to. To get the character count of the string, use strlen(msg). This is one of the most common beginner confusions in C.
Written and reviewed by senior developers with real-world experience across enterprise, startup and open-source projects. Every article on TheCodeForge is written to be clear, accurate and genuinely useful — not just SEO filler.