PHP Strings — XSS from Unescaped Product Descriptions
Using echo $review->text instead of htmlspecialchars caused fake login pop-ups on a major retailer's site.
- A PHP string is a sequence of characters wrapped in quotes
- Single quotes treat everything literally; double quotes parse variables inside
- PHP offers 100+ built-in functions: strlen, strpos, str_replace, explode, and more
- Strings are immutable — functions return new values, they don't modify the original
- Use double quotes for variable embedding; single quotes for performance when no variables
- Production risk: forgetting strict comparison with strpos() leads to logic bugs
Imagine a string is just a piece of text written on a sticky note. PHP lets you stick notes together, cut them up, search inside them, count the letters, and swap words out — all without touching the original paper. Every time a website shows your name, a search result, or an error message, it's doing exactly that: manipulating strings. PHP has a huge built-in toolkit of functions that do the heavy lifting so you don't have to write the logic yourself.
Every single thing a user reads on a webpage is text — usernames, blog posts, error messages, product names, email addresses. That text has to come from somewhere, get processed, and then get shown in exactly the right format. PHP is one of the most widely-used languages for powering that backend text-processing work, and strings are at the absolute heart of it. If you can't confidently work with strings in PHP, you'll hit a wall fast.
What Is a PHP String and How Do You Create One?
A string is just a sequence of characters — letters, numbers, spaces, symbols — wrapped in quotes. Think of it like a sentence inside a box. PHP needs those quotes to know where the text starts and where it ends. Without them, PHP would try to read your text as a command and get very confused.
PHP gives you two ways to wrap that box: single quotes and double quotes. The difference matters more than beginners expect. Inside double quotes, PHP looks for variables and replaces them with their values — this is called variable interpolation. Inside single quotes, PHP takes everything literally. No substitutions, no special behaviour — what you type is exactly what you get.
There's also a third syntax called Heredoc, which is like a multi-line box for long chunks of text. It's less common day-to-day but incredibly useful when you're building HTML templates or long email bodies inside PHP.
For now, start with double quotes for most things. Use single quotes when you want to be strict and explicit, or when your string contains a lot of dollar signs (like currency) and you don't want PHP trying to treat them as variable names.
The Most Useful PHP String Functions You'll Use Every Day
PHP ships with over 100 built-in string functions. That sounds overwhelming, but honestly about a dozen of them handle 90% of real-world work. These aren't arbitrary — each one exists because web developers kept writing the same helper code over and over, and the PHP team baked those solutions into the language.
Here's the mental model: every string function takes at least one string as input and gives you something back — a modified string, a number, or a true/false answer. You're never changing the original variable unless you explicitly overwrite it. Strings in PHP are immutable by default in that sense — the function returns a new value, it doesn't edit in place.
The functions below are grouped by what job they do: measuring strings, transforming their case, trimming whitespace, searching inside them, and replacing content. Learn them in that order and they'll stick.
strpos() !== false — an ugly pattern that trips up beginners (see Gotchas below). If you're on PHP 8+, use str_contains(), str_starts_with(), and str_ends_with() instead. They return plain true/false and read like plain English.strpos() instead of str_contains() when you just need a boolean.strpos() is slightly faster than str_contains() for very large strings, but readability wins in 99% of cases.str_contains() (PHP 8+) or strpos() !== false.Formatting Strings for Output — sprintf and number_format
There's a big difference between storing a number and displaying it nicely. The number 49999.5 in your database needs to look like $49,999.50 on a receipt. PHP's sprintf() function acts like a template engine for strings — you write a pattern with placeholders, then tell PHP what values to slot in.
Think of sprintf() like a Mad Libs game: you write a sentence with blank spaces, and then supply the words separately. The format codes starting with % are the blanks. %s means 'put a string here', %d means 'put a whole number here', %.2f means 'put a decimal number here with exactly 2 decimal places'.
This approach is cleaner than manually concatenating with dots, especially when building things like price labels, log messages, or SQL queries. It also separates your template from your data, which makes the code much easier to read and maintain.
number_format() is a simpler companion function focused purely on formatting numbers — adding thousands separators and controlling decimal places. Every e-commerce site uses it.
Searching and Replacing Text — The Real-World Power Move
The ability to search inside strings and replace content is where PHP strings stop being a theory exercise and start solving real problems. Imagine you're building a blog platform: users write posts with a custom shortcode like [author-name], and your system replaces it with the actual author's name before display. That's str_replace() in action.
For simple replacements, str_replace() is perfect. But sometimes you need more power — maybe you want to find all phone numbers in a block of text, or validate that an email address actually looks like an email. That's when you reach for regular expressions via preg_match() and preg_replace(). Regex is its own deep topic, but it's built on the same string foundation you're learning now.
One practical pattern every PHP developer uses: sanitising user input before storing or displaying it. The htmlspecialchars() function converts dangerous characters like < and > into safe HTML entities, stopping cross-site scripting (XSS) attacks. It's not glamorous, but skipping it is a serious security mistake. Always, always sanitise before you echo user-supplied data.
htmlspecialchars(), their script runs in every visitor's browser. This is called XSS (Cross-Site Scripting) and it's one of the most common web vulnerabilities. One line — htmlspecialchars($value, ENT_QUOTES, 'UTF-8') — fixes it every time.htmlspecialchars() is the root cause of 70% of XSS vulnerabilities in PHP applications.str_replace() for simple string replacement is fast, but for patterns use preg_replace() with careful regex to avoid catastrophic backtracking.htmlspecialchars() before echo — non-negotiable.str_replace() for simple replacements; preg_match()/preg_replace() for patterns.str_ireplace() or regex with /i flag.String Comparison, Sorting, and Multibyte Safety
Comparing strings in PHP isn't always straightforward. The == operator performs type coercion, which can lead to unexpected results. '123' == 123 is true, but '123abc' == 123 is also true because PHP converts the string to a number. For strict comparison, use === which checks both value and type. For string-specific comparison, use strcmp() which returns 0 if equal, negative if first is less, positive if first is greater — and it's binary safe.
When dealing with accented characters (é, ñ, ü) or non-Latin scripts (Cyrillic, Chinese, Arabic), standard functions like strlen() and strtolower() break. They count bytes, not characters. A single emoji like 😀 is 4 bytes. Use the mb_ (multibyte) family of functions: mb_strlen(), mb_strtolower(), mb_substr(), etc. They respect character encoding and count actual characters.
Sorting user-generated lists? Use strnatcmp() for natural order (e.g., 'img2.jpg' before 'img10.jpg') and setlocale() with strcoll() for locale-aware sorting.
mb_strlen(), mb_strtolower(), mb_substr(), and other mb_ versions. The standard functions will silently produce wrong results for multi-byte characters.strlen() on UTF-8 text in a multilingual app leads to incorrect character counts, breaking validation, truncation, and search.strlen() when the password contains emojis — a 4-character password can be 16 bytes, causing false rejections.strlen() counts bytes, not characters.natsort() prevents 'img10.jpg' appearing before 'img2.jpg'.XSS via Unsanitised Product Descriptions Took Down a Major Retailer
echo $review->text; instead of echo htmlspecialchars($review->text, ENT_QUOTES, 'UTF-8');. PHP's default behavior is to output raw text, and the CMS validation only blocked explicit <script> tags but allowed other HTML like <img onerror>.htmlspecialchars() to all user-generated text in views, added a global output filter using PHP's output buffering with callback, and set Content-Security-Policy HTTP headers.- Never trust that input validation alone secures output.
- Always apply a context-appropriate encoding function (HTML, JS, URL) before rendering.
- Automate XSS scanning in CI/CD with tools like OWASP ZAP or SonarQube rules.
echo $someString outputs nothing (blank) but you expected textvar_dump($someString); and set error_reporting(E_ALL); in development. Also check if output buffering (ob_start) is interfering.strpos($haystack, $needle) evaluates to true when match is at position 0 but you expect falseif (strpos(...)) with if (strpos(...) !== false). Upgrade to PHP 8+ and use str_contains() for clarity.htmlspecialchars() is applied before echo. Look for missing ENT_QUOTES flag or wrong encoding parameter.strlen() returns a different count than expected for multi-byte characters (e.g., é, 日本)mb_strlen() instead of strlen(). Ensure the mbstring extension is enabled and default_charset is set to UTF-8.strtotime() returns false for date stringsDateTime::createFromFormat() for custom formats. Enable error-reporting to see warnings.htmlspecialchars($var, ENT_QUOTES, 'UTF-8') and add CSP header via PHP header() or server config.Key takeaways
str_contains() on PHP 8+.mb_strlen() instead of strlen()Common mistakes to avoid
3 patternsUsing == to check if strpos() returned false
strpos() checks: if (strpos($haystack, $needle) === false). Better yet, use str_contains() on PHP 8+.Forgetting that string indexes start at 0
Echoing unsanitised user input
Interview Questions on This Topic
What is the difference between single-quoted and double-quoted strings in PHP, and when would you deliberately choose one over the other?
Frequently Asked Questions
That's PHP Basics. Mark it forged?
4 min read · try the examples if you haven't