Beginner 10 min · March 06, 2026

SQL String Functions: Trailing Spaces Locked 30% of Users

Q: What is the difference between CHAR_LENGTH and LENGTH in SQL?

In MySQL, LENGTH() counts bytes not characters — so a single emoji or accented character (which uses multiple bytes in UTF-8) can count as 2, 3 or 4. CHAR_LENGTH() counts actual characters regardless of byte size. For modern text data with international characters always prefer CHAR_LENGTH() in MySQL. PostgreSQL's LENGTH() counts characters directly so this distinction doesn't apply there.

Q: Can I use SQL string functions inside a WHERE clause?

Yes, absolutely. WHERE LOWER(email) = 'alice@example.com' is perfectly valid. The function runs on each row as the database filters. The trade-off is performance: wrapping a column in a function usually prevents the database from using an index on that column. For a large table, consider storing a pre-normalised version of the column or using a function-based index if your database supports it.

Q: Why does CONCAT return NULL when one of my columns is NULL?

In standard SQL and most databases, any string operation involving NULL returns NULL — because NULL means 'unknown', and 'Alice' joined to 'unknown' is still 'unknown'. Fix this with COALESCE: CONCAT(first_name, ' ', COALESCE(last_name, '')) replaces a NULL last_name with an empty string so the full name still returns correctly instead of vanishing.

Trailing spaces in SQL emails locked 30% of users.

Naren Founder & Principal Engineer

20+ years shipping high-throughput database systems. Drawn from code that ran under real load.

✓ Production

production tested

July 19, 2026

last updated

2,466

articles · all by Naren

Before you start⏱ 20 min

✓Basic programming fundamentals
✓A computer with internet access
✓Willingness to follow along with examples

● Production Incident 🔎 Debug Guide

⚡Quick Answer

String functions operate on VARCHAR/TEXT columns — UPPER, LOWER, LENGTH, TRIM, SUBSTRING, CONCAT, REPLACE
TRIM removes leading and trailing whitespace — invisible spaces cause silent JOIN and comparison failures
LIKE with a leading wildcard ('%term') disables index usage — forces a full table scan
LENGTH counts bytes in MySQL for multi-byte characters; use CHAR_LENGTH for character count
String concatenation: CONCAT in MySQL, || in PostgreSQL and standard SQL
Biggest mistake: comparing strings without normalizing case — WHERE city = 'london' misses 'London' silently

✦ Definition~90s read

What is SQL String Functions?

SQL string functions are built-in operations that manipulate character data directly inside database queries, letting you clean, transform, and search text without pulling it into application code. They exist because raw database text is rarely clean—trailing spaces, inconsistent casing, embedded delimiters, and partial matches are the norm in production systems.

★

Imagine you work at a library and someone hands you a messy pile of book titles — some in ALL CAPS, some with random spaces at the start, some misspelled.

Without them, you'd be writing procedural loops or shipping gigabytes of raw strings to your app just to trim whitespace or extract a substring. These functions are the difference between a query that returns 'john@example.com ' (with a trailing space that breaks your login) and one that returns 'john@example.com'.

In practice, you'll reach for a core set of seven functions in nearly every project: UPPER, LOWER, LENGTH for normalization and sizing; TRIM, LTRIM, RTRIM for stripping invisible characters that silently corrupt joins and comparisons; and SUBSTRING, CONCAT, REPLACE for slicing, joining, and swapping text. The article's title references a real-world consequence: a 30% user lockout caused by trailing spaces in email fields—a problem TRIM solves in one line.

For search operations, LIKE with wildcards works for simple patterns but kills performance on large tables; PATINDEX and CHARINDEX give you position-based searching without full table scans, critical when you're querying millions of rows.

Alternatives exist—full-text search engines like Elasticsearch or PostgreSQL's tsvector for complex linguistic queries, and application-layer regex for patterns SQL can't handle. But for the 90% of text manipulation tasks—normalizing user input, building display strings, cleaning imported data—SQL string functions are the right tool because they execute inside the database engine, avoiding network round-trips and keeping your data pipeline simple.

Don't use them for heavy parsing or multi-language collation; that's what dedicated search indexes are for.

Plain-English First

Imagine you work at a library and someone hands you a messy pile of book titles — some in ALL CAPS, some with random spaces at the start, some misspelled. SQL string functions are the tools you use to clean up, reshape, and search through that text, all without touching the original paper. They let your database do the tedious text-editing work so you never have to open a spreadsheet and do it by hand.

Every real-world database is full of text — customer names, email addresses, product descriptions, phone numbers. The problem is that text is messy. Users type their names in all caps, leave trailing spaces, or enter 'New York' when your system expects 'new york'. If you can't clean and manipulate that text inside the database itself, you end up writing hundreds of lines of application code to do a job SQL can handle in a single line.

SQL string functions solve exactly that problem. They're built-in tools the database gives you to inspect, clean, transform, and combine text columns on the fly — right inside your queries. Instead of pulling all the data into Python or JavaScript and then fixing it, you fix it where it lives.

By the end of this article you'll know the seven most important SQL string functions, exactly when to reach for each one, how to chain them together for real cleaning tasks, and the two traps that catch almost every beginner. You'll be able to look at a messy data problem and immediately know which function to use.

Why SQL String Functions Are Not Just Trivial Text Tools

SQL string functions are built-in operations that manipulate character data — concatenating, extracting, trimming, replacing, and transforming strings directly in queries. Unlike application-level string handling, these functions execute inside the database engine, operating on column values row by row or across sets. The core mechanic is that each function returns a new string value derived from the input, leaving the original data unchanged unless explicitly updated.

In practice, SQL string functions are deterministic (same input always yields same output) and can be used in SELECT, WHERE, JOIN, and ORDER BY clauses. However, their behavior varies across databases — LENGTH in Oracle returns bytes, while in PostgreSQL it returns characters. Trailing spaces are handled inconsistently: CHAR columns pad values with spaces, and comparisons in some databases ignore trailing spaces (SQL Server) while others treat them as significant (PostgreSQL). This subtlety causes silent data mismatches, failed joins, and incorrect aggregations.

Use SQL string functions when you need to clean, normalize, or transform data within the database rather than pulling raw data into an application layer. They are critical for ETL pipelines, data migration, and reporting — especially when dealing with user input, legacy systems, or cross-database compatibility. Ignoring their quirks, especially around whitespace and collation, leads to production bugs that are hard to trace because the data looks correct in a GUI but behaves differently in queries.

⚠ Trailing Spaces Are Not Your Friend

In SQL Server, 'abc' = 'abc ' evaluates to true due to ANSI padding rules, but in PostgreSQL it's false — a silent join killer across databases.

📊 Production Insight

A CRM migration from SQL Server to PostgreSQL caused 30% of user records to fail matching because source CHAR(10) fields had trailing spaces that PostgreSQL treated as distinct.

Symptom: JOINs on email or username returned fewer rows than expected, with no error message — only a mysterious drop in record counts.

Rule: Always TRIM() or RTRIM() string columns before comparison or join, especially when data originates from fixed-width or legacy systems.

🎯 Key Takeaway

Trailing space behavior is database-specific and can silently break joins and lookups.

String functions are deterministic but not portable — test across target databases.

Always normalize strings (TRIM, UPPER, COLLATE) at the boundary where data enters your system.

thecodeforge.io

Sql String Functions

The Seven String Functions You'll Use 90% of the Time

SQL has dozens of string functions, but most professional work comes down to seven. Think of them like tools in a toolbox — a hammer, a screwdriver, a tape measure. You don't need every tool on day one; you need to know the core ones cold.

Here's a quick map before we go deep:

— UPPER / LOWER: change the case of text, like Caps Lock on a keyboard. — LENGTH: count how many characters are in a string, like counting letters in a word. — TRIM / LTRIM / RTRIM: remove unwanted spaces from the edges of text, like cutting the crusts off bread. — SUBSTRING: pull out a chunk of text from the middle of a string, like cutting a slice from a loaf. — CONCAT: glue two or more pieces of text together, like taping two pieces of paper side by side. — REPLACE: swap one piece of text for another, like using find-and-replace in Word. — INSTR / CHARINDEX / POSITION: find where a smaller piece of text lives inside a bigger one.

Different databases (MySQL, PostgreSQL, SQL Server) spell a few of these slightly differently — we'll flag that as we go. The concepts are identical everywhere.

string_functions_overview.sqlSQL

-- ============================================================
-- Setup: create a small customers table with intentionally
-- messy data so we can demonstrate each function clearly.
-- ============================================================

CREATE TABLE customers (
    customer_id   INT,
    full_name     VARCHAR(100),
    email         VARCHAR(150),
    city          VARCHAR(50)
);

INSERT INTO customers VALUES
    (1, '  alice johnson  ', 'ALICE@EXAMPLE.COM', 'new york'),
    (2, 'BOB SMITH',         'bob@example.com',   'Los Angeles'),
    (3, 'carol white  ',     'Carol@Example.com', 'CHICAGO');

-- -------------------------------------------------------
-- Quick preview: what does the raw data look like?
-- -------------------------------------------------------
SELECT customer_id, full_name, email, city
FROM   customers;

Output

customer_id | full_name | email | city

------------|--------------------|---------------------|------------

1 | alice johnson | ALICE@EXAMPLE.COM | new york

2 | BOB SMITH | bob@example.com | Los Angeles

3 | carol white | Carol@Example.com | CHICAGO

🔥Database Dialect Note:

All examples in this article use standard SQL that runs in MySQL 8+, PostgreSQL 14+, and SQL Server 2019+. Where syntax differs between databases we'll call it out explicitly with a short note.

📊 Production Insight

String normalization (LOWER, TRIM) should happen at write time, not read time — applying LOWER() in WHERE prevents index usage.

Store emails in lowercase at INSERT time; store phone numbers in a canonical format — don't rely on query-time normalization.

For case-insensitive columns, PostgreSQL's CITEXT type handles case-insensitive comparisons automatically without function overhead.

🎯 Key Takeaway

Normalize strings at write time — don't apply LOWER or TRIM in WHERE clauses if you can avoid it.

Case-insensitive storage (lowercase at insert) allows exact-match queries to use indexes normally.

For text columns requiring case-insensitive search, PostgreSQL's CITEXT type is cleaner than LOWER() everywhere.

UPPER, LOWER and LENGTH — Your Case and Size Tools

Case inconsistency is one of the most common data quality problems in real databases. One record says 'London', another says 'LONDON', a third says 'london'. To your database those are three completely different values — so a GROUP BY city query would produce three separate buckets instead of one. That's a real bug.

UPPER(text) converts every letter in a string to uppercase. LOWER(text) converts every letter to lowercase. Neither modifies the original stored data — they return a new version of the text just for that query result. Think of it like wearing tinted glasses: the world looks different, but the world hasn't changed.

LENGTH(text) tells you how many characters are in a string, including spaces. This is useful for data validation — if a phone number column has a length other than 10 digits, something went wrong during data entry. Note: in SQL Server, the equivalent function is called LEN() not LENGTH() — a classic gotcha we'll revisit.

The real power comes when you combine these with a WHERE clause. Instead of hoping users typed city names consistently, you normalise both sides of the comparison so they always match regardless of how the data was entered.

upper_lower_length.sqlSQL

-- ============================================================
-- UPPER: standardise city names to uppercase for safe grouping
-- ============================================================
SELECT
    UPPER(city)          AS city_uppercase,   -- 'new york' → 'NEW YORK'
    COUNT(*)             AS customer_count
FROM   customers
GROUP BY UPPER(city);                         -- groups correctly despite mixed case

-- ============================================================
-- LOWER: normalise email addresses (emails are case-insensitive
-- but databases treat them as case-sensitive strings)
-- ============================================================
SELECT
    customer_id,
    LOWER(email)  AS normalised_email  -- 'ALICE@EXAMPLE.COM' → 'alice@example.com'
FROM   customers;

-- ============================================================
-- LENGTH: find any city names that seem suspiciously short
-- (useful for catching typos or blank entries)
-- ============================================================
SELECT
    customer_id,
    city,
    LENGTH(city)  AS city_length   -- counts every character including spaces
FROM   customers
WHERE  LENGTH(city) < 5;           -- flag anything under 5 characters as suspect

-- ============================================================
-- PRACTICAL COMBO: case-insensitive search
-- Find all customers in New York regardless of how it was typed
-- ============================================================
SELECT *
FROM   customers
WHERE  LOWER(city) = 'new york';   -- normalise BOTH sides to lowercase

Output

-- UPPER + GROUP BY result:

city_uppercase | customer_count

---------------|---------------

NEW YORK | 1

LOS ANGELES | 1

CHICAGO | 1

-- LOWER email result:

customer_id | normalised_email

------------|------------------

1 | alice@example.com

2 | bob@example.com

3 | carol@example.com

-- LENGTH < 5 result:

(0 rows — all city names in this dataset are 7+ characters)

-- Case-insensitive search result:

customer_id | full_name | email | city

------------|------------------|-------------------|----------

1 | alice johnson | ALICE@EXAMPLE.COM | new york

⚠ Watch Out — SQL Server Uses LEN() Not LENGTH():

In SQL Server, LENGTH() does not exist. Use LEN() instead. Also note that LEN() does NOT count trailing spaces, while MySQL's LENGTH() does. If you're counting spaces intentionally, use DATALENGTH() in SQL Server.

📊 Production Insight

UPPER/LOWER applied to indexed columns in WHERE disables the index.

Create a functional index to restore index usage: CREATE INDEX idx_lower_email ON users (LOWER(email)).

For case-insensitive equality, LOWER(email) = LOWER($input) with a matching functional index is O(log n).

🎯 Key Takeaway

LOWER(col) = LOWER($input) forces a full scan unless you have a functional index on LOWER(col).

For case-insensitive lookups, create a functional index that matches the WHERE expression exactly.

LENGTH counts bytes in MySQL for multi-byte UTF8 — use CHAR_LENGTH for true character count.

thecodeforge.io

Sql String Functions

TRIM, LTRIM and RTRIM — Killing the Invisible Spaces

Spaces you can't see are some of the sneakiest bugs in data work. When a user fills in a web form and accidentally hits the spacebar before their name, that space gets stored in the database. Now ' Alice' and 'Alice' are two different strings. Searches fail. Joins don't match. Reports double-count.

TRIM(text) removes spaces from both the left and right sides of a string. LTRIM only removes spaces on the left (the leading side). RTRIM only removes spaces on the right (the trailing side). Think of it like cutting the bread crusts — you can cut both ends, just the left, or just the right.

This comes up constantly in data imports. When you load data from a CSV file or an external API, strings almost always arrive with extra whitespace. Running TRIM before you process the data is like shaking the crumbs off a tablecloth before you sit down — a small step that prevents a lot of mess.

Modern SQL (SQL:2003 standard) lets you TRIM other characters too, not just spaces — for example trimming a dollar sign off a price string. We'll show that in the code below. PostgreSQL and SQL Server both support this; MySQL added it in version 8.

trim_functions.sqlSQL

-- ============================================================
-- Basic TRIM: remove spaces from both sides of full_name
-- ============================================================
SELECT
    customer_id,
    full_name,                      -- original, messy version
    TRIM(full_name) AS clean_name,  -- spaces removed from both sides
    LENGTH(full_name)         AS original_length,
    LENGTH(TRIM(full_name))   AS trimmed_length  -- shows how many spaces were hiding
FROM   customers;

-- ============================================================
-- LTRIM vs RTRIM: when you only want to cut one side
-- Useful if you're processing fixed-width data exports where
-- trailing spaces are padding but leading spaces mean something.
-- ============================================================
SELECT
    '  hello world  '           AS original,
    LTRIM('  hello world  ')    AS left_trimmed,   -- 'hello world  '
    RTRIM('  hello world  ')    AS right_trimmed,  -- '  hello world'
    TRIM('  hello world  ')     AS fully_trimmed;  -- 'hello world'

-- ============================================================
-- TRIM a specific character (not just spaces)
-- Imagine prices were imported with a leading dollar sign
-- ============================================================
SELECT
    TRIM(LEADING '$' FROM '$49.99')   AS price_no_prefix,   -- '49.99'
    TRIM(TRAILING '!' FROM 'Sale!')   AS message_no_bang,   -- 'Sale'
    TRIM(BOTH   '*' FROM '***HOT***') AS product_clean;      -- 'HOT'

-- ============================================================
-- PRACTICAL USE: clean names before inserting into a report
-- Combine TRIM + UPPER to fully normalise a string
-- ============================================================
SELECT
    customer_id,
    UPPER(TRIM(full_name)) AS display_name  -- '  alice johnson  ' → 'ALICE JOHNSON'
FROM   customers;

Output

-- TRIM with length comparison:

customer_id | full_name | clean_name | original_length | trimmed_length

------------|------------------|---------------|-----------------|---------------

1 | alice johnson | alice johnson | 16 | 13

2 | BOB SMITH | BOB SMITH | 9 | 9

3 | carol white | carol white | 12 | 11

-- LTRIM / RTRIM demo:

original | left_trimmed | right_trimmed | fully_trimmed

-----------------|-----------------|-----------------|---------------

hello world | hello world | hello world | hello world

-- TRIM specific characters:

price_no_prefix | message_no_bang | product_clean

----------------|-----------------|---------------

49.99 | Sale | HOT

-- UPPER + TRIM combo:

customer_id | display_name

------------|---------------

1 | ALICE JOHNSON

2 | BOB SMITH

3 | CAROL WHITE

💡Pro Tip — Trim on INSERT, Not Just on SELECT:

Instead of trimming on every SELECT query (which runs the function millions of times on a large table), use a CHECK constraint or a BEFORE INSERT trigger to strip whitespace when data first enters the database. Clean at the gate, not at every door.

📊 Production Insight

TRIM is not applied automatically on insert — users regularly paste values with surrounding whitespace.

Add a database-level CHECK constraint: CHECK (email = TRIM(email)) to prevent dirty data entering.

For batch imports from CSV or external systems, always run UPDATE table SET col = TRIM(col) as a post-import step.

🎯 Key Takeaway

TRIM at write time beats TRIM at read time — prevent dirty data, don't just work around it.

CHECK (col = TRIM(col)) is an elegant constraint that rejects values with leading/trailing spaces.

For imports: UPDATE table SET col = TRIM(col) WHERE col != TRIM(col) as a data quality step.

SUBSTRING, CONCAT and REPLACE — Slicing, Joining and Swapping Text

These three functions let you reshape text rather than just clean it. SUBSTRING pulls a piece out of the middle of a string. CONCAT sticks strings together. REPLACE swaps one piece of text for another. Together they cover almost every text-transformation task you'll face.

SUBSTRING(text, start, length) takes three arguments: the text to work on, the position to start cutting, and how many characters to take. Positions start at 1, not 0 — that trips up programmers coming from JavaScript or Python. So SUBSTRING('London', 1, 3) gives you 'Lon', not 'ond'.

CONCAT(text1, text2, ...) glues strings together. This is useful for building full names from first and last name columns, constructing URLs from parts, or building human-readable labels from codes. The || operator does the same job in PostgreSQL and standard SQL — but MySQL requires the CONCAT function.

REPLACE(text, old_value, new_value) finds every occurrence of old_value inside text and swaps it for new_value. This is powerful for data cleaning — swapping phone number formats, removing unwanted characters, or fixing a batch of mis-entered values all in one UPDATE statement.

substring_concat_replace.sqlSQL

-- ============================================================
-- SUBSTRING: extract just the domain from an email address.
-- We know the '@' is always there — find where it is, then
-- grab everything after it.
--
-- INSTR(string, search) returns the position of the first
-- match. We add 1 to skip past the '@' itself.
-- ============================================================
SELECT
    email,
    SUBSTRING(
        email,
        INSTR(email, '@') + 1,          -- start one char AFTER the '@'
        LENGTH(email)                    -- grab the rest (generous upper bound)
    ) AS email_domain                   -- 'alice@example.com' → 'example.com'
FROM   customers;

-- ============================================================
-- CONCAT: build a full display label from separate columns.
-- Notice we add a literal string ' — ' between city and name.
-- ============================================================
SELECT
    CONCAT(
        TRIM(UPPER(city)),      -- 'NEW YORK'
        ' | ',                  -- literal separator
        TRIM(full_name)         -- 'alice johnson'
    ) AS customer_label
FROM   customers;

-- PostgreSQL / standard SQL alternative (same result):
-- SELECT TRIM(UPPER(city)) || ' | ' || TRIM(full_name) AS customer_label
-- FROM   customers;

-- ============================================================
-- REPLACE: the marketing team re-branded 'example.com' to
-- 'mycompany.com' — fix all the email addresses in one shot.
-- ============================================================
SELECT
    email,
    REPLACE(
        LOWER(email),           -- normalise case first so the match is reliable
        'example.com',          -- the old text to find
        'mycompany.com'         -- the new text to put in its place
    ) AS updated_email
FROM   customers;

-- ============================================================
-- REAL UPDATE using REPLACE (not just a SELECT preview):
-- Uncomment only when you're ready — always preview first!
-- ============================================================
-- UPDATE customers
-- SET    email = REPLACE(LOWER(email), 'example.com', 'mycompany.com')
-- WHERE  email LIKE '%example.com%';  -- safety net: only touch matching rows

-- ============================================================
-- SUBSTRING to extract area code from a stored phone number
-- Assumes format: '212-555-0100' — area code is always first 3 chars
-- ============================================================
SELECT
    '212-555-0100'            AS raw_phone,
    SUBSTRING('212-555-0100', 1, 3) AS area_code;  -- '212'

Output

-- Email domain extraction:

email | email_domain

--------------------|-------------

ALICE@EXAMPLE.COM | EXAMPLE.COM

bob@example.com | example.com

Carol@Example.com | Example.com

-- CONCAT customer label:

customer_label

--------------------------

NEW YORK | alice johnson

LOS ANGELES | BOB SMITH

CHICAGO | carol white

-- REPLACE email domain:

email | updated_email

--------------------|----------------------

ALICE@EXAMPLE.COM | alice@mycompany.com

bob@example.com | bob@mycompany.com

Carol@Example.com | carol@mycompany.com

-- Area code extraction:

raw_phone | area_code

-------------|----------

212-555-0100 | 212

⚠ Watch Out — SUBSTRING Positions Start at 1, Not 0:

Coming from Python or JavaScript? SUBSTRING in SQL counts from position 1, not 0. SUBSTRING('Hello', 1, 1) returns 'H'. If you write SUBSTRING('Hello', 0, 1) you'll get an empty string or unexpected result depending on the database. This bites almost every developer the first time.

📊 Production Insight

SUBSTRING indexes are 0-based in some databases (SQL Server) and 1-based in others (PostgreSQL, MySQL) — verify for your database.

For extracting structured components (area code from phone number), REGEXP_REPLACE/REGEXP_SUBSTR is more robust than SUBSTRING with hardcoded positions.

For large text manipulation, do it in application code — pushing heavy string processing into SQL runs on the database server which is usually a shared resource.

🎯 Key Takeaway

SUBSTRING position is 1-based in PostgreSQL/MySQL, 0-based in SQL Server — test your database.

For structured text parsing (extracting domain from email), REGEXP_SUBSTR is more robust than SUBSTRING(col, POSITION('@' IN col) + 1).

Heavy string transformation belongs in application code, not SQL — databases are optimized for data retrieval, not string processing.

LIKE, PATINDEX, and CHARINDEX — Finding Needles Without Blowing Up Performance

You've got a column full of customer emails. Your junior asks: "How do I find all users from .edu domains?" First instinct: WHERE email LIKE '%.edu'. Works on 100 rows. On 10 million, that leading wildcard forces a full table scan every single time. Your query goes from 2ms to 30 seconds. Production cries.

LIKE with a leading % is a sequential scan bomb. Always push the wildcard to the end when you can: LIKE 'john@%'. If you need substring search across millions of rows, use PATINDEX or CHARINDEX inside a persisted computed column and index it. CHARINDEX('@company.com', email) > 0 is marginally better than LIKE for some optimizers, but still not SARGable. The real move: normalize, add a domain column, index it. String functions are runtime crutches — don't let them become your schema.

EmailDomainSearch.sqlSQL

// io.thecodeforge — database tutorial

-- Bad: leading wildcard = index ignore
SELECT user_id, email
  FROM users
 WHERE email LIKE '%.edu';

-- Better: trailing wildcard can use index
SELECT user_id, email
  FROM users
 WHERE email LIKE 'john@%';

-- Best: extracted domain column + index
ALTER TABLE users
  ADD email_domain AS SUBSTRING(
    email, CHARINDEX('@', email) + 1, LEN(email)
  ) PERSISTED;

CREATE INDEX idx_users_email_domain
  ON users(email_domain);

-- Now this uses the index
SELECT user_id, email
  FROM users
 WHERE email_domain = 'stanford.edu';

Output

-- Index scan on idx_users_email_domain

-- user_id | email

-- 1 | alice@stanford.edu

-- 2 | bob@stanford.edu

-- (no table scan)

⚠ Production Trap:

LIKE with leading '%' is the #1 cause of accidental full scans in text search queries. Always check the execution plan — if you see 'Table Scan' and a wildcard at position 0, refactor.

🎯 Key Takeaway

Push wildcards to the end of LIKE patterns, or better yet, extract and index the substring you're searching for.

REPLACE and TRANSLATE — When a Sledgehammer Beats a Scalpel

You need to strip punctuation from a user's bio before inserting it into a search index. Junior reaches for nested REPLACE calls. Eight lines of spaghetti. One for each character: period, comma, exclamation. It's fragile and slow — every REPLACE scans the whole string. On a batch of 50,000 rows, that's 400,000 full scans. Your ETL pipe grinds to a halt.

TRANSLATE is the hammer you forgot you had. It maps characters in one step: TRANSLATE(bio, '.,!?;:', '') — single pass, zero nesting. If you're cleaning character sets (e.g. replacing smart quotes with straight ones), TRANSLATE handles a full mapping in one call. REPLACE is for swapping substrings, not scrubbing individual characters. Know the difference: REPLACE = substring surgery. TRANSLATE = character-level carpet bombing. Use the right tool or own the performance bill.

StripPunctuation.sqlSQL

// io.thecodeforge — database tutorial

-- Junior: nested REPLACE — scans bio 8 times
SELECT
  REPLACE(
    REPLACE(
      REPLACE(
        REPLACE(bio, '.', ''),
        ',', ''
      ),
      '!', ''
    ),
    '?', ''
  ) AS clean_bio
FROM users;

-- Senior: TRANSLATE — single pass
SELECT
  TRANSLATE(bio, '.,!?;:', '') AS clean_bio
FROM users;

-- Real-world: replace smart quotes
SELECT
  TRANSLATE(
    'Hello “World” — foo',
    '“”‘’—',
    '""\'\'-'
  ) AS normalized_text;

Output

-- With bio = 'Hello, World! How are you?'

-- translate returns: 'Hello World How are you'

-- vs nested replace returns the same, but with 8x the I/O

🔥Senior Shortcut:

TRANSLATE is not available in MySQL or PostgreSQL — use REGEXP_REPLACE instead. In SQL Server and Oracle, TRANSLATE is a performance cheat code.

🎯 Key Takeaway

Use TRANSLATE for single-character replacements (punctuation, quotes) — one scan beats eight nested REPLACE calls every time.

CONCAT_WS() — Stop Writing Ugly Concatenation Chains

You’ve been writing CONCAT with a dozen parameters and commas between every pair? That’s amateur hour. CONCAT_WS() is the production-grade version: first argument is your separator, then list as many columns as you want. NULLs get skipped silently — your report won’t crash because someone’s middle name is missing.

The separator is applied between every non-null value. No more checking IS NULL before every join. No more messy COALESCE hacks. If you’re building CSV exports or full-name fields, this is the function you should reach for every time. It’s shorter, faster, and handles edge cases without you writing defensive code.

Example.sqlSQL

// io.thecodeforge — database tutorial

SELECT 
    CONCAT_WS(' | ', first_name, middle_initial, last_name) AS full_name
FROM employees
WHERE department = 'Engineering'
LIMIT 3;

Output

full_name

----------------------

Alice | A | Chen

Bob | NULL | Kowalski --> middle_initial is NULL, skipped

Carlos | NULL | Mendez

💡Production Trap:

CONCAT() treats NULL as empty string if you wrap in COALESCE. CONCAT_WS() just skips nulls. That changes row counts in aggregations — watch out when building column lists for pivot tables.

🎯 Key Takeaway

CONCAT_WS() is the only concatenation function you need in production — it handles separators and nulls in one call.

LPAD() and RPAD() — Padding Is Not Just for Pretty Print

Padding looks like a UI problem. It’s not. When you need fixed-width exports for legacy systems, or you’re building invoice numbers with leading zeros, LPAD() and RPAD() are your only sane options. LEFT() + REPLICATE() is a fragile kludge — these functions exist exactly so you don’t write that garbage.

LPAD() adds characters to the left until the string hits your target length. RPAD() does the same on the right. Pass the column, the target width, and the pad character. If the original string is longer than the target, it truncates from the padded side. That means you need to check your data first — truncation is silent and invisible.

Example.sqlSQL

// io.thecodeforge — database tutorial

SELECT 
    order_id,
    LPAD(CAST(order_id AS CHAR), 8, '0') AS padded_id,
    RPAD(product_name, 20, '.') AS product_display
FROM orders
WHERE order_date >= '2024-01-01'
LIMIT 2;

Output

order_id | padded_id | product_display

-------- | --------- | --------------------

123 | 00000123 | Wireless Mouse........

4567 | 00004567 | Mechanical Keyboard...

⚠ Senior Shortcut:

Pad with spaces for fixed-width reports; pad with zeros for financial codes. Always test truncation behavior — if your target length is smaller than the longest value, you’ll lose data without an error.

🎯 Key Takeaway

LPAD/RPAD turn variable-length strings into fixed-width data without messy string arithmetic.

LOCATE() — The Find Function That Doesn’t Overcomplicate Things

LIKE is for pattern matching. PATINDEX is for regex fans. When you just need to know if a substring exists inside a string — and where it starts — use LOCATE(). It’s the SQL equivalent of strpos() in PHP or indexOf() in JavaScript. Three parameters: the substring to find, the source string, and an optional starting position.

Returns zero if not found. That’s it. No wildcards, no performance nightmares from leading wildcard LIKE patterns. LOCATE() is fully index-unfriendly when you search on a computed column, but if you’re searching a known column with a substring, it’s faster and cleaner than a LIKE with '%something%'. Use it for parsing CSV fields, checking for tags, or extracting substrings by position.

Example.sqlSQL

// io.thecodeforge — database tutorial

SELECT 
    email,
    LOCATE('@', email) AS at_position,
    LOCATE('.com', email) AS dotcom_position
FROM users
WHERE LOCATE('@', email) > 0
  AND LOCATE('.com', email) = 0
LIMIT 3;

Output

email | at_position | dotcom_position

-------------------- | ----------- | ---------------

user@example.net | 5 | 0

admin@company.org | 6 | 0

test@dev.io | 5 | 0

🔥Production Trap:

LOCATE() with a leading wildcard (e.g., LOCATE('@', email)) cannot use indexes. For search-heavy workloads, store extracted domains in a separate indexed column or use full-text search.

🎯 Key Takeaway

LOCATE() is the simplest way to find a substring’s position — zero means not found, any positive integer is the character position.

REVERSE() — When Backward Text Holds Hidden Patterns

Most developers view REVERSE() as a party trick, but it solves real problems. Need to find palindromic product codes? REVERSE() + equality check filters them in one pass. Working with hierarchically delimited strings like file paths or SKUs? Reverse the string, extract the first segment, and reverse back — a two-step solution that avoids complex split logic. REVERSE() also cracks pattern detection in domain names, where subdomains must match certain suffixes. The function flips every character: REVERSE('abc') becomes 'cba'. It works on any string type and respects Unicode correctly in most modern databases. Performance is linear in string length, making it safe for millions of rows. Avoid REVERSE() inside WHERE clauses on indexed columns unless you use a computed column with a persisted index — because reversing destroys index order. Use it instead in SELECT or CTE stages after filtering.

ReverseDomain.sqlSQL

// io.thecodeforge — database tutorial

WITH emails AS (
  SELECT 'user@sub.example.com' AS email
)
SELECT
  email,
  REVERSE(email) AS reversed,
  REVERSE(
    SUBSTRING_INDEX(REVERSE(email), '.', 1)
  ) AS top_level_domain
FROM emails;

Output

email | reversed | top_level_domain

---------------------------+----------------------------------+-----------------

user@sub.example.com | moc.elpmaxe.bus@resu | com

⚠ Production Trap:

REVERSE() bypasses indexes. Never use it in JOIN or WHERE on large tables unless you create a computed, persisted index on the reversed column.

🎯 Key Takeaway

REVERSE() turns string-end processing into string-start processing — always apply after filtering, never in join conditions.

REPEAT() — Generating Test Data Without Loops

REPEAT() replicates a string N times in a single expression. It's your fastest tool for generating padded output, fake data, or separator lines. The signature is simple: REPEAT(string, count). If count is zero, it returns an empty string. Negative count returns NULL. Most databases cap the result at a system maximum length (e.g., 65,535 bytes in MySQL) — hitting that silently truncates, not errors. Use REPEAT() in reporting to create dynamic progress bars: REPEAT('█', ROUND(percent 50)). For data anonymization, replace each character of a name with '': REPEAT('*', LENGTH(name)). It also builds repeating separators in CONCAT_WS chains without hardcoding. The performance is O(N) in the output length, so generating 100MB strings will hurt. Keep counts under 10,000 in production queries. For row-wise repetition, pair REPEAT() with a numbers table to multiply rows, not just characters.

GenerateStars.sqlSQL

// io.thecodeforge — database tutorial

SELECT
  product_name,
  CONCAT(
    'Score: ',
    REPEAT('★', rating),
    REPEAT('☆', 5 - rating)
  ) AS star_rating
FROM (
  SELECT 'Widget A' AS product_name, 4 AS rating
  UNION ALL
  SELECT 'Widget B', 2
) scores;

Output

product_name | star_rating

-------------+--------------------

Widget A | Score: ★★★★☆

Widget B | Score: ★★☆☆☆

⚠ Production Trap:

REPEAT() with count over 100,000 can lock memory. Always validate count is bounded — use LEAST(count, 5000) for safety.

🎯 Key Takeaway

REPEAT() replaces cursor-based loops for generating repeated strings — but cap the count to avoid memory bloat.

FIND_IN_SET() — Slicing Comma-Separated Lists Without Pain

FIND_IN_SET() searches a comma-separated string for a needle and returns its 1-based position. It's purpose-built for columns that store multiple tags or categories in a single string — a pattern you inherit from legacy systems or ETL feeds. The syntax: FIND_IN_SET('needle', 'item1,item2,item3'). Returns 0 if not found. The needle must be a single string, not a comma. It does not support wildcards. Performance is O(N) in list length; for very long lists (100+ items), normalize to a junction table instead. The biggest hidden trap: FIND_IN_SET ignores spaces after commas — 'tag1, tag2' treats ' tag2' as the item, so 'tag2' will not match. Always clean your data with REPLACE(list, ' ', '') before using it. Combine with SUBSTRING_INDEX to extract the matching item itself. Avoid using FIND_IN_SET in JOIN conditions — it forces full table scans and prevents index usage. Use it only for read-only reports or one-time data cleanups.

FindCategory.sqlSQL

// io.thecodeforge — database tutorial

WITH products AS (
  SELECT 'Laptop' AS product, 'electronics,computers,portable' AS tags
  UNION ALL
  SELECT 'Chair', 'furniture,office'
)
SELECT
  product,
  FIND_IN_SET('computers', tags) AS computer_position
FROM products
WHERE FIND_IN_SET('computers', tags) > 0;

Output

product | computer_position

--------+------------------

Laptop | 2

⚠ Production Trap:

FIND_IN_SET ignores whitespace. 'blue' will NOT match 'blue, red' because of the leading space before 'red'. Always trim your delimited values.

🎯 Key Takeaway

FIND_IN_SET() is a quick-but-dirty parser for comma lists — acceptable in reports, but refactor to normalized tables for OLTP queries.

Regex Functions: REGEXP_MATCH, REGEXP_REPLACE Across Dialects

Regular expressions (regex) provide powerful pattern matching and text manipulation beyond basic string functions. While SQL dialects vary, most support regex through functions like REGEXP_MATCH (or REGEXP_LIKE), REGEXP_REPLACE, and REGEXP_SUBSTR. In PostgreSQL, use ~ for matching and regexp_replace(); in MySQL, REGEXP and REGEXP_REPLACE(); in SQL Server, LIKE with limited patterns or CLR integration; in Oracle, REGEXP_LIKE, REGEXP_REPLACE, etc. For example, to find emails in a text column: SELECT * FROM users WHERE email ~ '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$' (PostgreSQL). To replace all digits with '#': SELECT REGEXP_REPLACE(phone, '[0-9]', '#', 'g') FROM contacts (MySQL). Regex can be expensive; use indexes or full-text search for large datasets. Always test patterns for performance and correctness.

regex_examples.sqlSQL

1

2

3

4

5

6

7

8

9

10

-- PostgreSQL: match emails
SELECT * FROM users WHERE email ~ '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$';

-- MySQL: replace all digits with '#'
SELECT REGEXP_REPLACE(phone, '[0-9]', '#', 'g') FROM contacts;

-- Oracle: extract first word
SELECT REGEXP_SUBSTR(description, '[^ ]+', 1, 1) AS first_word FROM products;

-- SQL Server: requires CLR or use LIKE with patterns (limited)

⚠ Regex Performance Trap

📊 Production Insight

In production, pre-filter data with cheaper operations (e.g., LIKE for simple patterns) before applying regex. Use generated columns or materialized views to precompute regex results for frequent queries.

🎯 Key Takeaway

Regex functions enable complex pattern matching and text replacement across SQL dialects, but use them judiciously to avoid performance pitfalls.

`Full-Text Search vs LIKE vs Regex: Performance`

Choosing the right text search method is critical for performance. LIKE with wildcards (e.g., '%pattern%') cannot use indexes efficiently, leading to full table scans. Regex functions are even more expensive due to pattern compilation and backtracking. Full-text search (FTS) is designed for large text corpora: it uses inverted indexes, tokenization, and ranking. In PostgreSQL, use tsvector and tsquery; in MySQL, MATCH ... AGAINST; in SQL Server, CONTAINS or FREETEXT. For example, to find documents containing 'database' and 'performance': SELECT * FROM articles WHERE to_tsvector('english', body) @@ to_tsquery('english', 'database & performance') (PostgreSQL). FTS supports stemming, stop words, and ranking. Use LIKE for exact prefix matches (e.g., 'prefix%') which can use an index. Use regex only when pattern complexity justifies the cost. Benchmark with realistic data volumes to choose the right tool.

search_comparison.sqlSQL

1

2

3

4

5

6

7

8

9

10

11

-- Full-text search (PostgreSQL)
SELECT * FROM articles WHERE to_tsvector('english', body) @@ to_tsquery('english', 'database & performance');

-- LIKE (no index usage for leading wildcard)
SELECT * FROM articles WHERE body LIKE '%database%';

-- Regex (expensive)
SELECT * FROM articles WHERE body ~ 'database.*performance';

-- MySQL full-text search
SELECT * FROM articles WHERE MATCH(body) AGAINST('+database +performance' IN BOOLEAN MODE);

🔥Indexing for LIKE

📊 Production Insight

Implement a hybrid approach: use FTS for main search, fall back to LIKE for autocomplete, and apply regex only in ETL or data cleaning jobs where performance is less critical.

🎯 Key Takeaway

Full-text search outperforms LIKE and regex for large text datasets; use LIKE for simple prefix searches and regex only when necessary.

`String Aggregation: STRING_AGG, GROUP_CONCAT, LISTAGG`

String aggregation functions concatenate values from multiple rows into a single string, often with a delimiter. PostgreSQL uses STRING_AGG, MySQL uses GROUP_CONCAT, SQL Server uses STRING_AGG (since 2017), and Oracle uses LISTAGG. For example, to list employee names per department: SELECT department, STRING_AGG(employee_name, ', ' ORDER BY employee_name) AS employees FROM employees GROUP BY department (PostgreSQL). In MySQL: GROUP_CONCAT(employee_name ORDER BY employee_name SEPARATOR ', '). These functions are efficient for grouping but can hit length limits (e.g., MySQL's group_concat_max_len). For large aggregations, consider using array types or application-level concatenation. Always specify an ORDER BY to ensure deterministic results. Use DISTINCT to avoid duplicates: STRING_AGG(DISTINCT employee_name, ', ').

string_agg_examples.sqlSQL

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

-- PostgreSQL: aggregate employee names per department
SELECT department, STRING_AGG(employee_name, ', ' ORDER BY employee_name) AS employees
FROM employees
GROUP BY department;

-- MySQL: same with GROUP_CONCAT
SELECT department, GROUP_CONCAT(employee_name ORDER BY employee_name SEPARATOR ', ') AS employees
FROM employees
GROUP BY department;

-- SQL Server (2017+): STRING_AGG
SELECT department, STRING_AGG(employee_name, ', ') WITHIN GROUP (ORDER BY employee_name) AS employees
FROM employees
GROUP BY department;

-- Oracle: LISTAGG
SELECT department, LISTAGG(employee_name, ', ') WITHIN GROUP (ORDER BY employee_name) AS employees
FROM employees
GROUP BY department;

💡Handling NULLs

📊 Production Insight

Set appropriate limits (e.g., group_concat_max_len in MySQL) and monitor for truncation. For very large aggregations, consider using arrays or temp tables to avoid memory issues.

🎯 Key Takeaway

Use STRING_AGG (PostgreSQL, SQL Server), GROUP_CONCAT (MySQL), or LISTAGG (Oracle) to efficiently concatenate row values into a single string with ordering and delimiters.



    ● Production incidentPOST-MORTEMseverity: high
User Lookup Failed for 30% of Users Due to Trailing Spaces in Email Column
Symptom
30% of users were locked out after a system migration. Password reset emails reached them (sent to the stored email) but login lookups returned 'user not found'.
Assumption
The login query used WHERE email = $input. The team assumed emails in the database were clean.
Root cause
A legacy CSV import inserted emails with trailing spaces. The login form trimmed user input; the database values had spaces. 'alice@example.com ' != 'alice@example.com' — exact string comparison failed silently.
Fix
UPDATE users SET email = TRIM(email) to clean existing data. Added TRIM() in the login query as an interim fix. Added a CHECK constraint: CHECK (email = TRIM(email)) to prevent future imports from introducing spaces.
Key lesson
Always TRIM text columns at write time, not read time — prevent the dirty data from entering the database
Add CHECK constraints for format rules: CHECK (email = TRIM(email)) prevents leading/trailing spaces
When doing exact string comparisons in authentication, TRIM both sides defensively

    Production debug guideDiagnosing silent comparison failures and encoding issues3 entries
Symptom · 01
String comparison returns no rows despite visually matching values
→
Fix
Check for invisible characters: SELECT LENGTH(col), TRIM(col), col FROM table WHERE id = x. If LENGTH != CHAR_LENGTH(TRIM(col)), there are leading/trailing spaces or multi-byte characters. Also check case: col = 'london' won't match 'London' without LOWER(col).
Symptom · 02
LIKE '%pattern%' is slow on a large table
→
Fix
LIKE with a leading wildcard forces a full table scan. For contains-search, create a full-text index: CREATE INDEX idx_fts ON products USING GIN(to_tsvector('english', description)) in PostgreSQL. Use MySQL FULLTEXT index with MATCH...AGAINST for large text search.
Symptom · 03
CONCAT returns NULL when some inputs are NULL
→
Fix
CONCAT in PostgreSQL and standard SQL returns NULL if any argument is NULL. Use COALESCE: COALESCE(first_name, '') || ' ' || COALESCE(last_name, ''). MySQL's CONCAT_WS(separator, ...) ignores NULLs automatically.

    

    Function What It Does MySQL Syntax PostgreSQL Syntax SQL Server Syntax
Uppercase Converts all letters to uppercase UPPER(col) UPPER(col) UPPER(col)
Lowercase Converts all letters to lowercase LOWER(col) LOWER(col) LOWER(col)
String length Returns character count of string LENGTH(col) LENGTH(col) LEN(col)
Trim both sides Removes leading and trailing spaces TRIM(col) TRIM(col) TRIM(col)
Trim left only Removes leading spaces only LTRIM(col) LTRIM(col) LTRIM(col)
Trim right only Removes trailing spaces only RTRIM(col) RTRIM(col) RTRIM(col)
Extract substring Pulls a slice of text by position SUBSTRING(col, start, len) SUBSTRING(col, start, len) SUBSTRING(col, start, len)
Concatenate strings Joins two or more strings together CONCAT(a, b) CONCAT(a, b) or a || b CONCAT(a, b) or a + b
Find position Returns index of substring in string INSTR(col, search) POSITION(search IN col) CHARINDEX(search, col)
Replace text Swaps all occurrences of a substring REPLACE(col, old, new) REPLACE(col, old, new) REPLACE(col, old, new)

    ⚙ Quick Reference
13 commands from this guide
File Command / Code Purpose
string_functions_overview.sql CREATE TABLE customers ( The Seven String Functions You'll Use 90% of the Time
upper_lower_length.sql SELECT UPPER, LOWER and LENGTH
trim_functions.sql SELECT TRIM, LTRIM and RTRIM
substring_concat_replace.sql SELECT SUBSTRING, CONCAT and REPLACE
EmailDomainSearch.sql SELECT user_id, email LIKE, PATINDEX, and CHARINDEX
StripPunctuation.sql SELECT REPLACE and TRANSLATE
Example.sql SELECT CONCAT_WS()
ReverseDomain.sql WITH emails AS ( REVERSE()
GenerateStars.sql SELECT REPEAT()
FindCategory.sql WITH products AS ( FIND_IN_SET()
regex_examples.sql SELECT * FROM users WHERE email ~ '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2... Regex Functions
search_comparison.sql SELECT * FROM articles WHERE to_tsvector('english', body) @@ to_tsquery('english... Full-Text Search vs LIKE vs Regex
string_agg_examples.sql SELECT department, STRING_AGG(employee_name, ', ' ORDER BY employee_name) AS emp... String Aggregation

    Key takeaways
1String functions never change the stored data
they return a transformed copy for that query only. To permanently change data you need an UPDATE statement.
2Always normalise BOTH sides of a string comparison
WHERE LOWER(city) = 'london' works, but WHERE city = 'london' silently fails when the stored value is 'London' or 'LONDON'.
3SQL Server uses LEN() and CHARINDEX() where MySQL/PostgreSQL use LENGTH() and INSTR()/POSITION()
the concept is identical, the spelling is different. Check your dialect before debugging.
4Chain functions from the inside out
UPPER(TRIM(full_name)) trims first, then uppercases. Read nested functions like onion layers — innermost runs first.

    

    

    

  
    
      INTERVIEW PREP · PRACTICE MODE
      Interview Questions on This Topic
    
    
      
      
    
  

  
  
    Q01SENIOR
What is the difference between CHAR_LENGTH and LENGTH in SQL?
Q02SENIOR
Why does LIKE '%pattern%' not use an index, and how would you optimize a...
Q03JUNIOR
How do you safely concatenate a full name from first_name and last_name ...
    Q01 of 03SENIOR
What is the difference between CHAR_LENGTH and LENGTH in SQL?
ANSWER
LENGTH in MySQL returns the number of bytes, not characters. For ASCII strings these are equal. For multi-byte UTF-8 characters (Chinese, Arabic, emoji), LENGTH may return a higher number because each character uses 2-4 bytes. CHAR_LENGTH always returns the number of characters regardless of encoding. In PostgreSQL, LENGTH returns characters by default. Example: a string containing a single emoji (4 bytes in UTF-8) — MySQL LENGTH returns 4, CHAR_LENGTH returns 1. For user-facing character counting (form validation, display truncation), always use CHAR_LENGTH.
Q02 of 03SENIOR
Why does LIKE '%pattern%' not use an index, and how would you optimize a full-text search?
ANSWER
A B-tree index stores values in sorted order. LIKE 'prefix%' can use the index because the sorted structure allows a range scan from 'prefix' onwards. LIKE '%pattern%' (leading wildcard) cannot — there's no way to find all values containing 'pattern' in a sorted structure without scanning every entry. Solutions: (1) PostgreSQL: create a GIN index on to_tsvector() and use the @@ operator for full-text search. (2) MySQL: CREATE FULLTEXT INDEX and query with MATCH ... AGAINST. (3) Elasticsearch or Typesense for production search with ranking, typo-tolerance, and faceting. (4) pg_trgm extension in PostgreSQL creates a trigram index that supports LIKE '%pattern%' and is significantly faster than a full scan.
Q03 of 03JUNIOR
How do you safely concatenate a full name from first_name and last_name columns that might be NULL?
ANSWER
CONCAT in standard SQL (and PostgreSQL || operator) returns NULL if any argument is NULL. Safe options: (1) COALESCE each component: COALESCE(first_name, '') || ' ' || COALESCE(last_name, '') — but this produces a leading or trailing space if either is NULL. (2) CONCAT_WS(' ', first_name, last_name) in MySQL — CONCAT_WS skips NULL arguments and doesn't produce extra separators. (3) CASE logic: CASE WHEN first_name IS NOT NULL AND last_name IS NOT NULL THEN first_name || ' ' || last_name WHEN first_name IS NOT NULL THEN first_name ELSE COALESCE(last_name, '') END. The cleanest production pattern is usually to store display names normalized at write time rather than computing them in every query.
  

  
  
    01
What is the difference between CHAR_LENGTH and LENGTH in SQL?
SENIOR
02
Why does LIKE '%pattern%' not use an index, and how would you optimize a full-text search?
SENIOR
03
How do you safely concatenate a full name from first_name and last_name columns that might be NULL?
JUNIOR
  



    

  
    FAQ · 3 QUESTIONS
    Frequently Asked Questions
  
  01
What is the difference between CHAR_LENGTH and LENGTH in SQL?
In MySQL, LENGTH() counts bytes not characters — so a single emoji or accented character (which uses multiple bytes in UTF-8) can count as 2, 3 or 4. CHAR_LENGTH() counts actual characters regardless of byte size. For modern text data with international characters always prefer CHAR_LENGTH() in MySQL. PostgreSQL's LENGTH() counts characters directly so this distinction doesn't apply there.
Was this helpful? 
02
Can I use SQL string functions inside a WHERE clause?
Yes, absolutely. WHERE LOWER(email) = 'alice@example.com' is perfectly valid. The function runs on each row as the database filters. The trade-off is performance: wrapping a column in a function usually prevents the database from using an index on that column. For a large table, consider storing a pre-normalised version of the column or using a function-based index if your database supports it.
Was this helpful? 
03
Why does CONCAT return NULL when one of my columns is NULL?
In standard SQL and most databases, any string operation involving NULL returns NULL — because NULL means 'unknown', and 'Alice' joined to 'unknown' is still 'unknown'. Fix this with COALESCE: CONCAT(first_name, ' ', COALESCE(last_name, '')) replaces a NULL last_name with an empty string so the full name still returns correctly instead of vanishing.
Was this helpful? 



    📖 What to Read Next
→C# Strings — Turkish 'i' Bug Breaks LoginsIn Turkish culture, 'I' and 'i' are separate letters — default comparison fails logins. Real production bug: specify Ord→Java Text Blocks — Invisible Whitespace Broke Payment APIJava text block closed with 8-space indent: 4 unwanted spaces appeared in nested JSON. 2300 payments failed. Don't trust→String Manipulation Patterns — Loop Concatenation OOMAutocomplete crashed OOM: 2GB/min temporary char[] from loop concatenation. Fix with StringBuilder — DSA patterns with r→Full-Text Search in SQL: Stopwords Hide 'ink' ProductsStopwords hid 'ink' from FTS results; LIKE found all products. Custom stopword list had 'ink' as noise. Review your stop→Java String Methods - == Bug Rejected Login PasswordsRandom login errors every ~3 attempts caused by ==. Our debug guide reveals how to catch this bug and substring gotchas
From Other Categories
Normalization vs DenormalizationBeginner
    

    
    
      
        N
        
          
            Naren
            Founder & Principal Engineer
          
          
            20+ years shipping high-throughput database systems. Drawn from code that ran under real load.
          
        
        
          
          Follow
        
      
      
        
          
            ✓ Verified
          
          production tested
        
        
          July 19, 2026
          last updated
        
        
          2,466
          articles · all by Naren
        
      
    

    
    
      🔥
      
        That's SQL Basics. Mark it forged?
      
      
        10 min read · try the examples if you haven't

Function	What It Does	MySQL Syntax	PostgreSQL Syntax	SQL Server Syntax
Uppercase	Converts all letters to uppercase	UPPER(col)	UPPER(col)	UPPER(col)
Lowercase	Converts all letters to lowercase	LOWER(col)	LOWER(col)	LOWER(col)
String length	Returns character count of string	LENGTH(col)	LENGTH(col)	LEN(col)
Trim both sides	Removes leading and trailing spaces	TRIM(col)	TRIM(col)	TRIM(col)
Trim left only	Removes leading spaces only	LTRIM(col)	LTRIM(col)	LTRIM(col)
Trim right only	Removes trailing spaces only	RTRIM(col)	RTRIM(col)	RTRIM(col)
Extract substring	Pulls a slice of text by position	SUBSTRING(col, start, len)	SUBSTRING(col, start, len)	SUBSTRING(col, start, len)
Concatenate strings	Joins two or more strings together	CONCAT(a, b)	CONCAT(a, b) or a \|\| b	CONCAT(a, b) or a + b
Find position	Returns index of substring in string	INSTR(col, search)	POSITION(search IN col)	CHARINDEX(search, col)
Replace text	Swaps all occurrences of a substring	REPLACE(col, old, new)	REPLACE(col, old, new)	REPLACE(col, old, new)

File	Command / Code	Purpose
string_functions_overview.sql	CREATE TABLE customers (	The Seven String Functions You'll Use 90% of the Time
upper_lower_length.sql	SELECT	UPPER, LOWER and LENGTH
trim_functions.sql	SELECT	TRIM, LTRIM and RTRIM
substring_concat_replace.sql	SELECT	SUBSTRING, CONCAT and REPLACE
EmailDomainSearch.sql	SELECT user_id, email	LIKE, PATINDEX, and CHARINDEX
StripPunctuation.sql	SELECT	REPLACE and TRANSLATE
Example.sql	SELECT	CONCAT_WS()
ReverseDomain.sql	WITH emails AS (	REVERSE()
GenerateStars.sql	SELECT	REPEAT()
FindCategory.sql	WITH products AS (	FIND_IN_SET()
regex_examples.sql	SELECT * FROM users WHERE email ~ '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2...	Regex Functions
search_comparison.sql	SELECT * FROM articles WHERE to_tsvector('english', body) @@ to_tsquery('english...	Full-Text Search vs LIKE vs Regex
string_agg_examples.sql	SELECT department, STRING_AGG(employee_name, ', ' ORDER BY employee_name) AS emp...	String Aggregation