TRIM removes leading and trailing whitespace — invisible spaces cause silent JOIN and comparison failures
LIKE with a leading wildcard ('%term') disables index usage — forces a full table scan
LENGTH counts bytes in MySQL for multi-byte characters; use CHAR_LENGTH for character count
String concatenation: CONCAT in MySQL, || in PostgreSQL and standard SQL
Biggest mistake: comparing strings without normalizing case — WHERE city = 'london' misses 'London' silently
Plain-English First
Imagine you work at a library and someone hands you a messy pile of book titles — some in ALL CAPS, some with random spaces at the start, some misspelled. SQL string functions are the tools you use to clean up, reshape, and search through that text, all without touching the original paper. They let your database do the tedious text-editing work so you never have to open a spreadsheet and do it by hand.
Every real-world database is full of text — customer names, email addresses, product descriptions, phone numbers. The problem is that text is messy. Users type their names in all caps, leave trailing spaces, or enter 'New York' when your system expects 'new york'. If you can't clean and manipulate that text inside the database itself, you end up writing hundreds of lines of application code to do a job SQL can handle in a single line.
SQL string functions solve exactly that problem. They're built-in tools the database gives you to inspect, clean, transform, and combine text columns on the fly — right inside your queries. Instead of pulling all the data into Python or JavaScript and then fixing it, you fix it where it lives.
By the end of this article you'll know the seven most important SQL string functions, exactly when to reach for each one, how to chain them together for real cleaning tasks, and the two traps that catch almost every beginner. You'll be able to look at a messy data problem and immediately know which function to use.
The Seven String Functions You'll Use 90% of the Time
SQL has dozens of string functions, but most professional work comes down to seven. Think of them like tools in a toolbox — a hammer, a screwdriver, a tape measure. You don't need every tool on day one; you need to know the core ones cold.
Here's a quick map before we go deep:
— UPPER / LOWER: change the case of text, like Caps Lock on a keyboard. — LENGTH: count how many characters are in a string, like counting letters in a word. — TRIM / LTRIM / RTRIM: remove unwanted spaces from the edges of text, like cutting the crusts off bread. — SUBSTRING: pull out a chunk of text from the middle of a string, like cutting a slice from a loaf. — CONCAT: glue two or more pieces of text together, like taping two pieces of paper side by side. — REPLACE: swap one piece of text for another, like using find-and-replace in Word. — INSTR / CHARINDEX / POSITION: find where a smaller piece of text lives inside a bigger one.
Different databases (MySQL, PostgreSQL, SQL Server) spell a few of these slightly differently — we'll flag that as we go. The concepts are identical everywhere.
string_functions_overview.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
-- ============================================================-- Setup: create a small customers table with intentionally-- messy data so we can demonstrate each function clearly.-- ============================================================CREATETABLEcustomers (
customer_id INT,
full_name VARCHAR(100),
email VARCHAR(150),
city VARCHAR(50)
);
INSERTINTO customers VALUES
(1, ' alice johnson ', 'ALICE@EXAMPLE.COM', 'new york'),
(2, 'BOB SMITH', 'bob@example.com', 'Los Angeles'),
(3, 'carol white ', 'Carol@Example.com', 'CHICAGO');
-- --------------------------------------------------------- Quick preview: what does the raw data look like?-- -------------------------------------------------------SELECT customer_id, full_name, email, city
FROM customers;
All examples in this article use standard SQL that runs in MySQL 8+, PostgreSQL 14+, and SQL Server 2019+. Where syntax differs between databases we'll call it out explicitly with a short note.
Production Insight
String normalization (LOWER, TRIM) should happen at write time, not read time — applying LOWER() in WHERE prevents index usage.
Store emails in lowercase at INSERT time; store phone numbers in a canonical format — don't rely on query-time normalization.
For case-insensitive columns, PostgreSQL's CITEXT type handles case-insensitive comparisons automatically without function overhead.
Key Takeaway
Normalize strings at write time — don't apply LOWER or TRIM in WHERE clauses if you can avoid it.
Case-insensitive storage (lowercase at insert) allows exact-match queries to use indexes normally.
For text columns requiring case-insensitive search, PostgreSQL's CITEXT type is cleaner than LOWER() everywhere.
UPPER, LOWER and LENGTH — Your Case and Size Tools
Case inconsistency is one of the most common data quality problems in real databases. One record says 'London', another says 'LONDON', a third says 'london'. To your database those are three completely different values — so a GROUP BY city query would produce three separate buckets instead of one. That's a real bug.
UPPER(text) converts every letter in a string to uppercase. LOWER(text) converts every letter to lowercase. Neither modifies the original stored data — they return a new version of the text just for that query result. Think of it like wearing tinted glasses: the world looks different, but the world hasn't changed.
LENGTH(text) tells you how many characters are in a string, including spaces. This is useful for data validation — if a phone number column has a length other than 10 digits, something went wrong during data entry. Note: in SQL Server, the equivalent function is called LEN() not LENGTH() — a classic gotcha we'll revisit.
The real power comes when you combine these with a WHERE clause. Instead of hoping users typed city names consistently, you normalise both sides of the comparison so they always match regardless of how the data was entered.
upper_lower_length.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
-- ============================================================-- UPPER: standardise city names to uppercase for safe grouping-- ============================================================SELECTUPPER(city) AS city_uppercase, -- 'new york' → 'NEW YORK'COUNT(*) AS customer_count
FROM customers
GROUPBYUPPER(city); -- groups correctly despite mixed case-- ============================================================-- LOWER: normalise email addresses (emails are case-insensitive-- but databases treat them as case-sensitive strings)-- ============================================================SELECT
customer_id,
LOWER(email) AS normalised_email -- 'ALICE@EXAMPLE.COM' → 'alice@example.com'FROM customers;
-- ============================================================-- LENGTH: find any city names that seem suspiciously short-- (useful for catching typos or blank entries)-- ============================================================SELECT
customer_id,
city,
LENGTH(city) AS city_length -- counts every character including spacesFROM customers
WHERELENGTH(city) < 5; -- flag anything under 5 characters as suspect-- ============================================================-- PRACTICAL COMBO: case-insensitive search-- Find all customers in New York regardless of how it was typed-- ============================================================SELECT *
FROM customers
WHERELOWER(city) = 'new york'; -- normalise BOTH sides to lowercase
Output
-- UPPER + GROUP BY result:
city_uppercase | customer_count
---------------|---------------
NEW YORK | 1
LOS ANGELES | 1
CHICAGO | 1
-- LOWER email result:
customer_id | normalised_email
------------|------------------
1 | alice@example.com
2 | bob@example.com
3 | carol@example.com
-- LENGTH < 5 result:
(0 rows — all city names in this dataset are 7+ characters)
In SQL Server, LENGTH() does not exist. Use LEN() instead. Also note that LEN() does NOT count trailing spaces, while MySQL's LENGTH() does. If you're counting spaces intentionally, use DATALENGTH() in SQL Server.
Production Insight
UPPER/LOWER applied to indexed columns in WHERE disables the index.
Create a functional index to restore index usage: CREATE INDEX idx_lower_email ON users (LOWER(email)).
For case-insensitive equality, LOWER(email) = LOWER($input) with a matching functional index is O(log n).
Key Takeaway
LOWER(col) = LOWER($input) forces a full scan unless you have a functional index on LOWER(col).
For case-insensitive lookups, create a functional index that matches the WHERE expression exactly.
LENGTH counts bytes in MySQL for multi-byte UTF8 — use CHAR_LENGTH for true character count.
TRIM, LTRIM and RTRIM — Killing the Invisible Spaces
Spaces you can't see are some of the sneakiest bugs in data work. When a user fills in a web form and accidentally hits the spacebar before their name, that space gets stored in the database. Now ' Alice' and 'Alice' are two different strings. Searches fail. Joins don't match. Reports double-count.
TRIM(text) removes spaces from both the left and right sides of a string. LTRIM only removes spaces on the left (the leading side). RTRIM only removes spaces on the right (the trailing side). Think of it like cutting the bread crusts — you can cut both ends, just the left, or just the right.
This comes up constantly in data imports. When you load data from a CSV file or an external API, strings almost always arrive with extra whitespace. Running TRIM before you process the data is like shaking the crumbs off a tablecloth before you sit down — a small step that prevents a lot of mess.
Modern SQL (SQL:2003 standard) lets you TRIM other characters too, not just spaces — for example trimming a dollar sign off a price string. We'll show that in the code below. PostgreSQL and SQL Server both support this; MySQL added it in version 8.
trim_functions.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
-- ============================================================-- Basic TRIM: remove spaces from both sides of full_name-- ============================================================SELECT
customer_id,
full_name, -- original, messy versionTRIM(full_name) AS clean_name, -- spaces removed from both sidesLENGTH(full_name) AS original_length,
LENGTH(TRIM(full_name)) AS trimmed_length -- shows how many spaces were hidingFROM customers;
-- ============================================================-- LTRIM vs RTRIM: when you only want to cut one side-- Useful if you're processing fixed-width data exports where-- trailing spaces are padding but leading spaces mean something.-- ============================================================SELECT' hello world 'AS original,
LTRIM(' hello world ') AS left_trimmed, -- 'hello world 'RTRIM(' hello world ') AS right_trimmed, -- ' hello world'TRIM(' hello world ') AS fully_trimmed; -- 'hello world'-- ============================================================-- TRIM a specific character (not just spaces)-- Imagine prices were imported with a leading dollar sign-- ============================================================SELECTTRIM(LEADING '$' FROM '$49.99') AS price_no_prefix, -- '49.99'TRIM(TRAILING '!' FROM 'Sale!') AS message_no_bang, -- 'Sale'TRIM(BOTH '*' FROM '***HOT***') AS product_clean; -- 'HOT'-- ============================================================-- PRACTICAL USE: clean names before inserting into a report-- Combine TRIM + UPPER to fully normalise a string-- ============================================================SELECT
customer_id,
UPPER(TRIM(full_name)) AS display_name -- ' alice johnson ' → 'ALICE JOHNSON'FROM customers;
Instead of trimming on every SELECT query (which runs the function millions of times on a large table), use a CHECK constraint or a BEFORE INSERT trigger to strip whitespace when data first enters the database. Clean at the gate, not at every door.
Production Insight
TRIM is not applied automatically on insert — users regularly paste values with surrounding whitespace.
Add a database-level CHECK constraint: CHECK (email = TRIM(email)) to prevent dirty data entering.
For batch imports from CSV or external systems, always run UPDATE table SET col = TRIM(col) as a post-import step.
Key Takeaway
TRIM at write time beats TRIM at read time — prevent dirty data, don't just work around it.
CHECK (col = TRIM(col)) is an elegant constraint that rejects values with leading/trailing spaces.
For imports: UPDATE table SET col = TRIM(col) WHERE col != TRIM(col) as a data quality step.
SUBSTRING, CONCAT and REPLACE — Slicing, Joining and Swapping Text
These three functions let you reshape text rather than just clean it. SUBSTRING pulls a piece out of the middle of a string. CONCAT sticks strings together. REPLACE swaps one piece of text for another. Together they cover almost every text-transformation task you'll face.
SUBSTRING(text, start, length) takes three arguments: the text to work on, the position to start cutting, and how many characters to take. Positions start at 1, not 0 — that trips up programmers coming from JavaScript or Python. So SUBSTRING('London', 1, 3) gives you 'Lon', not 'ond'.
CONCAT(text1, text2, ...) glues strings together. This is useful for building full names from first and last name columns, constructing URLs from parts, or building human-readable labels from codes. The || operator does the same job in PostgreSQL and standard SQL — but MySQL requires the CONCAT function.
REPLACE(text, old_value, new_value) finds every occurrence of old_value inside text and swaps it for new_value. This is powerful for data cleaning — swapping phone number formats, removing unwanted characters, or fixing a batch of mis-entered values all in one UPDATE statement.
substring_concat_replace.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
-- ============================================================-- SUBSTRING: extract just the domain from an email address.-- We know the '@' is always there — find where it is, then-- grab everything after it.---- INSTR(string, search) returns the position of the first-- match. We add 1 to skip past the '@' itself.-- ============================================================SELECT
email,
SUBSTRING(
email,
INSTR(email, '@') + 1, -- start one char AFTER the '@'LENGTH(email) -- grab the rest (generous upper bound)
) AS email_domain -- 'alice@example.com' → 'example.com'FROM customers;
-- ============================================================-- CONCAT: build a full display label from separate columns.-- Notice we add a literal string ' — ' between city and name.-- ============================================================SELECTCONCAT(
TRIM(UPPER(city)), -- 'NEW YORK'
' | ', -- literal separatorTRIM(full_name) -- 'alice johnson'
) AS customer_label
FROM customers;
-- PostgreSQL / standard SQL alternative (same result):-- SELECT TRIM(UPPER(city)) || ' | ' || TRIM(full_name) AS customer_label-- FROM customers;-- ============================================================-- REPLACE: the marketing team re-branded 'example.com' to-- 'mycompany.com' — fix all the email addresses in one shot.-- ============================================================SELECT
email,
REPLACE(
LOWER(email), -- normalise case first so the match is reliable
'example.com', -- the old text to find
'mycompany.com' -- the new text to put in its place
) AS updated_email
FROM customers;
-- ============================================================-- REAL UPDATE using REPLACE (not just a SELECT preview):-- Uncomment only when you're ready — always preview first!-- ============================================================-- UPDATE customers-- SET email = REPLACE(LOWER(email), 'example.com', 'mycompany.com')-- WHERE email LIKE '%example.com%'; -- safety net: only touch matching rows-- ============================================================-- SUBSTRING to extract area code from a stored phone number-- Assumes format: '212-555-0100' — area code is always first 3 chars-- ============================================================SELECT'212-555-0100'AS raw_phone,
SUBSTRING('212-555-0100', 1, 3) AS area_code; -- '212'
Output
-- Email domain extraction:
email | email_domain
--------------------|-------------
ALICE@EXAMPLE.COM | EXAMPLE.COM
bob@example.com | example.com
Carol@Example.com | Example.com
-- CONCAT customer label:
customer_label
--------------------------
NEW YORK | alice johnson
LOS ANGELES | BOB SMITH
CHICAGO | carol white
-- REPLACE email domain:
email | updated_email
--------------------|----------------------
ALICE@EXAMPLE.COM | alice@mycompany.com
bob@example.com | bob@mycompany.com
Carol@Example.com | carol@mycompany.com
-- Area code extraction:
raw_phone | area_code
-------------|----------
212-555-0100 | 212
Watch Out — SUBSTRING Positions Start at 1, Not 0:
Coming from Python or JavaScript? SUBSTRING in SQL counts from position 1, not 0. SUBSTRING('Hello', 1, 1) returns 'H'. If you write SUBSTRING('Hello', 0, 1) you'll get an empty string or unexpected result depending on the database. This bites almost every developer the first time.
Production Insight
SUBSTRING indexes are 0-based in some databases (SQL Server) and 1-based in others (PostgreSQL, MySQL) — verify for your database.
For extracting structured components (area code from phone number), REGEXP_REPLACE/REGEXP_SUBSTR is more robust than SUBSTRING with hardcoded positions.
For large text manipulation, do it in application code — pushing heavy string processing into SQL runs on the database server which is usually a shared resource.
Key Takeaway
SUBSTRING position is 1-based in PostgreSQL/MySQL, 0-based in SQL Server — test your database.
For structured text parsing (extracting domain from email), REGEXP_SUBSTR is more robust than SUBSTRING(col, POSITION('@' IN col) + 1).
Heavy string transformation belongs in application code, not SQL — databases are optimized for data retrieval, not string processing.
● Production incidentPOST-MORTEMseverity: high
User Lookup Failed for 30% of Users Due to Trailing Spaces in Email Column
Symptom
30% of users were locked out after a system migration. Password reset emails reached them (sent to the stored email) but login lookups returned 'user not found'.
Assumption
The login query used WHERE email = $input. The team assumed emails in the database were clean.
Root cause
A legacy CSV import inserted emails with trailing spaces. The login form trimmed user input; the database values had spaces. 'alice@example.com ' != 'alice@example.com' — exact string comparison failed silently.
Fix
UPDATE users SET email = TRIM(email) to clean existing data. Added TRIM() in the login query as an interim fix. Added a CHECK constraint: CHECK (email = TRIM(email)) to prevent future imports from introducing spaces.
Key lesson
Always TRIM text columns at write time, not read time — prevent the dirty data from entering the database
Add CHECK constraints for format rules: CHECK (email = TRIM(email)) prevents leading/trailing spaces
When doing exact string comparisons in authentication, TRIM both sides defensively
Production debug guideDiagnosing silent comparison failures and encoding issues3 entries
Symptom · 01
String comparison returns no rows despite visually matching values
→
Fix
Check for invisible characters: SELECT LENGTH(col), TRIM(col), col FROM table WHERE id = x. If LENGTH != CHAR_LENGTH(TRIM(col)), there are leading/trailing spaces or multi-byte characters. Also check case: col = 'london' won't match 'London' without LOWER(col).
Symptom · 02
LIKE '%pattern%' is slow on a large table
→
Fix
LIKE with a leading wildcard forces a full table scan. For contains-search, create a full-text index: CREATE INDEX idx_fts ON products USING GIN(to_tsvector('english', description)) in PostgreSQL. Use MySQL FULLTEXT index with MATCH...AGAINST for large text search.
Symptom · 03
CONCAT returns NULL when some inputs are NULL
→
Fix
CONCAT in PostgreSQL and standard SQL returns NULL if any argument is NULL. Use COALESCE: COALESCE(first_name, '') || ' ' || COALESCE(last_name, ''). MySQL's CONCAT_WS(separator, ...) ignores NULLs automatically.
Function
What It Does
MySQL Syntax
PostgreSQL Syntax
SQL Server Syntax
Uppercase
Converts all letters to uppercase
UPPER(col)
UPPER(col)
UPPER(col)
Lowercase
Converts all letters to lowercase
LOWER(col)
LOWER(col)
LOWER(col)
String length
Returns character count of string
LENGTH(col)
LENGTH(col)
LEN(col)
Trim both sides
Removes leading and trailing spaces
TRIM(col)
TRIM(col)
TRIM(col)
Trim left only
Removes leading spaces only
LTRIM(col)
LTRIM(col)
LTRIM(col)
Trim right only
Removes trailing spaces only
RTRIM(col)
RTRIM(col)
RTRIM(col)
Extract substring
Pulls a slice of text by position
SUBSTRING(col, start, len)
SUBSTRING(col, start, len)
SUBSTRING(col, start, len)
Concatenate strings
Joins two or more strings together
CONCAT(a, b)
CONCAT(a, b) or a || b
CONCAT(a, b) or a + b
Find position
Returns index of substring in string
INSTR(col, search)
POSITION(search IN col)
CHARINDEX(search, col)
Replace text
Swaps all occurrences of a substring
REPLACE(col, old, new)
REPLACE(col, old, new)
REPLACE(col, old, new)
Key takeaways
1
String functions never change the stored data
they return a transformed copy for that query only. To permanently change data you need an UPDATE statement.
2
Always normalise BOTH sides of a string comparison
WHERE LOWER(city) = 'london' works, but WHERE city = 'london' silently fails when the stored value is 'London' or 'LONDON'.
3
SQL Server uses LEN() and CHARINDEX() where MySQL/PostgreSQL use LENGTH() and INSTR()/POSITION()
the concept is identical, the spelling is different. Check your dialect before debugging.
4
Chain functions from the inside out
UPPER(TRIM(full_name)) trims first, then uppercases. Read nested functions like onion layers — innermost runs first.
Common mistakes to avoid
3 patterns
×
Comparing strings without normalizing case
Symptom
WHERE city = 'london' returns zero rows when the data contains 'London' or 'LONDON' — exact string comparison is case-sensitive in PostgreSQL and SQL Server
Fix
Use LOWER() on both sides: WHERE LOWER(city) = 'london'. For repeated case-insensitive searches, create a functional index: CREATE INDEX idx_lower_city ON addresses (LOWER(city)). Store the data normalized (always lowercase) at write time to avoid the overhead entirely.
×
Using LIKE '%pattern%' for search on large tables
Symptom
Search query is slow — EXPLAIN shows Seq Scan on the full table regardless of existing indexes on the searched column. Performance degrades linearly with table size.
Fix
Full-text search: PostgreSQL: CREATE INDEX idx_fts ON table USING GIN(to_tsvector('english', col)); query with WHERE to_tsvector('english', col) @@ plainto_tsquery('pattern'). MySQL: CREATE FULLTEXT INDEX; query with MATCH(col) AGAINST ('pattern'). LIKE 'prefix%' (no leading wildcard) can still use a regular B-tree index.
×
Concatenating strings without handling NULLs
Symptom
CONCAT(first_name, ' ', last_name) returns NULL for any user where first_name or last_name is NULL — the full name column shows NULL instead of the partial name
Fix
COALESCE every nullable component: COALESCE(first_name, '') || ' ' || COALESCE(last_name, ''). Or use CONCAT_WS(' ', first_name, last_name) in MySQL which skips NULL arguments automatically.
INTERVIEW PREP · PRACTICE MODE
Interview Questions on This Topic
Q01SENIOR
What is the difference between CHAR_LENGTH and LENGTH in SQL?
Q02SENIOR
Why does LIKE '%pattern%' not use an index, and how would you optimize a...
Q03JUNIOR
How do you safely concatenate a full name from first_name and last_name ...
Q01 of 03SENIOR
What is the difference between CHAR_LENGTH and LENGTH in SQL?
ANSWER
LENGTH in MySQL returns the number of bytes, not characters. For ASCII strings these are equal. For multi-byte UTF-8 characters (Chinese, Arabic, emoji), LENGTH may return a higher number because each character uses 2-4 bytes. CHAR_LENGTH always returns the number of characters regardless of encoding. In PostgreSQL, LENGTH returns characters by default. Example: a string containing a single emoji (4 bytes in UTF-8) — MySQL LENGTH returns 4, CHAR_LENGTH returns 1. For user-facing character counting (form validation, display truncation), always use CHAR_LENGTH.
Q02 of 03SENIOR
Why does LIKE '%pattern%' not use an index, and how would you optimize a full-text search?
ANSWER
A B-tree index stores values in sorted order. LIKE 'prefix%' can use the index because the sorted structure allows a range scan from 'prefix' onwards. LIKE '%pattern%' (leading wildcard) cannot — there's no way to find all values containing 'pattern' in a sorted structure without scanning every entry. Solutions: (1) PostgreSQL: create a GIN index on to_tsvector() and use the @@ operator for full-text search. (2) MySQL: CREATE FULLTEXT INDEX and query with MATCH ... AGAINST. (3) Elasticsearch or Typesense for production search with ranking, typo-tolerance, and faceting. (4) pg_trgm extension in PostgreSQL creates a trigram index that supports LIKE '%pattern%' and is significantly faster than a full scan.
Q03 of 03JUNIOR
How do you safely concatenate a full name from first_name and last_name columns that might be NULL?
ANSWER
CONCAT in standard SQL (and PostgreSQL || operator) returns NULL if any argument is NULL. Safe options: (1) COALESCE each component: COALESCE(first_name, '') || ' ' || COALESCE(last_name, '') — but this produces a leading or trailing space if either is NULL. (2) CONCAT_WS(' ', first_name, last_name) in MySQL — CONCAT_WS skips NULL arguments and doesn't produce extra separators. (3) CASE logic: CASE WHEN first_name IS NOT NULL AND last_name IS NOT NULL THEN first_name || ' ' || last_name WHEN first_name IS NOT NULL THEN first_name ELSE COALESCE(last_name, '') END. The cleanest production pattern is usually to store display names normalized at write time rather than computing them in every query.
01
What is the difference between CHAR_LENGTH and LENGTH in SQL?
SENIOR
02
Why does LIKE '%pattern%' not use an index, and how would you optimize a full-text search?
SENIOR
03
How do you safely concatenate a full name from first_name and last_name columns that might be NULL?
JUNIOR
FAQ · 3 QUESTIONS
Frequently Asked Questions
01
What is the difference between CHAR_LENGTH and LENGTH in SQL?
In MySQL, LENGTH() counts bytes not characters — so a single emoji or accented character (which uses multiple bytes in UTF-8) can count as 2, 3 or 4. CHAR_LENGTH() counts actual characters regardless of byte size. For modern text data with international characters always prefer CHAR_LENGTH() in MySQL. PostgreSQL's LENGTH() counts characters directly so this distinction doesn't apply there.
Was this helpful?
02
Can I use SQL string functions inside a WHERE clause?
Yes, absolutely. WHERE LOWER(email) = 'alice@example.com' is perfectly valid. The function runs on each row as the database filters. The trade-off is performance: wrapping a column in a function usually prevents the database from using an index on that column. For a large table, consider storing a pre-normalised version of the column or using a function-based index if your database supports it.
Was this helpful?
03
Why does CONCAT return NULL when one of my columns is NULL?
In standard SQL and most databases, any string operation involving NULL returns NULL — because NULL means 'unknown', and 'Alice' joined to 'unknown' is still 'unknown'. Fix this with COALESCE: CONCAT(first_name, ' ', COALESCE(last_name, '')) replaces a NULL last_name with an empty string so the full name still returns correctly instead of vanishing.