Aggregate functions collapse multiple rows into a single summary value — COUNT, SUM, AVG, MIN, MAX
COUNT(*) counts every row including NULLs; COUNT(column) skips NULLs silently
AVG also skips NULLs — AVG of (10, 20, NULL) returns 15, not 10
GROUP BY groups rows by a column before aggregating — required when mixing aggregate and non-aggregate columns in SELECT
HAVING filters groups after aggregation; WHERE filters rows before aggregation
Biggest mistake: selecting a non-aggregated column without including it in GROUP BY — most databases throw an error
✦ Definition~90s read
What is SQL Aggregate Functions?
Aggregate functions are SQL operations that collapse multiple rows into a single summary value — they exist because raw row-level data is rarely useful for decision-making. Instead of scanning 10 million transaction records, you want to know the average order value, total revenue, or highest-spending customer.
★
Imagine you have a spreadsheet with 10,000 sales records.
These functions (COUNT, SUM, AVG, MIN, MAX) are the foundation of every reporting query, dashboard, and business metric. Without them, you'd be writing application-level loops over database cursors, which is both slow and error-prone.
In practice, aggregate functions are almost always paired with GROUP BY to produce per-category summaries — average revenue per customer cohort, total sales per region, or max transaction per day. The HAVING clause then filters those grouped results, acting as a WHERE for aggregates.
This trio (aggregate + GROUP BY + HAVING) is the core pattern for any analytical query in SQL, from PostgreSQL to Snowflake to BigQuery.
The critical trap — and the reason your LTV report was off by 23% — is NULL handling. AVG ignores NULLs entirely, dividing only by non-NULL rows. If your revenue column has NULLs for free trials or churned users, AVG silently computes a misleading average.
COUNT(*) counts all rows; COUNT(column) excludes NULLs. SUM returns NULL if all values are NULL, not zero. These behaviors are consistent across every major SQL dialect, but they routinely destroy reports when data pipelines produce unexpected NULLs. Understanding this is the difference between a dashboard that looks right and one that actually is right.
Plain-English First
Imagine you have a spreadsheet with 10,000 sales records. Aggregate functions are the 'Totals' row at the bottom — COUNT tells you how many rows you have, SUM adds them all up, AVG gives you the average, and MIN/MAX find the smallest and largest values. GROUP BY is what lets you get a separate total row for each category, region, or customer rather than just one grand total for everything.
Every real application — whether it's a shopping site showing 'Average rating: 4.7 stars', a bank dashboard showing 'Total spent this month: $1,240', or an HR tool reporting 'Headcount: 342 employees' — uses aggregate functions behind the scenes. They're the bridge between raw row-level data and the summary numbers that drive business decisions.
The rules are simple until NULL gets involved. COUNT(*) and COUNT(column_name) behave completely differently. AVG silently ignores NULL values. GROUP BY requires every non-aggregated column in your SELECT. These aren't obscure edge cases — they're the cause of a surprisingly large percentage of wrong report numbers in production.
What Aggregate Functions Actually Do (And Why They Exist)
A regular SQL query like SELECT name FROM employees returns one row per employee — every single row that matches your WHERE clause. An aggregate function changes that contract entirely. Instead of 'give me all the rows', it says 'give me one summarised value across all the rows.' This is what makes them aggregate — they combine many rows into one.
Five aggregate functions cover the vast majority of real-world analytics: COUNT (how many rows), SUM (total value), AVG (mean value), MIN (smallest value), MAX (largest value). All five share a critical characteristic: they handle NULL values by ignoring them — except COUNT(*), which is the one exception.
aggregate_basics.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
-- Setup: a simple orders tableCREATETABLEorders (
order_id INTPRIMARYKEY,
customer_id INT,
total DECIMAL(8,2),
status VARCHAR(20)
);
INSERTINTO orders VALUES
(1, 101, 49.99, 'completed'),
(2, 102, 120.00, 'completed'),
(3, 101, NULL, 'pending'), -- NULL total: not yet charged
(4, 103, 75.50, 'completed'),
(5, 102, 30.00, 'refunded');
-- Five aggregates on the same columnSELECTCOUNT(*) AS total_rows, -- 5: counts every rowCOUNT(total) AS rows_with_total, -- 4: skips the NULL rowSUM(total) AS revenue, -- 275.49: skips NULLAVG(total) AS avg_order, -- 68.87: 275.49/4, not /5!MIN(total) AS smallest,
MAX(total) AS largest
FROM orders;
AVG Skips NULLs — This Is Almost Never What Stakeholders Expect
AVG(total) on these 5 rows divides by 4, not 5, because NULL is excluded. If the pending order should count as zero revenue, use AVG(COALESCE(total, 0)). If it should genuinely be excluded from the average, the implicit NULL skip is correct — but document it explicitly so the next engineer reading the query understands the intent.
Production Insight
COUNT(*) vs COUNT(column) is the source of countless wrong dashboard numbers in production.
Always run both side by side when debugging a count discrepancy — the difference reveals your NULL row count.
For revenue calculations, always COALESCE NULL numeric columns explicitly rather than relying on implicit skip behavior.
Key Takeaway
Aggregate functions collapse many rows into one summary — that's their entire purpose.
All five functions skip NULLs, except COUNT(*) which counts every row regardless.
Document NULL treatment explicitly in any aggregate query that will be read by business stakeholders.
thecodeforge.io
SQL AVG NULL Behavior — LTV Report Error
Sql Aggregate Functions
COUNT, SUM and AVG — The Most-Used Trio
COUNT comes in two forms with very different behavior. COUNT(*) counts every row in the result set, including rows with NULL values in every column. COUNT(column_name) counts only rows where that specific column is not NULL. This distinction bites constantly in production.
SUM adds all non-NULL values. NULL rows contribute nothing. SUM of a column with all NULLs returns NULL, not zero — this surprises developers who expect 0. Use COALESCE(SUM(column), 0) when zero is the correct return for an all-NULL column.
AVG computes the mean of non-NULL values only. It's mathematically equivalent to SUM(column) / COUNT(column) — both numerator and denominator exclude NULLs.
count_sum_avg_queries.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
-- Demonstrating the COUNT(*) vs COUNT(column) differenceSELECTCOUNT(*) AS all_orders,
COUNT(total) AS charged_orders,
COUNT(*) - COUNT(total) AS pending_not_charged,
SUM(total) AS total_revenue,
COALESCE(SUM(total), 0) AS revenue_safe, -- 0 if all NULLsAVG(total) AS avg_charged, -- excludes NULLsAVG(COALESCE(total, 0)) AS avg_all -- includes NULLs as zeroFROM orders
WHERE status != 'refunded';
-- COUNT DISTINCT: unique customer countSELECTCOUNT(DISTINCT customer_id) AS unique_customers,
COUNT(customer_id) AS total_orders
FROM orders
WHERE status = 'completed';
COUNT(DISTINCT customer_id) gives you unique customers; COUNT(customer_id) gives you total orders. If Alice placed 3 orders, COUNT includes her 3 times but COUNT DISTINCT includes her once. This is the pattern behind 'unique visitors', 'active users', and 'distinct products ordered'.
Production Insight
SUM of an all-NULL column returns NULL, not zero — wrap with COALESCE(SUM(col), 0) for dashboard totals.
COUNT DISTINCT is expensive on large tables — it requires sorting or hashing all values before counting.
For approximate unique counts at scale, consider HyperLogLog (available in PostgreSQL, Redis, BigQuery) which trades 1% accuracy for O(1) memory.
Key Takeaway
COUNT(*) ≠ COUNT(column) — the difference is exactly the number of NULL rows in that column.
SUM returns NULL (not zero) when all input values are NULL — always COALESCE for dashboard totals.
COUNT DISTINCT is the right tool for unique user/product counts but is expensive — index the column.
MIN and MAX — Finding the Extremes in Your Data
MIN and MAX are the simplest aggregate functions to understand, and also the most underestimated. They don't just work on numbers — they work on dates, times, and text too. MIN on a date column gives the earliest date. MAX on a VARCHAR column gives the last value alphabetically. This makes them useful for pattern like 'first order date', 'most recent login', and 'last employee hired'.
min_max_all_five_aggregates.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
-- MIN/MAX on dates: first and most recent order per customerSELECT
customer_id,
COUNT(*) AS order_count,
MIN(created_at) AS first_order_date,
MAX(created_at) AS most_recent_order,
SUM(total) AS lifetime_value,
MAX(total) AS largest_single_order
FROM orders
WHERE status = 'completed'GROUPBY customer_id
ORDERBY lifetime_value DESC;
Like all aggregate functions, MIN and MAX skip NULL values. If a column has 4 values and one NULL, MIN and MAX operate on the 4 non-NULL values. This is usually correct behavior for date columns (NULLs typically mean 'not set yet') but worth being explicit about in documentation.
Production Insight
MIN(created_at) per customer_id is the standard pattern for 'acquisition date' in cohort analysis.
MAX(updated_at) identifies which rows were most recently modified — useful for incremental ETL pipelines.
For the single row with the minimum or maximum value (not just the value itself), use ORDER BY + LIMIT 1 or a window function — MIN/MAX alone can't return the full row.
Key Takeaway
MIN and MAX work on dates and text, not just numbers — first_order_date and last_login are MIN/MAX patterns.
Neither function returns the full row — use ORDER BY + LIMIT 1 or ROW_NUMBER() when you need the whole record.
Both skip NULLs — document explicitly if NULL means something specific in your column.
GROUP BY — Where Aggregate Functions Become Truly Powerful
Everything so far has given us a single row summarising the whole table. GROUP BY changes that: instead of one grand total, it calculates a separate aggregate for each unique value in the grouping column. This is what produces 'revenue by category', 'orders per customer', 'signups per month'.
The rule is strict: every column in SELECT must either be in GROUP BY or wrapped in an aggregate function. You cannot SELECT a column that isn't in GROUP BY and isn't aggregated — it's undefined which row's value should appear for the group.
group_by_with_aggregates.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
-- Revenue and order count by statusSELECT
status,
COUNT(*) AS order_count,
SUM(total) AS total_revenue,
AVG(total) AS avg_order_value,
MIN(total) AS smallest_order,
MAX(total) AS largest_order
FROM orders
GROUPBY status
ORDERBY total_revenue DESC;
-- HAVING: only show statuses with 2 or more ordersSELECT
status,
COUNT(*) AS order_count,
SUM(total) AS revenue
FROM orders
GROUPBY status
HAVINGCOUNT(*) >= 2-- filter AFTER aggregationORDERBY revenue DESC;
Output
status | order_count | total_revenue | avg_order_value
WHERE runs before GROUP BY — it removes individual rows. HAVING runs after GROUP BY — it removes entire groups. You cannot use aggregate functions in WHERE. HAVING COUNT() > 5 is valid; WHERE COUNT() > 5 throws an error. Use both in the same query when you need to: WHERE to pre-filter rows, HAVING to post-filter groups.
Production Insight
GROUP BY without an ORDER BY returns groups in undefined order — always add ORDER BY for consistent report output.
Grouping by date truncated to a period (DATE_TRUNC('month', created_at)) is the standard pattern for monthly trend reports.
For large tables, GROUP BY on an unindexed column forces a hash aggregate or sort — add an index on the GROUP BY column for repeated report queries.
Key Takeaway
GROUP BY produces one aggregate row per unique value — the foundation of every summary report.
Every SELECT column must be in GROUP BY or an aggregate function — this is a hard rule, not a style preference.
WHERE filters rows before grouping; HAVING filters groups after aggregation — they serve different purposes.
HAVING — The WHERE Clause for Aggregated Data
You can't use WHERE on aggregate results. That's not a bug, it's a SQL evaluation order constraint. WHERE filters rows before aggregation happens. Once COUNT(), SUM(), or AVG() collapse your rows into a single value per group, WHERE has already finished its job.
HAVING is your escape hatch. It runs after GROUP BY and aggregation, letting you filter on the computed result. Think of it as WHERE for groups, not rows. If you want departments where average salary exceeds $80,000, HAVING AVG(salary) > 80000 is the only way. Writing that filter in WHERE would silently fail or return nonsense.
The catch: HAVING forces a full scan of all groups before filtering. On large datasets with many groups, this hurts. Use WHERE to eliminate rows early whenever possible. Push filtering down to WHERE before aggregation, then only use HAVING for post-aggregation conditions. Your query planner will thank you.
DepartmentSalaryAnalysis.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// io.thecodeforge — database tutorial
-- Find departments with above-average total payrollSELECT
department_id,
COUNT(employee_id) AS headcount,
SUM(salary) AS total_payroll,
ROUND(AVG(salary), 2) AS avg_salary
FROM employees
-- WHERE happens here, before aggregationWHERE termination_date ISNULLGROUPBY department_id
-- HAVING happens after aggregationHAVINGSUM(salary) > 500000ORDERBY total_payroll DESC;
HAVING without GROUP BY is valid SQL but rarely what you want. It treats the entire result set as one group. You'll get a single row back with no way to slice further. Debugging that at 2 AM is not fun.
Key Takeaway
WHERE filters rows, HAVING filters groups. Mix them up and your queries either break silently or run 10x slower than they should.
NULL Handling — The Silent Data Poison
Aggregate functions and NULL have a complicated relationship. COUNT(*) counts everything, including rows where every column is NULL. COUNT(column_name) skips NULLs entirely. SUM(), AVG(), MIN(), MAX() all silently discard NULL values before computing. Your average might be calculated over 80% of your data and you'd never know.
This becomes a production nightmare when NULLs represent missing data versus actual zeros. A NULL salary could mean "not yet entered" or "contractor with no salary." Either way, AVG(salary) treats both identically — it skips them. The result looks fine but misrepresents reality. Suddenly your quarterly compensation report shows an average that's 15% too high because entry-level positions had NULLs in the pipeline.
Mitigation: Always check NULL percentages before running aggregates. Use COUNT(*) vs COUNT(column) to gauge nullability. For critical reports, consider COALESCE() or explicit NULL handling. Better yet, enforce NOT NULL constraints at the schema level if NULLs aren't semantically valid. Database design debt shows up first in aggregate queries.
NullExposureCheck.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
// io.thecodeforge — database tutorial
-- Spot NULL poisoning before it hits your aggregatesSELECTCOUNT(*) AS total_rows,
COUNT(employee_id) AS ids_present,
COUNT(salary) AS salaries_present,
COUNT(*) - COUNT(salary) AS missing_salaries,
ROUND(100.0 * COUNT(salary) / COUNT(*), 1) AS salary_completeness_pct,
ROUND(AVG(salary), 2) AS naive_average,
ROUND(AVG(COALESCE(salary, 0)), 2) AS zero_filled_average
FROM employees
WHERE department_id = 17;
Run a NULL audit once per quarter on columns you aggregate. A 5% NULL rate on a 10-million-row table means your averages are built on only 9.5M rows. That's a half-million phantom rows of uncertainty.
Key Takeaway
Aggregate functions are NULL-agnostic by default. That's a feature until it's a bug. Always know your NULL ratio before trusting an average.
Related Content
Understanding SQL aggregate functions is essential, but they don't exist in isolation. To build robust data pipelines, you should pair them with window functions like ROW_NUMBER() and RANK() for running totals without collapsing rows. JOINs are critical: aggregates often query multiple tables, so mastering INNER JOIN and LEFT JOIN ensures your SUM or COUNT reflects correct relationships. Subqueries let you nest aggregates — for instance, finding departments above average salary. Finally, indexes on grouped columns drastically improve performance; without them, large datasets turn aggregate queries into bottlenecks. Explore these topics to move beyond basic reporting into performant, analytic SQL.
Related_Aggregates.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
// io.thecodeforge — database tutorial
// 25 lines max
WITH dept_avg AS (
SELECT department_id, AVG(salary) AS avg_salary
FROM employees
GROUPBY department_id
)
SELECT d.department_name, da.avg_salary
FROM departments d
JOIN dept_avg da ON d.id = da.department_id
WHERE da.avg_salary > (
SELECTAVG(salary) FROM employees
);
Output
department_name | avg_salary
Engineering | 95000.00
Sales | 82000.00
Production Trap:
Aggregates on unindexed columns in JOINs cause full table scans. Always verify execution plans before deploying.
Key Takeaway
Pair aggregates with window functions and proper indexing for scalable queries.
Feedback
Your input directly shapes TheCodeForge.io's database content. Did the NULL handling section clarify how COUNT(*) differs from COUNT(column)? Was the HAVING example with real sales data intuitive, or did it need more edge cases? I want to know if the code snippets run cleanly in your environment — particularly the MIN/MAX on timestamps, which often trips up junior engineers. Drop a comment below or open an issue in our GitHub repo. Specific feedback like 'Show how AVG ignores NULLs in salary calculations' helps us refine future tutorials. Every response gets a direct reply from our senior engineering team. We build these guides for you, not for search engines.
Feedback_Example.sqlSQL
1
2
3
4
5
6
7
8
9
10
// io.thecodeforge — database tutorial
// 25 lines max
SELECT
department_id,
COUNT(*) AS total_rows,
COUNT(salary) AS non_null_salaries,
AVG(salary) AS avg_salary
FROM employees
WHERE department_id IN (10, 20)
GROUPBY department_id;
COUNT(*) includes rows with all-NULL columns; COUNT(column) does not. Misunderstanding this skews reporting metrics.
Key Takeaway
Test aggregate behavior with NULL data in your staging environment before production deployment.
Conclusion
SQL aggregate functions transform raw rows into actionable insights — but only when you apply them with precision. We've covered the mechanics of COUNT, SUM, AVG, MIN, MAX, and the critical grouping duo GROUP BY and HAVING. The key lesson: aggregates collapse data, so always ask 'What am I losing?' when grouping. Beware NULLs: they silently exclude rows from AVG and SUM unless you use COALESCE. Start your next query by identifying the business question, then pick the aggregate that answers it directly. Run EXPLAIN ANALYZE on heavy aggregations. Master these tools, and you'll move from writing queries to designing data narratives.
Conclusion_Check.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
// io.thecodeforge — database tutorial
// 25 lines max
SELECTEXTRACT(YEARFROM order_date) AS year,
COUNT(DISTINCT customer_id) AS unique_buyers,
ROUND(AVG(total_amount)::numeric, 2) AS avg_order_value
FROM orders
WHERE status = 'completed'GROUPBY year
HAVINGCOUNT(*) > 100ORDERBY year DESC;
Output
year | unique_buyers | avg_order_value
2024 | 342 | 125.50
2023 | 298 | 118.75
Production Trap:
HAVING without GROUP BY treats entire table as one group — rarely intended. Always verify your GROUP BY logic.
Key Takeaway
Start with the business question, then choose your aggregate. Always test with edge cases.
● Production incidentPOST-MORTEMseverity: high
Customer Lifetime Value Report Was 23% Higher Than Reality
Symptom
Monthly business review showed customer LTV 23% above the finance team's number. Both teams were querying the same database. No data discrepancy — just different queries.
Assumption
The analytics team used AVG(lifetime_value) and assumed NULL lifetime_value meant the customer had zero value. Finance used SUM(lifetime_value) / COUNT(*) which treated NULLs as zero in the denominator.
Root cause
AVG(lifetime_value) skips NULL rows entirely — it only averages the rows where a value exists. Customers who had never completed a purchase had NULL lifetime_value. The analytics query divided by the number of paying customers only; finance divided by all customers including nulls.
Fix
Changed to AVG(COALESCE(lifetime_value, 0)) to treat non-purchasing customers as zero value. Added a comment explaining the NULL treatment so the query is unambiguous to future readers.
Key lesson
AVG skips NULLs — this is mathematically valid but rarely what business stakeholders expect
Explicitly COALESCE NULL numeric columns before aggregating when zero is the intended value
When two queries on the same data return different aggregates, NULL treatment is the first thing to check
Production debug guideDiagnosing wrong counts, inflated averages, and GROUP BY errors4 entries
Symptom · 01
COUNT returns fewer rows than expected
→
Fix
You are using COUNT(column_name) on a column with NULLs. Switch to COUNT() to count all rows, or run SELECT COUNT(), COUNT(column_name) side by side to see the gap. The difference is your NULL row count.
Symptom · 02
AVG is higher than expected
→
Fix
NULLs are being excluded from the average. Run SELECT COUNT(), COUNT(column_name), SUM(column_name) to verify. If COUNT() != COUNT(column_name), you have NULLs. Use AVG(COALESCE(column_name, 0)) if zero is the correct treatment.
Symptom · 03
Error: column must appear in GROUP BY or aggregate function
→
Fix
You have a non-aggregated column in SELECT that isn't in GROUP BY. Either add it to GROUP BY, wrap it in an aggregate (MAX, MIN), or remove it from SELECT. PostgreSQL and SQL Server are strict about this; MySQL's ONLY_FULL_GROUP_BY mode controls it.
Symptom · 04
HAVING filter isn't working as expected
→
Fix
Check if you meant WHERE. HAVING filters groups after aggregation — it can use aggregate conditions like HAVING COUNT(*) > 5. If you're filtering on a non-aggregated column, use WHERE before GROUP BY.
Function
What It Computes
NULL Behavior
Works On
COUNT(*)
Total row count
Includes NULLs
Any table
COUNT(col)
Non-NULL row count
Skips NULLs
Any column
SUM(col)
Total of all values
Skips NULLs; returns NULL if all NULL
Numeric
AVG(col)
Mean of all values
Skips NULLs — divides by non-NULL count
Numeric
MIN(col)
Smallest value
Skips NULLs
Numeric, date, text
MAX(col)
Largest value
Skips NULLs
Numeric, date, text
Key takeaways
1
COUNT(*) counts every row including NULLs; COUNT(column) skips NULLs
the difference equals your NULL row count.
2
All aggregate functions except COUNT(*) skip NULL values
document NULL treatment explicitly for any business metric.
3
AVG skips NULLs in the denominator
use AVG(COALESCE(col, 0)) when NULL should mean zero.
4
GROUP BY requires every non-aggregated SELECT column to be in the GROUP BY clause
this is a hard rule.
5
WHERE filters rows before GROUP BY; HAVING filters groups after
aggregate functions belong in HAVING, not WHERE.
Common mistakes to avoid
3 patterns
×
Using COUNT(column) when COUNT(*) is intended
Symptom
Row count is lower than expected — the discrepancy equals the number of NULL rows in that column, but no error is thrown
Fix
Use COUNT(*) to count all rows regardless of NULLs. Only use COUNT(column_name) when you explicitly want to count non-NULL entries in that specific column. Run both side by side to expose the difference.
×
Selecting a non-aggregated column without adding it to GROUP BY
Symptom
Error: column must appear in GROUP BY clause or be used in an aggregate function (PostgreSQL) or column is not in SELECT list (MySQL in strict mode)
Fix
Every column in SELECT must either appear in GROUP BY or be wrapped in an aggregate function (MAX, MIN, COUNT, SUM, AVG). Add the column to GROUP BY if you want one row per unique combination, or remove it from SELECT if you don't need it.
×
Using WHERE to filter aggregate results instead of HAVING
Symptom
Error: aggregate functions are not allowed in WHERE — WHERE SUM(amount) > 1000 throws an error immediately
Fix
Use HAVING to filter on aggregate results: HAVING SUM(amount) > 1000. WHERE filters individual rows before grouping; HAVING filters groups after aggregation. Both can appear in the same query.
INTERVIEW PREP · PRACTICE MODE
Interview Questions on This Topic
Q01JUNIOR
What is the difference between COUNT(*) and COUNT(column_name)?
Q02JUNIOR
What is the difference between WHERE and HAVING? Why can't you use WHERE...
Q03SENIOR
A column contains five values — 10, 20, NULL, 30, NULL. What does AVG re...
Q01 of 03JUNIOR
What is the difference between COUNT(*) and COUNT(column_name)?
ANSWER
COUNT() counts every row in the result set, including rows where every column is NULL. COUNT(column_name) counts only rows where that specific column has a non-NULL value — NULL rows are silently skipped. If a table has 100 rows and 20 have NULL in the email column, COUNT() returns 100 and COUNT(email) returns 80. The difference tells you the NULL count: COUNT() - COUNT(email) = 20. In practice, use COUNT() when you want row counts and COUNT(column) only when you explicitly want to exclude NULLs from the count.
Q02 of 03JUNIOR
What is the difference between WHERE and HAVING? Why can't you use WHERE to filter aggregate results?
ANSWER
WHERE filters individual rows before GROUP BY runs — it operates on raw row data. HAVING filters groups after GROUP BY and aggregation — it operates on aggregate results. SQL executes in logical order: FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY. When WHERE runs, GROUP BY hasn't happened yet, so aggregate values don't exist — this is why WHERE SUM(amount) > 100 throws an error. HAVING runs after GROUP BY, so HAVING SUM(amount) > 100 works correctly. You can use both in the same query: WHERE filters rows before grouping, HAVING filters groups after.
Q03 of 03SENIOR
A column contains five values — 10, 20, NULL, 30, NULL. What does AVG return?
ANSWER
AVG returns 20, not 12. AVG skips NULL values and computes the mean of non-NULL values only: (10 + 20 + 30) / 3 = 20. It does not treat NULL as zero. If zero is the intended value for NULL rows, use AVG(COALESCE(column, 0)) which returns (10 + 20 + 0 + 30 + 0) / 5 = 12. This distinction matters enormously for business metrics — AVG customer lifetime value, average order value, and average response times can all be significantly inflated if NULL rows are silently excluded rather than counted as zero.
01
What is the difference between COUNT(*) and COUNT(column_name)?
JUNIOR
02
What is the difference between WHERE and HAVING? Why can't you use WHERE to filter aggregate results?
JUNIOR
03
A column contains five values — 10, 20, NULL, 30, NULL. What does AVG return?
SENIOR
FAQ · 3 QUESTIONS
Frequently Asked Questions
01
Can I use multiple aggregate functions in the same SELECT?
Yes. SELECT COUNT(*), SUM(total), AVG(total), MIN(total), MAX(total) FROM orders is perfectly valid. All five aggregate functions run over the same set of rows and return their respective summary values in a single result row.
Was this helpful?
02
What does SUM return if all values in the column are NULL?
SUM returns NULL, not zero. This catches developers off guard when building dashboards — an empty period returns NULL rather than 0. Wrap with COALESCE: COALESCE(SUM(column), 0) to return zero when no non-NULL rows exist.
Was this helpful?
03
How do I count only rows that match a condition within an aggregate?
Use COUNT with a CASE expression: COUNT(CASE WHEN status = 'completed' THEN 1 END) counts only completed orders. This is the conditional count pattern — also works with SUM: SUM(CASE WHEN status = 'completed' THEN total ELSE 0 END). Some databases support COUNT(column) FILTER (WHERE condition) syntax as a cleaner alternative.