Home Database SQL for Beginners: Query, Filter and Manage Data From Scratch

SQL for Beginners: Query, Filter and Manage Data From Scratch

In Plain English 🔥
Imagine a massive filing cabinet full of folders, and each folder holds a spreadsheet of information — one for customers, one for orders, one for products. SQL is simply the language you use to talk to that filing cabinet: 'Hey, give me every customer who spent more than $100 last month.' The cabinet hands you exactly that list. No programming degree needed — SQL reads almost like plain English, and that's entirely by design.
⚡ Quick Answer
Imagine a massive filing cabinet full of folders, and each folder holds a spreadsheet of information — one for customers, one for orders, one for products. SQL is simply the language you use to talk to that filing cabinet: 'Hey, give me every customer who spent more than $100 last month.' The cabinet hands you exactly that list. No programming degree needed — SQL reads almost like plain English, and that's entirely by design.

Every app you've ever used — Instagram, Netflix, your bank — runs on a database humming quietly in the background. When Netflix recommends a show, a database is being asked a question. When your bank balance updates, a database is being changed. SQL (Structured Query Language) is the universal language for doing both of those things, and it has been for over 50 years. Learning it is one of the highest-ROI skills a developer — or honestly anyone who works with data — can pick up.

Before SQL existed, retrieving data from a computer system meant writing complex custom programs for every single question you wanted to ask. IBM researchers in the 1970s asked a radical question: what if non-programmers could just ask the data a question in something close to plain English? SQL was born from that idea. It solves the problem of making structured data accessible — whether you're a data analyst, a backend engineer, a product manager, or a founder trying to understand your users.

By the end of this article you'll understand what a database and a table actually are, you'll be able to write SELECT queries to fetch data, use WHERE clauses to filter it, sort results with ORDER BY, and combine conditions with AND and OR. You'll also know the most common traps beginners fall into and exactly how to sidestep them. Let's build this up from the ground floor.

What a Database and a Table Actually Are (No Hand-Waving)

A database is an organised collection of data stored on a computer so it can be retrieved quickly and reliably. Think of it as the filing cabinet from the analogy — but one that can hold millions of folders and find any single piece of information in milliseconds.

Inside a database, data lives in tables. A table is exactly what the word suggests: rows and columns, just like a spreadsheet. Each column has a name and a data type — for example, a 'price' column stores numbers, a 'customer_name' column stores text. Each row is one complete record — one customer, one product, one transaction.

When tables are related to each other — for example, an 'orders' table references a 'customers' table — the whole system is called a Relational Database. SQL is the language designed specifically for relational databases.

Popular relational databases you'll hear about include PostgreSQL, MySQL, SQLite, and Microsoft SQL Server. The good news: SQL is nearly identical across all of them. Learn it once, use it everywhere. The small differences (called 'dialects') only matter for advanced features.

create_bookstore_tables.sql · SQL
1234567891011121314151617181920212223242526272829303132333435363738
-- ============================================================
-- We're building a tiny bookstore database from scratch.
-- Two tables: 'authors' and 'books'.
-- Run this in any SQL environment (SQLite, MySQL, PostgreSQL).
-- ============================================================

-- Create the authors table first, because books will reference it.
CREATE TABLE authors (
    author_id   INTEGER PRIMARY KEY,   -- unique identifier for each author
    full_name   TEXT    NOT NULL,       -- author's full name; NOT NULL means it's required
    country     TEXT                    -- country of origin; optional field
);

-- Create the books table.
CREATE TABLE books (
    book_id     INTEGER PRIMARY KEY,
    title       TEXT    NOT NULL,
    author_id   INTEGER NOT NULL,       -- links back to the authors table
    genre       TEXT,
    price       REAL    NOT NULL,       -- REAL means a decimal number (e.g. 12.99)
    published   INTEGER                -- year published, stored as a number
);

-- Insert four authors into the authors table.
INSERT INTO authors (author_id, full_name, country) VALUES
    (1, 'George Orwell',        'United Kingdom'),
    (2, 'Gabriel García Márquez', 'Colombia'),
    (3, 'Chimamanda Ngozi Adichie', 'Nigeria'),
    (4, 'Haruki Murakami',      'Japan');

-- Insert six books into the books table.
INSERT INTO books (book_id, title, author_id, genre, price, published) VALUES
    (1, 'Nineteen Eighty-Four',         1, 'Dystopian',  9.99,  1949),
    (2, 'Animal Farm',                  1, 'Satire',     7.49,  1945),
    (3, 'One Hundred Years of Solitude',2, 'Magical Realism', 12.99, 1967),
    (4, 'Americanah',                   3, 'Literary Fiction', 10.99, 2013),
    (5, 'Purple Hibiscus',              3, 'Literary Fiction',  8.99, 2003),
    (6, 'Norwegian Wood',               4, 'Literary Fiction', 11.50, 1987);
▶ Output
-- No visible output for CREATE TABLE and INSERT statements.
-- The database engine confirms each statement silently.
-- In most SQL tools you'll see: 'Query OK' or '4 rows affected' etc.
-- The data now lives in the database, ready to be queried.
🔥
Why PRIMARY KEY matters:A PRIMARY KEY is a column (or set of columns) that uniquely identifies every row. Think of it as a social security number for your data — no two rows can share the same value, and it can never be empty. Without it, you'd have no reliable way to point to a specific row later.

Your First SQL Query — SELECT, FROM, and WHERE Explained

SELECT is the most important SQL command you'll ever learn. It answers the question: 'Show me data.' The basic structure is almost a sentence: SELECT [what columns] FROM [which table].

The asterisk (*) is a wildcard meaning 'all columns'. Use it when exploring data. In production code, always name the columns explicitly — it's faster and far easier to read six months later.

The WHERE clause is how you filter. Without WHERE, SQL returns every single row in the table. WHERE adds a condition that each row must pass to be included in the results. Think of it as a bouncer at the door: only rows that satisfy the condition get in.

Conditions in WHERE can use these comparison operators: = (equals), != or <> (not equals), > (greater than), < (less than), >= (greater than or equal), <= (less than or equal). You can also combine multiple conditions using AND (both must be true) and OR (either can be true).

select_and_filter_books.sql · SQL
123456789101112131415161718192021222324252627282930313233343536
-- ============================================================
-- Querying the bookstore database we created in the last step.
-- Run each query block one at a time to see the output.
-- ============================================================

-- Query 1: Get every column from every row in the books table.
-- Good for exploration, but avoid SELECT * in real apps.
SELECT * FROM books;

-- Query 2: Get only the title and price — we don't need everything.
SELECT title, price
FROM books;

-- Query 3: Filter with WHERE — only books cheaper than $10.00
SELECT title, price
FROM books
WHERE price < 10.00;

-- Query 4: Combine conditions with AND
-- Only books under $10 that were also published after 1980
SELECT title, price, published
FROM books
WHERE price < 10.00
  AND published > 1980;

-- Query 5: Use OR — books that are either Satire OR Dystopian genre
SELECT title, genre
FROM books
WHERE genre = 'Satire'
   OR genre = 'Dystopian';

-- Query 6: A tidier way to write the OR above, using IN
-- IN checks if a value matches any item in a list
SELECT title, genre
FROM books
WHERE genre IN ('Satire', 'Dystopian');
▶ Output
-- Query 1 output (all rows, all columns):
-- book_id | title | author_id | genre | price | published
-- 1 | Nineteen Eighty-Four | 1 | Dystopian | 9.99 | 1949
-- 2 | Animal Farm | 1 | Satire | 7.49 | 1945
-- 3 | One Hundred Years of Solitude | 2 | Magical Realism | 12.99 | 1967
-- 4 | Americanah | 3 | Literary Fiction | 10.99 | 2013
-- 5 | Purple Hibiscus | 3 | Literary Fiction | 8.99 | 2003
-- 6 | Norwegian Wood | 4 | Literary Fiction | 11.50 | 1987

-- Query 3 output (price < 10.00):
-- title | price
-- Nineteen Eighty-Four | 9.99
-- Animal Farm | 7.49
-- Purple Hibiscus | 8.99

-- Query 4 output (price < 10.00 AND published > 1980):
-- title | price | published
-- Purple Hibiscus | 8.99 | 2003

-- Query 6 output (Satire OR Dystopian):
-- title | genre
-- Nineteen Eighty-Four | Dystopian
-- Animal Farm | Satire
⚠️
Watch Out: AND vs OR Operator PrecedenceSQL evaluates AND before OR, just like multiplication before addition in maths. If you write WHERE price < 10 OR genre = 'Satire' AND published > 1960, SQL reads the AND part first. Always use parentheses to make your intent explicit: WHERE price < 10 OR (genre = 'Satire' AND published > 1960). This single habit will save you hours of debugging wrong results.

Sorting, Limiting and Shaping Results with ORDER BY and LIMIT

Fetching data is one thing — getting it back in a useful order is another. ORDER BY sorts your results by one or more columns. ASC means ascending (A→Z, 0→9) and is the default. DESC means descending (Z→A, 9→0).

You can sort by multiple columns: ORDER BY genre ASC, price DESC sorts alphabetically by genre first, then within each genre sorts by price from highest to lowest. This is enormously useful for leaderboards, product listings, and reports.

LIMIT controls how many rows come back. Databases can hold millions of rows — you almost never want all of them at once. LIMIT 5 gives you the top five. Pair it with ORDER BY and you can answer questions like 'What are the 3 most expensive books?' in a single clean query.

ALIAS (the AS keyword) lets you rename a column in the output. This is purely cosmetic — it doesn't change the database — but it makes results far more readable, especially when column names are long or ambiguous.

sort_and_limit_books.sql · SQL
123456789101112131415161718192021222324252627282930313233343536
-- ============================================================
-- Sorting, limiting and renaming output columns.
-- These techniques are used in virtually every real-world query.
-- ============================================================

-- Query 1: Sort all books by price, cheapest first (ASC is default).
SELECT title, price
FROM books
ORDER BY price ASC;

-- Query 2: Sort by price, most expensive first.
SELECT title, price
FROM books
ORDER BY price DESC;

-- Query 3: The 3 most expensive books — combine ORDER BY with LIMIT.
SELECT title, price
FROM books
ORDER BY price DESC
LIMIT 3;

-- Query 4: Sort by genre alphabetically, then by price within each genre.
-- This is multi-column sorting.
SELECT title, genre, price
FROM books
ORDER BY genre ASC, price ASC;

-- Query 5: Use AS to give columns friendlier names in the output.
-- 'title' becomes 'Book Title', 'price' becomes 'Price (USD)'.
-- Notice: multi-word aliases need to be wrapped in double quotes.
SELECT title   AS "Book Title",
       price   AS "Price (USD)",
       published AS "Year"
FROM books
ORDER BY price DESC
LIMIT 4;
▶ Output
-- Query 1 output (cheapest first):
-- title | price
-- Animal Farm | 7.49
-- Purple Hibiscus | 8.99
-- Nineteen Eighty-Four | 9.99
-- Americanah | 10.99
-- Norwegian Wood | 11.50
-- One Hundred Years of Solitude | 12.99

-- Query 3 output (top 3 most expensive):
-- title | price
-- One Hundred Years of Solitude | 12.99
-- Norwegian Wood | 11.50
-- Americanah | 10.99

-- Query 5 output (aliased columns, top 4 by price):
-- Book Title | Price (USD) | Year
-- One Hundred Years of Solitude | 12.99 | 1967
-- Norwegian Wood | 11.50 | 1987
-- Americanah | 10.99 | 2013
-- Nineteen Eighty-Four | 9.99 | 1949
⚠️
Pro Tip: Always LIMIT when exploring unknown dataIf you're querying a production table for the first time and you have no idea how many rows it contains, always add LIMIT 10 to your SELECT. Running SELECT * FROM transactions on a table with 50 million rows without a LIMIT will either time out, crash your client, or make your DBA very unhappy. Explore first, then remove the limit once you know what you're working with.

Aggregate Functions — COUNT, SUM, AVG, MIN, MAX and GROUP BY

So far every query has returned individual rows. But sometimes you don't want individual records — you want a summary. How many books does each author have? What's the average price per genre? What's the most expensive book? Aggregate functions answer these questions by collapsing many rows into a single calculated value.

The five core aggregate functions: COUNT() counts rows, SUM() adds up numeric values, AVG() calculates the average, MIN() finds the smallest value, MAX() finds the largest.

Aggregate functions become truly powerful when combined with GROUP BY. GROUP BY splits your table into groups (one group per unique value in a column) and then runs the aggregate function on each group separately. The result is one summary row per group.

HAVING is like WHERE but for aggregated results. You can't use WHERE to filter on a COUNT or SUM because those values don't exist until after grouping happens. HAVING runs after the groups are formed, so it can filter on them.

aggregate_and_group_books.sql · SQL
123456789101112131415161718192021222324252627282930313233343536373839404142434445
-- ============================================================
-- Aggregate functions and GROUP BY on our bookstore data.
-- This is where SQL starts to feel genuinely powerful.
-- ============================================================

-- Query 1: How many books are in the entire table?
-- COUNT(*) counts every row regardless of NULL values.
SELECT COUNT(*) AS total_books
FROM books;

-- Query 2: What is the average price of all books?
SELECT ROUND(AVG(price), 2) AS average_price
FROM books;
-- ROUND(value, 2) rounds the result to 2 decimal places.

-- Query 3: Most expensive and cheapest single book.
SELECT MAX(price) AS most_expensive,
       MIN(price) AS cheapest
FROM books;

-- Query 4: How many books does each author have?
-- GROUP BY splits the table into one group per author_id,
-- then COUNT(*) counts the rows in each group.
SELECT author_id,
       COUNT(*) AS number_of_books
FROM books
GROUP BY author_id
ORDER BY number_of_books DESC;

-- Query 5: Average price per genre.
SELECT genre,
       COUNT(*)            AS book_count,
       ROUND(AVG(price),2) AS avg_price
FROM books
GROUP BY genre
ORDER BY avg_price DESC;

-- Query 6: HAVING — only show genres that have MORE than 1 book.
-- We cannot use WHERE here because book_count doesn't exist yet
-- at the point WHERE runs. HAVING runs AFTER GROUP BY.
SELECT genre,
       COUNT(*) AS book_count
FROM books
GROUP BY genre
HAVING book_count > 1;
▶ Output
-- Query 1 output:
-- total_books
-- 6

-- Query 2 output:
-- average_price
-- 10.16

-- Query 3 output:
-- most_expensive | cheapest
-- 12.99 | 7.49

-- Query 4 output (books per author):
-- author_id | number_of_books
-- 3 | 2 (Chimamanda — Americanah + Purple Hibiscus)
-- 1 | 2 (Orwell — 1984 + Animal Farm)
-- 2 | 1
-- 4 | 1

-- Query 5 output (avg price per genre):
-- genre | book_count | avg_price
-- Magical Realism | 1 | 12.99
-- Literary Fiction | 3 | 10.49
-- Dystopian | 1 | 9.99
-- Satire | 1 | 7.49

-- Query 6 output (genres with more than 1 book):
-- genre | book_count
-- Literary Fiction | 3
🔥
Interview Gold: WHERE vs HAVINGThis is one of the most common SQL interview questions at every level. The rule is clean and memorable: WHERE filters individual rows BEFORE grouping happens. HAVING filters groups AFTER grouping happens. You literally cannot use WHERE to filter on an aggregate like COUNT(*) or AVG() — the database engine will throw an error because those values don't exist yet at the WHERE stage.
FeatureWHEREHAVING
What it filtersIndividual rowsGroups of rows (post-aggregation)
When it runsBefore GROUP BYAfter GROUP BY
Works with aggregates?No — causes an errorYes — designed for it
Example use caseWHERE price > 10HAVING COUNT(*) > 2
Required clauseNo — optionalNo — only needed with GROUP BY
Can replace the other?NoNo — they serve different stages

🎯 Key Takeaways

  • SQL reads like English on purpose — SELECT [columns] FROM [table] WHERE [condition] is the skeleton of almost every query you'll ever write.
  • NULL is not a value — it means 'unknown'. Never compare it with = or !=. Always use IS NULL or IS NOT NULL, or you'll silently filter out data that should be there.
  • WHERE filters rows before grouping; HAVING filters groups after grouping — confusing the two is one of the most common SQL errors at every experience level.
  • Always add LIMIT when exploring an unfamiliar table — running an unbounded SELECT * on a production table with millions of rows can lock up your client, slow the database, and make you very unpopular very fast.

⚠ Common Mistakes to Avoid

  • Mistake 1: Using single quotes for column/table names and double quotes for strings — SQL is the opposite of most languages. String values go in single quotes: WHERE genre = 'Satire'. Table and column names that need quoting use double quotes: SELECT "Book Title". Mixing them up gives you either wrong results or a syntax error like 'column Satire does not exist'.
  • Mistake 2: Forgetting that NULL is not a value — you cannot compare NULL with = or !=. Writing WHERE country = NULL will always return zero rows, silently and without error. The correct syntax is WHERE country IS NULL or WHERE country IS NOT NULL. This is one of the trickiest beginner traps because SQL won't warn you — it just quietly returns nothing.
  • Mistake 3: Selecting a non-aggregated column without including it in GROUP BY — if you write SELECT genre, title, COUNT(*) FROM books GROUP BY genre, most SQL databases will throw an error like 'column title must appear in the GROUP BY clause or be used in an aggregate function'. Every column in your SELECT list must either be in the GROUP BY or wrapped in an aggregate function like MAX(title). PostgreSQL enforces this strictly; older MySQL versions silently return random values, which is even more dangerous.

Interview Questions on This Topic

  • QWhat's the difference between WHERE and HAVING in SQL, and can you give a concrete example of when you'd use each?
  • QIf I run SELECT * FROM users WHERE last_login = NULL and get zero rows back even though I know null values exist in that column, what's wrong and how do you fix it?
  • QWhat does GROUP BY actually do to the rows in a table, and why does every column in the SELECT list need to either be in the GROUP BY clause or wrapped in an aggregate function?

Frequently Asked Questions

Do I need to install anything to start learning SQL?

No installation needed to get started. SQLiteOnline.com and DB Fiddle both let you write and run SQL directly in your browser for free. If you want a local setup, SQLite is a single file with no server required — DB Browser for SQLite gives it a visual interface. All examples in this article run in any of these environments without modification.

What's the difference between SQL and MySQL (or PostgreSQL)?

SQL is the language — a standard defined by ISO. MySQL, PostgreSQL, SQLite, and SQL Server are database engines: software systems that store data and understand SQL. It's the same relationship as English being a language and Google Docs being a tool that accepts English. The core SQL you've learned here works across all of them; the differences only appear in advanced or edge-case features.

Is SQL still worth learning in 2025 with NoSQL and AI tools around?

Absolutely yes. SQL is the single most in-demand data skill across engineering, analytics, product, and business roles — consistently appearing in more job postings than any other technical skill. NoSQL databases handle unstructured or highly scalable data, but relational databases remain the backbone of most applications. Even AI data tools like Copilot for data analysis generate SQL under the hood — understanding what they produce is still your responsibility.

🔥
TheCodeForge Editorial Team Verified Author

Written and reviewed by senior developers with real-world experience across enterprise, startup and open-source projects. Every article on TheCodeForge is written to be clear, accurate and genuinely useful — not just SEO filler.

Next →SQL CREATE TABLE and Data Types
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged