SQL CREATE TABLE Explained: Data Types, Constraints & Common Mistakes
Every app you use stores data somewhere. Instagram stores your username, follower count, and the exact timestamp of every post. Your bank stores your account number, balance, and transaction history. That data doesn't just float around freely — it lives in structured tables inside a database, and someone had to design those tables before a single row of data was ever saved. That someone could be you, and the tool for the job is SQL's CREATE TABLE statement.
Before CREATE TABLE existed, developers had to manage data in flat files — think giant text documents where structure was implied but never enforced. Nothing stopped someone from saving the word 'banana' into a field meant to hold a bank balance. CREATE TABLE solves that by letting you declare a blueprint upfront: here are the columns, here is what type of data each column accepts, and here are the rules that every single row must follow, no exceptions.
By the end of this article you'll be able to write a complete CREATE TABLE statement from scratch, choose the right data type for any column, add constraints that protect your data integrity, and avoid the three mistakes that trip up almost every beginner. You'll also walk away knowing how to answer the CREATE TABLE questions that come up in real SQL interviews.
What a Table Actually Is — and Why You Design It First
A database table is a grid of rows and columns, almost identical to a spreadsheet tab. Each column has a name and a fixed data type. Each row is one record — one user, one order, one product. The critical difference from a spreadsheet is that you lock down the structure before any data goes in. You can't just add a random column mid-row the way you might type anywhere in Excel.
This design-first approach is what makes relational databases so reliable. When your app inserts a new user record, the database already knows exactly what shape that record must have. It rejects anything that doesn't fit — a text value in a numeric column, a date that doesn't exist, a username that's too long. That enforcement happens automatically, 24/7, without any code on your part.
Think of it like designing a parking garage before you open it. You decide the width of each space (data type), whether a space is reserved (constraints), and how many spaces exist (columns). You can't add a half-space later without a redesign. SQL works the same way — invest in the design upfront and the database protects your data forever.
The CREATE TABLE statement is literally you handing that blueprint to the database engine and saying: 'Build this structure and enforce these rules from now on.'
-- Step 1: Create a simple users table for a social media app -- This is the most basic form of CREATE TABLE — no constraints yet, -- just column names and data types. We'll add constraints in the next section. CREATE TABLE users ( user_id INT, -- A whole number to uniquely identify each user username VARCHAR(50), -- Variable-length text, max 50 characters email VARCHAR(100), -- Emails can be longer, so we allow up to 100 chars age INT, -- Whole number — ages don't have decimal points profile_bio TEXT, -- Unlimited text for longer content like bios created_at DATETIME, -- Stores both the date AND the time of registration is_verified BOOLEAN -- TRUE or FALSE — is this account email-verified? ); -- To confirm the table was created, run: DESCRIBE users; -- MySQL syntax -- or -- \d users -- PostgreSQL syntax
+-------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+-------+
| user_id | int | YES | | NULL | |
| username | varchar(50) | YES | | NULL | |
| email | varchar(100) | YES | | NULL | |
| age | int | YES | | NULL | |
| profile_bio | text | YES | | NULL | |
| created_at | datetime | YES | | NULL | |
| is_verified | tinyint(1) | YES | | NULL | |
+-------------+--------------+------+-----+---------+-------+
7 rows in set (0.00 sec)
SQL Data Types Explained — Choosing the Right One Every Time
Choosing a data type is one of the most consequential decisions you make when designing a table. Get it wrong and you either waste storage, break functionality, or both. Think of data types as choosing the right container in a kitchen: you wouldn't store soup in a paper bag or carry sand in a colander. The container has to match what's going in.
SQL data types fall into four families: Numbers, Text, Dates/Times, and Boolean. Within each family there are options that trade off range versus storage size.
Numbers: Use INT for whole numbers (user IDs, quantities, ages). Use DECIMAL(precision, scale) for money — never use FLOAT for currency because floating-point arithmetic introduces tiny rounding errors that will cost real money. DECIMAL(10, 2) means up to 10 total digits, with exactly 2 after the decimal point — perfect for prices like 1299.99.
Text: VARCHAR(n) is your default for text — it only uses as much space as the actual string, up to n characters. CHAR(n) always uses exactly n characters (padded with spaces) and is faster for fixed-length values like country codes ('US', 'UK'). TEXT holds unlimited text and is ideal for blog posts or descriptions.
Dates: DATE stores only the calendar date. TIME stores only the clock time. DATETIME stores both. TIMESTAMP also stores both but automatically converts to UTC — useful for apps with global users.
Boolean: BOOLEAN (or TINYINT(1) in MySQL) stores TRUE/FALSE — perfect for flags like is_active or is_verified.
-- A real e-commerce products table showing deliberate data type choices. -- Read the inline comments to understand WHY each type was chosen. CREATE TABLE products ( product_id INT, -- Whole number ID — products don't have fractional IDs product_name VARCHAR(150), -- Names vary in length; VARCHAR saves storage vs CHAR sku_code CHAR(10), -- SKUs are always exactly 10 chars — CHAR is faster here description TEXT, -- Product descriptions can be long — no character limit needed price DECIMAL(10, 2), -- CRITICAL: use DECIMAL for money, never FLOAT -- DECIMAL(10,2) = up to 10 digits, always 2 decimal places -- e.g. 99999999.99 is the max value here stock_quantity INT, -- You can't have 2.5 items in stock — whole number only weight_kg DECIMAL(6, 3), -- Weight needs decimals: 1.250 kg is valid is_available BOOLEAN, -- Simple flag: is the product currently for sale? listed_on DATE, -- We only care about the calendar date, not the time last_updated TIMESTAMP -- Auto-tracks date+time; great for audit trails ); -- Let's also look at what FLOAT looks like to understand why we avoid it for money: -- SELECT 0.1 + 0.2; -- In many systems this returns 0.30000000000000004 — not 0.30! -- DECIMAL avoids this entirely. Always use DECIMAL for financial values. DESCRIBE products;
+----------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------------+---------------+------+-----+---------+-------+
| product_id | int | YES | | NULL | |
| product_name | varchar(150) | YES | | NULL | |
| sku_code | char(10) | YES | | NULL | |
| description | text | YES | | NULL | |
| price | decimal(10,2) | YES | | NULL | |
| stock_quantity | int | YES | | NULL | |
| weight_kg | decimal(6,3) | YES | | NULL | |
| is_available | tinyint(1) | YES | | NULL | |
| listed_on | date | YES | | NULL | |
| last_updated | timestamp | YES | | NULL | |
+----------------+---------------+------+-----+---------+-------+
10 rows in set (0.00 sec)
Constraints — The Rules That Make Your Data Trustworthy
A table without constraints is just a suggestion. You could insert a user with a NULL email address, a negative price, or two products with the same ID. Constraints are the rules you bolt onto columns (or the whole table) that the database enforces automatically, on every single INSERT and UPDATE, forever.
Here are the six constraints you'll use constantly:
PRIMARY KEY — Uniquely identifies each row. No two rows can share the same value, and it can never be NULL. Every table should have one. Combine with AUTO_INCREMENT (MySQL) or SERIAL (PostgreSQL) so the database generates the ID for you.
NOT NULL — This column must always have a value. NULL is not allowed. Use it on any column that genuinely must exist for every record — like email or username.
UNIQUE — Values in this column must be unique across all rows, but unlike PRIMARY KEY, a UNIQUE column can contain NULLs. Use it for things like email addresses or usernames — no two users should share one.
DEFAULT — If no value is provided during INSERT, use this fallback. Great for created_at timestamps or boolean flags that default to FALSE.
CHECK — Define a custom rule. CHECK (age >= 18) will reject any row where age is below 18. The database enforces it, not your application code.
FOREIGN KEY — Links a column in this table to the PRIMARY KEY of another table, enforcing referential integrity. You can't insert an order for a user_id that doesn't exist in the users table.
-- First, we need a users table to reference from orders. -- Notice how we apply constraints at the column level — clean and readable. CREATE TABLE users ( user_id INT AUTO_INCREMENT PRIMARY KEY, -- AUTO_INCREMENT: DB generates 1, 2, 3... -- PRIMARY KEY: unique + never NULL username VARCHAR(50) NOT NULL UNIQUE, -- Must exist, must be unique per user email VARCHAR(100) NOT NULL UNIQUE, -- Every user needs a unique email age INT CHECK (age >= 13), -- Reject anyone under 13 (e.g. COPPA compliance) country CHAR(2) DEFAULT 'US', -- If not specified, assume USA created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP -- Auto-fills with NOW() on every INSERT ); -- Now create an orders table that references users. -- A FOREIGN KEY means: every order must belong to a real, existing user. CREATE TABLE orders ( order_id INT AUTO_INCREMENT PRIMARY KEY, user_id INT NOT NULL, -- Cannot be NULL — every order needs an owner order_total DECIMAL(10,2) NOT NULL CHECK (order_total > 0), -- Must be positive order_status VARCHAR(20) NOT NULL DEFAULT 'pending', -- Sensible default when order is first created placed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, -- FOREIGN KEY defined at the bottom — links orders.user_id to users.user_id -- ON DELETE CASCADE means: if the user is deleted, their orders are deleted too CONSTRAINT fk_orders_user FOREIGN KEY (user_id) REFERENCES users (user_id) ON DELETE CASCADE ); -- Test: try inserting a valid user and order INSERT INTO users (username, email, age) VALUES ('maya_dev', 'maya@example.com', 25); -- user_id = 1 is auto-generated, country = 'US' by default, created_at = now INSERT INTO orders (user_id, order_total) VALUES (1, 149.99); -- order_id = 1, status = 'pending' by default, placed_at = now -- Test: try inserting an order for a non-existent user INSERT INTO orders (user_id, order_total) VALUES (9999, 50.00); -- This WILL FAIL — user_id 9999 doesn't exist in users table
Query OK, 0 rows affected (0.06 sec)
-- After CREATE TABLE orders:
Query OK, 0 rows affected (0.04 sec)
-- After INSERT valid user:
Query OK, 1 row affected (0.01 sec)
-- After INSERT valid order:
Query OK, 1 row affected (0.01 sec)
-- After INSERT order with invalid user_id:
ERROR 1452 (23000): Cannot add or update a child row:
a foreign key constraint fails (`mydb`.`orders`,
CONSTRAINT `fk_orders_user` FOREIGN KEY (`user_id`)
REFERENCES `users` (`user_id`) ON DELETE CASCADE)
Defensive Table Creation — IF NOT EXISTS and Real-World Patterns
In production, you rarely run CREATE TABLE just once. You run migration scripts, set up test environments, redeploy applications, and onboard new developers. If you run a CREATE TABLE statement on a database where that table already exists, you get an error that stops your entire script. That's why every professional CREATE TABLE statement includes IF NOT EXISTS.
IF NOT EXISTS tells the database: 'Create this table only if it doesn't already exist. If it does exist, skip this statement silently and move on.' This makes your setup scripts idempotent — you can run them ten times in a row and they'll only ever create the table once, without errors.
The other real-world pattern worth knowing is CREATE TABLE ... LIKE and CREATE TABLE ... AS SELECT. LIKE copies the structure of an existing table (columns, types, constraints) without any data — perfect for making a test copy. AS SELECT creates a new table and fills it with the results of a query — useful for snapshots or archiving.
Finally, always use DROP TABLE IF EXISTS before CREATE TABLE in setup scripts where you want a clean slate. The order matters: drop first, then create.
-- PATTERN 1: IF NOT EXISTS — the safe default for any script you'll run more than once CREATE TABLE IF NOT EXISTS blog_posts ( post_id INT AUTO_INCREMENT PRIMARY KEY, author_id INT NOT NULL, title VARCHAR(200) NOT NULL, slug VARCHAR(200) NOT NULL UNIQUE, -- URL-friendly version of title, must be unique body TEXT NOT NULL, is_published BOOLEAN NOT NULL DEFAULT FALSE, -- Drafts are unpublished by default published_at DATETIME NULL, -- NULL until the post is actually published created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, -- ON UPDATE: automatically refreshes this column whenever the row changes CONSTRAINT fk_post_author FOREIGN KEY (author_id) REFERENCES users (user_id) ON DELETE RESTRICT -- RESTRICT: prevent deleting a user who has blog posts ); -- PATTERN 2: Clean-slate setup for development or testing -- Drop if it exists, then create fresh. Use only in dev — never in production! DROP TABLE IF EXISTS blog_posts; -- Silently skips if table doesn't exist CREATE TABLE blog_posts ( post_id INT AUTO_INCREMENT PRIMARY KEY, title VARCHAR(200) NOT NULL -- ... rest of columns ); -- PATTERN 3: Copy a table's structure for testing (no data copied) CREATE TABLE blog_posts_test LIKE blog_posts; -- blog_posts_test now has identical columns and constraints but zero rows -- PATTERN 4: Create a table from a query result (useful for reporting snapshots) CREATE TABLE published_posts_snapshot AS SELECT post_id, title, published_at FROM blog_posts WHERE is_published = TRUE; -- Note: AS SELECT does NOT copy constraints like PRIMARY KEY — just structure + data
Query OK, 0 rows affected (0.05 sec)
-- IF NOT EXISTS run a second time (table already exists):
Query OK, 0 rows affected, 1 warning (0.00 sec)
-- Warning: Table 'mydb.blog_posts' already exists — but the script keeps running!
-- DROP TABLE IF EXISTS (table exists, drops it):
Query OK, 0 rows affected (0.02 sec)
-- CREATE TABLE blog_posts (fresh):
Query OK, 0 rows affected (0.04 sec)
-- CREATE TABLE blog_posts_test LIKE blog_posts:
Query OK, 0 rows affected (0.03 sec)
-- CREATE TABLE published_posts_snapshot AS SELECT ...:
Query OK, 0 rows affected (0.06 sec)
-- (0 rows because no published posts exist yet in this example)
| Data Type | Best Used For | Storage | Example Value | Watch Out For |
|---|---|---|---|---|
| INT | IDs, counts, ages, quantities | 4 bytes | 42 | Can't store decimals — 3/2 becomes 1 |
| DECIMAL(p,s) | Prices, salaries, measurements | Variable, exact | 1299.99 | Always specify precision and scale explicitly |
| FLOAT / DOUBLE | Scientific data, approximations | 4–8 bytes | 3.14159265 | Never use for money — rounding errors guaranteed |
| VARCHAR(n) | Names, emails, URLs, short text | Actual length + 1-2 bytes | 'john@example.com' | n is a max cap, not a fixed size — set it generously |
| CHAR(n) | Fixed-length codes (country, SKU) | Always n bytes | 'US' | Pads shorter strings with spaces — can cause subtle bugs in comparisons |
| TEXT | Blog posts, bios, descriptions | Up to 65,535 bytes | 'Once upon a time...' | Cannot have a DEFAULT value; can't be fully indexed |
| DATE | Birthdays, event dates | 3 bytes | '2024-03-15' | No time component — use DATETIME if time matters |
| DATETIME | Timestamps, scheduled events | 8 bytes | '2024-03-15 14:30:00' | Stored as-is — no timezone awareness |
| TIMESTAMP | Created/updated audit columns | 4 bytes | '2024-03-15 14:30:00' | Auto-converts to UTC; range is 1970–2038 |
| BOOLEAN | Flags, toggles, yes/no fields | 1 byte | TRUE / FALSE | MySQL stores as TINYINT(1) — 0 is false, anything else is true |
🎯 Key Takeaways
- CREATE TABLE is a blueprint, not data — it defines structure and rules before a single row exists; the database enforces those rules on every INSERT and UPDATE automatically, forever.
- Always use DECIMAL(precision, scale) for money columns — FLOAT introduces binary rounding errors that compound into real financial discrepancies at scale.
- Every production table needs a PRIMARY KEY with AUTO_INCREMENT — without it you have no reliable way to identify, update, or delete individual rows, and duplicates become invisible.
- Add IF NOT EXISTS to every CREATE TABLE in scripts you'll run more than once — it makes your setup scripts idempotent and prevents them from crashing on re-runs in CI/CD pipelines or fresh dev environments.
⚠ Common Mistakes to Avoid
- ✕Mistake 1: Using VARCHAR(255) for every text column by default — Your email column doesn't need 255 characters (most are under 80) and your country code certainly doesn't (it's exactly 2). Over-provisioning VARCHAR wastes nothing on storage since it's variable-length, but it misleads future developers about what the data should look like and can cause index bloat on long VARCHAR columns. Fix it: use a realistic max based on what the data actually is. VARCHAR(100) for emails, CHAR(2) for country codes, VARCHAR(200) for titles.
- ✕Mistake 2: Forgetting PRIMARY KEY entirely — This happens constantly with beginners who just want to 'get data in quickly.' Without a primary key, you have no reliable way to identify, update, or delete a specific row — and your database can store identical duplicate rows with no way to distinguish them. Fix it: every table needs a primary key. The simplest pattern is id INT AUTO_INCREMENT PRIMARY KEY as your first column, always.
- ✕Mistake 3: Using FLOAT or DOUBLE for monetary values — Floating-point types are binary approximations. They cannot represent 0.10 (ten cents) exactly in binary, leading to errors like 0.1 + 0.2 = 0.30000000000000004. In financial software this causes rounding errors that compound over millions of transactions into real money lost or reconciliation nightmares. Fix it: always use DECIMAL(precision, scale) for any column that stores currency, e.g., DECIMAL(12, 2) for a balance that can reach ten billion dollars.
Interview Questions on This Topic
- QWhat is the difference between CHAR and VARCHAR? When would you deliberately choose CHAR over VARCHAR even though VARCHAR seems more flexible?
- QExplain the difference between a PRIMARY KEY and a UNIQUE constraint. Can a table have multiple UNIQUE constraints? Can it have multiple PRIMARY KEYs?
- QIf you define a FOREIGN KEY with ON DELETE CASCADE versus ON DELETE RESTRICT, what happens in each case when you delete a parent row? Give a real-world example of when you'd choose each option.
Frequently Asked Questions
What is the difference between VARCHAR and TEXT in SQL?
VARCHAR(n) stores variable-length text up to a defined maximum (n) and can be fully indexed and have default values — it's your go-to for most text columns. TEXT stores unlimited text (up to 65,535 bytes in MySQL) but cannot have a DEFAULT value and can only be partially indexed. Use VARCHAR for names, emails, and titles; use TEXT for blog posts, product descriptions, and anything open-ended in length.
Can I change a table's structure after I've already created it?
Yes — use the ALTER TABLE statement. ALTER TABLE users ADD COLUMN phone VARCHAR(20) adds a new column. ALTER TABLE users MODIFY COLUMN age SMALLINT changes a column's type. ALTER TABLE users DROP COLUMN age removes a column entirely. Be careful in production though: changing a column type or dropping a column on a large, live table can lock the table for minutes and cause downtime.
What does NULL actually mean in SQL, and is it the same as zero or an empty string?
NULL means 'no value exists' — it represents the absence of data, not zero and not an empty string. Zero is a valid integer value. An empty string '' is a valid text value. NULL is neither. This matters because NULL behaves differently in comparisons: you can't use = NULL to check for null, you must use IS NULL. A column with NOT NULL means a value is mandatory; NULL means the value is optional and might not be known.
Written and reviewed by senior developers with real-world experience across enterprise, startup and open-source projects. Every article on TheCodeForge is written to be clear, accurate and genuinely useful — not just SEO filler.