SQLAlchemy Basics: Models, Queries, and Sessions Explained
Every serious Python application eventually needs to store data. Whether you're building a REST API, a web app, or an internal tool, you'll hit a point where a dictionary just doesn't cut it and you need a real database. The instinct for many developers is to write raw SQL strings scattered across their codebase — and that works until it really doesn't. Maintenance becomes a nightmare, security holes (hello, SQL injection) creep in, and switching databases feels like a full rewrite.
SQLAlchemy solves this by giving you two powerful tools in one library. First, its Core layer lets you build and execute SQL expressions using Python objects — it's still SQL-flavored thinking, but type-safe and composable. Second, and more importantly for most projects, its ORM (Object-Relational Mapper) layer lets you define your database tables as Python classes and interact with rows as if they were plain Python objects. The database becomes an implementation detail, not the center of your universe.
By the end of this article, you'll know how to define database models as Python classes, create and manage a database session, insert and query records using both the ORM and filter expressions, and set up a one-to-many relationship between two tables. You'll also know the mistakes that trip up almost every developer the first time — and exactly how to avoid them.
Setting Up SQLAlchemy: The Engine and the Session Factory
Before you can talk to a database, SQLAlchemy needs two things: an Engine and a Session. Think of the Engine as the phone line — it knows the database's address and how to connect to it. The Session is a single phone call on that line — it's where your actual work happens, and it tracks every change you make until you decide to commit them.
The Engine is created once, at app startup, using a connection string. SQLAlchemy supports PostgreSQL, MySQL, SQLite, and more — you only change the connection string to switch. For development and learning, SQLite is perfect because it's a file-based database that requires zero setup.
The Session is created via a sessionmaker factory bound to your Engine. You never create Sessions manually in production — you use the factory. This separation matters: the Engine is a long-lived, shared, thread-safe object; Sessions are short-lived and should be opened and closed per request or per task.
The declarative_base() function creates a base class that all your ORM model classes inherit from. This base class is what registers your models with SQLAlchemy's metadata system, so it knows which Python class maps to which database table.
from sqlalchemy import create_engine from sqlalchemy.orm import sessionmaker, declarative_base # The connection string tells SQLAlchemy which database to use. # 'sqlite:///bookstore.db' creates a file called bookstore.db in the current directory. # For PostgreSQL it would be: 'postgresql://user:password@localhost/dbname' DATABASE_URL = "sqlite:///bookstore.db" # Create the engine — this is the one-time setup that manages the connection pool. # echo=True prints every SQL statement SQLAlchemy generates, great for debugging. engine = create_engine(DATABASE_URL, echo=True) # sessionmaker returns a class (a factory), not an instance. # Every time you call SessionLocal(), you get a fresh, independent Session. SessionLocal = sessionmaker(bind=engine, autocommit=False, autoflush=False) # Base is the parent class all our ORM models will inherit from. # It holds the metadata (table definitions) SQLAlchemy needs to create the schema. Base = declarative_base() # Helper function for safely opening and closing sessions. # Use this as a context manager anywhere you need database access. def get_db_session(): session = SessionLocal() try: yield session # Hand the session to the caller session.commit() # Auto-commit if no exception was raised except Exception: session.rollback() # Roll back ALL changes if anything went wrong raise finally: session.close() # Always close the session to return connection to pool print("Engine and session factory created successfully.") print(f"Database URL: {DATABASE_URL}")
Database URL: sqlite:///bookstore.db
Defining ORM Models: Mapping Python Classes to Database Tables
An ORM model is a Python class that represents a database table. Each class attribute that uses SQLAlchemy's Column type maps to a column in that table. When you define a model, you're doing two things at once: describing the database schema AND defining the Python object you'll work with in your code.
The __tablename__ attribute is mandatory — it tells SQLAlchemy the exact name of the table in the database. Column types like Integer, String, and DateTime are SQLAlchemy's type system, which maps to the appropriate native type for whatever database you're using.
Relationships between tables are defined using relationship(), which tells SQLAlchemy how two models connect logically. The ForeignKey defines the database-level link, while relationship() adds the Python-level convenience — letting you access related objects directly as attributes instead of writing join queries manually.
Once your models are defined, calling Base.metadata.create_all(engine) inspects all classes that inherit from Base and creates the corresponding tables in the database if they don't already exist. It's safe to call repeatedly — it won't overwrite existing tables.
from sqlalchemy import Column, Integer, String, Float, DateTime, ForeignKey, Text from sqlalchemy.orm import relationship from datetime import datetime, timezone # Import Base from our setup file from database_setup import Base, engine class Author(Base): """Represents an author in the bookstore database.""" __tablename__ = "authors" # Exact table name in the database # Integer primary key — SQLAlchemy auto-increments this for SQLite id = Column(Integer, primary_key=True, index=True) # nullable=False means this column is required — INSERT will fail without it name = Column(String(150), nullable=False) email = Column(String(255), unique=True, nullable=False) bio = Column(Text, nullable=True) # Optional field # relationship() adds a Python-level attribute — NOT a database column. # 'books' will be a list of Book objects belonging to this author. # back_populates='author' means the Book model has a matching 'author' attribute. books = relationship("Book", back_populates="author", cascade="all, delete-orphan") def __repr__(self): # __repr__ makes debugging so much easier — you see useful info, not memory addresses return f"<Author(id={self.id}, name='{self.name}')>" class Book(Base): """Represents a book in the bookstore database.""" __tablename__ = "books" id = Column(Integer, primary_key=True, index=True) title = Column(String(300), nullable=False) isbn = Column(String(13), unique=True, nullable=False) price = Column(Float, nullable=False) published_at = Column(DateTime, default=lambda: datetime.now(timezone.utc)) # ForeignKey creates the actual database constraint linking books.author_id to authors.id author_id = Column(Integer, ForeignKey("authors.id"), nullable=False) # This gives us book.author — a direct reference to the Author object author = relationship("Author", back_populates="books") def __repr__(self): return f"<Book(id={self.id}, title='{self.title}', price=${self.price})>" # Create all tables defined above in the actual database file. # If the tables already exist, this does nothing — it won't destroy existing data. Base.metadata.create_all(bind=engine) print("Tables created: 'authors' and 'books'")
id INTEGER NOT NULL,
name VARCHAR(150) NOT NULL,
email VARCHAR(255) NOT NULL,
bio TEXT,
PRIMARY KEY (id),
UNIQUE (email)
)
2024-01-15 10:23:01,443 INFO sqlalchemy.engine.Engine CREATE TABLE books (
id INTEGER NOT NULL,
title VARCHAR(300) NOT NULL,
isbn VARCHAR(13) NOT NULL,
price FLOAT NOT NULL,
published_at DATETIME,
author_id INTEGER NOT NULL,
PRIMARY KEY (id),
UNIQUE (isbn),
FOREIGN KEY(author_id) REFERENCES authors (id)
)
Tables created: 'authors' and 'books'
Sessions in Action: Inserting, Querying, and Filtering Records
The Session is where all the action happens. Think of it as a scratchpad — you add objects to it, modify them, and delete them, and SQLAlchemy tracks every change. Nothing actually touches the database until you call session.commit(). This is called the Unit of Work pattern, and it's one of SQLAlchemy's most powerful ideas.
When you add an object with session.add(), it moves into a 'pending' state — tracked by the session but not yet written. After commit(), it moves to 'persistent' — it exists in the database AND in the session's identity map. If you query for it again, SQLAlchemy returns the same Python object from memory, not a fresh copy from the database.
Querying uses the session.query() method (classic ORM style) or the newer select() construct. Both work, but the newer select() style is the direction SQLAlchemy 2.0 is heading. Filters work like WHERE clauses — filter_by() accepts keyword arguments for simple equality checks, while filter() accepts more expressive comparison expressions for anything complex like greater-than, LIKE, or IN.
Understanding the difference between .all(), .first(), and .one() matters: .all() returns a list (empty if no results), .first() returns the first result or None, and .one() raises an exception if the result count isn't exactly one — useful when you absolutely expect a unique record.
from database_setup import SessionLocal from models import Author, Book def seed_bookstore_data(): """Insert sample authors and books into the database.""" with SessionLocal() as session: # --- INSERT: Create Author objects and add them to the session --- author_rowling = Author( name="J.K. Rowling", email="jk@rowling.com", bio="Author of the Harry Potter series." ) author_martin = Author( name="George R.R. Martin", email="grrm@westeros.com", bio="Author of A Song of Ice and Fire." ) # session.add_all() is more efficient than calling session.add() multiple times session.add_all([author_rowling, author_martin]) # flush() writes changes to the DB within this transaction but doesn't commit yet. # We need this so author_rowling.id gets populated before we create the books. session.flush() print(f"Authors flushed — Rowling ID: {author_rowling.id}, Martin ID: {author_martin.id}") # --- INSERT: Create Book objects, linking them to authors via author_id --- books = [ Book(title="Harry Potter and the Philosopher's Stone", isbn="9780747532699", price=12.99, author_id=author_rowling.id), Book(title="Harry Potter and the Chamber of Secrets", isbn="9780747538493", price=13.99, author_id=author_rowling.id), Book(title="A Game of Thrones", isbn="9780553103540", price=15.99, author_id=author_martin.id), ] session.add_all(books) session.commit() # Now EVERYTHING above is written permanently to the database print("All data committed successfully.") def query_bookstore_data(): """Demonstrate various query and filter patterns.""" with SessionLocal() as session: # --- QUERY ALL: Returns a list of all Author objects --- all_authors = session.query(Author).all() print(f"\nAll authors ({len(all_authors)} total):") for author in all_authors: print(f" {author}") # Uses our __repr__ method # --- FILTER BY: Simple equality filter, returns first match or None --- rowling = session.query(Author).filter_by(name="J.K. Rowling").first() print(f"\nFound author: {rowling}") # --- RELATIONSHIP ACCESS: Access books via the relationship attribute --- # SQLAlchemy issues a second query here automatically (lazy loading) print(f"\nRowling's books ({len(rowling.books)} total):") for book in rowling.books: print(f" {book}") # --- FILTER with expression: Find all books priced above $13 --- # Book.price > 13.00 generates a SQL WHERE clause: WHERE books.price > 13.0 expensive_books = session.query(Book).filter(Book.price > 13.00).all() print(f"\nBooks over $13.00:") for book in expensive_books: print(f" {book.title} — ${book.price}") # --- UPDATE: Modify an attribute and commit --- cheap_book = session.query(Book).filter_by(isbn="9780747532699").one() cheap_book.price = 14.99 # SQLAlchemy detects this change automatically session.commit() print(f"\nPrice updated — {cheap_book.title} now costs ${cheap_book.price}") if __name__ == "__main__": seed_bookstore_data() query_bookstore_data()
All data committed successfully.
All authors (2 total):
<Author(id=1, name='J.K. Rowling')>
<Author(id=2, name='George R.R. Martin')>
Found author: <Author(id=1, name='J.K. Rowling')>
Rowling's books (2 total):
<Book(id=1, title='Harry Potter and the Philosopher's Stone', price=$12.99)>
<Book(id=2, title='Harry Potter and the Chamber of Secrets', price=$13.99)>
Books over $13.00:
Harry Potter and the Chamber of Secrets — $13.99
A Game of Thrones — $15.99
Price updated — Harry Potter and the Philosopher's Stone now costs $14.99
Joins and Eager Loading: Avoiding the N+1 Query Problem
The N+1 query problem is the most common performance mistake developers make with ORMs, and SQLAlchemy's lazy loading makes it easy to fall into. Here's how it happens: you fetch 100 authors with one query, then loop over them accessing author.books — and SQLAlchemy fires 100 separate queries to load each author's books. One query becomes 101. That's N+1.
The fix is eager loading — telling SQLAlchemy to fetch the related data upfront in the same query. SQLAlchemy offers two main strategies: joinedload() uses a SQL JOIN to get everything in one query, and subqueryload() uses a second optimized query to load all related records at once. Both avoid N+1; which one to use depends on your data shape.
joinedload() is ideal when each parent has a small number of related children — the join stays manageable. subqueryload() is better when you have many parents each with many children, because joining would create a large Cartesian product that duplicates the parent rows. For most one-to-many relationships with moderate data, joinedload() is your default.
You can also write explicit joins using .join() and filter across tables — essential when you need to filter the parent based on a child's column, like finding all authors who have at least one book priced over $15.
from sqlalchemy.orm import joinedload, subqueryload from sqlalchemy import select from database_setup import SessionLocal from models import Author, Book def demonstrate_n_plus_1_problem(): """Shows what NOT to do — this fires one query per author.""" print("=== N+1 Problem (BAD) ===") with SessionLocal() as session: # This fires ONE query: SELECT * FROM authors all_authors = session.query(Author).all() for author in all_authors: # EACH iteration fires ANOTHER query: SELECT * FROM books WHERE author_id = ? # With 1000 authors, that's 1001 total queries! book_count = len(author.books) print(f" {author.name} has {book_count} book(s)") def demonstrate_eager_loading(): """Shows the right approach — loads everything in one query.""" print("\n=== Eager Loading with joinedload (GOOD) ===") with SessionLocal() as session: # joinedload tells SQLAlchemy: 'when you fetch authors, also JOIN and fetch their books' # This produces ONE query with a JOIN instead of N+1 separate queries all_authors = ( session.query(Author) .options(joinedload(Author.books)) # Pre-load the books relationship .all() ) for author in all_authors: # author.books is already loaded — no extra database query fires here book_count = len(author.books) print(f" {author.name} has {book_count} book(s)") for book in author.books: print(f" -> {book.title} (${book.price})") def find_authors_with_expensive_books(): """Join across tables to filter parents based on child attributes.""" print("\n=== Cross-Table Join Filter ===") with SessionLocal() as session: # .join(Book) tells SQLAlchemy to JOIN books on the foreign key relationship # .filter(Book.price > 15.00) then filters using the joined table's columns # .distinct() prevents the same author appearing multiple times if they have # multiple books matching the filter authors_with_pricey_books = ( session.query(Author) .join(Book) # JOIN books ON authors.id = books.author_id .filter(Book.price > 15.00) # WHERE books.price > 15.00 .distinct() # Deduplicate if author has multiple matches .all() ) print("Authors with at least one book over $15.00:") for author in authors_with_pricey_books: print(f" {author.name}") if __name__ == "__main__": demonstrate_n_plus_1_problem() demonstrate_eager_loading() find_authors_with_expensive_books()
J.K. Rowling has 2 book(s)
George R.R. Martin has 1 book(s)
=== Eager Loading with joinedload (GOOD) ===
J.K. Rowling has 2 book(s)
-> Harry Potter and the Philosopher's Stone ($14.99)
-> Harry Potter and the Chamber of Secrets ($13.99)
George R.R. Martin has 1 book(s)
-> A Game of Thrones ($15.99)
=== Cross-Table Join Filter ===
Authors with at least one book over $15.00:
George R.R. Martin
| Feature / Aspect | SQLAlchemy ORM | Raw SQL (psycopg2/sqlite3) |
|---|---|---|
| Code style | Python classes and objects | String-based SQL queries |
| Database portability | Change connection string to switch DBs | Rewrite queries for each DB dialect |
| SQL injection protection | Built-in via parameterized bindings | Manual — developer's responsibility |
| Learning curve | Higher upfront, faster long-term | Lower upfront, harder to maintain |
| Complex joins | Possible but can get verbose | Natural and expressive |
| Performance tuning | Inspect generated SQL with echo=True | You control every query directly |
| Relationship traversal | author.books works out of the box | Write JOIN queries manually every time |
| Schema migrations | Use Alembic alongside SQLAlchemy | Write ALTER TABLE statements by hand |
🎯 Key Takeaways
- The Engine is created once and shared; the Session is created per-request or per-task and must always be closed — treat it like an open file handle.
- SQLAlchemy's lazy loading is convenient but deadly at scale — always use joinedload() or subqueryload() when you know you'll loop over related objects.
- ForeignKey and relationship() do different jobs: ForeignKey creates the database constraint, relationship() creates the Python-level convenience attribute — you need both for full ORM functionality.
- Set echo=True on your engine during development to see every SQL query SQLAlchemy generates — it's the fastest way to catch N+1 problems and understand what your ORM code actually does.
⚠ Common Mistakes to Avoid
- ✕Mistake 1: Not closing sessions — If you create a session manually with SessionLocal() and never call session.close(), the database connection stays open and gets leaked back into the pool in a dirty state. Over time this exhausts your connection pool and your app hangs. Fix: Always use a context manager (with SessionLocal() as session:) or the generator pattern shown in the setup section so close() is guaranteed to run.
- ✕Mistake 2: Accessing lazy-loaded relationships after the session closes — You load an object, the session goes out of scope, then you try to read a relationship attribute somewhere else in your code. SQLAlchemy raises DetachedInstanceError because there's no active session to fire the lazy query. Fix: Either use joinedload() to eagerly fetch related data before the session closes, or restructure your code to access all needed attributes while the session is still open.
- ✕Mistake 3: Calling Base.metadata.create_all() before importing all model files — If your models are spread across multiple files and you call create_all() before importing them, SQLAlchemy doesn't know they exist and won't create their tables. No error is raised — the tables are simply absent. Fix: Make sure every model module is imported before calling create_all(). A common pattern is a central models/__init__.py that imports all model classes, which you then import before calling create_all().
Interview Questions on This Topic
- QWhat's the difference between session.flush() and session.commit() in SQLAlchemy, and when would you use flush() over commit()?
- QExplain the N+1 query problem in the context of SQLAlchemy's ORM. How do you detect it and what are the two main strategies to fix it?
- QIf you define a relationship() on a model but forget to add the corresponding ForeignKey column, what happens — and why does SQLAlchemy need both?
Frequently Asked Questions
What is the difference between SQLAlchemy Core and SQLAlchemy ORM?
SQLAlchemy Core is the lower-level layer — it lets you build and execute SQL expressions using Python objects, but you still think in terms of tables and rows. The ORM layer sits on top of Core and lets you work with Python classes and objects instead, mapping each class to a database table automatically. Most applications use the ORM, but Core is useful for bulk operations or when you need fine-grained SQL control.
Do I need to know SQL to use SQLAlchemy?
You don't need to write SQL, but understanding it makes you significantly more effective with SQLAlchemy. When you use filter(), join(), or order_by(), you're describing SQL operations in Python syntax. Knowing what SQL is being generated (use echo=True) helps you debug slow queries and understand why certain ORM patterns cause performance problems like the N+1 issue.
When should I use SQLAlchemy instead of a simpler library like sqlite3?
Use SQLAlchemy when your application has multiple related tables, when you want the option to switch databases without rewriting queries, or when your codebase will grow beyond a handful of queries. For a quick one-off script that reads a single table, sqlite3 is perfectly fine. The moment you're modeling relationships between entities or building something that will be maintained over time, SQLAlchemy pays for its learning curve quickly.
Written and reviewed by senior developers with real-world experience across enterprise, startup and open-source projects. Every article on TheCodeForge is written to be clear, accurate and genuinely useful — not just SEO filler.