Bidirectional relationships need both sides set via a helper method
@OneToMany defaults to LAZY; accessing it outside a transaction throws LazyInitializationException
N+1 queries occur when lazy collections are accessed in a loop; fix with JOIN FETCH
First-level cache is per EntityManager; second-level is shared across sessions
✦ Definition~90s read
What is JPA?
JPA (Jakarta Persistence API, formerly Java Persistence API) is a specification for object-relational mapping (ORM) in Java, standardizing how you map Java objects to database tables and persist them without writing raw SQL. It solves the impedance mismatch between object-oriented domain models and relational databases, letting you work with entities, relationships, and queries in a type-safe, database-agnostic way.
★
Imagine you have a filing cabinet full of paper forms (your database), but your job requires working with sticky notes on a whiteboard (Java objects).
Under the hood, JPA delegates to an implementation like Hibernate (the de facto standard, used by ~90% of JPA applications) or EclipseLink, which generate SQL, manage connections, and handle caching. You'd use JPA when you want rapid development with complex object graphs and don't need fine-grained SQL control — but you must understand its hidden costs, especially the infamous N+1 query problem, where a single findAll on 100 orders can silently issue 1,001 SQL statements if you lazily load each order's line items.
Alternatives include raw JDBC (full control, no magic), jOOQ (type-safe SQL with explicit joins), or MyBatis (SQL mapping, less abstraction). Don't use JPA for read-heavy analytics, massive batch operations, or when your team lacks ORM expertise — the convenience tax is real.
Plain-English First
Imagine you have a filing cabinet full of paper forms (your database), but your job requires working with sticky notes on a whiteboard (Java objects). JPA is the assistant who automatically transfers information between the two without you having to manually copy each field. You work with your sticky notes, and JPA keeps the filing cabinet in sync. That's it — it's a translation layer between your Java world and your database world.
JPA isn’t an ORM. It’s a spec that forces you to think in objects while your database thinks in sets. Ignore that friction—and most do—and you’ll ship a system that runs fine on three test rows, then collapses under production load with N+1 queries, stale data, and deadlock timeouts. Master JPA properly, and you get thread-safe persistence, compile-time query validation, and a consistency model that survives concurrent writes.
Why JPA's Convenience Hides a Query Bomb
JPA (Jakarta Persistence API) is a Java specification for object-relational mapping (ORM) that lets you work with relational data using plain Java objects. The core mechanic: you define entities (Java classes) mapped to database tables, and the JPA provider (typically Hibernate) translates your object operations into SQL queries. This abstraction eliminates boilerplate JDBC code but introduces a critical hidden cost: the N+1 query problem.
In practice, JPA uses lazy loading by default for collections. When you fetch a parent entity (e.g., 100 orders), each child collection (e.g., line items) is fetched on demand via a separate SQL query. Accessing order.getItems() inside a loop triggers 100 additional queries — 1 for the parent + N for each child. That's 1,001 queries for 100 orders, turning a simple page load into a database meltdown.
Use JPA when your application has clear entity relationships and you need rapid development. But never trust default fetch strategies in production. Always profile query counts, use JOIN FETCH or @EntityGraph for read paths, and batch collections with @BatchSize. The abstraction is a tool, not a shield — you must understand the SQL it generates.
Lazy Loading Is Not Free
Lazy loading defers queries, not costs. Each lazy access fires a new SQL statement — in a loop, that's O(n) queries instead of O(1).
Production Insight
Teams migrating from JDBC to JPA often see REST endpoints that returned in 50ms suddenly taking 5 seconds after adding a single @OneToMany relationship.
The exact symptom: a single HTTP request generates hundreds of SELECT queries visible in slow query logs, with database connection pool exhaustion under moderate load.
Rule of thumb: if you see more than 5 SQL queries for a single business operation, you have an N+1 — fix it before it hits production.
Key Takeaway
JPA is an ORM specification, not a query optimizer — you must audit the SQL it generates.
The N+1 problem is the #1 performance killer in JPA applications; always verify with logging or a tool like datasource-proxy.
Use JOIN FETCH, @EntityGraph, or DTO projections for any read path that returns a list of entities with associations.
thecodeforge.io
JPA N+1 Query Disaster — 1,001 Queries for 100 Orders
Jpa Java Persistence Api
The Persistence Context: The One Concept That Unlocks Everything
Most JPA tutorials throw annotations at you immediately. That's backwards. Before you write a single @Entity, you need to understand the persistence context — because every confusing JPA behaviour you'll ever encounter traces back to it.
Think of the persistence context as a short-lived, in-memory snapshot of your database. It's a first-level cache managed by the EntityManager. Any entity you load, persist, or merge within the same EntityManager instance is tracked. JPA watches those objects. The moment you change a field — even without calling any save method — JPA will automatically flush that change to the database at the right moment. This is called 'dirty checking'.
This matters enormously. In Spring, the default scope is one EntityManager per HTTP request (via @Transactional). Load an Order object, change its status, and JPA will write the UPDATE for you when the transaction commits. No save() call required. This feels like magic until something updates unexpectedly — and then it's a nightmare to debug if you didn't know this was happening.
Entities can be in one of four states: Transient (new object, JPA doesn't know about it), Managed (inside the persistence context, being tracked), Detached (was managed, transaction ended), or Removed (scheduled for deletion). Knowing which state your object is in is the difference between confident JPA usage and guesswork.
PersistenceContextDemo.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
import jakarta.persistence.*;
import jakarta.persistence.EntityManager;
import jakarta.persistence.EntityManagerFactory;
import jakarta.persistence.Persistence;
// A minimal runnable JPA example using a persistence.xml// Works with Hibernate 6+ and H2 in-memory databasepublicclassPersistenceContextDemo {
publicstaticvoidmain(String[] args) {
// Bootstrap JPA — in Spring Boot this happens automaticallyEntityManagerFactory emFactory =
Persistence.createEntityManagerFactory("demo-unit");
EntityManager em = emFactory.createEntityManager();
em.getTransaction().begin();
// --- STATE 1: TRANSIENT ---// 'newProduct' is just a regular Java object. JPA has no idea it exists.Product newProduct = newProduct();
newProduct.setName("Mechanical Keyboard");
newProduct.setPrice(149.99);
System.out.println("State: TRANSIENT — id is null: " + newProduct.getId());
// --- STATE 2: MANAGED ---// persist() hands the object to the persistence context.// JPA now tracks every field change on 'newProduct'.
em.persist(newProduct);
System.out.println("State: MANAGED — id assigned: " + newProduct.getId());
// Dirty checking in action: we change a field WITHOUT calling any save method.// JPA will detect this change and generate an UPDATE automatically on commit.
newProduct.setPrice(129.99); // <-- no em.save(), no em.update() neededSystem.out.println("Price changed to 129.99 — JPA will auto-flush this on commit");
em.getTransaction().commit(); // Flush happens here — INSERT then UPDATE sent to DB// --- STATE 3: DETACHED ---// After the transaction commits, the entity is still in memory// but JPA is no longer tracking it.
em.close(); // closing the EntityManager detaches all entitiesSystem.out.println("State: DETACHED — object still in memory, but JPA ignores changes");
// Changing a detached entity does NOT touch the database
newProduct.setPrice(99.99); // silently ignored by JPASystem.out.println("Price changed to 99.99 — but the DB still shows 129.99!");
// To persist changes on a detached entity, you must merge() it// in a new EntityManager session:EntityManager em2 = emFactory.createEntityManager();
em2.getTransaction().begin();
Product reattached = em2.merge(newProduct); // now JPA tracks changes again
em2.getTransaction().commit();
System.out.println("After merge and commit — DB now shows: " + reattached.getPrice());
em2.close();
emFactory.close();
}
}
Output
State: TRANSIENT — id is null: null
State: MANAGED — id assigned: 1
Price changed to 129.99 — JPA will auto-flush this on commit
State: DETACHED — object still in memory, but JPA ignores changes
Price changed to 99.99 — but the DB still shows 129.99!
After merge and commit — DB now shows: 99.99
Watch Out: Silent Updates
In a @Transactional Spring method, loading an entity and modifying it will ALWAYS generate an UPDATE on commit — even if you never call save(). This surprises developers who add a logging field or increment a counter inside a read-only method. Annotate genuinely read-only methods with @Transactional(readOnly = true) — Hibernate will skip dirty checking entirely, improving both correctness and performance.
Production Insight
Silent updates from dirty checking caused a production outage when a background job incremented a 'version' field on every entity it touched.
Read-only methods must be marked @Transactional(readOnly=true) to prevent accidental changes from flushing.
Rule: if you're not writing data, disable dirty checking explicitly.
Key Takeaway
The persistence context is the core of JPA.
Managed entities auto-track changes via dirty checking.
Always know your entity state: transient, managed, detached, or removed.
Mapping Real-World Relationships: @OneToMany, @ManyToOne, and the Ownership Rule
Relationships are where JPA gets genuinely powerful — and genuinely tricky. Let's use a real domain: an e-commerce order system. An Order has many OrderItems. An OrderItem belongs to one Order. That's a classic bidirectional @OneToMany / @ManyToOne.
The single most important concept here is the 'owning side'. In a bidirectional relationship, exactly one side must be the owner. The owner is the side that holds the foreign key column in the database. In a One-To-Many, the 'many' side (@ManyToOne) is ALWAYS the owner. This matters because JPA only looks at the owning side to decide what to write to the database. If you only update the 'mappedBy' side (the @OneToMany list) without setting the @ManyToOne reference, JPA writes nothing. This is one of the most common bugs in JPA code.
Fetch strategy is the other critical decision. @OneToMany defaults to LAZY loading — the list of items isn't fetched until you access it. @ManyToOne defaults to EAGER — the parent Order is fetched immediately. Changing these defaults without understanding the impact causes either N+1 query problems (too many small queries) or Cartesian product problems (one massive query that multiplies rows).
The golden rule: model relationships on both sides for object graph consistency, always set both sides in a helper method, and let the owning side drive persistence.
OrderEntityRelationship.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
import jakarta.persistence.*;
import java.math.BigDecimal;
import java.time.LocalDateTime;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
// ─── Order.java ───────────────────────────────────────────────
@Entity
@Table(name = "orders") // 'order' is a reserved SQL keyword — always quote or renamepublicclassOrder {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
privateLong id;
@Column(nullable = false)
privateString customerEmail;
@Column(nullable = false)
privateLocalDateTime placedAt;
// mappedBy = "order" means: "the 'order' field on OrderItem owns this relationship."// cascade = PERSIST, MERGE: saving/updating an Order auto-saves its items.// orphanRemoval = true: removing an item from this list deletes it from the DB.
@OneToMany(
mappedBy = "order",
cascade = {CascadeType.PERSIST, CascadeType.MERGE},
orphanRemoval = true,
fetch = FetchType.LAZY// default — explicitly written here for clarity
)
privateList<OrderItem> items = newArrayList<>();
// ── Helper method to keep BOTH sides of the relationship in sync ──// This is the pattern senior devs use. Never call items.add() directly.publicvoidaddItem(OrderItem item) {
items.add(item); // update the 'one' side (in-memory list)
item.setOrder(this); // update the 'many' side (the foreign key owner)
}
publicvoidremoveItem(OrderItem item) {
items.remove(item);
item.setOrder(null); // orphanRemoval will delete it from the DB
}
// Read-only view — prevents callers from bypassing addItem()publicList<OrderItem> getItems() {
returnCollections.unmodifiableList(items);
}
// getters / setterspublicLonggetId() { return id; }
publicStringgetCustomerEmail() { return customerEmail; }
publicvoidsetCustomerEmail(String customerEmail) { this.customerEmail = customerEmail; }
publicLocalDateTimegetPlacedAt() { return placedAt; }
publicvoidsetPlacedAt(LocalDateTime placedAt) { this.placedAt = placedAt; }
}
// ─── OrderItem.java ───────────────────────────────────────────
@Entity
@Table(name = "order_items")
publicclassOrderItem {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
privateLong id;
// @ManyToOne is the OWNING side — it holds the foreign key column 'order_id'// EAGER is the default for @ManyToOne, shown explicitly here
@ManyToOne(fetch = FetchType.LAZY) // Override to LAZY to avoid unnecessary joins
@JoinColumn(name = "order_id", nullable = false) // defines the FK column nameprivateOrder order;
@Column(nullable = false)
privateString productSku;
@Column(nullable = false, precision = 10, scale = 2)
privateBigDecimal unitPrice;
@Column(nullable = false)
privateint quantity;
// getters / setterspublicLonggetId() { return id; }
publicOrdergetOrder() { return order; }
publicvoidsetOrder(Order order) { this.order = order; }
publicStringgetProductSku() { return productSku; }
publicvoidsetProductSku(String productSku) { this.productSku = productSku; }
publicBigDecimalgetUnitPrice() { return unitPrice; }
publicvoidsetUnitPrice(BigDecimal unitPrice) { this.unitPrice = unitPrice; }
publicintgetQuantity() { return quantity; }
publicvoidsetQuantity(int quantity) { this.quantity = quantity; }
}
// ─── Usage example (inside a @Transactional service) ──────────publicclassOrderService {
privatefinalEntityManager em;
publicOrderService(EntityManager em) {
this.em = em;
}
publicOrdercreateOrder(String customerEmail) {
Order order = newOrder();
order.setCustomerEmail(customerEmail);
order.setPlacedAt(LocalDateTime.now());
OrderItem keyboard = newOrderItem();
keyboard.setProductSku("KB-MX-RED");
keyboard.setUnitPrice(newBigDecimal("149.99"));
keyboard.setQuantity(1);
OrderItem mousepad = newOrderItem();
mousepad.setProductSku("MP-XL-BLK");
mousepad.setUnitPrice(newBigDecimal("29.99"));
mousepad.setQuantity(2);
// Using the helper method — both sides stay consistent
order.addItem(keyboard);
order.addItem(mousepad);
// cascade PERSIST means JPA will also INSERT both OrderItems
em.persist(order);
System.out.println("Order created with id: " + order.getId());
System.out.println("Items count: " + order.getItems().size());
return order;
}
}
Output
Hibernate: insert into orders (customer_email, placed_at) values (?, ?)
Pro Tip: Always Override equals() and hashCode() on Entities
Use the database ID for equals/hashCode, but guard against null IDs (transient state). The safest pattern: use instanceof checks and only compare by id if both ids are non-null, otherwise fall back to object identity. Using Lombok's @EqualsAndHashCode without thought will break Set-based collections when entities transition from transient to managed state — the hashCode changes as the id goes from null to a value.
Production Insight
A common production bug: adding items to an order without setting the back reference.
The foreign key column stays null because JPA only persists the owning side.
Fix: always use a helper method that calls item.setOrder(this) alongside items.add(item).
Key Takeaway
In bidirectional relationships, the @ManyToOne side owns the foreign key.
Always set both sides of the relationship.
A helper method is the only safe way to add child entities.
Querying with JPQL and the N+1 Problem You Must Know How to Spot
JPQL (Java Persistence Query Language) lets you write queries against your entity model instead of your database tables. That's the key difference from SQL — you write FROM Order o, not FROM orders o. JPA translates it. This means your queries stay valid even if you rename a column, as long as you update the entity mapping.
But the query you write isn't always the query JPA executes. This gap is where the infamous N+1 problem lives. It happens when you fetch a list of N entities (1 query), and then as you iterate and access a lazy collection on each one, JPA fires an additional query per entity (N queries). Fetch 50 orders and touch each order's items — you've just fired 51 database round trips instead of 1.
The fix is JOIN FETCH. It tells JPA to retrieve the parent and its collection in a single JOIN query. But JOIN FETCH has its own trap: if you join-fetch multiple collections at once, you get a Cartesian product in the result set. The safe pattern for multi-collection fetches is to use @EntityGraph or run separate queries.
For complex reporting queries where you don't need full entity hydration, use JPQL projections or constructor expressions — they fetch only the columns you need and skip the overhead of building full entity objects.
OrderRepository.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
import jakarta.persistence.*;
import java.util.List;
publicclassOrderRepository {
privatefinalEntityManager em;
publicOrderRepository(EntityManager em) {
this.em = em;
}
// ─── PROBLEM: N+1 Query ───────────────────────────────────────// This loads all orders in 1 query.// But the moment we call order.getItems() in a loop, Hibernate fires// a separate SELECT for each order's items. 50 orders = 51 queries.publicvoiddemonstrateNPlusOne() {
List<Order> orders = em.createQuery("SELECT o FROM Order o", Order.class)
.getResultList();
// This loop is the trap — each getItems() call hits the databasefor (Order order : orders) {
System.out.println(order.getCustomerEmail()
+ " — items: " + order.getItems().size()); // N queries fired here
}
}
// ─── SOLUTION 1: JOIN FETCH ────────────────────────────────────// Fetches orders AND their items in a single SQL JOIN.// Use DISTINCT to prevent duplicate Order objects from the join result set.publicList<Order> findAllOrdersWithItems() {
return em.createQuery(
"SELECT DISTINCT o FROM Order o JOIN FETCH o.items",
Order.class
).getResultList();
// Generated SQL: SELECT DISTINCT o.*, oi.* FROM orders o// INNER JOIN order_items oi ON oi.order_id = o.id
}
// ─── SOLUTION 2: @EntityGraph (Spring Data JPA style) ─────────// Cleaner API — define the graph on the entity or inline.// Shown here as a named query for clarity.publicList<Order> findOrdersWithItemsViaEntityGraph() {
EntityGraph<Order> graph = em.createEntityGraph(Order.class);
graph.addAttributeNodes("items"); // tell JPA to eagerly load 'items'return em.createQuery("SELECT o FROM Order o", Order.class)
.setHint("jakarta.persistence.fetchgraph", graph)
.getResultList();
}
// ─── JPQL Projection: fetch only what you need ─────────────────// For a summary dashboard, you don't need full Order objects.// A DTO projection is faster — no entity tracking overhead.publicList<OrderSummary> findOrderSummaries() {
return em.createQuery(
// Constructor expression — JPA calls new OrderSummary(email, count, total)"SELECT new com.example.OrderSummary(o.customerEmail, COUNT(i), SUM(i.unitPrice * i.quantity)) "
+ "FROM Order o JOIN o.items i "
+ "GROUP BY o.customerEmail",
OrderSummary.class
).getResultList();
}
// ─── Named Query (defined on the entity with @NamedQuery) ───────// Validated at startup — typos fail fast, not at runtime.// On Order entity: @NamedQuery(name="Order.findByEmail",// query="SELECT o FROM Order o WHERE o.customerEmail = :email")publicList<Order> findByCustomerEmail(String email) {
return em.createNamedQuery("Order.findByEmail", Order.class)
.setParameter("email", email) // always use named params — prevents SQL injection
.getResultList();
}
}
// ─── DTO for projection queries ────────────────────────────────classOrderSummary {
privatefinalString customerEmail;
privatefinallong itemCount;
privatefinal java.math.BigDecimal totalValue;
// JPA calls this constructor via the JPQL constructor expressionpublicOrderSummary(String customerEmail, long itemCount, java.math.BigDecimal totalValue) {
this.customerEmail = customerEmail;
this.itemCount = itemCount;
this.totalValue = totalValue;
}
@OverridepublicStringtoString() {
return customerEmail + " | Items: " + itemCount + " | Total: $" + totalValue;
}
}
Output
// demonstrateNPlusOne() with 3 orders — Hibernate log shows:
Hibernate: select o1_0.id, o1_0.customer_email, o1_0.placed_at from orders o1_0
Hibernate: select items0_.order_id ... from order_items where order_id=1
Hibernate: select items0_.order_id ... from order_items where order_id=2
Hibernate: select items0_.order_id ... from order_items where order_id=3
from orders o1_0 join order_items i1_0 on i1_0.order_id=o1_0.id
// findOrderSummaries() output:
jane@example.com | Items: 2 | Total: $209.97
bob@example.com | Items: 1 | Total: $29.99
Interview Gold: N+1 Is Always About Lazy Loading in a Loop
When interviewers ask about JPA performance, N+1 is the answer they're fishing for 80% of the time. Know how to spot it (enable Hibernate SQL logging with spring.jpa.show-sql=true and count the SELECT statements), and know the three fixes: JOIN FETCH for single collections, @EntityGraph for flexibility, and separate queries or batch fetching (hibernate.default_batch_fetch_size=25) for multiple collections.
Production Insight
N+1 queries are the #1 JPA performance killer in production.
Always enable SQL logging in dev to see real query count.
If you see 1 + N queries, apply JOIN FETCH or @EntityGraph immediately.
Key Takeaway
N+1 happens when you access a lazy collection in a loop.
Count SQL statements to detect it.
Fix with JOIN FETCH (single collection) or @EntityGraph (multiple).
Caching: First-Level and Second-Level
JPA defines two caching layers. The first-level cache is tied to the EntityManager (persistence context). Every entity you load or persist within a transaction is stored in this cache. Subsequent lookups of the same entity by primary key within the same transaction avoid a database round trip. This cache is always enabled and you can't disable it.
The second-level cache is optional and shared across EntityManager instances. When enabled, entities loaded in one session are cached so the next session can retrieve them without hitting the database. This is useful for reference data that rarely changes (country codes, product categories). The second-level cache must be explicitly configured and is typically backed by a distributed cache like Redis or Hazelcast.
A common mistake is assuming the second-level cache will work without configuring the cache provider and without enabling cacheable on entities. Even if you add @Cacheable, Hibernate requires a cache region configuration. Without it, the annotation is silently ignored.
Cache invalidation is another trap. When you update an entity directly via SQL or via another application, the second-level cache becomes stale. Use a cache TTL or trigger a manual eviction using the EntityManagerFactory cache API.
SecondLevelCacheConfig.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
// ─── Step 1: Add Hibernate caching dependencies (Maven) ───────// <dependency>// <groupId>org.hibernate.orm</groupId>// <artifactId>hibernate-jcache</artifactId>// </dependency>// <dependency>// <groupId>org.ehcache</groupId>// <artifactId>ehcache</artifactId>// <classifier>jakarta</classifier>// </dependency>// ─── Step 2: Configure in application.properties ──────────────// spring.jpa.properties.hibernate.cache.use_second_level_cache=true// spring.jpa.properties.hibernate.cache.region.factory_class=org.hibernate.cache.jcache.internal.JCacheRegionFactory// spring.jpa.properties.javax.cache.provider=org.ehcache.jsr107.EhcacheCachingProvider// ─── Step 3: Enable caching on an entity ───────────────────────import jakarta.persistence.*;
import org.hibernate.annotations.Cache;
import org.hibernate.annotations.CacheConcurrencyStrategy;
@Entity
@Cacheable
@Cache(usage = CacheConcurrencyStrategy.READ_ONLY) // for reference datapublicclassProductCategory {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
privateLong id;
@Column(unique = true, nullable = false)
privateString code;
@Column(nullable = false)
privateString displayName;
// getters and setters
}
// ─── Step 4: Manual cache eviction ─────────────────────────────
@ServicepublicclassCacheService {
publicvoidevictAllSecondLevelCache(EntityManagerFactory emf) {
emf.getCache().evictAll();
}
publicvoidevictRegion(String regionName) {
// Region name is typically the fully qualified entity class name
emf.getCache().evict(regionName);
}
}
Output
// After enabling second-level cache, the following log shows cache hit:
Hibernate: select pc1_0.id,pc1_0.code,pc1_0.display_name from product_category pc1_0 where pc1_0.code=?
// Second query for same category: no SQL log — served from cache
Hibernate: <!-- no SQL emitted -->
Cache Strategy Selection
Use READ_ONLY for immutable reference data, READ_WRITE for frequently modified but not critical data (with distributed locks), and TRANSACTIONAL for strict consistency (requires JTA). NONSTRICT_READ_WRITE allows some stale reads but offers better performance for data that tolerates eventual consistency.
Production Insight
Second-level cache can mask database performance issues and cause stale data bugs.
Always configure a TTL for cache regions to force periodic refreshes.
For distributed systems, use a replicated or distributed cache (Redis, Hazelcast) to avoid stale reads across nodes.
Key Takeaway
First-level cache is per-session and always on.
Second-level cache requires explicit configuration and a cache provider.
Use READ_ONLY for reference data; evict on updates.
Transaction Management and Isolation Levels
JPA transactions are managed through the EntityTransaction API or declaratively via @Transactional. In Spring, @Transactional opens a transaction before the method starts and commits (or rolls back) after it returns. The propagation and isolation level behaviours are defined on this annotation.
The isolation level determines how transactions interact. The default (READ_COMMITTED) prevents dirty reads but allows non-repeatable reads and phantom reads. REPEATABLE_READ prevents those but can cause more deadlocks. SERIALIZABLE is the safest but has the worst concurrency. Choosing the wrong isolation level leads to data consistency bugs that are hard to reproduce.
Another important concept is transaction propagation. REQUIRED (default) joins an existing transaction or creates a new one. REQUIRES_NEW suspends the current transaction and creates a new one — useful for audit logging where you want to commit independently. NESTED uses savepoints (if supported) to allow partial rollbacks.
A common pitfall is calling a @Transactional method from within the same class. Spring's AOP proxies won't intercept internal calls, so the transaction settings are ignored. The method will run without any transaction boundary.
TransactionConfig.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
import org.springframework.transaction.annotation.Transactional;
import org.springframework.transaction.annotation.Isolation;
import org.springframework.transaction.annotation.Propagation;
@ServicepublicclassOrderService {
// ─── Basic Read-Only Transaction ─────────────────────────────
@Transactional(readOnly = true)
publicList<Order> findAllOrders() {
// No dirty checking, no unnecessary UPDATEsreturn em.createQuery("SELECT o FROM Order o", Order.class).getResultList();
}
// ─── Transaction with Custom Isolation ────────────────────────
@Transactional(isolation = Isolation.REPEATABLE_READ)
publicOrderupdateOrderStatus(Long orderId, String status) {
Order order = em.find(Order.class, orderId);
order.setStatus(status);
// Flush and commit happen automatically on method exitreturn order;
}
// ─── REQUIRES_NEW for Independent Audit ───────────────────────
@Transactional(propagation = Propagation.REQUIRES_NEW)
publicvoidlogAudit(String action, Long entityId) {
// This runs in its own transaction; rolls back independentlyAuditLog log = newAuditLog();
log.setAction(action);
log.setEntityId(entityId);
em.persist(log);
}
// ─── Pitfall: Self-Invocation ─────────────────────────────────publicvoidselfInvocationProblem() {
// This call does NOT apply @Transactional from updateOrderStatus// because it's called from within the same class.updateOrderStatus(1L, "SHIPPED");
}
}
Output
// No output — but demonstrates the self-invocation issue:
// Calling updateOrderStatus() within OrderService bypasses the AOP proxy.
// Transactional annotations on self-invoked methods are ignored.
// Fix: inject a separate service bean or use AspectJ weaving.
Self-Invocation Trap
Never call a @Transactional method from another method in the same class. Spring's proxy won't intercept it. Move transactional logic to a separate @Service bean and inject it. Otherwise your transaction boundaries are silently ignored — leading to data corruption that's hard to track down.
Production Insight
A production incident where an audit log was never persisted because the calling method lacked @Transactional.
The audit method used REQUIRES_NEW but was called from a loop inside the same service.
Fix: extract audit logic into its own service bean.
Self-invocation of @Transactional methods breaks AOP — extract to a separate bean.
The Criteria API: Why You Should Write Queries That Compile-Check
You've been burned by a JPQL typo at 2 AM. We all have. The JPA Criteria API exists to make that impossible. It's not about being fancy — it's about letting the compiler catch your column names before they hit production.
The core idea: instead of writing JPQL strings, you build queries programmatically with Java objects. Your IDE autocompletes entity field names. If you rename a field, every Criteria query referencing it breaks at compile time, not runtime. That's the whole point.
Here's the production pattern you'll actually use. Start from the CriteriaBuilder, get it from EntityManager.getCriteriaBuilder(). Build a query with createQuery(YourEntity.class). Define your root — that's your FROM clause. Then chain predicates, joins, and order by clauses with type safety. It's verbose, yes. But when you're maintaining a query that joins five tables across three microservices, you'll worship the compile-time validation.
The performance trap: lazy-loading doesn't care how you build your query. Criteria queries still suffer from N+1 if you don't explicitly fetch joins. Always call root.fetch() for relationships you plan to traverse. Never assume it's handled.
CriteriaApiDynamicFilter.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// io.thecodeforge — database tutorial
// Dynamic filtering with type-safe CriteriaAPI — no string concatenation
CriteriaBuilder cb = entityManager.getCriteriaBuilder();
CriteriaQuery<Invoice> query = cb.createQuery(Invoice.class);
Root<Invoice> root = query.from(Invoice.class);
// Build filters dynamically: 2019 production incident — forgot to handle null status
List<Predicate> predicates = new ArrayList<>();
if (statusFilter != null) {
predicates.add(cb.equal(root.get("status"), statusFilter));
}
if (totalMin != null) {
predicates.add(cb.greaterThanOrEqualTo(root.get("totalAmount"), totalMin));
}
query.where(cb.and(predicates.toArray(new Predicate[0])));
query.orderBy(cb.desc(root.get("createdAt")));
List<Invoice> results = entityManager.createQuery(query).getResultList();
Output
No direct SQL output — returns typed List<Invoice>.
Generated SQL (Hibernate dialect):
SELECT i.* FROM invoices i WHERE i.status = 'PAID' AND i.total_amount >= 1000.00 ORDER BY i.created_at DESC
Never Do This:
Never mix Criteria API with string-based JPQL fragments. You lose type safety. If you need dynamic filtering, commit fully to Criteria — or use QueryDSL. Half-measures cause the worst bugs.
Key Takeaway
If your query has more than two dynamic filters, use Criteria API. The compile-time safety saves your ass when a column rename slips into a hotfix.
Locking: Optimistic vs Pessimistic — The Production-Proof System
Here's the cold truth: JPA's default locking strategy is 'hope nothing breaks.' That works until two services update the same row at the same millisecond. Then you're debugging ghost writes at 3 AM.
Optimistic locking is your first line of defense. Add a @Version column — JPA checks it on every update. If your data hasn't changed since you read it, the update goes through. If another transaction modified it, you get an OptimisticLockException. Catch it, retry, move on. This handles 90% of concurrent-write scenarios without database-level locks.
Pessimistic locking is for when 10% matters — financial transactions, inventory decrements, reservation systems. You tell the database 'lock this row until I'm done.' Use LockModeType.PESSIMISTIC_WRITE inside a transaction. The tradeoff: performance goes down (other transactions wait). But correctness goes up. Pick your poison.
Production pattern: always optimistic by default. Wrap write operations in a retry loop — three retries with exponential backoff. Only reach for pessimistic locks when you can prove optimistic isn't enough. The deadlock stories you hear? 9/10 times someone used pessimistic locking when they didn't need to.
LockingProductInventory.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// io.thecodeforge — database tutorial
// Optimistic locking with retry for product inventory decrement
@Version
private Integer version; // JPA auto-increments on every write
// Pessimistic locking for critical stock deduction
@Transactional
public void deductStock(Long productId, int quantity) {
Product product = entityManager.find(
Product.class, productId,
LockModeType.PESSIMISTIC_WRITE // Locks the row — others wait
);
if (product.getStock() < quantity) {
throw new InsufficientStockException("Only " + product.getStock() + " available");
}
product.setStock(product.getStock() - quantity);
entityManager.flush(); // ForcesSQLUPDATE with lock held
}
Output
Optimistic: UPDATE products SET stock = 45, version = 3 WHERE id = 101 AND version = 2
Affected rows: 1 (success) or 0 (stale data — throw OptimisticLockException)
Pessimistic: SELECT * FROM products WHERE id = 101 FOR UPDATE
UPDATE products SET stock = 45 WHERE id = 101
Senior Shortcut:
If you're using Spring Data JPA, annotate your repository method with @Lock(LockModeType.OPTIMISTIC_FORCE_INCREMENT) to force a version bump even on read operations — prevents phantom writes on parent entities.
Key Takeaway
Start with @Version and a retry loop. Only pessimistically lock rows when a single cent matters. Deadlocks are harder to debug than stale data.
Advanced Mappings: Inheritance Strategies and Composite Keys
When your domain model demands inheritance or composite primary keys, JPA’s default mapping falls short. You must choose an inheritance strategy—SINGLE_TABLE (fast, but nullable columns), JOINED (normalized, slow joins), or TABLE_PER_CLASS (no nulls, no polymorphic queries). Each trades storage cost for query performance. Composite keys need @IdClass or @EmbeddedId; the latter enforces value semantics and avoids duplicated boilerplate. The trap: SINGLE_TABLE looks clean in code but creates column explosion and broken NOT NULL constraints. JOINED looks relational but generates N+1 joins on polymorphic fetches. Always benchmark your actual query patterns. For composite keys, @EmbeddedId plus equals()/hashCode() prevents runtime errors from mismatched primary key objects.
AdvancedMapping.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// io.thecodeforge — database tutorial
-- SINGLE_TABLE inheritance: one table, discriminator columnCREATETABLEpayment (
id BIGINTPRIMARYKEY,
payment_type VARCHAR(20) NOTNULL,
amount DECIMAL(10,2),
card_number VARCHAR(16) NULL,
check_number VARCHAR(10) NULL
);
-- JOINED strategy: normalized tables, joins requiredCREATETABLEsubscription (
id BIGINTPRIMARYKEY,
plan VARCHAR(50)
);
CREATETABLEtrial_subscription (
id BIGINTPRIMARYKEY,
trial_end DATE,
FOREIGNKEY (id) REFERENCESsubscription(id)
);
SINGLE_TABLE with many subclasses creates columns with NULL checks disabled, breaking NOT NULL constraints. Your DB allows nulls where business logic forbids them.
Key Takeaway
Pick inheritance strategy by query pattern, not by purity of object orientation.
Custom Type Mapping: Converters and Embeddables Beyond Primitives
JPA only natively maps common Java types. For enums, monetary values, or encrypted strings, you need @Enumerated, @Convert, or @Embeddable. @Enumerated(ORDINAL) is brittle—inserting a new enum value shifts ordinals, corrupting persisted data. Always use STRING. @Converter lets you write custom logic, e.g., mapping a Money object to a decimal column or encrypting on write and decrypting on read. The catch: converters run inside the persistence context—any exception leaves the EntityManager in an inconsistent state. @Embeddable groups columns without creating a separate table, but beware of null semantics: a null embeddable sets all its columns to null, not just the parent FK. Never embed large value objects; they flood the row with nullable columns that kill indexing.
CustomType.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// io.thecodeforge — database tutorial
-- Persistent table with embedded value objectCREATETABLEcustomer (
id BIGINTPRIMARYKEY,
name VARCHAR(100),
street VARCHAR(200),
city VARCHAR(100),
zip VARCHAR(10),
account_status VARCHAR(20)
);
-- @Embeddable Address maps to street, city, zip-- @Enumerated(STRING) maps account_statusINSERTINTO customer VALUES (1, 'Alice', '123 Oak', 'Springfield', '01101', 'ACTIVE');
Output
Row inserted. Enums stored as strings survive reordering.
Embeddable groups columns but all become nullable.
Production Trap:
A @Convert that throws on read returns a corrupt EntityManager. Wrap conversion in try-catch and mark transaction for rollback explicitly.
Key Takeaway
Prefer @Enumerated(STRING) and @Embeddable for clean schemas, but handle exceptions to avoid corrupted persistence contexts.
Batch Operations: Bulk Updates and Inserts Without the Memory Blowout
JPA’s entity model is built for incremental changes. Doing 100,000 updates in a loop loads every entity into the persistence context, causing heap exhaustion. Use JPQL UPDATE and DELETE for bulk changes—they translate to single SQL statements without loading entities. For inserts, enable JDBC batch processing via hibernate.jdbc.batch_size and set spring.jpa.properties.hibernate.order_inserts=true. The gotcha: bulk operations bypass the persistence context—second-level cache becomes stale unless you evict affected regions. Also, batch inserts work only if you disable IDENTITY ID generation (use SEQUENCE or TABLE). Without that, each insert fires a separate SELECT nextval, killing batch performance. Test batch size in production; 20–50 is typical. Over 100 increases deadlock risk under concurrent writes.
BatchOperations.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// io.thecodeforge — database tutorial
-- Bulk update without loading entitiesUPDATE account SET status = 'ARCHIVED'WHERE last_login < '2020-01-01';
-- Batch insert requires SEQUENCE generatorCREATESEQUENCE user_seq START1INCREMENT50;
CREATETABLE"user" (
id BIGINTDEFAULTnextval('user_seq') PRIMARYKEY,
name VARCHAR(100)
);
-- jdbc batch enabled, identity disables it
Output
Bulk update completes in single SQL (O(1) queries).
Batch insert with SEQUENCE achieves 50 rows per roundtrip.
Production Trap:
Bulk operations leave the second-level cache stale. Always evict affected entity regions or clear the cache after execution.
Key Takeaway
Use JPQL bulk statements for mass updates and configure batch inserts with SEQUENCE generation to avoid memory bloat.
Overview
JPA (Java Persistence API) is the standard specification for object-relational mapping in Java, allowing developers to map Java objects directly to database tables without writing verbose JDBC code. At TheCodeForge.io, we treat JPA not just as a library but as a design philosophy: it bridges the object-oriented world with relational storage by managing entity lifecycle, persistence context, and transaction boundaries. Understanding JPA means understanding the state transitions (new, managed, detached, removed) that every entity passes through, and how these states align with database commits. The specification is implemented by providers like Hibernate, EclipseLink, and OpenJPA, each adding performance optimizations while adhering to the core contract. This section establishes the foundational vocabulary: EntityManager, EntityManagerFactory, persistence unit, and the crucial concept of identity fields. Without this base, the advanced mappings and query strategies in later sections lack context. We emphasize that JPA is a tool for managing complexity, not a magic wand—its proper use requires disciplined design and awareness of when to bypass it for native queries or batch operations.
Persistence unit 'TheCodeForgePU' validated with Hibernate provider. Tables must exist externally.
Production Trap:
Never set hibernate.hbm2ddl.auto to 'create-drop' in production. Use 'validate' or 'update' with schema migration tooling.
Key Takeaway
JPA is a specification, not an implementation; always know your provider's quirks.
5. Defining the Domain Models (with AuditModel)
Domain models in JPA represent the business entities that map to database tables. Beyond simple data containers, enterprise applications demand audit trails for every change. The AuditModel is a reusable, mapped superclass that captures creation and modification timestamps plus the user responsible for the change. In the context of a CodeForge system, every Instructor and Course entity inherits from AuditModel, ensuring consistent auditing without duplicating fields. The @MappedSuperclass annotation tells JPA not to generate a separate table for AuditModel but to include its columns (created_at, updated_at, created_by, updated_by) in each child entity's table. This design enforces the WHY: auditing is a cross-cutting concern, not an afterthought. By centralizing it in a superclass, we reduce boilerplate and guarantee that every repository query returns temporal context for debugging or compliance. The Instructor and Course models then focus solely on their business attributes—name, email, title, credits—while the audit fields are transparently persisted. This separation of concerns is the hallmark of maintainable enterprise JPA code.
Tables 'instructors' and 'courses' created (or validated) with audit columns: created_at, updated_at, created_by, updated_by.
Production Trap:
Do not rely on @PreUpdate alone for audit timestamps—use database default triggers if the app bypasses JPA (e.g., batch operations).
Key Takeaway
AuditModel centralizes temporal tracking; inheritance enforces consistency across all domain entities.
7. Defining the Repositories (Instructor & Course Repository)
Repositories in JPA abstract the data access layer, providing a clean contract between business logic and persistence. For InstructorRepository and CourseRepository, we use Spring Data JPA's JpaRepository which supplies default CRUD methods (save, findById, findAll, delete) plus pagination and sorting without boilerplate. The WHY behind separate repositories: each entity has distinct query requirements. Instructors may need search by email or department; Courses require filtering by credits or instructor ID. Defining custom query methods via method naming conventions (e.g., findByEmailIgnoreCase) keeps the interface declarative. Additionally, we extend the repository with a custom interface for bulk operations that bypass the persistence context for performance. The CourseRepository, for example, includes a method to find all courses for a given instructor, leveraging JPQL to avoid N+1 queries. This design enforces the rule that repositories are not mere DAOs—they are strategic boundaries that encapsulate query logic and transaction scoping. Both repositories share a common base (JpaRepository<Entity, Long>) and the project's package structure (io.thecodeforge.repository) ensures consistent bean discovery.
Instructor_Course_Repository.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// io.thecodeforge — database tutorial
// Instructor and Course repository definitions
package io.thecodeforge.repository;
import io.thecodeforge.model.Instructor;
import io.thecodeforge.model.Course;
import org.springframework.data.jpa.repository.JpaRepository;
import org.springframework.data.jpa.repository.Query;
import java.util.List;
public interface InstructorRepository extends JpaRepository<Instructor, Long> {
InstructorfindByEmailIgnoreCase(String email);
}
public interface CourseRepository extends JpaRepository<Course, Long> {
@Query("SELECT c FROM Course c JOIN FETCH c.instructor WHERE c.credits > :minCredits")
List<Course> findCoursesWithInstructorByMinCredits(int minCredits);
}
Output
Repositories instantiated by Spring Data. Custom JPQL queries compiled at bootstrap—errors caught early.
Production Trap:
Avoid returning entities directly from repositories to controllers—use DTOs or projections to prevent lazy loading exceptions and over-fetching.
Key Takeaway
Separate repositories per entity ensure focused query logic and prevent cross-entity coupling.
8. CRUD Restful Web Services (Instructor & Course Resources)
Exposing JPA repositories as RESTful web services requires converting entity operations into HTTP semantics. For Instructor and Course resources, we build a REST controller layer that translates POST (create), GET (read), PUT (update), and DELETE actions into repository calls. The WHY of this pattern: separating REST endpoints from JPA logic allows independent scaling, security (e.g., @PreAuthorize annotations), and versioning without touching the persistence layer. The InstructorResource handles mapping from DTOs to entities, avoiding direct exposure of entity IDs in requests. The CourseResource includes a nested endpoint (e.g., /instructors/{id}/courses) to retrieve courses by instructor, demonstrating relationship traversal in REST. Error handling uses @ControllerAdvice to return consistent JSON error responses (404 for not found, 400 for validation). We use @Transactional on service methods to ensure atomicity when creating an Instructor with nested Course objects. Importantly, we never return entity objects directly—we use DTOs (InstructorDTO, CourseDTO) to decouple the API contract from the JPA model. This prevents accidental lazy loading in serialization and allows field-level deprecation without schema changes.
CRUD_REST_Controller.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// io.thecodeforge — database tutorial
// REST controller for Instructor resource
@RestController @RequestMapping("/api/instructors")
public class InstructorResource {
private final InstructorRepository repo;
private final ModelMapper mapper;
public InstructorResource(InstructorRepository repo, ModelMapper mapper) {
this.repo = repo;
this.mapper = mapper;
}
@GetMapping("/{id}")
public ResponseEntity<InstructorDTO> get(@PathVariableLong id) {
Instructor entity = repo.findById(id).orElseThrow(() -> new NotFoundException("Instructor not found"));
return ResponseEntity.ok(mapper.map(entity, InstructorDTO.class));
}
@PostMapping
@ResponseStatus(HttpStatus.CREATED)
public InstructorDTOcreate(@Valid @RequestBodyInstructorCreateRequest request) {
Instructor entity = mapper.map(request, Instructor.class);
return mapper.map(repo.save(entity), InstructorDTO.class);
}
}
Output
HTTP 201 on POST /api/instructors with body. HTTP 404 on GET /api/instructors/9999. All responses in JSON.
Production Trap:
Never use entity classes directly as request/response bodies—use DTOs to avoid security leaks (e.g., hashed passwords) and lazy loading issues.
Key Takeaway
REST APIs must map to DTOs, not entities, to decouple the HTTP contract from JPA internals.
● Production incidentPOST-MORTEMseverity: high
N+1 Queries Caused a 30x Database Load Spike
Symptom
An order listing API that returned 100 orders took 12 seconds to respond. Database CPU hit 100%. The endpoint previously responded in 20ms.
Assumption
The team assumed that accessing order.getItems() inside a loop would fetch data in a single query because the relationship was annotated with @OneToMany.
Root cause
The @OneToMany relationship used default FetchType.LAZY. Each call to order.getItems() inside the loop triggered a separate SELECT statement. With 100 orders, this produced 1 SQL query for the list + 100 individual queries for items = 101 queries. But the team also had a status history collection with default fetching, resulting in another 100 queries per order, totalling 1,001 queries.
Fix
Replaced the loop-based iteration with a JPQL query using JOIN FETCH on both collections. Added @EntityGraph for the second collection to avoid Cartesian product. Set hibernate.default_batch_fetch_size=25 as a safety net for other lazy collections. Response time dropped back to 25ms.
Key lesson
Always enable SQL logging (spring.jpa.show-sql=true) during development to see the actual number of queries.
Never assume that a lazy collection will be efficient — always verify with logs or a profiler.
JOIN FETCH works for one collection; for multiple collections, use @EntityGraph or separate queries with batch fetching.
Production debug guideQuick reference for diagnosing JPA issues in production4 entries
Symptom · 01
LazyInitializationException thrown when accessing a collection or lazily loaded attribute
→
Fix
Ensure the access happens inside an active @Transactional context. If outside, use JOIN FETCH in the query to load the data eagerly, or call Hibernate.initialize() before closing the session.
Symptom · 02
Unexpected UPDATE statement on commit even though no save() was called
→
Fix
This is dirty checking. If the method is supposed to be read-only, annotate it with @Transactional(readOnly = true) to disable dirty checking. If a field is changed accidentally, inspect the entity's setters and any detach/merge logic.
Symptom · 03
Foreign key column is null after persisting a child entity in a bidirectional relationship
→
Fix
Check that the owning side (@ManyToOne) was set. You likely only updated the @OneToMany side. Always use a helper method that sets both sides of the relationship.
Symptom · 04
Duplicate entities in the result set after using JOIN FETCH
→
Fix
Add DISTINCT to the JPQL query. Without DISTINCT, the SQL join duplicates parent rows for each child. JPA may deduplicate in memory if you use DISTINCT, but it's better to use DISTINCT in the query.
★ Quick Debug Cheat Sheet for JPA IssuesCommands and configurations to diagnose JPA behaviour fast
Too many SQL queries — suspect N+1−
Immediate action
Enable SQL logging in application.properties
Commands
spring.jpa.show-sql=true
spring.jpa.properties.hibernate.format_sql=true
Fix now
Count the SELECT statements in the log. Each extra SELECT per entity confirms N+1. Apply JOIN FETCH or @EntityGraph in the data access layer.
LazyInitializationException+
Immediate action
Check if the code is accessing a lazy property outside a @Transactional boundary
Commands
Wrap the calling method with @Transactional
If you cannot keep the transaction open, use JOIN FETCH in the query or call Hibernate.initialize(entity.getCollection()) within the transaction.
Fix now
Move the collection access inside the service method that was already annotated with @Transactional, or refactor to use a DTO projection that avoids lazy loading.
Entity updates are not persisted+
Immediate action
Check if the entity is in detached state
Commands
Call entityManager.merge(entity) to reattach it
Alternatively, load the entity again within the transaction using find() and modify it there.
Fix now
Redesign the workflow to keep the entity managed for the duration of the transaction, or use a DTO-based approach that reattaches on merge.
JPA Query Approaches Comparison
Aspect
JPQL (JPA Standard)
Criteria API
Native SQL
Syntax style
String-based, entity-aware
Type-safe Java builder API
Raw SQL strings
SQL injection safety
Safe with named params (:param)
Fully safe by design
Risky — requires careful escaping
Compile-time checking
None — fails at runtime
Yes — with JPA Metamodel
None
Readability
High for simple queries
Verbose for complex queries
High for SQL experts
Dynamic query building
Painful — string concatenation
Excellent — built for this
Possible but messy
Portability across DBs
Yes — JPA translates
Yes — JPA translates
No — vendor-specific SQL
Best for
Static, readable queries
Search/filter forms with optional criteria
Performance-critical bulk ops or stored procedures
Key takeaways
1
The persistence context is JPA's beating heart
every managed entity is auto-tracked for changes via dirty checking, so unexpected UPDATEs are always a state management issue, not a bug in your save logic.
2
In any bidirectional relationship, the @ManyToOne side owns the foreign key
always set it, always use a helper method on the parent to keep both sides consistent.
3
N+1 is caused by accessing a LAZY collection in a loop
detect it by counting SQL statements in logs, fix it with JOIN FETCH or @EntityGraph, and prefer batch fetch size for multiple collections.
4
Use JPQL for readable static queries, Criteria API for dynamic filter queries, and native SQL only when you need database-specific features or bulk performance that JPA can't optimise
and document exactly why.
5
Second-level cache is optional and requires explicit setup
never assume @Cacheable is enough without a provider and region configuration.
6
Transaction isolation and propagation are not just theoretical
choose them carefully based on concurrency requirements, and avoid self-invocation of @Transactional methods.
Common mistakes to avoid
5 patterns
×
Only updating the 'mappedBy' side of a bidirectional relationship
Symptom
You call order.getItems().add(item) but the foreign key column in order_items is never populated; the item is saved with order_id = null or silently ignored.
Fix
Always use a helper method that sets both sides: order.addItem(item) which internally does items.add(item) AND item.setOrder(order). The owning side (@ManyToOne) must always be set for JPA to write the foreign key.
×
Using CascadeType.ALL on @ManyToOne
Symptom
Deleting an OrderItem accidentally deletes its parent Order, cascading deletes up the relationship tree and wiping data.
Fix
CascadeType.ALL (which includes REMOVE) should almost never go on @ManyToOne. It belongs on @OneToMany from parent to children. Think of cascade as 'parent controls children', not 'child controls parent'.
×
Calling entity getters on a LAZY collection outside a transaction
Symptom
LazyInitializationException: could not initialize proxy — no Session
Fix
Ensure the collection is accessed within an active @Transactional boundary, use JOIN FETCH to load it eagerly when you know you'll need it, or use a DTO projection instead. Never rely on the 'Open Session in View' anti-pattern in production — it hides N+1 problems and leaks database connections.
×
Assuming second-level cache works without configuration
Symptom
No performance improvement even after adding @Cacheable. Entities are always fetched from the database.
Fix
Add the Hibernate cache provider dependencies (e.g., hibernate-jcache + Ehcache), set spring.jpa.properties.hibernate.cache.use_second_level_cache=true, configure the region factory, and set the cache concurrency strategy on the entity.
×
Self-invoking @Transactional methods inside the same class
Symptom
Transaction settings (isolation, propagation, readOnly) are ignored. The method runs without any transaction boundary, causing inconsistent data or missing rollbacks.
Fix
Move transactional methods to a separate Spring bean and inject it. Never call @Transactional methods from another method in the same class.
INTERVIEW PREP · PRACTICE MODE
Interview Questions on This Topic
Q01SENIOR
What is the difference between persist(), merge(), and save() in JPA — a...
Q02SENIOR
Explain the N+1 query problem in JPA. How do you detect it in a running ...
Q03SENIOR
What is the difference between FetchType.LAZY and FetchType.EAGER, and w...
Q01 of 03SENIOR
What is the difference between persist(), merge(), and save() in JPA — and when would using the wrong one cause a bug?
ANSWER
persist() makes a transient entity managed and schedules an INSERT. merge() copies state from a detached entity to a managed entity (or creates a new one) and returns the managed instance. save() is not a JPA method; it's Hibernate-specific and is similar to persist but can return the generated ID. Using persist on a detached entity throws IllegalArgumentException. Using merge on a newly created entity that already exists in the DB can cause a duplicate if the ID is assigned. The rule: use persist for new entities you create, merge for reattaching detached entities you received from a client.
Q02 of 03SENIOR
Explain the N+1 query problem in JPA. How do you detect it in a running application, and what are your three options to fix it?
ANSWER
N+1 occurs when you fetch N entities (1 query) and then iterate over them, accessing lazy-loaded relationships (N more queries) => total N+1 SQL statements. Detect it by enabling SQL logging (spring.jpa.show-sql=true) and counting the SELECT statements. Fixes: (1) JOIN FETCH in JPQL to eagerly load the collection in one query; (2) @EntityGraph to define a fetching strategy without modifying the query; (3) batch fetching (hibernate.default_batch_fetch_size=25) to reduce queries by batching collection loads.
Q03 of 03SENIOR
What is the difference between FetchType.LAZY and FetchType.EAGER, and why is changing @ManyToOne from EAGER to LAZY considered a performance improvement in most production systems?
ANSWER
LAZY means the related entity is loaded only when accessed; EAGER means it is loaded immediately along with the owner. @ManyToOne defaults to EAGER. In most production systems, loading the parent Order every time you load an OrderItem is wasteful — you rarely need both. Changing it to LAZY reduces the initial query size and prevents unnecessary joins. However, you must then ensure that when you do need the parent, you access it within a transaction or use JOIN FETCH. The trade-off is avoiding the EAGER join for the 90% of cases where it's not needed, at the cost of extra queries for the 10% where it is.
01
What is the difference between persist(), merge(), and save() in JPA — and when would using the wrong one cause a bug?
SENIOR
02
Explain the N+1 query problem in JPA. How do you detect it in a running application, and what are your three options to fix it?
SENIOR
03
What is the difference between FetchType.LAZY and FetchType.EAGER, and why is changing @ManyToOne from EAGER to LAZY considered a performance improvement in most production systems?
SENIOR
FAQ · 5 QUESTIONS
Frequently Asked Questions
01
What is the difference between JPA and Hibernate?
JPA is a specification — a set of interfaces and rules defined by Jakarta EE. Hibernate is the most popular implementation of that specification. You code against JPA interfaces (@Entity, EntityManager, @OneToMany), and Hibernate is the engine doing the actual work underneath. This means you can theoretically swap Hibernate for EclipseLink without changing your application code.
Was this helpful?
02
When should I use JPA instead of plain JDBC or jOOQ?
Use JPA when your application has a rich domain model with complex object relationships and you want the ORM to handle CRUD boilerplate. Use JDBC or jOOQ when you need maximum SQL control, are doing heavy batch operations, or your queries are so complex that the ORM abstraction becomes a hindrance. Many serious production apps use both — JPA for the domain layer, jOOQ or JDBC for reporting queries.
Was this helpful?
03
Why does JPA update my entity in the database even though I never called save()?
This is JPA's dirty checking mechanism. Any entity that is in the 'managed' state within an active persistence context is automatically tracked. When the transaction commits, JPA compares each managed entity's current state against a snapshot taken at load time and generates UPDATE statements for any fields that changed. Mark methods as @Transactional(readOnly = true) when you don't intend to modify data — this disables dirty checking and improves performance.
Was this helpful?
04
What is the difference between first-level and second-level cache?
First-level cache is per EntityManager (persistence context) and is always enabled. It ensures you get the same Java object instance within the same session. Second-level cache is shared across EntityManagers and must be explicitly configured with a cache provider like Ehcache or Redis. It caches entities across sessions and is best for read-only reference data. When an entity is updated in one session, the second-level cache must be invalidated.
Was this helpful?
05
What isolation level should I use for JPA transactions?
The default in most databases is READ_COMMITTED, which prevents dirty reads but allows non-repeatable reads and phantom reads. For most applications this is sufficient. If you need to prevent changes during a transaction (e.g., reading a count then using it to update), use REPEATABLE_READ. Use SERIALIZABLE only when absolute consistency is required and concurrency is low, as it severely reduces throughput.