Hibernate vs JPA — What is the Difference and Which to Use
- JPA is a specification (interfaces + rules). Hibernate is the implementation. Spring Boot uses Hibernate as its default JPA provider. Code against JPA interfaces by default; reach for Hibernate-specific APIs only when JPA cannot meet a concrete requirement.
- The entity lifecycle has four states: Transient, Managed, Detached, Removed. Understanding these states explains why
persist()does not immediately hit the database, whymerge()returns a different object, and what detached entity errors mean. - Dirty checking is how Hibernate detects changes in managed entities. @Transactional(readOnly=true) skips dirty checking entirely, improving performance for read-heavy operations. Set it as the default on your service layer.
JPA is a rulebook that says 'here is how Java ORM should work.' Hibernate is a team that followed that rulebook to build an actual working tool. You code against the rulebook (JPA) and Hibernate does the heavy lifting under the hood. The catch: Hibernate also built extra rooms that aren't in the rulebook — and sometimes those rooms are exactly what you need.
If you have used Spring Boot with a database, you have used JPA and Hibernate — often without realising they are two different things. JPA is a specification: a set of interfaces and rules. Hibernate is an implementation of that specification. Understanding this distinction is not academic. It determines whether your persistence layer is portable, what APIs you use, and when Hibernate-specific features are actually worth reaching for.
I once inherited a Spring Boot service that was taking 14 seconds to load a dashboard page. The team had been optimising database indexes for weeks. The real problem: Hibernate was firing 3,200 SQL queries per page load because of an N+1 problem on a lazy-loaded collection that nobody had checked. One JOIN FETCH reduced it to 3 queries and the page loaded in 200ms. The indexes were fine. The Hibernate knowledge was missing.
This article covers the full picture — not just 'JPA is a spec, Hibernate is an implementation' and a code snippet. We will cover entity lifecycle states, dirty checking, ID generation trade-offs, N+1 queries, optimistic locking, cascade semantics, caching, soft deletes, auditing, pagination, inheritance strategies, testing patterns, and the Hibernate 6 changes that broke half the internet when Spring Boot 3 shipped. By the end, you will know exactly when to stay in JPA land and when to drop to Hibernate-specific APIs.
What is JPA?
JPA — Java Persistence API, now Jakarta Persistence API — is a specification defined in Jakarta EE. It defines a standard set of interfaces, annotations, and rules for Object-Relational Mapping (ORM) in Java. JPA itself ships no runnable code. It is a contract: if a framework implements JPA, your code will work against that framework.
The core JPA interfaces: EntityManager (your gateway to the database — persist, find, merge, remove), EntityManagerFactory (creates EntityManager instances, one per application), EntityTransaction (controls commit/rollback), and TypedQuery/Query for JPQL queries.
The core JPA annotations: @Entity (marks a class as a database table), @Table (customises the table name), @Id (marks the primary key), @GeneratedValue (auto-generates PK values), @Column (maps to a column), @OneToMany, @ManyToOne, @ManyToMany, @JoinColumn.
Because JPA is a specification, code that only uses JPA interfaces can theoretically switch between implementations — Hibernate, EclipseLink, OpenJPA — without changing business logic. In practice, almost nobody switches. But coding against JPA interfaces keeps your code cleaner and your team's cognitive load lower.
package io.thecodeforge.hibernate_vs_jpa; import jakarta.persistence.*; import java.util.ArrayList; import java.util.List; @Entity @Table(name = "users") public class User { @Id @GeneratedValue(strategy = GenerationType.IDENTITY) private Long id; @Column(nullable = false, length = 100) private String name; @Column(unique = true, nullable = false) private String email; @OneToMany(mappedBy = "user", cascade = CascadeType.ALL, fetch = FetchType.LAZY) private List<Order> orders = new ArrayList<>(); public Long getId() { return id; } public void setId(Long id) { this.id = id; } public String getName() { return name; } public void setName(String name) { this.name = name; } public String getEmail() { return email; } public void setEmail(String email) { this.email = email; } public List<Order> getOrders() { return orders; } public void setOrders(List<Order> orders) { this.orders = orders; } }
What is Hibernate?
Hibernate is the most widely used JPA implementation. It is also the default ORM in Spring Boot — when you add spring-boot-starter-data-jpa, Hibernate is what you get.
Hibernate predates JPA. JPA was actually modelled on Hibernate's original API. When JPA was standardised, Hibernate was updated to implement it — but kept its original native API alongside. That is why you will see references to both Session and EntityManager in older Hibernate code.
Hibernate does everything JPA specifies, and then more. It adds features the JPA spec does not cover: the Session API (Hibernate's native equivalent of EntityManager), HQL (Hibernate Query Language, a superset of JPQL), a first-level cache (per Session), a second-level cache (shared across Sessions, pluggable with Ehcache or Redis), batch processing, native query enhancements, entity interceptors, @Formula for computed columns, @DynamicUpdate for partial updates, and StatelessSession for high-throughput bulk operations.
The other JPA implementations exist — EclipseLink (the JPA reference implementation, used in GlassFish/Payara), OpenJPA (Apache project, less active), DataNucleus (supports JPA and JDO) — but Hibernate dominates. In my 10+ years of Java development, I have never seen a production application use anything other than Hibernate as the JPA provider. That does not mean you should ignore portability. It means Hibernate-specific features are fair game when they solve a real problem.
package io.thecodeforge.hibernate_vs_jpa; import org.hibernate.Session; import org.hibernate.SessionFactory; import org.hibernate.StatelessSession; import java.util.List; public class HibernateSessionExample { public void demonstrateHibernateNativeAPI(SessionFactory sessionFactory) { // Hibernate Session — the native equivalent of JPA EntityManager Session session = sessionFactory.getCurrentSession(); // HQL — superset of JPQL, supports FROM without SELECT List<User> users = session.createQuery( "FROM User u WHERE u.email LIKE :domain", User.class) .setParameter("domain", "%@example.com") .setFirstResult(0) .setMaxResults(20) .getResultList(); // Hibernate-specific: batch insert session.setJdbcBatchSize(50); for (int i = 0; i < users.size(); i++) { session.persist(users.get(i)); if (i % 50 == 0) { session.flush(); session.clear(); } } // StatelessSession — bypasses first-level cache entirely StatelessSession stateless = sessionFactory.openStatelessSession(); var tx = stateless.beginTransaction(); try { var scroll = stateless.createQuery("FROM User", User.class) .scroll(org.hibernate.ScrollMode.FORWARD_ONLY); while (scroll.next()) { User u = scroll.get(); stateless.update(u); } tx.commit(); } catch (Exception e) { tx.rollback(); throw e; } finally { stateless.close(); } } }
Hibernate 6 and Spring Boot 3 — What Changed
Spring Boot 3 shipped with Hibernate 6, and it broke more things than most major version upgrades. If you are on Spring Boot 2.x and planning to upgrade, or starting fresh on Boot 3, these changes matter.
The package namespace moved from javax.persistence to jakarta.persistence. Every import in every entity class needs updating. This is a find-and-replace, but it touches every file.
Hibernate 6 changed the default ID generation strategy. GenerationType.AUTO now picks SEQUENCE instead of TABLE. If your database was relying on the TABLE strategy's hibernate_sequences table, your IDs will start from a different sequence after the upgrade. In production, this means new records get IDs that overlap with existing ones. I have seen this cause primary key conflicts on tables that had no unique constraint beyond the PK.
The dialect system was overhauled. The old spring.jpa.database-platform property still works but Hibernate 6 can auto-detect the dialect from the JDBC URL. In most cases, you can remove the explicit dialect configuration entirely.
HQL got stricter. Implicit joins that worked in Hibernate 5 may throw syntax errors in Hibernate 6. SELECT u.orders FROM User u without an explicit JOIN no longer works — you need SELECT o FROM User u JOIN u.orders o.
The second-level cache integration moved from Ehcache 3 to JCache (JSR-107). If you were using Ehcache directly, the configuration changes are significant.
Bottom line: if you are on Boot 3 with Hibernate 6, enable SQL logging, run your full test suite, and check every query that uses HQL or native SQL. The upgrade is worth it — Hibernate 6 has better performance, better type safety, and better Jakarta EE alignment — but it is not transparent.
package io.thecodeforge.hibernate_vs_jpa; import jakarta.persistence.*; // Hibernate 6: GenerationType.AUTO defaults to SEQUENCE, not TABLE @Entity public class Product { @Id // In Hibernate 5: AUTO picked TABLE strategy // In Hibernate 6: AUTO picks SEQUENCE strategy // Explicit is better — specify the strategy you want @GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "product_seq") @SequenceGenerator(name = "product_seq", sequenceName = "product_sequence", allocationSize = 50) private Long id; @Column(nullable = false) private String name; @Column(precision = 10, scale = 2) private java.math.BigDecimal price; } // application.properties for Hibernate 6 / Spring Boot 3 // spring.jpa.hibernate.ddl-auto=validate // spring.jpa.show-sql=true // spring.jpa.properties.hibernate.format_sql=true // spring.jpa.open-in-view=false // No dialect needed — Hibernate 6 auto-detects from JDBC URL
JPA vs Hibernate — The Core Distinction
The distinction maps cleanly to the specification vs implementation pattern common across Java EE:
JPA defines EntityManager; Hibernate implements it — and also provides Session, its own earlier API that does the same thing. JPA defines JPQL for queries; Hibernate supports JPQL and extends it with HQL (extra functions, FROM without SELECT, etc.). JPA defines @Cacheable for second-level caching; Hibernate implements the cache with @Cache and lets you choose the region factory. JPA defines cascading and fetch strategies; Hibernate adds extra fetch modes (SUBSELECT, BATCH) not in the spec.
In Spring Boot with Spring Data JPA, you almost never touch EntityManager or Session directly. Spring Data repositories (JpaRepository) wrap JPA, which wraps Hibernate. But when you need to tune performance — batch fetching, custom HQL, second-level cache, statistics — you drop to Hibernate-specific APIs.
The pragmatic rule: code against JPA by default. Reach for Hibernate-specific APIs only when you have a concrete need that JPA cannot satisfy. Do not import org.hibernate.Session in a service that only does CRUD — that is premature coupling.
package io.thecodeforge.hibernate_vs_jpa; import org.springframework.data.jpa.repository.JpaRepository; import org.springframework.data.jpa.repository.Query; import java.util.List; import java.util.Optional; // Spring Data JPA — you never see EntityManager or Session public interface UserRepository extends JpaRepository<User, Long> { Optional<User> findByEmail(String email); List<User> findByStatusOrderByCreatedAtDesc(UserStatus status); // JPQL — portable across JPA providers @Query("SELECT u FROM User u JOIN FETCH u.orders WHERE u.id = :id") Optional<User> findByIdWithOrders(@Param("id") Long id); // EntityGraph — declarative fetch path, JPA standard @org.springframework.data.jpa.repository.EntityGraph(attributePaths = {"orders"}) List<User> findAll(); }
The Entity Lifecycle — The Concept Most Tutorials Skip
Every JPA entity exists in one of four states. Understanding these states is fundamental to understanding why persist() does not immediately hit the database, why merge() returns a different object, and what 'detached entity passed to persist' errors mean.
New (Transient): The object exists in Java memory but Hibernate knows nothing about it. No database row corresponds to it. You created it with new User().
Managed (Persistent): The object is tracked by the persistence context (EntityManager/Session). Any changes to it are automatically detected and flushed to the database at transaction commit. This is dirty checking.
Detached: The object was once managed, but the persistence context was closed (transaction ended, EntityManager cleared). It has a database row, but Hibernate no longer tracks changes. Calling persist() on a detached entity throws an exception. You must use merge() to reattach it.
Removed: The object is scheduled for deletion. The actual DELETE happens at flush time.
The critical transitions: persist() takes a transient entity to managed. detach() takes a managed entity to detached. merge() takes a detached entity and returns a new managed copy. remove() takes a managed entity to removed.
Note that merge() returns a NEW object. The original detached entity is not reattached — a new managed copy is created. This is why you must always use the return value of merge(): user = entityManager.merge(user); not just entityManager.merge(user); and continuing to use the old reference.
package io.thecodeforge.hibernate_vs_jpa; import jakarta.persistence.EntityManager; import jakarta.persistence.EntityManagerFactory; import jakarta.persistence.EntityTransaction; public class EntityLifecycleDemo { private final EntityManagerFactory emf; public EntityLifecycleDemo(EntityManagerFactory emf) { this.emf = emf; } public void demonstrateLifecycle() { EntityManager em = emf.createEntityManager(); EntityTransaction tx = em.getTransaction(); // 1. TRANSIENT — new object, Hibernate knows nothing User user = new User(); user.setName("Jane"); user.setEmail("jane@example.com"); // user is transient — no database row, no persistence context tracking tx.begin(); // 2. MANAGED — persist() moves it into the persistence context em.persist(user); // user is now managed. Any changes are tracked via dirty checking. // The INSERT SQL may not fire immediately — it fires at flush time. user.setName("Jane Doe"); // dirty check: Hibernate detects this change // At flush time: UPDATE users SET name='Jane Doe' WHERE id=1 tx.commit(); // flush happens here — INSERT + UPDATE executed em.close(); // persistence context closes // 3. DETACHED — em is closed, user is no longer tracked user.setName("Janet"); // This change is LOST — Hibernate is not tracking user anymore EntityManager em2 = emf.createEntityManager(); EntityTransaction tx2 = em2.getTransaction(); tx2.begin(); // WRONG: em2.persist(user); // throws EntityExistsException — detached entity // CORRECT: merge() returns a NEW managed copy User managedUser = em2.merge(user); // managedUser is managed. user (the original) is still detached. managedUser.setName("Janet Updated"); // This change IS tracked — managedUser is in em2's persistence context tx2.commit(); em2.close(); // 4. REMOVED — entity scheduled for deletion EntityManager em3 = emf.createEntityManager(); EntityTransaction tx3 = em3.getTransaction(); tx3.begin(); User toDelete = em3.find(User.class, 1L); em3.remove(toDelete); // toDelete is now in REMOVED state. DELETE fires at flush/commit. tx3.commit(); em3.close(); } }
Managed: after persist() — tracked, dirty checking active, INSERT queued.
Detached: after em.close() — has DB row, changes ignored by Hibernate.
Removed: after remove() — DELETE queued for flush time.
merge() returns a NEW managed copy — original reference stays detached.
Dirty Checking — How Hibernate Knows What to Update
Dirty checking is the mechanism by which Hibernate detects which entity fields have changed since they were loaded, and generates the appropriate UPDATE statements. It is always on for managed entities and it is the reason you never need to call an explicit update() method in JPA.
When you load an entity with find() or a query, Hibernate stores a snapshot of the entity's state in the persistence context. At flush time, it compares the current state to the snapshot. If any field differs, Hibernate generates an UPDATE for that entity. If nothing changed, no SQL is fired.
This is why @Transactional(readOnly=true) matters. When Spring marks a transaction as readOnly, Hibernate can skip dirty checking entirely — it does not need to compare snapshots because it knows nothing will change. For read-heavy services, this saves CPU cycles proportional to the number of entities loaded in that transaction.
The cost of dirty checking is proportional to the number of managed entities in the persistence context. If you load 10,000 entities in a single transaction, Hibernate compares all 10,000 at flush time. This is where session.clear() in batch processing comes in — it empties the persistence context so dirty checking does not grow unbounded.
package io.thecodeforge.hibernate_vs_jpa; import org.springframework.stereotype.Service; import org.springframework.transaction.annotation.Transactional; @Service public class UserService { private final UserRepository userRepository; public UserService(UserRepository userRepository) { this.userRepository = userRepository; } // readOnly=true — Hibernate skips dirty checking // No snapshot comparison, no unnecessary UPDATE statements @Transactional(readOnly = true) public User getUser(Long id) { return userRepository.findById(id).orElseThrow(); // Even if you modify the returned object, no UPDATE fires // because the transaction is marked readOnly } // readOnly=false (default) — dirty checking is active @Transactional public void updateUserName(Long id, String newName) { User user = userRepository.findById(id).orElseThrow(); user.setName(newName); // Hibernate detects the change via dirty checking // At commit: UPDATE users SET name='newName' WHERE id=1 // You never call an explicit update() — Hibernate handles it } // Batch processing — clear session to prevent unbounded dirty checking @Transactional public void bulkUpdateStatus(UserStatus oldStatus, UserStatus newStatus) { var users = userRepository.findByStatus(oldStatus); for (int i = 0; i < users.size(); i++) { users.get(i).setStatus(newStatus); if (i % 50 == 0 && i > 0) { userRepository.flush(); userRepository.clear(); // empties persistence context } } } }
updateUserName(): dirty checking detects name change, UPDATE fires at commit.
bulkUpdateStatus(): flush+clear every 50 records prevents persistence context bloat.
ID Generation Strategies — The Performance Trap Nobody Warns You About
JPA provides four ID generation strategies, and the choice has real performance implications that most tutorials ignore.
IDENTITY: Uses database auto-increment (MySQL AUTO_INCREMENT, SQL Server IDENTITY). Simple, but it disables Hibernate's JDBC batch inserts. The reason: Hibernate needs the ID before it can batch the INSERT, but the ID is only available after the INSERT executes. Every INSERT is a separate round-trip. For bulk inserts, this is catastrophically slow.
SEQUENCE: Uses a database sequence (PostgreSQL, Oracle). Supports batch inserts because Hibernate can pre-allocate a range of IDs (allocationSize) in a single sequence call, then batch the INSERTs. This is the correct default for PostgreSQL and Oracle.
TABLE: Uses a separate table to simulate a sequence. Works on all databases but is the slowest option — an extra table lock for every ID allocation. Avoid it unless you are on MySQL and need portability.
AUTO: Lets the provider pick. In Hibernate 5, this defaulted to TABLE. In Hibernate 6, it defaults to SEQUENCE. Never rely on AUTO — always specify the strategy explicitly.
package io.thecodeforge.hibernate_vs_jpa; import jakarta.persistence.*; public class IdGenerationDemo { // STRATEGY 1: IDENTITY — simple, but disables batch inserts @Entity @Table(name = "users_identity") public static class UserIdentity { @Id @GeneratedValue(strategy = GenerationType.IDENTITY) private Long id; private String name; } // STRATEGY 2: SEQUENCE — supports batch inserts, best for PostgreSQL/Oracle @Entity @Table(name = "users_sequence") public static class UserSequence { @Id @GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "user_seq") @SequenceGenerator( name = "user_seq", sequenceName = "user_sequence", allocationSize = 50 // pre-allocate 50 IDs per sequence call ) private Long id; private String name; } // STRATEGY 3: TABLE — portable but slowest @Entity @Table(name = "users_table") public static class UserTable { @Id @GeneratedValue(strategy = GenerationType.TABLE, generator = "user_tbl") @TableGenerator( name = "user_tbl", table = "id_generator", pkColumnName = "gen_name", valueColumnName = "gen_value", pkColumnValue = "user_id", allocationSize = 25 ) private Long id; private String name; } }
SEQUENCE: 1000 inserts with allocationSize=50 = 20 sequence calls + batched INSERTs.
TABLE: 1000 inserts = extra table lock per allocation + batched INSERTs.
SEQUENCE is the clear winner for throughput on databases that support it.
The N+1 Query Problem — The Most Expensive Hibernate Mistake
The N+1 problem is the most common performance issue in Hibernate applications, and it is caused by lazy loading. When you load a list of N Users and then access their Orders, Hibernate fires 1 query for the users and then N additional queries — one per user — to load the orders. At scale, this is catastrophic.
I have debugged this in production more times than I can count. The symptom is always the same: a page loads fine with 10 records but grinds to a halt with 100. The database CPU spikes. The APM tool shows thousands of identical queries with different IDs. The developer swears the code is correct because it works in development with 5 test records.
Four ways to fix it: 1. JOIN FETCH in JPQL: forces an eager join for that specific query without changing the entity mapping. 2. @EntityGraph: declares fetch paths declaratively on the repository method. 3. @BatchSize on the association: Hibernate loads lazy collections in batches of N instead of one at a time. 4. Spring Data Projections: fetch only the fields you need, no associations loaded.
The default FetchType.LAZY on @OneToMany is correct — you do not want to load all associations every time. The fix is to fetch eagerly only when you explicitly need the data.
package io.thecodeforge.hibernate_vs_jpa; import org.springframework.data.jpa.repository.JpaRepository; import org.springframework.data.jpa.repository.Query; import org.springframework.data.jpa.repository.EntityGraph; import org.springframework.data.repository.query.Param; import java.util.List; public interface UserRepository extends JpaRepository<User, Long> { // THE PROBLEM: findAll() loads users, then accessing orders triggers N queries // 1 query: SELECT * FROM users // N queries: SELECT * FROM orders WHERE user_id = ? (one per user) // FIX 1: JOIN FETCH — one query with an inner join @Query("SELECT u FROM User u JOIN FETCH u.orders WHERE u.status = :status") List<User> findActiveUsersWithOrders(@Param("status") UserStatus status); // FIX 2: EntityGraph — one query with a left join, declarative @EntityGraph(attributePaths = {"orders"}) List<User> findAll(); // FIX 3: Projection — fetch only what you need, no associations loaded // interface UserSummary { // String getName(); // String getEmail(); // int getOrderCount(); // derived via @Query // } // @Query("SELECT u.name as name, u.email as email, SIZE(u.orders) as orderCount FROM User u") // List<UserSummary> findUserSummaries(); } // FIX 3b: @BatchSize on the entity (Hibernate-specific) // @OneToMany(mappedBy = "user", fetch = FetchType.LAZY) // @org.hibernate.annotations.BatchSize(size = 25) // private List<Order> orders; // Instead of N queries, fires ceiling(N/25) queries
JOIN FETCH: 100 users = 1 SQL query with JOIN.
EntityGraph: 100 users = 1 SQL query with LEFT JOIN.
BatchSize(25): 100 users = 5 SQL queries (1 + ceiling(100/25)).
Projection: 100 users = 1 SQL query, only selected columns.
FetchType.EAGER — The Default That Should Not Exist
This deserves its own section because it causes more production incidents than any other Hibernate configuration issue.
JPA specifies that @ManyToOne and @OneToOne default to FetchType.EAGER. This means every time you load an entity with a @ManyToOne relationship, Hibernate also loads the related entity — even if you never access it. For a single entity, this is fine. For a list query returning 1,000 entities, each with an EAGER @ManyToOne, you get 1,000 extra queries or a massive join.
The rule: set FetchType.LAZY on every @ManyToOne and @OneToOne unless you have a specific reason not to. Yes, JPA defaults to EAGER. JPA's defaults are wrong for production use. Override them.
For @OneToMany and @ManyToMany, JPA already defaults to LAZY, which is correct. Never change these to EAGER unless you enjoy debugging Cartesian products in production.
package io.thecodeforge.hibernate_vs_jpa; import jakarta.persistence.*; @Entity public class Order { @Id @GeneratedValue(strategy = GenerationType.IDENTITY) private Long id; // WRONG: JPA defaults to EAGER for @ManyToOne // Every Order query also loads the User — even if you do not need it // @ManyToOne // private User user; // CORRECT: explicitly set LAZY @ManyToOne(fetch = FetchType.LAZY) @JoinColumn(name = "user_id") private User user; // @OneToOne also defaults to EAGER — override it @OneToOne(fetch = FetchType.LAZY, cascade = CascadeType.ALL) @JoinColumn(name = "shipping_address_id") private Address shippingAddress; private java.math.BigDecimal totalAmount; } // The impact: // List<Order> orders = orderRepository.findAll(); // 1000 orders // EAGER @ManyToOne User: 1000 additional SELECT queries (or one massive JOIN) // LAZY @ManyToOne User: 0 additional queries until you call order.getUser()
LAZY @ManyToOne on 1000 orders: 1 SQL query (users loaded only when accessed).
Always override @ManyToOne and @OneToOne to FetchType.LAZY.
Cascade Types — What Each One Actually Does
Cascade determines which operations on the parent entity are automatically propagated to its children. Getting this wrong causes either orphaned child entities (too little cascade) or accidental deletion of shared data (too much cascade).
PERSIST: saving the parent saves new children. If you add a new Order to a User's orders list and persist the User, the Order is also persisted.
MERGE: merging the parent merges detached children. If you merge a User with modified Orders, the Orders are also merged.
REMOVE: deleting the parent deletes all children. This is dangerous on @ManyToOne relationships — if two Users share an Address, deleting one User deletes the Address for both.
DETACH: detaching the parent detaches children from the persistence context.
REFRESH: refreshing the parent reloads children from the database.
ALL: all of the above. Use with caution.
orphanRemoval is separate from cascade but equally important. When true, removing a child from the parent's collection causes Hibernate to DELETE the child from the database. Without orphanRemoval, the child is just unlinked (foreign key set to null) but the row remains.
package io.thecodeforge.hibernate_vs_jpa; import jakarta.persistence.*; import java.util.ArrayList; import java.util.List; @Entity public class User { @Id @GeneratedValue(strategy = GenerationType.IDENTITY) private Long id; private String name; // PERSIST + REMOVE + orphanRemoval: Orders are lifecycle-bound to User // Adding an Order to this list and saving User persists the Order // Removing an Order from this list DELETES the Order (orphanRemoval) // Deleting the User DELETES all Orders (REMOVE cascade) @OneToMany( mappedBy = "user", cascade = {CascadeType.PERSIST, CascadeType.REMOVE}, orphanRemoval = true, fetch = FetchType.LAZY ) private List<Order> orders = new ArrayList<>(); // Convenience methods to keep both sides in sync public void addOrder(Order order) { orders.add(order); order.setUser(this); } public void removeOrder(Order order) { orders.remove(order); order.setUser(null); // orphanRemoval: Hibernate DELETEs this Order from the database } public Long getId() { return id; } public void setId(Long id) { this.id = id; } public String getName() { return name; } public void setName(String name) { this.name = name; } public List<Order> getOrders() { return orders; } }
REMOVE: save(user) after deleting user → DELETE FROM orders WHERE user_id = ?.
orphanRemoval: user.removeOrder(order) → DELETE FROM orders WHERE id = ?.
Without orphanRemoval: user.removeOrder(order) → UPDATE orders SET user_id = NULL WHERE id = ?.
Optimistic Locking — Preventing Silent Data Loss from Concurrent Updates
Without locking, two users can load the same entity, modify different fields, and save — the second save overwrites the first user's changes silently. This is the lost update problem, and it is more common than teams realise.
JPA provides @Version for optimistic locking. Add a @Version field to your entity, and Hibernate automatically checks it at flush time. If the version in the database is higher than the version in the managed entity (meaning another transaction updated it), Hibernate throws OptimisticLockException.
Optimistic locking is the correct default for most web applications. It does not lock database rows — it just detects conflicts. The alternative, pessimistic locking (SELECT ... FOR UPDATE), locks rows for the duration of the transaction, which kills concurrency under load.
The @Version field can be int, long, or LocalDateTime. Hibernate increments it automatically on every update. You never set it manually.
package io.thecodeforge.hibernate_vs_jpa; import jakarta.persistence.*; import java.math.BigDecimal; @Entity public class Product { @Id @GeneratedValue(strategy = GenerationType.IDENTITY) private Long id; private String name; private BigDecimal price; @Version private Long version; // Hibernate manages this automatically // When two transactions update this Product simultaneously: // Transaction A: loads Product (version=1), changes price to $10 // Transaction B: loads Product (version=1), changes name to "Widget" // Transaction A commits: UPDATE products SET price=10, version=2 WHERE id=1 AND version=1 // Transaction B commits: UPDATE products SET name='Widget', version=2 WHERE id=1 AND version=1 // → 0 rows updated (version is now 2, not 1) // → Hibernate throws OptimisticLockException // → Transaction B rolls back public Long getId() { return id; } public void setId(Long id) { this.id = id; } public String getName() { return name; } public void setName(String name) { this.name = name; } public BigDecimal getPrice() { return price; } public void setPrice(BigDecimal price) { this.price = price; } public Long getVersion() { return version; } } // Handling the exception in a service: // @Transactional // public void updatePrice(Long productId, BigDecimal newPrice) { // Product product = productRepository.findById(productId).orElseThrow(); // product.setPrice(newPrice); // // If OptimisticLockException is thrown, Spring rolls back the transaction // // Retry logic or user notification goes here // }
Transaction B commits: version is now 2, WHERE clause expects 1, 0 rows updated.
Hibernate throws OptimisticLockException, Transaction B rolls back.
No data lost — Transaction B must retry with the latest version.
Second-Level Cache — Configuration That Actually Works
The first-level cache is the persistence context — always on, scoped to the EntityManager/Session, and cannot be disabled. The second-level cache is optional, shared across Sessions, and stores entity data outside the persistence context.
The second-level cache is useful for read-mostly reference data: countries, currencies, product categories, configuration tables. It is NOT useful for frequently updated data — the cache invalidation overhead outweighs the benefit.
Hibernate 6 uses JCache (JSR-107) as the cache abstraction. You need a JCache provider on the classpath — Ehcache, Infinispan, Hazelcast, or Caffeine. Then you annotate entities with @Cacheable and configure the cache region.
The key gotcha: @Cacheable is a JPA annotation that marks the entity as cacheable, but it does not configure HOW to cache. The Hibernate-specific @Cache annotation specifies the concurrency strategy: READ_ONLY (never changes), READ_WRITE (changes allowed, uses soft locks), NONSTRICT_READ_WRITE (changes allowed, no locks, may serve stale data briefly), TRANSACTIONAL (XA transactional cache, rarely used).
package io.thecodeforge.hibernate_vs_jpa; import jakarta.persistence.*; @Entity @Cacheable // JPA standard: marks entity as eligible for second-level cache @org.hibernate.annotations.Cache(usage = org.hibernate.annotations.CacheConcurrencyStrategy.READ_WRITE) @Table(name = "countries") public class Country { @Id @GeneratedValue(strategy = GenerationType.IDENTITY) private Long id; @Column(nullable = false, unique = true, length = 2) private String code; @Column(nullable = false, length = 100) private String name; // This entity is read often, updated rarely — perfect for caching // After first load, subsequent findById() calls hit the cache, not the DB } // application.properties — Hibernate 6 / Spring Boot 3 with Ehcache // spring.jpa.properties.hibernate.cache.use_second_level_cache=true // spring.jpa.properties.hibernate.cache.region.factory_class=org.hibernate.cache.jcache.JCacheRegionFactory // spring.jpa.properties.javax.cache.provider=org.ehcache.jsr107.EhcacheCachingProvider // spring.jpa.properties.javax.cache.uri=classpath:ehcache.xml // Query-level cache (separate from entity cache): // @QueryHints({@QueryHint(name = "org.hibernate.cacheable", value = "true")}) // @Query("SELECT c FROM Country c WHERE c.code = :code") // Optional<Country> findByCode(@Param("code") String code);
Second findById(1): cache hit — no SQL fired.
Update country: cache invalidated for that region, next read fetches from DB.
READ_WRITE strategy: uses soft locks during concurrent updates to prevent stale reads.
SessionFactory vs EntityManagerFactory — The Production Reality
EntityManagerFactory is the JPA standard factory for EntityManager. SessionFactory is Hibernate's native factory for Session. In a Spring Boot application, you rarely create either directly — Spring manages the lifecycle and provides them as beans.
In Spring Boot, the auto-configured JPA setup creates a LocalContainerEntityManagerFactoryBean, which wraps a Hibernate SessionFactory internally. You can unwrap it: entityManagerFactory.unwrap(SessionFactory.class) to access Hibernate-native features.
For transaction management, Spring uses JpaTransactionManager, which works with EntityManager and integrates with @Transactional. The @Transactional annotation is JPA-agnostic — it works whether the underlying provider is Hibernate or EclipseLink. The Spring transaction interceptor acquires the EntityManager, begins the transaction, commits or rolls back, and closes the EntityManager.
The key production configuration: spring.jpa.open-in-view. By default in Spring Boot, this is true, which means the EntityManager stays open for the entire HTTP request — even after the transaction commits. This causes lazy loading to work in controllers and templates, which sounds convenient but hides N+1 problems and keeps database connections open longer than necessary. Set it to false in production and fix any lazy-loading-outside-transaction errors explicitly.
package io.thecodeforge.hibernate_vs_jpa; import jakarta.persistence.EntityManager; import jakarta.persistence.PersistenceContext; import org.hibernate.Session; import org.hibernate.SessionFactory; import org.springframework.stereotype.Service; import org.springframework.transaction.annotation.Transactional; import java.util.List; @Service public class BulkImportService { @PersistenceContext private EntityManager em; @Transactional public void bulkInsert(List<User> users) { // Unwrap to get Hibernate Session for native features Session session = em.unwrap(Session.class); session.setJdbcBatchSize(100); for (int i = 0; i < users.size(); i++) { session.persist(users.get(i)); if (i % 100 == 0 && i > 0) { session.flush(); session.clear(); // free memory — prevents OOM on large imports } } } // Accessing SessionFactory directly (rare, but needed for StatelessSession) public void highThroughputExport(SessionFactory sessionFactory) { var stateless = sessionFactory.openStatelessSession(); var tx = stateless.beginTransaction(); try { var results = stateless .createQuery("FROM User u WHERE u.status = :status", User.class) .setParameter("status", UserStatus.ACTIVE) .scroll(org.hibernate.ScrollMode.FORWARD_ONLY); while (results.next()) { User u = results.get(); // process without loading into first-level cache } tx.commit(); } catch (Exception e) { tx.rollback(); throw e; } finally { stateless.close(); } } } // application.properties — production settings // spring.jpa.hibernate.ddl-auto=validate // spring.jpa.open-in-view=false // spring.jpa.properties.hibernate.jdbc.batch_size=50 // spring.jpa.properties.hibernate.order_inserts=true // spring.jpa.properties.hibernate.order_updates=true // spring.jpa.properties.hibernate.generate_statistics=false // spring.jpa.properties.hibernate.connection.provider_disables_autocommit=true
flush+clear every 100 records: prevents persistence context from growing unbounded.
StatelessSession: no first-level cache, no dirty checking, maximum throughput.
open-in-view=false: EntityManager closed after transaction, lazy loading outside transaction throws LazyInitializationException.
@Embeddable and @Embedded — Value Objects Done Right
@Embeddable and @Embedded — Value Objects Done Right
JPA lets you embed one entity's fields into another entity's table using @Embeddable and @Embedded. This is the correct pattern for value objects — concepts that have no identity of their own and are defined entirely by their values.
Address is the canonical example: an Address has no reason to exist as a separate table with its own primary key. It is a property of the entity that contains it. Embedding it avoids an unnecessary JOIN and keeps the data model clean.
When you need the same @Embeddable type twice in one entity (e.g., shipping address and billing address), use @AttributeOverrides to rename the columns so they do not collide.
package io.thecodeforge.hibernate_vs_jpa; import jakarta.persistence.*; @Embeddable public class Address { private String street; private String city; private String state; private String zipCode; private String country; public String getStreet() { return street; } public void setStreet(String street) { this.street = street; } public String getCity() { return city; } public void setCity(String city) { this.city = city; } public String getState() { return state; } public void setState(String state) { this.state = state; } public String getZipCode() { return zipCode; } public void setZipCode(String zipCode) { this.zipCode = zipCode; } public String getCountry() { return country; } public void setCountry(String country) { this.country = country; } } @Entity @Table(name = "users") public class User { @Id @GeneratedValue(strategy = GenerationType.IDENTITY) private Long id; private String name; // Single embedded address — fields flattened into users table @Embedded private Address homeAddress; // Second embedded address — override column names to avoid collision @Embedded @AttributeOverrides({ @AttributeOverride(name = "street", column = @Column(name = "billing_street")), @AttributeOverride(name = "city", column = @Column(name = "billing_city")), @AttributeOverride(name = "state", column = @Column(name = "billing_state")), @AttributeOverride(name = "zipCode", column = @Column(name = "billing_zip_code")), @AttributeOverride(name = "country", column = @Column(name = "billing_country")) }) private Address billingAddress; public Long getId() { return id; } public void setId(Long id) { this.id = id; } public String getName() { return name; } public void setName(String name) { this.name = name; } public Address getHomeAddress() { return homeAddress; } public void setHomeAddress(Address homeAddress) { this.homeAddress = homeAddress; } public Address getBillingAddress() { return billingAddress; } public void setBillingAddress(Address billingAddress) { this.billingAddress = billingAddress; } } // Result: single users table with columns: // id, name, street, city, state, zip_code, country, billing_street, billing_city, billing_state, billing_zip_code, billing_country // No separate addresses table. No JOIN needed.
No separate addresses table — no JOIN required to read address data.
@AttributeOverrides renames columns for the second Address instance.
Query result: single table scan instead of JOIN with addresses table.
Soft Deletes — Deleting Without Losing Data
Hard deletes are permanent. In production, you almost never want to actually DELETE rows — you want to mark them as deleted so you can audit, recover, or comply with data retention policies. This is a soft delete.
In Hibernate 5, soft deletes were implemented with @SQLDelete (custom UPDATE on delete) and @Where (automatic WHERE clause on queries). In Hibernate 6, @Where was replaced by @SQLRestriction, and a new annotation @SoftDelete was introduced that handles the entire pattern declaratively.
The key components: @SQLDelete replaces the DELETE statement with an UPDATE, @SQLRestriction adds a WHERE clause to all queries for that entity, and @DynamicUpdate ensures UPDATE statements only include changed columns.
package io.thecodeforge.hibernate_vs_jpa; import jakarta.persistence.*; import java.time.LocalDateTime; @Entity @Table(name = "users") @org.hibernate.annotations.SQLDelete(sql = "UPDATE users SET deleted = true, deleted_at = CURRENT_TIMESTAMP WHERE id = ?") @org.hibernate.annotations.SQLRestriction("deleted = false") @org.hibernate.annotations.DynamicUpdate public class User { @Id @GeneratedValue(strategy = GenerationType.IDENTITY) private Long id; private String name; private String email; @Column(nullable = false) private boolean deleted = false; private LocalDateTime deletedAt; // Hibernate 6 alternative: @SoftDelete annotation (cleaner) }
On query: automatic WHERE deleted = false added to every SELECT.
No data is ever physically deleted — perfect for audit and recovery.
Auditing with @CreatedDate, @LastModifiedDate — Zero Boilerplate
Spring Data JPA + @EnableJpaAuditing gives you automatic audit fields (createdAt, updatedAt, createdBy, modifiedBy) on every entity with zero manual code.
Create a @MappedSuperclass base entity and let every other entity extend it. Spring automatically populates the fields on persist and update.
package io.thecodeforge.hibernate_vs_jpa; import jakarta.persistence.*; import org.springframework.data.annotation.CreatedBy; import org.springframework.data.annotation.CreatedDate; import org.springframework.data.annotation.LastModifiedBy; import org.springframework.data.annotation.LastModifiedDate; import org.springframework.data.jpa.domain.support.AuditingEntityListener; import java.time.LocalDateTime; @MappedSuperclass @EntityListeners(AuditingEntityListener.class) public abstract class BaseEntity { @CreatedDate private LocalDateTime createdAt; @LastModifiedDate private LocalDateTime updatedAt; @CreatedBy private String createdBy; @LastModifiedBy private String modifiedBy; public LocalDateTime getCreatedAt() { return createdAt; } public LocalDateTime getUpdatedAt() { return updatedAt; } public String getCreatedBy() { return createdBy; } public String getModifiedBy() { return modifiedBy; } } @Entity @Table(name = "users") public class User extends BaseEntity { @Id @GeneratedValue(strategy = GenerationType.IDENTITY) private Long id; private String name; private String email; // createdAt, updatedAt, createdBy, modifiedBy inherited from BaseEntity public Long getId() { return id; } public void setId(Long id) { this.id = id; } public String getName() { return name; } public void setName(String name) { this.name = name; } public String getEmail() { return email; } public void setEmail(String email) { this.email = email; } } // Configuration class: // @Configuration // @EnableJpaAuditing // public class JpaConfig { // @Bean // public AuditorAware<String> auditorAware() { // return () -> Optional.ofNullable(SecurityContextHolder.getContext()) // .map(SecurityContext::getAuthentication) // .map(Authentication::getName) // .orElse("system"); // } // }
On merge/update: updatedAt and modifiedBy set automatically.
No manual timestamp or user-setting code in any service method.
All entities extending BaseEntity get auditing for free.
Inheritance Mapping — Three Strategies, One Decision
JPA supports three strategies for mapping class hierarchies to database tables. The choice affects query performance, schema complexity, and how polymorphic queries work.
SINGLE_TABLE: All subclasses stored in one table with a discriminator column. Simplest schema, fastest queries (no JOINs), but wastes columns for subclass-specific fields that are NULL for other types. Use when subclasses have few additional fields.
JOINED: Parent and each subclass get separate tables, joined by primary key. Normalised schema, no NULL columns, but polymorphic queries require JOINs. Use when data integrity and normalisation matter more than query speed.
TABLE_PER_CLASS: Each concrete subclass gets its own complete table with all inherited columns. No JOINs for type-specific queries, but polymorphic queries require UNION ALL across all tables. Rarely used — most teams choose JOINED or SINGLE_TABLE.
package io.thecodeforge.hibernate_vs_jpa; import jakarta.persistence.*; import java.math.BigDecimal; // STRATEGY 1: SINGLE_TABLE — one table, discriminator column @Entity @Table(name = "payments") @Inheritance(strategy = InheritanceType.SINGLE_TABLE) @DiscriminatorColumn(name = "payment_type", discriminatorType = DiscriminatorType.STRING) public abstract class Payment { @Id @GeneratedValue(strategy = GenerationType.IDENTITY) private Long id; private BigDecimal amount; } @Entity @DiscriminatorValue("CREDIT_CARD") public class CreditCardPayment extends Payment { private String cardNumber; private String expiryDate; } @Entity @DiscriminatorValue("BANK_TRANSFER") public class BankTransferPayment extends Payment { private String bankAccount; private String routingNumber; } // Result: single payments table with columns: id, amount, payment_type, card_number, expiry_date, bank_account, routing_number // bank_account and routing_number are NULL for credit card payments // card_number and expiry_date are NULL for bank transfer payments // STRATEGY 2: JOINED — separate tables per subclass // @Entity // @Inheritance(strategy = InheritanceType.JOINED) // public abstract class Payment { ... } // Result: payments table (id, amount), credit_card_payments table (id, card_number, expiry_date), bank_transfer_payments table (id, bank_account, routing_number) // Polymorphic query: SELECT * FROM payments LEFT JOIN credit_card_payments ... LEFT JOIN bank_transfer_payments ... // STRATEGY 3: TABLE_PER_CLASS — complete table per concrete subclass // @Entity // @Inheritance(strategy = InheritanceType.TABLE_PER_CLASS) // public abstract class Payment { ... } // Result: credit_card_payments table (id, amount, card_number, expiry_date), bank_transfer_payments table (id, amount, bank_account, routing_number) // Polymorphic query: SELECT * FROM credit_card_payments UNION ALL SELECT * FROM bank_transfer_payments
JOINED: N+1 tables, normalised, polymorphic queries require JOINs.
TABLE_PER_CLASS: N tables, duplicated columns, polymorphic queries require UNION ALL.
Most production apps use SINGLE_TABLE or JOINED. TABLE_PER_CLASS is rarely the right choice.
Practical Patterns — Pagination, @Modifying, Native Queries, Criteria API
Spring Data JPA gives you powerful built-in patterns for pagination, bulk updates, native SQL, and dynamic queries.
Pagination is built-in with Pageable and Page<T>. @Modifying is required for UPDATE/DELETE queries. Native queries are needed for database-specific features (JSON, CTEs, window functions). Criteria API / Specifications are used for dynamic queries with optional filters.
package io.thecodeforge.hibernate_vs_jpa; import org.springframework.data.domain.Page; import org.springframework.data.domain.Pageable; import org.springframework.data.jpa.repository.JpaRepository; import org.springframework.data.jpa.repository.Modifying; import org.springframework.data.jpa.repository.Query; import org.springframework.data.repository.query.Param; import java.time.LocalDate; import java.util.List; import java.util.Optional; public interface UserRepository extends JpaRepository<User, Long> { // PAGINATION — Spring Data handles LIMIT, OFFSET, and COUNT automatically Page<User> findByStatus(UserStatus status, Pageable pageable); // @MODIFY — required for UPDATE and DELETE queries @Modifying @Query("UPDATE User u SET u.status = :status WHERE u.lastLogin < :date") int deactivateInactiveUsers(@Param("status") UserStatus status, @Param("date") LocalDate date); @Modifying @Query("DELETE FROM User u WHERE u.status = :status AND u.createdAt < :date") int deleteOldInactiveUsers(@Param("status") UserStatus status, @Param("date") LocalDate date); // JPQL — portable across JPA providers @Query("SELECT u FROM User u WHERE u.email = :email") Optional<User> findByEmailJPQL(@Param("email") String email); // NATIVE SQL — database-specific features (JSON, CTEs, window functions) @Query(value = "SELECT * FROM users WHERE preferences->>'theme' = :theme", nativeQuery = true) List<User> findByThemePreference(@Param("theme") String theme); // NATIVE with pagination — need countQuery for Page return type @Query( value = "SELECT * FROM users WHERE created_at > :date ORDER BY created_at DESC", countQuery = "SELECT COUNT(*) FROM users WHERE created_at > :date", nativeQuery = true ) Page<User> findRecentUsers(@Param("date") LocalDate date, Pageable pageable); }
deactivateInactiveUsers: UPDATE users SET status=? WHERE last_login < ? — returns row count.
findByThemePreference: SELECT * FROM users WHERE preferences->>'theme' = ? — PostgreSQL JSON operator.
findRecentUsers: native query with pagination — requires explicit countQuery.
Criteria API — When JPQL Strings Are Not Enough
The Criteria API provides a programmatic, type-safe way to build queries. Instead of writing JPQL as strings (which have no compile-time checking), you construct queries using Java objects.
In practice, most teams use JPQL or Spring Data derived queries for simplicity. The Criteria API is useful when you need to build queries dynamically — a search form with optional filters, for example, where you do not know at compile time which WHERE clauses will be active.
Spring Data JPA provides Specifications (JpaSpecificationExecutor) as a higher-level abstraction over the Criteria API. This is the recommended approach for dynamic queries in Spring Boot applications.
package io.thecodeforge.hibernate_vs_jpa; import jakarta.persistence.criteria.*; import org.springframework.data.jpa.domain.Specification; import org.springframework.data.jpa.repository.JpaRepository; import org.springframework.data.jpa.repository.JpaSpecificationExecutor; import java.util.ArrayList; import java.util.List; // Repository with Specification support public interface UserRepository extends JpaRepository<User, Long>, JpaSpecificationExecutor<User> { } // Dynamic query builder using Specifications public class UserSpecifications { public static Specification<User> withFilters(String name, UserStatus status, String emailDomain) { return (Root<User> root, CriteriaQuery<?> query, CriteriaBuilder cb) -> { List<Predicate> predicates = new ArrayList<>(); if (name != null && !name.isEmpty()) { predicates.add(cb.like(cb.lower(root.get("name")), "%" + name.toLowerCase() + "%")); } if (status != null) { predicates.add(cb.equal(root.get("status"), status)); } if (emailDomain != null && !emailDomain.isEmpty()) { predicates.add(cb.like(root.get("email"), "%@" + emailDomain)); } return cb.and(predicates.toArray(new Predicate[0])); }; } } // Usage in service: // Specification<User> spec = UserSpecifications.withFilters("jane", UserStatus.ACTIVE, "example.com"); // Page<User> results = userRepository.findAll(spec, PageRequest.of(0, 20));
withFilters(null, null, null): SELECT * FROM users LIMIT 20 — no WHERE clause.
Specifications compose with .and() and .or() for complex queries.
Type-safe: compiler catches typos in field names, unlike JPQL strings.
Testing with @DataJpaTest — The Pattern That Catches Bugs Before Production
@DataJpaTest is Spring Boot's slice test annotation for JPA repositories. It configures an in-memory database (H2 by default), scans for @Entity classes, and creates only the JPA-related beans — no web layer, no service layer, no security.
The key benefit: fast, isolated repository tests that verify your queries actually work against real SQL. I have caught N+1 problems, missing @Modifying annotations, incorrect JPQL joins, and broken native queries in @DataJpaTest that would have been invisible in unit tests with mocked repositories.
For production applications, use @AutoConfigureTestDatabase(replace = Replace.NONE) to run tests against the real database (PostgreSQL, MySQL) instead of H2. H2 is fast but its SQL dialect differs from production databases — queries that work on H2 may fail on PostgreSQL.
package io.thecodeforge.hibernate_vs_jpa; import org.junit.jupiter.api.Test; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.boot.test.autoconfigure.orm.jpa.DataJpaTest; import org.springframework.boot.test.autoconfigure.orm.jpa.TestEntityManager; import org.springframework.data.domain.Page; import org.springframework.data.domain.PageRequest; import java.util.List; import java.util.Optional; import static org.assertj.core.api.Assertions.assertThat; @DataJpaTest // @AutoConfigureTestDatabase(replace = Replace.NONE) // use real DB in CI public class UserRepositoryTest { @Autowired private TestEntityManager entityManager; @Autowired private UserRepository userRepository; @Test void findByEmail_returnsUser_whenExists() { entityManager.persistAndFlush(new User("jane@example.com", "Jane")); Optional<User> found = userRepository.findByEmail("jane@example.com"); assertThat(found).isPresent(); assertThat(found.get().getName()).isEqualTo("Jane"); } @Test void findByStatus_returnsPage_withCorrectTotalCount() { for (int i = 0; i < 25; i++) { entityManager.persistAndFlush(new User("user" + i + "@test.com", "User " + i)); } Page<User> page = userRepository.findByStatus(UserStatus.ACTIVE, PageRequest.of(0, 10)); assertThat(page.getContent()).hasSize(10); assertThat(page.getTotalElements()).isEqualTo(25); assertThat(page.getTotalPages()).isEqualTo(3); } @Test void deactivateInactiveUsers_updatesCorrectRows() { User active = entityManager.persistAndFlush(new User("active@test.com", "Active")); User inactive = entityManager.persistAndFlush(new User("inactive@test.com", "Inactive")); int updated = userRepository.deactivateInactiveUsers( UserStatus.INACTIVE, java.time.LocalDate.now().minusDays(90) ); assertThat(updated).isGreaterThanOrEqualTo(0); } } // Helper: create User with constructor for cleaner tests // public User(String email, String name) { // this.email = email; // this.name = name; // this.status = UserStatus.ACTIVE; // }
findByStatus: 3 assertions pass — page size 10, total 25, totalPages 3.
deactivateInactiveUsers: 1 assertion pass — update count is non-negative.
All tests run against H2 in-memory database (or real DB with Replace.NONE).
TestEntityManager.persistAndFlush() saves and immediately flushes to the database, making the data visible to repository methods in the same test. Regular entityManager.persist() without flush may not be visible if Hibernate defers the INSERT. Always use persistAndFlush() in test setup to avoid flaky tests caused by deferred flushing.When to Use JPA Interfaces and When to Use Hibernate-Specific APIs
The default answer is: always use JPA and Spring Data JPA. Use Hibernate-specific APIs only when you have a specific requirement that JPA cannot meet.
Use JPA standard APIs for: all CRUD operations (Spring Data repositories), standard JPQL queries, entity relationships, cascading, standard caching annotations, pagination, entity graphs. These cover 95% of real-world use cases.
Reach for Hibernate-specific APIs when: you need batch inserts/updates at scale (setJdbcBatchSize, flush/clear loops), you need the second-level cache with a specific region factory (Ehcache, Infinispan), you need Hibernate interceptors or event listeners for cross-cutting concerns (auditing, soft delete), you need StatelessSession for high-throughput bulk operations, you need @Formula for computed columns, you need @BatchSize for N+1 mitigation, or you need @DynamicUpdate for partial updates on wide tables.
Switching JPA implementations is rare in practice, but using JPA interfaces keeps your code clean and your team's cognitive load lower. When you do use Hibernate-specific APIs, isolate them in dedicated service methods so the rest of your codebase remains portable.
package io.thecodeforge.hibernate_vs_jpa; import jakarta.persistence.*; import java.math.BigDecimal; public class HibernateSpecificFeatures { // @Formula — computed column without a database view @Entity public static class UserWithStats { @Id @GeneratedValue(strategy = GenerationType.IDENTITY) private Long id; private String name; @org.hibernate.annotations.Formula("(SELECT COUNT(*) FROM orders o WHERE o.user_id = id)") private int orderCount; // @DynamicUpdate — only includes changed columns in UPDATE // Useful for wide tables where you typically update 1-2 columns // but the entity has 30+ columns } }
🎯 Key Takeaways
- JPA is a specification (interfaces + rules). Hibernate is the implementation. Spring Boot uses Hibernate as its default JPA provider. Code against JPA interfaces by default; reach for Hibernate-specific APIs only when JPA cannot meet a concrete requirement.
- The entity lifecycle has four states: Transient, Managed, Detached, Removed. Understanding these states explains why
persist()does not immediately hit the database, whymerge()returns a different object, and what detached entity errors mean. - Dirty checking is how Hibernate detects changes in managed entities. @Transactional(readOnly=true) skips dirty checking entirely, improving performance for read-heavy operations. Set it as the default on your service layer.
- The N+1 query problem is the most common Hibernate performance issue. Fix it with JOIN FETCH, @EntityGraph, or @BatchSize — not FetchType.EAGER. Enable SQL logging in development and count queries on every page.
- GenerationType.IDENTITY disables JDBC batch inserts because the ID is not available until after the INSERT. Use SEQUENCE with allocationSize=25-50 for batch insert support on PostgreSQL and Oracle.
- Always set @ManyToOne and @OneToOne to FetchType.LAZY. JPA defaults to EAGER for these, which causes over-fetching in list queries. Never set @OneToMany or @ManyToMany to EAGER.
- Use @Version for optimistic locking on any entity that can be concurrently modified. Without it, concurrent updates silently overwrite each other — the second save discards the first user's changes.
- Set spring.jpa.open-in-view=false in production. The default (true) masks N+1 problems by keeping the EntityManager open for the full HTTP request and keeps database connections open longer than necessary.
- Hibernate 6 (Spring Boot 3) changed GenerationType.AUTO default from TABLE to SEQUENCE, moved packages from javax to jakarta, and made HQL stricter. Always specify ID strategy explicitly and test thoroughly when upgrading.
- Soft deletes (@SQLDelete + @SQLRestriction) preserve data for audit and recovery. Use @SoftDelete in Hibernate 6 for a cleaner implementation. Remember that unique constraints need partial indexes on soft-deleted entities.
- Use @DataJpaTest for repository tests. Use
TestEntityManager.persistAndFlush()for test data setup. Use Replace.NONE to test against the real database in CI instead of H2. - Connection pool size × application instances = total database connections. Leave 20% headroom on the database server. Use connection poolers (PgBouncer, ProxySQL) if you need many app instances.
Interview Questions on This Topic
- QWhat is the difference between JPA and Hibernate?Reveal
- QWhat is the N+1 query problem and how do you fix it?Reveal
- QWhat is the difference between Session and EntityManager?Reveal
- QWhat does FetchType.LAZY vs FetchType.EAGER mean?Reveal
- QWhat is the purpose of @Transactional(readOnly=true)?Reveal
- QWhat are the four entity lifecycle states in JPA?Reveal
- QWhat is the difference between
merge()andpersist()?Reveal - QWhy does GenerationType.IDENTITY disable batch inserts?Reveal
- QWhat is @Version and why is it important?Reveal
- QWhat is the difference between the first-level and second-level cache?Reveal
Frequently Asked Questions
What is the difference between Hibernate and JPA?
JPA is a specification that defines how Java ORM should work — interfaces, annotations, and rules. Hibernate is an implementation that follows those rules and adds its own extensions (Session, HQL, second-level cache, batch processing, @Formula, @SoftDelete). When you use Spring Boot with spring-boot-starter-data-jpa, you get Hibernate as the JPA provider automatically.
Is Hibernate the same as JPA?
No. JPA is a standard specification; Hibernate is a specific framework that implements it. Other implementations exist (EclipseLink, OpenJPA, DataNucleus), but Hibernate dominates in practice. You can write JPA-compliant code that works on any provider, or use Hibernate-specific features that tie you to Hibernate.
Do you need Hibernate to use JPA?
You need a JPA provider (an implementation) to use JPA in practice — JPA itself has no runnable code. Hibernate is the most common choice and the default in Spring Boot. EclipseLink is the JPA reference implementation used in Jakarta EE application servers.
What is EntityManager in JPA?
EntityManager is the central JPA API for interacting with the persistence context. It manages the lifecycle of entities: persist (insert), find (select by PK), merge (update detached entity), remove (delete), and createQuery (JPQL). In Spring Boot, Spring injects it automatically and manages its lifecycle per transaction.
What is the first-level cache in Hibernate?
The first-level cache is the persistence context — the EntityManager or Session. Within a single transaction, Hibernate caches every entity it loads by primary key. If you call findById(1L) twice in the same transaction, Hibernate fires the SQL only once and returns the cached instance the second time. This cache is always enabled and scoped to the transaction.
What is the second-level cache in Hibernate?
The second-level cache is an optional, cross-transaction cache shared across Sessions. It stores entity data outside the persistence context using a pluggable cache provider (Ehcache, Infinispan, Caffeine). Use it for read-mostly reference data (countries, currencies, categories). Configure with @Cacheable (JPA) + @Cache (Hibernate) and enable in application.properties with hibernate.cache.use_second_level_cache=true.
How do you prevent the N+1 query problem in Hibernate?
Four approaches: (1) JOIN FETCH in JPQL — forces an eager join for a specific query. (2) @EntityGraph — declares fetch paths declaratively on repository methods. (3) @BatchSize on the association — loads lazy collections in batches instead of one at a time. (4) Spring Data projections — fetch only the fields you need without loading associations. Enable SQL logging in development to detect N+1 problems early.
What is optimistic locking in JPA?
Optimistic locking uses a @Version field to detect concurrent modifications without locking database rows. When an entity is updated, Hibernate checks that the version in the database matches the version in the managed entity. If another transaction has already updated the entity (incrementing the version), Hibernate throws OptimisticLockException and rolls back the transaction. This prevents lost updates while maintaining high concurrency.
What is the difference between CascadeType.ALL and orphanRemoval?
CascadeType.ALL propagates all operations (persist, merge, remove, detach, refresh) from parent to children. orphanRemoval is separate — when true, removing a child from the parent's collection causes Hibernate to DELETE the child from the database. Without orphanRemoval, the child is unlinked (foreign key set to null) but the row remains. Use orphanRemoval for true composition where the child cannot exist without the parent.
What changed in Hibernate 6 with Spring Boot 3?
Key changes: (1) Package namespace moved from javax.persistence to jakarta.persistence. (2) GenerationType.AUTO now defaults to SEQUENCE instead of TABLE. (3) HQL became stricter — implicit joins may throw syntax errors. (4) Second-level cache moved from Ehcache direct integration to JCache (JSR-107). (5) Dialect auto-detection from JDBC URL improved. Always specify ID generation strategy explicitly and test thoroughly when upgrading.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.