Spring Boot with MySQL and JPA: The Definitive Persistence Guide
- Spring Boot with MySQL and JPA maps Java objects to MySQL tables via annotations — Hibernate generates SQL, Spring Data generates CRUD repository implementations, and Spring Boot auto-configures HikariCP. What took 300 lines of JDBC boilerplate now takes 30 lines of entity class and a repository interface.
- Spring Boot 3.2.x uses the jakarta.persistence namespace — if you are migrating from 2.x, update every @Entity, @Column, @Transactional, and @Id import from javax to jakarta. This is a breaking change with no behavioral difference.
- Spring Data JPA repositories eliminate DAO boilerplate — method-name-derived queries (findByNameContainingIgnoreCase) are parsed and validated at startup, not at runtime. A startup failure means your method name is wrong — a runtime failure means your logic is wrong. This distinction saves significant debugging time.
- JPA maps Java objects (Entities) to MySQL tables — Hibernate generates SQL so you don't write raw JDBC
- Spring Data JPA Repositories provide CRUD methods from an interface — no implementation class needed
- HikariCP is the default connection pool — misconfigured pool size is the #1 cause of database-related outages
- The N+1 problem is the most common performance killer — one query for parents, N queries for children; fix it with JOIN FETCH or @EntityGraph
- spring.jpa.open-in-view=false is non-negotiable in production — OSIV holds connections open until view rendering completes, not until your SQL finishes
- Use BigDecimal for money, never Double — floating-point precision errors corrupt financial data silently and accumulate across thousands of transactions
- ddl-auto=validate in production, Flyway for migrations — never let Hibernate touch your production schema autonomously
Connection pool exhaustion — 'HikariPool-1 - Connection is not available, request timed out' errors
curl -s http://localhost:8080/actuator/metrics/hikaricp.connections.pending | jq '.measurements[] | select(.statistic=="VALUE") | .value'mysql -u forge_user -p -e "SELECT command, state, time, info FROM information_schema.processlist WHERE command != 'Sleep' ORDER BY time DESC LIMIT 20;"N+1 query problem — hundreds of SQL queries for a single API request, response time 10x slower than expected
grep -c 'Hibernate: select' /var/log/app/application.loggrep 'Hibernate: select' /var/log/app/application.log | sort | uniq -c | sort -rn | head -10Slow queries — API p99 latency spiked from 50ms to 5 seconds, database CPU elevated
mysql -u forge_user -p -e "SELECT query_time, sql_text FROM mysql.slow_log ORDER BY query_time DESC LIMIT 10;"mysql -u forge_user -p -e "EXPLAIN SELECT * FROM forge_products WHERE category_id = 42 AND status = 'ACTIVE';"OutOfMemoryError during batch processing — heap fills up during bulk inserts or exports
jcmd $(pgrep -f spring-boot) GC.heap_dump /tmp/heapdump.hprofjcmd $(pgrep -f spring-boot) GC.class_histogram | grep -E 'ForgeProduct|EntityEntry|StatefulPersistence' | head -10Production Incident
Production Debug GuideWhen Spring Boot with MySQL and JPA behaves unexpectedly in production, here is how to go from observable symptom to root cause to resolution. These are the paths I have walked through actual incidents — not hypothetical scenarios.
save() twice in the same @Transactional block — in some cases the Persistence Context detects it as a new entity both times. Check for concurrent requests hitting the same insert path: add a UNIQUE constraint at the database level and handle DataIntegrityViolationException at the service layer rather than assuming the application prevents duplicates.Connecting a Java application to a database used to be a nightmare of boilerplate JDBC code, manual connection pooling, and error-prone SQL strings embedded in Java classes. Spring Boot with MySQL and JPA eliminates that friction — it is one of the most frequently used combinations in the Java ecosystem for a reason.
By leveraging the Repository pattern and Object-Relational Mapping (ORM), Spring Boot allows you to interact with your MySQL database using plain Java objects. Hibernate, the default JPA provider, handles SQL generation, connection management, transaction boundaries, and schema validation. You focus on the business problem; the framework handles the plumbing.
In this guide we break down exactly what Spring Boot with MySQL and JPA does, why it was designed this way, and how to configure it correctly for production. That means going beyond the getting-started tutorial — we cover HikariCP tuning, the N+1 problem and its fixes, OSIV and why it will eventually cause a connection pool outage if you leave it enabled, and the right way to manage schema changes without letting Hibernate near your production tables.
We target Spring Boot 3.2.x with Jakarta Persistence 3.1 (the post-javax namespace migration) and MySQL 8.0+, ensuring the configuration and code in this guide works on the stack you are actually running in 2026, not the stack from a five-year-old tutorial.
By the end you will have both the conceptual understanding and practical code to build a production-ready persistence layer — one that performs well under load, fails safely when things go wrong, and does not surprise you with silent data corruption six months after launch.
What Is Spring Boot with MySQL and JPA and Why Does It Exist?
Spring Boot with MySQL and JPA exists to solve the 'Impedance Mismatch' — the fundamental structural difference between how Java models data (objects with fields and relationships) and how MySQL stores data (tables with rows and foreign keys). Bridging these two models manually through JDBC is possible but expensive: you write SQL strings, manage ResultSet iteration, map columns to fields, handle null values explicitly, manage connection lifecycle, and implement transactions by hand. Every CRUD operation is dozens of lines of infrastructure code that contributes nothing to the business problem.
JPA — the Java Persistence API, implemented by Hibernate — makes this mapping declarative. You annotate a Java class with @Entity and its fields with @Column, and Hibernate generates the SQL at startup. Spring Data JPA wraps Hibernate with a Repository abstraction that generates CRUD operations from an interface definition with zero implementation code. Spring Boot auto-configures HikariCP (the fastest connection pool in the JVM ecosystem), a Hibernate SessionFactory, and the transaction manager — all from the properties you provide in application.properties.
The result: what took 300 lines of JDBC boilerplate takes 30 lines of annotated entity class and a one-line repository interface. That is the reason this stack is the default choice for Java database persistence in 2026.
With Spring Boot 3.2.x, the namespace has migrated from javax.persistence to jakarta.persistence — all imports in this guide use the jakarta package. If you are migrating from Spring Boot 2.x, this is a breaking change that requires updating every @Entity, @Column, @Transactional, and @Id import in your codebase. The behavior is identical; only the package name changed.
package io.thecodeforge.persistence.model; import jakarta.persistence.*; import lombok.Data; import lombok.NoArgsConstructor; import lombok.AllArgsConstructor; import org.hibernate.annotations.CreationTimestamp; import org.hibernate.annotations.UpdateTimestamp; import java.math.BigDecimal; import java.time.LocalDateTime; @Entity @Table( name = "forge_products", indexes = { // Declare indexes here so Hibernate schema validation knows about them. // For production, create indexes in Flyway migrations, not via ddl-auto. @Index(name = "idx_product_name", columnList = "name"), @Index(name = "idx_product_category", columnList = "category_id") } ) @Data @NoArgsConstructor @AllArgsConstructor public class ForgeProduct { @Id @GeneratedValue(strategy = GenerationType.IDENTITY) private Long id; // nullable = false maps to NOT NULL in the schema — Hibernate validates this @Column(nullable = false, length = 100) private String name; // ALWAYS BigDecimal for money — never Double or Float. // precision = total digits, scale = digits after decimal point. // This maps to DECIMAL(12,2) in MySQL. @Column(nullable = false, precision = 12, scale = 2) private BigDecimal price; @Column(name = "category_id") private Long categoryId; // @CreationTimestamp — Hibernate sets this on INSERT, never on UPDATE. // updatable = false enforces this at the JPA column level too. @CreationTimestamp @Column(name = "created_at", nullable = false, updatable = false) private LocalDateTime createdAt; // @UpdateTimestamp — Hibernate updates this on every UPDATE automatically. @UpdateTimestamp @Column(name = "updated_at", nullable = false) private LocalDateTime updatedAt; } // --- Repository Interface --- // No implementation class. Spring Data generates the implementation at startup // via JDK dynamic proxies. The method names are parsed into JPQL at boot time. package io.thecodeforge.persistence.repository; import io.thecodeforge.persistence.model.ForgeProduct; import org.springframework.data.jpa.repository.JpaRepository; import org.springframework.data.jpa.repository.Query; import org.springframework.data.repository.query.Param; import org.springframework.stereotype.Repository; import java.math.BigDecimal; import java.util.List; @Repository public interface ProductRepository extends JpaRepository<ForgeProduct, Long> { // Method-name-derived query — Spring Data parses this at startup. // Generates: SELECT * FROM forge_products WHERE LOWER(name) LIKE LOWER(?) List<ForgeProduct> findByNameContainingIgnoreCase(String name); // Custom JPQL for anything method names cannot express cleanly. // JPQL uses entity class names and field names, not table/column names. @Query("SELECT p FROM ForgeProduct p WHERE p.price > :minPrice ORDER BY p.price DESC") List<ForgeProduct> findExpensiveProducts(@Param("minPrice") BigDecimal minPrice); // Use @Modifying for UPDATE and DELETE — without it, Hibernate treats the // query as a SELECT and throws an exception. // clearAutomatically = true evicts affected entities from the Persistence // Context so subsequent reads reflect the updated values. @Modifying(clearAutomatically = true) @Query("UPDATE ForgeProduct p SET p.price = p.price * :multiplier WHERE p.categoryId = :categoryId") int applyPriceMultiplierByCategory( @Param("multiplier") BigDecimal multiplier, @Param("categoryId") Long categoryId ); }
INFO HikariPool-1 - Start completed. Pool size: 20, Idle connections: 10
INFO Hibernate: create table forge_products (
id bigint not null auto_increment,
name varchar(100) not null,
price decimal(12,2) not null,
category_id bigint,
created_at datetime(6) not null,
updated_at datetime(6) not null,
primary key (id)
) engine=InnoDB
INFO Hibernate: create index idx_product_name on forge_products (name)
INFO Hibernate: create index idx_product_category on forge_products (category_id)
- @Entity maps a Java class to a MySQL table — field names become column names by default, @Column overrides naming and adds constraints
- JpaRepository gives you
save(), findById(), findAll(), deleteById() and pagination for free — Spring Data generates the implementation at startup via reflection - Method-name-derived queries (findByNameContainingIgnoreCase) are parsed and validated at startup — if the field name does not exist on the entity, the application will not start
- @Query with JPQL operates on entity class names and field names, not table and column names — it is database-agnostic but Hibernate still validates it at startup
- The Persistence Context (first-level cache) tracks managed entities — every entity you load or save within a @Transactional method is tracked, and changes are flushed automatically at commit via dirty checking
- @Modifying is required for UPDATE and DELETE queries — without it, Hibernate refuses to execute the mutation and throws an exception at runtime
Common Mistakes and How to Avoid Them
The two most damaging mistakes with Spring Boot JPA are both invisible during development and only reveal themselves under production load. The first is the N+1 Select Problem — the application makes one query to fetch a list of parent entities and then N additional queries to fetch the associated child entities for each parent. With 10 records in your development database, this executes 11 queries and is unnoticeable. With 500 records in production, it executes 501 queries and the endpoint takes 8 seconds instead of 50 milliseconds.
The second is OSIV (Open Session in View) left at its default value of true. During development, this feels like a feature — you can access lazy-loaded associations anywhere in your code without worrying about transaction boundaries. In production, it means every HTTP request holds a database connection from the moment it enters the DispatcherServlet until the moment the HTTP response bytes are flushed to the client. With a pool of 20 connections and serialization taking 500ms per request, your effective throughput ceiling is 20 / 0.5 = 40 requests per second regardless of how fast your SQL executes.
Below is the full production application.properties configuration. Every setting is here for a reason — the comments explain the non-obvious ones.
# ========================================================= # MySQL Connection — all sensitive values from environment # variables, never hardcoded in application.properties # ========================================================= spring.datasource.url=jdbc:mysql://${DB_HOST:localhost}:3306/${DB_NAME:forge_db}?useSSL=true&requireSSL=true&serverTimezone=UTC&allowPublicKeyRetrieval=false&rewriteBatchedStatements=true spring.datasource.username=${DB_USER:forge_user} spring.datasource.password=${DB_PASS:} spring.datasource.driver-class-name=com.mysql.cj.jdbc.Driver # ========================================================= # HikariCP Connection Pool # Formula: maximum-pool-size = (core_count * 2) + effective_spindle_count # For a 4-core server with SSD storage: (4 * 2) + 1 = 9, round up to 10. # Going above 20-30 on a single MySQL instance rarely helps and can cause # mutex contention on MySQL's internal lock manager. # ========================================================= spring.datasource.hikari.pool-name=ForgeHikariPool spring.datasource.hikari.maximum-pool-size=20 spring.datasource.hikari.minimum-idle=10 # How long (ms) a caller waits for a connection before failing with an exception. # 20 seconds is generous for most workloads — consider 5-10s for latency-sensitive APIs. spring.datasource.hikari.connection-timeout=20000 # How long (ms) an idle connection stays in the pool before being closed. # 5 minutes is a reasonable default — shorter for serverless/bursty workloads. spring.datasource.hikari.idle-timeout=300000 # Maximum lifetime of a connection in the pool regardless of idle state. # Must be shorter than MySQL's wait_timeout (default 8 hours, commonly set to # 10-30 minutes on managed cloud instances). Set to 25 minutes to be safe. spring.datasource.hikari.max-lifetime=1500000 # Validates that the connection is alive before handing it to the caller. # MySQL 8.0+ supports SELECT 1 as the keepalive query. spring.datasource.hikari.connection-test-query=SELECT 1 spring.datasource.hikari.keepalive-time=60000 # ========================================================= # JPA / Hibernate # ========================================================= # validate: check entities match schema at startup; fail fast if they do not. # NEVER use create, create-drop, or update in production. # Schema changes belong in Flyway migrations, not in Hibernate's DDL generation. spring.jpa.hibernate.ddl-auto=validate # false in production — show-sql floods your log with SQL and has measurable # throughput impact under load. Enable temporarily for N+1 debugging. spring.jpa.show-sql=false spring.jpa.properties.hibernate.format_sql=false # MySQL8Dialect is correct for MySQL 8.0+. # Without this, Hibernate may generate SQL using syntax not optimized for MySQL 8. spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.MySQL8Dialect # ========================================================= # OSIV — Open Session in View # Default is true. This is wrong for production. # Set to false: database connections are released when @Transactional ends, # not when the HTTP response is written. This is the single highest-impact # JPA configuration change for connection pool efficiency. # ========================================================= spring.jpa.open-in-view=false # ========================================================= # Hibernate Batch Processing # Enables JDBC-level batching: multiple INSERT/UPDATE statements are sent # to MySQL in a single network round-trip instead of one per statement. # Requires rewriteBatchedStatements=true in the JDBC URL (already set above). # ========================================================= spring.jpa.properties.hibernate.jdbc.batch_size=500 spring.jpa.properties.hibernate.order_inserts=true spring.jpa.properties.hibernate.order_updates=true spring.jpa.properties.hibernate.generate_statistics=false
INFO ForgeHikariPool - Start completed. Pool size: 20, Idle connections: 10
INFO HibernateJpaVendorAdapter - Hibernate validation mode: validate
INFO SchemaValidator - Schema validation successful — all entities match the database schema
WARN spring.jpa.open-in-view is disabled — LazyInitializationException will be thrown if lazy associations are accessed outside @Transactional
| Aspect | Traditional JDBC | Spring Boot JPA |
|---|---|---|
| Boilerplate | High — manual Connection acquisition from pool, PreparedStatement creation, ResultSet iteration, and field-by-field mapping for every query. A simple findById is 20+ lines. | Zero — @Entity annotation and JpaRepository interface. The save(), findById(), findAll(), deleteById() implementations are generated at startup. A complete CRUD repository is 5 lines. |
| SQL Writing | Required for every operation including the simplest CRUD. You write, maintain, and test every SQL string — including handling dialect differences between MySQL versions. | Optional for standard CRUD — methods like save() and findById() are dynamically generated. Custom queries use JPQL (database-agnostic) or native SQL for MySQL-specific features. |
| Database Portability | Low — SQL strings are often written for a specific MySQL version or syntax. Migrating to PostgreSQL or another engine requires rewriting queries. | High — Hibernate dialects translate JPQL to the correct SQL for each database engine. Switching from MySQL to PostgreSQL in tests is a configuration change, not a code change. |
| Transaction Management | Manual — conn.setAutoCommit(false), conn.commit(), and conn.rollback() on every method that requires transactional semantics. Easy to forget, easy to leak. | Declarative — add @Transactional to the service method and Spring handles commit on success and rollback on any RuntimeException. Transaction boundaries are visible and auditable. |
| Performance Visibility | High — you write every SQL statement, so you know exactly what runs, when it runs, and what parameters it uses. No hidden queries, no surprise N+1 patterns. | Lower by default — Hibernate generates SQL behind the scenes. N+1 queries, unnecessary full-column SELECTs, and redundant dirty-check flushes are invisible without explicit SQL logging. |
| Connection Pooling | Manual setup required — you choose HikariCP, DBCP2, or C3P0, add the dependency, and configure it explicitly. Nothing is auto-configured. | Auto-configured — HikariCP is on the classpath via spring-boot-starter-data-jpa and configured from spring.datasource.hikari.* properties. Zero code required for basic pooling. |
| Batch Operations | Full control — PreparedStatement.addBatch() and executeBatch() give you explicit JDBC-level batching. Best raw throughput for bulk inserts and updates. | Requires explicit configuration — hibernate.jdbc.batch_size, hibernate.order_inserts, and rewriteBatchedStatements=true in the JDBC URL. Also requires entityManager.flush()/clear() to prevent Persistence Context heap exhaustion. |
| Best for | High-throughput bulk operations, reporting queries returning millions of rows, MySQL-specific features (window functions, JSON operations), and teams that need deterministic SQL output. | Transactional CRUD operations, REST API backends, applications where development velocity matters more than maximum query throughput, and teams that want schema portability. |
🎯 Key Takeaways
- Spring Boot with MySQL and JPA maps Java objects to MySQL tables via annotations — Hibernate generates SQL, Spring Data generates CRUD repository implementations, and Spring Boot auto-configures HikariCP. What took 300 lines of JDBC boilerplate now takes 30 lines of entity class and a repository interface.
- Spring Boot 3.2.x uses the jakarta.persistence namespace — if you are migrating from 2.x, update every @Entity, @Column, @Transactional, and @Id import from javax to jakarta. This is a breaking change with no behavioral difference.
- Spring Data JPA repositories eliminate DAO boilerplate — method-name-derived queries (findByNameContainingIgnoreCase) are parsed and validated at startup, not at runtime. A startup failure means your method name is wrong — a runtime failure means your logic is wrong. This distinction saves significant debugging time.
- Always use BigDecimal for monetary and financial fields, mapped to MySQL DECIMAL(precision, scale). Float and Double use IEEE 754 binary floating-point which cannot represent decimal fractions exactly — the resulting rounding errors accumulate silently across thousands of transactions.
- The N+1 problem is invisible in development and catastrophic in production. Enable spring.jpa.show-sql=true during development, count SELECT statements per request, and fix any linear growth in query count with JOIN FETCH or @EntityGraph before the code ships.
- Set spring.jpa.open-in-view=false unconditionally in production. OSIV holds database connections for the entire HTTP request lifecycle — including serialization time — which reduces your effective connection pool throughput by orders of magnitude under real load.
- Set spring.jpa.hibernate.ddl-auto=validate in production and use Flyway for schema migrations. Hibernate should never autonomously modify your production schema — validation gives you the startup-time schema mismatch detection without the risk of data loss.
- Monitor HikariCP metrics in production: hikaricp.connections.pending is the leading indicator of pool exhaustion and fires minutes before requests start timing out. Alert on any non-zero value sustained for 60 seconds.
⚠ Common Mistakes to Avoid
Interview Questions on This Topic
- QWhat is the internal difference between FetchType.LAZY and FetchType.EAGER? In what scenario would LAZY still cause an exception?Mid-levelReveal
- QDescribe three different ways to solve the N+1 Select Problem in Spring Data JPA. When would you choose each?Mid-levelReveal
- QHow does Hibernate's First Level Cache work, and how does it differ from a Second Level Cache like Hazelcast or Ehcache?SeniorReveal
- QWhat is LazyInitializationException and why does it occur after a transaction has closed?Mid-levelReveal
- QExplain Hibernate's Dirty Checking mechanism. How does Hibernate know which fields to include in an UPDATE statement?SeniorReveal
- QHow do you configure HikariCP for optimal performance, and what metrics should you monitor in production?SeniorReveal
Frequently Asked Questions
When should I use JPA vs JdbcTemplate?
Use JPA for transactional CRUD operations where you benefit from: dirty checking (no explicit save() for managed entities), relationship management (@OneToMany, @ManyToOne with cascading), schema validation at startup, and database portability via Hibernate dialects. JPA is the right tool for the majority of Spring Boot application persistence — REST API backends, user data management, order processing, and any domain where you model relationships between entities.
Use JdbcTemplate or NamedParameterJdbcTemplate for: high-throughput bulk read operations returning millions of rows (reporting, analytics, data exports), bulk inserts where raw JDBC batching outperforms Hibernate batch configuration, MySQL-specific features where native SQL is required and JPQL cannot express the query, and cases where you need deterministic SQL output without the Persistence Context overhead.
In practice, many production applications use both. JPA handles the transactional write path and standard read queries. JdbcTemplate handles reporting endpoints and bulk processing. Spring Boot auto-configures a JdbcTemplate bean using the same HikariCP datasource — there is no additional configuration required to use both in the same application.
What is the difference between JPQL and native SQL in Spring Data JPA?
JPQL (Java Persistence Query Language) operates on entity class names and field names — not table and column names. SELECT p FROM ForgeProduct p WHERE p.price > :min is valid JPQL. Hibernate translates JPQL to the correct SQL dialect for the configured database at startup — the query is database-agnostic. Hibernate also validates JPQL at startup: if you reference a field that does not exist on the entity, the application fails to start. This is a significant advantage during development.
Native SQL in Spring Data JPA uses @Query(nativeQuery = true) and operates on table names and column names as they exist in MySQL. SELECT * FROM forge_products WHERE price > :min is native SQL. It is MySQL-specific and will not work on PostgreSQL without modification. It is not validated by Hibernate at startup — a typo in a column name produces a runtime exception, not a startup failure.
When to use native SQL: FULLTEXT search (MATCH(name) AGAINST(:term IN BOOLEAN MODE)), MySQL JSON functions (JSON_EXTRACT, JSON_ARRAYAGG), window functions (ROW_NUMBER() OVER, RANK() OVER), and performance-critical queries where you need to control the execution plan and JPQL generates suboptimal SQL. Always run the native query directly in MySQL Workbench first and verify the EXPLAIN plan before wiring it into the repository.
How do I handle database schema migrations in production?
Use Flyway for versioned, auditable, SQL-based schema migrations. Add the flyway-core dependency and place SQL migration files in src/main/resources/db/migration following the naming convention V{version}__{description}.sql — for example, V1__create_forge_products.sql, V2__add_category_index.sql, V3__add_updated_at_column.sql.
Spring Boot auto-configures Flyway when flyway-core is on the classpath. On startup, Flyway checks the flyway_schema_history table in MySQL to determine which migrations have already run, then executes any pending migrations in version order — before Hibernate schema validation runs. This ordering matters: Flyway applies the schema change, then Hibernate validates that the entity annotations match the updated schema.
Key practices: (1) migration files are append-only and immutable once merged to the main branch — never modify a migration that has run in any environment, (2) every schema change goes through a migration file, never through ddl-auto=update, (3) test migrations locally against a copy of production schema before deploying, (4) for large table alterations on production MySQL, use pt-online-schema-change (Percona Toolkit) or gh-ost to apply the change without table locks.
What is the difference between @Transactional(readOnly=true) and @Transactional()?
@Transactional(readOnly = true) tells Spring and Hibernate that the method performs only read operations. Three performance benefits: (1) Hibernate skips dirty checking entirely at flush time — no snapshot comparison, no UPDATE generation, measurably faster for methods that load many entities, (2) the database connection may be set to read-only mode (MySQL: SET SESSION TRANSACTION READ ONLY), which allows the MySQL optimizer to skip undo log overhead for read-only transactions, (3) if you have configured read-write splitting (a primary MySQL for writes, read replicas for reads), Spring's @Transactional(readOnly = true) is the signal that routes the query to a read replica.
@Transactional() (default) allows both reads and writes — Hibernate performs dirty checking at flush time, the transaction can contain INSERT/UPDATE/DELETE statements, and any RuntimeException triggers a rollback.
Annotate every service method @Transactional(readOnly = true) by default and override with @Transactional() (or no annotation) on methods that perform writes. This is a low-cost, high-value optimization that requires no architectural changes.
How do I batch insert 100,000 entities without running out of memory?
JPA's Persistence Context caches every entity loaded or persisted in memory as a managed object plus a snapshot copy for dirty checking. Inserting 100,000 entities in a single transaction without clearing the Persistence Context means 200,000 object references (100,000 entities + 100,000 snapshots) live in heap simultaneously — OutOfMemoryError is the predictable outcome.
The solution requires four steps working together: (1) set spring.jpa.properties.hibernate.jdbc.batch_size=500 to group INSERT statements into batches of 500 before sending to MySQL, (2) set spring.jpa.properties.hibernate.order_inserts=true to ensure inserts of the same entity type are grouped together — enabling the batch, (3) add rewriteBatchedStatements=true to the JDBC URL — this is the MySQL JDBC driver-level setting that rewrites individual INSERT statements into a multi-row INSERT, (4) call entityManager.flush() followed by entityManager.clear() every 500 entities in your batch loop. flush() sends the current batch to MySQL, clear() evicts all managed entities from the Persistence Context, making them eligible for garbage collection.
If memory pressure persists after these changes, or if maximum throughput is required, consider bypassing JPA entirely for the bulk insert and using JdbcTemplate's batchUpdate() method — it has zero Persistence Context overhead and is typically 3–5x faster than Hibernate batch inserts for large volumes.
How do I monitor JPA and MySQL performance in production?
Four layers of monitoring, each catching different classes of problems:
(1) HikariCP metrics via Actuator and Prometheus — add spring-boot-starter-actuator and micrometer-registry-prometheus. Key metrics: hikaricp.connections.active (current usage), hikaricp.connections.pending (leading indicator of pool exhaustion — alert at > 0 for 60 seconds), hikaricp.connections.timeout (count of timed-out requests — alert at any non-zero value), hikaricp.connections.acquire (p99 acquisition latency — alert above 100ms).
(2) Hibernate statistics — enable spring.jpa.properties.hibernate.generate_statistics=true in non-production environments only (statistics collection has CPU overhead). This logs query counts, entity load counts, cache hit ratios, and flush counts per session — invaluable for identifying N+1 patterns and unnecessary dirty-check flushes during load testing.
(3) Datasource proxy or P6Spy — add datasource-proxy-spring-boot-starter to log every SQL statement with execution time, parameter values, and the calling method. More precise than spring.jpa.show-sql=true because it shows actual bound parameter values and millisecond execution times. Use this during performance testing, not in production (log volume is significant).
(4) MySQL slow query log — SET GLOBAL slow_query_log = 'ON'; SET GLOBAL long_query_time = 1; Queries taking longer than 1 second are logged to the slow query log with execution plan details. Combine with pt-query-digest (Percona Toolkit) to aggregate slow queries and identify the most expensive patterns across millions of log entries.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.