JPA and Hibernate effective query

Opens question about keep clean the layers separation and balancing the querying strategy in relation databases

TIP 1: Entity or DTO

Entity and DTO (Data Transfer Object) serve different purposes, and understanding their distinctions can help you write more effective queries and design a well-structured application. Here’s a breakdown:

Entity

  • An Entity is a Java object representing a database table in JPA (Java Persistence API) and Hibernate.
  • It’s typically annotated with @Entity and is directly mapped to the database table, where each instance of an Entity represents a row in the table.
  • Purpose: Entities are used for data persistence and to represent the structure of database tables, with attributes corresponding to columns in the table.
  • Lifespan: Entities are managed by Hibernate and exist within a persistence context (session). Changes to entities are synchronized with the database when the session is flushed.
  • Annotations: Entities often have Hibernate-specific annotations (e.g., @Table, @Column, @Id) to define mappings, constraints, and relationships.
  • Overhead: Since entities are tied to the persistence context and may have cascading and lazy-loading settings, working with them in detached states or over networks can sometimes be inefficient.

DTO (Data Transfer Object)

  • A DTO is a plain Java object created specifically for transferring data between layers (often between the service and presentation layers).
  • DTOs are not tied to the persistence context and don’t have annotations for ORM (Object-Relational Mapping) like @Entity. Instead, they are simple objects with getters and setters.
  • Purpose: DTOs are used to reduce the data sent over the network by holding only the necessary information, which is especially useful for optimizing API responses.
  • Lifespan: DTOs are usually short-lived and created as needed, based on specific requests.
  • Annotations: DTOs don’t usually contain JPA annotations; instead, they may have annotations related to serialization or validation if needed.
  • Overhead: Since DTOs aren’t tied to the persistence context, they can reduce performance overhead, as they avoid unnecessary data access and session management issues.

Practical Usage and Effective Queries

  • When to Use Entities: Use entities in the data access layer when interacting directly with the database for CRUD operations. You can leverage the Hibernate session context for transaction management and lazy loading.
  • When to Use DTOs: Use DTOs in the service or controller layer to optimize performance by transferring only necessary data. They’re helpful in cases where you need to aggregate or transform data from multiple entities without loading unnecessary fields.

Example Scenario

Imagine you have an Order entity with fields like orderId, customer, product, date, and status. If you need only the orderId and status for a summary view, using an OrderSummaryDTO with these two fields would be more efficient than returning the entire Order entity.

Writing Effective Queries

When writing Hibernate queries to retrieve data for a DTO, you can use:

  • JPQL or HQL: Query the exact fields for the DTO using SELECT new ..., which allows you to create DTOs directly in the query:
List<OrderSummaryDTO> summaries = entityManager.createQuery(
    "SELECT new com.example.OrderSummaryDTO(o.id, o.status) FROM Order o", 
    OrderSummaryDTO.class)
    .getResultList();

Native SQL Queries: When working with complex queries or aggregations, native SQL queries can map results to a DTO as well. Using DTOs in this way helps reduce data transfer overhead and keeps your application performant, especially in high-load scenarios or with complex data structures.

Tip 2 : FetchType.EAGER or FetchType.LAZY

In Hibernate, the FetchType defines how and when associated entities are loaded from the database. There are two types of fetching strategies:

1. Eager Fetching (FetchType.EAGER)

  • Description: With FetchType.EAGER, the associated entity or collection is loaded immediately along with the main entity in a single query, regardless of whether it’s needed right away.
  • Usage: Eager fetching can be useful when you know that you’ll always need the associated entity or collection as part of your processing.
  • Trade-Offs:
  • Pros: Ensures that the data is loaded in a single query, avoiding lazy-loading issues and possible LazyInitializationException.
  • Cons: Can cause performance issues because it always fetches the associated entities, even if they’re not used. For example, if you fetch a list of parent entities with @OneToMany or @ManyToMany associations set to EAGER, it could lead to the N+1 problem or large result sets, slowing down performance.
  • Example:
@ManyToOne(fetch = FetchType.EAGER)
private Customer customer;

2. Lazy Fetching (FetchType.LAZY)

  • Description: With FetchType.LAZY, the associated entity or collection is not loaded immediately; instead, it is fetched on demand when it’s accessed for the first time. Hibernate uses proxies to load the association only when it’s needed.
  • Usage: Lazy fetching is the default strategy for @OneToMany and @ManyToMany relationships. It’s recommended when you have large or complex associations and don’t always need to load them.
  • Trade-Offs:
  • Pros: Improves performance by loading associated entities only when required. Reduces the initial query load, especially for large or deeply nested relationships.
  • Cons: May lead to LazyInitializationException if the associated entity is accessed outside the Hibernate session context (e.g., in a detached or closed session). This often occurs in web applications where lazy-loaded fields are accessed after the session is closed.
  • Example:
@OneToMany(fetch = FetchType.LAZY, mappedBy = "order")
private List<OrderItem> orderItems;

Choosing Between EAGER and LAZY

When choosing between EAGER and LAZY, consider the following best practices:

  • Default to LAZY: In most cases, FetchType.LAZY is preferred, as it improves performance by deferring data loading until it’s necessary. This approach also minimizes unnecessary data retrieval.
  • Use EAGER Sparingly: Use FetchType.EAGER only when you are certain the associated entity or collection will always be needed along with the main entity.
  • Handling Lazy Loading in Web Applications: Since lazy loading can cause issues in web applications, consider using DTOs or fetch joins in queries (e.g., JPQL JOIN FETCH) to pre-load the necessary associations in a controlled way, especially if they are to be accessed outside the Hibernate session.

Example: JPQL with Fetch Join

If you prefer lazy loading but want specific associations eagerly loaded for certain queries, use a fetch join in JPQL:

List<Order> orders = entityManager.createQuery(
    "SELECT o FROM Order o JOIN FETCH o.orderItems WHERE o.customer.id = :customerId", 
    Order.class)
    .setParameter("customerId", customerId)
    .getResultList();

By defaulting to FetchType.LAZY and selectively using fetch joins or custom DTOs, you gain more control over data loading and application performance.

TIP 3: N+1 selection issue

The N+1 selection issue is a common performance problem in applications using ORMs like Hibernate. It arises when the application makes one initial query to fetch a list of entities (the “1” query), and then, for each of these entities, it makes an additional query (the “N” queries) to load related entities due to lazy loading. This can lead to a huge number of database queries, which significantly slows down performance.

Example Scenario of the N+1 Issue

Imagine you have two entities: Author and Book, where an Author can have multiple Books.

@Entity
public class Author {
    @Id
    private Long id;
    private String name;

    @OneToMany(mappedBy = "author", fetch = FetchType.LAZY)
    private List<Book> books;
}

@Entity
public class Book {
    @Id
    private Long id;
    private String title;

    @ManyToOne
    private Author author;
}

Now, if you fetch a list of Author entities, and then in your code, you access the books collection for each Author, you may encounter the N+1 issue:

List<Author> authors = entityManager.createQuery("SELECT a FROM Author a", Author.class).getResultList();

for (Author author : authors) {
    // This line triggers an additional query for each author to fetch the books
    List<Book> books = author.getBooks();
}

How the N+1 Issue Occurs

  1. The first query (SELECT a FROM Author a) retrieves all Author entities.
  2. Then, for each Author, accessing author.getBooks() triggers a new query due to the FetchType.LAZY configuration:
SELECT * FROM Book WHERE author_id = ?;

If there are 10 authors, this results in 1 query to get authors + 10 additional queries to get the books for each author.

This pattern of multiple queries can be detrimental to performance, especially when the number of authors (or primary entities) grows.

Solutions to the N+1 Issue

Using JOIN FETCH in JPQL or HQL Queries

  • You can use a fetch join to load the Author entities along with their Book associations in a single query.
  • Example:
List<Author> authors = entityManager.createQuery(
    "SELECT a FROM Author a JOIN FETCH a.books", Author.class
).getResultList();
  • This will load all authors and their books in one query, avoiding additional queries for each author’s books.

Changing FetchType to EAGER

  • Another option is to set the FetchType of the association to EAGER:
@OneToMany(mappedBy = "author", fetch = FetchType.EAGER)
private List<Book> books;
  • However, this is not generally recommended as it makes Hibernate load this relationship eagerly everywhere it’s accessed, potentially affecting performance in other parts of the application.
  • This solution can cause more problems if your entities have large collections, so use it cautiously.

Using Batch Fetching with @BatchSize

Hibernate supports batch fetching, which allows related entities to be loaded in batches rather than one at a time. Add the @BatchSize annotation to specify how many associations should be loaded in a single query:

@OneToMany(mappedBy = "author", fetch = FetchType.LAZY)
@BatchSize(size = 10)
private List<Book> books;

Now, when the first author’s books are accessed, Hibernate will load the books for multiple authors (up to the batch size) in fewer queries. This can improve performance when working with a large number of entities.

Using a Secondary Query with a Fetch Join

In some cases, it’s more efficient to load entities in separate stages. First, retrieve the main entities (e.g., Author), and then use a second query to load the associated entities with a fetch join.

Example:

List<Author> authors = entityManager.createQuery("SELECT a FROM Author a", Author.class).getResultList();
List<Author> authorsWithBooks = entityManager.createQuery(
    "SELECT DISTINCT a FROM Author a JOIN FETCH a.books WHERE a IN :authors", Author.class
).setParameter("authors", authors)
.getResultList();

EntityGraph

An EntityGraph is a JPA feature that specifies which attributes or associations of an entity should be fetched eagerly. Instead of modifying the default FetchType in entity mappings, EntityGraphs allow you to dynamically define fetching strategies at query time.

EntityGraphs can be defined at two levels:

  1. Static: Defined on the entity using annotations.
  2. Dynamic: Created at runtime and passed into queries.

Why Use EntityGraph?

EntityGraphs are beneficial for: Avoiding the N+1 problem: They allow you to specify which related entities to fetch in a single query, reducing the number of database queries. Dynamic fetch control: You can control eager and lazy loading dynamically without altering entity mappings or using JOIN FETCH in JPQL, which keeps your codebase clean. Better performance: Fetching only what is necessary improves the efficiency of queries.

Defining an EntityGraph

  1. Static EntityGraph with Annotations

Define an EntityGraph using @NamedEntityGraph and @NamedAttributeNode annotations on an entity.

For example, let’s say you have Author and Book entities, and you want to fetch an author along with their books:

@Entity
@NamedEntityGraph(
    name = "Author.books",
    attributeNodes = @NamedAttributeNode("books")
)
public class Author {
    @Id
    private Long id;
    private String name;

    @OneToMany(mappedBy = "author", fetch = FetchType.LAZY)
    private List<Book> books;
}

Here:

  • name = "Author.books" is the identifier for this EntityGraph.
  • attributeNodes = @NamedAttributeNode("books") specifies that the books association should be fetched eagerly when this EntityGraph is used.
  1. Dynamic EntityGraph

Alternatively, you can create and configure an EntityGraph dynamically at runtime. This is useful when you need different fetch strategies in different contexts.

Example:

EntityGraph<Author> graph = entityManager.createEntityGraph(Author.class);
graph.addAttributeNodes("books");

Using an EntityGraph in a Query

Once you have defined an EntityGraph (either statically or dynamically), you can apply it to a JPA query. Here’s how:

Example with Static EntityGraph

TypedQuery<Author> query = entityManager.createQuery("SELECT a FROM Author a WHERE a.id = :id", Author.class);
query.setParameter("id", 1L);
query.setHint("javax.persistence.fetchgraph", entityManager.getEntityGraph("Author.books"));
Author author = query.getSingleResult();

Example with Dynamic EntityGraph

EntityGraph<Author> graph = entityManager.createEntityGraph(Author.class);
graph.addAttributeNodes("books");

TypedQuery<Author> query = entityManager.createQuery("SELECT a FROM Author a WHERE a.id = :id", Author.class);
query.setParameter("id", 1L);
query.setHint("javax.persistence.fetchgraph", graph);
Author author = query.getSingleResult();

EntityGraph Types: fetchgraph vs. loadgraph

When applying an EntityGraph to a query, there are two modes you can use:

  1. Fetch Graph (javax.persistence.fetchgraph): Specifies that only the attributes defined in the EntityGraph should be eagerly fetched, while others remain lazy. This is the most commonly used mode for avoiding the N+1 problem selectively.
  2. Load Graph (javax.persistence.loadgraph): Specifies that attributes in the EntityGraph should be eagerly fetched, but it also respects any default FetchType.EAGER annotations in the entity. This mode can be useful when you want to extend the default fetching behavior but respect existing mappings.

Example of FetchGraph and LoadGraph Usage

// Using fetchgraph to control eager fetching for only specified attributes
query.setHint("javax.persistence.fetchgraph", entityManager.getEntityGraph("Author.books"));

// Using loadgraph to load specified attributes and follow default fetch types
query.setHint("javax.persistence.loadgraph", entityManager.getEntityGraph("Author.books"));

Advantages of Using EntityGraph

  • Performance Optimization: Load only the required data in a single query, which minimizes database round-trips.
  • Dynamic Fetching: Control fetching strategies without changing the entity mappings, keeping your domain model cleaner.
  • Better than JPQL JOIN FETCH: EntityGraph is more flexible than using JPQL JOIN FETCH, as it can be applied selectively and reused without modifying queries.

The N+1 selection issue is a performance problem in ORMs like Hibernate. It occurs when an application makes one initial query to fetch a list of entities (the “1” query) and then, for each entity, makes N additional queries to load related data due to lazy loading.

Example:

  • Initial Query: Retrieve a list of authors.
  • Additional Queries: For each author, retrieve their books in a separate query.

Yes, pagination can also be a solution to mitigate the N+1 selection issue, especially when dealing with large datasets. By limiting the number of entities fetched in the initial query, you reduce the number of associated queries triggered.

TIP 5: Pagination

Instead of loading all entities at once, use pagination to fetch a smaller subset of records and process them incrementally. This prevents the ORM from issuing too many follow-up queries for related entities in a single operation.

Example:

Suppose you are fetching authors and their books:

  1. Without Pagination: The initial query fetches all authors, triggering N additional queries for books if lazy loading is enabled.
  2. With Pagination: The initial query fetches only a limited number of authors (e.g., 10 per page), reducing the number of subsequent queries.

How to Implement Pagination in Hibernate

Use setFirstResult and setMaxResults for pagination:

TypedQuery<Author> query = entityManager.createQuery("SELECT a FROM Author a", Author.class);
query.setFirstResult(0); // Start index
query.setMaxResults(10); // Page size
List<Author> authors = query.getResultList();

Combine Pagination with Fetch Strategies

To optimize further:

  • Use Fetch Join or EntityGraph in combination with pagination to ensure that associations are loaded in a single query for the current page.
  • Example with Fetch Join:
TypedQuery<Author> query = entityManager.createQuery(
    "SELECT a FROM Author a JOIN FETCH a.books", Author.class);
query.setFirstResult(0);
query.setMaxResults(10);
List<Author> authors = query.getResultList();

Benefits of Pagination

  1. Limits the number of entities fetched, reducing the number of associated queries.
  2. Prevents memory overload when dealing with large datasets.
  3. Works well with other solutions like fetch joins or EntityGraphs.

By combining pagination with other techniques like batch fetching or fetch joins, you can effectively minimize the impact of the N+1 selection issue while keeping your application performant.

This leads to N+1 queries instead of a single optimized query, which can degrade performance, especially with a large number of entities.

Solutions:

  1. Fetch Join: Use JPQL JOIN FETCH to load related entities in a single query.

  2. EntityGraph: Define and use an EntityGraph to dynamically control eager loading.

  3. Batch Fetching: Configure Hibernate to load related entities in batches, reducing the number of queries.

  4. Selective Eager Fetching: Set FetchType.EAGER only for associations that are always needed, but use cautiously to avoid unnecessary data loading.

  5. Pagination: Limit the number of entities fetched in the initial query, reducing the number of follow-up queries for related data.


Links:

All rights reserved by Borg.Net community & Technosphere.