Java Persistence Performance: Objects vs Data, and Filtering a JOIN FETCH

JPA deals with objects, and SQL deals with rows. Some developers love JPA because it allows them to use objects in an object-oriented fashion. However, other developers have trouble understanding JPA precisely because it is object-oriented.

Even though object-oriented programming has been popular for several decades, there are still a lot of procedural languages out their. Developers used to procedural programming have a different mindset than object-oriented developers. This is also true of developers with a lot of experience with SQL. JPA and JPQL can be difficult to understand for developers used to SQL and JDBC.

This can lead to some odd usages and misunderstandings of JPA. In this blog post, I would like to highlight a couple of these misunderstandings, and provide a solution to my favorite, which I call Filtering a JOIN FETCH.

Dobject Model

One JPA usage that I find aggravating is what I call the Dobject model. This is a data model that has been made into an object model. Sometimes this comes across as a class that has the same name as the database table, with all of the same field names, and no relationships. Sometimes there are relationships, but they have the same name as the foreign key columns.

@Entity
public class EMP {
  @Id
  long EMP_ID;
  String F_NAME;
  String L_NAME;
  long MGR_ID;
  long ADDR_ID;
}

The above is an unusual class, and not very object-oriented. It is probably not as useful as it could if it had relationships instead of foreign keys.

@Entity
public class employees {
  @Id
  long empId;
  String fName;
  String lName;
  @ManyToOne
  Employee mgrId;
  @OneToOne
  Address addrId;
}

This one is very confused. First of all, it seems to be named after its table, where the name employees might make sense, but an object is a single entity, so should not be pluralized. Also, classes in Java should start with an upper case letter, as classes are proper names of the real world entity that they represent.

This class at least has relationships, which I suppose is an improvement, but they are named like they are foreign keys, not objects. This normally leads the user to try to query them as values instead of as objects.

Select e from employees e where e.mgrId = 4

Which does not work, because mgrId is an Employee object, not an Id value. The relationship should be named after what it represents, i.e. manager not the foreign key.

A more object-oriented way to define the class would be:

@Entity
public class Employee {
  @Id
  long id;
  String firstName;
  String lastName;
  @ManyToOne(fetch=FetchType.LAZY)
  Employee manager;
  @OneToMany(mappedBy="manager")
  List<Employee> managedEmployees;
  @OneToOne
  Address address;
}

How not to write a DAO

In JPA you do not normally execute queries to insert, update or delete objects. To update and object you just find or query it, change it through its set methods, and commit the transaction. JPA automatically keeps track of what changed and updates what is required to be updated, in the order that it is required to be updated.

To insert an object you call persit() on the EntityManager. To delete an object you call remove() on the EntityManager. This is different than SQL or JDBC that requires you to execute a query to perform any modification.

JPA does allow UPDATE and DELETE queries through JPQL. These are for batch updates and deletes, not for the deletion or updating or single objects. This can lead to the very confused Data Access Object below:

public class EmployeeDOA {
  public void insert(Employee employee) {
    em.persist(employee);
  }
  public void update(Employee employee) {
    Query query = em.createQuery("Update Employee e set e.firstName = :firstName, e.lastName = :lastName where e.id = :id");
    query.setParameter("id", employee.getId());
    query.setParameter("firstName", employee.getFirstName());
    query.setParameter("lastName", employee.getLastName());
    query.executeUpdate();
  }
  public void delete(Employee employee) {
    Query query = em.createQuery("Delete from Employee e where e.id = :id");
    query.setParameter("id", employee.getId());
    query.executeUpdate();
  }
}

This is wrong. You do not execute queries to update or delete objects. This will leave the objects in your persistence context in an invalid state, as they are not aware of the query updates. In is also not using JPA correctly, or as it was intended, and not benefiting from its full functionality. A better Data Access Object would be:

public class EmployeeDOA {
  public void persist(Employee employee) {
    em.persist(employee);
  }
  public Object merge(Employee employee) {
    return em.merge(employee);
  }
  public void remove(Employee employee) {
    em.remove(employee);
  }
}

Note that there is no update(). JPA does not have or require and update(), merge() can be used for detached objects, but is not the equivalent of update(), you do not need to call update in JPA, this is one of its benefits.

JPQL vs SQL

JPQL is not SQL. It looks a lot like SQL, has similar syntax and uses the same standard naming for operators and functions, but it is not SQL. This can be very confusing for someone experienced with SQL. When they try to use their SQL in place of JPQL it does not work.

Select * from Employee e join Address a on e.addressId = a.id where a.city = 'Ottawa'

This is SQL, not JPQL.
The equivalent JPQL would be:

Select e from Employee e where e.address.city = 'Ottawa'

Of coarse if you prefer SQL, JPA fully allows you to use SQL, you just need to call createNativeQuery instead of createQuery. However, most users prefer to use JPQL. I suppose this is because JPQL lets them deal with objects, and even if they don't quite understand objects, they do understand there is some benefit there.

JPQL also defines the JOIN syntax, but it does not have a ON clause, and JOIN is based on relationships, not foreign keys.

Select e from Employee e join e.address a where a.city = 'Ottawa'

The JOIN syntax in JPQL allows you do query collection relationships, and use OUTER joins and FETCH. A join FETCH allows you read an object an its relationship in a single query (as appose to a possible dreaded N+1 queries).

Select e from Employee e left join fetch e.address where e.address.city = 'Ottawa'

Notice that I did not use an alias on this JOIN FETCH. This is because JPQL does not allow this, which I will get into later. There is also no ON clause in JPQL because the relationship is always joined by the foreign keys defined in the relationship's join columns in its mapping.

Sometimes it is desirable to place additional conditions in a JOIN ON clause. This is normally in the case of OUTER joins, where placing the condition in the WHERE clause would result in empty joins being filtered. A ON clause is something that is part of the JPA 2.1 draft, so it is coming. EclipseLink already supports the ON clause, as well as aliasing JOIN FETCH in its 2.4 development milestones, see:

www.eclipse.org/eclipselink/downloads/milestones

Filtering a JOIN FETCH

A common misunderstanding I see users make occurs when querying a OneToMany relationship.

Select d from Department d join d.employees e where e.address.city = 'Ottawa'

This query results in all department objects that have any employee living in Ottawa. The confusion comes when they access the department's employees. Each department contains all of its employees, however they were expecting just the employees that live in Ottawa.

This is because JPA deals with objects, and a Department object represents a specific department, and has a specific identity. If I issue two queries for a particular department, I get back the same identical (==) instance (provided both queries use the same EntityManager). A specific department always has the same employees, it represent the real world department, and does not change, just because you queried it differently. This is important for caching, but also within the same persistence context, if you query the same department, two different ways, you should also get back the same exact department.

If you really want the department, and only the employees of the department that live in Ottawa, you can use the following query:

Select d, e from Department d join d.employees e where e.address.city = 'Ottawa'

This will give you an List of Object[] that contain the Department and the Employee. For each employee you will get back n Object[] (rows), where n is the number of departments in the employee. The same employee will be duplicated n times, each with its different department. If you access the departments any of the employees they will contain all of the employee's departments, not just the ones in Ottawa.

But, if you really, really want the department to only have the employees that live it Ottawa, you can do this. I'm not sure I would recommend it, but it is possible, at least in EclipseLink 2.4 it will be. EclipseLink allows you to use an alias on a JOIN FETCH. This support was intended for OneToOne and ManyToOne relationships, to avoid having to join it twice just to get an alias, as well as to allow using it in an ORDER BY or in other ways that would not filter the results and alter how the objects are built. But, there is nothing stopping you from using it with a OneToMany to filter the contents of the fetched OneToMany results.

If you are going to do this, you should be careful, and at least ensure you set the "javax.persistence.cache.storeMode" and "javax.persistence.cache.retrieveMode" to BYPASS, to avoid corrupting the shared cache.

Query query = em.createQuery("Select d from Department d join fetch d.employees e where e.address.city = 'Ottawa'");
query.setHint("javax.persistence.cache.storeMode", "BYPASS");
query.setHint("javax.persistence.cache.retrieveMode", "BYPASS");
List departmentsWithFilteredEmployees = query.getResultList();

EclipseLink 2.4 is currently under development, but its milestone builds already contain this functionality. The EclipseLink 2.4 milestone builds can be download from:
www.eclipse.org/eclipselink/downloads/milestones

For more information of the many JPQL extensions and enhancements in EclipseLink 2.4 see:
wiki.eclipse.org/EclipseLink/UserGuide/JPA/Basic_JPA_Development/Querying/JPQL

Java Persistence Performance

Pages

Monday, April 23, 2012

Objects vs Data, and Filtering a JOIN FETCH

Dobject Model

How not to write a DAO

JPQL vs SQL

Filtering a JOIN FETCH

1 comment :