Monday, April 2, 2012

EclipseLink JPA supports MongoDB

EclipseLink 2.4 will support JPA access to NoSQL databases. This support is already part of the EclipseLink development trunk and can be tried out using the milestone or nightly builds.  Initial support is provided for MongoDB and Oracle NoSQL. A plug-able platform and adapter layer allows for other databases to be supported.

NoSQL is a classification of database systems that do not conform to the relational database or SQL standard. They have various roots, from distributed internet databases, to object databases, XML databases and even legacy databases. They have become recently popular because of their use in large scale distributed databases in Google, Amazon, and Facebook.

There are various NoSQL databases including:
  • Mongo DB
  • Oracle NoSQL
  • Cassandra
  • Google BigTable
  • Couch DB

EclipseLink's NoSQL support allows the JPA API and JPA annotations/xml to be used with NoSQL data. EclipseLink also supports several NoSQL specific annotations/xml including @NoSQL that defines a class to map NoSQL data.

EclipseLink's NoSQL support is based on previous EIS support offered since EclipseLink 1.0. EclipseLink's EIS support allowed persisting objects to legacy and non-relational databases. EclipseLink's EIS and NoSQL support uses the Java Connector Architecture (JCA) to access the data-source similar to how EclipseLink's relational support uses JDBC. EclipseLink's NoSQL support is extendable to other NoSQL databases, through the creation of an EclipseLink EISPlatform class and a JCA adapter.

Let's walk through an example of using EclipseLink's NoSQL support to persist an ordering system's object model to a MongoDB database.

The source for the example can be found here, or from the EclipseLink SVN repository.

Ordering object model

The ordering system consists of four classes, Order, OrderLine, Address and Customer. The Order has a billing and shipping address, many order lines, and a customer.

public class Order implements Serializable {
private String id;
private String description;
private double totalCost = 0;
private Address billingAddress;
private Address shippingAddress;
private List orderLines = new ArrayList();
private Customer customer;
...
}
public class OrderLine implements Serializable {
private int lineNumber;
private String description;
private double cost = 0;
...
}
public class Address implements Serializable {
private String street;
private String city;
private String province;
private String country;
private String postalCode;
....
}
public class Customer implements Serializable {
private String id;
private String name;
...
}

Step 1 : Decide how to store the data

There is no standard on how NoSQL databases store their data. Some NoSQL databases only support key/value pairs, others support structured hierarchical data such as JSON or XML.

MongoDB stores data as BSON (binary JSON) documents. The first decision that must be made is how to store the objects. Normally each independent object would compose a single document, so a single document could contain Order, OrderLine and Address. Since customers can be shared amongst multiple orders, Customer would be its own document.

Step 2 : Map the data

The next step is to map the objects. Each root object in the document will be mapped as an @Entity in JPA. The objects that are stored by being embedded within their parent's document are mapped as @Embeddable. This is similar to how JPA maps relational data, but in NoSQL embedded data is much more common because of the hierarchical nature of the data format. In summary, Order and Customer are mapped as @Entity, OrderLine and Address are mapped as @Embeddable.

The @NoSQL annotation is used to map NoSQL data. This tags the classes as mapping to NoSQL data instead of traditional relational data. It is required in each persistence class, both entities and embeddables. The @NoSQL annotation allows the dataType and the dataFormat to be set.

The dataType is the equivalent of the table in relational data, its meaning can differ depending on the NoSQL data-source being used. With MongoDB the dataType refers to the collection used to store the data. The dataType is defaulted to the entity name (as upper case), which is the simple class name.

The dataFormat depends on the type of data being stored. Three formats are supported by EclipseLink, XML, Mapped, and Indexed. XML is the default, but since MongoDB uses BSON, which is similar to a Map in structure, Mapped is used. In summary, each class requires the @NoSql(dataFormat=DataFormatType.MAPPED) annotation.

@Entity
@NoSql(dataFormat=DataFormatType.MAPPED)
public class Order

@Embeddable
@NoSql(dataFormat=DataFormatType.MAPPED)
public class OrderLine

Step 3 : Define the Id

JPA requires that each Entity define an Id. The Id can either be a natural id (application assign id) or a generated id (id is assign by EclipseLink). MongoDB also requires an _id field in every document. If no _id field is present, then Mongo will auto generate and assign the _id field using an OID (object identifier) which is similar to a UUID (universally unique identifier).

You are free to use any field or set of fields as your Id in EclipseLink with NoSQL, the same as a relational Entity. To use an application assigned id as the Mongo id, simply name its field as "_id". This can be done through the @Field annotation, which is similar to the @Column annotation (which will also work), but without all of the relational details, it has just a name. So, to define the field Mongo will use for the id include @Field(name="_id") in your mapping.

To use the generated Mongo OID as your JPA Id, simply include @Id, @GeneratedValue, and @Field(name="_id") in your object's id field mapping. The @GeneratedValue tells EclipseLink to use the Mongo OID to generate this id value. @SequenceGenerator and @TableGenerator are not supported in MongoDB, so these cannot be used. Also the generation types of IDENTITY, TABLE and SEQUENCE are not supported. You can use the EclipseLink @UUIDGenerator if you wish to use a UUID instead of the Mongo OID. You can also use your own custom generator. The id value for a Mongo OID or a UUID is not a numeric value, it can only be mapped as String or byte[].

@Id
@GeneratedValue
@Field(name="_id")
private String id;

Step 4 : Define the mappings

Each attribute in your object has too be mapped. If no annotation/xml is defined for the attribute, then it mapping will be defaulted. Defaulting rules for NoSQL data, follow the JPA defaulting rules, so most simple mappings do not require any configuration if defaults are used. The field names used in the Mongo BSON document will mirror the object attribute names (as uppercase). To provide a different BSON field name, the @Field annotation is used.

Any embedded value stored in the document is persisted using the @Embedded JPA annotation. An embedded collection will use the JPA @ElementCollection annotation. The @CollectionTable of the @ElementCollection is not used or supported in NoSQL, as the data is stored within the document, no separate table is required. The @AttributeOverride is also not required nor supported with NoSQL, as the embedded objects are nested in the document, and do not require unique field names. The @Embedded annoation/xml is normally not required, as it is defaulted, the @ElementCollection is required, as defaulting does not currently work for @ElementCollection in EclipseLink.

The relationship annotations/xml @OneToOne, @ManyToOne, @OneToMany, and @ManyToMany are only to be used with external relationships in NoSQL. Relationships within the document use the embedded annotations/xml. External relationships are supported to other documents. To define an external relationship a foreign key is used. The id of the target object is stored in the source object's document. In the case of a collection, a collection of ids is stored. To define the name of the foreign key field in the BSON document the @JoinField annotation/xml is used.

The mappedBy option on relationships is not supported for NoSQL data, for bi-directional relationships, the foreign keys would need to be stored on both sides. It is also possible to define a relationship mapping using a query, but this is not currently supported through annotations/xml, only through a DescriptorCustomizer.

@Basic
private String description;
@Basic
private double totalCost = 0;
@Embedded
private Address billingAddress;
@Embedded
private Address shippingAddress;
@ElementCollection
private List orderLines = new ArrayList();
@ManyToOne(fetch=FetchType.LAZY)
private Customer customer;

Step 5 : Optimistic locking

Optimistic locking is supported with MongoDB. It is not required, but if locking is desired, the @Version annotation can be used.

Note that MongoDB does not support transactions, so if a lock error occurs during a transaction, any objects that have been previously written will not be rolled back.

@Version
private long version;

Step 6 : Querying

MongoDB has is own JSON based query by example language. It does not support SQL (i.e. NoSQL), so querying has limitations.

EclipseLink supports both JPQL and the Criteria API on MongoDB. Not all aspects of JPQL are supported. Most basic operations are supported, but joins are not supported, nor sub-selects, group bys, or certain database functions. Querying to embedded values, and element collections are supported, as well as ordering, like, and selecting attribute values.

Not all NoSQL database support querying, so EclipseLink's NoSQL support only supports querying if the NoSQL platform supports it.

Query query = em.createQuery("Select o from Order o where o.totalCost > 1000");
List orders = query.getResultList();

Query query = em.createQuery("Select o from Order o where o.description like 'Pinball%'");
List orders = query.getResultList();

Query query = em.createQuery("Select o from Order o join o.orderLines l where l.description = :desc");
query.setParameter("desc", "shipping");
List orders = query.getResultList();

Native queries are also supported in EclipseLink NoSQL. For MongoDB the native query is in MongoDB's command language.

Query query = em.createNativeQuery("db.ORDER.findOne({\"_id\":\"" + oid + "\"})", Order.class);
Order order = (Order)query.getSingleResult();

Step 7 : Connecting

The connection to a Mongo database is done through the JPA persistence.xml properties. The "eclipselink.target-database" property must define the Mongo platform "org.eclipse.persistence.nosql.adapters.mongo.MongoPlatform". A connection spec must also be defined through "eclipselink.nosql.connection-spec" to be "org.eclipse.persistence.nosql.adapters.mongo.MongoConnectionSpec". Other properties can also be set such as the "eclipselink.nosql.property.mongo.db", "eclipselink.nosql.property.mongo.host" and "eclipselink.nosql.property.mongo.port". The host and port can accept a comma separated list of values to connect to a cluster of Mongo databases.

<persistence-unit name="mongo-example" transaction-type="RESOURCE_LOCAL">
<class>model.Order</class>
<class>model.OrderLine</class>
<class>model.Address</class>
<class>model.Customer</class>
<properties>
<property name="eclipselink.target-database" value="org.eclipse.persistence.nosql.adapters.mongo.MongoPlatform">
<property name="eclipselink.nosql.connection-spec" value="org.eclipse.persistence.nosql.adapters.mongo.MongoConnectionSpec">
<property name="eclipselink.nosql.property.mongo.port" value="27017">
<property name="eclipselink.nosql.property.mongo.host" value="localhost">
<property name="eclipselink.nosql.property.mongo.db" value="mydb">
<property name="eclipselink.logging.level" value="FINEST">
</property>
</property>

Summary

The full source code to this demo is available from SVN.

To run the example you will need a Mongo database, which can be downloaded from, http://www.mongodb.org/downloads.

EclipseLink also support NoSQL access to other data-sources including:
  • Oracle NoSQL
  • XML files
  • JMS
  • Oracle AQ

4 comments :

  1. HI,
    i am trying to get a project running with JPA and a mongDB database. All the persistence and everything seems to work fine. However, i am not able to perform native queries using find(), only with findOne()... could you explain we how/why this happens.
    Thanks

    ReplyDelete
  2. Hi,

    Wery good article ! But I try to use this in a JEE (JBoss 7.1) container without success.
    I can't deploy the application because of persistence.xml problems.
    Can you show me the way to make this work in a JEE environment please ?

    Thank's

    ReplyDelete
  3. Eclipselink support to MongoDB seems to be a half-solution for Java EE since there is not a Java datasource implementation for MongoDB and Java EE aplications can't be deployed using RESOURCE_LOCAL transaction type.

    A better solution for now would be using a library such as morphia to deal with mongodb mapping/queries.

    ReplyDelete
  4. Interesting how everyone seems to want to standardise heterogeneous "data stores", pretending that a single API will make it easier to interact with those stores. MongoDB and RDBMS are very different beasts, each with sophisticated and dedicated querying languages. Wouldn't it be much simpler if people started using MongoDB's querying DSL and SQL instead of EclipseLink?

    The standardisation of data stores is the next big impedance mismatch after ORM themselves.

    ReplyDelete