Blog

The Digital Agency for International Development

Hibernate, EJB and the @Unique Constraint

By Chris Wilson on 26 November 2010

What are Hibernate and EJB

As a bit of background introduction, Hibernate is a Java library that allows Java objects to be loaded and saved from a database. (It is other things as well, but for simplicity I can ignore those for now). It handles loading, creating, updating and searching for objects by generating SQL queries for us.

Hibernate is an implementation of IBM and Sun's Enterprise Java Beans (EJB) specification. You can argue about which came first, Hibernate or EJB, but Hibernate is a key member of the EJB board and most new EJB-related standards follow Hibernate's de facto lead, and key Hibernate developers like Emmanuel Bernard are leaders of the EJB specification teams.

Insanity Rules

Let's start with the theory. I'm going to argue that EJB is insane. I mean it. I've been telling people that for nearly a year, and nobody has been able to prove me wrong.

It's insane because it's trying to solve the wrong problem, an impossible problem. It's trying to keep your in-memory Java objects perfectly in sync with the database contents at all times. If you don't believe me, check out the manual (under Do not treat exceptions as recoverable).

Of course that's impossible because other people, and other instances of the application, can be modifying the database under your feet, and you have no way to know until you try to save the object, which is when it fails. But the only way to find out is to commit your transaction, and you might not want to do that because you might not be ready to actually save the object yet, or you might want to recover gracefully if it fails (see below).

Instead, EJB forces you to pretend that everything is just rosy, and let it throw an exception when the inevitable happens, and someone modified the record under your feet, or some other constraint is violated (such as uniqueness). The worst thing about this exception is that you can't recover from it. That's because the faulty object is still managed by EJB.

If you try to recover from the exception (for example to display a nice message to the user instead of dumping core all over the shop floor), and you touch the database session in any way, you risk that EJB will try to save the object again... which fails again... which throws another exception.

If you discard the session, you'd better not dare touch any object that you loaded from the database before, because it might be lazily loaded (not yet loaded), and throw another exception when you try to actually do anything with it.

There is no way out of this trap, at least officially:

Do not treat exceptions as recoverable: This is more of a necessary practice than a "best" practice. When an exception occurs, roll back the Transaction and close the Session. If you do not do this, Hibernate cannot guarantee that in-memory state accurately represents the persistent state.

We've implemented a workaround in RITA, which tries to identify the offending object when an exception occurs and evicts it from the cache, but it's pretty scary and will never be officially supported by Hibernate.

Performance

The other problem with this approach is that it forces the EJB implementation to constantly check all of your objects to see if any of them have changed, and if so try to persist them to the database. Your application knows which objects have changed and is best placed to handle any errors in trying to save them to the database, but apparently the designers of EJB know better. Or something.

Validation

One of the nicer things about EJB is that it lets you annotate your data-storage classes with extra information that controls how they are saved to the database by EJB. For example, this allows you to specify the table and column names for your classes and properties, as well as information about indexes.

A sub-standard of EJB is Bean Validation (JSR 303), which allows you to write code that checks whether your object is valid before trying to save it to the database. In some cases, this can save you from falling into the trap above, because it allows you to validate your object when you want, before saving it, rather than following the whims of your EJB implementation.

So, what can you do to validate your object? Well, you can check that some fields are not null, or that their value follows a certain pattern. You can write your own custom validator that's adapted to your specific objects, by checking for invalid combinations of field values. And... that's about it.

Notably missing from this list is the ability to check anything in the database. The philosophical reason for this is that EJB is completely database-agnostic, and in fact it's trying to pretend that there is no database and apart from one magical call to a save() method, your objects live forever in some kind of implementation-independent limbo. Of course there's no universal way to access that limbo, so if you want to do it then you can kiss your platform-independence goodbye.

Validating Uniqueness

There's not even a generic way to implement something like the @Unique validation, which is so obvious that people keep asking for it. It would simply ensure that a property is unique for that kind of object, so that for example you don't have two User objects with the same name or emailAddress. But it doesn't exist in the Bean Validation specification. The official reason is that:

@Unique cannot be tested at the Java level reliably but could generate a database unique constraint generation. @Unique is not part of the BV spec today.

In other words, "life isn't perfect so we're not going to bother trying to make it better." Perhaps they're trying to save us from our own foolishness (moral hazard), that we might actually believe that it's enough to check for this and we'll never fail when writing to the database, the server will never crash or explode in flames, etc...

Incidentally, committing the transaction to check for uniqueness might force your code to go through unreadable contortions to avoid saving an invalid object or inconsistent state in the database (a preference for academic perfection over clean, maintainable code seems to be common among designers of Java standards). And committing a transaction early is also dangerous because the database will no longer detect and warn you about conflicting changes in a concurrent transaction, so you could end up silently overwriting someone else's changes.

Hibernate is more pragmatic, but @Unique doesn't even exist there, at least not officially, although there is a sample implementation on the community wiki. I'm not clear exactly why it's not official, although that page says that "accessing the Session/EntityManager during a valiation is opening yourself up for potential phantom reads", whatever that means. It is true that:

  • executing many individual reads would be difficult to manage efficiently, if you had many of these annotations;
  • reading while a flush() is in progress may trigger another flush(), leading to an infinite loop;
  • it requires you to jump through hoops in an extremely ugly way just to get a usable Hibernate session object.

Anyway, even if Sun and Hibernate don't want to write this validator because it's not technically perfect, many people are going ahead and writing it themselves, even complaining that it's "harder than you think".

Our Implementation

So I wanted to talk about what we've done to work around this, apart from me swearing never again to use EJB or Hibernate, in RITA. I don't like the above approach because we wrap an object of our own around the Hibernate session, to keep some of this crazyness locked away in a well-guarded cellar of the application. Their approach gives us no way to access our wrapper object. And it's not a standard or anything so it doesn't matter much if we ignore it.

We already have a Hibernate Interceptor, which already does the following:

  • logs object changes in the audit log; (this appears to be the most common use of interceptors in Hibernate)
  • uploads modified records from a local instance to the master, if working online;
  • goes offline if the upload fails;
  • updates version numbers of owned objects on a local instance;
  • updates version numbers of all objects on a master.

We added a checkUniqueConstraints function, which is about 40 lines long (much shorter than the example on the community wiki), that looks for @UniqueInDatabase annotations on properties and runs a quick and dirty Criteria query to verify that no conflicting values are present in the database at that time.

Further Work

It might be a good idea in future to separate these different functionalities into separate layers using something like Listeners. Interceptors are more convenient because they have access to the state of the object when it was loaded, and the current state, which is handy for audit logging.

I think it would be handy if Java (or Hibernate) would provide an easy way to iterate over an object's properties (whether annotated on their fields or getters or setters) and retrieve a specific annotation, class of annotations, or all annotations. I think this code already exists in Hibernate's AnnotationConfiguration, and it's a shame to have to write it again. Our method would be half as long if it could reuse this.