A Short Primer On Fetching Strategies

Note: All of this is also explained in more detail in the reference documentation. However, some Hibernate users prefer to collect information from Wiki pages or need a pointer if they post on the forum. This page describes Hibernate 3.1.

 

Hibernate3 loads a single entity instance if you retrieve it by identifier through get() or load(). All collections mapped for this entity, and all associated entities, be it through to-many or to-one associations, are not loaded. For collections there are wrappers in place, for single-ended associations, a proxy is used by default. Collection wrappers are Hibernate implementations of JDK collection interfaces, which can initialize data on-demand. A proxy is a placeholder instance of a runtime-generated subclass (through cglib or Javassist) of a mapped persistent class, it can initialize itself if any method is called that is not the mapped database identifier getter-method.

 

(The exception is a one-to-one association with a shared primary key: a proxy for the target of an association can only be used if the target is required. If a Person has a one-to-one association to a Desk entity, the primary key value of the Desk must be the same as the primary key value of the Person. The Desk can only be proxied if the Person primary key column has a foreign key constraint referencing the primary key column of Desk, making it non-optional and required; a Person needs a Desk. If this foreign key constraint is not present, or if the foreign key constraint is not mapped in Hibernate with constrained="true" in XML or optional="false" in annotations, no proxy for a Desk can be used, and Hibernate has to hit the database, either with a join or a secondary immediate query, to find out if the association is really null or not. You can use bytecode injected interception to get lazy loading in that case. Read this page for an alternative explanation of the same issue.)

 

If you then navigate from the loaded entity instances to a wrapped collection or proxied associated entity, and access the collection or entity, an additional SELECT will be executed to fetch exactly what you just hit in your code. This effectively produces n+1 selects. It also exposes you to the problem that initializing unloaded collections and proxies requires the persistence context for these objects, the Session, to be open and the objects to be attached to it. That means Hibernate can not fetch data anymore when you already ended your unit of work and closed the Session, or when you detached objects from it.

 

Obviously, the fetching behavior for navigational access can be changed and tuned (a complementary solution is the Open Session in View pattern). We have two notions: What should be fetched and how should it be fetched. There are two different and complementary tuning strategies, global and per use case.

 

Change the global behavior of a particular collection or association in mapping metadata:

  • You can switch to an immediate non-lazy second SELECT by setting lazy="false" on a collection or single-valued association mapping.
  • You can switch to an immediate non-lazy second SELECT for all single-ended associations to a particular entity class by disabling lazy fetching of the target entity using <class ... lazy="false">. This is only really useful if there are a small number of instances of the entity, and we expect to be able to pull them from the second-level cache.

These settings let you define what part of the persistent graph should be loaded at all times. The how defaults to fetch="select".

  • With fetch="join" on a collection or single-valued association mapping, you will actually avoid the second SELECT (hence making the association or collection non-lazy), by using just one "bigger" outer (for nullable many-to-one foreign keys and collections) or inner (for not-null many-to-one foreign keys) join SELECT to get both the owning entity and the referenced entity or collection. If you use fetch="join" for more than one collection role for a particular entity instance (in "parallel"), you create a Cartesian product (also called cross join) and two (lazy or non-lazy) SELECT would probably be faster.
  • With batch-size="N" on a collection or an entity class mapping you tell Hibernate to optimize the second SELECT (either lazy or non-lazy) by fetching up to N other collections (or entity instances) when you hit one in Java, depending on how many "owning" entities you expect to be in the Session already. This is a blind-guess optimization technique, but very nice for nested tree node loading.
  • With fetch="subselect" on a collection you can tell Hibernate to not only load this collection in the second SELECT (either lazy or non-lazy), but also all other collections for all "owning" entities you loaded in the first SELECT. This is especially useful for fetching multiple collections in parallel.

 

Programmatically override or completely redefine fetching at runtime through API and queries:

  • Criteria respects the laziness settings in your mappings and guarantees that what you want loaded is loaded. This means one Criteria query might result in several SQL immediate SELECT statements to fetch the subgraph with all non-lazy mapped associations and collections. If you want to change the "how" and even the "what", use setFetchMode() to enable or disable outer join fetching for a particular collection or association. Criteria queries also completely respect the fetching strategy (join vs select vs subselect).
  • HQL respects the laziness settings in your mappings and guarantees that what you want loaded is loaded. This means one HQL query might result in several SQL immediate SELECT statements to fetch the subgraph with all non-lazy mapped associations and collections. If you want to change the "how" and even the "what", use LEFT JOIN FETCH to enable outer-join fetching for a particular collection or nullable many-to-one or one-to-one association, or JOIN FETCH to enable inner join fetching for a non-nullable many-to-one or one-to-one association. HQL queries do not respect any fetch="join" defined in the mapping document.
  • At the moment no batch or subselect fetching override is available dynamically. Both Criteria and HQL respect the settings defined in the mapping document if additional SELECT statements must be executed to load the non-lazy mapped associations and collections.

 

Finally, there are more advanced options:

  • Use lazy="true" on <component>, and <property> mappings to enable lazy loading of individual scalar value-typed properties (a somewhat exotic case). Requires bytecode instrumentation of compiled persistent classes for the injection of interception code. Can be overriden in HQL with FETCH ALL PROPERTIES.
  • Use lazy="no-proxy" on single-valued associations to enable lazy fetching without the use of a proxy. Requires bytecode instrumentation for the injection of interception code.
  • Use lazy="extra" on collections for "smart" collection behavior, i.e. some collection operations such as size(), contains(), get(), etc. do not trigger collection initialization. This is only sensible for very large collections.