Version 2

    Compatible with Hibernate 3.2.4 SP1.

    Last rewritten on July, 2007, by Pietro Polsinelli. In the last update (for version 3.2.4), the code has been simplified.


     

    I propose here a pagination solution, which took as starting point what can be found in hibernate team weblog. It is structured in an interface, Page, and two implementation classes:

    • HibernatePage: uses the method suggested in the guide to feed the page. Has a (sample) factory method,
    public static HibernatePage getHibernatePageInstance(Query query,
    int pageNumber,
    int pageSize,
    String driverClass)

     

    that in base of the driver class returns a HibernatePage that gets the total elements by scrolling, for databases that support scrollable resultsets: not many do it really (see below). For jdbc drivers that do not support scrollable resultsets, finally I just did query.list().size(). This can be optimized in particular instances; in general, a Hibernate user proposed "I was planning on converting the Hibernate query to SQL and then replacing the 'select .... from ...' with 'select count from ..' to get the real count'" but in order to make it work in all cases, you have to take into account that you may be returning arrays of objects, hence you have to modify the Hibernate query, which is not trivial. In fact I did some testing on Oracle 10 with driver 10, and the way it is done in the sample code, using query.list().size() resulted FASTER than the count(*) done in sql!

     

    If you seriously consider performance, real problems are got by making wrong joins, forgetting to index columns and so on, where you may get multiplicative groth of required times. If anyway you are troubled by this, you can use proprietary sql tools (in particular in case of Oracle) to develop an OraclePage that avoids even that (assuming that Oracle will not release support for real scrollable resultsets): see the paragraph on Oracle below.

    Anyway I don't see what's the use of displaying huge paged resultsets to the user.

     

    • ListPage: in this case the page is fed by a list.

    It is nice to have the same interface handle paging of lists got by query and lists built in other ways, so if for example you build a web component that manages pagination, you can build one for both cases.

     

    This approach (also called "Query approach") is compatible with a stateless handling of pages, as you can create the Page in each request by passing the page number required. Being stateless, if you ask as page number Integer.MAX_VALUE, it will return the last page for the query; this way you may ask for the last page even without having already constructed the Page, which may be the case in web based contexts.

     

    A different approach (also called "Cache approach") would be to build a list of all ids queried, and put them in user session; then for each page you will instantiate the set of objects whose ids are in the page. I think that this is not a good idea as it

    • feeds user session, i.e. introduces global variables;
    • it’s uselessly complex: you need a way to tell pages what to take down from session (a key), to make it expiry..;
    • may have stale data;
    • is "working against the framework": you will get no advantages if say you enable a second level cache, because you are buildin caches at the aplication level.

     

    Another discussion of the two approaches in: http://www.devx.com/Java/Article/21383

     


    The Api

    It is structured so simply that it hardly needs explanations; and simplicity is very important. One criteria is that the names of methods should speak as much as possible by themselves. So for example it is a bad idea to introduce a method called "getElements": which elements are meant ? The page or the entire list ?

    An approach distinctively different: http://java.sun.com/blueprints/corej2eepatterns/Patterns/ValueListHandler.html http://valuelist.sourceforge.net/

     


    Is this approach performant ?

    To know the total number of elements, we use net.sf.hibernate.ScrollableResults, on which in the Hibernate 2 documentation:

    "9.3.3. Scrollable iteration

    If your JDBC driver supports scrollable ResultSets, the Query interface may be used to obtain a ScrollableResults which allows more flexible navigation of the query results."

    Now, would the stateful approach be better in this respect ? Well, assuming the above improvement, for performance I doubt so: to get the ids of all the objects, you still have to get them all; only once, but there in all databases there is a (very well made: see the excellent "SQL Tuning" by Dan Tow) cache, so our stateless query is bound to hit the cache most of the time, and hence will be performant too. Starting to develop your own cache, and making your solution complex will just worsen things: see for a completely different aproach, which simply ignores the main problem (i.e. are scrollable resultsets implemented efficiently):

    http://www.theserverside.com/articles/content/DataListHandler/DataListHandler_Revised.pdf

     


    Oracle

    Always put an order by clause when paging on Oracle: we have found that on some instances the records are resorted between one page fetch and the subsequent one if no order by is specified. Actually we found this behavior in case of a distributed db load, so that may be the cause. Anyway, if you use a order by clause, you'll be fine.

    This version does NOT scale well on Oracle 9 (so use 10 if you can):

    http://download-west.oracle.com/docs/cd/B10501_01/java.920/a96654/resltset.htm

    in particular read

    "Because the underlying server does not support scrollable cursors, Oracle JDBC must implement scrollability in a separate layer. It is important to be aware that this is accomplished by using a client-side memory cache to store rows of a scrollable result set.

    Important: Because all rows of any scrollable result set are stored in the client-side cache, a situation where the result set contains many rows, many columns, or very large columns might cause the client-side Java virtual machine to fail. Do not specify scrollability for a large result set.

    Scrollable cursors in the Oracle server, and therefore a server-side cache, will be supported in a future Oracle release."

    http://opensource.atlassian.com/projects/hibernate/browse/HB-1181

    in particular read

    "Oracle 9i's JDBC driver caches scrollable results sets in the JVM, meaning that scroll() is impractical for large result sets:"

    See? Oracle emulates (or.. fakes ;-) )a scrollable result set, hence maximizing impact on memory.

    If you can't wait or patch, you could develop a OraclePage extending hibernate page where you use Native SQL Queries, in particular you will have to insert Oracle's proprietary rownum modifiers into the generated SQL: by using Cached RowSet and cachedRowset.getRow() you get the row count of a query without scanning all the rows or issue a separate SELECT COUNT(*). Myself may have to that in the near future, if so I will integrate this page.

    Oracle 10 fixed many issues, and is positively faster (it still is from 3 to 10 times slower then jtds on sql server on the same hardware, but that may come from my oracle incompetence).

     


    Sql server on Jtds

    On SQL Server server-side cursors are used.

    On SQL Server jTDS uses server-side cursors for scrollable ResultSets, and only fetches a limited number of rows at a time (by default 100 rows, you can alter that if your rows contain LOBs or if memory is an issue for any other reason). So no problem here. Thanks to Alin Sinpalean of the jtds team for help.

     


    Sybase on Jtds

    When jTDS connects to Sybase it uses client-side cursors, so the whole ResultSet is cached on the client. So no good. Thanks to Alin Sinpalean of the jtds team for help.

     


    MySql

    using http://dev.mysql.com/doc/connector/j/en/index.html#id2424811

    "ResultSet

    By default, ResultSets are completely retrieved and stored in memory. In most cases this is the most efficient way to operate, and due to the design of the MySQL network protocol is easier to implement. If you are working with ResultSets that have a large number of rows or large values, and can not allocate heap space in your JVM for the memory required, you can tell the driver to 'stream' the results back one row at-a-time.

    To enable this functionality, you need to create a Statement instance in the following manner:

    stmt = conn.createStatement( java.sql.ResultSet.TYPE_FORWARD_ONLY, java.sql.ResultSet.CONCUR_READ_ONLY );

    stmt.setFetchSize(Integer.MIN_VALUE);

    The combination of a forward-only, read-only result set, with a fetch size of Integer.MIN_VALUE serves as a signal to the driver to "stream" result sets row-by-row. After this any result sets created with the statement will be retrieved row-by-row."

    So here too the FORWARD_ONLY parameter should help.

     


    Hsqldb

    Found on Google: "HSQLDB 1.7.1 doesn't support scrollable insensitive ResultSets"; no infos on the website.

     


    Firebird JDBC driver

    (kindly supplied by Roman Rokytskyy rrokytskyy@users.sourceforge.net)

    ResultSet.TYPE_SCROLL_FORWARD_ONLY uses server-side cursor, each fetch call

    fetches as many records as fit in the single network packet unless FOR

    UPDATE clause is specified in SELECT statement (in this case only one record

    will be send per fetch operation). JDBC driver will do as many fetches as

    needed to fill the record number specified in Statement.setFetchSize (a bug

    in latest 1.5.5 version ignores the ResultSet.setFetchSize(int) value, will

    be fixed in next release).

    ResultSet.TYPE_SCROLL_INSENSITIVE will cache complete result set in JVM (so

    we "emulate"/"fake" the scrollable result sets since server does not support

    this feature).

    ResultSet.TYPE_SCROLL_SENSITIVE is not supported and is "downgraded" into

    TYPE_SCROLL_INSENSITIVE with appropriate warning in the Connection object.

    Also Firebird supports SELECT FIRST n SKIP m <column list> FROM ...

    statements. Firebird 2.0 will support ROWS clause (SELECT * FROM .. WHERE

    ... ORDER BY ... ROWS 10 TO 20) as an alternative to FIRST/SKIP syntax.

     


    PostgreSQL JDBC driver

    I am informed by Mr. Moisei that on PostgreSQL the scrolling is quite slow. One could in such cases inject a specific counting query to speed up things. Note that one cannot in general do a "select count", as the hql may involve arbitrarily complex joins for the select.

     


    Informix JDBC driver

    (kindly supplied by Bob Copeland)

    Scroll Sensitivity

    The Informix database server implementation of scroll cursors places the rows fetched in a temporary table. If another process changes a row in the original table (assuming the row is not locked) and the row is fetched again, the changes are not visible to the client.

    This behavior is similar to the SCROLL_INSENSITIVE description in the JDBC 2.0 specification. Informix JDBC Driver does not support SCROLL_SENSITIVE cursors. To see updated rows, your client application must close and reopen the cursor.

    Client-Side Scrolling

    The JDBC specification implies that the scrolling can happen on the client-side result set. Informix JDBC Driver supports the scrolling of the result set only according to how the database server supports scrolling.

     


    Other jdbc drivers

    Help me !

     


    In conclusion, for the moment only jtds supplies optimal support for this technique; this is not a sufficient reason not to use it anyway.

    In order to make this page readable, the code is on a separate one: paginationCode.

    Some readers have asked for a sample of the code usage; actually it is as simple as this: given a Hibernate query hibQuery, just call

    Page page = HibernatePage.getHibernatePageInstance(hibQuery,pageNumber,pageSize);
    for (Object yourType : page.getThisPageElements()) {
          YourType yourType = ...
    ...
    }
    

    If you have updated information, please feed this page (not the remarks below) or send me (mailto:ppolsinelli (at) open-lab (dot) com) infos to put here.