11 Replies Latest reply on Jan 24, 2012 1:49 PM by hvico

Cache eviction with Lucene / HSearch

hvico Jan 23, 2012 6:39 AM

Hi!

I've recently released an HSearch application storing the indexes using InfiniSpan 4.x, and an MySQL JDBC-backed cacheloader. After a full index rebuild my cacheloader "LuceneIndexData" table has around 30.000 registers (indexing process lasts around 20 minutes). The number of documents to index grow really slowly in this applications, around 30 documents per day, so the indexes should not grow considerably.

The application was doing well, until some days passed. Then I got out of memory erros, and noticing the cacheloader LuceneData table had grown over 120.000 registers, so I had to truncate it and reindex from scratch.

After that, I started to keep track of the LuceneData table size, and noticed that it grows even on weekends where no new documents are stored, so this must be related to Infinispan. My Lucene config is set to optimize indexes after 1000 transactions, so it should not grow this way.

Possibly my problem is related to having a wrong eviction policy, because the Infinispan config I used had this configuration:

At first, I changed that to:

But then I could not rebuild my indexes, in the middle of the process I got "read past EOF" exceptions and "could not acquire lock" exceptions. I think Infinispan is trying to evict entries that are in use by the indexing process.

Now I am trying the following configuration in order to allow the indexing process to finish without any eviction occuring in the middle of it:

Using this configuration I got no errors, but I am blindly configuring something I do not really understand.

So, as you may see I am confused and do not really know how the eviction process works. I need all my entities to be available when I do a full text search, I cannot afford to get "partial" results, and of course I need the cache not to explode because out of memory errors. Could you please explain how Infinispan handles eviction when acting as a Lucene index store, and which eviction policy should I use?

Many thanks in advance,

1. Re: Cache eviction with Lucene / HSearch

hvico Jan 24, 2012 8:24 AM (in response to hvico)

Please, I really need some help here!

After setting that last configuration:

<eviction maxEntries="30000" strategy="LIRS" wakeUpInterval="1800000"/>

My in memory cache seems to be stable in size (the cacheloader table still grows really fast).

But now I have this problem, I started to get this search exceptions:

Caused by: org.hibernate.search.SearchException: Unable to query Lucene index
.....
Caused by: java.io.IOException: Read past EOF: Chunk value could not be found for key _94.tis|167|org.somepackage.SomeEntity

Thanks!
Actions
2. Re: Cache eviction with Lucene / HSearch

sannegrinovero Jan 24, 2012 9:01 AM (in response to hvico)

Hi Horacio,
Is passivation enabled? If you don't enable passivation, when an entry is selected for eviction it's not stored anyway, but thrown away.

what do you mean by "registers" in your fist post? is it indexed elements? entities? entries in the JDBC cacheloader table?

Which version of Hibernate Search? And please I'll need the exact version of Infinispan too, "4.x" is a long range.

In all cases, the SearchException you have might mean two things:
1- some segments where lost, possible if you where using a very old "4.x" version of Infinispan since the Lucene Directory was experimental, I remember fixing several bugs.
2- you're having more than one node writing at the same time, and one is deleting a segment being read by another. This is an unlikely race condition which is prevented by the ReadLocks, or by using big enough chunksizes. By default the chunkSize is very small (4k), but the readlocks are enabled.

What's the size of chunkSize being used? Is the cache clustered or is it a single node?
1 of 1 people found this helpful
Actions

3. Re: Cache eviction with Lucene / HSearch

hvico Jan 24, 2012 9:28 AM (in response to sannegrinovero)

Hi Sanne!

1) Passivation is disabled. I read this on Infinispan documentation:

Passivation is also a popular option when using eviction, so that only a single copy of an entry is maintained - either in memory or in a cache store, but not both. The main benefit of using passivation over a regular cache store is that updates to entries which exist in memory are cheaper since the update doesn't need to be made to the cache store as well.

Also this information and the example with and without passivation shows the entry is always stored in the cachestore:

http://docs.redhat.com/docs/en-US/JBoss_Enterprise_Web_Platform/5/html/JBoss_Cache_User_Guide/cl.pass.html

Is this wrong and I should enable passivation or did I misunderstand something?

2) By registers I meant entries/rows in the JDBC cacheloader table.

3) HSearch 3.4.1.FINAL and Infinispan 4.2.1-Final

4) I did not configure any chunksize, so it must have the default value.

About my cluster configuration, it has 4 nodes, and this is what I am trying to achieve:

Node 1) It has a backend application where documents are generated, this node has the JDBC cachestore configured.

Node 2,3 and 4) It has the fronted search read-only application. Identicall infinispan configuration EXCEPT the nodes do not have the JDBC cachestore configured.

Intially I tried to configure a shared cachestore between all 4 nodes, but under stress I found exceptions ("read past EOF"), so I decided that only the backend application should interact with the cachestore, and the frontend nodes should work with an in-memory cache. This exception after configuring eviction arised in the "frontend" nodes, not in the backend, so maybe this nodes are not getting evicted entries from the cachestore through the backend node (that is how I supposed it would work, maybe it does not work this way and every node have to point to the JDBC cachestore).

Many MANY thanks for your interest!

4. Re: Cache eviction with Lucene / HSearch

hvico Jan 24, 2012 9:50 AM (in response to hvico)

Further info: I enabled passivation in the backend node which has the cachestore configured, and now I get chunk not found and read past EOF exception when I reindex all my entities.

I changed my chunk size to 10 Mb.
Actions
5. Re: Cache eviction with Lucene / HSearch

sannegrinovero Jan 24, 2012 10:46 AM (in response to hvico)

Right sorry about the passivation I take that back: I confused myself while answering you, the docs and example are correct. I often use passivation, but you shouldn't need it in this configuration.

I'd suggest to keep the 10MB chunkSize, that looks good.

Did you remove all data from the CacheLoade before attempting other fixes? You might have an issue which is now permanently stored in the database, and you're having Infinispan realoding the corrupt index at each startup. I'd remove all data an re-index from scratch.

The CacheLoader configuration is expected to be the same on each node: so if your frontend nodes have no CacheLoader, they don't know the backend node has got one and they won't ask him for the data: if the specific Infinispan key is not found where they expect it, they're going to assume the key is not existing.
If a single index segment is bigger than your chunksize, and it's not finding the next chunk when an InputStream is open, you'll get the "read past EOF" problem.

Is the cluster using replication or distribution? I'd suggest replication since you have very limited number of writes and the cluster is small.
Make sure you use "fetchInMemoryState=true" as well, or each node won't download the current index from the peers at startup
[http://docs.jboss.org/infinispan/5.1/apidocs/config.html#ce_clustering_stateRetrieval or http://docs.jboss.org/infinispan/4.2/apidocs/config.html#ce_clustering_stateRetrieval ]
1 of 1 people found this helpful
Actions
6. Re: Cache eviction with Lucene / HSearch

hvico Jan 24, 2012 11:25 AM (in response to sannegrinovero)

Sanne,

My cluster is using replication.

So you made clear I should have the cacheloader config in the frontend nodes, but I cannot make this work.

This is my LuceneData cache config (backend node):

<namedCache
        name="LuceneIndexesData">
        <clustering mode="replication">
            <stateRetrieval fetchInMemoryState="true" logFlushTimeout="300000" />
            <sync replTimeout="50000" />
            <l1 enabled="false" />
        </clustering>
        <locking lockAcquisitionTimeout="20000" writeSkewCheck="false" concurrencyLevel="5000" useLockStriping="false" />
      <loaders shared="true" preload="true" >
         <loader class="org.infinispan.loaders.jdbc.stringbased.JdbcStringBasedCacheStore" fetchPersistentState="true" ignoreModifications="false" purgeOnStartup="false">
            <properties>
               <property name="key2StringMapperClass" value="org.infinispan.lucene.LuceneKey2StringMapper" />
               <property name="createTableOnStart" value="true" />

               <property name="datasourceJndiLocation" value="java:/MyDatasource" />
               <property name="connectionFactoryClass" value="org.infinispan.loaders.jdbc.connectionfactory.ManagedConnectionFactory" />
               <property name="dataColumnType" value="BLOB" />

<property name="idColumnType" value="VARCHAR(256)" />
               <property name="idColumnName" value="idCol" />
               <property name="dataColumnName" value="dataCol" />
               <property name="stringsTableNamePrefix" value="LuceneIndexesData" />

               <property name="timestampColumnName" value="timestampCol" />
               <property name="timestampColumnType" value="BIGINT" />
            </properties>
            <async enabled="true" flushLockTimeout="2500" shutdownTimeout="7200" threadPoolSize="5" />
         </loader>
      </loaders>
      <eviction maxEntries="60000" strategy="LIRS" wakeUpInterval="18000000" />
      <expiration maxIdle="-1" />
</namedCache>

When I add the cacheloader block to the frontend nodes I get "could not acquire cluster-wide sync after 5 minutes" exceptions at startup in the backend (which tries to send state), and "read past EOF" exceptions at the frontend node (trying with just one node now in order to find a config that works).

I need the frontend to access the data which is evicted to the cachestore. Could you please suggest something based on my config?

Thanks,
Actions
7. Re: Cache eviction with Lucene / HSearch

hvico Jan 24, 2012 11:39 AM (in response to hvico)

Edit:

I could start a frontend node with cachestore enabled, but had to disable fetchInMemoryState in the stateRetrieval block. But the node seems to work fine for now (have to test it further, because eviction problems occured after some use ). I tried some queries on the frontend and it retrieves documents.

I also enabled "ignoreModifications" on the fronted, since I don't want this nodes to modify the cachestore.

Should I expect side-effects because I disabled that couple of settings in the frontend?

Thanks,
Actions
8. Re: Cache eviction with Lucene / HSearch

sannegrinovero Jan 24, 2012 12:10 PM (in response to hvico)

Hi Horacio,
no I wouldn't expect side-effects by "ignoreModifications". So all is working fine now?

I'd suggest if you can to update to more recent libraries, we solved a long list of bugs in the last year.

Hibernate Search 4.0.0.Final expects Infinispan 5.0.1.Final (and Hibernate Core 4), today we're releasing Infinispan 5.1.0.Final and a new version of Search will be out soon as well; 4.1.0.Alpha1 is out already and is already compatible with Infinispan 5.1. Even if you can't update your application right away, I'd appreciate if you could try it in tests as I need feedback: I plan with this release cycle to make it easier to setup clustering; let's see also if we can provide some automatic validation of the Infinispan configuration.

I hope you solved your issue.
Actions
9. Re: Cache eviction with Lucene / HSearch

hvico Jan 24, 2012 12:23 PM (in response to sannegrinovero)

Now is working, but I need more testing, specially I need to watch this over time (a day or two).

Forgive my ignorance, but what about this eviction and ignoreModifications=true combination would work?

Backend and frontend nodes have eviction configured.

What happens when a node evicts an entry which is later needed?. This node cannot modify the cachestore because of the ignoreModifications flag. It would retrieve that entry from the backend? Is eviction "synchronized" over the cluster (a cluster-wide operation), so when a frontend node evicts an entry the backend would save it in the cachestore (since the backend has the ignoreModifications flag set to false)?

Thanks,
Actions
10. Re: Cache eviction with Lucene / HSearch

sannegrinovero Jan 24, 2012 12:54 PM (in response to hvico)

No eviction is not synchronized across the cluster, but since you have passivation=false, as soon as your backend node writes something to the cache, a copy is written into the CacheLoader and it's replicated to the other nodes to update their in-memory content; when any node needs it, it will look into the CacheLoader if it's not in the local Infinispan Cache.

When it's evicted, it's removed from the Cache, but it's not removed from the CacheStore so when it's needed again it will be reloaded again from the database.
Actions
11. Re: Cache eviction with Lucene / HSearch

hvico Jan 24, 2012 1:49 PM (in response to sannegrinovero)

OK, thank you very much. I will monitor my app some days and see how it works.

About my libs and versions, I considered upgrading my application some time ago, but the compatibility matrix is too complex (JBoss AS 4, SEAM 2.x, JSF 1, Richfaces 3, Hibernate 3.x, H-Search 3.4 and its Lucene dependencies, Infinispan 4, and so on ). It would be a huge project to upgrade and test everything now.

However, I am doing some research for my master's thesis, which would probably be related with this topics (full text searching and scalability along clusters), so I will be testing new releases in a couple of months.
Actions

Go to original post