12 Replies Latest reply on Jan 5, 2016 10:58 PM by pranavk

    Issues faced during clustered caching

    pranavk

      Hi,

       

      Just wanted to point to some observations while testing clustered caching. The cluster includes 2 machines running Teiid, having the same VDB deployed, which contains a view model mapped to a JDBC a single source (no joins/federation). The view model is deployed with internal materialization switched on. The source contains about 10MN rows and 14 cols.

      When I query node1 (say select * from viewmodel), I see that buffer folder gets filled up on node1 (for cached entries), and then the cache sync takes place with node2. Now while the syncing is taking place, if the network gets cut (buffer folder sync stops), after a while node1 stops attempts to sync and serves the results of the query. Now when I run the same query on node2 (select * from viewmodel), it is not able to return any results (neither it returns fresh results nor from the partially sync-ed cache buffer).

       

      Apart from this I also noticed that when node1 starts to return its results (after network break and failure to sync node2), it was returning a random record count at times. For eg for a 10MN row table, it sometimes returned 9.87MN, 9.82MN, 8.79MN, and even the correct 10MN at times for each of the first runs where network break took place. But for each subsequent runs (fetching from cache), it always returned 10MN records.

       

      This seems a fairly plausible scenario in a clustered environment, hence thought of opening a discussion as to why these 2 issues could be occurring.

       

      Thanks,

      Pranav

        • 1. Re: Issues faced during clustered caching
          rareddy

          Pranav,

           

          In the first case did "node2" restart or it case of network failure? Looks like we need to check how Teiid tries to re-sync. The second case

           

          Apart from this I also noticed that when node1 starts to return its results (after network break and failure to sync node2), it was returning a random record count at times. For eg for a 10MN row table, it sometimes returned 9.87MN, 9.82MN, 8.79MN, and even the correct 10MN at times for each of the first runs where network break took place. But for each subsequent runs (fetching from cache), it always returned 10MN records.

          sounds little confusing to me, are you saying, first time the results fetched they were 10M then subsequently they were different counts. Or is this behavior *only* after the network sync failure?

           

          Ramesh..

          • 2. Re: Issues faced during clustered caching
            pranavk

            Hi Ramesh,

            >In the first case did "node2" restart or it case of network failure?

            It was a case of network failure. The network gets back up after say 5 mins, till that time node1 runs its usual course and returns the results (as the initial query was to node1); and querying node2 with the same query leads to a hang, no results are ever returned. It should probably re-sync cache using the the cached results from node1 (code checks for state of Mat tables as FAILED_LOAD but was not able to sync), OR retrieve fresh results from source in the worst case.

             

             

            > sounds little confusing to me, are you saying, first time the results fetched they were 10M then subsequently they were different counts. Or is this behavior *only* after the network sync failure?

            This scenario is only related to query on node1, against which the initially query was fired. I am talking about the scenario where the network failure just happened while syncing between node1 and node2, and node1 starts to finally return the result of the query (when it is not able to sync). In this scenario, I saw a random count of records being returned (on that particular run where network failure happened). On any subsequent query to node1 (which now has results cached for itself), it returns the correct count of 10MN.

            • 3. Re: Issues faced during clustered caching
              pranavk

              For the first case, it seems that jgroups is not able to re-establish communication properly (something which is expected by the code), hence it hangs up returning no results. Some of the messages from jgroups that appeared on the console (node1) :

              WARN  [TCP] null: no physical address for NODE2-30937, dropping message

              WARN  [TCP] null: no physical address for Node-A-43526, dropping message

              WARN  [TCP] null: no physical address for Node-A-43526, dropping message

              WARN  [FD] I was suspected by NODE2-30937; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK

              WARN  [FD] I was suspected by Node-A-43526; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK

              WARN  [FD] I was suspected by NODE2-30937; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK

              WARN  [FD] I was suspected by Node-A-43526; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK

              WARN  [NAKACK] [JGRP00011] Node-A-61391: dropped message 4 from non-member Node-A-43526 (view=[Node-A-61391|4] [Node-A-61391])

              WARN  [NAKACK] [JGRP00011] NODE1-34424: dropped message 4 from non-member NODE2-30937 (view=[NODE1-34424|4] [NODE1-34424])

              WARN  [NAKACK] (requester=Node-A-43526, local_addr=Node-A-61391) message Node-A-61391::5 not found in retransmission table of Node-A-61391:

              [6 (6)]

              WARN  [NAKACK] (requester=Node-A-43526, local_addr=Node-A-61391) message Node-A-61391::5 not found in retransmission table of Node-A-61391:

              [6 (6)]

              WARN  [NAKACK] (requester=Node-A-43526, local_addr=Node-A-61391) message Node-A-61391::5 not found in retransmission table of Node-A-61391:

              [6 (6)]

              WARN  [NAKACK] (requester=Node-A-43526, local_addr=Node-A-61391) message Node-A-61391::5 not found in retransmission table of Node-A-61391:

              [6 (6)]

              WARN  [FD] I was suspected by NODE2-30937; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK

              WARN  [FD] I was suspected by Node-A-43526; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK

              WARN  [FD] I was suspected by Node-A-43526; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK

              WARN  [TCP] null: no physical address for Node-A-43526, dropping message

              WARN  [TCP] null: no physical address for NODE2-30937, dropping message

              WARN  [TCP] null: no physical address for NODE2-30937, dropping message

              WARN  [TCP] null: no physical address for NODE2-30937, dropping message

              WARN  [TCP] null: no physical address for Node-A-43526, dropping message

              WARN  [Merger] NODE1-34424: java.lang.Exception: did not get any merge responses from partition coordinators, merge is cancelled

              WARN  [TCP] null: no physical address for NODE2-30937, dropping message

              WARN  [NAKACK] [JGRP00011] Node-A-61391: dropped message 8 from non-member Node-A-43526 (view=[Node-A-61391|9] [Node-A-61391])

              WARN  [NAKACK] (requester=Node-A-43526, local_addr=Node-A-61391) message Node-A-61391::7 not found in retransmission table of Node-A-61391:

              [8 (8)]

              WARN  [TCPConnectionMap] [JGRP00006] failed accepting connection from peer: java.net.SocketTimeoutException: Read timed out

              WARN  [NAKACK] (requester=NODE2-30937, local_addr=NODE1-34424) message NODE1-34424::8 not found in retransmission table of NODE1-34424:

              [9 (9)]

              WARN  [NAKACK] (requester=Node-A-43526, local_addr=Node-A-61391) message Node-A-61391::9 not found in retransmission table of Node-A-61391:

              [10 (10)]

              WARN  [NAKACK] (requester=Node-A-43526, local_addr=Node-A-61391) message Node-A-61391::9 not found in retransmission table of Node-A-61391:

              [10 (10)]

              WARN  [NAKACK] (requester=NODE2-30937, local_addr=NODE1-34424) message NODE1-34424::8 not found in retransmission table of NODE1-34424:

              [9 (9)]

              WARN  [NAKACK] (requester=Node-A-43526, local_addr=Node-A-61391) message Node-A-61391::9 not found in retransmission table of Node-A-61391:

              [10 (10)]

              WARN  [NAKACK] (requester=NODE2-30937, local_addr=NODE1-34424) message NODE1-34424::8 not found in retransmission table of NODE1-34424:

              [9 (9)]

              WARN  [NAKACK] (requester=Node-A-43526, local_addr=Node-A-61391) message Node-A-61391::9 not found in retransmission table of Node-A-61391:

              [10 (10)]

              WARN  [NAKACK] (requester=Node-A-43526, local_addr=Node-A-61391) message Node-A-61391::9 not found in retransmission table of Node-A-61391:

              INFO   | jvm 1    | 2015/12/30 14:54:54 | [10 (10)]

              • 4. Re: Issues faced during clustered caching
                shawkins

                Is there a teiid context log related to the network failure, or is it only these messages at the jgroups context?

                • 5. Re: Issues faced during clustered caching
                  shawkins

                  > Now when I run the same query on node2 (select * from viewmodel), it is not able to return any results (neither it returns fresh results nor from the partially sync-ed cache buffer).

                   

                  I am able to reproduce this behavior.  Node2 is determining that it should not reload the matview based upon the state on node1, and is effectively waiting for node1 to reload.  This is obviously not desirable if node1 is not actually in a loading state.

                   

                  I'm still working on reproducing the other behavior.  It seems odd in that if there is any error in the transfer, the whole thing should be invalidated.

                  • 6. Re: Issues faced during clustered caching
                    pranavk

                    The following is the error that shows up on node2 (the node which is syncing/pulling the cached data from node1)

                     

                      [Asynch Worker1] ERROR org.teiid.RUNTIME  - TEIID40101 error setting state #MAT_MV10.TABLE_1

                    org.teiid.core.TeiidRuntimeException: TEIID30222 java.io.IOException: java.util.concurrent.TimeoutException

                      at org.teiid.query.tempdata.GlobalTableStoreImpl.setState(GlobalTableStoreImpl.java:597)

                      at org.teiid.query.tempdata.GlobalTableStoreImpl.setState(GlobalTableStoreImpl.java:86)

                      at org.teiid.replication.jgroups.JGroupsObjectReplicator$StreamingRunner.run(JGroupsObjectReplicator.java:234)

                      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

                      at java.util.concurrent.FutureTask.run(FutureTask.java:266)

                      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)

                      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)

                      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

                      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

                      at java.lang.Thread.run(Thread.java:745)

                    Caused by: java.io.IOException: java.util.concurrent.TimeoutException

                      at org.teiid.replication.jgroups.JGroupsInputStream.read(JGroupsInputStream.java:57)

                      at java.io.InputStream.read(InputStream.java:170)

                      at java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2313)

                      at java.io.ObjectInputStream$BlockDataInputStream.refill(ObjectInputStream.java:2537)

                      at java.io.ObjectInputStream$BlockDataInputStream.read(ObjectInputStream.java:2618)

                      at java.io.DataInputStream.readInt(DataInputStream.java:387)

                      at java.io.ObjectInputStream$BlockDataInputStream.readInt(ObjectInputStream.java:2823)

                      at java.io.ObjectInputStream.readInt(ObjectInputStream.java:972)

                      at org.teiid.client.BatchSerializer$IntColumnSerializer.readObject(BatchSerializer.java:567)

                      at org.teiid.client.BatchSerializer$ColumnSerializer.readColumn(BatchSerializer.java:543)

                      at org.teiid.client.BatchSerializer.readBatch(BatchSerializer.java:924)

                      at org.teiid.common.buffer.STree.readValuesFrom(STree.java:186)

                      at org.teiid.query.tempdata.TempTable.readFrom(TempTable.java:828)

                      at org.teiid.query.tempdata.GlobalTableStoreImpl.loadTable(GlobalTableStoreImpl.java:625)

                      at org.teiid.query.tempdata.GlobalTableStoreImpl.setState(GlobalTableStoreImpl.java:592)

                      ... 9 more

                    Caused by: java.util.concurrent.TimeoutException

                      ... 24 more

                    • 7. Re: Issues faced during clustered caching
                      pranavk

                      > I'm still working on reproducing the other behavior.  It seems odd in that if there is any error in the transfer, the whole thing should be invalidated.

                       

                      Are you saying that if there is any error in transfer of cached data (cache sync) from node1 to node2, the whole data should be invalidated on node1 too?

                      I feel the current behavior in this context is correct, that node1 still returns the query results, even if it sees that the transfer to node2 has failed. The issue I am pointing to here, is that node1 is at times returning arbitrary number of records which is a bit less than the total. Any subsequent query to node1 (which now I believe serves from the cache/buffer) is always returning the correct number of records.


                      As far as invalidation is concerned, it should happen with the case of node2 -- where the state of transfer is only partial

                      • 8. Re: Issues faced during clustered caching
                        shawkins

                        > Are you saying that if there is any error in transfer of cached data (cache sync) from node1 to node2, the whole data should be invalidated on node1 too?

                         

                        No I'm saying that the node2 state should be invalid.  I misread and thought you were indicating that node2 was returning inconsistent results.

                         

                        > The issue I am pointing to here, is that node1 is at times returning arbitrary number of records which is a bit less than the total. Any subsequent query to node1 (which now I believe serves from the cache/buffer) is always returning the correct number of records.

                         

                        I'm not sure what you are seeing there.  Does this ever happen when there is a single node or no errors?  Are the source results always consistent in row count?  Are any updates being performed against the materialization table?

                        • 9. Re: Issues faced during clustered caching
                          pranavk

                          > I'm not sure what you are seeing there.  Does this ever happen when there is a single node or no errors?  Are the source results always consistent in row count?  Are any updates being performed against the materialization table?

                           

                          No I haven't noticed these issues in case of a single node, or in the cases where there is no network break yet.

                          And no updates are being performed on the materialization table. I'll see if I can narrow down on the occurrence of this second issue more.

                           

                          Thanks

                          Pranav

                          • 10. Re: Issues faced during clustered caching
                            shawkins

                            I've logged [TEIID-3879] Error during state transfer prevents a node from loading later - JBoss Issue Tracker to capture the first issue.

                             

                            I'll keep trying to reproduce the second one as well.

                             

                            Thanks,

                            Steve

                            • 11. Re: Issues faced during clustered caching
                              shawkins

                              TEIID-3879 will be addressed for 8.12.4 and 8.13, which should both be released in the next two weeks.

                              • 12. Re: Issues faced during clustered caching
                                pranavk

                                Thanks Steve!