-
60. Re: Journaling errors
clebert.suconic Aug 24, 2010 5:12 PM (in response to clebert.suconic)I committed a fix, and it will supposedly fix some of these issues. I'm still trying to replicate the issue with your restarts (which I couldn't yet).
And I found a leak on trunk that will affect your tests.. so you should stick with 2.1.2 for now...
Or you can use Branch_1_2 which is basically 2.1.2 + the fix on the journal. I have been able to run your test for long periods without any warning so far.
I'm still testing it though. (and still waiting some information on the properties you're using)
(I will identify the leak on trunk in the next few days, what will make trunk stable again)
-
61. Re: Journaling errors
ronnys Aug 25, 2010 3:04 AM (in response to clebert.suconic)Hi Clebert,
Thanks a lot. I'll do some tests with the branch today and let you know. Will send the requested information ASAP.
Best regards,
Ronny
-
62. Re: Journaling errors
clebert.suconic Aug 29, 2010 4:14 PM (in response to ronnys)Ronny,
I have your MultiThreadLoadTest running with flips for 24h jours now without any missing messages (# msg missing: 0). (Using Branch_2_1, r9607)
I have duplicates on the test, but that's because the test is not coping well with failures as we talked before.
However I want to change the test to make sure it will cope well with failures on receiving, maybe adding XA.. etc. I will work with it around this monday.
thanks for all the help so far
-
63. Re: Journaling errors
ronnys Aug 30, 2010 9:21 AM (in response to clebert.suconic)Hi Clebert,
the new version looks extremely good. I ran the multithreaded diverts test w/o server restarts for 3h without any errors. The same test with continuous server restarts is running since 3.5h without any server errors as well. Will keep it running over night. Excellent job! Thanks! Hope you didn't spent your whole weekend on it.
Best regards,
Ronny
-
64. Re: Journaling errors
ronnys Sep 3, 2010 3:18 AM (in response to ronnys)Hi Clebert,
the test ran for the past ~96h (~780 million inbound messages, ~1.56 billion outbound messages, ~3600 restarts). No issues detected. Thanks again for all your help!
Best regards,
Ronny
-
65. Re: Journaling errors
clebert.suconic Sep 3, 2010 8:23 AM (in response to ronnys)That's great news Ronny...
I've also added a soak test inspired in your test.
/Branch_2_1/examples/soak/tx-restarts
This test will start/stop the server, making sure no messages are lost or duplicate.
It's also playing with XA directly.
II've run the test for 12 hours, with Journal file size=100K what would make it fail much earlier and no issues.
WARNING: tx-restarts is not an example on how you should handle XA. The way to go is to use TransactionManager. I'm dealing directly with XIDs, Start, End, Prepare and Rollback as a way to test it only.
(@Ronny: I know you know that.. I'm just putting the WARNING note case someone is reading the thread and getting access to the test)
-
66. Re: Journaling errors
clebert.suconic Sep 3, 2010 5:09 PM (in response to clebert.suconic)And to close the thread with a golden key.. I've also did the same tests with kills... and it was all good also.
-
67. Re: Journaling errors
timfox Sep 13, 2010 5:25 AM (in response to clebert.suconic)Clebert Suconic wrote:
That's great news Ronny...
I've also added a soak test inspired in your test.
/Branch_2_1/examples/soak/tx-restarts
Why is this not in TRUNK?
-
68. Re: Journaling errors
clebert.suconic Sep 13, 2010 9:09 AM (in response to timfox)It is on trunk