mlange
2016-09-23 11:39:33 UTC
Recently I installed Apache ActiveMQ in a few different ways. One of those is
using ReplicatedLevelDB for a master/slave/slave setup.
Yesterday I did a bit of loadtesting: sending 100.000 messages with 100
threads producing the messages (used jmeter for that) (so each thread
produced 1000 messages); I had another process moving the messages from one
broker to another and back again (the queues had the same names across each
broker, so that was easy moving) and then went about processing the messages
which caused the messages to flow across various queues.
All seemed fine, everything looked okay... until I stopped the active
broker. (this is hours after the last message was consumed and procsesed):
Then I notice a few bouncing brokers, one comes up but crashes on an
EOFException; a bit later the other broker does the same.
In the log I see many messages like this:
[quote]
2016-09-23 13:38:52,950 | WARN | No reader available for position: 0,
log_infos:
{11534500540=LogInfo(/data/activemq/broker1-db/00000002af8282bc.log,11534500540,104858130),
12163654223=LogInfo(/data/activemq/broker1-db/00000002d502a24f.log,12163654223,104858162),
12897666570=LogInfo(/data/activemq/broker1-db/0000000300c2c60a.log,12897666570,104859912),
13002526482=LogInfo(/data/activemq/broker1-db/000000030702cf12.log,13002526482,104859038),
18455209795=LogInfo(/data/activemq/broker1-db/000000044c042743.log,18455209795,104859837),
22020442500=LogInfo(/data/activemq/broker1-db/0000000520854984.log,22020442500,104859288),
23173898306=LogInfo(/data/activemq/broker1-db/000000056545a042.log,23173898306,104860684),
24641928389=LogInfo(/data/activemq/broker1-db/00000005bcc5fcc5.log,24641928389,0)}
| org.apache.activemq.leveldb.RecordLog | Thread-1039
[/quote]
Then I see messages like this:
[quote]
2016-09-23 13:38:46,324 | WARN | Invalid log position: 11409726550 |
org.apache.activemq.leveldb.LevelDBClient | ActiveMQ BrokerService[broker1]
Task-3
[/quote]
After that, the broker starts and logs a few messages like this:
[quote]
2016-09-23 13:40:49,041 | WARN | Invalid log position: 0 |
org.apache.activemq.leveldb.LevelDBClient | Thread-1040
[/quote]
and then we get exception:
[quote]
2016-09-23 13:41:09,748 | INFO | Stopping BrokerService[broker1] due to
exception, java.io.EOFException: File
'/data/activemq/broker1-db/000000030702cf12.log' offset: 110647192 |
org.apache.activemq.util.DefaultIOExceptionHandler | LevelDB IOException
handler.
java.io.EOFException: File '/data/activemq/broker1-db/000000030702cf12.log'
offset: 110647192
[/quote]
This is a rince and repeat situation; both living brokers are now
alternating this sequence.
It seems like the load I generated caused corruption on the database; but
this should not be possible... What information can I provide to see how
this situation can be avoided?
--
View this message in context: http://activemq.2283324.n4.nabble.com/ActiveMQ-ReplicatedLevelDB-corruption-tp4716831.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.
using ReplicatedLevelDB for a master/slave/slave setup.
Yesterday I did a bit of loadtesting: sending 100.000 messages with 100
threads producing the messages (used jmeter for that) (so each thread
produced 1000 messages); I had another process moving the messages from one
broker to another and back again (the queues had the same names across each
broker, so that was easy moving) and then went about processing the messages
which caused the messages to flow across various queues.
All seemed fine, everything looked okay... until I stopped the active
broker. (this is hours after the last message was consumed and procsesed):
Then I notice a few bouncing brokers, one comes up but crashes on an
EOFException; a bit later the other broker does the same.
In the log I see many messages like this:
[quote]
2016-09-23 13:38:52,950 | WARN | No reader available for position: 0,
log_infos:
{11534500540=LogInfo(/data/activemq/broker1-db/00000002af8282bc.log,11534500540,104858130),
12163654223=LogInfo(/data/activemq/broker1-db/00000002d502a24f.log,12163654223,104858162),
12897666570=LogInfo(/data/activemq/broker1-db/0000000300c2c60a.log,12897666570,104859912),
13002526482=LogInfo(/data/activemq/broker1-db/000000030702cf12.log,13002526482,104859038),
18455209795=LogInfo(/data/activemq/broker1-db/000000044c042743.log,18455209795,104859837),
22020442500=LogInfo(/data/activemq/broker1-db/0000000520854984.log,22020442500,104859288),
23173898306=LogInfo(/data/activemq/broker1-db/000000056545a042.log,23173898306,104860684),
24641928389=LogInfo(/data/activemq/broker1-db/00000005bcc5fcc5.log,24641928389,0)}
| org.apache.activemq.leveldb.RecordLog | Thread-1039
[/quote]
Then I see messages like this:
[quote]
2016-09-23 13:38:46,324 | WARN | Invalid log position: 11409726550 |
org.apache.activemq.leveldb.LevelDBClient | ActiveMQ BrokerService[broker1]
Task-3
[/quote]
After that, the broker starts and logs a few messages like this:
[quote]
2016-09-23 13:40:49,041 | WARN | Invalid log position: 0 |
org.apache.activemq.leveldb.LevelDBClient | Thread-1040
[/quote]
and then we get exception:
[quote]
2016-09-23 13:41:09,748 | INFO | Stopping BrokerService[broker1] due to
exception, java.io.EOFException: File
'/data/activemq/broker1-db/000000030702cf12.log' offset: 110647192 |
org.apache.activemq.util.DefaultIOExceptionHandler | LevelDB IOException
handler.
java.io.EOFException: File '/data/activemq/broker1-db/000000030702cf12.log'
offset: 110647192
[/quote]
This is a rince and repeat situation; both living brokers are now
alternating this sequence.
It seems like the load I generated caused corruption on the database; but
this should not be possible... What information can I provide to see how
this situation can be avoided?
--
View this message in context: http://activemq.2283324.n4.nabble.com/ActiveMQ-ReplicatedLevelDB-corruption-tp4716831.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.