sravan
2017-06-21 18:51:53 UTC
Our batch processing applications process abut 10 Billion messages a day.
From past two months we have been experiencing an issue with ActiveMQ where
ActiveMQ delivers messages very late, sometimes messages are delivered 3
days later. Daily, in the worst case 5% (and mostly 1%) of messages are
delivered one day later. We do not have message expiration policy by the way
that controls what should happen when a message is not delivered for a
certain period of time. We monitored our splunk based logs and do not see
any exceptions or errors that indicate any issues with consumer connections.
We could not turn on additional logging on ActiveMQ as that will cause a
huge hit on performance. So we mostly relied on monitoring ActiveMQ consoles
and Dynatarce. There is no server resource utilization issues on AMQs. By
the way we have 4 AMQ nodes active in the cluster. When we monitored
activeMQ consoles we saw messages stuck in network bridge sometimes. When I
say stuck, what I mean is that message draining was extremely slow and in a
period of 2 hours I noticed only handful of messages getting drained.
Whenever we restart a AMQ node any stuck messages on that node are getting
released (I think this is a known fact to all AMQ users). What is most
frustrating is, number of stuck messages we have noticed do not correlate to
number of messages delivered lately (i.e. delivered next day). So we are
under an impression that there could be invisible stuck messages. BTW this
issue started happening ever since we applied a Linux patch
(redhat-release-5Server-5.11.0.9 /
autofs-5.0.1-0.rc2.186.el5_11) on ActiveMQ nodes. We did some research in
the forums to check if there is any incompatibility between AMQ 5.11 and
this Linux patch but could not find anything. Does anyone here have any
ideas suggestions how we can troubleshoot this issue further? Any inputs
would be greatly appreciated.
--
View this message in context: http://activemq.2283324.n4.nabble.com/Messages-are-stuck-in-ActiveMQ-5-11-and-delivered-for-after-more-than-24-hours-tp4727694.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.
From past two months we have been experiencing an issue with ActiveMQ where
ActiveMQ delivers messages very late, sometimes messages are delivered 3
days later. Daily, in the worst case 5% (and mostly 1%) of messages are
delivered one day later. We do not have message expiration policy by the way
that controls what should happen when a message is not delivered for a
certain period of time. We monitored our splunk based logs and do not see
any exceptions or errors that indicate any issues with consumer connections.
We could not turn on additional logging on ActiveMQ as that will cause a
huge hit on performance. So we mostly relied on monitoring ActiveMQ consoles
and Dynatarce. There is no server resource utilization issues on AMQs. By
the way we have 4 AMQ nodes active in the cluster. When we monitored
activeMQ consoles we saw messages stuck in network bridge sometimes. When I
say stuck, what I mean is that message draining was extremely slow and in a
period of 2 hours I noticed only handful of messages getting drained.
Whenever we restart a AMQ node any stuck messages on that node are getting
released (I think this is a known fact to all AMQ users). What is most
frustrating is, number of stuck messages we have noticed do not correlate to
number of messages delivered lately (i.e. delivered next day). So we are
under an impression that there could be invisible stuck messages. BTW this
issue started happening ever since we applied a Linux patch
(redhat-release-5Server-5.11.0.9 /
autofs-5.0.1-0.rc2.186.el5_11) on ActiveMQ nodes. We did some research in
the forums to check if there is any incompatibility between AMQ 5.11 and
this Linux patch but could not find anything. Does anyone here have any
ideas suggestions how we can troubleshoot this issue further? Any inputs
would be greatly appreciated.
--
View this message in context: http://activemq.2283324.n4.nabble.com/Messages-are-stuck-in-ActiveMQ-5-11-and-delivered-for-after-more-than-24-hours-tp4727694.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.