July 23, 2015 – 10:48 am
by Grant Keller
This morning starting at 10am we began noticing a performance impacting issue on our IMAP cluster. We are currently diagnosing the problem and hope to have the system back up and running as soon as possible. This will cause issues for mail retrieval on local clients as well as webmail.
Update: We’ve traced the problem back to high CPU utilization on one of the heads units in one of the two NFS filer clusters used to store mail spools. At this time, it isn’t entirely clear what is causing the problem and we’re working to trace it further to the source and determine what options we to resolve the situation. In the meantime, POP, IMAP and Webmail services may have intermittent availability problems. No stored mail, or inbound mail, is being deferred or lost. -Kelsey, William and the rest of the SOC.
Update: All services have been back to normal since 11:45AM. The problem filer is still showing elevated usage but is no longer running one of the CPUs at 100% – we still do not know the root cause but suspect it may have been one or more mail server processes stuck in a byzantine failure. We’re continuing to investigate and monitor the situation and have contacted our vendor for assistance. -Kelsey and William