digital.forest Technical Support
Mail loop causes delivery delays

Tonight, a little before midnight, a client created a forwarding mail loop, back to themselves via an external address. A single message queued on the server "treehouse" looped out via our mailhub, to an external mail address, which then looped back via postini, and into treehouse. You will note that this loop is asynchronous, which prevented the built-in mail loop detection features from stopping it.

Within seconds, this loop started clogging our outbound SMTP queues, and it was finally detected by our monitoring systems as the disks of our mail servers began to fill.

It took us well over an hour to get this under control, and required us to stop processing mail for several minutes at a time. As it was a forwarding loop, the message that looped grew in size every time it looped and so we had tens of thousands of looped messages each queued on multiple servers here. We were able to delete them from the SMTP queue on treehouse, but not on our outbound spam/virus filtering mail hub (due to limitations of that device's software.)

The loop will have lingering effects, and we're taking the following steps to mitigate them:

* We have removed the filtering mail hub out of its primary task of handling all of our outbound mail. It will take it some time to unload its outbound and inbound(bounces) queue.

* We have configured our mail servers to relay mail directly outbound. This will slow normal delivery as we can not filter outbound, and any pollution of the mail stream with spam (usually via forwards to external addresses) may cause remote servers to temporarily reject our mail via "greylisting".

* We have scoured the queues for copies of this looping message and deleted them. Some are inevitably still "out there" on external servers, so we have created filters to reject them.

* We will contact the client who created the mail loop and explain to them how NOT to do that in the future.

We apologize for any inconvenience this may cause you. NO MAIL WAS LOST during this event, but we do expect that delivery will be delayed throughout the day today as queues clear out. There is no way for us to prioritize some mail over others as it will really be up to the willingness of remote servers to accept our mail in a timely fashion, as the queues for external domains rotate to the head of the line.

In the "good old days" before the spam problem, this issue was solved via automated technical means, but now the ubiquitous deployment of spam filtering technologies has complicated the environment significantly. We took every possible measure to detect, and correct this issue before it escalated into an actual outage, or crash of the servers involved. We must now ask your patience as the resulting backlog clears itself.

Please note: For obvious reasons, I have elected not to use the email notification system of this blog. If you rely on email to be notified of digital.forest support updates, I suggest switching to an RSS reader. The link for doing so is just over to the right side of this page. -->

Regards,
Chuck Goolsbee
VP, Technical Operations
digital.forest, Inc.

posted by Chuck G. at 01:41 AM on Thursday, March 8, 2007
Categories: Mail