|
This is a reproduction of an article written
for, and commissioned by, the Shropshire Chamber of Commerce concerning a major
email problem that affected businesses in Shropshire during the
months of January and February 2006.
E-Mail Replication - Root Causes & Lessons To
Be Learned
Manic Monday
Eliminating the Chaos
Another Manic Monday
Lessons to be Learned
The Chamber's Role
Many of us Chamber members have suffered from the effects of the now infamous
email replication that caused chaos throughout Shropshire during January and
February of this year. For many people the events have been a source of extreme
frustration, and some have suffered not inconsiderable interference to their
business operations. In recognising the severity of the situation, and following
my efforts to halt the chaos, the Shropshire Chamber commissioned me to deal
with the issues arising and to consider ways of reducing the chances of such an
event happening again.
Having already spoken with a number of you most affected by the replication,
I would like to explain here the multiple events of January that combined to
create a mushroom effect, how they were handled and why, though apparently
resolved, the chaos returned again in February. Finally, I would like to
summarise the lessons that have been learned and consider actions that we can
all take to lessen the impact on ourselves should such an event happen again.
Around close of business on Monday 23rd January, 2006 a Shropshire-based
company sent out a promotional email to 630 or so addresses, placing each of those addresses into the 'To' field of the email.
What happened from there is that company's ISP mail server created the necessary
copies of the original email and relayed them out to each of the domains
specified by the 'To' addresses.
Consider now the instance of the email that is routed to the address
joe@bloggs.com, say. As it arrives at the
mail server which serves the domain bloggs.com, that server becomes responsible
for relaying the email to its user 'joe'. Even so, the email copy it handles
still has the original 'To' line (with all 630 addresses) fully intact. In
relaying the email to joe (and any other of its own internal users also
addressed by the email), the bloggs.com mail server does not - indeed, must not
- relay the email for any of its external addresses (i.e., addresses for which
it is not directly responsible).
Let us now suppose the bloggs.com email server is faulty (let's call this
Server1), and it decides not only to pass the message onto 'joe', but also to
relay it out again to every address it finds on the 'To' line. Now we have a
problem, but so far not a big one. Server1 will cause everyone to receive two
copies of the original mail instead of one. Now, add a second faulty email
server (Server2) into the mix which is also addressed by the same email.
What happens now is disastrous. Server1 incorrectly relays the email to all the
addressees, including one or more served by Server2. Server2 does exactly the
same. Each time it receives the email from Server1 it relays this message back
out to all concerned, including Server1. There is nothing to stop this process,
as Server1 and Server2 mutually replicate the email, relaying it to all 630 of
us every time.
In fact, there were three company email servers that started to
replicate the email in exactly this way.
All this mushroomed out of control during the evening of Monday 23rd January,
and most people were back at home with their families blissfully unaware of
events until returning to work on the Tuesday. However, some of us were affected
early on by the replication and one person, in his frustration at events,
responded to all addressees asking for them to stop. All this achieved, however,
was to add another email into the mix, effectively doubling the number of
replications that we all were to receive.
It was late on the Monday that I started to look into what was happening, and at
first I was as frustrated as everyone else. It seemed at first that we were
being subjected to a rather large 'spam' attack and I set about attempting to
track down the location of the perpetrators. In analysing the headers of the
many messages arriving, though, what I found suggested that this wasn't a spam
attack of the traditional kind, where the perpetrators put efforts into hiding
their tracks. Instead, there had been no attempt at obfuscation, and it became
clear that all of the emails were originating from three distinct sources. Each
source was a Microsoft Small Business Server (which runs a cut-down version of
the Microsoft Exchange Server), which I further determined to be operated by
legitimate companies based in Shropshire.
Over the following day or two, I liaised with the ISP's serving those
companies, the companies themselves, and those companies' outsourced IT support
personnel, and it became clear what was actually happening. The fault in the
email servers was a known one, and the manufacturer (Microsoft) had issued a
patch to fix the problem some two years ago. In the case of the three servers
involved, it turned out that the necessary updates had not been made.
In most cases those involved were very cooperative, and worked to resolve
the problem as quickly as they could. As a result the replication stopped, but
it took some time for the large numbers of emails to filter through recipients'
mail queues. While it is difficult to estimate exactly how many replications
were made in January, talking with a number of people affected it seems the
replication may have created well over 1,000,000 separate copies in all. Staggering.
Monday 20th February. Aren't Mondays fun? I was as surprised as all of you
when the emails suddenly started up again in earnest. This time, another
hitherto unknown email server started to send out large numbers of the same
emails. Why, you may ask? It turns out that this server was actually involved
back in January, only it wasn't visible because after creating the replicated the emails,
they became stuck in the server's own outgoing queue and were not transmitted at the
time. The company operating the server recently changed the outsourced IT
support they were using, and the first thing the new personnel did (unaware of
the January event) was to open up the outgoing queue...
... and now, one of the servers involved back in January got involved again.
It turned out that this particular server had not in fact been patched properly.
Oh, and just to make matters worse, another (i.e., fifth) server also got
involved this time round.
Given our experiences of January, I was able to react more quickly to the
situation and liaised with those responsible once more to have the situation
resolved. This time, it seems, there were around 500,000 more replications in
all.
Will all this happen again? Probably not, but there are no guarantees.
The chaos that has happened was a combination of events, triggered mostly by
ignorance and poor IT support policies. The best we can do is to learn from the
events and act to prevent them from happening again. A number of lessons can be
learned from the root causes:
-
Be careful about allowing people access to
mailing lists including large numbers of email addresses. When you do, be
sure the users of the list understand how to use the list responsibly. The
Chamber withdrew their mailing list quickly after the first outbreak in
January (more about this below).
-
Do not expose multiple email addresses in the 'To'
or 'cc' fields of an email. While the original email was a catalyst for all that
has happened, the sender could not have predicted the consequences that would
ensue. That said, exposing email addresses in such a way is a violation of
privacy, and can make those email addresses subject to abuse by others. Further,
rather than use the 'bcc' field, use a 'kosher' e-mailing service instead.
-
If you operate an email server, make sure you (or
your IT support personnel) verify it is configured to relay emails properly.
Ensure SMTP authentication is turned on (to prevent the server from being
used to relay spam on behalf of others), and ensure that all available manufacturer-supplied
patches are applied in a timely manner. It is the failure of IT
administrators to patch the systems they are responsible for that was the
primary cause of the chaos. While I do understand that IT administrators often
work under pressure to keep company facilities running 24/7, if you fail to take
systems down for the short time it takes to apply patches you will get bitten
sooner or later. The companies operating servers (email, or otherwise) should
also appreciate that short downtimes are a necessary part of maintenance, and
allow their IT administrators the freedom to judge when this is necessary.
For those of us who are the victims of recent events, there is little we can
do stop the problem, but there are actions we can take to considerably lessen
the effects:
-
When you subscribe to emailing lists of any
kind, don't use your primary email address. Instead, use an address created
especially for the purpose. Then, retrieve emails arriving at that address
on a less regular basis, rather than as a routine part of your emailing
send/receive. You will then always be able to access your primary emails
without having to review or download list-based email. Most email client
software offers you the facility to manage multiple addresses (each with
different send/receive schedules), and your company mail server and/or ISP
will mostly likely offer a facility for you to set up multiple addresses.
-
If you receive what you believe to be spam emails,
do not respond to them. If the emails are spam, all you achieve is to
confirm your email address to the spammer and almost guarantee you will receive
more. In this case, it wasn't spam emails (in the traditional sense) that were
the problem, but this advice would have prevented the replication from
mushrooming quite so fast.
-
Implement effective spam filters. This is best
done at the server level - either on your company server, or your ISP's - as it
stops you having to download much spam to your computer or mobile device.
-
If you still find that you are being bombarded by
spam, especially from specific senders, use your spam filters to (at least
temporarily) block emails from those defined senders. Most spam filters will
allow you to do this, and again try to do this as far 'upstream' as possible: if
your ISP or company provides you a facility to do this on their servers (most
will), do it there, as this will stop your download link becoming overburdened
and help prevent the 'quota exceeded' problem some people have experienced with
the replication.
The Shropshire Chamber is a victim in all this, just like the rest of us.
While it was their email list which was used to send the original email, they
did provide that list to its members in good faith, and as a service to assist
its members in creating new business. The Chamber did stop offering the list
soon after the events of January, and they did move to commission someone (me,
in this case) to address the situation once it was apparent that further action
was necessary.
In withdrawing the email list, though, it does remove a facility that some
members have expressed they would find useful, and that they would like to see
continue in some form or other. The Chamber are, therefore, looking to set up an
opt-in email list server to be hosted on the Chamber's web facilities, whereby
members can subscribe confidentially. A member wishing to promote their business
will then be able to send a single email to the list (and using a password to be
supplied by the Chamber), and the list server will then take care of relaying to
the subscribers on a confidential basis.
I and the Chamber sincerely hope that this article will help you understand
the issues involved, and assist you in dealing with any subsequent repetitions
that may (but hopefully not!) happen again.
 |
Steve Moss,
IT Consultant,
FreeYourNet. |
Home | Business Services | Residential Services | D-FEWS | Remote
Support
|