CEAS 2007 Live Spam Challenge
Competition Guidelines
Version 1.0
Overview
The CEAS Live Spam Challenge is an online anti-spam filtering
competition in which anti-spam filters are tested on a live stream of
spam and ham messages. Each filter will be asked to process a live
e-mail stream and label each incoming message as spam or ham. The
competition environment has been designed to allow most existing
anti-spam systems to be used essentially out-of-the box by accurately
simulating typical e-mail installations.
Competition Model
Each participant will be assigned a subdomain of ceas-challenge.cc to
filter. The competition e-mail stream will be multiplexed to each
participating filter such that each filter receives an essentially
identical set of messages. The messages received by each filter will
only differ in that e-mail addresses will be rewritten to match the
subdomain assigned to the destination filter.
The test e-mail stream will be collected from several production SMTP
servers. The collection process records the original SMTP envelope
and the original message contents and relays that data to the
Competition Controller. The Competition Controller modifies the
message to appear to be addressed to each contestant's sub-domain and
then relays it to each contestant as appropriate. The relayed message
is identical to what the anti-spam filter would have received if the
message had actually been sent to the perimiter server for the
simulated domain and relayed to the anti-spam filter. The SMTP "MAIL
From", and "RCPT To" addresses will be presented to each filter as it
was presented to the capturing server. Exactly one Received header
will be added to the captured message. The "from part" of the
Received header will record the SMTP "HELO" and connection information
using a Sendmail-style received line. For example,
Received: from spammer.bulkmail.com {openrelay.com [1.2.3.4])
by ceas-challenge.cc (8.13.1/8.13.1)
with ESMTP ID l68FDvK031975; Mon, 16 Jul 2007 11:13:57 -0400
Indicates a HELO of "spammer.bulkmail.com", a connecting IP address of
1.2.3.4, and the results of a reverse DNS check on 1.2.3.4 that
returned "openrelay.com". The first received line can safely be used
for IP blacklisting or SPF checks.
Most messages will be delivered within minutes of its original
receipt. The test stream may contain some previously recorded
messages. Recorded messages will be updated to ensure they contain
appropriate dates, IP addresses, and DKIM signatures to ensure that
most widely known anti-spam algorithms will behave correctly.
COMPETITION RULES
- The competition will occur on August 2nd. The start time will be
12:15pm PST and the competition will last for 24 hours.
- The first hour of the competition will be used to train learning
filters and will not be scored.
- All contestants must register to participate in the competition.
- Each group can enter up to two anti-spam filters.
- All contestants must sign and return the contestant agreement to
participate in the competition. All signed agreements must be
received by July 31, 2007.
- Filters can be located anywhere on the internet.
- Filters must not send bounce messages, challenge-response messages,
or make any attempt to contact the message originator. Failure to
adhere to this requirement may result is disqualification.
- Filters can use any resource available to make filtering decision.
This includes both public and private resources, as well as teams
of human labelers. We ask that each contestant report what
resources they use, especially any human labelers.
- Filters should not update any public resources such as a shared
signature database. Any updates made to public databases may be
available to competing filters before they are required to
classify the message. Any contestant found to have intentionally
tampered with a public resource will be disqualified.
- Filters will be given one minute to classify each message. If a
response is not received in one minute, the message will be scored
using the filter's default response.
- The first valid response received by a filter will be scored.
Subsequent responses will be dropped.
- Invalid responses will be discarded. If no valid response is
received in the one minute time limit, the message will be scored
using the filter's default response.
- Contestant's are responsible for ensuring the continuous operation
of their anti-spam filter. The one minute timeout will be
strictly enforced.
- Learnng based filters can (and should) be pre-trained with any or
all public and private data available to the contestant.
- Feedback on all messages will be provided for the first hour of
the competition. Thereafter, feedback will be provided for a
subset of the e-mail stream.
- Feedback will be sent immediately for the first hour of the
competition. Thereafter, feedback will be delayed to simulate
real-user behavior. The delay may range from minutes to hours.
- Feedback will contain the official judgment for a message. There
will be no attempt to simulate user labeling errors.
- Filters will be evaulated using the lam() metric. Lam()
calculates the average of a filter's false-negative and
false-positive rates, but performs the calculation in logit space.
The exact formula used for the competition will be:
FPrate = (#ham-errors + 0.5) / (#ham + 0.5)
FNrate = (#spam-errors + 0.5) / (#spam + 0.5)
lam = invlogit((logit(FPrate) + logit(FNrate)) / 2)
where
logit(p) = log(p/(1-p))
invlogit(x) = (e^x)/(1 + e^x)
The winner will be the filter with the lowest lam() score.
- All practical measures will be taken to ensure the accuracy of the
official message judgements. We reserve the right to change the
official judgment of a message or remove a message from the
competition as needed. Any reasonable disagreement on whether a
message is spam will cause a message to be dropped.
- Feedback for incorrectly judged messages may be sent to all
competing anti-spam filters before the error can be corrected.
Participating filters are responsible for smoothly handling
any erroneous feedback received.