Relax greylisting with automatically created whitelists of reputable MTAs; share those whitelists with peer MTAs.

The idea

The idea behind the application is to automatically determine reputable SMTP hosts and whitelist them (i.e. to skip greylisting). Greylisting delay is a major shortcoming of otherwise very effective anti-spam technology: longer delays (1 hour) cause major frustration of users (up to "where is my mail?" support calls). Shorter delays could be overcome by spammers (and short delay may easily turn into long one if the sending MTA will not retry shortly).

Different from greylisting implementations that delay mail from a given MTA only once, auto whitelists use more reliable criteria. To get whitelisted an MTA has to:

  • successfully transmit mail multiple times (i.e. to retry, if we assume greylising is installed)
  • to be known for some time (days or weeks)
This policy favors established well-known MTAs and lets to set longer greylisting delays to those unknown. (Some greylisting implementations, e.g. postgrey, auto-whitelist clients after e.g. 5 successfull transactions. The effect of P2PWL is similar except for sharing of whitelists.)

The resulting whitelists are shared among mutually trusting MTAs using HTTP protocol. So ideally, an MTA recognized by one peer will not be greylisted by another (trusting) peer.

Note: you may also use P2PWL scripts to derive whitelists from your existing postfix logs to soften greylisting deployment effects.

The agenda

The agenda behind P2P WL is to build Web of Trust involving any notable MTA worldwide. In theory, this result might be achieved if every participant MTA maintains forward and back cones ("whom I trust" and "who trusts me" resp.) of size O(N0.5) each, where N is the number of MTAs worldwide (so, if N=106 then each participant needs lists of size ~1000, next to nothing). That global system will also need sophisticated reputation calculations model to prevent abuse, see stages.

The code

The latest version is always available (3kB). At this moment, the only supported MTA is postfix.

How to install

First, unpack the tarball. Code and configuration is supposed to reside under $P2PWL which defaults to /usr/local/p2pwl, so you may just move p2pwl to /usr/local/. All data is supposed to reside in $P2PWL_DATA which defaults to /var/mail/p2pwl. Create the directory and make sure it is accessible to the user you will run scripts as.
You may also need to install some perl modules from CPAN, such as HTTP::Daemon and Date::Manip. Use perl -MCPAN -e shell. Next, configure data inputs and outputs. (P2PWL has two inputs and two outputs.)

Configuring inputs

SMTP transaction logs

...by parsing postfix logs

Make a special logrotate entry for postfix logs, such as

/var/log/maillog {
  daily
  prerotate
    cat /var/log/maillog | /usr/local/p2pwl/bin/postfix2trans > \
       /var/mail/p2pwl/history.`date +%d%b%Y`
  endscript
  sharedscripts
  postrotate
    /bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` \
      2> /dev/null || true
  endscript
}
Thus, postfix logs are translated into transaction logs. That logs are stored in $P2PWL_DATA.

Note: most probably, you will need to tweak postfix2trans code to match your postfix configuration. Check whether cat /var/log/maillog | $P2PWL/bin/postfix2trans produces anything.

... or by logging transactions by postgrey

This variant is a bit easier than the previous. If you are using postgrey you may patch it to log transactions in trans format.

One may ask why don't we move all the logic to postgrey. This is implementable, but undesirable. First, we don't need to update this information in real time, so better not to place it inside SMTP transaction processing code (because of reliability, response time, failure tolerance issues). Second, whitelists are accessed by wl-server, so better keep them in immutable standalone files, not a concurrently accessed database. Third, standalone postfix lookup table is just cheaper. Fourth, this approach contradicts to "simple parts connected by clean interfaces" engineering pattern, thus leads to complexity. Enough.

Whitelists imported from peer MTAs

List peer MTAs in $P2PWL/etc/peers.conf, one line per entry. Add $P2PWL/bin/check-peers to crontab, like this:
0  5  *  *  *  /usr/local/p2pwl/bin/check-peers
	
If your peer MTAs run whitelist http servers then fresh versions of their whitelists will be stored in $P2PWL_DATA.

Configuring outputs

Whitelists for own MTA

$P2PWL/bin/compile-wl merges whitelist derived from your own logs ($P2PWL_DATA/my-compliant) and imported whitelists ($P2PWL_DATA/import*) to create the resulting whitelist for postfix ($P2PWL_DATA/compliant-mtas). Add compile-wl to your crontab and add the resulting access map to your postfix configuration just before greylisting, e.g.

smtpd_recipient_restrictions = ... ,
    check_client_access hash:/var/mail/p2pwl/compliant-mtas, #P2PWL
    check_policy_service inet:127.0.0.1:10023, #greylisting
    permit

Shared whitelists

Execute $P2PWL/bin/wl-server; it will start an http server that sends your whitelist out on any request of http://yourmta.com:24024/whitelist.

Done!

TODO, recent changes

TODO

Recent changes

Stages of P2PWL delelopment

Stage I. Plain whitelists

The only raw information involved is MTA logs reflecting successful SMTP transactions. Well-known MTAs are whitelisted. Peers are exchanging plain whitelists, like this (scrambled one). An administrator manually lists all peers to get whitelists from.

Status: DONE

Stage II. Multilevel (transitive) lists

Let whitelist entries propagate via the network of P2PWL-enabled MTAs. (Include imported entries into the exported whitelist.)

Status: PLANNING

Stage III. Reputation model, weighted lists

to be written

Status: HAVE SOME THEORY