Mailarchive has been discontinued

I launched mailarchives.bentasker.co.uk back in 2014. As well as hosting mirrors of mailing lists such as tor-talk and cypherpunks, it also hosted mail based notifications derived from multiple sources (such as RSS feeds, lists etc) like my CVEs list.

However, I've taken the decision to take mailarchives.bentasker.co.uk offline - this post details the rationale behind that choice.

To explain the decision, it's first worth looking at the relevant parts of the journey up to this point.


Content Filtering

When the archive was originally set up, it was set up as a direct mirror - anything that hit the mailing list was mirrored.

However, in February 2017, things hit a bit of a tipping point for me, a number of the mailing lists were being targetted with anti-semetic and white supremacist rhetoric. At the time, I pushed a notification into the mirror

When I originally set up a mirror of the Cyberpunks list, my intention was to mirror "as is".

That, unfortunately, needs to change because the nature of the speech happening on the list has similarly changed.

A number of posters are repeatedly posting opinions and positions that I'm very strongly opposed to, and frankly seem to be becoming all too often repeated on the net.

They are, of course, at liberty to speak their mind, just as I'm at liberty to decide that I won't mirror their Neo-Nazi white supremacist bullshit under my domain.

As of now, future messages from the following addresses will no longer be included in the mirror.

This decision was also later recorded in MAILARCHIV-1.

Essentially, the position I took was: whilst freedom of speech allows you to air your objectionable views, there is absolutely no reason that I should be required to host it under bentasker.co.uk.

Over time, additional addresses were filtered for similar things (MAILARCHIV-2, MAILARCHIV-4, MAILARCHIV-12.html, MAILARCHIV-13 and MAILARCHIV-14).

I tried to keep the bar for blocking high, but that also meant there was still a lot of stuff going through, and the reactive nature of the blocking only increased that.

Eventually, I reached the point that I decided the quality/value of the Cypherpunks mailing list (in particular) had dropped so far that there simply was no point in continuing to mirror it. The list had ceased to be about the laudable aims of Cypherpunks and degraded entirely into a list of conspiracy theories, racism and general shitheadedness (looking in now, it doesn't seem to have improved, with most of the same players still active).


You seem to have mentioned...

Anyone who runs a website has probably received at least one mail like

You seem to have written about VPNs at "your address/some/path", I recently wrote a post that I thought your visitors might find really helpful, perhaps you could link to "their site/covered/in/ads"?

Now, just imagine just how many of these spammy mails you get when you're hosting arbitrary user generated content.

It's not just keyword related spam either, you get mails about outbound links

You are linking to [some domain] on your page at https://mailarchives.bentasker.co.uk/Mirrors/[some message]. Could you consider linking to [some other domain]

Which come in because people are linking out to useful references in their mailing list contributions.

Inevitably, due to the sheer number of unsolicited "suggestions" being sent, some of it slips through spam filters.


Is anyone actually using it?

The archive was always active serving something. If you tail -f the access logs, there's a relentless march of loglines.

But, closer examination shows that this activity is basically all bot originated, consisting of

  • Bots looking for links/keywords so they can spam you
  • Search engine spiders

A mailing list archive, by it's very nature, consists of millions upon millions of small pages - so it takes a lot of requests for even a single spider to achieve full index coverage.

Because I dual-home my sites onto Tor, I get Onion search engines such as (Ahmia.fi) crawling too, increasing the load further.

Obviously, it's expected that search engine spiders will account for most page views - most humans just don't try to click through every page, they'll read a few and then go elsewhere. But, even accounting for that, the number of human visitors to the archives looks negligible.


The decision to discontinue

Ultimately, I put mailarchives.bentasker.co.uk online as a service to others, but there was no point in maintaining it if not enough people were using it.

Despite there not being many human visits, the bandwidth cost of maintaining the archive was quite high - lots of search engines x lots of pages = lots of bandwidth. Any human traffic coming into the site would likely come via a search engine, so restricting SE spiders could only be counterproductive.

Even were that not the case, maintaining the archive would also mean continuing to host examples of some pretty objectionable speech under my domain. The old adage that "bad" speech should be allowed so it can be challenged only works if there's someone challenging it, and in many of these cases there either was no dissenting voice, or it was buried under a pile on. I'm really not interested in hosting a portrayal of the worst of humanity on an ongoing basis.

I raised an internal ticket about a month ago, laying my thoughts out, and then left it over the Christmas period to see whether my position shifted/softened.

Conclusion

Which brings us to today.

I've removed content from mailarchives.bentasker.co.uk and lxjwnnwvbp25jt3q44bcyulhkzr2e344tnkyh3qrqneukmshik3qotyd.onion - the vast majority of pages will now redirect to a page called gone.html which provides links out to more mainstream mirrors/archives.

The domains themselves will remain online, so that anything/anyone that is following links into them at least gets an explanation of where it's gone.