Since I moved a couple of my websites to a new server, I’ve been taking a closer look at the logs.  I was surprised to see that my marga.org site – my personal site, which mostly includes my recipes, food blog & restaurant reviews – is getting a healthy 6,500 hits a day. Yay!  Some closer look at the logs, however, paint a less rosy picture.  About 1/4th to 1/3 of my hits are from search engine robots, and probably as many more hits come from spammers.

Spam traffic falls into two main categories: referrer spammers and comment spammers.  Referrer spam robots hit your website repeatedly pretending to come from a given site, so that that site appears high on your stats file as a “referrer”.  If your web stats file is public (mine is password protected), it will be spidered by google and the listed referrers will count as links from your site to theirs.  That helps the referrer appear higher on google searches.

It’s amazing just how many of these junk referrer sites there are.  I’ve only been blocking them for a week, and then only the top junk referrers to my sites, and I already have 80 sites blocked by my .htaccess file,  in addition to all referrer websites from .ru (Russia) and .pl (Poland).  I anticipate that for every junk referrer I block, another will take its place at the top of my referrer stats.  I’m not sure if there is anything I can do about this beyond manually blocking them. Google, on the other hand, could just stop indexing stat files and make this problem moot.

Comment spam is significantly less annoying now that I moved to using wordpress as my blog software.  Comment spam are just comments left after blog topics, whose main purpose is to link to the spammer’s site.  WordPress has a very useful plugin called Akismet, which identifies and blocks most of the comment spam.  It’s amazingly accurate.