The Dark Side of the Web: Unmasking Email Harvesters

Jan 2
23:14

2024

Richard Lowe

Richard Lowe

  • Share this article on Facebook
  • Share this article on Twitter
  • Share this article on Linkedin

This article delves into the murky depths of the internet, exposing one of its most insidious threats. If you're brave enough to confront this menace, you could significantly reduce the amount of spam you receive. Prepare yourself for a journey into the world of unscrupulous spammers, their tactics for stealing your email address, and the steps you can take to protect yourself.

mediaimage

The Hidden Threat in Your Website Logs

If you have access to your website's log files,The Dark Side of the Web: Unmasking Email Harvesters Articles you might be surprised to discover that your site is visited far more frequently than you realize. More alarmingly, your HTML files could be used against you. These log files may contain traces of the tools used by unscrupulous spammers to pilfer your email addresses.

Let's take a step back and explain a few things. Every time you visit a website, a record is kept of every page, graphic, sound file, video, or any other element you access. This record is known as a log file. Each line within the log file represents one "hit", which could be an image, an HTML page, a video, a sound file, or anything else.

One crucial piece of information recorded in these log files is the "user agent", which typically contains the browser name or the name of a spider (a tool used by search engines to crawl the web). For example, "googlebot" is the spider for the Google search engine.

Unmasking the Culprits

Upon examining these user agent fields, you'll discover that your site is visited by a variety of entities with peculiar names:

  • Googlebot
  • Slurp (used by hundreds of search engines including Hotbot)
  • Scooter (Altavista robot)
  • Lycos Spider (used by the Lycos search engine)
  • And many others.

Most of these are benign bots, used by major search engines to keep their indexes up-to-date. They play a crucial role in maintaining your site's visibility and driving traffic. However, buried within your log files, you may also find names like EmailSiphon and Cherry Picker. These are malignant bots used by spammers to harvest email addresses.

How Email Harvesters Operate

Email harvesters scan every single page on your website, looking for email addresses, particularly "mailto:" type links. They are commonly found on websites as they provide a convenient way for visitors to send emails.

Email addresses are also often left in guestbooks, message boards, and other online communities, making them a goldmine for spam harvesters. They can quickly and easily gather dozens, hundreds, or even thousands of valid and usable email addresses.

Defending Against Email Harvesters

So, what can you do to protect yourself against these email harvesters? Here are a few strategies:

  • Ask them politely: With most "good" spiders, you can create a robots.txt file or use the robots metatag to request them not to crawl your site. Unfortunately, email harvesters typically ignore such requests.
  • Block them: On some web servers, you can use special commands in a file called htaccess to block certain robots. However, this only works for robots that clearly identify themselves, and not all web hosts allow this.
  • Confuse them: Some webmasters create pages filled with fake email addresses to trick email harvesters. While this can occasionally work, it doesn't prevent spammers from getting your other email addresses and still consumes resources.
  • Cloak your email addresses: You can make your email addresses look like something else, such as a graphic image or JavaScript code. However, these solutions can make your site harder to maintain and may eventually be outsmarted by spam harvesters.
  • Strip your site of email addresses: The most effective solution currently is to remove all email addresses from your web pages. If you need to receive information from your visitors, use a form that doesn't include your email address.

By understanding the threat of email harvesting and taking steps to protect yourself, you can fight back against this internet menace.