Email Addresses Obfuscation on HTML Pages

2014-03-24 13:35:31 | Post by: Yuri
Blog entry cover image.

Our website has been released about a year ago. Our contact info can be clearly seen on the footer on every page, anyone can click that and instantly send us a message using the email client of their preference. Normally when companies have their email addresses exposed just like that they end up getting a huge amount of spam emails at that address.

What happens is that there are special programs called spambots, that navigate from a page to page on the Internet harvesting any emails they can find. The owner of the spambot can collect a database of email addresses of a formidable size in this fashion that can be used for any illicit purpose, like sending huge amount of spam emails. Now, it is a different question as of why they do it since the efficiency of spam emails as a marketing medium is not particularly impressive, especially that sending spam emails can damage reputation of the company that uses such a means of advertisement.

The trouble that each company that has their contact info exposed is not even the actual spam itself, the problem is that it is too easy to lose a legit email from a prospective customer among a big number of other messages that have nothing to do with what the company does. People often use some sort of sofisticated spam filtering software to clear the mail from spam, however it is not always very efficient and always runs a danger of removing a legit email from the incoming heap of messages.

However in the time of the website being up we did not get any spam messages. Not a single one. None. How is this possible? Frankly, that is not that difficult to hide your email address from spambots on the Internet. All it takes is a bit of JavaScript in order to generate your email address on page dynamically. Apparently, spambots do not render javascript since all they need is the static data stored on page, which in our case would be emails that in HTML looks something like:

<a href="mailto:my@email.com">My Email Here</a>

When a spambot comes across a statement like that in HTML, it know straight-away to look into the href=""mailto:xxx"" part of it because it contains the email address that could be extracted. If a developer inserts the email address into HTML like that, nothing is really going to prevent it from being acquired. So to handle that people use different tricks, like posting their email addresses like my.email[at]email.com, or replacing the @ symbol with an image in order to prevent their email address from being harvested, at least automatically.

While it works to an extent to hide the real email address from spambots, this way of concealment can make things not very convenient for other people who legitimately might need it. You can't click this email address in order to compose a message in your favourite email client, and you can't really copy and paste it either. You will have to manually copy parts of the email address and put them together replacing the missing @ sign.

So the problem is this: you need to have your email address on page, but you can't have it in your HTML as it loads. What would be a solution? The solution, strangely enough, is quite simple: not to have the email address in the HTML code that is being sent from the server, but rather to have a bit of programming code that recreates it in the user's browser. Since spambots do not use browsers but rather read the raw HTML, in this manner they will never be able to see it. The code itself is literally a handful lines of JavaScript code:

<script type="text/javascript">

var string1 = "my.email";
var string2 = "@";
var string3 = "email.com";
var out = string1 + string2 + string3;

var outputId = document.getElementById("email-output");
outputId.setAttribute('href', "mailto:" + out);
outputId.innerHTML = out;

</script>

The code operation is quite simple. First it creates three string variables, string1, string2, string3 which contain the three parts of the email address, meaning the part before @, the at sign itself, and the domain name. The variable out contains the actual concatenated email address that will be used in the mailto: parameter as well as for display on the page.

The variable outputId finds the element on the page (it has to be an anchor tag in our case) that should contain the email address. Afterwards the href attribute of the element is set to the newly created email address, as well as the innerHTML part that is the anchor text between the opening and closing <a> tags.

That is pretty much all there is to it. This technique, while simple, showed itself as very reliable and we did not have a single spam email in months even though our email address is displayed openly on our website. Of course spammers may get smarter over time and in case this method of concealing email addresses becomes widespread they will find ways around it eventually. But for now we have had much success with it and would highly recommend it.