In their never-ending pursuit to evade spam filters, malevolent mailers have deployed a number of techniques to obfuscate their true intent. One of those techniques is using facsimiles of legitimate brand names in Web addresses to redirect victims to outlaw Internet sites where the scammers can work their mischief on their targets, a practice called spoofing.
Spoofing Web addresses isn’t a new tool in the phishing community’s black bag of tricks, but the introduction of international domain names (IDN) has fertilized the poison fruit of the scam artists.
An IDN is a domain name that contains non-English characters. The rationale behind the idea is a noble one. It’s intended to broaden the appeal of the Web by allowing domains to be registered in the alphabet of their native countries. However, some of those letters may appear as English characters in a URL, which allows phishers to create phony Web addresses that look exactly like the Real McCoy.
Before IDNs, clever Black Hats played the homograph game by taking advantage of typographic foibles. In some fonts, for instance, zeros (00) look like o’s (OO) so spoofing a domain like google.com as g00gle.com could reap rewards. But with IDNs, there may be no discernible difference to the eye between the two Google addresses, yet one is real, whilst the other is bogus with letters from a foreign alphabet that perfectly emulate the Latin one.
To the user, a Cyrillic ‘o’ looks exactly like a Latin ‘o’ because most fonts don’t make a distinction between the two characters when they display them. A computer, though, does make such a distinction when it processes the character string as a URL.
Inside an email, the spoofed address will look legitimate, but when rubes click the URL, they’re taken to a straw site where their personal information can be filched or their computer infected with malware before being passed to the genuine Web address, often unaware that they’ve just been electronically mugged.
The IDN dodge has been known for quite some time. In 2005, for example, two computer science students, Evgeniy Gabrilovich and Alex Gontmakher, at Technion, the Israel Institute of Technology, illustrated how letters from the Cyrillic alphabet that look exactly like ‘o’ and ‘e’ could be used to register the domain name bloomberg.com. Better yet, they used the same technique to spoof microsoft.com. One can only imagine how many unfortunates might be fleeced of their personal information by spoofing that URL.
A number of foreign alphabets lend themselves to homograph spoofing.
One of the best is Cyrillic. That’s because it contains 11 characters that mimic or closely mimic the letters a, c, h, e, i, j, o, p, s, x and y.
In Greek, only lowercase omicron mirrors a Latin letter, o. But that changes if uppercase letters are used. Then the letters A, B, E, H, I, K, M, N, O, P, T, X, Y AND Z all have Greek twins.
Armenian has six letters that look like g, h, n, o, q and u. Two other letters may slip by as a j or p, depending on the display font. However, while Cyrillic and Greek are supported by most standard fonts, Armenian isn’t. So in Windows, for instance, Armenian is displayed as a special font, Sylfaen, which supports the language. That means any Armenian characters mixed in with Latin ones will be as ostentatious as a Turk at a Viking convention.
Although it’s rare, Hebrew can be used for spoofing, too. Three members of its alphabet look very similar to o, i and n.
When IDNs first began to appear, the popular browsers were practically defenseless against homograph spoofs. Now there are a number of defense strategies that can be used to thwart the practice.
The most extreme is to turn off IDN support. That may block access to IDN sites, but most likely the browser will just display the foreign URL in Punycode. Punycode will convert the non-ASCII characters in an address to an ASCII equivalent. The result looks a little weird. If an URL has www.xn in it, chances are it’s in Punycode.
By default, Firefox and Opera will display Punycode for IDNs, unless the Top Level Domain (TLD) is one that counters homographic attacks by restricting the characters that can be used in a domain name. They also allow TLDs to be manually added to a White List.
Another way to counter homograph spoofing is to turn on the anti-phishing feature found in the major browsers. It alerts users when they’re about to access a dubious Web site.