Thursday, January 11, 2007

A tour of the Google blacklist

This may not format correctly in the blog so here is the link: -

LINK to Site

I recently decided to devote a day to walking through the Google Blacklist. While some of the findings were to be expected, others proved somewhat surprising. The Google Blacklist is a listing of URLs suspected to be phishing sites. It is used by the Google Safe Browsing for Firefox extension which is now part of the Google Toolbar for Firefox. It is also leveraged by the Firefox 2 web browser. Google maintains a number of different safe browsing lists to combat phishing including a URL blacklist, an encoded/hashed blacklist, a URL whitelist, a domain whitelist and a sandbox text list, which contains keywords included in URLs. While Google doesn't reveal exactly how these lists are developed, it's clear that user input is an important variable given that both the Google Toolbar and Firefox 2 allow for optional user feedback when phishing sites are encountered.

My hope was that this exercise would provide some insight into current phishing attacks and it certainly did. The blacklist is continuously updated and specific versions can be requested by including the required major:minor version in the GET request. The full listing (1:1) contained primarily outdated URLs as 86% of the pages or sites were no longer available. While I would like to think that the existence of Google's blacklist had contributed to the demise of these sites, phishing sites tend to emerge and disappear quickly, so I suspect that this is just a natural part of the phishing cycle. I had expected to see a combination of social engineering attacks, known vulnerabilities and 0day attacks used on the sites with the majority falling into the first category. I was therefore somewhat surprised to find virtually all sites using straight social engineering attacks. I was also surprised to see that the top three targets - eBay, PayPal and Bank of America accounted for 63% of the active phishing sites. One amusing finding was that Yahoo! commonly hosts pages that phish...wait for it...Yahoo! credentials. A breakdown of the full findings can be found below.

Summary of All URLs

Category


# of URLs


% of URLs

Server not found


2315


76.71%

Page unavailable1


283


9.38%

False positives


79


2.62%

Active phishing sites


341


11.29%

Total


3018


100.00%

Note 1: Includes 404 Not Found errors, 503 Service Unavailable errors, domain registrar placeholders, under construction pages, generic search pages, error messages, etc.

Summary of Active Phishing Sites

Target


# of URLs


% of URLs

eBay


80


23.46%

Paypal


79


23.17%

Bank of America


56


16.42%

Generic2


27


7.92%

Yahoo!


21


6.16%

MySpace


15


4.40%

Wachovia


7


2.05%

e-gold


4


1.17%

HSBC


4


1.17%

Wells Fargo


4


1.17%

Barclays Bank


3


0.88%

Citi


3


0.88%

Nationwide Bank


3


0.88%

Bank of Scotland


2


0.59%

Development Bank


2


0.59%

Geocities


2


0.59%

Hotmail


2


0.59%

IRS


2


0.59%

Orkut


2


0.59%

Others


23


6.73%

Total


341


100.00%

Note 2: Includes pages with web forms designed to harvest email addresses, social security numbers etc., but the sites were generic in nature and did not target any one company. Most were sites promising financial advice in exchange for providing personal information.

In almost all cases, these phishing sites are fake login pages, password reset pages or various other web forms that require the user to input sensitive data such as a password, credit card number or social security number. Surprisingly, of the 341 active phishing pages that I looked at, only one attempted to use a known vulnerability. All others simply employed social engineering attacks. The pages are generally exact replicas of the original web page and generally pull graphics (*.jpg, *.gif, etc.) from the legitimate web site.

Free Web Hosting Sites

Phishers are apparently cheap as many utilize free hosting sites such as Geocities, Tripod and FreeSpaces to host their phishing sites. While the hosting providers are catching some of these sites, they're clearly not working hard enough as several remained active. Somewhat amusing is that Yahoo Geocities is commonly used to host pages designed to harvest Yahoo! login IDs. Why Yahoo! can't catch phishing pages which they host is beyond me. Below are a handful of such pages which remained active as of this posting:

* http://uk.geocities.com/hardtimekennel/
* http://uk.geocities.com/sexy_pa_4u/
* http://uk.geocities.com/jenny_for_play_2/
* http://uk.geocities.com/lilac_rain111/
* http://uk.geocities.com/pics_for_you_0nly16/
* http://www.geocities.com/l0ts_0f_laffs.com52/
* http://www.geocities.com/sweet_blond3s_1nc/

All of the aforementioned sites appear to be set up by the same group as they all forward login credentials to http://www2.fiberbit.net/form/mailto.cgi. However, it is likely that they were set up at different times as they exhibit different properties. Some display an error page after a failed login while others redirect the user to a porn site or alternate Yahoo! page. Some also utilize JavaScript to prevent the user from accessing the right-click context menu, presumably in an attempt to hinder viewing the web page source code. Others also HTML encode the action attribute.

URL Obfuscation

The majority of active phishing sites took minimal steps to obfuscate the URL. The most common practice was to simply prefix the attacker URL with the targeted domain name (e.g. paypal.evilsite.com). Obviously even a casual inspection of the targeted URL would arouse suspicion but sadly this simple attack appears to fool many users. Other sites utilized IP addresses as opposed to fully qualified domain names and others went a step further by using decimal or hexadecimal forms of the IP addresses which are properly interpreted by most browsers. Hexadecimal encoding of at least portions of the URL itself was also a popular technique.

Cybersquatting

Cybersquatting, although far from a new technique, appears to remain in vogue. The unfortunate part is that domain registrars are allowing these names to be registered in the first place. In the sampling of domain names below, it is clear that these were registered with questionable intensions.

* mail-yahoo-us.tk
* wellsfargo-newupdate.com
* e-wachovia.org
* wachovia-bank.org
* wachovia24.net
* wachovia24.org

Multiple Scams, One Domain

A single domain is commonly be used for multiple phishing scams targeting different companies as noted below:

* Domain - mujweb.cz
* Targets:
o eBay
o PayPal
o Wachovia

[Update: 01.05.07 - A couple of people have kindly pointed out that mujweb.cz is a free web-hosting service in the Czech Republic and this situation would be more comparable to phishing sites hosted on Geocities as discussed in the 'Free Web Hosting Sites' section.]

* Domain: 81.113.212.146
* Targets:
o eBay
o Barclays Bank
o Citi Bank

URL Redirection

Another surprising finding was that few of the phishing scams utilized open URL redirectors. This is a known technique whereby phishers identify redirection functionality at a popular website (e.g. Google) and use that functionality to redirect the victim to the targeted phishing site in order to minimize suspicion. Combing through the blacklist did however reveal the following redirection attack using Google AdWords:

http://www.google.com/pagead/iclk?sa=l&ai=x&adurl=http://www.spidynamics.com

I've seen this technique mentioned in past threads. However, one thing that I noticed which I had not seen before is that the 'ai' parameter can actually be any value rather than the 50+ character unique string that is typically used by AdWords. This would naturally make for a more compact, natural looking URL.

DOM Manipulation

Once again, few phishing sites did anything beyond present a reasonable forgery of a legitimate web page. Some did however perform basic DOM manipulation such as this PayPal phishing site, which uses JavaScript to create a fake address bar at the top of the page, displaying a PayPal URL. The fake address bar is not displayed in Firefox 2.0 but IE 6 & 7 will both display 2 address bars, the real and the fake.

Conclusions

Based on all of the sites that I looked at, the majority of phishing scams are less sophisticated than I had predicted. This is however somewhat concerning as simple attacks must still be working and attackers have not been forced to upgrade their skills in order to make a profit. Phishing is an attack that will be around for some time and there is no magic bullet to stop it. Many parties must work together to reduce phishing attacks and that simply isn't happening. When domain name registrars permit cybersquatting and popular websites allow open URL redirection, it is clear that we have a long way to go.

No comments: