Friday, October 03, 2008

Fake blogs make Blog search risky

IT World reported a comparison of rival Blog search engines ("Is Google Blog Search a Techmeme killer? No way.", by Ian Lamont, October 2, 2008), so I did some ego surfing to see who said what about me. But the search resulted in so many scam blogs, it makes blog searching a risky business and not very useful.

A search for "Tom Worthington", taking out the references to my own site and other well known people of the same name (in the USA there is an attorney and a fish seller who frequently feature in news web sites), left only 122 references. Some of these were by me, others were just relays of posting from my own blog, but some were thoughtful, if not always positive, comments on my work. Some are from people I know, but most from people I don't. Even from people I know I was not aware of the postings.

One worrying aspect is that about one quarter of the postings seem to be pieces of random text copied from web pages to produce fake blogs, mostly on blogspot.com. These are then used to lure people to web sites packed with dubious advertising, re-directions and pop ups. One which seems popular with scams is Jim Byrne's summary of the web discrimination case "Bruce Maguire versus Sydney Organising Committee for the Olympic Games (SOCOG)", in which I get a mention. It is not clear why this would be used to promote sex web sites, but perhaps the document is very popular and so useful to attract web traffic.

The blog search engine designers need to improve their algorithms so reduce the risk of recommending fake blogs. The problem does not seem to occur with normal web searchers, so a solution should not be too difficult. The blog hosting sites, particularly blogspot.com, need to put in tests for such sites. This a serious problem which makes it so likely to end up at a dubious web site that it is not worth using a blog search at all, until it is fixed.

1 comment:

James Dellow said...

I've noticed the same problem with my own content being scrapped into dubious "honeypot" sites that lure in readers in order to the get the online ad impressions, but I think in those cases the online ad mechanisms are as much to blame as the search.