Thursday, March 16, 2006

Blogger Hack

I was looking around the source code of this blog when I noticed such a line:
<link rel="EditURI" type="application/rsd+xml" title="RSD" href="http://www.blogger.com/rsd.g?blogID=11708271" />

So I opened it up and found that the XML data actually contained a link to my website! And then... I decided to randomise some "valid" numbers after "blogID" in the url, and found that I could find links to other blogs under Blogger as well.

Implications:
  1. That line is inserted by Blogger.
  2. That line is very useful. For example, web crawlers (or crawlers building up a blog directory, newspage, etc) can capitalise on this information. By permutating the numbers at the back of "BlogID", it is possible to get hold of millions of valid weblogs URLs. And Blogspot is a really good place to start. With millions of blogs... we have millions of valid URLs to other blogs as well, from a list of blogs.
  3. It means that your blog, even if you don't publicly list it or show it to anyone, can get onto the hands of any web crawler or search engine for that matter.

No comments: