Why PageRank is broken and how it’s being fixed

  • January 16, 2008
Patrick Altoft

The original PageRank algorithm looked at links and web pages through the eyes of a web surfer following the random walk model.

The surfer would visit a page and then click at random on one of the links and arrive at another page. Eventually the surfer would have visited every page on the web and the pages they visited the most (ie the ones with the highest number of inbound links) would be deemed the most important.

Google is trying to create an algorithm that mirrors human behaviour and displays the results that real people want to see, this becomes a problem because humans don’t follow the random walk model.

Consider a person viewing a news story or blog post, how likely is it that they click a navigation link or a link in the footer compared with a link in the middle of the news article? What is the chance they click on a link at the top of the post compared to a link right down at the bottom? Treating all links equally isn’t a viable way to decide which sites are most important.

Today Bill Slawski analyses a patent from Yahoo that proposes a new method of handling PR. The patent is filed by Yahoo but I’m sure Google is using the exact same methods.

The document discusses how a search engine could use user data (for example toolbar or Adense data) to see which links were clicked on most from different web pages. Links that received the most clicks are deemed to be more important and are given a higher weight. This weight can be combined with a user “satisfaction” figure determined by the amount of time the person stays on a certain page before moving on. Pages with a higher satisfaction score are deemed more important.

Looking at how search engines hope to mirror human behaviour makes it very easy for webmasters to insulate themselves from algorithmic fluctuations and updates. Buying footer links for traffic is a bad idea so why should it be a good idea to buy them for search engine rankings?

Google is only going to get better at mirroring human behaviour so why not make sure you are ahead of the game.

