Google+ and the Usage Statistics in Document Retrieval patent

  • 1
  • July 21, 2011
Patrick Altoft

Patrick Altoft

Director of Strategy

On 24th February 2011 Google was busy with two things; rolling out the Panda update and filing a patent entitled Methods and Apparatus for Employing Usage Statistics in Document Retrieval.

This patent covers methods which could solve one of Googles biggest problems – the fact that older pages with lots of links are outranking newer & more relevant pages which have fewer links just because they are new.

As long as Google relies on traditional links they are always going to struggle to be up to date and relevant.

The patent details how documents could be ranked based on factors such as:

  • Number of visits
  • Frequency of visits over a recent time period
  • Nature of the visit
  • Country of the visitor

It’s quite clear that a document that is getting lots of attention and visits from users in the UK should rank higher in the UK search results than a document that is getting very little attention.

Methods and apparatus consistent with the invention employ usage information to aid in organizing documents. Based purely on raw visit frequency, the documents may be organized into the following order: 610 (40 visits), 620 (30 visits), and 630 (4 visits). If these raw visit frequency numbers are refined to filter automated agents and to assign double weight to visits from Germany, the documents may be organized in the following order: 620 (effectively 40 visits, since the 10 from Germany count double), 610 (effectively 25 visits after filtering the 15 visits from automated agents), and 630 (effectively 4 visits).

It’s interesting that this patent has surfaced a the same time as we see Google+ getting wider adoption and also as we learn that Google is timestamping every single click from Google+ in order to track both the frequency and count of all outbound clicks.