Saturday, September 27, 2008

Google quality rating and what it means

Google's philosophy was oriented to obtaining high quality search engine results from the beginning. This is evident from the original papers published by Page and Brin,  The Anatomy of a Large-Scale Hypertextual Web Search Engine and The PageRank Citation Ranking: Bringing Order to the Web,  High quality is defined by them, basically, as the results that the user would want to get. The various algorithms and tweaks, beginning with with  Google PageRank are all intended to provide the highest quality results.
 
Not surprisingly, Google maintains an army of quality human checkers who evaluate the results of searches. Enterprising bloggers found a confidential document that describes the rating criteria for Google quality raters and put it on the Web for a while. This provoked a lot of comment that was based on the mistaken notion that the criteria reflect what the Google algorithm actually does. That is not the case. At best, they reflect what Google wants its algorithm to do. In brief, Google wants the algorithm to carry out the intention of the surfer. Pages that are retrieved are rated for relevancy to the search query. No attempt is made to determine if the information in those pages is correct or reliable, other than the criterion of "authoritative" citations. If other people think it is right, or if the page cites "authorities" then it must be right, according to Google. But the important point is that the algorithm doesn't necessarily carry out the desires of the management. If you design a page according to the quality handbook, it will not necessarily get a high ranking in Google.
 
In fact, we do not know that the ratings are related in any way to position of the pages retrieved by Google. The raters don't either evidently. They are presented with a lot of information about a page retrieved for a qury, but that information doesn't include the position of the query in the result returned by Google or the pagerank of the page retrieved (though raters they can usually find both). There is no attempt, at least not by the raters, to determine if the page returned as #10 in the search is better or worse than the page returned as number 1, or if the first 10 pages are better or worse than the next 10 pages. 
 
For a detailed discussion of what the raters and Google are looking for, see:  Google Quality Rater Secrets.
 
Ami Isseroff

No comments: