How Google produces relevant search results

Searching for keywords

For a search engine, the original table is going to have billions of rows since there are billions of pages on the web. The rows will be relatively short (maybe less than a thousand fields). However, the inverted table will have correspondingly fewer rows – let's say of the order of magnitude of a million or so – but each row could be huge.

Figure 3: The PageRank of a website depends on how many other well-cited pages link to it

Figure 3 shows a calculation of PageRank for a small corpus of six documents, A though F. The links between the documents are shown by arrows, and the area of a document is proportional to its PageRank. Notice that although A has more links coming in than B does, B has the higher PageRank, because A has more links from pages that are irrelevant (E and F in particular).

Another way to view PageRank is to imagine a person that, once he's given a random page to start from, randomly clicks on links on the pages he visits without hitting the Back button. The PageRank of a page is essentially the statistical probability that he visits that page. The PageRank that Google Toolbar displays for pages is not this probability.

Instead, using a formula that remains a trade secret, Google converts a PageRank probability to a whole number between 0 and 10, with 0 meaning 'unranked' (the page is too new to have been cited often) and 10 assigned to the most important or highest-quality pages. There's only one 10 as far as we know: the Google search page.

Sites like Wikipedia, Twitter and Yahoo! manage a 9 for their homepages, whereas Facebook only manages an 8. Now that we have a method for determining a page's importance, we can fairly easily order the search results for a given page by order of their importance and quality.

This article should have given you some insight into how a modern search engine works. There are more details to understand, though, and a good place to start is Page and Brin's Paper.

-------------------------------------------------------------------------------------------------------

First published in PC Plus Issue 281

Liked this? Then check out The tech that's shaping the web's evolution

Sign up for TechRadar's free Weird Week in Tech newsletter
Get the oddest tech stories of the week, plus the most popular news and reviews delivered straight to your inbox. Sign up at http://www.techradar.com/register

Follow TechRadar on Twitter

TOPICS