Wednesday 17 June 2015

HB Blog 78: Google's Page Rank Algorithm.

Basically, PageRank is an algorithm used by Google Search to rank websites in their search engine results. PageRank was named after Larry Page one of the founders of Google. Page Rank was developed by Larry Page and Sergey Brin as a part of there research project. It is designed to crawl and index the web efficiently and produce much more satisfying search results.

According to Google:-
    PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is. The underlying assumption is that more important websites are likely to receive more links from other websites.

Algorithm description:-
Usually crawling and index process done for retrieving search results were calculated based on backlinks or comments or various sorting algorithms, whereas Google came up with PageRank algorithm which normalize number of links on the page.

Sergey Brin and Lawrence Page explained in their research paper as follows:-
    "We assume page A has pages T1...Tn which point to it (i.e., are citations). The parameter d is a damping factor which can be set between 0 and 1. We usually set d to 0.85. Also C(A) is defined as the number of links going out of page A. The PageRank of a page A is given as follows:

    PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))

    Note that the PageRanks form a probability distribution over web pages, so the sum of all web pages' PageRanks will be one.

PageRank or PR(A) can be calculated using a simple iterative algorithm, and corresponds to the principal eigenvector of the normalized link matrix of the web. Also, a PageRank for 26 million web pages can be computed in a few hours on a medium size workstation. "

Here,
PR(A)- Here "A" is some page for which page rank is to be calculated.
T1-Tn- They are page that points to page "A" from other sources.
C(A)- They are the links that go out of the page "A".
d- It is a damping factor. According to page rank theory, assume that the user clicks randomly on links and the stop after a point of time. The probability, at any step, that the person will continue is a damping factor d.

For more information and reference follow the below link of research paper by Sergey Brin and Lawrence Page that was published at Stanford University, Stanford, CA:-
http://infolab.stanford.edu/~backrub/google.html

No comments:

Post a Comment