Can Anybody Understand This ?
-
Hey guyz,
These days I'm reading the paperwork from sergey brin and larry which is the first paper of Google.
And I dont get the Ranking part which is:"Google maintains much more information about web documents than typical search engines. Every hitlist includes position, font, and capitalization information. Additionally, we factor in hits from anchor text and the PageRank of the document. Combining all of this information into a rank is difficult. We designed our ranking function so that no particular factor can have too much influence. First, consider the simplest case -- a single word query. In order to rank a document with a single word query, Google looks at that document's hit list for that word. Google considers each hit to be one of several different types (title, anchor, URL, plain text large font, plain text small font, ...), each of which has its own type-weight. The type-weights make up a vector indexed by type. Google counts the number of hits of each type in the hit list. Then every count is converted into a count-weight. Count-weights increase linearly with counts at first but quickly taper off so that more than a certain count will not help. We take the dot product of the vector of count-weights with the vector of type-weights to compute an IR score for the document. Finally, the IR score is combined with PageRank to give a final rank to the document.
For a multi-word search, the situation is more complicated. Now multiple hit lists must be scanned through at once so that hits occurring close together in a document are weighted higher than hits occurring far apart. The hits from the multiple hit lists are matched up so that nearby hits are matched together. For every matched set of hits, a proximity is computed. The proximity is based on how far apart the hits are in the document (or anchor) but is classified into 10 different value "bins" ranging from a phrase match to "not even close". Counts are computed not only for every type of hit but for every type and proximity. Every type and proximity pair has a type-prox-weight. The counts are converted into count-weights and we take the dot product of the count-weights and the type-prox-weights to compute an IR score. All of these numbers and matrices can all be displayed with the search results using a special debug mode. These displays have been very helpful in developing the ranking system.
"
-
I can't say I have a complete understanding of what this is explaining, but here's a link to the original paper on Stanford's website if anyone else is interested. http://infolab.stanford.edu/~backrub/google.html
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is there a limit to how many URLs you can put in a robots.txt file?
We have a site that has way too many urls caused by our crawlable faceted navigation. We are trying to purge 90% of our urls from the indexes. We put no index tags on the url combinations that we do no want indexed anymore, but it is taking google way too long to find the no index tags. Meanwhile we are getting hit with excessive url warnings and have been it by Panda. Would it help speed the process of purging urls if we added the urls to the robots.txt file? Could this cause any issues for us? Could it have the opposite effect and block the crawler from finding the urls, but not purge them from the index? The list could be in excess of 100MM urls.
Technical SEO | | kcb81780 -
Can I speed up removal of cache for 301'd page on unverified website?
I recently asked another website to remove a page from their website (I have no control over this website) and they have now 301'd this old URL to another - this is just what I wanted. My only aim now is to see the Google cache removed for that page as quickly as possible.
Technical SEO | | Mark_Reynolds
I'm not sure that asking the website to remove the url via WMT is the right way to go and assume I should just be waiting for Google to pick up the 301 and naturally remove the cache. But are there any recommended methods I can use to speed this process up? The old URL was last cached on 3 Oct 2014 so not too long ago. I don't think the URL is linked from any other page on the Internet now, but I guess it would still be in Google's list of URLs to crawl. Should I sit back and wait (who knows how long that would take?) or would adding a link to the old URL from a website I manage speed things up? Or would it help to submit the old URL to Google's Submission tool? URL0 -
Hit hard by EMD update, used to be #1 now not in top 50, what can I do?
We have what I think is a pretty good site, unique articles a few widgets, lots of reviews, decent enough bounce rates and user times (60% and 2:15) based on drupal. Previous updates haven't touched us and an almost identical duplicate (same site compltely different content) of the site targetting a different but related EMD is unaffected which provides a control. I have seen some discussion on it having to do with link profiles. We did pay some backlinkers to link to us, much more on the site that has dropped, and quite a few for a partial match keyword. I'm supposing this is a lot of the issue. If we try and delete these backlinks will it make the situation better or worse? I have also notice some duplicate content warnings in seomoz that weren't there previously. Any ideas?
Technical SEO | | btrr690 -
Can I put the tag in the MasterPage of my ASP.NET website or does this need to be specific to each page?
Hi Moz Community, I am a designer/junior SEO'er and have been working with our web developer to setup SEO oriented redirects and the rel canonical tag on our ASP.NET page running MasterPages - www.tisbest.org. I know setting up an incorrect canonical tag can be devastating so I'm hoping for some guidance. Can we put the <title> </span>Charity Gift Cards | Donation Gift Ideas | TisBest Philanthropy</p> <p style="color: #5e5e5e; font-family: Helvetica, Arial, sans-serif; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal;"><span style="color: #5e5e5e;"> </span></p> <p style="color: #5e5e5e; font-family: Helvetica, Arial, sans-serif; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal;"><span style="color: #5e5e5e;"></title> Thanks! Chad
Technical SEO | | TisBest0 -
I know I'm missing pages with my page level 301 re-directs. What can I do?
I am implementing page level re-directs for a large site but I know that I will inevitably miss some pages. Is there an additional safety net root level re-direct that I can use to catch these pages and send them to the homepage?
Technical SEO | | VMLYRDiscoverability0 -
Can somebody explain Canonical tags and the technical elements of SEO?
Newbie here,and learning fast. But... I can't help but feel the technical elements of SEO (i.e. canonical tags, javascript amongst others) are holding me back. My knowledge of programming and coding is basic at best. Do I have to have an understanding of this to get ahead in SEO or is it simply a case of reading some more and knowing the techniques? What percentage of SEO is technical (e.g. html coding etc...) Thanks in advance. N. p.s. could someone explain what canonical tags are?
Technical SEO | | Buzzwords0 -
I have found this on a site that i have seen many times where can i get one from
Hi i have seen this great map system that i have seen on many sites which i think makes a site look great but i have tried looking for the past few weeks but cannot find where i can get one from. http://www.hypnoslimmer.co.uk/consultant.html does anyone know how these sites do it and where you can get the product from. I use joomla for all my sites Any help would be great
Technical SEO | | ClaireH-1848860 -
Destination URL in SERPs keeps changing and I can't work out why.. Help.
I am befuddled as to why our destination URL in SERPs keeps changing oak furniture was nicely returning http://www.thefurnituremarket.co.uk/oakfurniture.asp then I changed something yesterday I did 2 things. published a link to that on facebook as part of a competition. redirected dynamic pages to the static URL for oak furniture.. Now for oak furniture the SERPs in GG UK is returning our home page as the most relevant landing page.. Any Idea why? I'm leaning to an onpage issue than posting on FB.. Thoughts?
Technical SEO | | robertrRSwalters0