Can Anybody Understand This ?
-
Hey guyz,
These days I'm reading the paperwork from sergey brin and larry which is the first paper of Google.
And I dont get the Ranking part which is:"Google maintains much more information about web documents than typical search engines. Every hitlist includes position, font, and capitalization information. Additionally, we factor in hits from anchor text and the PageRank of the document. Combining all of this information into a rank is difficult. We designed our ranking function so that no particular factor can have too much influence. First, consider the simplest case -- a single word query. In order to rank a document with a single word query, Google looks at that document's hit list for that word. Google considers each hit to be one of several different types (title, anchor, URL, plain text large font, plain text small font, ...), each of which has its own type-weight. The type-weights make up a vector indexed by type. Google counts the number of hits of each type in the hit list. Then every count is converted into a count-weight. Count-weights increase linearly with counts at first but quickly taper off so that more than a certain count will not help. We take the dot product of the vector of count-weights with the vector of type-weights to compute an IR score for the document. Finally, the IR score is combined with PageRank to give a final rank to the document.
For a multi-word search, the situation is more complicated. Now multiple hit lists must be scanned through at once so that hits occurring close together in a document are weighted higher than hits occurring far apart. The hits from the multiple hit lists are matched up so that nearby hits are matched together. For every matched set of hits, a proximity is computed. The proximity is based on how far apart the hits are in the document (or anchor) but is classified into 10 different value "bins" ranging from a phrase match to "not even close". Counts are computed not only for every type of hit but for every type and proximity. Every type and proximity pair has a type-prox-weight. The counts are converted into count-weights and we take the dot product of the count-weights and the type-prox-weights to compute an IR score. All of these numbers and matrices can all be displayed with the search results using a special debug mode. These displays have been very helpful in developing the ranking system.
"
-
I can't say I have a complete understanding of what this is explaining, but here's a link to the original paper on Stanford's website if anyone else is interested. http://infolab.stanford.edu/~backrub/google.html
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Understanding Redirects and Canonical Tags in SEO: A Complex Case
Hi everyone, nothing serious here, i'm just playing around doing my experiments 🙂
Technical SEO | | chueneke
but if any1 of you guys understand this chaos and what was the issue here, i'd appreciate if you try to explain it to me. I had a page "Linkaufbau" on my website at https://chriseo.de/linkaufbau. My .htaccess file contains only basic SEO stuff: # removed ".html" using htaccess RewriteCond %{THE_REQUEST} ^GET\ (.*)\.html\ HTTP RewriteRule (.*)\.html$ $1 [R=301,L] # internally added .html if necessary RewriteCond %{REQUEST_FILENAME}.html -f RewriteCond %{REQUEST_URI} !/$ RewriteRule (.*) $1\.html [L] # removed "index" from directory index pages RewriteRule (.*)/index$ $1/ [R=301,L] # removed trailing "/" if not a directory RewriteCond %{REQUEST_FILENAME} !-d RewriteCond %{REQUEST_URI} /$ RewriteRule (.*)/ $1 [R=301,L] # Here’s the first redirect: RedirectPermanent /index / My first three questions: Why do I need this rule? Why must this rule be at the top? Why isn't this handled by mod_rewrite? Now to the interesting part: I moved the Linkaufbau page to the SEO folder: https://chriseo.de/seo/linkaufbau and set up the redirect accordingly: RedirectPermanent /linkaufbau /seo/linkaufbau.html I deleted the old /linkaufbau page. I requested indexing for /seo/linkaufbau in the Google Search Console. Once the page was indexed, I set a canonical to the old URL: <link rel="canonical" href="https://chriseo.de/linkaufbau"> Then I resubmitted the sitemap and requested indexing for /seo/linkaufbau again, even though it was already indexed. Due to the canonical tag, the page quickly disappeared. I then requested indexing for /linkaufbau and /linkaufbau.html in GSC (the old, deleted page). After two days, both URLs were back in the serps:: https://chriseo.de/linkaufbau https://chriseo.de/linkaufbau.html this is the new page /seo/linkaufbau
b14ee095-5c03-40d5-b7fc-57d47cf66e3b-grafik.png This is the old page /linkaufbau
242d5bfd-af7c-4bed-9887-c12a29837d77-grafik.png Both URLs are now in the search results and all rankings are significantly better than before for keywords like: organic linkbuilding linkaufbau kosten linkaufbau service natürlicher linkaufbau hochwertiger linkaufbau organische backlinks linkaufbau strategie linkaufbau agentur Interestingly, both URLs (with and without .html) redirect to the new URL https://chriseo.de/seo/linkaufbau, which in turn has a canonical pointing to https://chriseo.de/linkaufbau (without .html). In the SERPs, when https://chriseo.de/linkaufbau is shown, my new, updated snippet is displayed. When /linkaufbau.html is shown, it displays the old, deleted page that had already disappeared from the index. I have now removed the canonical tag. I don't fully understand the process of what happened and why. If anyone has any ideas, I would be very grateful. Best regards,
Chris0 -
Can you help by advising how to stop a URL from referring to another URL on my website with a 404 errorplease?
How to stop a URL from referring to another URL on my site. I'm getting a 404 error on a referred URL which is (https://webwritinglab.com/know-exactly-what-your-ideal-clients-want-in-8-easy-steps/[null id=43484])referred from URL (https://webwritinglab.com/know-exactly-what-your-ideal-clients-want-in-8-easy-steps/) The referred URL is the URL page that I want and I do not need it redirecting to the other URL as that's presenting a 404 error. I have tried saving the permalink in WordPress and recreated the .htaccess file and the problem is still there. Can you advise how to fix this please? Is it a case of removing the redirect? Is this advisable and how do I do that please? Thanks
Technical SEO | | Nichole.wynter20200 -
Can ALT tags for a Gallery be identical for all images with just the no changes?
Can ALT tags for a Gallery be identical for all images with just the no changes? Will that create any issues?
Technical SEO | | AlexisWithers1 -
NOFOLLOW Links: Can we 100% ignore them for SEO purposes?
Some SEO articles say we can completely ignore NoFollow links. Other articles say they still matter - but then are very vague on what they count for or against. So which is it really? I do realize that they can provide traffic, and for that they are worthwhile. But it is SEO I am asking about... The SEO purpose I am most concerned with is the Link Profile. Separating the Follows from the NoFollows often gives really different anchor text distributions. If they don't matter, why do MOZ and other SEO Analysis programs still include them in their standard reports? (I can see some benefit to having them as part of the in-depth reports) So what's your thoughts? Can we 100% ignore the NoFollows for our SEO analysis?
Technical SEO | | GregB1230 -
Error: Missing Meta Description Tag on pages I can't find in order to correct
This seems silly, but I have errors on blog URLs in our WordPress site that I don't know how to access because they are not in our Dashboard. We are using All in One SEO. The errors are for blog archive dates, authors and just simply 'blog'. Here are samples: http://www.fateyes.com/2012/10/
Technical SEO | | gfiedel
http://www.fateyes.com/author/gina-fiedel/
http://www.fateyes.com/blog/ Does anyone know how to input descriptions for pages like these?
Thanks!!0 -
Title tag not changing in Google. Can somebody take a look for me?
I'm using Yoast SEO plugin for the website. The website is http://www.emerypharmaservices.com. It appears on the webpage, the title tag is correct (home page should be Contract Laboratory Research Services for Analytical Chemistry and Microbiology), however, in Google it only says Emeryville Pharmaceutical Services. Could this be due to my settings? Please advise. Thank you
Technical SEO | | leopold49520 -
I can buy a domain from a competitor. Whats the best way to make good use of these links for my existing website
I can buy a domain from a competitor. Whats the best way to make good use of these links for my existing website
Technical SEO | | Archers0 -
Canonical - how can you tell if page is appearing duplicate in Google?
Our home page file is www.ides.com/default.asp and appears in Google as www.ides.com. Would it be a good thing for us to include the following tag in the head section of our website homepage?
Technical SEO | | Prospector-Plastics0