Why Does OSE (Open Site Explorer) have such little backlink data on russian sites in the google.ru index?
-
OK this seems v strange, but google.ru are indexing far more BLs in their SERPS for a widget than OSE reports. Very little data is found in OSE for russian based sites.
Is this the marketing intention?
(I could send raw data if needed!)
What is filtering this vast google.ru data list out? Is OSE only catered for US/UK?
-
I don't know the exact composition of seed sites that SEOmoz uses, but it's believed to be similar, at least in theory to a process that search engines use.
This would include seed sites such as universities and government agencies and other highly respected sites, for example the Red Cross would be considered a high trust site that would make a good seed.
The set is updated frequently as sites can be gamed. But I wouldn't go as far to say that all newspapers and media sites go in the "bad" category. Some do, some don't.
It's funny that you mention a "cynical" algorithm. It's believe Google does use something similar to a "SpamRank" algorythm.
-
Thanks for your response (last May - a while back I know)
I re-read your answer and it got me thinking - does OSE only find trusted seed set sites in the eyes of Google? Or does it consider beginning its 'trust' quest journey by what other engines consider, or humans even??
If google turned on a 'cynical' algorithm - surely it would consider most of the newspaper and media links as 'gamed' - turn these off - reverse the polarities and go to find the far remote honest and unbias independent citations from deeper parts of the web. How does one judge what a seed site should be in each country? Is this a manual choice?
-
Hi Turkey,
Good observation. It's certainly not an intentional bias. OSE is designed as a world-wide link index. That said, there are a couple of forces at play here.
First, because of the vast amount of links on the web,the OSE index is designed to find the most significant links. These are the links most likely to influence rankings. With this in mind, OSE usually only contains 40-60% of the links recorded by Google Webmaster Tools. This is true in all regions, including Europe and the United States.
The good news is that the majority of the time, the missing links are very often the same ones that pass little or no value.
It is possible certain "pockets" of the web get less bandwidth from Linkscape crawlers than other areas, due to natural selection. Linkscape starts with a "seed" of trusted sites, which include the top sites from the last index, and then crawls "out" from there. This method means sites well linked to from the "seed" sites and other top-ranked sites have a higher likelyhood of being crawled each index. If Linkscape doesn't have good metrics for a particular site, it is most likely distanced from the seeds.
Every so often the seed sites are adjusted, and there is a natural selection, in order to keep the index fresh and relative. It's one of the best indexes of its kind in the world, but of course there is always room for improvement.
Hope this explination helps. Thanks again for the feedback.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How can i discover how many of my pages have been indexed by google?
I am currently in the process of trying to produce a report for my corporation and this is a metric that i cannot seem to find on OpenSiteExplorer. Could anyone help?
Industry News | | CF20150 -
How do I ascertain how/why a site appears higher/lower in search results, based on different search terms?
The site in question is www.bullethq.comHere are the search terms I used, and what position Bullethq.com appears in the SERP's: <colgroup><col width="188"><col width="58"><col width="95"></colgroup>
Industry News | | PeterConnor
| Search-term | Position | Page returned |
| irish payroll online | 5 | Home Page |
| irish payroll online software | 20 | Home Page |
| online irish payroll software | 20 | Home Page |
| online payroll | 75 | (Blog post) | Could someone be so kind as to help me figure this out?0 -
Does Google still have a standard search result? How can I get it?
I have heard a lot from the experts that there are no "Standard" Google search results anymore. They said that most of the SERP's of Google that show up are customized/tailored for each individuals even if they are not logged-in using their Google Custom Search. My questions are, Is there still a way to retrieve the standard Google search result? How? Are these scripts will be helpful when searching on Google? *webhp?
Industry News | | RafaelRanada
*complete=0
*pws=0 watch?v=B8ofWFx525s B8ofWFx525s watch?v=B8ofWFx525s0 -
Is Google Making Life Harder For Aggregators?
Theres been a bunch of updates recently which have hurt aggregators: Reducing the number of search results to 7 for branded search queries The DMCA update which penalises those with trademark related takedown requests against them. At least 2 'domain diversity' updates, the most recent last week, which seeks to reduce the ability of sites to dominate SERPS e.g. a site which may have 2 search results on page 1 now may have 1. Plus Its commonly believed that Google favours big brands over smaller brands e.g. Marriott over examplehotelaggregator.com. Is this a deliberate ploy against aggregators in favour of brands i.e. does Google believe a brand site is a better search result than an aggregator? A brand site returned above an aggregator for a branded term may be seen by Google as a better fit, a better search result that should be higher. But is that true? Consumers like to see unbiased reviews and lowest prices and that isnt always available at the brand site. Thoughts please.
Industry News | | AndyMacLean0 -
Google Cached "Text Only" version
Is there a way to test what a page would look like in Google "Text Only" version before a page is indexed in Google? Is there a tool out there to help with this?
Industry News | | activejunky10 -
What is the best method for getting pure Javascript/Ajax pages Indeded by Google for SEO?
I am in the process of researching this further, and wanted to share some of what I have found below. Anyone who can confirm or deny these assumptions or add some insight would be appreciated. Option: 1 If you're starting from scratch, a good approach is to build your site's structure and navigation using only HTML. Then, once you have the site's pages, links, and content in place, you can spice up the appearance and interface with AJAX. Googlebot will be happy looking at the HTML, while users with modern browsers can enjoy your AJAX bonuses. You can use Hijax to help ajax and html links coexist. You can use Meta NoFollow tags etc to prevent the crawlers from accessing the javascript versions of the page. Currently, webmasters create a "parallel universe" of content. Users of JavaScript-enabled browsers will see content that is created dynamically, whereas users of non-JavaScript-enabled browsers as well as crawlers will see content that is static and created offline. In current practice, "progressive enhancement" in the form of Hijax-links are often used. Option: 2
Industry News | | webbroi
In order to make your AJAX application crawlable, your site needs to abide by a new agreement. This agreement rests on the following: The site adopts the AJAX crawling scheme. For each URL that has dynamically produced content, your server provides an HTML snapshot, which is the content a user (with a browser) sees. Often, such URLs will be AJAX URLs, that is, URLs containing a hash fragment, for example www.example.com/index.html#key=value, where #key=value is the hash fragment. An HTML snapshot is all the content that appears on the page after the JavaScript has been executed. The search engine indexes the HTML snapshot and serves your original AJAX URLs in search results. In order to make this work, the application must use a specific syntax in the AJAX URLs (let's call them "pretty URLs;" you'll see why in the following sections). The search engine crawler will temporarily modify these "pretty URLs" into "ugly URLs" and request those from your server. This request of an "ugly URL" indicates to the server that it should not return the regular web page it would give to a browser, but instead an HTML snapshot. When the crawler has obtained the content for the modified ugly URL, it indexes its content, then displays the original pretty URL in the search results. In other words, end users will always see the pretty URL containing a hash fragment. The following diagram summarizes the agreement:
See more in the....... Getting Started Guide. Make sure you avoid this:
http://www.google.com/support/webmasters/bin/answer.py?answer=66355
Here is a few example Pages that have mostly Javascrip/AJAX : http://catchfree.com/listen-to-music#&tab=top-free-apps-tab https://www.pivotaltracker.com/public_projects This is what the spiders see: view-source:http://catchfree.com/listen-to-music#&tab=top-free-apps-tab This is the best resources I have found regarding Google and Javascript http://code.google.com/web/ajaxcrawling/ - This is step by step instructions.
http://www.google.com/support/webmasters/bin/answer.py?answer=81766
http://www.seomoz.org/blog/how-to-allow-google-to-crawl-ajax-content
Some additional Resources: http://googlewebmastercentral.blogspot.com/2009/10/proposal-for-making-ajax-crawlable.html
http://www.seomoz.org/blog/how-to-allow-google-to-crawl-ajax-content
http://www.google.com/support/webmasters/bin/answer.py?answer=357690 -
Google Directory no longer available?
Now, we will forever not know what is in the Google Directory. I just clicked on the link..... and everything is dead and points you to DMOZ. What does this mean for us? Is DMOZ going to get more editor juice, so submissions are actually reviewed for once? The Yahoo! directory has also been glitching - new submissions have been disabled for over a week now. Any comments?
Industry News | | antidanis0 -
How to remove a Google algorithmic penalty
My site has a Google penalty. I seem to be stuck in the 64th position for a Google search for my sites name. All my keywords that I used to rank well for are now well above the 60th search place in Google. I have resolved the issue I recieved the penalty for and I have asked Google for reconsideration. That has been about 3 months ago. The penalty is still firmly in place. I was wondering if anyone else has had a Google algorithmic penalty removed and if so how did they accomplish this?
Industry News | | tadden0