Why Does OSE (Open Site Explorer) have such little backlink data on russian sites in the google.ru index?
-
OK this seems v strange, but google.ru are indexing far more BLs in their SERPS for a widget than OSE reports. Very little data is found in OSE for russian based sites.
Is this the marketing intention?
(I could send raw data if needed!)
What is filtering this vast google.ru data list out? Is OSE only catered for US/UK?
-
I don't know the exact composition of seed sites that SEOmoz uses, but it's believed to be similar, at least in theory to a process that search engines use.
This would include seed sites such as universities and government agencies and other highly respected sites, for example the Red Cross would be considered a high trust site that would make a good seed.
The set is updated frequently as sites can be gamed. But I wouldn't go as far to say that all newspapers and media sites go in the "bad" category. Some do, some don't.
It's funny that you mention a "cynical" algorithm. It's believe Google does use something similar to a "SpamRank" algorythm.
-
Thanks for your response (last May - a while back I know)
I re-read your answer and it got me thinking - does OSE only find trusted seed set sites in the eyes of Google? Or does it consider beginning its 'trust' quest journey by what other engines consider, or humans even??
If google turned on a 'cynical' algorithm - surely it would consider most of the newspaper and media links as 'gamed' - turn these off - reverse the polarities and go to find the far remote honest and unbias independent citations from deeper parts of the web. How does one judge what a seed site should be in each country? Is this a manual choice?
-
Hi Turkey,
Good observation. It's certainly not an intentional bias. OSE is designed as a world-wide link index. That said, there are a couple of forces at play here.
First, because of the vast amount of links on the web,the OSE index is designed to find the most significant links. These are the links most likely to influence rankings. With this in mind, OSE usually only contains 40-60% of the links recorded by Google Webmaster Tools. This is true in all regions, including Europe and the United States.
The good news is that the majority of the time, the missing links are very often the same ones that pass little or no value.
It is possible certain "pockets" of the web get less bandwidth from Linkscape crawlers than other areas, due to natural selection. Linkscape starts with a "seed" of trusted sites, which include the top sites from the last index, and then crawls "out" from there. This method means sites well linked to from the "seed" sites and other top-ranked sites have a higher likelyhood of being crawled each index. If Linkscape doesn't have good metrics for a particular site, it is most likely distanced from the seeds.
Every so often the seed sites are adjusted, and there is a natural selection, in order to keep the index fresh and relative. It's one of the best indexes of its kind in the world, but of course there is always room for improvement.
Hope this explination helps. Thanks again for the feedback.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What are your opinions on the Google News vs Spanish Government Issue ?
Greg Sterling said: "Governments across Europe are justifiably alarmed by the declining fortunes of their respective newspaper industries. However punitive or parasitic taxation measures targeting Google, masquerading as copyright protections, are not the answer." Do you agree?
Industry News | | Tintanus1 -
How to remove the inbound links of a website from Google Webmaster Tools?
Hello viewers, One of my projects (BannerBuzz.com) is having linked with this Site: http://www.article-niche.com/ and we can see so many inbound links in our Webmaster account from this site, we have already disavowed this site but still it is found in our Webmaster Tools and we don’t have option to mail them as the site is down, so kindly anyone help us out how to remove this back-links and I want to remove it from my Webmaster account as well as from “Search Results” as the site is down. Currently as this site is show down from long time and because of its back-links, our website (BannerBuzz.com) has been penalized by Google.
Industry News | | CommercePundit0 -
Is it currently possible to have a company logo appear with a Google search result?
Hello, We're experimenting with rich snippets. It seems fairly easy to attain authorship through Google and have a headshot appear in a search result... However, we are wondering if it is possible to have a company logo appear (perhaps through rel="publisher"?). Surprisingly, I can't find much info on this. Thanks!
Industry News | | BBEXNinja0 -
Is this still Google?
My niche, my concern.
Industry News | | webfeatus
http://www.google.com/search?q=jimbaran+villa
My site just dropped out of the rankings completely. But if you look at the Google search above you will notice 2 things:
1. First page: 75% of space above the fold is dedicated to Google making money
2. Subsequent pages: It is like you don't actually search "Google" If you flip through a few pages what you actually search is:
agoda.com
flipkey.com
tripadvisor.com
homeaway.com Do I have a point or am I simply having a cynical day?1 -
3rd Party Site Presence and Internet Marketing
An issue that I do not see discussed about much in Internet Marketing circles is the importance of establishing and developing a presence on third party sites other than through guest posting and I was wondering how much other internet marketers focus on this area and recommend this to clients. I see many people wanting hiring internet marketers basically for SEO and the KPI's on which they are judged are SEO based outcomes (higher alexa rank, number of page 1 keywords, number of top 1, top 3 keywords etc). I recently gave advice to someone who had very little time, very little money and very little SEO skills who wanted to rank in number 1 positions for their band for wedding and corporate bookings. I basically showed them sites that were highly positioned in the SERPs where they could get develop a strong online profile and how this would work better than investing 3 hours a month badly into SEO. Ultimately what businesses want is strong convertible leads and sales. For many businesses that might be better established particularly where budgets are tight through other elements whether it is Tripadvisor rankings, presence in other 3rd party sites or in an Ebay store than through SEO for their own site. Obviously for big brands with big budgets SEO is essential but as many keyword terms become increasingly competitive how important should referrals from 3rd party sites be in businesses internet marketing strategy and to what extent do you feel current internet marketing businesses identify and promote this need rather than simply working on improving ranking performance for keywords?
Industry News | | LighthouseC0 -
Hello, Actually I have bit of doubt. If I create Google plus business page. Will it helpful or effects for my website ranking?
If I create Google plus business page. Will it helpful or effects for my website ranking?
Industry News | | jaybinary0 -
What is the best method for getting pure Javascript/Ajax pages Indeded by Google for SEO?
I am in the process of researching this further, and wanted to share some of what I have found below. Anyone who can confirm or deny these assumptions or add some insight would be appreciated. Option: 1 If you're starting from scratch, a good approach is to build your site's structure and navigation using only HTML. Then, once you have the site's pages, links, and content in place, you can spice up the appearance and interface with AJAX. Googlebot will be happy looking at the HTML, while users with modern browsers can enjoy your AJAX bonuses. You can use Hijax to help ajax and html links coexist. You can use Meta NoFollow tags etc to prevent the crawlers from accessing the javascript versions of the page. Currently, webmasters create a "parallel universe" of content. Users of JavaScript-enabled browsers will see content that is created dynamically, whereas users of non-JavaScript-enabled browsers as well as crawlers will see content that is static and created offline. In current practice, "progressive enhancement" in the form of Hijax-links are often used. Option: 2
Industry News | | webbroi
In order to make your AJAX application crawlable, your site needs to abide by a new agreement. This agreement rests on the following: The site adopts the AJAX crawling scheme. For each URL that has dynamically produced content, your server provides an HTML snapshot, which is the content a user (with a browser) sees. Often, such URLs will be AJAX URLs, that is, URLs containing a hash fragment, for example www.example.com/index.html#key=value, where #key=value is the hash fragment. An HTML snapshot is all the content that appears on the page after the JavaScript has been executed. The search engine indexes the HTML snapshot and serves your original AJAX URLs in search results. In order to make this work, the application must use a specific syntax in the AJAX URLs (let's call them "pretty URLs;" you'll see why in the following sections). The search engine crawler will temporarily modify these "pretty URLs" into "ugly URLs" and request those from your server. This request of an "ugly URL" indicates to the server that it should not return the regular web page it would give to a browser, but instead an HTML snapshot. When the crawler has obtained the content for the modified ugly URL, it indexes its content, then displays the original pretty URL in the search results. In other words, end users will always see the pretty URL containing a hash fragment. The following diagram summarizes the agreement:
See more in the....... Getting Started Guide. Make sure you avoid this:
http://www.google.com/support/webmasters/bin/answer.py?answer=66355
Here is a few example Pages that have mostly Javascrip/AJAX : http://catchfree.com/listen-to-music#&tab=top-free-apps-tab https://www.pivotaltracker.com/public_projects This is what the spiders see: view-source:http://catchfree.com/listen-to-music#&tab=top-free-apps-tab This is the best resources I have found regarding Google and Javascript http://code.google.com/web/ajaxcrawling/ - This is step by step instructions.
http://www.google.com/support/webmasters/bin/answer.py?answer=81766
http://www.seomoz.org/blog/how-to-allow-google-to-crawl-ajax-content
Some additional Resources: http://googlewebmastercentral.blogspot.com/2009/10/proposal-for-making-ajax-crawlable.html
http://www.seomoz.org/blog/how-to-allow-google-to-crawl-ajax-content
http://www.google.com/support/webmasters/bin/answer.py?answer=357690 -
How to achieve the highest global and local relevance in google?
Let's say I have a company that has its main business in Europe for thefollowing languages: English German Portugese French Italian And let's say some other markets (e.g. the Portugese one in south america) is also important. The question now is how should we structure the Domain if we want onlyone top level domain (www.company.com)? a) By using subdomains to target users with Google Webmaster Tools for the relevant country: portugal.company.com/pt (same content) brasil.company.com/pt (same content) germany.company.com/de england.company.com/en etc. or b) by using virtual folders www.company.com/pt www.company.com/de www.company.com/en
Industry News | | imsi
etc. or c) something completely different I do not know about? What do you reckon is best? I appreciate all suggestions!0