External Links from own domain
-
Hi all,
I have a very weird question about external links to our site from our own domain.
According to GWMT we have 603,404,378 links from our own domain to our domain (see screen 1) We noticed when we drilled down that this is from disabled sub-domains like m.jump.co.za.
In the past we used to redirect all traffic from sub-domains to our primary www domain. But it seems that for some time in the past that google had access to crawl some of our sub-domains, but in december 2010 we fixed this so that all sub-domain traffic redirects (301) to our primary domain. Example http://m.jump.co.za/search/ipod/ redirected to http://www.jump.co.za/search/ipod/
The weird part is that the number of external links kept on growing and is now sitting on a massive number.
On 8 April 2011 we took a different approach and we created a landing page for m.jump.co.za and all other requests generated 404 errors. We added all the directories to the robots.txt and we also manually removed all the directories from GWMT.
Now 3 weeks later, and the number of external links just keeps on growing: Here is some stats:
11-Apr-11 - 543 747 534
12-Apr-11 - 554 066 716
13-Apr-11 - 554 066 716
14-Apr-11 - 554 066 716
15-Apr-11 - 521 528 014
16-Apr-11 - 515 098 895
17-Apr-11 - 515 098 895
18-Apr-11 - 515 098 895
19-Apr-11 - 520 404 181
20-Apr-11 - 520 404 181
21-Apr-11 - 520 404 181
26-Apr-11 - 520 404 181
27-Apr-11 - 520 404 181
28-Apr-11 - 603 404 378
I am now thinking of cleaning the robots.txt and re-including all the excluded directories from GWMT and to see if google will be able to get rid of all these links.
What do you think is the best solution to get rid of all these invalid pages.
-
We had 301s for about 6 months, and the old URLs did not disappear from google. Thats why we decided to change them to 404s, with the thinking that Google might remove them quicker. But the number of links from sub-domains just keeps on growing.
I am worried that by having these problem urls listed in the robots.txt actually prevents google from following them and seeing that it should be removed and that it returns a 404
-
Instead of trying to manage a massive 301 list, can you just customize your 404 page to redirect?
{script to test page URL}
$location = "http://www.YourSite.com/";
header("HTTP/1.1 301 Moved Permanently");
header("Location: {$location}");
exit;
}
-
Update:
There are 2 things that still puzzles me with this:
If you go to http://www.google.co.za/search?q=site:jump.co.za+-www&hl=en&rlz=1C1GPCK_enZA426ZA426&prmd=ivns&filter=0&biw=1920&bih=979 you notice all sorts of weird sub-domains, and all of these are invalid and have been removed from GWMT.
If you manage the domain m.jump.co.za on GWMT you also notice that it still reports on keywords, queries and all sorts of data, although the site is disabled and all the URLs generate 404 errors
There is only a few of these weird sub-domains that are causing the problems:
0www.
iiiiiwww.
iwww.
m.
wtfwww.
www.www.
wwww.All these domains feels very fimiliar to me and I am almost 100% sure that its domains that used to test when we found the problem on apache, meaning google took the data from the toolbar queries and probably started indexing these sub-domains. But now I can't get rid of them, and Google seems to be out of control with these.
So the main question is probably, should we just give 404s or should we add to Robots.txt as well?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Value of dormant domain
My client used to own a successful domain. They sold the business, the domain was not used by the purchaser. My client bought back the business and redirects the original STRONG domain to their new domain. How can I find out current page rank, traffic, etc of the original domain? Mik
Technical SEO | | mcorso0 -
Will Redirection of Unnatural Links to the New Domain will work?
Hi All, If a website www.mywebsite.com has received 100 unnatural links and because of that rankings are dropped. I want to follow the below strategy to protect my website from these 100 unnatural links. Please let me know if it will work: Buy a new domain www.newdomain.com with new whois record (Not related to www.mywebsite.com ) Do 301 redirection of all 100 unnatural links (Those are pointing to www.mywebsite.com) to www.newdomain (New domain) Do 302 redirection from www.newdomain (New domain) to www.mywebsite.com (Old Domain) After applying the above strategy, will google still consider those 100 unnatural links for Old domain??
Technical SEO | | RuchiPardal0 -
Too Many Internal Links?
Hi Guys, I'm completing a overhawl of our website at the moment have a certain penguin killed our site for our main keyword. I'm currently working on our internal linking as most of our blog posts have a link back to our home page with the main money keyword. At present we have 3,331 internal links and our site has only 1,000 pages. Can you get penalised for having too many internal links with exact match anchors. Thanks, Scott
Technical SEO | | ScottBaxterWW0 -
Canonical Link Quesiton
I wrote an article that is a page article, but would also be a very good blog post - So my question is two things: 1. If i post it as a static page and syndicate it as a blog post and have it as a canonical link to the page, google will read see the blog and read the page _url as the one with credit correct? In turn not dinging me for duplicate content. 2. Given if the above statement is correct, should I write the blog and put it on my static page referencing the blog or the way i have it as a static page with the blog using a canonical reference back to the page. Any input would be greatly appreciated.
Technical SEO | | tgr0ss0 -
Parking Domains
I currently have a website domain.com.au, an American branch of the company who own domain.com are currently having their site built and want to forward there domain.com to domain.com.au while construction is taking place. Are there any negative effects to parking the domain.com on my domain.com.au? What is the best method to do this without causing any problems for my domain.com.au?
Technical SEO | | Pork0 -
.com domain is an iframe copy of a .net domain?
Hey folks, This one is over my head. I'm helping out a friend's dental office website (www.capitolperiodontal.com), and their home page source code points to the .net TLD for its content apparently: | | <title></span>http://www.capitolperiodontal.com/</title> http-equiv="content-type" content="text/html" /> rows="100%" id="dd_frameset_0001"> src="http://www.capitolperiodontal.net/" name="dd_content_0001" framespacing="0" frameborder="0" noresize="noresize" title="capitolperiodontal.com" /> <noframes></noframes> My idea was to load all the content from the .net to the .com, then redirect the .net to the .com as it has better domain authority and is, well a .com. Any insights what this iframe biz is all about and if my strategy above is ok? Many thanks folks! john
Technical SEO | | juanzo0070 -
A Puzzling Link
I'm stumped and I'm hoping some mozzers will be able to help. I run our company blog (http://scottymacblog.com/). The last couple of days I have noticed that the blog is receiving some traffic from cnn.com. I looked, but cannot find any mention of the blog on cnn. Adding to my frustration is that the content on cnn is constantly changing. Our blog doesn't do any sort of advertising and no one affiliated with the blog posts on cnn. As great as it is to be getting traffic from such a valued source, I have no idea why. Has something like this happened to (for?) anyone else? Any ideas on how I can research the source of the link? Thanks in advance!
Technical SEO | | EssEEmily0 -
Internal Linking: Site-wide VS Content Links
I just watched this video in which Matt Cutts talks about the ancient 100 links per page limit. I often encounter websites which have massive navigation (elaborate main menu, side bar, footer, superfooter...etc) in addition to content area based links. My question is do you think Google passes votes (PageRank and anchor text) differently from template links such as navigation to the ones in the content area, if so have you done any testing to confirm?
Technical SEO | | Dan-Petrovic0