Is there a limit to the number of duplicate pages pointing to a rel='canonical ' primary?
-
We have a situation on twiends where a number of our 'dead' user pages have generated links for us over the years. Our options are to 404 them, 301 them to the home page, or just serve back the home page with a canonical tag.
We've been 404'ing them for years, but i understand that we lose all the link juice from doing this. Correct me if I'm wrong?
Our next plan would be to 301 them to the home page. Probably the best solution but our concern is if a user page is only temporarily down (under review, etc) it could be permanently removed from the index, or at least cached for a very long time.
A final plan is to just serve back the home page on the old URL, with a canonical tag pointing to the home page URL. This is quick, retains most of the link juice, and allows the URL to become active again in future. The problem is that there could be 100,000's of these.
Q1) Is it a problem to have 100,000 URLs pointing to a primary with a rel=canonical tag? (Problem for Google?)
Q2) How long does it take a canonical duplicate page to become unique in the index again if the tag is removed? Will google recrawl it and add it back into the index? Do we need to use WMT to speed this process up?
Thanks
-
I'll add this article by Rand that I came across too. I'm busy testing the solution presented in it:
https://moz.com/blog/are-404-pages-always-bad-for-seo
In summary, 404 all dead pages with a good custom 404 page so as to not waste crawl bandwidth. Then selectively 301 those dead pages that have accrued some good link value.
Thanks Donna/Tammy for pointing me in this direction..
-
In this scenario yes, a customized 404 page with a link to a few top level ( useful) links would be better served to both the user and to Google. From a strictly SEO standpoint, 100,000 redirects and or canonical tags would not benefit your SEO.
-
Thanks Donna, good points..
We return a hard 404, so it's treated correctly by google. We are just looking at this from a SEO point of view now to see if there's any way to reclaim this lost link juice.
Your point about looking at the value of those incoming links is a good one. I suppose it's not worth making google crawl 100,000 more pages for the sake of a few links. We've just starting seeing these pop up in Moz Analytics as link opportunities, and we can see them as 404's in site explorer too. There are a few hundred of these incoming links that point to a 404, so we feel this could have an impact.
I suppose we could selectively 301 any higher value links to the home page.. It will be an administrative nightmare, but doable..
How do others tackle this problem. Does everyone just hard 404 a page when that loses the link juice for incoming links to it..?
Thanks
-
Hi David,
When you say "we've been 404'ing them for years", does that mean you've created a custom 404 page that explains the situation to site visitors or does it mean you've been letting them naturally error and return the appropriate 404 (page not found) error to Google? It makes a difference. If the pages truly no longer exist and there is no equivalent replacement, you should be letting them naturally error (return a 404 return code) so as not to mislead Google's robots and site visitors.
Have you looked at the value of those incoming links? They may be low value anyway. There may be more valuable things you could be doing with your time and budget.
To answer your specific questions:
_Q1) Is it a problem to have 100,000 URLs pointing to a primary with a rel=canonical tag? (Problem for Google?) _
Yes, if those pages (or valuable replacements) don't actually exist. You'd be wasting valuable crawl budget. This looks like it might be especially true in your case given the size of your site. Check out this article. I think you might find it very helpful. It's an explanation of soft 404 errors and what you should do about them.
Q2) How long does it take a canonical duplicate page to become unique in the index again if the tag is removed? Will google recrawl it and add it back into the index? Do we need to use WMT to speed this process up?
If the canonical tag is changed or removed, Google will find and reindex it next time it crawls your site (assuming you don't run out of crawl budget). You don't need to use WMT unless you're impatient and want to try to speed the process up.
-
Thanks Sandi, I did..
It's a great article and it answered many questions for me, but i couldn't really get clarity on my last two questions above..
-
Hey David
Check this MOZ Blog post about Rel=Canlonical appropriately named Rel=Confused?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What is the Impact of Canonical to a Canonical Page?
hey folks, How does google respond to this, canonical to a canonical page? i.e page A is canonical to Page which is already/also canonical to PAGE C. Thanks In advance AK
On-Page Optimization | | AnkammaRao0 -
To create extra pages, or not to create extra pages?
I'm responsible for a site where we cater for all kinds of medical & legal problems. I recently conducted keyword research that shows a lot of questions being 'asked' in relation to the conditions we cater for. Naturally, I want to create content to answer these questions. We have a page for 'Cancer compensation' - the 'possible content' that answers questions won't necessarily help someone claiming compensation for cancer mistreatment, BUT someone who asks a question relating to cancer, answered in the 'possible content' may find the 'cancer compensation' page useful. SO! Do I: Add this content to the existing 'cancer compensation' page? Create individual pages of content answering each question, linking to the 'cancer compensation' page? or do I amalgamate all the answers into one heafty 'resource' page that sits elsewhere on the site? What do you think? Thanks in advance. John King
On-Page Optimization | | Muhammad-Isap0 -
Duplicate Content - What can be duplicate in two different product pages.
I am having a hard time understanding how my 3 different product pages are being shown up as Duplicate Content in s crawl. Some of my 21 different pages are being shown as duplicate content. Here are 3 of those: 1. http://champu.in/korn-rock-band-mens-round-neck-t-shirt-india 2. http://champu.in/stop-the-burning-mens-round-neck-t-shirt-india 3. http://champu.in/funny-t-shirts/absolut-punjabi-red-men-s-round-neck-t-shirt Can someone help me with this. Thanks in advance 🙂
On-Page Optimization | | sidjain4you0 -
How to overcome blog page 1, 2, 3, etc having no or duplicate meta info?
As the above what is the best way to overcome having the same meta info on your blog pages (not blog posts) So if you have 25 blog posts per page once you exceed this number you then move onto a second blog page, then when you get to 50 you then move onto a 3rd blog page etc etc So if you have thousands f blog pages what is the best method to deal with this rather than having to write 100s of different meta titkes & descriptions? Cheers
On-Page Optimization | | webguru20141 -
Home page and category page target same keyword
Hi there, Several of our websites have a common problem - our main target keyword for the homepage is also the name of a product category we have within the website. There are seemingly two solutions to this problem, both of which not ideal: Do not target the keyword with the homepage. However, the homepage has the most authority and is our best shot at getting ranked for the main keyword. Reword and "de-optimise" the category page, so it doesn't target the keyword. This doesn't work well from UX point of view as the category needs to describe what it is and enable visitors to navigate to it. Anybody else gone through a similar conundrum? How did you end up going about it? Thanks Julian
On-Page Optimization | | tprg0 -
Can rel="canonical" refer to another website page?
I want to republish the post from another website with their permission and want to abide by Google guidelines. Google guidelines is clear when you are using the same content at different parts of the same site however not when using it on another site in a legitimate way. Is there some way to use rel="canonical" refer to another website page of you are reproducing the content from same page?
On-Page Optimization | | h1seo0 -
Why there's a full-stop in the title of SEOMOZ's home page?
Hello, I see there's a full-stop (.) in the title of SEOMOZ's home page. Why is it so? Regards
On-Page Optimization | | IM_Learner0 -
Telephone numbers on page getting classed as 404s by SEOMoz
Hi there, I have a number of clients who have their telephone numbers on their sites (understandably of course!) and SEOmoz is classing them as links and therefore a 404 in the crawl software. The protocol is added in the code so if viewing the page on a mobile you can call the client. Should I be doing anything else? Webmaster does not pick these up as 404s so I am wondering if this is an SEOmoz bug or that I should be adding a no-follow? Thanks jT
On-Page Optimization | | Switch_Digital0