Is there a limit to the number of duplicate pages pointing to a rel='canonical ' primary?
-
We have a situation on twiends where a number of our 'dead' user pages have generated links for us over the years. Our options are to 404 them, 301 them to the home page, or just serve back the home page with a canonical tag.
We've been 404'ing them for years, but i understand that we lose all the link juice from doing this. Correct me if I'm wrong?
Our next plan would be to 301 them to the home page. Probably the best solution but our concern is if a user page is only temporarily down (under review, etc) it could be permanently removed from the index, or at least cached for a very long time.
A final plan is to just serve back the home page on the old URL, with a canonical tag pointing to the home page URL. This is quick, retains most of the link juice, and allows the URL to become active again in future. The problem is that there could be 100,000's of these.
Q1) Is it a problem to have 100,000 URLs pointing to a primary with a rel=canonical tag? (Problem for Google?)
Q2) How long does it take a canonical duplicate page to become unique in the index again if the tag is removed? Will google recrawl it and add it back into the index? Do we need to use WMT to speed this process up?
Thanks
-
I'll add this article by Rand that I came across too. I'm busy testing the solution presented in it:
https://moz.com/blog/are-404-pages-always-bad-for-seo
In summary, 404 all dead pages with a good custom 404 page so as to not waste crawl bandwidth. Then selectively 301 those dead pages that have accrued some good link value.
Thanks Donna/Tammy for pointing me in this direction..
-
In this scenario yes, a customized 404 page with a link to a few top level ( useful) links would be better served to both the user and to Google. From a strictly SEO standpoint, 100,000 redirects and or canonical tags would not benefit your SEO.
-
Thanks Donna, good points..
We return a hard 404, so it's treated correctly by google. We are just looking at this from a SEO point of view now to see if there's any way to reclaim this lost link juice.
Your point about looking at the value of those incoming links is a good one. I suppose it's not worth making google crawl 100,000 more pages for the sake of a few links. We've just starting seeing these pop up in Moz Analytics as link opportunities, and we can see them as 404's in site explorer too. There are a few hundred of these incoming links that point to a 404, so we feel this could have an impact.
I suppose we could selectively 301 any higher value links to the home page.. It will be an administrative nightmare, but doable..
How do others tackle this problem. Does everyone just hard 404 a page when that loses the link juice for incoming links to it..?
Thanks
-
Hi David,
When you say "we've been 404'ing them for years", does that mean you've created a custom 404 page that explains the situation to site visitors or does it mean you've been letting them naturally error and return the appropriate 404 (page not found) error to Google? It makes a difference. If the pages truly no longer exist and there is no equivalent replacement, you should be letting them naturally error (return a 404 return code) so as not to mislead Google's robots and site visitors.
Have you looked at the value of those incoming links? They may be low value anyway. There may be more valuable things you could be doing with your time and budget.
To answer your specific questions:
_Q1) Is it a problem to have 100,000 URLs pointing to a primary with a rel=canonical tag? (Problem for Google?) _
Yes, if those pages (or valuable replacements) don't actually exist. You'd be wasting valuable crawl budget. This looks like it might be especially true in your case given the size of your site. Check out this article. I think you might find it very helpful. It's an explanation of soft 404 errors and what you should do about them.
Q2) How long does it take a canonical duplicate page to become unique in the index again if the tag is removed? Will google recrawl it and add it back into the index? Do we need to use WMT to speed this process up?
If the canonical tag is changed or removed, Google will find and reindex it next time it crawls your site (assuming you don't run out of crawl budget). You don't need to use WMT unless you're impatient and want to try to speed the process up.
-
Thanks Sandi, I did.. It's a great article and it answered many questions for me, but i couldn't really get clarity on my last two questions above..
-
Hey David
Check this MOZ Blog post about Rel=Canlonical appropriately named Rel=Confused?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Too many links per page? Double navigation on every page...
I have a client with navigation across the top of each page plus the same nav links in a sidebar on every page. Can that duplication (or the sheer number of links) on each page have a negative ranking factor?
On-Page Optimization | | brm20170 -
Should you 301, 302, or rel=canonical private pages?
What should you do with private 'logged in' pages from a seo perspective? They're not visible to crawlers and shouldn't be indexed, so what is best practice? Believe it or not, we have found quite a few back links to private pages and want to get the ranking benefit from them without them being indexed. Eg: http://twiends.com/settings (Only logged in user can see the page) 302 them: We can redirect users/crawlers temporarily, but I believe this is not ideal from a seo perspective? Do we lose the link juice to this page? 301 them: We can do a permanent redirect with a short cache time. We preserve most link juice now, but we probably mess up the users browser. Users trying to reach a private page while logged out may have issues reaching it after logged in. **Serve another page with rel=canonical tag: **We could serve back the home page without changing the URL. We use a canonical tag to tell the crawlers that it's a duplicate of the home page. We keep most of the link juice, and the browser is unaffected. Yes, a user might share that different URL now, but its unlikely. We've been doing 302's up until now, now we're testing the third option. How do others solve this problem? Is there a problem with it? Any advice appreciated.
On-Page Optimization | | dsumter0 -
"?inline=true" Duplicate Page
I have a new Drupal client and am getting a duplicate page error and indicates "?inline=true" after the domain as the culprit. Google comes up empty 😞
On-Page Optimization | | JimCoarse0 -
Home page keyword effecting internal page ranking
Hello, My client has a second keyword for the home page that is competitive. The home page is not being ranked for this keyword. Instead, an internal category page is ranking. This internal category page is more relevant than the home page - it shows the categories for the actual products that this term refers to. But everyone around us in Google's page results has far more backlinks than the internal page, and we're all heavily optimized for this term. My question is, is it safe to pull the second term off of the home page or is this internal page strong because it is somehow being strengthened by the home page optimization?
On-Page Optimization | | BobGW0 -
Appropriate Use of Rel Canonical
Hello, in on page report card , for a kyeword: armadi portafucili blindati URL: http://www.bighunter.net/shop/searchresult.seam?codiceSettoreSel=CACCIA&codiceCategoriaSel=Armadi Blindati&codiceSottoCategoriaSel=Linea Legno DeLuxe&codiceMarcaSel=SILMEC i have a Critical Factor that don't undestand. It 's not ok "appropiate Use of Rel Canoncal, but in my page i have <link href="http://www.bighunter.net/shop/searchresult.seam?codiceSettoreSel=CACCIA&codiceCategoriaSel=Armadi Blindati&codiceSottoCategoriaSel=Linea Legno DeLuxe&codiceMarcaSel=SILMEC" rel="canonical"> and the link is the same of the url . I don't undestand where is the problem . Who can help me? Best Regards Luca
On-Page Optimization | | lbecarelli0 -
How woud you deal with Blog TAGS & CATEGORY listings that are marked a 'duplicate content' in SEOmoz campaign reports?
We're seeing "Duplicate Content" warnings / errors in some of our clients' sites for blog / event calendar tags and category listings. For example the link to http://www.aavawhistlerhotel.com/news/?category=1098 provides all event listings tagged to the category "Whistler Events". The Meta Title and Meta Description for the "Whistler Events" category is the same as another other category listing. We use Umbraco, a .NET CMS, and we're working on adding some custom programming within Umbraco to develop a unique Meta Title and Meta Description for each page using the tag and/or category and post date in each Meta field to make it more "unique". But my question is .... in the REAL WORLD will taking the time to create this programming really positively impact our overall site performance? I understand that while Google, BING, etc are constantly tweaking their algorithms as of now having duplicate content primarily means that this content won't get indexed and there won't be any really 'fatal' penalties for having this content on our site. If we don't find a way to generate unique Meta Titles and Meta Descriptions we could 'no-follow' these links (for tag and category pages) or just not use these within our blogs. I am confused about this. Any insight others have about this and recommendations on what action you would take is greatly appreciated.
On-Page Optimization | | RoyMcClean0 -
Potential Duplicate Title Tags On Sibling Pages
Edit I'll take the fall on this one, seems I could have asked my quesiton in a more clear manner. I was cruising other questions and finding a whole of answers that I suspect were not truly intended to help, but maybe help and earn Mozpoints. Wasn't fair of me to label those answering here with that. I will work better on the wording of my questions! 🙂 Edit Either I am asking my question poorly or I am learning there may be a rush to get points by throwing up any old answer...it very well may be the former which I am open to feedback on. Each page is to stand alone and hopefully rank well for the neighbourhood name and in conjunction with another relevant keyword phrase. There is no 'duplicate' version of any pages. * On a site there are numerous pages that provide real estate listings broken down by neighbourhood. Each containing similar content, a abbreviated version of the listings, often spanning 2 or 3 pages. These are 3rd level pages. Properties->Calgary Neighbourhoods->Evanston The title tags created are: Evanston Homes For Sale - NW Calgary Real Estate Panorama Hills Home For Sale - NW Calgary Real Estate Etc. for about 15 or so pages. Then they start again for another area of the city: Sagewood Homes For Sale - Airdrie Real Estate Woodside Homes For Sale - Airdrie Real Estate At this point there is no text on the actual page outside of the listings...an example of similar listings on another site - http://www.experiencerealtygroup.com/BaturynandDunluceHomes.ubr Do you think the SE's will see these as 'proper' use of the Title Tag or duplicate or other practices they tend to frown upon? It is a logical way of creating the title and obviously creating a unique version for each page would not only be tough to scale on some sites with 100's of these pages, they would become a little silly and not much use to the searcher in the SERPs Thanks for any help!
On-Page Optimization | | kyegrace1 -
Do we need to use the canonical tag on non-indexed pages?
Hi there I have been working in / learning SEO for just over a year, coming from a non dev background, so there are still plenty of the finer points on-page points I am working on. Slowly building up confidence and knowledge with the great SEOMoz as a reference! We are working on this site http://www.preciseuk.co.uk (we are still tweaking the tags and content by the way- not finished yet!) Because a lot of the information is within accordians, a page is generated for each tab of the accordian expanded, for example: http://www.preciseuk.co.uk/facilities-management.php is the main page but then you also have: http://www.preciseuk.co.uk/facilities-management.php?tab=0 http://www.preciseuk.co.uk/facilities-management.php?tab=1 http://www.preciseuk.co.uk/facilities-management.php?tab=2 http://www.preciseuk.co.uk/facilities-management.php?tab=3 http://www.preciseuk.co.uk/facilities-management.php?tab=4 http://www.preciseuk.co.uk/facilities-management.php?tab=5 All of which are in the same file. According to the crawl test, these pages are not indexed. Because it is all in one file, should we add the canonical tag to it, so that this is replicated in all the tab pages that are generated? eg. Thanks in advance for your help! Liz OneResult
On-Page Optimization | | oneresult
liz@oneresult.co.uk2