404 Error Pages being picked up as duplicate content
-
Hi,
I recently noticed an increase in duplicate content, but all of the pages are 404 error pages.
For instance, Moz site crawl says this page: https://www.allconnect.com/sc-internet/internet.html has 43 duplicates and all the duplicates are also 404 pages (https://www.allconnect.com/Coxstatic.html for instance is a duplicate of this page).
Looking for insight on how to fix this issue, do I add an rel=canonical tag to these 60 error pages that points to the original error page?
Thanks!
-
I just did a check and you're right, even though these pages are showing up as errors to the user they are actually showing up as 200 OK, which is causing the duplicate content issue.
Thank you!
-
kfallconnect, if the 404 errors are being picked up as duplicate content, then most likely they're not actually showing up as 404 error pages. It's quite possible that it's a 404 error on the site (that's what the user sees) but, in fact, the server header is not displaying a 404 error. It could be showing up as a "200 OK".
First, I would identify the pages. If the user sees an error on the page, then that's fine. Use a server header check tool to see what the response code is when someone goes to the page. You can use something like Rex Swain's HTTP header tool to check it: http://www.rexswain.com/httpview.html . If the page shows a 404 error then you should be fine, it's not duplicate content.
If the page is showing a "200 OK" then it most likely IS duplicate content. If the page is showing an error to users but showing a '200 OK' in the server header, then that needs to be fixed.
But if the page is showing actual content (and not an error to visitors) then you need to look at potentially using the canonical tag or removing the content on the site completely (which is preferred).
-
I have not used Drupal for a couple of years, but there used to be a plugin called Fast 404 for some versions. Need to check whether suitable and if it weighs downs page speed. Zach is right if you can manually handle it do so, but if not perhaps research a plugin and research side effects.
-
We run on Drupal so I'm not sure if there is a 404 plug in
-
I had some issues with 404 errors as well.
Not sure how your site is set up but I just added permanent redirects on all the 404 duplicates I was having. I didn't have as many as you so it was pretty easy to do. I'm not sure if this is the 100% correct way of doing it but it fixed my issue.
Hope it helps!
-
I am not sure of the site is wordpress but have you considered 'smart 404' plugin. Could consider adding to the site, a solution.
Hope that assists.
Regards
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to avoid duplicate content on internal search results page?
Hi, according to Webmaster Tools and Siteliner our website have an above-average amount of duplicate content. Most of the pages are the search results pages, where it finds only one result. The only difference in this case are the TDK, H1 and the breadcrumbs. The rest of the layout is pretty static and similar. Here is an example for two pages with "duplicate content": https://soundbetter.com/search/Globo https://soundbetter.com/search/Volvo Edit: These are legitimate results that happen to have the same result. In this case we want users to be able to find the audio engineers by 'credits' (musicians they've worked with). Tags. We want users to rank for people searching for 'engineers who worked with'. And searching for two different artists (credit tags) returns this one service provider, with different urls (the tag being the search parameter) hence the duplicate content. I guess every e-commerce/directory website faces this kind of issue. What is the best practice to avoid duplicate content on search results page?
Technical SEO | | ShaqD1 -
Drupal duplicate pages
Anyone else encountered massive numbers of duplicate pages being reported on SEO Moz crawls for Drupal based sites? I assumed it was b/c there was no redirect on the print format pages, so I fixed that with a cannonical tag. But still seeing 2 or 3 duplicate pages reported for many pages. Any experience fixing this would be awesome to hear about. Thanks, Kevin
Technical SEO | | kevgrand0 -
Duplicates on the page
Hello SEOMOZ, I've one big question about one project. We have a page http://eb5info.com/eb5-attorneys and a lot of other similar pages. And we got a big list of errors, warnings saying that we have duplicate pages. But in real not all of them are same, they have small differences. For example - you select "State" in the left sidebar and you see a list on the right. List on the right panel is changing depending on the what you selecting on the left. But on report pages marked as duplicates. Maybe you can give some advices how to improve quality of the pages and make SEO better? Thanks Igor
Technical SEO | | usadvisors0 -
Google inconsistent in display of meta content vs page content?
Our e-comm site includes more than 250 brand pages - lrg image, some fluffy text, maybe a video, links to categories for that brand, etc. In many cases, Google publishes our page title and description in their search results. However, in some cases, Google instead publishes our H1 and the aforementioned fluffy page content. We want our page content to read well, be descriptive of the brand and appropriate for the audience. We want our meta titles and descriptions brief and likely to attract CTR from qualified shoppers. I'm finding this difficult to manage when Google pulls from two different areas inconsistently. So my question... Is there a way to ensure Google only utilizes our title/desc for our listings?
Technical SEO | | websurfer0 -
Why do I get duplicate content errors just for tags I place on blog entries?
I the SEO MOZ crawl diagnostics for my site, www.heartspm.com, I am getting over 100 duplicate content errors on links built from tags on blog entries. I do have the original base blog entry in my site map not referencing the tags. Similarly, I am getting almost 200 duplicate meta description errors in Google Webmaster Tools associated with links automatically generated from tags on my blog. I have more understanding that I could get these errors from my forum, since the forum entries are not in the sitemap, but the blog entries are there in the site map. I thought the tags were only there to help people search by category. I don't understand why every tag becomes its' own link. I can see how this falsely creates the impression of a lot of duplicate data. As seen in GWT: Pages with duplicate meta descriptions Pages [Customer concerns about the use of home water by pest control companies.](javascript:dropInfo('zip_0div', 'none', document.getElementById('zip_0zipimg'), 'none', null);)/category/job-site-requirements/tag/cost-of-water/tag/irrigation-usage/tag/save-water/tag/standard-industry-practice/tag/water-use 6 [Pest control operator draws analogy between Children's Day and the state of the pest control industr](javascript:dropInfo('zip_1div', 'none', document.getElementById('zip_1zipimg'), 'none', null);)/tag/children-in-modern-world/tag/children/tag/childrens-day/tag/conservation-medicine/tag/ecowise-certified/tag/estonia/tag/extermination-service/tag/exterminator/tag/green-thumb/tag/hearts-pest-management/tag/higher-certification/tag/higher-education/tag/tartu/tag/united-states
Technical SEO | | GerryWeitz0 -
404-like content
A site that I look after is having lots of soft 404 responses for pages that are not 404 at all but unique content pages. the following page is an example: http://www.professionalindemnitynow.com/medical-malpractice-insurance-clinics This page returns a 200 response code, has unique content, but is not getting indexed. Any ideas? To add further information that may well impact your answer, let me explain how this "classic ASP" website performs the SEO Friendly url mapping: All pages within the custom CMS have a unique ID which are referenced with an ?intID=xx parameter. The custom 404.asp file receives a request, looks up the ID to find matching content in the CMS, and then server.transfers the visitor to the correct page. Like I said, the response codes are setup correctly, as far as Firebug can tell me. any thoughts would be most appreciated.
Technical SEO | | eseyo20 -
Strange duplicate content issue
Hi there, SEOmoz crawler has identified a set of duplicate content that we are struggling to resolve. For example, the crawler picked up that this page www. creative - choices.co.uk/industry-insight/article/Advice-for-a-freelance-career is a duplicate of this page www. creative - choices.co.uk/develop-your-career/article/Advice-for-a-freelance-career. The latter page's content is the original and can be found in the CMS admin area whilst the former page is the duplicate and has no entry in the CMS. So we don't know where to begin if the "duplicate" page doesn't exist in the CMS. The crawler states that this page www. creative-choices.co.uk/industry-insight/inside/creative-writing is the referrer page. Looking at it, only the original page's link is showing on the referrer page, so how did the crawler get to the duplicate page?
Technical SEO | | CreativeChoices0 -
How to fix 404 (Client Error) errors in wordpress blog?
hey A very quick question...after analyzed my wp blog I've found "34" 404 (Client Error) Errors and I don't know how to fix it, do you know how?? *I renew html code of 404 of my wordpress blog.
Technical SEO | | akitmane1