Duplicate content problem
-
Hi there,
I have a couple of related questions about the crawl report finding duplicate content:
We have a number of pages that feature mostly media - just a picture or just a slideshow - with very little text. These pages are rarely viewed and they are identified as duplicate content even though the pages are indeed unique to the user. Does anyone have an opinion about whether or not we'd be better off to just remove them since we do not have the time to add enough text at this point to make them unique to the bots?
The other question is we have a redirect for any 404 on our site that follows the pattern immigroup.com/news/* - the redirect merely sends the user back to immigroup.com/news. However, Moz's crawl seems to be reading this as duplicate content as well. I'm not sure why that is, but is there anything we can do about this? These pages do not exist, they just come from someone typing in the wrong url or from someone clicking on a bad link. But we want the traffic - after all the users are landing on a page that has a lot of content. Any help would be great!
Thanks very much!
George
-
Thanks Don.
The links are external. We do have a general 404 page, but for whatever reason, when he built this site, the developer guided those specific 404s to that news page.
Thanks again.
-
The alternative he suggested was to delete the page.
Before I removed a page I would canonical it.You can always undo the Canonical once content becomes available.
-
I would only use the Canonical tag if you don't want these pages to rank for anything in particular. When you set the authoritative page, you are eliminating the ranking potential of the non authoritative pages. Is it possible to get user generated content for these pages? Like reviews?
-
Hi George,
To the first issue. Have you considered using a REL=CANONICAL tag on this pages with thin to no content (point to them to the parent category). Barring that option you could also just put a NOINDEX / NOFOLLOW. No sense in removing the page just in case it has a few links out there.
For the 404 page, the main reason I would suspect Moz seeing this as duplicate content would be because of parameters on it. It would make more sense to have a universal 404 Page and let the server handle the error for you and redirect where appropriate. Moz site.. www.thisSite.com/catalog/ImGoingNoWhere.php
404 Page are usually pretty simple to setup content wise. Your second goal would be to remove all links on your site that would return a 404.
-
Hi there,
If you have a lot of HTML coding on pages, with no content, your HTML coding is what is triggering the duplicate content. Because you have no content, the HTML is all the bots see and if it is really similar, there is nothing to tell them that there is anything unique on the pages.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Content Issues: Duplicate Content
Hi there
Technical SEO | | Kingagogomarketing
Moz flagged the following content issues, the page has duplicate content and missing canonical tags.
What is the best solution to do? Industrial Flooring » IRL Group Ltd
https://irlgroup.co.uk/industrial-flooring/ Industrial Flooring » IRL Group Ltd
https://irlgroup.co.uk/index.php/industrial-flooring Industrial Flooring » IRL Group Ltd
https://irlgroup.co.uk/index.php/industrial-flooring/0 -
Query Strings causing Duplicate Content
I am working with a client that has multiple locations across the nation, and they recently merged all of the location sites into one site. To allow the lead capture forms to pre-populate the locations, they are using the query string /?location=cityname on every page. EXAMPLE - www.example.com/product www.example.com/product/?location=nashville www.example.com/product/?location=chicago There are thirty locations across the nation, so, every page x 30 is being flagged as duplicate content... at least in the crawl through MOZ. Does using that query string actually cause a duplicate content problem?
Technical SEO | | Rooted1 -
Development Website Duplicate Content Issue
Hi, We launched a client's website around 7th January 2013 (http://rollerbannerscheap.co.uk), we originally constructed the website on a development domain (http://dev.rollerbannerscheap.co.uk) which was active for around 6-8 months (the dev site was unblocked from search engines for the first 3-4 months, but then blocked again) before we migrated dev --> live. In late Jan 2013 changed the robots.txt file to allow search engines to index the website. A week later I accidentally logged into the DEV website and also changed the robots.txt file to allow the search engines to index it. This obviously caused a duplicate content issue as both sites were identical. I realised what I had done a couple of days later and blocked the dev site from the search engines with the robots.txt file. Most of the pages from the dev site had been de-indexed from Google apart from 3, the home page (dev.rollerbannerscheap.co.uk, and two blog pages). The live site has 184 pages indexed in Google. So I thought the last 3 dev pages would disappear after a few weeks. I checked back late February and the 3 dev site pages were still indexed in Google. I decided to 301 redirect the dev site to the live site to tell Google to rank the live site and to ignore the dev site content. I also checked the robots.txt file on the dev site and this was blocking search engines too. But still the dev site is being found in Google wherever the live site should be found. When I do find the dev site in Google it displays this; Roller Banners Cheap » admin <cite>dev.rollerbannerscheap.co.uk/</cite><a id="srsl_0" class="pplsrsla" tabindex="0" data-ved="0CEQQ5hkwAA" data-url="http://dev.rollerbannerscheap.co.uk/" data-title="Roller Banners Cheap » admin" data-sli="srsl_0" data-ci="srslc_0" data-vli="srslcl_0" data-slg="webres"></a>A description for this result is not available because of this site's robots.txt – learn more.This is really affecting our clients SEO plan and we can't seem to remove the dev site or rank the live site in Google.Please can anyone help?
Technical SEO | | SO_UK0 -
How different should content be so that it is not considered duplicate?
I am making a 2nd website for the same company. The name of the company, our services, keywords and contact info will show up several times within the text of both websites. The overall text and paragraphs will be different but some info may be repeated on both sites. Should I continue this? What precautions should I take?
Technical SEO | | savva0 -
Duplicate Footer Content
A client I just took over is having some duplicate content issues. At the top of each page he has about 200 words of unique content. Below this is are three big tables of text that talks about his services, history, etc. This table is pulled into the middle of every page using php. So, he has the exact same three big table of text across every page. What should I do to eliminate the dup content. I thought about removing the script then just rewriting the table of text on every page... Is there a better solution? Any ideas would be greatly appreciated. Thanks!
Technical SEO | | BigStereo0 -
If two websites pull the same content from the same source in a CMS, does it count as duplicate content?
I have a client who wants to publish the same information about a hotel (summary, bullet list of amenities, roughly 200 words + images) to two different websites that they own. One is their main company website where the goal is booking, the other is a special program where that hotel is featured as an option for booking under this special promotion. Both websites are pulling the same content file from a centralized CMS, but they are different domains. My question is two fold: • To a search engine does this count as duplicate content? • If it does, is there a way to configure the publishing of this content to avoid SEO penalties (such as a feed of content to the microsite, etc.) or should the content be written uniquely from one site to the next? Any help you can offer would be greatly appreciated.
Technical SEO | | HeadwatersContent0 -
Whats with the backslash in the url adding as duplicate content?
Is this a bug or something that needs to be addressed? If so, just use a redirect?
Technical SEO | | Boogily0 -
Duplicate content?
I have a question regarding a warning that I got on one of my websites, it says Duplicate content. I'm canonical url:s and is also using blocking Google out from pages that you are warning me about. The pages are not indexed by Google, why do I get the warnings? Thanks for great seotools! 3M5AY.png
Technical SEO | | bnbjbbkb0