Wordpress Duplicate Content
-
We have recently moved our company's blog to Wordpress on a subdomain (we utilize the Yoast SEO plugin). We are now experiencing an ever-growing volume of crawl errors (nearly 300 4xx now) for pages that do not exist to begin with. I believe it may have something to do with having the blog on a subdomain and/or our yoast seo plugin's indexation archives (author, category, etc) --- we currently have Subpages of archives and taxonomies, and category archives in use.
I'm not as familiar with Wordpress and the Yoast SEO plugin as I am with other CMS' so any help in this matter would be greatly appreciated. I can PM further info if necessary. Thank you for the help in advance.
-
But of course! You're welcome and thanks for the assistance!
-Marty
-
Great Marty! Thanks for letting us know, and glad you got it sorted out.
-Dan
-
Thank you both for your responses! I was actually able to figure out the issue on my own, but I appreciate all the helpful advice. All of our redirects from the past blog domain work perfectly and were added by hand, and we are unable to use .htaccess with our servers (quite annoying believe me). But I greatly appreciate that advice Ben; I'm sure it will help someone with this issue.
The issue that was causing all the errors was our relative path structure on the root domain. When moving the blog to the subdomain we accidentally left 4 links in the footer as relative paths instead of absolute. Therefore the bot were attempting to access the root from the subdomain through those relative paths, which in-turn created multiple 404 pages for every blog page.
I appreciate the help guys. Screaming Frog, SEO Moz, and GWT definitely all helped on this one.
Thanks!
-
Marty
Did you both move to the subdomain and switch to Yoast at the same time. Or is the WordPress setup essentially the same, and all you did is switch to the subdomain?
If you were already using Yoast before the switch, have you changed settings, or did those stay the same too?
Are the crawl errors happening in the Moz tools? Google Webmaster Tools? Can you confirm by manually trying to visit the URLs?
Lastly, when you say "pages that do not exist to begin with" - do they still not exit? Are they at all similar to pages that do exist?
Sorry for all the questions, just trying to nail it down for you and also see if Ben has answered it.
-Dan
-
If you moved the site into a subdomain then all the links that used to point to the old blog (that wasn't on a subdomain) won't work.
You need to add a .htaccess file to the root of your website and put in redirects for broken links. Something like the following should work:
<code>Options -Indexes +FollowSymLinks RewriteEngine On RewriteBase / RewriteCond %{HTTP_HOST} ^example.com [NC] RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301] RedirectMatch 301 ^/blog/(.*)$ http://blog.example.com/$1</code>
This will basically redirect the old links for your blog to the subdomain, which will help Google know that the pages have moved. The whole point of 301 redirects (if you don't already know) is to ensure your pages retain their page rank if you change your site structure. Now its been said that you lose some page rank using a 301 redirect from the old location to the new location, but that's better than Google assuming the page has been removed from your site as this would mean Google will remove the site from its index and you can wave goodbye to that page's good search position.
I hope this helps, if you need me to clarify anything let me know.
Ben
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Defining duplicate content
If you have the same sentences or paragraphs on multiple pages of your website, is this considered duplicate content and will it hurt SEO?
Intermediate & Advanced SEO | | mnapier120 -
Please provide solution for my website? Duplicate content Problem
I have 2 Domains with the same name with same content. How to solve that problem? Do I need to change the content from my main website. My Hosting is having different plans, but with the same features. So many pages were having the same content, and it is not possible to change the content, what is the solution for that? Please let me know how to solve that issue?
Intermediate & Advanced SEO | | Alexa.Hill0 -
Is This Considered Duplicate Content?
My site has entered SEO hell and I am not sure how to fix it. Up until 18 months ago I had tremendous success on Google and Bing and now my website appears below my Facebook page for the term "Direct Mail Raleigh." What makes it even more frustrating is my competitors have done no SEO and they are dominating this keyword. I thought that the issue was due to harmful inbound links and two months ago I disavowed ones that were clearly spam. Somehow my site has actually gone down! I have a blog that I have updated infrequently and I do not know if it I am getting punished for duplicate content. On Google Webmaster Tools it says I have 279 crawled and indexed pages. Yesterday when I ran the MOZ crawl check I was amazed to find 1150 different webpages on my site. Despite the fact that it does not appear on the webmaster tools I have three different webpages due to the format that the Wordpress blog was created: "http://www.marketplace-solutions.com/report/part2leadershi/", "http://www.marketplace-solutions.com/report/page/91/" and "http://www.marketplace-solutions.com/report/category/competent-leadership/page/3/" What does not make sense to me is why Google only indexed 279 webpages AND why MOZ did not identify these three webpages as duplicate content with the Crawl Test Tool. Does anyone have any ideas? Would it be as easy as creating a massive robot.txt file and just putting 2 of the 3 URLs in that file? Thank you for your help.
Intermediate & Advanced SEO | | DR700950 -
Implications of posting duplicate blog content on external domains?
I've had a few questions around the blog content on our site. Some of our vendors and partners have expressed interest in posting some of that content on their domains. What are the implications if we were to post copies of our blog posts on other domains? Should this be avoided or are there circumstances that this type of program would make sense?
Intermediate & Advanced SEO | | Visier1 -
3rd Party hosted whitepapers — bad idea? Duplicate content?
It is common the B2B world to have 3rd parties host your whitepapers for added exposure. Is this a bad practice from an SEO point of view? Is the expectation that the 3rd parties use rel=canonical tags? I doubt most of them do . . .
Intermediate & Advanced SEO | | BlueLinkERP0 -
Duplicate Page Title/Content Issues on Product Review Submission Pages
Hi Everyone, I'm very green to SEO. I have a Volusion-based storefront and recently decided to dedicate more time and effort into improving my online presence. Admittedly, I'm mostly a lurker in the Q&A forum but I couldn't find any pre-existing info regarding my situation. It could be out there. But again, I'm a noob... So, in my recent SEOmoz report I noticed that over 1,000 Duplicate Content Errors and Duplicate Page Title Errors have been found since my last crawl. I can see that every error is tied to a product in my inventory - specifically each product page has an option to write a review. It looks like the subsequent page where a visitor can fill out their review is the stem of the problem. All of my products are shown to have the same issue: Duplicate Page Title - Review:New Duplicate Page Content - the form is already partially filled out with the corresponding product My first question - It makes sense that a page containing a submission form would have the same title and content. But why is it being indexed, or crawled (or both for that matter) under every parameter in which it could be accessed (product A, B, C, etc)? My second question (an obvious one) - What can I do to begin to resolve this? As far as I know, I haven't touched this option included in Volusion other than to simply implement it. If I'm missing any key information, please point me in the right direction and I'll respond with any additional relevant information on my end. Many thanks in advance!
Intermediate & Advanced SEO | | DakotahW0 -
Duplicate content mess
One website I'm working with keeps a HTML archive of content from various magazines they publish. Some articles were repeated across different magazines, sometimes up to 5 times. These articles were also used as content elsewhere on the same website, resulting in up to 10 duplicates of the same article on one website. With regards to the 5 that are duplicates but not contained in the magazine, I can delete (resulting in 404) all but the highest value of each (most don't have any external links). There are hundreds of occurrences of this and it seems unfeasible to 301 or noindex them. After seeing how their system works I can canonical the remaining duplicate that isn't contained in the magazine to the corresponding original magazine version - but I can't canonical any of the other versions in the magazines to the original. I can't delete the other duplicates as they're part of the content of a particular issue of a magazine. The best thing I can think of doing is adding a link in the magazine duplicates to the original article, something along the lines of "This article originally appeared in...", though I get the impression the client wouldn't want to reveal that they used to share so much content across different magazines. The duplicate pages across the different magazines do differ slightly as a result of the different Contents menu for each magazine. Do you think it's a case of what I'm doing will be better than how it was, or is there something further I can do? Is adding the links enough? Thanks. 🙂
Intermediate & Advanced SEO | | Alex-Harford0 -
Accepting RSS feeds. Does it = duplicate content?
Hi everyone, for a few years now I've allowed school clients to pipe their news RSS feed to their public accounts on my site. The result is a daily display of the most recent news happening on their campuses that my site visitors can browse. We don't republish the entire news item; just the headline, and the first 150 characters of their article along with a Read more link for folks to click if they want the full story over on the school's site. Each item has it's own permanent URL on my site. I'm wondering if this is a wise practice. Does this fall into the territory of duplicate content even though we're essentially providing a teaser for the school? What do you think?
Intermediate & Advanced SEO | | peterdbaron0