Wordpress Duplicate Content
-
We have recently moved our company's blog to Wordpress on a subdomain (we utilize the Yoast SEO plugin). We are now experiencing an ever-growing volume of crawl errors (nearly 300 4xx now) for pages that do not exist to begin with. I believe it may have something to do with having the blog on a subdomain and/or our yoast seo plugin's indexation archives (author, category, etc) --- we currently have Subpages of archives and taxonomies, and category archives in use.
I'm not as familiar with Wordpress and the Yoast SEO plugin as I am with other CMS' so any help in this matter would be greatly appreciated. I can PM further info if necessary. Thank you for the help in advance.
-
But of course! You're welcome and thanks for the assistance!
-Marty
-
Great Marty! Thanks for letting us know, and glad you got it sorted out.
-Dan
-
Thank you both for your responses! I was actually able to figure out the issue on my own, but I appreciate all the helpful advice. All of our redirects from the past blog domain work perfectly and were added by hand, and we are unable to use .htaccess with our servers (quite annoying believe me). But I greatly appreciate that advice Ben; I'm sure it will help someone with this issue.
The issue that was causing all the errors was our relative path structure on the root domain. When moving the blog to the subdomain we accidentally left 4 links in the footer as relative paths instead of absolute. Therefore the bot were attempting to access the root from the subdomain through those relative paths, which in-turn created multiple 404 pages for every blog page.
I appreciate the help guys. Screaming Frog, SEO Moz, and GWT definitely all helped on this one.
Thanks!
-
Marty
Did you both move to the subdomain and switch to Yoast at the same time. Or is the WordPress setup essentially the same, and all you did is switch to the subdomain?
If you were already using Yoast before the switch, have you changed settings, or did those stay the same too?
Are the crawl errors happening in the Moz tools? Google Webmaster Tools? Can you confirm by manually trying to visit the URLs?
Lastly, when you say "pages that do not exist to begin with" - do they still not exit? Are they at all similar to pages that do exist?
Sorry for all the questions, just trying to nail it down for you and also see if Ben has answered it.
-Dan
-
If you moved the site into a subdomain then all the links that used to point to the old blog (that wasn't on a subdomain) won't work.
You need to add a .htaccess file to the root of your website and put in redirects for broken links. Something like the following should work:
<code>Options -Indexes +FollowSymLinks RewriteEngine On RewriteBase / RewriteCond %{HTTP_HOST} ^example.com [NC] RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301] RedirectMatch 301 ^/blog/(.*)$ http://blog.example.com/$1</code>
This will basically redirect the old links for your blog to the subdomain, which will help Google know that the pages have moved. The whole point of 301 redirects (if you don't already know) is to ensure your pages retain their page rank if you change your site structure. Now its been said that you lose some page rank using a 301 redirect from the old location to the new location, but that's better than Google assuming the page has been removed from your site as this would mean Google will remove the site from its index and you can wave goodbye to that page's good search position.
I hope this helps, if you need me to clarify anything let me know.
Ben
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Backup Server causing duplicate content flag?
Hi, Google is indexing pages from our backup server. Is this a duplicate content issue? There are essentially two versions of our entire domain indexed by Google. How do people typically handle this? Any thoughts are appreciated. Thanks, Yael
Intermediate & Advanced SEO | | yaelslater0 -
SEO for video content that is duplicated accross a larger network
I have a website with lots of content (high quality video clips for a particular niche). All the content gets fed out 100+ other sites on various domains/subdomains which are reskinned for a given city. So the content on these other sites is 100% duplicate. I still want to generate SEO traffic though. So my thought is that we: a) need to have canonical tags from all the other domains/subdomains that point back to the original post on the main site b) probably need to disallow search engine crawlers on all the other domains/subdomains Is this on the right track? Missing anything important related to duplicate content? The idea is that after we get search engines crawling the content correctly, from there we'd use the IP address to redirect the visitor to the best suited domain/subdomain. any thoughts on that approach? Thanks for your help!
Intermediate & Advanced SEO | | PlusROI0 -
Removing duplicate content
Due to URL changes and parameters on our ecommerce sites, we have a massive amount of duplicate pages indexed by google, sometimes up to 5 duplicate pages with different URLs. 1. We've instituted canonical tags site wide. 2. We are using the parameters function in Webmaster Tools. 3. We are using 301 redirects on all of the obsolete URLs 4. I have had many of the pages fetched so that Google can see and index the 301s and canonicals. 5. I created HTML sitemaps with the duplicate URLs, and had Google fetch and index the sitemap so that the dupes would get crawled and deindexed. None of these seems to be terribly effective. Google is indexing pages with parameters in spite of the parameter (clicksource) being called out in GWT. Pages with obsolete URLs are indexed in spite of them having 301 redirects. Google also appears to be ignoring many of our canonical tags as well, despite the pages being identical. Any ideas on how to clean up the mess?
Intermediate & Advanced SEO | | AMHC0 -
Magento products and eBay - duplicate content risk?
Hi, We are selling about 1000 sticker products in our online store and would like to expand a large part of our products lineup to eBay as well. There are pretty good modules for this as I've heard. I'm just wondering if there will be duplicate content problems if I sync the products between Magento and eBay and they get uploaded to eBay with identical titles, descriptions and images? What's the workaround in this case? Thanks!
Intermediate & Advanced SEO | | speedbird12290 -
Duplicate content on yearly product models.
TL;DR - Is creating a page that has 80% of duplicated content from the past year's product model where 20% is about the new model changes going to be detrimental to duplicate content issues. Is there a better way to update minor yearly model changes and not have duplicated content? Full Question - We create landing pages for yearly products. Some years the models change drastically and other years there are only a few minor changes. The years where the product features change significantly is not an issue, it's when there isn't much of a change to the product description & I want to still rank on the new year searches. Since I don't want duplicate content by just adding the last year's model content to a new page and just changing the year (2013 to 2014) because there isn't much change with the model, I thought perhaps we could write a small paragraph describing the changes & then including the last year's description of the product. Since 80% of the content on the page will be duplicated from the last year's model, how detrimental do you think this would be for a duplicate content issue? The reason I'm leaving the old model up is to maintain the authority that page has and to still rank on the old model which is still sold. Does anyone else have any other better idea other than re-writing the same information over again in a different way with the few minor changes to the product added in.
Intermediate & Advanced SEO | | DCochrane0 -
Duplicate WordPress Home Page Issue
I have an issue where I've created a site, (www.tntperformance805.com), using WordPress as a CMS. I enabled the option to use a static page as the home page, and created that page as /home. Well, now the issue that exists is that Google is indexing both www.tntperformance805.com, and www.tntperformance805.com/home/. I've already setup a 301 redirect, pointing /home/ to the main domain, and even have rel=canonical set up automatically, pointing every page to the www version of that particular page. However, Google Webmaster Tools is still reporting the pages as having duplicate page titles and descriptions. I've even had the page removed from Google's cache and index. I'm assuming Google is not considering the 301 redirect, even though it's setup properly. Should I add rel="canonical" href="http://www.tntperformance805.com" /> to the body of the /home/ post, to ensure that it is giving credit to the main domain? I am assuming the page is only redirecting to , as that's the www version, but I thought the 301 redirect would enforce that the search engines should give all credit to the main domain. Thanks in advance for the help everyone. I look forward to some insightful feedback. Best Regards, Matt Dimock
Intermediate & Advanced SEO | | National-Positions0 -
Duplicate content - canonical vs link to original and Flash duplication
Here's the situation for the website in question: The company produces printed publications which go online as a page turning Flash version, and as a separate HTML version. To complicate matters, some of the articles from the publications get added to a separate news section of the website. We want to promote the news section of the site over the publications section. If we were to forget the Flash version completely, would you: a) add a canonical in the publication version pointing to the version in the news section? b) add a link in the footer of the publication version pointing to the version in the news section? c) both of the above? d) something else? What if we add the Flash version into the mix? As Flash still isn't as crawlable as HTML should we noindex them? Is HTML content duplicated in Flash as big an issue as HTML to HTML duplication?
Intermediate & Advanced SEO | | Alex-Harford0 -
How are they avoiding duplicate content?
One of the largest stores in USA for soccer runs a number of whitelabel sites for major partners such as Fox and ESPN. However, the effect of this is that they are creating duplicate content for their products (and even the overall site structure is very similar). Take a look at: http://www.worldsoccershop.com/23147.html http://www.foxsoccershop.com/23147.html http://www.soccernetstore.com/23147.html You can see that practically everything is the same including: product URL product title product description My question is, why is Google not classing this as duplicate content? Have they coded for it in a certain way or is there something I'm missing which is helping them achieve rankings for all sites?
Intermediate & Advanced SEO | | ukss19840