Duplicate content http:// something .com and http:// something .com/
-
Hi,
I've just got a crawl report for a new wordpress blog with suffusion theme and yoast wordpress seo module and there is duplicate content for:
http:// something .com
and
http:// something .com/
I just can't figure out how to handle this. Can I add a redirect for .com/ to .com in htaccess?
Any help is appreciated!
By the way, the tag value for rel canonical is **http:// something .com/ **for both.
-
All so rember the canonicalization SEO advice: url canonicalization by MATT CUTTS on JANUARY 4, 2006 in GOOGLE/SEO (I got my power back!) Before I start collecting feedback on the Bigdaddy data center, I want to talk a little bit about canonicalization, www vs. non-www, redirects, duplicate urls, 302 “hijacking,” etc. so that we’re all on the same page. Q: What is a canonical url? Do you have to use such a weird word, anyway? A: Sorry that it’s a strange word; that’s what we call it around Google. Canonicalization is the process of picking the best url when there are several choices, and it usually refers to home pages. For example, most people would consider these the same urls: www.example.com example.com/ www.example.com/index.html example.com/home.asp But technically all of these urls are different. A web server could return completely different content for all the urls above. When Google “canonicalizes” a url, we try to pick the url that seems like the best representative from that set. Q: So how do I make sure that Google picks the url that I want? A: One thing that helps is to pick the url that you want and use that url consistently across your entire site. For example, don’t make half of your links go to http://example.com/ and the other half go to http://www.example.com/ . Instead, pick the url you prefer and always use that format for your internal links. Q: Is there anything else I can do? A: Yes. Suppose you want your default url to be http://www.example.com/ . You can make your webserver so that if someone requests http://example.com/, it does a 301 (permanent) redirect to http://www.example.com/ . That helps Google know which url you prefer to be canonical. Adding a 301 redirect can be an especially good idea if your site changes often (e.g. dynamic content, a blog, etc.). Q: If I want to get rid of domain.com but keep www.domain.com, should I use the url removal tool to remove domain.com? A: No, definitely don’t do this. If you remove one of the www vs. non-www hostnames, it can end up removing your whole domain for six months. Definitely don’t do this. If you did use the url removal tool to remove your entire domain when you actually only wanted to remove the www or non-www version of your domain, do a reinclusion request and mention that you removed your entire domain by accident using the url removal tool and that you’d like it reincluded. Q: I noticed that you don’t do a 301 redirect on your site from the non-www to the www version, Matt. Why not? Are you stupid in the head? A: Actually, it’s on purpose. I noticed that several months ago but decided not to change it on my end or ask anyone at Google to fix it. I may add a 301 eventually, but for now it’s a helpful test case. Q: So when you say www vs. non-www, you’re talking about a type of canonicalization. Are there other ways that urls get canonicalized? A: Yes, there can be a lot, but most people never notice (or need to notice) them. Search engines can do things like keeping or removing trailing slashes, trying to convert urls with upper case to lower case, or removing session IDs from bulletin board or other software (many bulletin board software packages will work fine if you omit the session ID). Q: Let’s talk about the inurl: operator. Why does everyone think that if inurl:mydomain.com shows results that aren’t from mydomain.com, it must be hijacked? A: Many months ago, if you saw someresult.com/search2.php?url=mydomain.com, that would sometimes have content from mydomain. That could happen when the someresult.com url was a 302 redirect to mydomain.com and we decided to show a result from someresult.com. Since then, we’ve changed our heuristics to make showing the source url for 302 redirects much more rare. We are moving to a framework for handling redirects in which we will almost always show the destination url. Yahoo handles 302 redirects by usually showing the destination url, and we are in the middle of transitioning to a similar set of heuristics. Note that Yahoo reserves the right to have exceptions on redirect handling, and Google does too. Based on our analysis, we will show the source url for a 302 redirect less than half a percent of the time (basically, when we have strong reason to think the source url is correct). Q: Okay, how about supplemental results. Do supplemental results cause a penalty in Google? A: Nope. Q: I have some pages in the supplemental results that are old now. What should I do? A: I wouldn’t spend much effort on them. If the pages have moved, I would make sure that there’s a 301 redirect to the new location of pages. If the pages are truly gone, I’d make sure that you serve a 404 on those pages. After that, I wouldn’t put any more effort in. When Google eventually recrawls those pages, it will pick up the changes, but because it can take longer for us to crawl supplemental results, you might not see that update for a while. That’s about all I can think of for now. I’ll try to talk about some examples of 302′s and inurl: soon, to help make some of this more concrete. http://www.ragepank.com/articles/3/preventing-duplicate-content/ Hope I was of help, Thomas Von Zickell
-
thanks!
Can some body please also clarify exactly what should be in the second line:
As eyepaq wrote: RewriteRule ^(.+)/$ [%{HTTP_HOST}...] [R=301,L]
Should I insert something in/after "[%{HTTP_HOST}...]"?
-
After RewriteEngine if i'm not wrong
-
Should I keep the existing wordpress rewrite? If I keep it, should I then place your code before or after?
BEGIN WordPress
RewriteEngine On
RewriteBase /
RewriteRule ^index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
END WordPress
-
Hi,
Google is pretty good in understanding that the trailing slash version is the same with the non-trailing slash version so you are safe on that side.
Even if the crawler said this is an issue it's not something you should focus on.
However, if you want to play by the book, you can httaccess it so it will 301 redirect to oen or another.
Bellow is a sample code:
#get rid of trailing slashes
RewriteCond %{HTTP_HOST} ^(www.)?example.com$ [NC]
RewriteRule ^(.+)/$ [%{HTTP_HOST}...] [R=301,L]Hope it helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Updating Old Content at Scale - Any Danger from a Google Penalty/Spam Perspective?
We've read a lot about the power of updating old content (making it more relevant for today, finding other ways to add value to it) and republishing (Here I mean changing the publish date from the original publish date to today's date - not publishing on other sites). I'm wondering if there is any danger of doing this at scale (designating a few months out of the year where we don't publish brand-new content but instead focus on taking our old blog posts, updating them, and changing the publish date - ~15 posts/month). We have a huge archive of old posts we believe we can add value to and publish anew to benefit our community/organic traffic visitors. It seems like we could add a lot of value to readers by doing this, but I'm a little worried this might somehow be seen by Google as manipulative/spammy/something that could otherwise get us in trouble. Does anyone have experience doing this or have thoughts on whether this might somehow be dangerous to do? Thanks Moz community!
On-Page Optimization | | paulz9990 -
What constitutes duplicate content on a page?
I am working on SEO for a Shopify store. Their products are very similar, hence the pages are so similar that Moz shows them as duplicate content. The only difference in the product pages is the title and model number. I am going to "go for the gold" and try re-writing all the product descriptions. It's incredibly difficult due to the products being nearly identical with just a minor variation. I know I could go down the road of just creating variants --- but the customer is not down for that. Here's my question: what constitutes duplicate content? 80% of the content, 90%???? If I can going to re-write the descriptions, what should I aim for? Thank you!
On-Page Optimization | | steve_linn1 -
Many have stolen our content. Rewrite vs. DMCA content removal?
Hello, We own a medical tourism website and many other sites have stolen (copied and pasted) our content. Our content is more than 2 years old, so we thought we could rewrite the content - but Which is a more wiser decision from you guys' experience? Archive our current content at a different URL and upload a fresh content in the current URL Claim our originality to Google and ask the stolen sites to remove our content. Thank you and appreciate your time.
On-Page Optimization | | joony0 -
Duplicate content
Are images considered duplicate content too? Example:
On-Page Optimization | | BridalHotspot
I've got a size chart on each my lingerie pages. All written content is unique but I'm using the same chart for all those pages.0 -
Duplicate page content
what is duplicate page content, I have a dating site and it's got a groups area where the members can base there discussions in a category like for an example, night life, health and beauty, and such. why would this cause a problem of duplicate page content and how would I fix it. explained in the terms of a dummy.
On-Page Optimization | | clickit2getwithit0 -
Duplicate content
Hello, I have two pages showing dulicate content. They are: http://www.cedaradirondackchairs.net/ http://www.cedaradirondackchairs.net/index Not sure how to resolve this issue. Any help would be greatly appreciated! Thanks.
On-Page Optimization | | Ronb10230 -
301 redirect www.brandname.com to www.brandname-keyword.com
It seems I've been reading about 301 for hours now, but I still didn't find an answer to my question, so I'm hoping someone can help me out here. I'm starting a new webshop which is relaunching a semi known brand within its specific niche, say kids toys. Now my question is - since the brand name is relatively known and it is only 5 letters short, the website will be www.xxxxx.com. However the brand name itself doesn't say anything about the products we sell, so I was thinking to buy www.xxxxx-toys.com and 301 redirect www.xxxxx.com to this new site, but still use the shorter version in all marketing material since it's a lot easier to type and remember. Apparently Google doesn't give extra juice to sites with keywords in the domain name anymore (?) but it would still say something about site to new customers unaware of the brand name. Any advice? 🙂
On-Page Optimization | | JaneVO0 -
Duplicate Page Title
Hi Guys, First off, it's an honour to be a part of this awesome community. I'm using WordPress and getting top 3 rankings for great keywords and I'm very excited, however my page titles are in this format "keyword optimised title here - site name here" eg: "This is my keyword - this is the name of my blog", "This is another keyword - this is the name of my blog", "This is a longtail keyword - this is the name of my blog" SEOMoz is reporting errors because of duplicate page title tags due to the "this is the name of my blog" being in every page title. Will this hurt my rankings? Thanks in advance and keep up the great work! Cheers, Troy.
On-Page Optimization | | TroyDean710