Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Duplicate content on URL trailing slash
-
Hello,
Some time ago, we accidentally made changes to our site which modified the way urls in links are generated. At once, trailing slashes were added to many urls (only in links).
Links that used to send to
example.com/webpage.htmlWere now linking to
example.com/webpage.html/Urls in the xml sitemap remained unchanged (no trailing slash).
We started noticing duplicate content (because our site renders the same page with or without the trailing shash). We corrected the problematic php url function so that now, all links on the site link to a url without trailing slash.
However, Google had time to index these pages. Is implementing 301 redirects required in this case?
-
Yes you want to have it match the canonical tag so most effective method is to 301 redirect so they match the canonical tag site map and robots.txt etc. You can use a Regex code like this at the end of the URL /?$ in the case of category URLs it will allow them when needed.
if you use the proper 301 you will not have to deal with the category issue anyway.
rel="canonical" href="https://moz.com/community/q/duplicate-content-on-url-trailing-slash" />
I hope this is able to shed more light on the issue and great answer Eric.
Hope I was of help,
Tom
-
Hi Eric,
I was at Step 3 of your 3 Step plan, looking for confirmation as to whether or not the 301 redirects were required in this situation.
Thanks!
-
Hi yacpro13! Did Eric or Thomas answer your question, and if so, would you mind marking one or both responses as a "Good Answer?"

Otherwise, what questions do you still have?
-
If you have changed the URLs with trailing slashes, then there are a few things you'll want to do:
-
make sure all the internal links on your site are updated to point to the proper version.
-
make sure that the sitemap.xml file(s) are correct, pointing to the proper version.
-
set up 301 permanent redirects so that the ones with the slash are redirecting to the old URLs.
As long as you have corrected the links internally, updated the sitemap file, and set up the 301 redirects, everything should go "back to normal" within a fairly short period of time. You will need to give it time, though, as Google will need to re-crawl all of those URLs and get it all ironed out.
-
-
I have provided the Apache and Nginx configurations you would need in addition to a URL that will convert
Apache Htaccess to Nginx
The instructions are right here
Remove Trailing Slash
Just like with the WWW example, some prefer to remove the trailing slash. It's a commonly debated question that you'll find around the Internet, but it just depends on what you prefer.
Remember, though, your browser and even your server, by default, add a trailing slash to a directory. It is done for a reason. If you must strip the trailing slash, though, this is how you would do it:
<code class="hljs apache">RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_URI} !(.*)$ RewriteRule ^(.*)$ http://www.domain.com/$1 [R=301,L]</code>For Nginx
nginx configuration location ~ (.)$ { } location / { if (!-e $request_filename){ rewrite ^(.)$ http://www.domain.com/$1 redirect; } }
The explanation for this rule is the same as it is for when we want to add a trailing slash, just in reverse. We can also specify specific directories that we don't want apply this rule over.
<code class="hljs apache">RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_URI} !directory/(.*)$ RewriteCond %{REQUEST_URI} !(.*)$ RewriteRule ^(.*)$ http://www.domain.com/$1 [R=301,L]</code>For Nginx
nginx configuration location ~ directory/(.)$ { } location ~ (.)$ { } location / { if (!-e $request_filename){ rewrite ^(.*)$ http://www.domain.com/$1 redirect; } }
Please see the note about mod_dir and the
DirectorySlashdirective in the previous example. You might need to turn this directive off.HTaccess converter for Apache to Nginx configuration.
http://winginx.com/en/htaccess
https://www.maxcdn.com/one/tutorial/remove-trailing-slash/
https://www.crucialhosting.com/knowledgebase/htaccess-apache-rewrites-examples
https://moz.com/community/q/how-to-remove-trailing-slashes-in-urls-using-htaccess-apache
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to enable lost trailing slash redirection in WordPress with Yoast plugin
Hi, We have lost the non-slash to slash URL redirection in our WP site. We are using Yoast SEO. All the settings are normal and we have enabled the related code in .htaccess too. Still we couldn't able to find why we lost. Please help. Thanks
Intermediate & Advanced SEO | | vtmoz0 -
Duplicate Content through 'Gclid'
Hello, We've had the known problem of duplicate content through the gclid parameter caused by Google Adwords. As per Google's recommendation - we added the canonical tag to every page on our site so when the bot came to each page they would go 'Ah-ha, this is the original page'. We also added the paramter to the URL parameters in Google Wemaster Tools. However, now it seems as though a canonical is automatically been given to these newly created gclid pages; below https://www.google.com.au/search?espv=2&q=site%3Awww.mypetwarehouse.com.au+inurl%3Agclid&oq=site%3A&gs_l=serp.3.0.35i39l2j0i67l4j0i10j0i67j0j0i131.58677.61871.0.63823.11.8.3.0.0.0.208.930.0j3j2.5.0....0...1c.1.64.serp..8.3.419.nUJod6dYZmI Therefore these new pages are now being indexed, causing duplicate content. Does anyone have any idea about what to do in this situation? Thanks, Stephen.
Intermediate & Advanced SEO | | MyPetWarehouse0 -
Case Sensitive URLs, Duplicate Content & Link Rel Canonical
I have a site where URLs are case sensitive. In some cases the lowercase URL is being indexed and in others the mixed case URL is being indexed. This is leading to duplicate content issues on the site. The site is using link rel canonical to specify a preferred URL in some cases however there is no consistency whether the URLs are lowercase or mixed case. On some pages the link rel canonical tag points to the lowercase URL, on others it points to the mixed case URL. Ideally I'd like to update all link rel canonical tags and internal links throughout the site to use the lowercase URL however I'm apprehensive! My question is as follows: If I where to specify the lowercase URL across the site in addition to updating internal links to use lowercase URLs, could this have a negative impact where the mixed case URL is the one currently indexed? Hope this makes sense! Dave
Intermediate & Advanced SEO | | allianzireland0 -
Woocommerce SEO & Duplicate content?
Hi Moz fellows, I'm new to Woocommerce and couldn't find help on Google about certain SEO-related things. All my past projects were simple 5 pages websites + a blog, so I would just no-index categories, tags and archives to eliminate duplicate content errors. But with Woocommerce Product categories and tags, I've noticed that many e-Commerce websites with a high domain authority actually rank for certain keywords just by having their category/tags indexed. For example keyword 'hippie clothes' = etsy.com/category/hippie-clothes (fictional example) The problem is that if I have 100 products and 10 categories & tags on my site it creates THOUSANDS of duplicate content errors, but If I 'non index' categories and tags they will never rank well once my domain authority rises... Anyone has experience/comments about this? I use SEO by Yoast plugin. Your help is greatly appreciated! Thank you in advance. -Marc
Intermediate & Advanced SEO | | marcandre1 -
Avoiding Duplicate Content with Used Car Listings Database: Robots.txt vs Noindex vs Hash URLs (Help!)
Hi Guys, We have developed a plugin that allows us to display used vehicle listings from a centralized, third-party database. The functionality works similar to autotrader.com or cargurus.com, and there are two primary components: 1. Vehicle Listings Pages: this is the page where the user can use various filters to narrow the vehicle listings to find the vehicle they want.
Intermediate & Advanced SEO | | browndoginteractive
2. Vehicle Details Pages: this is the page where the user actually views the details about said vehicle. It is served up via Ajax, in a dialog box on the Vehicle Listings Pages. Example functionality: http://screencast.com/t/kArKm4tBo The Vehicle Listings pages (#1), we do want indexed and to rank. These pages have additional content besides the vehicle listings themselves, and those results are randomized or sliced/diced in different and unique ways. They're also updated twice per day. We do not want to index #2, the Vehicle Details pages, as these pages appear and disappear all of the time, based on dealer inventory, and don't have much value in the SERPs. Additionally, other sites such as autotrader.com, Yahoo Autos, and others draw from this same database, so we're worried about duplicate content. For instance, entering a snippet of dealer-provided content for one specific listing that Google indexed yielded 8,200+ results: Example Google query. We did not originally think that Google would even be able to index these pages, as they are served up via Ajax. However, it seems we were wrong, as Google has already begun indexing them. Not only is duplicate content an issue, but these pages are not meant for visitors to navigate to directly! If a user were to navigate to the url directly, from the SERPs, they would see a page that isn't styled right. Now we have to determine the right solution to keep these pages out of the index: robots.txt, noindex meta tags, or hash (#) internal links. Robots.txt Advantages: Super easy to implement Conserves crawl budget for large sites Ensures crawler doesn't get stuck. After all, if our website only has 500 pages that we really want indexed and ranked, and vehicle details pages constitute another 1,000,000,000 pages, it doesn't seem to make sense to make Googlebot crawl all of those pages. Robots.txt Disadvantages: Doesn't prevent pages from being indexed, as we've seen, probably because there are internal links to these pages. We could nofollow these internal links, thereby minimizing indexation, but this would lead to each 10-25 noindex internal links on each Vehicle Listings page (will Google think we're pagerank sculpting?) Noindex Advantages: Does prevent vehicle details pages from being indexed Allows ALL pages to be crawled (advantage?) Noindex Disadvantages: Difficult to implement (vehicle details pages are served using ajax, so they have no tag. Solution would have to involve X-Robots-Tag HTTP header and Apache, sending a noindex tag based on querystring variables, similar to this stackoverflow solution. This means the plugin functionality is no longer self-contained, and some hosts may not allow these types of Apache rewrites (as I understand it) Forces (or rather allows) Googlebot to crawl hundreds of thousands of noindex pages. I say "force" because of the crawl budget required. Crawler could get stuck/lost in so many pages, and my not like crawling a site with 1,000,000,000 pages, 99.9% of which are noindexed. Cannot be used in conjunction with robots.txt. After all, crawler never reads noindex meta tag if blocked by robots.txt Hash (#) URL Advantages: By using for links on Vehicle Listing pages to Vehicle Details pages (such as "Contact Seller" buttons), coupled with Javascript, crawler won't be able to follow/crawl these links. Best of both worlds: crawl budget isn't overtaxed by thousands of noindex pages, and internal links used to index robots.txt-disallowed pages are gone. Accomplishes same thing as "nofollowing" these links, but without looking like pagerank sculpting (?) Does not require complex Apache stuff Hash (#) URL Disdvantages: Is Google suspicious of sites with (some) internal links structured like this, since they can't crawl/follow them? Initially, we implemented robots.txt--the "sledgehammer solution." We figured that we'd have a happier crawler this way, as it wouldn't have to crawl zillions of partially duplicate vehicle details pages, and we wanted it to be like these pages didn't even exist. However, Google seems to be indexing many of these pages anyway, probably based on internal links pointing to them. We could nofollow the links pointing to these pages, but we don't want it to look like we're pagerank sculpting or something like that. If we implement noindex on these pages (and doing so is a difficult task itself), then we will be certain these pages aren't indexed. However, to do so we will have to remove the robots.txt disallowal, in order to let the crawler read the noindex tag on these pages. Intuitively, it doesn't make sense to me to make googlebot crawl zillions of vehicle details pages, all of which are noindexed, and it could easily get stuck/lost/etc. It seems like a waste of resources, and in some shadowy way bad for SEO. My developers are pushing for the third solution: using the hash URLs. This works on all hosts and keeps all functionality in the plugin self-contained (unlike noindex), and conserves crawl budget while keeping vehicle details page out of the index (unlike robots.txt). But I don't want Google to slap us 6-12 months from now because it doesn't like links like these (). Any thoughts or advice you guys have would be hugely appreciated, as I've been going in circles, circles, circles on this for a couple of days now. Also, I can provide a test site URL if you'd like to see the functionality in action.0 -
Duplicate content on sites from different countries
Hi, we have a client who currently has a lot of duplicate content with their UK and US website. Both websites are geographically targeted (via google webmaster tools) to their specific location and have the appropriate local domain extension. Is having duplicate content a major issue, since they are in two different countries and geographic regions of the world? Any statement from Google about this? Regards, Bill
Intermediate & Advanced SEO | | MBASydney0 -
Copying my Facebook content to website considered duplicate content?
I write career advice on Facebook on a daily basis. On my homepage users can see the most recent 4-5 feeds (using FB social media plugin). I am thinking to create a page on my website where visitors can see all my previous FB feeds. Would this be considered duplicate content if I copy paste the info, but if I use a Facebook social media plugin then it is not considered duplicate content? I am working on increasing content on my website and feel incorporating FB feeds would make sense. thank you
Intermediate & Advanced SEO | | knielsen0 -
How to Remove Joomla Canonical and Duplicate Page Content
I've attempted to follow advice from the Q&A section. Currently on the site www.cherrycreekspine.com, I've edited the .htaccess file to help with 301s - all pages redirect to www.cherrycreekspine.com. Secondly, I'd added the canonical statement in the header of the web pages. I have cut the Duplicate Page Content in half ... now I have a remaining 40 pages to fix up. This is my practice site to try and understand what SEOmoz can do for me. I've looked at some of your videos on Youtube ... I feel like I'm scrambling around to the Q&A and the internet to understand this product. I'm reading the beginners guide.... any other resources would be helpful.
Intermediate & Advanced SEO | | deskstudio0